[linux-yocto] [PATCH] tracing/x86: Save CR2 before tracing irqsoff on error_entry
He Zhe
zhe.he at windriver.com
Mon Apr 29 19:41:12 PDT 2019
On 4/30/19 3:37 AM, Bruce Ashfield wrote:
>
>
> On Mon, Apr 29, 2019 at 8:34 AM Bruce Ashfield <bruce.ashfield at gmail.com <mailto:bruce.ashfield at gmail.com>> wrote:
>
>
>
> On Sun, Apr 28, 2019 at 6:50 AM He Zhe <zhe.he at windriver.com <mailto:zhe.he at windriver.com>> wrote:
>
>
>
> On 4/26/19 11:14 PM, Bruce Ashfield wrote:
> >
> >
> > On Wed, Apr 24, 2019 at 9:38 PM He Zhe <zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>>> wrote:
> >
> >
> >
> > On 4/24/19 8:34 PM, Bruce Ashfield wrote:
> > >
> > >
> > > On Wed, Apr 24, 2019 at 3:47 AM He Zhe <zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>>>> wrote:
> > >
> > > This is for standard/base and all sub-level branches. For explanation, see the
> > > bottom of the commit log I append.
> > >
> > >
> > > Which kernel versions ? I didn't notice a version it the shortlog or temporary section, but I may have overlooked it.
> >
> > >From 4.19 to 5.0
> >
> >
> > Thanks, this is now merged.
>
> This is missing on v5.0/standard/intel-x86.
>
>
> Not in my tree, but I'll double check as I'm merging more changes later today.
>
>
> I confirmed that this really is in the branches that I pushed, are you really not seeing commit b470fb994ebd5b9ae8e520c85029449bf847289c in that branch ?
Oops, I see it now after re-sync, there must be something wrong in the sync process back then. Thanks.
Zhe
>
> https://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto/commit/?h=v5.0/standard/intel-x86&id=b470fb994ebd5b9ae8e520c85029449bf847289c
>
> Bruce
>
>
>
>
> Bruce
>
>
>
> Zhe
>
> >
> > Bruce
> >
> >
> >
> > Zhe
> >
> > >
> > > Bruce
> > >
> > >
> > >
> > >
> > > Zhe
> > >
> > > On 4/24/19 3:42 PM, zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>>> wrote:
> > > > From: "Steven Rostedt (VMware)" <rostedt at goodmis.org <mailto:rostedt at goodmis.org> <mailto:rostedt at goodmis.org <mailto:rostedt at goodmis.org>> <mailto:rostedt at goodmis.org <mailto:rostedt at goodmis.org> <mailto:rostedt at goodmis.org <mailto:rostedt at goodmis.org>>>>
> > > >
> > > > He Zhe reported a crash by enabling trace events and selecting
> > > > "userstacktrace" which will read the stack of userspace for every trace
> > > > event recorded. Zhe narrowed it down to:
> > > >
> > > > c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
> > > >
> > > > With the problem config, I was able to also reproduce the error. I
> > > > narrowed it down to just having to do the following:
> > > >
> > > > # cd /sys/kernel/tracing
> > > > # echo 1 > options/userstacktrace
> > > > # echo 1 > events/preemptirq/irq_disable/enable
> > > >
> > > > And sure enough, I triggered a crash. Well, it was systemd crashing
> > > > with a bad memory access??
> > > >
> > > > systemd-journal[537]: segfault at ed8cb8 ip 00007f7fffc9fef5 sp 00007ffc4062cb10 error 7
> > > >
> > > > And it would crash similarly each time I tried it, but always at a
> > > > different place. After spending the day on this, I finally figured it
> > > > out. The bug is happening in entry_64.S right after error_entry.
> > > > There's two TRACE_IRQS_OFF in that code path, which if I comment out,
> > > > the bug goes away. Then it dawned on me that the crash always happens
> > > > when systemd does a normal page fault. We had this bug before, and it
> > > > was with the exception trace points.
> > > >
> > > > The issue is that a tracepoint can fault (reading vmalloc or whatever).
> > > > And doing a userspace stack trace most definitely will fault. But if we
> > > > are coming from a legitimate page fault, the address of that fault (in
> > > > the CR2 register) will be lost if we fault before we get to the page
> > > > fault handler. That's exactly what is happening.
> > > >
> > > > To solve this, a TRACE_IRQS_OFF_CR2 (and ON for consistency) was added
> > > > that saves the CR2 register. A new trace_hardirqs_off_thunk_cr2 is
> > > > created that stores the cr2 register, calls the
> > > > trace_hardirqs_off_caller, then on return restores the cr2 register if
> > > > it changed, before returning.
> > > >
> > > > On my tests this fixes the issue. I just want to know if this is a
> > > > legitimate fix or if someone can come up with a better fix?
> > > >
> > > > Note: this also saves the exception context just like the
> > > > do_page_fault() function does.
> > > >
> > > > Note2: This only gets enabled when lockdep or irq tracing is enabled,
> > > > which is not recommended for production environments.
> > > >
> > > > Link: http://lkml.kernel.org/r/897cf5cf-fc24-8a64-cb28-847f2d2e63d2@windriver.com
> > > >
> > > > Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")
> > > > Signed-off-by: Steven Rostedt (VMware) <rostedt at goodmis.org <mailto:rostedt at goodmis.org> <mailto:rostedt at goodmis.org <mailto:rostedt at goodmis.org>> <mailto:rostedt at goodmis.org <mailto:rostedt at goodmis.org> <mailto:rostedt at goodmis.org <mailto:rostedt at goodmis.org>>>>
> > > >
> > > > Link: https://lore.kernel.org/lkml/20190320221534.165ab87b@oasis.local.home/
> > > >
> > > > This might not be the final solution. But the upstream thread has stopped over
> > > > a month and there is unlikely a final solution in the near future.
> > > >
> > > > Since the diff looks quite clear and does not affect other functions. It should
> > > > be worth adding this initial patch from the maintainer.
> > > >
> > > > Signed-off-by: He Zhe <zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com> <mailto:zhe.he at windriver.com <mailto:zhe.he at windriver.com>>>>
> > > > ---
> > > > arch/x86/entry/common.c | 26 ++++++++++++++++++++++++++
> > > > arch/x86/entry/entry_64.S | 4 ++--
> > > > arch/x86/entry/thunk_64.S | 2 ++
> > > > arch/x86/include/asm/irqflags.h | 4 ++++
> > > > 4 files changed, 34 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> > > > index 7bc105f..7edffec 100644
> > > > --- a/arch/x86/entry/common.c
> > > > +++ b/arch/x86/entry/common.c
> > > > @@ -292,6 +292,32 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
> > > >
> > > > syscall_return_slowpath(regs);
> > > > }
> > > > +
> > > > +extern void trace_hardirqs_on_caller(unsigned long caller_addr);
> > > > +__visible void trace_hardirqs_on_caller_cr2(unsigned long caller_addr)
> > > > +{
> > > > + unsigned long address = read_cr2(); /* Get the faulting address */
> > > > + enum ctx_state prev_state;
> > > > +
> > > > + prev_state = exception_enter();
> > > > + trace_hardirqs_on_caller(caller_addr);
> > > > + if (address != read_cr2())
> > > > + write_cr2(address);
> > > > + exception_exit(prev_state);
> > > > +}
> > > > +
> > > > +extern void trace_hardirqs_off_caller(unsigned long caller_addr);
> > > > +__visible void trace_hardirqs_off_caller_cr2(unsigned long caller_addr)
> > > > +{
> > > > + unsigned long address = read_cr2(); /* Get the faulting address */
> > > > + enum ctx_state prev_state;
> > > > +
> > > > + prev_state = exception_enter();
> > > > + trace_hardirqs_off_caller(caller_addr);
> > > > + if (address != read_cr2())
> > > > + write_cr2(address);
> > > > + exception_exit(prev_state);
> > > > +}
> > > > #endif
> > > >
> > > > #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
> > > > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > > > index 1f0efdb..73ddf24 100644
> > > > --- a/arch/x86/entry/entry_64.S
> > > > +++ b/arch/x86/entry/entry_64.S
> > > > @@ -1248,12 +1248,12 @@ ENTRY(error_entry)
> > > > * we fix gsbase, and we should do it before enter_from_user_mode
> > > > * (which can take locks).
> > > > */
> > > > - TRACE_IRQS_OFF
> > > > + TRACE_IRQS_OFF_CR2
> > > > CALL_enter_from_user_mode
> > > > ret
> > > >
> > > > .Lerror_entry_done:
> > > > - TRACE_IRQS_OFF
> > > > + TRACE_IRQS_OFF_CR2
> > > > ret
> > > >
> > > > /*
> > > > diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
> > > > index be36bf4..1300b53 100644
> > > > --- a/arch/x86/entry/thunk_64.S
> > > > +++ b/arch/x86/entry/thunk_64.S
> > > > @@ -41,6 +41,8 @@
> > > > #ifdef CONFIG_TRACE_IRQFLAGS
> > > > THUNK trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1
> > > > THUNK trace_hardirqs_off_thunk,trace_hardirqs_off_caller,1
> > > > + THUNK trace_hardirqs_on_thunk_cr2,trace_hardirqs_on_caller_cr2,1
> > > > + THUNK trace_hardirqs_off_thunk_cr2,trace_hardirqs_off_caller_cr2,1
> > > > #endif
> > > >
> > > > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > > > diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> > > > index 058e40f..dd51174 100644
> > > > --- a/arch/x86/include/asm/irqflags.h
> > > > +++ b/arch/x86/include/asm/irqflags.h
> > > > @@ -172,9 +172,13 @@ static inline int arch_irqs_disabled(void)
> > > > #ifdef CONFIG_TRACE_IRQFLAGS
> > > > # define TRACE_IRQS_ON call trace_hardirqs_on_thunk;
> > > > # define TRACE_IRQS_OFF call trace_hardirqs_off_thunk;
> > > > +# define TRACE_IRQS_ON_CR2 call trace_hardirqs_on_thunk_cr2;
> > > > +# define TRACE_IRQS_OFF_CR2 call trace_hardirqs_off_thunk_cr2;
> > > > #else
> > > > # define TRACE_IRQS_ON
> > > > # define TRACE_IRQS_OFF
> > > > +# define TRACE_IRQS_ON_CR2
> > > > +# define TRACE_IRQS_OFF_CR2
> > > > #endif
> > > > #ifdef CONFIG_DEBUG_LOCK_ALLOC
> > > > # ifdef CONFIG_X86_64
> > >
> > >
> > >
> > > --
> > > - Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end
> > > - "Use the force Harry" - Gandalf, Star Trek II
> > >
> >
> >
> >
> > --
> > - Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end
> > - "Use the force Harry" - Gandalf, Star Trek II
> >
>
>
>
> --
> - Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end
> - "Use the force Harry" - Gandalf, Star Trek II
>
>
>
> --
> - Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end
> - "Use the force Harry" - Gandalf, Star Trek II
>
More information about the linux-yocto
mailing list