[linux-yocto] [PATCH 1/1] xilinx-zyqn: Move disable_nonboot_cpus() in front of local_irq_disable()

Quanyang Wang quanyang.wang at windriver.com
Thu Oct 24 04:57:52 PDT 2019


Hi Michal,

On 10/23/19 6:55 PM, Michal Simek wrote:
> On 21. 10. 19 14:13, qwang2 wrote:
>> Hi Michal,
>>
>> On 2019/10/21 下午6:45, Michal Simek wrote:
>>> On 21. 10. 19 10:45, Quanyang Wang wrote:
>>>> Hi Michal,
>>>>
>>>> On 10/21/19 4:16 PM, Michal Simek wrote:
>>>>> On 21. 10. 19 7:50, quanyang.wang at windriver.com wrote:
>>>>>> From: Quanyang Wang <quanyang.wang at windriver.com>
>>>>>>
>>>>>> When run kdump with enabling CONFIG_DEBUG_PREEMPT, there is a
>>>>>> calltrace
>>>>>> as below:
>>>>>>
>>>>>> BUG: using smp_processor_id() in preemptible [00000000] code: sh/303
>>>>>> caller is machine_crash_shutdown+0x2c/0xe8
>>>>>> CPU: 0 PID: 303 Comm: sh Kdump: loaded Not tainted
>>>>>> 5.2.20-yocto-standard #1
>>>>>> Hardware name: Xilinx Zynq Platform
>>>>>> [<80112ff4>] (unwind_backtrace) from [<8010ca4c>]
>>>>>> (show_stack+0x18/0x1c)
>>>>>> [<8010ca4c>] (show_stack) from [<809b000c>] (dump_stack+0x70/0x8c)
>>>>>> [<809b000c>] (dump_stack) from [<80549a14>]
>>>>>> (debug_smp_processor_id+0xd4/0x118)
>>>>>> [<80549a14>] (debug_smp_processor_id) from [<80111428>]
>>>>>> (machine_crash_shutdown+0x2c/0xe8)
>>>>>> [<80111428>] (machine_crash_shutdown) from [<801afe24>]
>>>>>> (__crash_kexec+0x70/0xd0)
>>>>>> [<801afe24>] (__crash_kexec) from [<801259b4>] (panic+0x110/0x324)
>>>>>> [<801259b4>] (panic) from [<805f7018>] (sysrq_handle_crash+0x18/0x1c)
>>>>>> [<805f7018>] (sysrq_handle_crash) from [<805f7584>]
>>>>>> (__handle_sysrq+0x9c/0x14c)
>>>>>> [<805f7584>] (__handle_sysrq) from [<805f79e8>]
>>>>>> (write_sysrq_trigger+0x5c/0x6c)
>>>>>> [<805f79e8>] (write_sysrq_trigger) from [<8031e850>]
>>>>>> (proc_reg_write+0x78/0x8c)
>>>>>> [<8031e850>] (proc_reg_write) from [<802b1b28>] (vfs_write+0xc0/0x154)
>>>>>> [<802b1b28>] (vfs_write) from [<802b2a64>] (ksys_write+0x6c/0xd4)
>>>>>> [<802b2a64>] (ksys_write) from [<80101000>]
>>>>>> (ret_fast_syscall+0x0/0x54)
>>>>>> Exception stack(0xba157fa8 to 0xba157ff0)
>>>>>> 7fa0: 00000002 005ab930 00000001 005ab930 00000002 00000000
>>>>>> 7fc0: 00000002 005ab930 76fa2290 00000004 76f3d124 76f3cc8c 00000000
>>>>>> 00000000
>>>>>> 7fe0: 00000004 7edec940 76edbfff 76e67d16
>>>>>>
>>>>>> This is because that the function disable_nonboot_cpus is called in
>>>>>> order to make sure that the crash kernel runs in the boot CPU(cpu0).
>>>>>> And it will enable local irq by calling as below:
>>>>>>
>>>>>> disable_nonboot_cpus
>>>>>>     -> freeze_secondary_cpus
>>>>>>      -> _cpu_down
>>>>>>       -> percpu_down_write
>>>>>>        -> rcu_sync_enter
>>>>>>         -> spin_unlock_irq(&rsp->rss_lock)
>>>>>>          -> local_irq_enable()
>>>>>>
>>>>>> Then the functions including smp_processor_id() behind
>>>>>> disable_nonboot_cpus
>>>>>> will run at the irq-enabled context, and this will trigger the
>>>>>> calltrace.
>>>>>>
>>>>>> So move disable_nonboot_cpus() in front of local_irq_disable() to
>>>>>> avoid
>>>>>> it since disable_nonboot_cpus() not need run at an atomic context.
>>>>>>
>>>>>> Signed-off-by: Quanyang Wang <quanyang.wang at windriver.com>
>>>>>> ---
>>>>>>     arch/arm/kernel/machine_kexec.c | 3 ++-
>>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/arm/kernel/machine_kexec.c
>>>>>> b/arch/arm/kernel/machine_kexec.c
>>>>>> index 654f2b1f9ac0..83d2025a4ab1 100644
>>>>>> --- a/arch/arm/kernel/machine_kexec.c
>>>>>> +++ b/arch/arm/kernel/machine_kexec.c
>>>>>> @@ -145,9 +145,10 @@ static void machine_kexec_mask_interrupts(void)
>>>>>>       void machine_crash_shutdown(struct pt_regs *regs)
>>>>>>     {
>>>>>> -    local_irq_disable();
>>>>>>         disable_nonboot_cpus();
>>>>>>     +    local_irq_disable();
>>>>>> +
>>>>>>         crash_smp_send_stop();
>>>>>>           crash_save_cpu(regs, smp_processor_id());
>>>>>>
>>>>> ok. Can you please check before this if your usecases work without
>>>>> disable_nonboot_cpus(). This patch was done pretty long time ago where
>>>>> there was an issue with kexec. Long time ago I was talking to arm-soc
>>>>> maintainers about this and they told me that mainline code should work
>>>>> fine without any need to call disable_nonboot_cpus().
>>>>> It means if kexec is working fine we can revert origin patch and use
>>>>> what mainline is using.
>>>> It seems that the issue is still there. When crash at cpu1 and crash
>>>> kernel runs at cpu1,
>>>>
>>>> it will hang, the log is as below:
>>>>
>>>> root at xilinx-zynq:~# sh 1.sh
>>>> syscall kexec_file_load not available.
>>>> sysrq: Trigger a crash
>>>> Kernel panic - not syncing: sysrq triggered crash
>>>> CPU: 1 PID: 308 Comm: sh Kdump: loaded Not tainted
>>>> 5.2.20-yocto-standard #4
>>>> Hardware name: Xilinx Zynq Platform
>>>> [<80112eb0>] (unwind_backtrace) from [<8010cc04>] (show_stack+0x18/0x1c)
>>>> [<8010cc04>] (show_stack) from [<8094f8f4>] (dump_stack+0x70/0x8c)
>>>> [<8094f8f4>] (dump_stack) from [<801256f4>] (panic+0xf8/0x320)
>>>> [<801256f4>] (panic) from [<805dbeb0>] (sysrq_handle_crash+0x18/0x1c)
>>>> [<805dbeb0>] (sysrq_handle_crash) from [<805dc3b8>]
>>>> (__handle_sysrq+0x9c/0x148)
>>>> [<805dc3b8>] (__handle_sysrq) from [<805dc804>]
>>>> (write_sysrq_trigger+0x5c/0x6c)
>>>> [<805dc804>] (write_sysrq_trigger) from [<8031b040>]
>>>> (proc_reg_write+0x78/0x8c)
>>>> [<8031b040>] (proc_reg_write) from [<802aeec4>] (vfs_write+0xc0/0x154)
>>>> [<802aeec4>] (vfs_write) from [<802afd18>] (ksys_write+0x64/0xc8)
>>>> [<802afd18>] (ksys_write) from [<80101000>] (ret_fast_syscall+0x0/0x54)
>>>> Exception stack(0xb905bfa8 to 0xb905bff0)
>>>> bfa0:                   00000002 0059afa0 00000001 0059afa0 00000002
>>>> 00000000
>>>> bfc0: 00000002 0059afa0 76f8e290 00000004 76f29124 76f28c8c 00000000
>>>> 00000000
>>>> bfe0: 00000004 7eb858c0 76ec7fff 76e53d16
>>>> CPU 0 will stop doing anything useful since another CPU has crashed
>>>> Loading crashdump kernel...
>>>> Bye!
>>>> Booting Linux on physical CPU 0x1
>>>> Linux version 5.2.20-yocto-standard (oe-user at oe-host) (gcc version 9.2.0
>>>> (GCC)) #1 SMP PREEMPT Thu Oct 17 08:15:14 UTC 2019
>>>> CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
>>>> CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
>>>> OF: fdt: Machine model: Xilinx ZC706 board
>>>> OF: fdt: Ignoring memory range 0x0 - 0x8000000
>>>> printk: debug: ignoring loglevel setting.
>>>> printk: bootconsole [earlycon0] enabled
>>>> Memory policy: Data cache writealloc
>>>> cma: Reserved 16 MiB at 0x16c00000
>>>> On node 0 totalpages: 65280
>>>>     Normal zone: 574 pages used for memmap
>>>>     Normal zone: 0 pages reserved
>>>>     Normal zone: 65280 pages, LIFO batch:15
>>>> percpu: Embedded 19 pages/cpu s47756 r8192 d21876 u77824
>>>> pcpu-alloc: s47756 r8192 d21876 u77824 alloc=19*4096
>>>> pcpu-alloc: [0] 0 [0] 1
>>>> Built 1 zonelists, mobility grouping on.  Total pages: 64706
>>>> Kernel command line: console=ttyPS0,115200n8 root=/dev/nfs rw
>>>> nfsroot=128.224.165.20:/export/pxeboot/vlm-boards/22009/rootfs,v3,tcp
>>>> ip=128.224.179.217:128.224.165.20:128.224.178.1:255.255.254.0:zc702:eth0:off
>>>>
>>>> ignore_loglevel earlyprintk noinitrd selinux=0 enforcing=0 kmemleak=on
>>>> elfcorehdr=0x17f00000 mem=261120K
>>>> Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
>>>> Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
>>>> Memory: 227332K/261120K available (9216K kernel code, 725K rwdata, 2284K
>>>> rodata, 1024K init, 567K bss, 17404K reserved, 16384K cma-reserved)
>>>> SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
>>>> ftrace: allocating 35203 entries in 69 pages
>>>> rcu: Preemptible hierarchical RCU implementation.
>>>> rcu:    RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
>>>>           Tasks RCU enabled.
>>>> rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
>>>> rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
>>>> NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
>>>> efuse mapped to (ptrval)
>>>> slcr mapped to (ptrval)
>>>> L2C: platform provided aux values match the hardware, so have no
>>>> effect.  Please remove them.
>>>> L2C-310 erratum 769419 enabled
>>>> L2C-310 enabling early BRESP for Cortex-A9
>>>> L2C-310: enabling full line of zeros but not enabled in Cortex-A9
>>>> L2C-310 ID prefetch enabled, offset 1 lines
>>>> L2C-310 dynamic clock gating enabled, standby mode enabled
>>>> L2C-310 cache controller enabled, 8 ways, 512 kB
>>>> L2C-310: CACHE_ID 0x410000c8, AUX_CTRL 0x76760001
>>>> random: get_random_bytes called from start_kernel+0x2b0/0x4c4 with
>>>> crng_init=0
>>>> zynq_clock_init: clkc starts at (ptrval)
>>>> Zynq clock init
>>>> sched_clock: 64 bits at 333MHz, resolution 3ns, wraps every
>>>> 4398046511103ns
>>>> clocksource: arm_global_timer: mask: 0xffffffffffffffff max_cycles:
>>>> 0x4ce07af025, max_idle_ns: 440795209040 ns
>>>> Switching to timer-based delay loop, resolution 3ns
>>>> clocksource: ttc_clocksource: mask: 0xffff max_cycles: 0xffff,
>>>> max_idle_ns: 537538477 ns
>>>> timer #0 at (ptrval), irq=17
>>>> Console: colour dummy device 80x30
>>>> Calibrating delay loop (skipped), value calculated using timer
>>>> frequency.. 666.66 BogoMIPS (lpj=3333333)
>>>> pid_max: default: 32768 minimum: 301
>>>> LSM: Security Framework initializing
>>>> Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
>>>> Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
>>>> CPU: Testing write buffer coherency: ok
>>>> CPU0: Spectre v2: using BPIALL workaround
>>>> CPU0: thread -1, cpu 1, socket 0, mpidr 80000001
>>>> Setting up static identity map for 0x8100000 - 0x8100060
>>>> rcu: Hierarchical SRCU implementation.
>>>> smp: Bringing up secondary CPUs ...
>>> ok. Can you send content of your 1.sh script?
>> The 1.sh and kdump_args is as below:
>>
>> root at xilinx-zynq:~# cat 1.sh
>> #!/bin/sh
>> kexec -p /boot/zImage --append="$(<kdump_args)"
>> echo c > /proc/sysrq-trigger
>>
>> root at xilinx-zynq:~# cat kdump_args
>> console=ttyPS0,115200n8 root=/dev/nfs rw
>> nfsroot=128.224.165.20:/export/pxeboot/vlm-boards/22009/rootfs,v3,tcp
>> ip=128.224.179.217:128.224.165.20:128.224.178.1:255.255.254.0:zc702:eth0:off
>> ignore_loglevel earlyprintk noinitrd selinux=0 enforcing=0 kmemleak=on
>> root at xilinx-zynq:~#
> I have looked at it on I have two observation.
> I am using latest kexec-tools and latest mainline kernel.
> 1. I can't use --append parameter to specify command line but I have
> force it in the kernel

I am using 2.0.19, and it works well.

root at xilinx-zynq:/# kexec -v
kexec-tools 2.0.19

root at xilinx-zynq:~# kexec  -p /boot/zImage --append="$(<kexec.args)"
syscall kexec_file_load not available.
root at xilinx-zynq:~#

> 2. New kernel is starting at location specified by crashkernel=
> parameter it means that running this in circle will crash at some point
> because you IIRC arm32 is not working with memory in front of kernel. It
> means on every attempt you are loosing location before kernel

Do you mean that kdump can't circulate running?

I using the cmdline to boot the first kernel:

console=ttyPS0,115200n8 root=/dev/nfs rw 
nfsroot=128.224.165.20:/export/pxeboot/vlm-boards/21122/rootfs,v3,tcp 
ip=128.224.165.199:128.224.165.20:128.224.178.1:255.255.255.0:zc706:eth0:off 
crashkernel=256M at 128M

But the cmdline for the crash kernel in kdump_args, the 
"crashkernel=256M at 128M" must be deleted, or else the allocation for 
memory will fail.

Thanks,

Quanyang


>
> Thanks,
> Michal


More information about the linux-yocto mailing list