From: "Paul E. McKenney" <paulmck@kernel.org>
To: Petr Mladek <pmladek@suse.com>
Cc: "Zhang, Qiang1" <qiang1.zhang@intel.com>,
"Liu, Yujie" <yujie.liu@intel.com>,
John Ogness <john.ogness@linutronix.de>,
"oe-lkp@lists.linux.dev" <oe-lkp@lists.linux.dev>,
lkp <lkp@intel.com>, Thomas Gleixner <tglx@linutronix.de>,
Miguel Ojeda <ojeda@kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [linux-next:master] [printk] 8bdbdd7f43: BUG:scheduling_while_atomic
Date: Thu, 1 Dec 2022 12:29:59 -0800 [thread overview]
Message-ID: <20221201202959.GJ4001@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <Y4jSfGxkP3HoOU75@alley>
On Thu, Dec 01, 2022 at 05:12:44PM +0100, Petr Mladek wrote:
> On Thu 2022-12-01 05:27:13, Zhang, Qiang1 wrote:
> > Greeting,
> >
> > FYI, we noticed BUG:scheduling_while_atomic due to commit (built with clang-14):
> >
> > commit: 8bdbdd7f43cd74c7faca6add8a62d541503ae21d ("printk: Prepare for SRCU console list protection")
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >
> > in testcase: boot
> >
> > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
> >
> > [ 8.561823][ T0] BUG: scheduling while atomic: swapper/0/0x00000002
> > [ 8.569154][ T0] no locks held by swapper/0.
> > [ 8.571934][ T0] Modules linked in:
> > [ 8.573001][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc1-00015-g8bdbdd7f43cd #1 9e431b4e696756e99f94acf7e74ac6e512df80ce
> > [ 8.576740][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> > [ 8.579942][ T0] Call Trace:
> > [ 8.581143][ T0] <TASK>
> > [ 8.582054][ T0] dump_stack_lvl (??:?)
> > [ 8.583312][ T0] ? netdev_notice (??:?)
> > [ 8.584753][ T0] ? lockdep_print_held_locks (lockdep.c:?)
> > [ 8.586563][ T0] __schedule_bug (core.c:?)
> > [ 8.588171][ T0] ? trace_sched_switch (core.c:?)
> > [ 8.589753][ T0] ? save_trace (lockdep.c:?)
> > [ 8.591135][ T0] schedule_debug (core.c:?)
> > [ 8.592582][ T0] __schedule (core.c:?)
> > [ 8.593902][ T0] ? __sched_text_start (core.c:?)
> > [ 8.595356][ T0] ? add_chain_block (lockdep.c:?)
> > [ 8.596847][ T0] ? find_held_lock (lockdep.c:?)
> > [ 8.598368][ T0] schedule (??:?)
> > [ 8.599564][ T0] schedule_timeout (??:?)
> > [ 8.600937][ T0] ? console_conditional_schedule (??:?)
> > [ 8.602773][ T0] do_wait_for_common (build_utility.c:?)
> > [ 8.604522][ T0] ? console_conditional_schedule (??:?)
> > [ 8.606462][ T0] ? bit_wait_io_timeout (build_utility.c:?)
> > [ 8.608196][ T0] ? _raw_spin_lock_irq (??:?)
> > [ 8.609935][ T0] ? lockdep_hardirqs_on (??:?)
> > [ 8.611646][ T0] wait_for_completion (??:?)
> > [ 8.613253][ T0] synchronize_srcu (??:?)
>
> Hmm, I do not understand it. It seems that this is called
> in normal context. There was console_unlock() right
> before synchronize_srcu(). It means that we took
> semaphore in the same context.
>
> Also all the calls above looks correct. wait_for_completion()
> seems to be called in normal context as well.
Is it possible that this is happening between the time that the scheduler
is running, but before workqueues are fully up and running?
If so, I can adjust synchronize_srcu() to operate during that time.
But I would need confirmation that this in fact the situation.
Thanx, Paul
> Best Regards,
> Petr
>
> > [ 8.614825][ T0] ? srcu_gp_start_if_needed (??:?)
> > [ 8.616664][ T0] ? rcu_read_lock_any_held (??:?)
> > [ 8.618281][ T0] ? console_trylock_spinning (??:?)
> > [ 8.620181][ T0] unregister_console (??:?)
> > [ 8.621875][ T0] register_console (??:?)
> > [ 8.623401][ T0] ? serial8250_isa_init_ports (8250_core.c:?)
> > [ 8.625281][ T0] ? parse_options (super.c:?)
> > [ 8.626887][ T0] univ8250_console_init (8250_core.c:?)
> > [ 8.628583][ T0] console_init (??:?)
> > [ 8.630025][ T0] start_kernel (??:?)
> > [ 8.631558][ T0] secondary_startup_64_no_verify (??:?)
> > [ 8.633502][ T0] </TASK>
> > [ 8.634624][ T0] ------------[ cut here ]------------
> > [ 8.636289][ T0] releasing a pinned lock
> > [ 8.638693][ T0] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:5352 lock_release (??:?)
> > [ 8.641591][ T0] Modules linked in:
> > [ 8.642864][ T0] CPU: 0 PID: 0 Comm: swapper Tainted: G W 6.1.0-rc1-00015-g8bdbdd7f43cd #1 9e431b4e696756e99f94acf7e74ac6e512df80ce
> > [ 8.646469][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> > [ 8.649578][ T0] RIP: 0010:lock_release (??:?)
> > [ 8.651243][ T0] Code: 00 00 e9 0f fe ff ff 48 83 05 06 f6 ff 06 01 e8 91 4e 2d 03 e9 67 fe ff ff 48 c7 c7 20 8c cf 84 e8 90 3f ec ff 48 8b 54 24 08 <0f> 0b 48 83 05 a9 f6 ff 06 01 e9 66 fc ff ff e8 67 c9 99 01 85 c0
> > All code
> > ========
> > 0: 00 00 add %al,(%rax)
> > 2: e9 0f fe ff ff jmpq 0xfffffffffffffe16
> > 7: 48 83 05 06 f6 ff 06 addq $0x1,0x6fff606(%rip) # 0x6fff615
> > e: 01
> > f: e8 91 4e 2d 03 callq 0x32d4ea5
> > 14: e9 67 fe ff ff jmpq 0xfffffffffffffe80
> > 19: 48 c7 c7 20 8c cf 84 mov $0xffffffff84cf8c20,%rdi
> > 20: e8 90 3f ec ff callq 0xffffffffffec3fb5
> > 25: 48 8b 54 24 08 mov 0x8(%rsp),%rdx
> > 2a:* 0f 0b ud2 <-- trapping instruction
> > 2c: 48 83 05 a9 f6 ff 06 addq $0x1,0x6fff6a9(%rip) # 0x6fff6dd
> > 33: 01
> > 34: e9 66 fc ff ff jmpq 0xfffffffffffffc9f
> > 39: e8 67 c9 99 01 callq 0x199c9a5
> > 3e: 85 c0 test %eax,%eax
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 0f 0b ud2
> > 2: 48 83 05 a9 f6 ff 06 addq $0x1,0x6fff6a9(%rip) # 0x6fff6b3
> > 9: 01
> > a: e9 66 fc ff ff jmpq 0xfffffffffffffc75
> > f: e8 67 c9 99 01 callq 0x199c97b
> > 14: 85 c0 test %eax,%eax
> > [ 8.656638][ T0] RSP: 0000:ffffffff862077c0 EFLAGS: 00010086
> > [ 8.658249][ T0] RAX: 0000000000000017 RBX: ffffffff86244244 RCX: ffffffff8631a420
> > [ 8.660383][ T0] RDX: ffffffff86244218 RSI: 0000000000000008 RDI: 0000000000000001
> > [ 8.662798][ T0] RBP: ffffffff862078e0 R08: dffffc0000000000 R09: 00000000035545d9
> > [ 8.665205][ T0] R10: dffff7fff0c40e7a R11: 0000000000000001 R12: ffffffff862fcf58
> > [ 8.667420][ T0] R13: 1ffffffff0c40f06 R14: ffffffff86244220 R15: dffffc0000000000
> > [ 8.669824][ T0] FS: 0000000000000000(0000) GS:ffffffff862ab000(0000) knlGS:0000000000000000
> > [ 8.672552][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 8.674557][ T0] CR2: ffff88843ffff000 CR3: 0000000006235000 CR4: 00000000000406b0
> > [ 8.677082][ T0] Call Trace:
> > [ 8.678220][ T0] <TASK>
> > [ 8.679230][ T0] ? __lock_acquire (??:?)
> > [ 8.680709][ T0] ? kvm_sched_clock_read (kvmclock.c:?)
> > [ 8.682309][ T0] ? sched_clock_cpu (??:?)
> > [ 8.683854][ T0] ? kvm_sched_clock_read (kvmclock.c:?)
> > [ 8.685268][ T0] _raw_spin_unlock (??:?)
> > [ 8.686730][ T0] dequeue_task_idle (build_policy.c:?)
> > [ 8.688259][ T0] ? set_next_task_idle (build_policy.c:?)
> > [ 8.689822][ T0] ? update_rq_clock (??:?)
> > [ 8.691387][ T0] __schedule (core.c:?)
> > [ 8.692782][ T0] ? __sched_text_start (core.c:?)
> > [ 8.694287][ T0] ? add_chain_block (lockdep.c:?)
> > [ 8.695869][ T0] ? find_held_lock (lockdep.c:?)
> > [ 8.697314][ T0] schedule (??:?)
> > [ 8.698529][ T0] schedule_timeout (??:?)
> > [ 8.701929][ T0] ? console_conditional_schedule (??:?)
> > [ 8.703826][ T0] do_wait_for_common (build_utility.c:?)
> > [ 8.705395][ T0] ? console_conditional_schedule (??:?)
> > [ 8.707189][ T0] ? bit_wait_io_timeout (build_utility.c:?)
> > [ 8.708779][ T0] ? _raw_spin_lock_irq (??:?)
> > [ 8.710371][ T0] ? lockdep_hardirqs_on (??:?)
> > [ 8.711982][ T0] wait_for_completion (??:?)
> > [ 8.713487][ T0] synchronize_srcu (??:?)
> > [ 8.715020][ T0] ? srcu_gp_start_if_needed (??:?)
> > [ 8.716772][ T0] ? rcu_read_lock_any_held (??:?)
> > [ 8.718477][ T0] ? console_trylock_spinning (??:?)
> > [ 8.720179][ T0] unregister_console (??:?)
> > [ 8.721644][ T0] register_console (??:?)
> > [ 8.728846][ T0] ? serial8250_isa_init_ports (8250_core.c:?)
> > [ 8.730692][ T0] ? parse_options (super.c:?)
> > [ 8.732180][ T0] univ8250_console_init (8250_core.c:?)
> > [ 8.733762][ T0] console_init (??:?)
> > [ 8.735096][ T0] start_kernel (??:?)
> > [ 8.736425][ T0] secondary_startup_64_no_verify (??:?)
> > [ 8.738088][ T0] </TASK>
> > [ 8.739057][ T0] irq event stamp: 494
> > [ 8.740376][ T0] hardirqs last enabled at (493): dump_stack_lvl (??:?)
> > [ 8.743081][ T0] hardirqs last disabled at (494): __schedule (core.c:?)
> > [ 8.745797][ T0] softirqs last enabled at (0): 0x0
> > [ 8.747882][ T0] softirqs last disabled at (0): 0x0
> > [ 8.749889][ T0] ---[ end trace 0000000000000000 ]---
> > [ 8.751520][ T0] bad: scheduling from the idle thread!
> > [ 8.753351][ T0] CPU: 0 PID: 0 Comm: swapper Tainted: G W 6.1.0-rc1-00015-g8bdbdd7f43cd #1 9e431b4e696756e99f94acf7e74ac6e512df80ce
> > [ 8.757566][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> > [ 8.760783][ T0] Call Trace:
> > [ 8.761970][ T0] <TASK>
> > [ 8.763067][ T0] dump_stack_lvl (??:?)
> > [ 8.764611][ T0] ? netdev_notice (??:?)
> > [ 8.766213][ T0] ? lockdep_hardirqs_on_prepare (??:?)
> > [ 8.768214][ T0] ? print_irqtrace_events (??:?)
> > [ 8.769911][ T0] ? kvm_sched_clock_read (kvmclock.c:?)
> > [ 8.771550][ T0] dequeue_task_idle (build_policy.c:?)
> > [ 8.773087][ T0] ? set_next_task_idle (build_policy.c:?)
> > [ 8.774647][ T0] ? update_rq_clock (??:?)
> > [ 8.776280][ T0] __schedule (core.c:?)
> > [ 8.777729][ T0] ? __sched_text_start (core.c:?)
> > [ 8.779238][ T0] ? add_chain_block (lockdep.c:?)
> > [ 8.780922][ T0] ? find_held_lock (lockdep.c:?)
> > [ 8.782560][ T0] schedule (??:?)
> > [ 8.783988][ T0] schedule_timeout (??:?)
> > [ 8.785515][ T0] ? console_conditional_schedule (??:?)
> > [ 8.787500][ T0] do_wait_for_common (build_utility.c:?)
> > [ 8.789162][ T0] ? console_conditional_schedule (??:?)
> > [ 8.790997][ T0] ? bit_wait_io_timeout (build_utility.c:?)
> > [ 8.792567][ T0] ? _raw_spin_lock_irq (??:?)
> > [ 8.794292][ T0] ? lockdep_hardirqs_on (??:?)
> > [ 8.796008][ T0] wait_for_completion (??:?)
> > [ 8.797520][ T0] synchronize_srcu (??:?)
> > [ 8.799001][ T0] ? srcu_gp_start_if_needed (??:?)
> > [ 8.800696][ T0] ? rcu_read_lock_any_held (??:?)
> > [ 8.802508][ T0] ? console_trylock_spinning (??:?)
> > [ 8.804316][ T0] unregister_console (??:?)
> > [ 8.805810][ T0] register_console (??:?)
> > [ 8.807364][ T0] ? serial8250_isa_init_ports (8250_core.c:?)
> > [ 8.811167][ T0] ? parse_options (super.c:?)
> > [ 8.812693][ T0] univ8250_console_init (8250_core.c:?)
> > [ 8.814269][ T0] console_init (??:?)
> > [ 8.815765][ T0] start_kernel (??:?)
> > [ 8.817296][ T0] secondary_startup_64_no_verify (??:?)
> > [ 8.819246][ T0] </TASK>
> >
> >
> >
> > Through the config file, it can be found that the following options are enabled
> >
> > CONFIG_TINY_RCU=y
> > CONFIG_RCU_EXPERT=y
> > CONFIG_SRCU=y
> > CONFIG_TINY_SRCU=y
> >
> > This problem should have been fixed:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=0bf43fcbf8ebbf84a2b1b8770eaefcdb4a99afd6
> >
> > commit dbc6ca150842650d513705f26e3e6b7a4e182ce9
> > Author: Zqiang <qiang1.zhang@intel.com>
> > Date: Wed Nov 9 15:36:38 2022 +0800
> >
> > srcu: Make Tiny synchronize_srcu() check for readers
> >
> > This commit adds lockdep checks for illegal use of synchronize_srcu()
> > within same-type SRCU read-side critical sections and within normal
> > RCU read-side critical sections. It also makes synchronize_srcu()
> > be a no-op during early boot.
> >
> > These changes bring Tiny synchronize_srcu() into line with both Tree
> > synchronize_srcu() and Tiny synchronize_rcu().
> >
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >
> >
> > Thanks
> > Zqiang
> >
> >
> >
> >
> >
> >
> >
> > If you fix the issue, kindly add following tag
> > | Reported-by: kernel test robot <yujie.liu@intel.com>
> > | Link: https://lore.kernel.org/oe-lkp/202211302358.ef0db537-yujie.liu@intel.com
> >
> >
> > To reproduce:
> >
> > # build kernel
> > cd linux
> > cp config-6.1.0-rc1-00015-g8bdbdd7f43cd .config
> > make HOSTCC=clang-14 CC=clang-14 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
> > make HOSTCC=clang-14 CC=clang-14 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> > cd <mod-install-dir>
> > find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
> >
> >
> > git clone https://github.com/intel/lkp-tests.git
> > cd lkp-tests
> > bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
> >
> > # if come across any failure that blocks the test,
> > # please remove ~/.lkp and /lkp dir to run from a clean state.
> >
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://01.org/lkp
prev parent reply other threads:[~2022-12-01 20:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-30 17:06 kernel test robot
2022-12-01 5:27 ` Zhang, Qiang1
2022-12-01 16:12 ` Petr Mladek
2022-12-01 20:29 ` Paul E. McKenney [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221201202959.GJ4001@paulmck-ThinkPad-P17-Gen-1 \
--to=paulmck@kernel.org \
--cc=john.ogness@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=ojeda@kernel.org \
--cc=pmladek@suse.com \
--cc=qiang1.zhang@intel.com \
--cc=tglx@linutronix.de \
--cc=yujie.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox