From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Thomas Gleixner <tglx@linutronix.de>,
Zqiang <qiang.zhang@windriver.com>,
Kees Cook <keescook@chromium.org>,
Stephen Boyd <swboyd@chromium.org>, Tejun Heo <tj@kernel.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>, linux-mm <linux-mm@kvack.org>
Subject: [5.14] problem with dynamic memory allocation for object debugging
Date: Sun, 15 Aug 2021 19:23:03 +0900 [thread overview]
Message-ID: <b9cadd1b-ed1f-8eab-5de5-36c79c709c67@i-love.sakura.ne.jp> (raw)
Hello.
I'm frequently hitting "BUG: Invalid wait context", and
it seems that this problem is not fixed yet.
This is a problem between spinlock versus raw_spinlock. check_wait_context()
complains if we try to take a spinlock when we already hold a raw_spinlock.
raw_spinlock is preferred for debugging code while normal spinlock is preferred
for normal code. But sometimes debugging code wants to allocate memory.
fill_pool() (a function in debugobject code) is using raw_spinlock
but does kmalloc(GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN). fill_pool()
can work as long as fill_pool() is called without holding raw_spinlocks.
However, functions in workqueue code (e.g. __queue_work()) are also using
raw_spinlock, and triggers fill_pool() via debug_object_activate() call.
If fill_pool() hits kmalloc() path, lockdep complains because
get_partial_node() uses normal spinlock.
[ 17.726419][ T41]
[ 17.727873][ T41] =============================
[ 17.730788][ T41] [ BUG: Invalid wait context ]
[ 17.733519][ T41] 5.14.0-rc5+ #742 Not tainted
[ 17.735111][ T41] -----------------------------
[ 17.736747][ T41] kworker/2:1/41 is trying to lock:
[ 17.738468][ T41] ffff888005854d18 (&n->list_lock){-.-.}-{3:3}, at: get_partial_node.isra.0.part.0+0x48/0x4c0
[ 17.741896][ T41] other info that might help us debug this:
[ 17.743970][ T41] context-{5:5}
[ 17.745132][ T41] 3 locks held by kworker/2:1/41:
[ 17.746798][ T41] #0: ffff888005871548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x21e/0x600
[ 17.750250][ T41] #1: ffffc90000823e58 ((work_completion)(&(&krcp->monitor_work)->work)){+.+.}-{0:0}, at: process_one_work+0x21e/0x600
[ 17.754416][ T41] #2: ffff88807cfebe40 (krc.lock){....}-{2:2}, at: kfree_rcu_monitor+0x1d/0x150
[ 17.757302][ T52] cfg80211: failed to load regulatory.db
[ 17.757462][ T41] stack backtrace:
[ 17.757464][ T41] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted 5.14.0-rc5+ #742
[ 17.757466][ T41] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
[ 17.757468][ T41] Workqueue: events kfree_rcu_monitor
[ 17.757474][ T41] Call Trace:
[ 17.757476][ T41] dump_stack_lvl+0x57/0x7d
[ 17.757481][ T41] __lock_acquire.cold+0x213/0x2f2
[ 17.757486][ T41] lock_acquire+0xd0/0x300
[ 17.757490][ T41] ? get_partial_node.isra.0.part.0+0x48/0x4c0
[ 17.757494][ T41] ? __bfs+0x10a/0x240
[ 17.757496][ T41] _raw_spin_lock+0x2c/0x70
[ 17.757500][ T41] ? get_partial_node.isra.0.part.0+0x48/0x4c0
[ 17.757503][ T41] get_partial_node.isra.0.part.0+0x48/0x4c0
[ 17.757506][ T41] ? ret_from_fork+0x1f/0x30
[ 17.757510][ T41] ___slab_alloc+0x4a0/0x9c0
[ 17.757513][ T41] ? fill_pool+0x14d/0x220
[ 17.757517][ T41] ? check_prev_add+0x32f/0xc00
[ 17.757519][ T41] ? __slab_alloc.isra.0+0x22/0x80
[ 17.757522][ T41] __slab_alloc.isra.0+0x22/0x80
[ 17.757524][ T41] kmem_cache_alloc+0x334/0x360
[ 17.757527][ T41] ? fill_pool+0x14d/0x220
[ 17.757530][ T41] fill_pool+0x14d/0x220
[ 17.757533][ T41] ? debug_object_activate+0xb2/0x210
[ 17.757536][ T41] ? find_held_lock+0x2b/0x80
[ 17.757537][ T41] __debug_object_init+0x2e/0x3a0
[ 17.757540][ T41] debug_object_activate+0x12a/0x210
[ 17.757543][ T41] ? delayed_work_timer_fn+0x10/0x10
[ 17.757547][ T41] call_rcu+0x2c/0x330
[ 17.757550][ T41] queue_rcu_work+0x22/0x30
[ 17.757553][ T41] kfree_rcu_monitor+0xff/0x150
[ 17.757555][ T41] process_one_work+0x2aa/0x600
[ 17.757559][ T41] worker_thread+0x48/0x3d0
[ 17.757562][ T41] ? process_one_work+0x600/0x600
[ 17.757564][ T41] kthread+0x13e/0x170
[ 17.757567][ T41] ? set_kthread_struct+0x40/0x40
[ 17.757570][ T41] ret_from_fork+0x1f/0x30
Question is how to solve this problem.
Can all locks held in atomic memory allocation path be
updated to use raw_spinlock ? If yes, it is nice.
Can debugobject code preallocate memory asynchronously so that
debug_object_activate() can be called from any context ?
Can workqueue code stop using raw_spinlocks (e.g. reverting
commit a9b8a985294debae ("workqueue: Convert the pool::lock
and wq_mayday_lock to raw_spinlock_t")) ?
next reply other threads:[~2021-08-15 10:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-15 10:23 Tetsuo Handa [this message]
2021-08-15 10:42 ` Matthew Wilcox
2021-08-16 9:10 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b9cadd1b-ed1f-8eab-5de5-36c79c709c67@i-love.sakura.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=boqun.feng@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=keescook@chromium.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=qiang.zhang@windriver.com \
--cc=swboyd@chromium.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox