From: Vlastimil Babka <vbabka@suse.cz>
To: Harry Yoo <harry.yoo@oracle.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Alexei Starovoitov <ast@kernel.org>, Hao Li <hao.li@linux.dev>,
linux-mm <linux-mm@kvack.org>, stable <stable@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm/slab: skip get_from_any_partial() if !allow_spin
Date: Mon, 9 Feb 2026 20:03:25 +0100 [thread overview]
Message-ID: <a972a203-d4c5-4161-b9e6-42fbf733d75a@suse.cz> (raw)
In-Reply-To: <aYlR_KW8xj4LJaYt@hyeyoo>
On 2/9/26 04:18, Harry Yoo wrote:
> On Fri, Feb 06, 2026 at 11:19:01AM -0800, Alexei Starovoitov wrote:
>> On Fri, Feb 6, 2026 at 10:10 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>> >
>> > On 2/6/26 18:13, Harry Yoo wrote:
>> > > Lockdep complains when get_from_any_partial() is called in an NMI
>> > > context, because current->mems_allowed_seq is seqcount_spinlock_t and
>> > > not NMI-safe:
>> > >
>> > > ================================
>> > > WARNING: inconsistent lock state
>> > > 6.19.0-rc5-kfree-rcu+ #315 Tainted: G N
>> > > --------------------------------
>> > > inconsistent {INITIAL USE} -> {IN-NMI} usage.
>> > > kunit_try_catch/9989 [HC1[1]:SC0[0]:HE0:SE1] takes:
>> > > ffff889085799820 (&____s->seqcount#3){.-.-}-{0:0}, at: ___slab_alloc+0x58f/0xc00
>> > > {INITIAL USE} state was registered at:
>> > > lock_acquire+0x185/0x320
>> > > kernel_init_freeable+0x391/0x1150
>> > > kernel_init+0x1f/0x220
>> > > ret_from_fork+0x736/0x8f0
>> > > ret_from_fork_asm+0x1a/0x30
>> > > irq event stamp: 56
>> > > hardirqs last enabled at (55): [<ffffffff850a68d7>] _raw_spin_unlock_irq+0x27/0x70
>> > > hardirqs last disabled at (56): [<ffffffff850858ca>] __schedule+0x2a8a/0x6630
>> > > softirqs last enabled at (0): [<ffffffff81536711>] copy_process+0x1dc1/0x6a10
>> > > softirqs last disabled at (0): [<0000000000000000>] 0x0
>> > >
>> > > other info that might help us debug this:
>> > > Possible unsafe locking scenario:
>> > >
>> > > CPU0
>> > > ----
>> > > lock(&____s->seqcount#3);
>> > > <Interrupt>
>> > > lock(&____s->seqcount#3);
>> > >
>> > > *** DEADLOCK ***
>> > >
>> > > According to Documentation/locking/seqlock.rst, seqcount_t is not
>> > > NMI-safe and seqcount_latch_t should be used when read path can interrupt
>> > > the write-side critical section. In this case, return NULL and fall back
>> > > to slab allocation if !allow_spin.
>> > >
>> > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
>> > > Cc: stable@vger.kernel.org
>> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>> > > ---
>> > > mm/slub.c | 8 ++++++++
>> > > 1 file changed, 8 insertions(+)
>> > >
>> > > diff --git a/mm/slub.c b/mm/slub.c
>> > > index 102fb47ae013..d46464654c15 100644
>> > > --- a/mm/slub.c
>> > > +++ b/mm/slub.c
>> > > @@ -3789,6 +3789,14 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
>> > > enum zone_type highest_zoneidx = gfp_zone(pc->flags);
>> > > unsigned int cpuset_mems_cookie;
>> > >
>> > > + /*
>> > > + * read_mems_allow_begin() accesses current->mems_allowed_seq,
>> > > + * a seqcount_spinlock_t that is not NMI-safe. Skip allocation
>> > > + * when GFP flags indicate spinning is not allowed.
>> > > + */
>> > > + if (!gfpflags_allow_spinning(pc->flags))
>> > > + return NULL;
>> >
>> > I think it would be less restrictive to just continue,
>
> Ack.
>
>> > but skip the
>> > read_mems_allowed_retry() part in the do-while loop, so just make it one
>> > iteration for !allow_spin.
>
> Makes sense.
>
>> > If lockdep doesn't like even the
>> > read_mems_allowed_begin() (not clear to me), skip it too?
>
> Yes, lockdep doesn't like read_mems_allowed_begin(), and thus
> we should skip both.
>
>>
>> +1
>> Just unconditional return NULL seems too restrictive.
>
> Ack.
>
> I'll do something like this:
Looks good!
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 102fb47ae013..cc686ab929fe 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3788,6 +3788,7 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
> struct zone *zone;
> enum zone_type highest_zoneidx = gfp_zone(pc->flags);
> unsigned int cpuset_mems_cookie;
> + bool allow_spin = gfpflags_allow_spinning(pc->flags);
>
> /*
> * The defrag ratio allows a configuration of the tradeoffs between
> @@ -3812,7 +3813,15 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
> return NULL;
>
> do {
> - cpuset_mems_cookie = read_mems_allowed_begin();
> + /*
> + * read_mems_allow_begin() accesses current->mems_allowed_seq,
> + * a seqcount_spinlock_t that is not NMI-safe. Do not access
> + * current->mems_allowed_seq and avoid retry when GFP flags
> + * indicate spinning is not allowed.
> + */
> + if (allow_spin)
> + cpuset_mems_cookie = read_mems_allowed_begin();
> +
> zonelist = node_zonelist(mempolicy_slab_node(), pc->flags);
> for_each_zone_zonelist(zone, z, zonelist, highest_zoneidx) {
> struct kmem_cache_node *n;
> @@ -3836,7 +3845,7 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
> }
> }
> }
> - } while (read_mems_allowed_retry(cpuset_mems_cookie));
> + } while (allow_spin && read_mems_allowed_retry(cpuset_mems_cookie));
> #endif /* CONFIG_NUMA */
> return NULL;
> }
>
>
next prev parent reply other threads:[~2026-02-09 19:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 17:13 [PATCH 0/2] mm/slab: fix lockdep warnings with kmalloc_nolock() Harry Yoo
2026-02-06 17:13 ` [PATCH 1/2] mm/slab: skip get_from_any_partial() if !allow_spin Harry Yoo
2026-02-06 18:10 ` Vlastimil Babka
2026-02-06 19:19 ` Alexei Starovoitov
2026-02-09 3:18 ` Harry Yoo
2026-02-09 19:03 ` Vlastimil Babka [this message]
2026-02-06 17:13 ` [PATCH 2/2] mm/slab: use prandom " Harry Yoo
2026-02-06 18:27 ` Vlastimil Babka
2026-02-06 19:22 ` Alexei Starovoitov
2026-02-07 1:25 ` Harry Yoo
2026-02-06 17:37 ` [PATCH 0/2] mm/slab: fix lockdep warnings with kmalloc_nolock() Harry Yoo
2026-02-09 19:03 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a972a203-d4c5-4161-b9e6-42fbf733d75a@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox