From: Harry Yoo <harry.yoo@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Alexei Starovoitov <ast@kernel.org>, Hao Li <hao.li@linux.dev>,
linux-mm <linux-mm@kvack.org>, stable <stable@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm/slab: skip get_from_any_partial() if !allow_spin
Date: Mon, 9 Feb 2026 12:18:20 +0900 [thread overview]
Message-ID: <aYlR_KW8xj4LJaYt@hyeyoo> (raw)
In-Reply-To: <CAADnVQ+1RBXBWNQtshEfFNZEp0tDZOFKf_vedyjgdz=wqWdG8A@mail.gmail.com>
On Fri, Feb 06, 2026 at 11:19:01AM -0800, Alexei Starovoitov wrote:
> On Fri, Feb 6, 2026 at 10:10 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > On 2/6/26 18:13, Harry Yoo wrote:
> > > Lockdep complains when get_from_any_partial() is called in an NMI
> > > context, because current->mems_allowed_seq is seqcount_spinlock_t and
> > > not NMI-safe:
> > >
> > > ================================
> > > WARNING: inconsistent lock state
> > > 6.19.0-rc5-kfree-rcu+ #315 Tainted: G N
> > > --------------------------------
> > > inconsistent {INITIAL USE} -> {IN-NMI} usage.
> > > kunit_try_catch/9989 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > > ffff889085799820 (&____s->seqcount#3){.-.-}-{0:0}, at: ___slab_alloc+0x58f/0xc00
> > > {INITIAL USE} state was registered at:
> > > lock_acquire+0x185/0x320
> > > kernel_init_freeable+0x391/0x1150
> > > kernel_init+0x1f/0x220
> > > ret_from_fork+0x736/0x8f0
> > > ret_from_fork_asm+0x1a/0x30
> > > irq event stamp: 56
> > > hardirqs last enabled at (55): [<ffffffff850a68d7>] _raw_spin_unlock_irq+0x27/0x70
> > > hardirqs last disabled at (56): [<ffffffff850858ca>] __schedule+0x2a8a/0x6630
> > > softirqs last enabled at (0): [<ffffffff81536711>] copy_process+0x1dc1/0x6a10
> > > softirqs last disabled at (0): [<0000000000000000>] 0x0
> > >
> > > other info that might help us debug this:
> > > Possible unsafe locking scenario:
> > >
> > > CPU0
> > > ----
> > > lock(&____s->seqcount#3);
> > > <Interrupt>
> > > lock(&____s->seqcount#3);
> > >
> > > *** DEADLOCK ***
> > >
> > > According to Documentation/locking/seqlock.rst, seqcount_t is not
> > > NMI-safe and seqcount_latch_t should be used when read path can interrupt
> > > the write-side critical section. In this case, return NULL and fall back
> > > to slab allocation if !allow_spin.
> > >
> > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > > ---
> > > mm/slub.c | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> > >
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 102fb47ae013..d46464654c15 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -3789,6 +3789,14 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
> > > enum zone_type highest_zoneidx = gfp_zone(pc->flags);
> > > unsigned int cpuset_mems_cookie;
> > >
> > > + /*
> > > + * read_mems_allow_begin() accesses current->mems_allowed_seq,
> > > + * a seqcount_spinlock_t that is not NMI-safe. Skip allocation
> > > + * when GFP flags indicate spinning is not allowed.
> > > + */
> > > + if (!gfpflags_allow_spinning(pc->flags))
> > > + return NULL;
> >
> > I think it would be less restrictive to just continue,
Ack.
> > but skip the
> > read_mems_allowed_retry() part in the do-while loop, so just make it one
> > iteration for !allow_spin.
Makes sense.
> > If lockdep doesn't like even the
> > read_mems_allowed_begin() (not clear to me), skip it too?
Yes, lockdep doesn't like read_mems_allowed_begin(), and thus
we should skip both.
>
> +1
> Just unconditional return NULL seems too restrictive.
Ack.
I'll do something like this:
diff --git a/mm/slub.c b/mm/slub.c
index 102fb47ae013..cc686ab929fe 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3788,6 +3788,7 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
struct zone *zone;
enum zone_type highest_zoneidx = gfp_zone(pc->flags);
unsigned int cpuset_mems_cookie;
+ bool allow_spin = gfpflags_allow_spinning(pc->flags);
/*
* The defrag ratio allows a configuration of the tradeoffs between
@@ -3812,7 +3813,15 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
return NULL;
do {
- cpuset_mems_cookie = read_mems_allowed_begin();
+ /*
+ * read_mems_allow_begin() accesses current->mems_allowed_seq,
+ * a seqcount_spinlock_t that is not NMI-safe. Do not access
+ * current->mems_allowed_seq and avoid retry when GFP flags
+ * indicate spinning is not allowed.
+ */
+ if (allow_spin)
+ cpuset_mems_cookie = read_mems_allowed_begin();
+
zonelist = node_zonelist(mempolicy_slab_node(), pc->flags);
for_each_zone_zonelist(zone, z, zonelist, highest_zoneidx) {
struct kmem_cache_node *n;
@@ -3836,7 +3845,7 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
}
}
}
- } while (read_mems_allowed_retry(cpuset_mems_cookie));
+ } while (allow_spin && read_mems_allowed_retry(cpuset_mems_cookie));
#endif /* CONFIG_NUMA */
return NULL;
}
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2026-02-09 3:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 17:13 [PATCH 0/2] mm/slab: fix lockdep warnings with kmalloc_nolock() Harry Yoo
2026-02-06 17:13 ` [PATCH 1/2] mm/slab: skip get_from_any_partial() if !allow_spin Harry Yoo
2026-02-06 18:10 ` Vlastimil Babka
2026-02-06 19:19 ` Alexei Starovoitov
2026-02-09 3:18 ` Harry Yoo [this message]
2026-02-09 19:03 ` Vlastimil Babka
2026-02-06 17:13 ` [PATCH 2/2] mm/slab: use prandom " Harry Yoo
2026-02-06 18:27 ` Vlastimil Babka
2026-02-06 19:22 ` Alexei Starovoitov
2026-02-07 1:25 ` Harry Yoo
2026-02-06 17:37 ` [PATCH 0/2] mm/slab: fix lockdep warnings with kmalloc_nolock() Harry Yoo
2026-02-09 19:03 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYlR_KW8xj4LJaYt@hyeyoo \
--to=harry.yoo@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@kernel.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox