linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Alexei Starovoitov <ast@kernel.org>, Hao Li <hao.li@linux.dev>,
	linux-mm <linux-mm@kvack.org>, stable <stable@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm/slab: skip get_from_any_partial() if !allow_spin
Date: Mon, 9 Feb 2026 12:18:20 +0900	[thread overview]
Message-ID: <aYlR_KW8xj4LJaYt@hyeyoo> (raw)
In-Reply-To: <CAADnVQ+1RBXBWNQtshEfFNZEp0tDZOFKf_vedyjgdz=wqWdG8A@mail.gmail.com>

On Fri, Feb 06, 2026 at 11:19:01AM -0800, Alexei Starovoitov wrote:
> On Fri, Feb 6, 2026 at 10:10 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > On 2/6/26 18:13, Harry Yoo wrote:
> > > Lockdep complains when get_from_any_partial() is called in an NMI
> > > context, because current->mems_allowed_seq is seqcount_spinlock_t and
> > > not NMI-safe:
> > >
> > >   ================================
> > >   WARNING: inconsistent lock state
> > >   6.19.0-rc5-kfree-rcu+ #315 Tainted: G                 N
> > >   --------------------------------
> > >   inconsistent {INITIAL USE} -> {IN-NMI} usage.
> > >   kunit_try_catch/9989 [HC1[1]:SC0[0]:HE0:SE1] takes:
> > >   ffff889085799820 (&____s->seqcount#3){.-.-}-{0:0}, at: ___slab_alloc+0x58f/0xc00
> > >   {INITIAL USE} state was registered at:
> > >     lock_acquire+0x185/0x320
> > >     kernel_init_freeable+0x391/0x1150
> > >     kernel_init+0x1f/0x220
> > >     ret_from_fork+0x736/0x8f0
> > >     ret_from_fork_asm+0x1a/0x30
> > >   irq event stamp: 56
> > >   hardirqs last  enabled at (55): [<ffffffff850a68d7>] _raw_spin_unlock_irq+0x27/0x70
> > >   hardirqs last disabled at (56): [<ffffffff850858ca>] __schedule+0x2a8a/0x6630
> > >   softirqs last  enabled at (0): [<ffffffff81536711>] copy_process+0x1dc1/0x6a10
> > >   softirqs last disabled at (0): [<0000000000000000>] 0x0
> > >
> > >   other info that might help us debug this:
> > >    Possible unsafe locking scenario:
> > >
> > >          CPU0
> > >          ----
> > >     lock(&____s->seqcount#3);
> > >     <Interrupt>
> > >       lock(&____s->seqcount#3);
> > >
> > >    *** DEADLOCK ***
> > >
> > > According to Documentation/locking/seqlock.rst, seqcount_t is not
> > > NMI-safe and seqcount_latch_t should be used when read path can interrupt
> > > the write-side critical section. In this case, return NULL and fall back
> > > to slab allocation if !allow_spin.
> > >
> > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > > ---
> > >  mm/slub.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 102fb47ae013..d46464654c15 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -3789,6 +3789,14 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
> > >       enum zone_type highest_zoneidx = gfp_zone(pc->flags);
> > >       unsigned int cpuset_mems_cookie;
> > >
> > > +     /*
> > > +      * read_mems_allow_begin() accesses current->mems_allowed_seq,
> > > +      * a seqcount_spinlock_t that is not NMI-safe. Skip allocation
> > > +      * when GFP flags indicate spinning is not allowed.
> > > +      */
> > > +     if (!gfpflags_allow_spinning(pc->flags))
> > > +             return NULL;
> >
> > I think it would be less restrictive to just continue,

Ack.

> > but skip the
> > read_mems_allowed_retry() part in the do-while loop, so just make it one
> > iteration for !allow_spin.

Makes sense.

> > If lockdep doesn't like even the
> > read_mems_allowed_begin() (not clear to me), skip it too?

Yes, lockdep doesn't like read_mems_allowed_begin(), and thus
we should skip both.

> 
> +1
> Just unconditional return NULL seems too restrictive.

Ack.

I'll do something like this:

diff --git a/mm/slub.c b/mm/slub.c
index 102fb47ae013..cc686ab929fe 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3788,6 +3788,7 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
 	struct zone *zone;
 	enum zone_type highest_zoneidx = gfp_zone(pc->flags);
 	unsigned int cpuset_mems_cookie;
+	bool allow_spin = gfpflags_allow_spinning(pc->flags);

 	/*
 	 * The defrag ratio allows a configuration of the tradeoffs between
@@ -3812,7 +3813,15 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
 		return NULL;

 	do {
-		cpuset_mems_cookie = read_mems_allowed_begin();
+		/*
+		 * read_mems_allow_begin() accesses current->mems_allowed_seq,
+		 * a seqcount_spinlock_t that is not NMI-safe. Do not access
+		 * current->mems_allowed_seq and avoid retry when GFP flags
+		 * indicate spinning is not allowed.
+		 */
+		if (allow_spin)
+			cpuset_mems_cookie = read_mems_allowed_begin();
+
 		zonelist = node_zonelist(mempolicy_slab_node(), pc->flags);
 		for_each_zone_zonelist(zone, z, zonelist, highest_zoneidx) {
 			struct kmem_cache_node *n;
@@ -3836,7 +3845,7 @@ static void *get_from_any_partial(struct kmem_cache *s, struct partial_context *
 				}
 			}
 		}
-	} while (read_mems_allowed_retry(cpuset_mems_cookie));
+	} while (allow_spin && read_mems_allowed_retry(cpuset_mems_cookie));
 #endif	/* CONFIG_NUMA */
 	return NULL;
 }


-- 
Cheers,
Harry / Hyeonggon


  reply	other threads:[~2026-02-09  3:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-06 17:13 [PATCH 0/2] mm/slab: fix lockdep warnings with kmalloc_nolock() Harry Yoo
2026-02-06 17:13 ` [PATCH 1/2] mm/slab: skip get_from_any_partial() if !allow_spin Harry Yoo
2026-02-06 18:10   ` Vlastimil Babka
2026-02-06 19:19     ` Alexei Starovoitov
2026-02-09  3:18       ` Harry Yoo [this message]
2026-02-09 19:03         ` Vlastimil Babka
2026-02-06 17:13 ` [PATCH 2/2] mm/slab: use prandom " Harry Yoo
2026-02-06 18:27   ` Vlastimil Babka
2026-02-06 19:22     ` Alexei Starovoitov
2026-02-07  1:25       ` Harry Yoo
2026-02-06 17:37 ` [PATCH 0/2] mm/slab: fix lockdep warnings with kmalloc_nolock() Harry Yoo
2026-02-09 19:03   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYlR_KW8xj4LJaYt@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=cl@gentwo.org \
    --cc=hao.li@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox