linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	ranxiaokai627@163.com, Andrew Morton <akpm@linux-foundation.org>,
	cl@gentwo.org, David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Alexei Starovoitov <ast@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	ran.xiaokai@zte.com.cn
Subject: Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context
Date: Thu, 2 Oct 2025 18:00:45 +0900	[thread overview]
Message-ID: <aN4_PeDcDQUhv6N-@hyeyoo> (raw)
In-Reply-To: <b9fb0a69-6cfb-4285-8118-ef5301115948@suse.cz>

On Thu, Oct 02, 2025 at 10:14:55AM +0200, Vlastimil Babka wrote:
> On 9/30/25 13:19, Alexei Starovoitov wrote:
> > On Tue, Sep 30, 2025 at 12:54 PM Harry Yoo <harry.yoo@oracle.com> wrote:
> >>
> >> On Tue, Sep 30, 2025 at 08:34:02AM +0000, ranxiaokai627@163.com wrote:
> >> > From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >> >
> >> > defer_free() maybe called in preemptible context, this will
> >> > trigger the below warning message:
> >> >
> >> > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> >> > caller is defer_free+0x1b/0x60
> >> > Call Trace:
> >> >  <TASK>
> >> >  dump_stack_lvl+0xac/0xc0
> >> >  check_preemption_disabled+0xbe/0xe0
> >> >  defer_free+0x1b/0x60
> >> >  kfree_nolock+0x1eb/0x2b0
> >> >  alloc_slab_obj_exts+0x356/0x390
> > 
> > Please share config and repro details, since the stack trace
> > looks theoretical, but you somehow got it?
> > This is not CONFIG_SLUB_TINY, but kfree_nolock()
> > sees locked per-cpu slab?
> 
> Could it be just the "slab != c->slab" condition in do_slab_free()? That's
> more likely.

Agreed that we're seeing the case.

> However...
> > Is this PREEMPT_RT ?
> > 
> >> >  __alloc_tagging_slab_alloc_hook+0xa0/0x300
> >> >  __kmalloc_cache_noprof+0x1c4/0x5c0
> >> >  __set_page_owner+0x10d/0x1c0
> 
> This is the part that puzzles me, where do we call kmalloc from
> __set_page_owner()?

It's

__set_page_owner()
-> inc_stack_record_count()
-> add_stack_record_to_list()
-> kmalloc().

> And in a way that it loses the GFP_KERNEL passed all the
> way? I don't even see a lib/stackdepot function here.

Oh wait, we clear __GFP_RECLAIM on the first attempt to allocate
high-order slabs. so gfpflags_allow_spinning() returns false.

> >> >  post_alloc_hook+0x84/0xf0
> >> >  get_page_from_freelist+0x73b/0x1380
> >> >  __alloc_frozen_pages_noprof+0x110/0x2c0
> >> >  alloc_pages_mpol+0x44/0x140
> >> >  alloc_slab_page+0xac/0x150
> >> >  allocate_slab+0x78/0x3a0
> >> >  ___slab_alloc+0x76b/0xed0
> >> >  __slab_alloc.constprop.0+0x5a/0xb0
> >> >  __kmalloc_noprof+0x3dc/0x6d0
> >> >  __list_lru_init+0x6c/0x210
> 
> This has a kcalloc(GFP_KERNEL).

Right.
 
> >> >  alloc_super+0x3b6/0x470
> >> >  sget_fc+0x5f/0x3a0
> >> >  get_tree_nodev+0x27/0x90
> >> >  vfs_get_tree+0x26/0xc0
> >> >  vfs_kern_mount.part.0+0xb6/0x140
> >> >  kern_mount+0x24/0x40
> >> >  init_pipe_fs+0x4f/0x70
> >> >  do_one_initcall+0x62/0x2e0
> >> >  kernel_init_freeable+0x25b/0x4b0
> 
> Here we've set the full gfp_allowed_mask already so it's not masking our
> GFP_KERNEL.

Right.

-- 
Cheers,
Harry / Hyeonggon

> >> >  kernel_init+0x1a/0x1c0
> >> >  ret_from_fork+0x290/0x2e0
> >> >  ret_from_fork_asm+0x11/0x20
> >> > </TASK>
> >> >
> >> > Replace this_cpu_ptr with raw_cpu_ptr to eliminate
> >> > the above warning message.
> >> >
> >> > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> >>
> >> There's no mainline commit hash yet, should be adjusted later.
> >>
> >> > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >> > ---
> >> >  mm/slub.c | 4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/mm/slub.c b/mm/slub.c
> >> > index 1433f5b988f7..67c57f1b5a86 100644
> >> > --- a/mm/slub.c
> >> > +++ b/mm/slub.c
> >> > @@ -6432,7 +6432,7 @@ static void free_deferred_objects(struct irq_work *work)
> >> >
> >> >  static void defer_free(struct kmem_cache *s, void *head)
> >> >  {
> >> > -     struct defer_free *df = this_cpu_ptr(&defer_free_objects);
> >> > +     struct defer_free *df = raw_cpu_ptr(&defer_free_objects);
> >>
> >> This suppresses warning, but let's answer the question;
> >> Is it actually safe to not disable preemption here?
> >>
> >> >       if (llist_add(head + s->offset, &df->objects))
> >>
> >> Let's say a task was running on CPU X and migrated to a different CPU
> >> (say, Y) after returning from llist_add() or before calling llist_add(),
> >> then we're queueing the irq_work of CPU X on CPU Y.
> >>
> >> I think technically this should be safe because, although we're using
> >> per-cpu irq_work here, the irq_work framework itself is designed to handle
> >> concurrent access from multiple CPUs (otherwise it won't be safe to use
> >> a global irq_work like in other places) by using lockless list, which
> >> uses try_cmpxchg() and xchg() for atomic update.
> >>
> >> So if I'm not missing something it should be safe, but it was very
> >> confusing to confirm that it's safe as we're using per-cpu irq_work...
> >>
> >> I don't think these paths are very performance critical, so why not disable
> >> preemption instead of replacing it with raw_cpu_ptr()?
> > 
> > +1.
> > Though irq_work_queue() works for any irq_work it should
> > be used for current cpu, since it IPIs itself.
> > So pls use guard(preempt)(); instead.
> 
> Agreed. But we should fix it like this. But the report is strange.


  reply	other threads:[~2025-10-02  9:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30  8:34 ranxiaokai627
2025-09-30 10:53 ` Harry Yoo
2025-09-30 11:19   ` Alexei Starovoitov
2025-10-02  8:14     ` Vlastimil Babka
2025-10-02  9:00       ` Harry Yoo [this message]
2025-10-03  6:52         ` Vlastimil Babka
2025-10-03  7:49 ` Vlastimil Babka
2025-10-03 15:01   ` Alexei Starovoitov
2025-10-13  7:00   ` Harry Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aN4_PeDcDQUhv6N-@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=cl@gentwo.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ran.xiaokai@zte.com.cn \
    --cc=ranxiaokai627@163.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox