* [PATCH] slab: Fix using this_cpu_ptr() in preemptible context
@ 2025-09-30 8:34 ranxiaokai627
2025-09-30 10:53 ` Harry Yoo
2025-10-03 7:49 ` Vlastimil Babka
0 siblings, 2 replies; 9+ messages in thread
From: ranxiaokai627 @ 2025-09-30 8:34 UTC (permalink / raw)
To: vbabka, akpm, cl, rientjes, roman.gushchin, harry.yoo, ast
Cc: linux-kernel, linux-mm, ran.xiaokai, ranxiaokai627
From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
defer_free() maybe called in preemptible context, this will
trigger the below warning message:
BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
caller is defer_free+0x1b/0x60
Call Trace:
<TASK>
dump_stack_lvl+0xac/0xc0
check_preemption_disabled+0xbe/0xe0
defer_free+0x1b/0x60
kfree_nolock+0x1eb/0x2b0
alloc_slab_obj_exts+0x356/0x390
__alloc_tagging_slab_alloc_hook+0xa0/0x300
__kmalloc_cache_noprof+0x1c4/0x5c0
__set_page_owner+0x10d/0x1c0
post_alloc_hook+0x84/0xf0
get_page_from_freelist+0x73b/0x1380
__alloc_frozen_pages_noprof+0x110/0x2c0
alloc_pages_mpol+0x44/0x140
alloc_slab_page+0xac/0x150
allocate_slab+0x78/0x3a0
___slab_alloc+0x76b/0xed0
__slab_alloc.constprop.0+0x5a/0xb0
__kmalloc_noprof+0x3dc/0x6d0
__list_lru_init+0x6c/0x210
alloc_super+0x3b6/0x470
sget_fc+0x5f/0x3a0
get_tree_nodev+0x27/0x90
vfs_get_tree+0x26/0xc0
vfs_kern_mount.part.0+0xb6/0x140
kern_mount+0x24/0x40
init_pipe_fs+0x4f/0x70
do_one_initcall+0x62/0x2e0
kernel_init_freeable+0x25b/0x4b0
kernel_init+0x1a/0x1c0
ret_from_fork+0x290/0x2e0
ret_from_fork_asm+0x11/0x20
</TASK>
Replace this_cpu_ptr with raw_cpu_ptr to eliminate
the above warning message.
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
---
mm/slub.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 1433f5b988f7..67c57f1b5a86 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6432,7 +6432,7 @@ static void free_deferred_objects(struct irq_work *work)
static void defer_free(struct kmem_cache *s, void *head)
{
- struct defer_free *df = this_cpu_ptr(&defer_free_objects);
+ struct defer_free *df = raw_cpu_ptr(&defer_free_objects);
if (llist_add(head + s->offset, &df->objects))
irq_work_queue(&df->work);
@@ -6440,7 +6440,7 @@ static void defer_free(struct kmem_cache *s, void *head)
static void defer_deactivate_slab(struct slab *slab, void *flush_freelist)
{
- struct defer_free *df = this_cpu_ptr(&defer_free_objects);
+ struct defer_free *df = raw_cpu_ptr(&defer_free_objects);
slab->flush_freelist = flush_freelist;
if (llist_add(&slab->llnode, &df->slabs))
--
2.25.1
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-09-30 8:34 [PATCH] slab: Fix using this_cpu_ptr() in preemptible context ranxiaokai627 @ 2025-09-30 10:53 ` Harry Yoo 2025-09-30 11:19 ` Alexei Starovoitov 2025-10-03 7:49 ` Vlastimil Babka 1 sibling, 1 reply; 9+ messages in thread From: Harry Yoo @ 2025-09-30 10:53 UTC (permalink / raw) To: ranxiaokai627 Cc: vbabka, akpm, cl, rientjes, roman.gushchin, ast, linux-kernel, linux-mm, ran.xiaokai On Tue, Sep 30, 2025 at 08:34:02AM +0000, ranxiaokai627@163.com wrote: > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> > > defer_free() maybe called in preemptible context, this will > trigger the below warning message: > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 > caller is defer_free+0x1b/0x60 > Call Trace: > <TASK> > dump_stack_lvl+0xac/0xc0 > check_preemption_disabled+0xbe/0xe0 > defer_free+0x1b/0x60 > kfree_nolock+0x1eb/0x2b0 > alloc_slab_obj_exts+0x356/0x390 > __alloc_tagging_slab_alloc_hook+0xa0/0x300 > __kmalloc_cache_noprof+0x1c4/0x5c0 > __set_page_owner+0x10d/0x1c0 > post_alloc_hook+0x84/0xf0 > get_page_from_freelist+0x73b/0x1380 > __alloc_frozen_pages_noprof+0x110/0x2c0 > alloc_pages_mpol+0x44/0x140 > alloc_slab_page+0xac/0x150 > allocate_slab+0x78/0x3a0 > ___slab_alloc+0x76b/0xed0 > __slab_alloc.constprop.0+0x5a/0xb0 > __kmalloc_noprof+0x3dc/0x6d0 > __list_lru_init+0x6c/0x210 > alloc_super+0x3b6/0x470 > sget_fc+0x5f/0x3a0 > get_tree_nodev+0x27/0x90 > vfs_get_tree+0x26/0xc0 > vfs_kern_mount.part.0+0xb6/0x140 > kern_mount+0x24/0x40 > init_pipe_fs+0x4f/0x70 > do_one_initcall+0x62/0x2e0 > kernel_init_freeable+0x25b/0x4b0 > kernel_init+0x1a/0x1c0 > ret_from_fork+0x290/0x2e0 > ret_from_fork_asm+0x11/0x20 > </TASK> > > Replace this_cpu_ptr with raw_cpu_ptr to eliminate > the above warning message. > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") There's no mainline commit hash yet, should be adjusted later. > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> > --- > mm/slub.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/slub.c b/mm/slub.c > index 1433f5b988f7..67c57f1b5a86 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -6432,7 +6432,7 @@ static void free_deferred_objects(struct irq_work *work) > > static void defer_free(struct kmem_cache *s, void *head) > { > - struct defer_free *df = this_cpu_ptr(&defer_free_objects); > + struct defer_free *df = raw_cpu_ptr(&defer_free_objects); This suppresses warning, but let's answer the question; Is it actually safe to not disable preemption here? > if (llist_add(head + s->offset, &df->objects)) Let's say a task was running on CPU X and migrated to a different CPU (say, Y) after returning from llist_add() or before calling llist_add(), then we're queueing the irq_work of CPU X on CPU Y. I think technically this should be safe because, although we're using per-cpu irq_work here, the irq_work framework itself is designed to handle concurrent access from multiple CPUs (otherwise it won't be safe to use a global irq_work like in other places) by using lockless list, which uses try_cmpxchg() and xchg() for atomic update. So if I'm not missing something it should be safe, but it was very confusing to confirm that it's safe as we're using per-cpu irq_work... I don't think these paths are very performance critical, so why not disable preemption instead of replacing it with raw_cpu_ptr()? > irq_work_queue(&df->work); > @@ -6440,7 +6440,7 @@ static void defer_free(struct kmem_cache *s, void *head) > > static void defer_deactivate_slab(struct slab *slab, void *flush_freelist) > { > - struct defer_free *df = this_cpu_ptr(&defer_free_objects); > + struct defer_free *df = raw_cpu_ptr(&defer_free_objects); > > slab->flush_freelist = flush_freelist; > if (llist_add(&slab->llnode, &df->slabs)) > -- > 2.25.1 -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-09-30 10:53 ` Harry Yoo @ 2025-09-30 11:19 ` Alexei Starovoitov 2025-10-02 8:14 ` Vlastimil Babka 0 siblings, 1 reply; 9+ messages in thread From: Alexei Starovoitov @ 2025-09-30 11:19 UTC (permalink / raw) To: Harry Yoo Cc: ranxiaokai627, Vlastimil Babka, Andrew Morton, cl, David Rientjes, Roman Gushchin, Alexei Starovoitov, LKML, linux-mm, ran.xiaokai On Tue, Sep 30, 2025 at 12:54 PM Harry Yoo <harry.yoo@oracle.com> wrote: > > On Tue, Sep 30, 2025 at 08:34:02AM +0000, ranxiaokai627@163.com wrote: > > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> > > > > defer_free() maybe called in preemptible context, this will > > trigger the below warning message: > > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 > > caller is defer_free+0x1b/0x60 > > Call Trace: > > <TASK> > > dump_stack_lvl+0xac/0xc0 > > check_preemption_disabled+0xbe/0xe0 > > defer_free+0x1b/0x60 > > kfree_nolock+0x1eb/0x2b0 > > alloc_slab_obj_exts+0x356/0x390 Please share config and repro details, since the stack trace looks theoretical, but you somehow got it? This is not CONFIG_SLUB_TINY, but kfree_nolock() sees locked per-cpu slab? Is this PREEMPT_RT ? > > __alloc_tagging_slab_alloc_hook+0xa0/0x300 > > __kmalloc_cache_noprof+0x1c4/0x5c0 > > __set_page_owner+0x10d/0x1c0 > > post_alloc_hook+0x84/0xf0 > > get_page_from_freelist+0x73b/0x1380 > > __alloc_frozen_pages_noprof+0x110/0x2c0 > > alloc_pages_mpol+0x44/0x140 > > alloc_slab_page+0xac/0x150 > > allocate_slab+0x78/0x3a0 > > ___slab_alloc+0x76b/0xed0 > > __slab_alloc.constprop.0+0x5a/0xb0 > > __kmalloc_noprof+0x3dc/0x6d0 > > __list_lru_init+0x6c/0x210 > > alloc_super+0x3b6/0x470 > > sget_fc+0x5f/0x3a0 > > get_tree_nodev+0x27/0x90 > > vfs_get_tree+0x26/0xc0 > > vfs_kern_mount.part.0+0xb6/0x140 > > kern_mount+0x24/0x40 > > init_pipe_fs+0x4f/0x70 > > do_one_initcall+0x62/0x2e0 > > kernel_init_freeable+0x25b/0x4b0 > > kernel_init+0x1a/0x1c0 > > ret_from_fork+0x290/0x2e0 > > ret_from_fork_asm+0x11/0x20 > > </TASK> > > > > Replace this_cpu_ptr with raw_cpu_ptr to eliminate > > the above warning message. > > > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") > > There's no mainline commit hash yet, should be adjusted later. > > > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> > > --- > > mm/slub.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index 1433f5b988f7..67c57f1b5a86 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -6432,7 +6432,7 @@ static void free_deferred_objects(struct irq_work *work) > > > > static void defer_free(struct kmem_cache *s, void *head) > > { > > - struct defer_free *df = this_cpu_ptr(&defer_free_objects); > > + struct defer_free *df = raw_cpu_ptr(&defer_free_objects); > > This suppresses warning, but let's answer the question; > Is it actually safe to not disable preemption here? > > > if (llist_add(head + s->offset, &df->objects)) > > Let's say a task was running on CPU X and migrated to a different CPU > (say, Y) after returning from llist_add() or before calling llist_add(), > then we're queueing the irq_work of CPU X on CPU Y. > > I think technically this should be safe because, although we're using > per-cpu irq_work here, the irq_work framework itself is designed to handle > concurrent access from multiple CPUs (otherwise it won't be safe to use > a global irq_work like in other places) by using lockless list, which > uses try_cmpxchg() and xchg() for atomic update. > > So if I'm not missing something it should be safe, but it was very > confusing to confirm that it's safe as we're using per-cpu irq_work... > > I don't think these paths are very performance critical, so why not disable > preemption instead of replacing it with raw_cpu_ptr()? +1. Though irq_work_queue() works for any irq_work it should be used for current cpu, since it IPIs itself. So pls use guard(preempt)(); instead. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-09-30 11:19 ` Alexei Starovoitov @ 2025-10-02 8:14 ` Vlastimil Babka 2025-10-02 9:00 ` Harry Yoo 0 siblings, 1 reply; 9+ messages in thread From: Vlastimil Babka @ 2025-10-02 8:14 UTC (permalink / raw) To: Alexei Starovoitov, Harry Yoo Cc: ranxiaokai627, Andrew Morton, cl, David Rientjes, Roman Gushchin, Alexei Starovoitov, LKML, linux-mm, ran.xiaokai On 9/30/25 13:19, Alexei Starovoitov wrote: > On Tue, Sep 30, 2025 at 12:54 PM Harry Yoo <harry.yoo@oracle.com> wrote: >> >> On Tue, Sep 30, 2025 at 08:34:02AM +0000, ranxiaokai627@163.com wrote: >> > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> >> > >> > defer_free() maybe called in preemptible context, this will >> > trigger the below warning message: >> > >> > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 >> > caller is defer_free+0x1b/0x60 >> > Call Trace: >> > <TASK> >> > dump_stack_lvl+0xac/0xc0 >> > check_preemption_disabled+0xbe/0xe0 >> > defer_free+0x1b/0x60 >> > kfree_nolock+0x1eb/0x2b0 >> > alloc_slab_obj_exts+0x356/0x390 > > Please share config and repro details, since the stack trace > looks theoretical, but you somehow got it? > This is not CONFIG_SLUB_TINY, but kfree_nolock() > sees locked per-cpu slab? Could it be just the "slab != c->slab" condition in do_slab_free()? That's more likely. However... > Is this PREEMPT_RT ? > >> > __alloc_tagging_slab_alloc_hook+0xa0/0x300 >> > __kmalloc_cache_noprof+0x1c4/0x5c0 >> > __set_page_owner+0x10d/0x1c0 This is the part that puzzles me, where do we call kmalloc from __set_page_owner()? And in a way that it loses the GFP_KERNEL passed all the way? I don't even see a lib/stackdepot function here. >> > post_alloc_hook+0x84/0xf0 >> > get_page_from_freelist+0x73b/0x1380 >> > __alloc_frozen_pages_noprof+0x110/0x2c0 >> > alloc_pages_mpol+0x44/0x140 >> > alloc_slab_page+0xac/0x150 >> > allocate_slab+0x78/0x3a0 >> > ___slab_alloc+0x76b/0xed0 >> > __slab_alloc.constprop.0+0x5a/0xb0 >> > __kmalloc_noprof+0x3dc/0x6d0 >> > __list_lru_init+0x6c/0x210 This has a kcalloc(GFP_KERNEL). >> > alloc_super+0x3b6/0x470 >> > sget_fc+0x5f/0x3a0 >> > get_tree_nodev+0x27/0x90 >> > vfs_get_tree+0x26/0xc0 >> > vfs_kern_mount.part.0+0xb6/0x140 >> > kern_mount+0x24/0x40 >> > init_pipe_fs+0x4f/0x70 >> > do_one_initcall+0x62/0x2e0 >> > kernel_init_freeable+0x25b/0x4b0 Here we've set the full gfp_allowed_mask already so it's not masking our GFP_KERNEL. >> > kernel_init+0x1a/0x1c0 >> > ret_from_fork+0x290/0x2e0 >> > ret_from_fork_asm+0x11/0x20 >> > </TASK> >> > >> > Replace this_cpu_ptr with raw_cpu_ptr to eliminate >> > the above warning message. >> > >> > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") >> >> There's no mainline commit hash yet, should be adjusted later. >> >> > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> >> > --- >> > mm/slub.c | 4 ++-- >> > 1 file changed, 2 insertions(+), 2 deletions(-) >> > >> > diff --git a/mm/slub.c b/mm/slub.c >> > index 1433f5b988f7..67c57f1b5a86 100644 >> > --- a/mm/slub.c >> > +++ b/mm/slub.c >> > @@ -6432,7 +6432,7 @@ static void free_deferred_objects(struct irq_work *work) >> > >> > static void defer_free(struct kmem_cache *s, void *head) >> > { >> > - struct defer_free *df = this_cpu_ptr(&defer_free_objects); >> > + struct defer_free *df = raw_cpu_ptr(&defer_free_objects); >> >> This suppresses warning, but let's answer the question; >> Is it actually safe to not disable preemption here? >> >> > if (llist_add(head + s->offset, &df->objects)) >> >> Let's say a task was running on CPU X and migrated to a different CPU >> (say, Y) after returning from llist_add() or before calling llist_add(), >> then we're queueing the irq_work of CPU X on CPU Y. >> >> I think technically this should be safe because, although we're using >> per-cpu irq_work here, the irq_work framework itself is designed to handle >> concurrent access from multiple CPUs (otherwise it won't be safe to use >> a global irq_work like in other places) by using lockless list, which >> uses try_cmpxchg() and xchg() for atomic update. >> >> So if I'm not missing something it should be safe, but it was very >> confusing to confirm that it's safe as we're using per-cpu irq_work... >> >> I don't think these paths are very performance critical, so why not disable >> preemption instead of replacing it with raw_cpu_ptr()? > > +1. > Though irq_work_queue() works for any irq_work it should > be used for current cpu, since it IPIs itself. > So pls use guard(preempt)(); instead. Agreed. But we should fix it like this. But the report is strange. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-10-02 8:14 ` Vlastimil Babka @ 2025-10-02 9:00 ` Harry Yoo 2025-10-03 6:52 ` Vlastimil Babka 0 siblings, 1 reply; 9+ messages in thread From: Harry Yoo @ 2025-10-02 9:00 UTC (permalink / raw) To: Vlastimil Babka Cc: Alexei Starovoitov, ranxiaokai627, Andrew Morton, cl, David Rientjes, Roman Gushchin, Alexei Starovoitov, LKML, linux-mm, ran.xiaokai On Thu, Oct 02, 2025 at 10:14:55AM +0200, Vlastimil Babka wrote: > On 9/30/25 13:19, Alexei Starovoitov wrote: > > On Tue, Sep 30, 2025 at 12:54 PM Harry Yoo <harry.yoo@oracle.com> wrote: > >> > >> On Tue, Sep 30, 2025 at 08:34:02AM +0000, ranxiaokai627@163.com wrote: > >> > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> > >> > > >> > defer_free() maybe called in preemptible context, this will > >> > trigger the below warning message: > >> > > >> > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 > >> > caller is defer_free+0x1b/0x60 > >> > Call Trace: > >> > <TASK> > >> > dump_stack_lvl+0xac/0xc0 > >> > check_preemption_disabled+0xbe/0xe0 > >> > defer_free+0x1b/0x60 > >> > kfree_nolock+0x1eb/0x2b0 > >> > alloc_slab_obj_exts+0x356/0x390 > > > > Please share config and repro details, since the stack trace > > looks theoretical, but you somehow got it? > > This is not CONFIG_SLUB_TINY, but kfree_nolock() > > sees locked per-cpu slab? > > Could it be just the "slab != c->slab" condition in do_slab_free()? That's > more likely. Agreed that we're seeing the case. > However... > > Is this PREEMPT_RT ? > > > >> > __alloc_tagging_slab_alloc_hook+0xa0/0x300 > >> > __kmalloc_cache_noprof+0x1c4/0x5c0 > >> > __set_page_owner+0x10d/0x1c0 > > This is the part that puzzles me, where do we call kmalloc from > __set_page_owner()? It's __set_page_owner() -> inc_stack_record_count() -> add_stack_record_to_list() -> kmalloc(). > And in a way that it loses the GFP_KERNEL passed all the > way? I don't even see a lib/stackdepot function here. Oh wait, we clear __GFP_RECLAIM on the first attempt to allocate high-order slabs. so gfpflags_allow_spinning() returns false. > >> > post_alloc_hook+0x84/0xf0 > >> > get_page_from_freelist+0x73b/0x1380 > >> > __alloc_frozen_pages_noprof+0x110/0x2c0 > >> > alloc_pages_mpol+0x44/0x140 > >> > alloc_slab_page+0xac/0x150 > >> > allocate_slab+0x78/0x3a0 > >> > ___slab_alloc+0x76b/0xed0 > >> > __slab_alloc.constprop.0+0x5a/0xb0 > >> > __kmalloc_noprof+0x3dc/0x6d0 > >> > __list_lru_init+0x6c/0x210 > > This has a kcalloc(GFP_KERNEL). Right. > >> > alloc_super+0x3b6/0x470 > >> > sget_fc+0x5f/0x3a0 > >> > get_tree_nodev+0x27/0x90 > >> > vfs_get_tree+0x26/0xc0 > >> > vfs_kern_mount.part.0+0xb6/0x140 > >> > kern_mount+0x24/0x40 > >> > init_pipe_fs+0x4f/0x70 > >> > do_one_initcall+0x62/0x2e0 > >> > kernel_init_freeable+0x25b/0x4b0 > > Here we've set the full gfp_allowed_mask already so it's not masking our > GFP_KERNEL. Right. -- Cheers, Harry / Hyeonggon > >> > kernel_init+0x1a/0x1c0 > >> > ret_from_fork+0x290/0x2e0 > >> > ret_from_fork_asm+0x11/0x20 > >> > </TASK> > >> > > >> > Replace this_cpu_ptr with raw_cpu_ptr to eliminate > >> > the above warning message. > >> > > >> > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") > >> > >> There's no mainline commit hash yet, should be adjusted later. > >> > >> > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> > >> > --- > >> > mm/slub.c | 4 ++-- > >> > 1 file changed, 2 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/mm/slub.c b/mm/slub.c > >> > index 1433f5b988f7..67c57f1b5a86 100644 > >> > --- a/mm/slub.c > >> > +++ b/mm/slub.c > >> > @@ -6432,7 +6432,7 @@ static void free_deferred_objects(struct irq_work *work) > >> > > >> > static void defer_free(struct kmem_cache *s, void *head) > >> > { > >> > - struct defer_free *df = this_cpu_ptr(&defer_free_objects); > >> > + struct defer_free *df = raw_cpu_ptr(&defer_free_objects); > >> > >> This suppresses warning, but let's answer the question; > >> Is it actually safe to not disable preemption here? > >> > >> > if (llist_add(head + s->offset, &df->objects)) > >> > >> Let's say a task was running on CPU X and migrated to a different CPU > >> (say, Y) after returning from llist_add() or before calling llist_add(), > >> then we're queueing the irq_work of CPU X on CPU Y. > >> > >> I think technically this should be safe because, although we're using > >> per-cpu irq_work here, the irq_work framework itself is designed to handle > >> concurrent access from multiple CPUs (otherwise it won't be safe to use > >> a global irq_work like in other places) by using lockless list, which > >> uses try_cmpxchg() and xchg() for atomic update. > >> > >> So if I'm not missing something it should be safe, but it was very > >> confusing to confirm that it's safe as we're using per-cpu irq_work... > >> > >> I don't think these paths are very performance critical, so why not disable > >> preemption instead of replacing it with raw_cpu_ptr()? > > > > +1. > > Though irq_work_queue() works for any irq_work it should > > be used for current cpu, since it IPIs itself. > > So pls use guard(preempt)(); instead. > > Agreed. But we should fix it like this. But the report is strange. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-10-02 9:00 ` Harry Yoo @ 2025-10-03 6:52 ` Vlastimil Babka 0 siblings, 0 replies; 9+ messages in thread From: Vlastimil Babka @ 2025-10-03 6:52 UTC (permalink / raw) To: Harry Yoo Cc: Alexei Starovoitov, ranxiaokai627, Andrew Morton, cl, David Rientjes, Roman Gushchin, Alexei Starovoitov, LKML, linux-mm, ran.xiaokai On 10/2/25 11:00, Harry Yoo wrote: > On Thu, Oct 02, 2025 at 10:14:55AM +0200, Vlastimil Babka wrote: >> However... >> > Is this PREEMPT_RT ? >> > >> >> > __alloc_tagging_slab_alloc_hook+0xa0/0x300 >> >> > __kmalloc_cache_noprof+0x1c4/0x5c0 >> >> > __set_page_owner+0x10d/0x1c0 >> >> This is the part that puzzles me, where do we call kmalloc from >> __set_page_owner()? > > It's > > __set_page_owner() > -> inc_stack_record_count() > -> add_stack_record_to_list() > -> kmalloc(). Thanks, missed that. >> And in a way that it loses the GFP_KERNEL passed all the >> way? I don't even see a lib/stackdepot function here. > > Oh wait, we clear __GFP_RECLAIM on the first attempt to allocate > high-order slabs. so gfpflags_allow_spinning() returns false. Ah right! Dang, that's suboptimal that we intend to do an opportunistic attempt, but limit the allocations of supplementary objects this way. But I don't see how to avoid this without inventing new gfp flags. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-09-30 8:34 [PATCH] slab: Fix using this_cpu_ptr() in preemptible context ranxiaokai627 2025-09-30 10:53 ` Harry Yoo @ 2025-10-03 7:49 ` Vlastimil Babka 2025-10-03 15:01 ` Alexei Starovoitov 2025-10-13 7:00 ` Harry Yoo 1 sibling, 2 replies; 9+ messages in thread From: Vlastimil Babka @ 2025-10-03 7:49 UTC (permalink / raw) To: ranxiaokai627, akpm, cl, rientjes, roman.gushchin, harry.yoo, ast Cc: linux-kernel, linux-mm, ran.xiaokai On 9/30/25 10:34, ranxiaokai627@163.com wrote: > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> > > defer_free() maybe called in preemptible context, this will > trigger the below warning message: > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 > caller is defer_free+0x1b/0x60 > Call Trace: > <TASK> > dump_stack_lvl+0xac/0xc0 > check_preemption_disabled+0xbe/0xe0 > defer_free+0x1b/0x60 > kfree_nolock+0x1eb/0x2b0 > alloc_slab_obj_exts+0x356/0x390 > __alloc_tagging_slab_alloc_hook+0xa0/0x300 > __kmalloc_cache_noprof+0x1c4/0x5c0 > __set_page_owner+0x10d/0x1c0 > post_alloc_hook+0x84/0xf0 > get_page_from_freelist+0x73b/0x1380 > __alloc_frozen_pages_noprof+0x110/0x2c0 > alloc_pages_mpol+0x44/0x140 > alloc_slab_page+0xac/0x150 > allocate_slab+0x78/0x3a0 > ___slab_alloc+0x76b/0xed0 > __slab_alloc.constprop.0+0x5a/0xb0 > __kmalloc_noprof+0x3dc/0x6d0 > __list_lru_init+0x6c/0x210 > alloc_super+0x3b6/0x470 > sget_fc+0x5f/0x3a0 > get_tree_nodev+0x27/0x90 > vfs_get_tree+0x26/0xc0 > vfs_kern_mount.part.0+0xb6/0x140 > kern_mount+0x24/0x40 > init_pipe_fs+0x4f/0x70 > do_one_initcall+0x62/0x2e0 > kernel_init_freeable+0x25b/0x4b0 > kernel_init+0x1a/0x1c0 > ret_from_fork+0x290/0x2e0 > ret_from_fork_asm+0x11/0x20 > </TASK> > > Replace this_cpu_ptr with raw_cpu_ptr to eliminate > the above warning message. > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Since the slab PR was merged, we should fix this ideally before rc1 so I updated the code myself per the discussion and added to slab/for-next Thanks! ----8<---- From d6306a3d5577769b179ae4e448fd144e2b0f7717 Mon Sep 17 00:00:00 2001 From: Ran Xiaokai <ran.xiaokai@zte.com.cn> Date: Tue, 30 Sep 2025 08:34:02 +0000 Subject: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context defer_free() maybe called in preemptible context, this will trigger the below warning message: BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is defer_free+0x1b/0x60 Call Trace: <TASK> dump_stack_lvl+0xac/0xc0 check_preemption_disabled+0xbe/0xe0 defer_free+0x1b/0x60 kfree_nolock+0x1eb/0x2b0 alloc_slab_obj_exts+0x356/0x390 __alloc_tagging_slab_alloc_hook+0xa0/0x300 __kmalloc_cache_noprof+0x1c4/0x5c0 __set_page_owner+0x10d/0x1c0 post_alloc_hook+0x84/0xf0 get_page_from_freelist+0x73b/0x1380 __alloc_frozen_pages_noprof+0x110/0x2c0 alloc_pages_mpol+0x44/0x140 alloc_slab_page+0xac/0x150 allocate_slab+0x78/0x3a0 ___slab_alloc+0x76b/0xed0 __slab_alloc.constprop.0+0x5a/0xb0 __kmalloc_noprof+0x3dc/0x6d0 __list_lru_init+0x6c/0x210 alloc_super+0x3b6/0x470 sget_fc+0x5f/0x3a0 get_tree_nodev+0x27/0x90 vfs_get_tree+0x26/0xc0 vfs_kern_mount.part.0+0xb6/0x140 kern_mount+0x24/0x40 init_pipe_fs+0x4f/0x70 do_one_initcall+0x62/0x2e0 kernel_init_freeable+0x25b/0x4b0 kernel_init+0x1a/0x1c0 ret_from_fork+0x290/0x2e0 ret_from_fork_asm+0x11/0x20 </TASK> Disable preemption in defer_free() and also defer_deactivate_slab() to make it safe. [vbabka@suse.cz: disable preemption instead of using raw_cpu_ptr() per the discussion ] Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Link: https://lore.kernel.org/r/20250930083402.782927-1-ranxiaokai627@163.com Signed-off-by: Vlastimil Babka <vbabka@suse.cz> --- mm/slub.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 1433f5b988f7..44aa0e3f48ee 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -6432,17 +6432,24 @@ static void free_deferred_objects(struct irq_work *work) static void defer_free(struct kmem_cache *s, void *head) { - struct defer_free *df = this_cpu_ptr(&defer_free_objects); + struct defer_free *df; + guard(preempt)(); + + df = this_cpu_ptr(&defer_free_objects); if (llist_add(head + s->offset, &df->objects)) irq_work_queue(&df->work); } static void defer_deactivate_slab(struct slab *slab, void *flush_freelist) { - struct defer_free *df = this_cpu_ptr(&defer_free_objects); + struct defer_free *df; slab->flush_freelist = flush_freelist; + + guard(preempt)(); + + df = this_cpu_ptr(&defer_free_objects); if (llist_add(&slab->llnode, &df->slabs)) irq_work_queue(&df->work); } -- 2.51.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-10-03 7:49 ` Vlastimil Babka @ 2025-10-03 15:01 ` Alexei Starovoitov 2025-10-13 7:00 ` Harry Yoo 1 sibling, 0 replies; 9+ messages in thread From: Alexei Starovoitov @ 2025-10-03 15:01 UTC (permalink / raw) To: Vlastimil Babka Cc: ranxiaokai627, Andrew Morton, cl, David Rientjes, Roman Gushchin, Harry Yoo, Alexei Starovoitov, LKML, linux-mm, ran.xiaokai On Fri, Oct 3, 2025 at 12:50 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > > Disable preemption in defer_free() and also defer_deactivate_slab() to > make it safe. > > [vbabka@suse.cz: disable preemption instead of using raw_cpu_ptr() per > the discussion ] > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> > Link: https://lore.kernel.org/r/20250930083402.782927-1-ranxiaokai627@163.com > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Alexei Starovoitov <ast@kernel.org> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context 2025-10-03 7:49 ` Vlastimil Babka 2025-10-03 15:01 ` Alexei Starovoitov @ 2025-10-13 7:00 ` Harry Yoo 1 sibling, 0 replies; 9+ messages in thread From: Harry Yoo @ 2025-10-13 7:00 UTC (permalink / raw) To: Vlastimil Babka Cc: ranxiaokai627, akpm, cl, rientjes, roman.gushchin, ast, linux-kernel, linux-mm, ran.xiaokai On Fri, Oct 03, 2025 at 09:49:57AM +0200, Vlastimil Babka wrote: > On 9/30/25 10:34, ranxiaokai627@163.com wrote: > > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> > > > > defer_free() maybe called in preemptible context, this will > > trigger the below warning message: > > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 > > caller is defer_free+0x1b/0x60 > > Call Trace: > > <TASK> > > dump_stack_lvl+0xac/0xc0 > > check_preemption_disabled+0xbe/0xe0 > > defer_free+0x1b/0x60 > > kfree_nolock+0x1eb/0x2b0 > > alloc_slab_obj_exts+0x356/0x390 > > __alloc_tagging_slab_alloc_hook+0xa0/0x300 > > __kmalloc_cache_noprof+0x1c4/0x5c0 > > __set_page_owner+0x10d/0x1c0 > > post_alloc_hook+0x84/0xf0 > > get_page_from_freelist+0x73b/0x1380 > > __alloc_frozen_pages_noprof+0x110/0x2c0 > > alloc_pages_mpol+0x44/0x140 > > alloc_slab_page+0xac/0x150 > > allocate_slab+0x78/0x3a0 > > ___slab_alloc+0x76b/0xed0 > > __slab_alloc.constprop.0+0x5a/0xb0 > > __kmalloc_noprof+0x3dc/0x6d0 > > __list_lru_init+0x6c/0x210 > > alloc_super+0x3b6/0x470 > > sget_fc+0x5f/0x3a0 > > get_tree_nodev+0x27/0x90 > > vfs_get_tree+0x26/0xc0 > > vfs_kern_mount.part.0+0xb6/0x140 > > kern_mount+0x24/0x40 > > init_pipe_fs+0x4f/0x70 > > do_one_initcall+0x62/0x2e0 > > kernel_init_freeable+0x25b/0x4b0 > > kernel_init+0x1a/0x1c0 > > ret_from_fork+0x290/0x2e0 > > ret_from_fork_asm+0x11/0x20 > > </TASK> > > > > Replace this_cpu_ptr with raw_cpu_ptr to eliminate > > the above warning message. > > > > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") > > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> > > Since the slab PR was merged, we should fix this ideally before rc1 so I > updated the code myself per the discussion and added to slab/for-next > > Thanks! > > ----8<---- > From d6306a3d5577769b179ae4e448fd144e2b0f7717 Mon Sep 17 00:00:00 2001 > From: Ran Xiaokai <ran.xiaokai@zte.com.cn> > Date: Tue, 30 Sep 2025 08:34:02 +0000 > Subject: [PATCH] slab: Fix using this_cpu_ptr() in preemptible context > > defer_free() maybe called in preemptible context, this will trigger the > below warning message: > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 > caller is defer_free+0x1b/0x60 > Call Trace: > <TASK> > dump_stack_lvl+0xac/0xc0 > check_preemption_disabled+0xbe/0xe0 > defer_free+0x1b/0x60 > kfree_nolock+0x1eb/0x2b0 > alloc_slab_obj_exts+0x356/0x390 > __alloc_tagging_slab_alloc_hook+0xa0/0x300 > __kmalloc_cache_noprof+0x1c4/0x5c0 > __set_page_owner+0x10d/0x1c0 > post_alloc_hook+0x84/0xf0 > get_page_from_freelist+0x73b/0x1380 > __alloc_frozen_pages_noprof+0x110/0x2c0 > alloc_pages_mpol+0x44/0x140 > alloc_slab_page+0xac/0x150 > allocate_slab+0x78/0x3a0 > ___slab_alloc+0x76b/0xed0 > __slab_alloc.constprop.0+0x5a/0xb0 > __kmalloc_noprof+0x3dc/0x6d0 > __list_lru_init+0x6c/0x210 > alloc_super+0x3b6/0x470 > sget_fc+0x5f/0x3a0 > get_tree_nodev+0x27/0x90 > vfs_get_tree+0x26/0xc0 > vfs_kern_mount.part.0+0xb6/0x140 > kern_mount+0x24/0x40 > init_pipe_fs+0x4f/0x70 > do_one_initcall+0x62/0x2e0 > kernel_init_freeable+0x25b/0x4b0 > kernel_init+0x1a/0x1c0 > ret_from_fork+0x290/0x2e0 > ret_from_fork_asm+0x11/0x20 > </TASK> > > Disable preemption in defer_free() and also defer_deactivate_slab() to > make it safe. > > [vbabka@suse.cz: disable preemption instead of using raw_cpu_ptr() per > the discussion ] > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().") > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> > Link: https://lore.kernel.org/r/20250930083402.782927-1-ranxiaokai627@163.com > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > --- Looks good to me, Reviewed-by: Harry Yoo <harry.yoo@oracle.com> -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-10-13 7:00 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-30 8:34 [PATCH] slab: Fix using this_cpu_ptr() in preemptible context ranxiaokai627 2025-09-30 10:53 ` Harry Yoo 2025-09-30 11:19 ` Alexei Starovoitov 2025-10-02 8:14 ` Vlastimil Babka 2025-10-02 9:00 ` Harry Yoo 2025-10-03 6:52 ` Vlastimil Babka 2025-10-03 7:49 ` Vlastimil Babka 2025-10-03 15:01 ` Alexei Starovoitov 2025-10-13 7:00 ` Harry Yoo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox