linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Cc: Harry Yoo <harry.yoo@oracle.com>,
	Qing Wang <wangqing7171@gmail.com>,
	syzbot+cae7809e9dc1459e4e63@syzkaller.appspotmail.com,
	Liam.Howlett@oracle.com, akpm@linux-foundation.org,
	chao@kernel.org, jaegeuk@kernel.org, jannh@google.com,
	linkinjeon@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, lorenzo.stoakes@oracle.com, pfalcato@suse.de,
	sj1557.seo@samsung.com, syzkaller-bugs@googlegroups.com,
	vbabka@suse.cz, Hao Li <hao.li@linux.dev>
Subject: Re: [syzbot] [mm?] [f2fs?] [exfat?] memory leak in __kfree_rcu_sheaf
Date: Fri, 6 Mar 2026 19:35:01 +0000	[thread overview]
Message-ID: <aassZV5PjgFx8dSI@arm.com> (raw)
In-Reply-To: <925a916a-6dfb-48c0-985c-0bdfb96ebd26@kernel.org>

On Wed, Mar 04, 2026 at 02:39:47PM +0100, Vlastimil Babka (SUSE) wrote:
> On 3/4/26 2:30 AM, Harry Yoo wrote:
> > [+Cc adding Catalin for kmemleak bits]
> > On Mon, Mar 02, 2026 at 09:39:48AM +0100, Vlastimil Babka (SUSE) wrote:
> >> On 3/2/26 04:41, Qing Wang wrote:
> >>> #syz test
> >>>
> >>> diff --git a/mm/slub.c b/mm/slub.c
> >>> index cdc1e652ec52..387979b89120 100644
> >>> --- a/mm/slub.c
> >>> +++ b/mm/slub.c
> >>> @@ -6307,15 +6307,21 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
> >>>  			goto fail;
> >>>  
> >>>  		if (!local_trylock(&s->cpu_sheaves->lock)) {
> >>> -			barn_put_empty_sheaf(barn, empty);
> >>> +			if (barn && data_race(barn->nr_empty) < MAX_EMPTY_SHEAVES)
> >>> +				barn_put_empty_sheaf(barn, empty);
> >>> +			else
> >>> +				free_empty_sheaf(s, empty);
> >>>  			goto fail;
> >>>  		}
> >>>  
> >>>  		pcs = this_cpu_ptr(s->cpu_sheaves);
> >>>  
> >>> -		if (unlikely(pcs->rcu_free))
> >>> -			barn_put_empty_sheaf(barn, empty);
> >>> -		else
> >>> +		if (unlikely(pcs->rcu_free)) {
> >>> +			if (barn && data_race(barn->nr_empty) < MAX_EMPTY_SHEAVES)
> >>> +				barn_put_empty_sheaf(barn, empty);
> >>> +			else
> >>> +				free_empty_sheaf(s, empty);
> >>> +		} else
> >>>  			pcs->rcu_free = empty;
> >>>  	}
> >>
> >> I don't think this would fix any leak, and syzbot agrees. It would limit the
> >> empty sheaves in barn more strictly, but they are not leaked.
> >> Hm I don't see any leak in __kfree_rcu_sheaf() or rcu_free_sheaf(). Wonder
> >> if kmemleak lacks visibility into barns or pcs's as roots for searching what
> >> objects are considered referenced, or something?
> > 
> > Objects that are allocated from slab and percpu allocator should be
> > properly tracked by kmemleak. But those allocated with
> > gfpflags_allow_spinning() == false are not tracked by kmemleak.
> > 
> > When barns and sheaves are allocated early (!gfpflags_allow_spinning()
> > due to gfp_allowed_mask) and it skips kmemleak_alloc_recursive(), 
> > it could produce false positives because from kmemleak's point of view,
> > the objects are not reachable from the root set (data section, stack,
> > etc.).
> 
> Good point.
> 
> > To me it seems kmemleak should gain allow_spin == false support
> > sooner or later.
> 
> Or we figure out how to deal with the false allow_spin == false during
> boot. Here I'm a bit confused how exactly it happens because AFAICS in
> slub we apply gfp_allowed_mask only when allocating a new slab, and in
> slab_post_alloc_hook() we apply it to init_mask. That is indeed passed
> to kmemleak_alloc_recursive() but not used for the
> gfpflags_allow_spinning() decision. kmemleak_alloc_recursive() should
> succeed because nobody should be holding any locks that would require
> spinning.

I don't fully understand what goes on. If kmemleak_alloc_recursive()
failed to allocate for some reason (other than SLAB_NOLEAKTRACE), it
would loudly disable kmemleak altogether and stop reporting leaks. Also
kmemleak doesn't care about allow_spin, it's only the slub code which
avoids calling kmemleak if spinning not allowed (as it takes some locks,
may call back into the slab allocator).

I wonder whether some early kmem_cache_node allocations like the ones in
early_kmem_cache_node_alloc() are not tracked and then kmemleak cannot
find n->barn. I got lost in the slub code, but something like this:

-----------8<-----------------------------------
diff --git a/mm/slub.c b/mm/slub.c
index 0c906fefc31b..401557ff5487 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7513,6 +7513,7 @@ static void early_kmem_cache_node_alloc(int node)
 	slab->freelist = get_freepointer(kmem_cache_node, n);
 	slab->inuse = 1;
 	kmem_cache_node->node[node] = n;
+	kmemleak_alloc(n, sizeof(*n), 1, GFP_NOWAIT);
 	init_kmem_cache_node(n, NULL);
 	inc_slabs_node(kmem_cache_node, node, slab->objects);
 
-------------8<----------------------------------------

Another thing I noticed, not sure it's related but we should probably
ignore an object once it has been passed to kvfree_call_rcu(), similar
to what we do on the main path in this function. Also see commit
5f98fd034ca6 ("rcu: kmemleak: Ignore kmemleak false positives when
RCU-freeing objects") when we added this kmemleak_ignore().

---------8<-----------------------------------
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d5a70a831a2a..73f4668d870d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1954,8 +1954,14 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr)
 	if (!head)
 		might_sleep();
 
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) && kfree_rcu_sheaf(ptr))
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) && kfree_rcu_sheaf(ptr)) {
+		/*
+		 * The object is now queued for deferred freeing via an RCU
+		 * sheaf. Tell kmemleak to ignore it.
+		 */
+		kmemleak_ignore(ptr);
 		return;
+	}
 
 	// Queue the object but don't yet schedule the batch.
 	if (debug_rcu_head_queue(ptr)) {
----------------8<-----------------------------------

-- 
Catalin


      reply	other threads:[~2026-03-06 19:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-09 18:26 syzbot
2026-03-02  3:41 ` Qing Wang
2026-03-02  3:57   ` syzbot
2026-03-02  8:39   ` Vlastimil Babka (SUSE)
2026-03-04  1:30     ` Harry Yoo
2026-03-04 13:39       ` Vlastimil Babka (SUSE)
2026-03-06 19:35         ` Catalin Marinas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aassZV5PjgFx8dSI@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=chao@kernel.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=jaegeuk@kernel.org \
    --cc=jannh@google.com \
    --cc=linkinjeon@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=pfalcato@suse.de \
    --cc=sj1557.seo@samsung.com \
    --cc=syzbot+cae7809e9dc1459e4e63@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=vbabka@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=wangqing7171@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox