[PATCH V1] mm/slub: fix memory leak in free_to_pcs

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
@ 2025-11-11 12:53 Harry Yoo
  2025-11-11 13:13 ` Vlastimil Babka
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Harry Yoo @ 2025-11-11 12:53 UTC (permalink / raw)
  To: Andrew Morton, Vlastimil Babka
  Cc: Liam R . Howlett, Tytus Rogalewski, Darrick J . Wong,
	Christoph Lameter, David Rientjes, Roman Gushchin, Harry Yoo,
	linux-mm

The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
freeing") introduced the remote_objects array in free_to_pcs_bulk() to
skip sheaves when objects from a remote node are freed.

However, the array is flushed only when:
  1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
  2) slab_free_hook() returns false and size becomes zero.

When neither of the conditions is met, objects in the array are leaked.
This resulted in a memory leak [1], where 82 GiB of memory was allocated
for the maple_node cache.

Flush the array after successfully freeing objects to sheaves
in the do_free: path.

In the meantime, move the snippet if (!size) goto flush_remote; outside
the while loop for readability. Let's say all objects in the array are
from a remote node: then we acquire s->cpu_sheaves->lock and try to free
an object even when size is zero. This doesn't appear to be harmful,
but isn't really readable.

Reported-by: Tytus Rogalewski <tytanick@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1]
Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object freeing")
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
 mm/slub.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index f1a5373eee7b..a787687a0d59 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)

 		if (unlikely(!slab_free_hook(s, p[i], init, false))) {
 			p[i] = p[--size];
-			if (!size)
-				goto flush_remote;
 			continue;
 		}

@@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
 		i++;
 	}

+	if (!size)
+		goto flush_remote;
+
 next_batch:
 	if (!local_trylock(&s->cpu_sheaves->lock))
 		goto fallback;
@@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
 		goto next_batch;
 	}

+	if (remote_nr)
+		goto flush_remote;
+
 	return;

 no_empty:
-- 
2.43.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-11 12:53 [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk() Harry Yoo
@ 2025-11-11 13:13 ` Vlastimil Babka
  2025-11-11 15:37 ` Liam R. Howlett
  2025-11-12 18:46 ` Darrick J. Wong
  2 siblings, 0 replies; 12+ messages in thread
From: Vlastimil Babka @ 2025-11-11 13:13 UTC (permalink / raw)
  To: Harry Yoo, Andrew Morton
  Cc: Liam R . Howlett, Tytus Rogalewski, Darrick J . Wong,
	Christoph Lameter, David Rientjes, Roman Gushchin, linux-mm

On 11/11/25 13:53, Harry Yoo wrote:
> The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> skip sheaves when objects from a remote node are freed.
> 
> However, the array is flushed only when:
>   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
>   2) slab_free_hook() returns false and size becomes zero.
> 
> When neither of the conditions is met, objects in the array are leaked.
> This resulted in a memory leak [1], where 82 GiB of memory was allocated
> for the maple_node cache.
> 
> Flush the array after successfully freeing objects to sheaves
> in the do_free: path.
> 
> In the meantime, move the snippet if (!size) goto flush_remote; outside
> the while loop for readability. Let's say all objects in the array are
> from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> an object even when size is zero. This doesn't appear to be harmful,
> but isn't really readable.
> 
> Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1]
> Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object freeing")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>

Thanks a lot! Adding to slab/for-next-fixes
Vlastimil

> ---
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f1a5373eee7b..a787687a0d59 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  
>  		if (unlikely(!slab_free_hook(s, p[i], init, false))) {
>  			p[i] = p[--size];
> -			if (!size)
> -				goto flush_remote;
>  			continue;
>  		}
>  
> @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  		i++;
>  	}
>  
> +	if (!size)
> +		goto flush_remote;
> +
>  next_batch:
>  	if (!local_trylock(&s->cpu_sheaves->lock))
>  		goto fallback;
> @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  		goto next_batch;
>  	}
>  
> +	if (remote_nr)
> +		goto flush_remote;
> +
>  	return;
>  
>  no_empty:



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-11 12:53 [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk() Harry Yoo
  2025-11-11 13:13 ` Vlastimil Babka
@ 2025-11-11 15:37 ` Liam R. Howlett
  2025-11-11 16:48   ` Tytus Rogalewski
  2025-11-12 18:46 ` Darrick J. Wong
  2 siblings, 1 reply; 12+ messages in thread
From: Liam R. Howlett @ 2025-11-11 15:37 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Andrew Morton, Vlastimil Babka, Tytus Rogalewski,
	Darrick J . Wong, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

* Harry Yoo <harry.yoo@oracle.com> [251111 07:55]:
> The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> skip sheaves when objects from a remote node are freed.
> 
> However, the array is flushed only when:
>   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
>   2) slab_free_hook() returns false and size becomes zero.
> 
> When neither of the conditions is met, objects in the array are leaked.
> This resulted in a memory leak [1], where 82 GiB of memory was allocated
> for the maple_node cache.
> 
> Flush the array after successfully freeing objects to sheaves
> in the do_free: path.
> 
> In the meantime, move the snippet if (!size) goto flush_remote; outside
> the while loop for readability. Let's say all objects in the array are
> from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> an object even when size is zero. This doesn't appear to be harmful,
> but isn't really readable.
> 
> Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1]
> Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object freeing")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>


Thanks Harry.

Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f1a5373eee7b..a787687a0d59 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  
>  		if (unlikely(!slab_free_hook(s, p[i], init, false))) {
>  			p[i] = p[--size];
> -			if (!size)
> -				goto flush_remote;
>  			continue;
>  		}
>  
> @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  		i++;
>  	}
>  
> +	if (!size)
> +		goto flush_remote;
> +
>  next_batch:
>  	if (!local_trylock(&s->cpu_sheaves->lock))
>  		goto fallback;
> @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  		goto next_batch;
>  	}
>  
> +	if (remote_nr)
> +		goto flush_remote;
> +
>  	return;
>  
>  no_empty:
> -- 
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-11 15:37 ` Liam R. Howlett
@ 2025-11-11 16:48   ` Tytus Rogalewski
  2025-11-11 18:26     ` Harry Yoo
  0 siblings, 1 reply; 12+ messages in thread
From: Tytus Rogalewski @ 2025-11-11 16:48 UTC (permalink / raw)
  To: Liam R. Howlett, Harry Yoo, Andrew Morton, Vlastimil Babka,
	Tytus Rogalewski, Darrick J . Wong, Christoph Lameter,
	David Rientjes, Roman Gushchin, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3168 bytes --]

Do you guys still need that debug then?
I think this is happening only when qemu vm is working.

I can get results within 1-2 days.

--

tel. 790 202 300

*Tytus Rogalewski*

Dolina Krzemowa 6A

83-010 Jagatowo

NIP: 9570976234


W dniu wt., 11 lis 2025 o 16:37 Liam R. Howlett <Liam.Howlett@oracle.com>
napisał(a):

> * Harry Yoo <harry.yoo@oracle.com> [251111 07:55]:
> > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> > skip sheaves when objects from a remote node are freed.
> >
> > However, the array is flushed only when:
> >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> >   2) slab_free_hook() returns false and size becomes zero.
> >
> > When neither of the conditions is met, objects in the array are leaked.
> > This resulted in a memory leak [1], where 82 GiB of memory was allocated
> > for the maple_node cache.
> >
> > Flush the array after successfully freeing objects to sheaves
> > in the do_free: path.
> >
> > In the meantime, move the snippet if (!size) goto flush_remote; outside
> > the while loop for readability. Let's say all objects in the array are
> > from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> > an object even when size is zero. This doesn't appear to be harmful,
> > but isn't really readable.
> >
> > Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1]
> > Closes:
> https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> > Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> > Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object
> freeing")
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
>
>
> Thanks Harry.
>
> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
>
> > ---
> >  mm/slub.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index f1a5373eee7b..a787687a0d59 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s,
> size_t size, void **p)
> >
> >               if (unlikely(!slab_free_hook(s, p[i], init, false))) {
> >                       p[i] = p[--size];
> > -                     if (!size)
> > -                             goto flush_remote;
> >                       continue;
> >               }
> >
> > @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s,
> size_t size, void **p)
> >               i++;
> >       }
> >
> > +     if (!size)
> > +             goto flush_remote;
> > +
> >  next_batch:
> >       if (!local_trylock(&s->cpu_sheaves->lock))
> >               goto fallback;
> > @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s,
> size_t size, void **p)
> >               goto next_batch;
> >       }
> >
> > +     if (remote_nr)
> > +             goto flush_remote;
> > +
> >       return;
> >
> >  no_empty:
> > --
> > 2.43.0
> >
>

[-- Attachment #2: Type: text/html, Size: 6081 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-11 16:48   ` Tytus Rogalewski
@ 2025-11-11 18:26     ` Harry Yoo
  2025-11-12 14:47       ` Tytus Rogalewski
  0 siblings, 1 reply; 12+ messages in thread
From: Harry Yoo @ 2025-11-11 18:26 UTC (permalink / raw)
  To: Tytus Rogalewski
  Cc: Liam R. Howlett, Andrew Morton, Vlastimil Babka,
	Darrick J . Wong, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

On Tue, Nov 11, 2025 at 05:48:35PM +0100, Tytus Rogalewski wrote:
> Do you guys still need that debug then?
> I think this is happening only when qemu vm is working.
> 
> I can get results within 1-2 days.

Hi Tythus!

Really appreciate you reporting the bug and testing it.

Now that I know what went wrong, I realize that `slab_debug=U` parameter
will hide the bug, since we disable "sheaves" feature for
debug caches.

Instead of testing with `slab_debug=U` parameter, could you please
apply this patch on top of Linux v6.18-rc5, build & install it,
and verify that the memory leak is indeed resolved on your machine?

> --
> 
> tel. 790 202 300
> 
> *Tytus Rogalewski*
> 
> Dolina Krzemowa 6A
> 
> 83-010 Jagatowo
> 
> NIP: 9570976234
> 
> 
> W dniu wt., 11 lis 2025 o 16:37 Liam R. Howlett <Liam.Howlett@oracle.com>
> napisał(a):
> 
> > * Harry Yoo <harry.yoo@oracle.com> [251111 07:55]:
> > > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> > > skip sheaves when objects from a remote node are freed.
> > >
> > > However, the array is flushed only when:
> > >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> > >   2) slab_free_hook() returns false and size becomes zero.
> > >
> > > When neither of the conditions is met, objects in the array are leaked.
> > > This resulted in a memory leak [1], where 82 GiB of memory was allocated
> > > for the maple_node cache.
> > >
> > > Flush the array after successfully freeing objects to sheaves
> > > in the do_free: path.
> > >
> > > In the meantime, move the snippet if (!size) goto flush_remote; outside
> > > the while loop for readability. Let's say all objects in the array are
> > > from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> > > an object even when size is zero. This doesn't appear to be harmful,
> > > but isn't really readable.
> > >
> > > Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765
> > > Closes:
> > https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> > > Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> > > Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object
> > freeing")
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> >
> >
> > Thanks Harry.
> >
> > Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> >
> > > ---
> > >  mm/slub.c | 8 ++++++--
> > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index f1a5373eee7b..a787687a0d59 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s,
> > size_t size, void **p)
> > >
> > >               if (unlikely(!slab_free_hook(s, p[i], init, false))) {
> > >                       p[i] = p[--size];
> > > -                     if (!size)
> > > -                             goto flush_remote;
> > >                       continue;
> > >               }
> > >
> > > @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s,
> > size_t size, void **p)
> > >               i++;
> > >       }
> > >
> > > +     if (!size)
> > > +             goto flush_remote;
> > > +
> > >  next_batch:
> > >       if (!local_trylock(&s->cpu_sheaves->lock))
> > >               goto fallback;
> > > @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s,
> > size_t size, void **p)
> > >               goto next_batch;
> > >       }
> > >
> > > +     if (remote_nr)
> > > +             goto flush_remote;
> > > +
> > >       return;
> > >
> > >  no_empty:
> > > --
> > > 2.43.0
> > >
> >

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-11 18:26     ` Harry Yoo
@ 2025-11-12 14:47       ` Tytus Rogalewski
  2025-11-13  0:42         ` Harry Yoo
  0 siblings, 1 reply; 12+ messages in thread
From: Tytus Rogalewski @ 2025-11-12 14:47 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Liam R. Howlett, Andrew Morton, Vlastimil Babka,
	Darrick J . Wong, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

[-- Attachment #1: Type: text/plain, Size: 4687 bytes --]

We wont make it until next week.
Maybe you guys can compile newest r5 kernel with that patch ?
We are using https://prebuiltkernels.com/ and do not compiled 6.18
ourselves. We can do that next week. This week is full of emergencies lol
If you can provide me two debs like prebuild kernels i could deploy it and
leave for testing for 1-2 days.


--

tel. 790 202 300

*Tytus Rogalewski*

Dolina Krzemowa 6A

83-010 Jagatowo

NIP: 9570976234


wt., 11 lis 2025 o 19:29 Harry Yoo <harry.yoo@oracle.com> napisał(a):

> On Tue, Nov 11, 2025 at 05:48:35PM +0100, Tytus Rogalewski wrote:
> > Do you guys still need that debug then?
> > I think this is happening only when qemu vm is working.
> >
> > I can get results within 1-2 days.
>
> Hi Tythus!
>
> Really appreciate you reporting the bug and testing it.
>
> Now that I know what went wrong, I realize that `slab_debug=U` parameter
> will hide the bug, since we disable "sheaves" feature for
> debug caches.
>
> Instead of testing with `slab_debug=U` parameter, could you please
> apply this patch on top of Linux v6.18-rc5, build & install it,
> and verify that the memory leak is indeed resolved on your machine?
>
> > --
> >
> > tel. 790 202 300
> >
> > *Tytus Rogalewski*
> >
> > Dolina Krzemowa 6A
> >
> > 83-010 Jagatowo
> >
> > NIP: 9570976234
> >
> >
> > W dniu wt., 11 lis 2025 o 16:37 Liam R. Howlett <Liam.Howlett@oracle.com
> >
> > napisał(a):
> >
> > > * Harry Yoo <harry.yoo@oracle.com> [251111 07:55]:
> > > > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > > freeing") introduced the remote_objects array in free_to_pcs_bulk()
> to
> > > > skip sheaves when objects from a remote node are freed.
> > > >
> > > > However, the array is flushed only when:
> > > >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> > > >   2) slab_free_hook() returns false and size becomes zero.
> > > >
> > > > When neither of the conditions is met, objects in the array are
> leaked.
> > > > This resulted in a memory leak [1], where 82 GiB of memory was
> allocated
> > > > for the maple_node cache.
> > > >
> > > > Flush the array after successfully freeing objects to sheaves
> > > > in the do_free: path.
> > > >
> > > > In the meantime, move the snippet if (!size) goto flush_remote;
> outside
> > > > the while loop for readability. Let's say all objects in the array
> are
> > > > from a remote node: then we acquire s->cpu_sheaves->lock and try to
> free
> > > > an object even when size is zero. This doesn't appear to be harmful,
> > > > but isn't really readable.
> > > >
> > > > Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765
> > > > Closes:
> > >
> https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> > > > Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> > > > Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > freeing")
> > > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > >
> > >
> > > Thanks Harry.
> > >
> > > Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> > >
> > > > ---
> > > >  mm/slub.c | 8 ++++++--
> > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/mm/slub.c b/mm/slub.c
> > > > index f1a5373eee7b..a787687a0d59 100644
> > > > --- a/mm/slub.c
> > > > +++ b/mm/slub.c
> > > > @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache
> *s,
> > > size_t size, void **p)
> > > >
> > > >               if (unlikely(!slab_free_hook(s, p[i], init, false))) {
> > > >                       p[i] = p[--size];
> > > > -                     if (!size)
> > > > -                             goto flush_remote;
> > > >                       continue;
> > > >               }
> > > >
> > > > @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache
> *s,
> > > size_t size, void **p)
> > > >               i++;
> > > >       }
> > > >
> > > > +     if (!size)
> > > > +             goto flush_remote;
> > > > +
> > > >  next_batch:
> > > >       if (!local_trylock(&s->cpu_sheaves->lock))
> > > >               goto fallback;
> > > > @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache
> *s,
> > > size_t size, void **p)
> > > >               goto next_batch;
> > > >       }
> > > >
> > > > +     if (remote_nr)
> > > > +             goto flush_remote;
> > > > +
> > > >       return;
> > > >
> > > >  no_empty:
> > > > --
> > > > 2.43.0
> > > >
> > >
>
> --
> Cheers,
> Harry / Hyeonggon
>

[-- Attachment #2: Type: text/html, Size: 7987 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-11 12:53 [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk() Harry Yoo
  2025-11-11 13:13 ` Vlastimil Babka
  2025-11-11 15:37 ` Liam R. Howlett
@ 2025-11-12 18:46 ` Darrick J. Wong
  2025-11-13  0:43   ` Harry Yoo
  2 siblings, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2025-11-12 18:46 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Andrew Morton, Vlastimil Babka, Liam R . Howlett,
	Tytus Rogalewski, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

On Tue, Nov 11, 2025 at 09:53:31PM +0900, Harry Yoo wrote:
> The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> skip sheaves when objects from a remote node are freed.
> 
> However, the array is flushed only when:
>   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
>   2) slab_free_hook() returns false and size becomes zero.
> 
> When neither of the conditions is met, objects in the array are leaked.
> This resulted in a memory leak [1], where 82 GiB of memory was allocated
> for the maple_node cache.
> 
> Flush the array after successfully freeing objects to sheaves
> in the do_free: path.
> 
> In the meantime, move the snippet if (!size) goto flush_remote; outside
> the while loop for readability. Let's say all objects in the array are
> from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> an object even when size is zero. This doesn't appear to be harmful,
> but isn't really readable.

I'll put this on my test fleet this evening.  Thank you for the quick
fix! :)

--D

> Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1]
> Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object freeing")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f1a5373eee7b..a787687a0d59 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  
>  		if (unlikely(!slab_free_hook(s, p[i], init, false))) {
>  			p[i] = p[--size];
> -			if (!size)
> -				goto flush_remote;
>  			continue;
>  		}
>  
> @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  		i++;
>  	}
>  
> +	if (!size)
> +		goto flush_remote;
> +
>  next_batch:
>  	if (!local_trylock(&s->cpu_sheaves->lock))
>  		goto fallback;
> @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>  		goto next_batch;
>  	}
>  
> +	if (remote_nr)
> +		goto flush_remote;
> +
>  	return;
>  
>  no_empty:
> -- 
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-12 14:47       ` Tytus Rogalewski
@ 2025-11-13  0:42         ` Harry Yoo
  0 siblings, 0 replies; 12+ messages in thread
From: Harry Yoo @ 2025-11-13  0:42 UTC (permalink / raw)
  To: Tytus Rogalewski
  Cc: Liam R. Howlett, Andrew Morton, Vlastimil Babka,
	Darrick J . Wong, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

On Wed, Nov 12, 2025 at 03:47:52PM +0100, Tytus Rogalewski wrote:
> We wont make it until next week.
> Maybe you guys can compile newest r5 kernel with that patch ?
> We are using https://prebuiltkernels.com/
> ourselves. We can do that next week.

I built it and uploaded it to my personal server:
http://download.kerneltesting.org/linux-6.18.0-rc5-fix.zip

But if you prefer to test images from prebuiltkernels.com, I think it's
fine to wait for a week and test 6.18.0-rc6 - I guess this will land -rc6
anyway.

> This week is full of emergencies lol

Haha I see, I can imagine what'll happen when you test latest kernels...

> If you can provide me two debs like prebuild kernels i could deploy it and
> leave for testing for 1-2 days.

Thanks a lot!
 
> --
> 
> tel. 790 202 300
> 
> *Tytus Rogalewski*
> 
> Dolina Krzemowa 6A
> 
> 83-010 Jagatowo
> 
> NIP: 9570976234
> 
> 
> wt., 11 lis 2025 o 19:29 Harry Yoo <harry.yoo@oracle.com> napisał(a):
> 
> > On Tue, Nov 11, 2025 at 05:48:35PM +0100, Tytus Rogalewski wrote:
> > > Do you guys still need that debug then?
> > > I think this is happening only when qemu vm is working.
> > >
> > > I can get results within 1-2 days.
> >
> > Hi Tythus!
> >
> > Really appreciate you reporting the bug and testing it.
> >
> > Now that I know what went wrong, I realize that `slab_debug=U` parameter
> > will hide the bug, since we disable "sheaves" feature for
> > debug caches.
> >
> > Instead of testing with `slab_debug=U` parameter, could you please
> > apply this patch on top of Linux v6.18-rc5, build & install it,
> > and verify that the memory leak is indeed resolved on your machine?
> >
> > > --
> > >
> > > tel. 790 202 300
> > >
> > > *Tytus Rogalewski*
> > >
> > > Dolina Krzemowa 6A
> > >
> > > 83-010 Jagatowo
> > >
> > > NIP: 9570976234
> > >
> > >
> > > W dniu wt., 11 lis 2025 o 16:37 Liam R. Howlett <Liam.Howlett@oracle.com
> > >
> > > napisał(a):
> > >
> > > > * Harry Yoo <harry.yoo@oracle.com> [251111 07:55]:
> > > > > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > > > freeing") introduced the remote_objects array in free_to_pcs_bulk()
> > to
> > > > > skip sheaves when objects from a remote node are freed.
> > > > >
> > > > > However, the array is flushed only when:
> > > > >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> > > > >   2) slab_free_hook() returns false and size becomes zero.
> > > > >
> > > > > When neither of the conditions is met, objects in the array are
> > leaked.
> > > > > This resulted in a memory leak [1], where 82 GiB of memory was
> > allocated
> > > > > for the maple_node cache.
> > > > >
> > > > > Flush the array after successfully freeing objects to sheaves
> > > > > in the do_free: path.
> > > > >
> > > > > In the meantime, move the snippet if (!size) goto flush_remote;
> > outside
> > > > > the while loop for readability. Let's say all objects in the array
> > are
> > > > > from a remote node: then we acquire s->cpu_sheaves->lock and try to
> > free
> > > > > an object even when size is zero. This doesn't appear to be harmful,
> > > > > but isn't really readable.
> > > > >
> > > > > Reported-by: Tytus Rogalewski <tytanick@gmail.com>
> > > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765
> > > > > Closes:
> > > >
> > https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
> > > > > Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
> > > > > Fixes: 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > > freeing")
> > > > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > > >
> > > >
> > > > Thanks Harry.
> > > >
> > > > Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> > > >
> > > > > ---
> > > > >  mm/slub.c | 8 ++++++--
> > > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/mm/slub.c b/mm/slub.c
> > > > > index f1a5373eee7b..a787687a0d59 100644
> > > > > --- a/mm/slub.c
> > > > > +++ b/mm/slub.c
> > > > > @@ -6332,8 +6332,6 @@ static void free_to_pcs_bulk(struct kmem_cache
> > *s,
> > > > size_t size, void **p)
> > > > >
> > > > >               if (unlikely(!slab_free_hook(s, p[i], init, false))) {
> > > > >                       p[i] = p[--size];
> > > > > -                     if (!size)
> > > > > -                             goto flush_remote;
> > > > >                       continue;
> > > > >               }
> > > > >
> > > > > @@ -6348,6 +6346,9 @@ static void free_to_pcs_bulk(struct kmem_cache
> > *s,
> > > > size_t size, void **p)
> > > > >               i++;
> > > > >       }
> > > > >
> > > > > +     if (!size)
> > > > > +             goto flush_remote;
> > > > > +
> > > > >  next_batch:
> > > > >       if (!local_trylock(&s->cpu_sheaves->lock))
> > > > >               goto fallback;
> > > > > @@ -6402,6 +6403,9 @@ static void free_to_pcs_bulk(struct kmem_cache
> > *s,
> > > > size_t size, void **p)
> > > > >               goto next_batch;
> > > > >       }
> > > > >
> > > > > +     if (remote_nr)
> > > > > +             goto flush_remote;
> > > > > +
> > > > >       return;
> > > > >
> > > > >  no_empty:
> > > > > --
> > > > > 2.43.0
> > > > >
> > > >
> >
> > --
> > Cheers,
> > Harry / Hyeonggon
> >

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-12 18:46 ` Darrick J. Wong
@ 2025-11-13  0:43   ` Harry Yoo
  2025-11-13 17:01     ` Darrick J. Wong
  2025-11-13 17:02     ` Tytus Rogalewski
  0 siblings, 2 replies; 12+ messages in thread
From: Harry Yoo @ 2025-11-13  0:43 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andrew Morton, Vlastimil Babka, Liam R . Howlett,
	Tytus Rogalewski, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

On Wed, Nov 12, 2025 at 10:46:45AM -0800, Darrick J. Wong wrote:
> On Tue, Nov 11, 2025 at 09:53:31PM +0900, Harry Yoo wrote:
> > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> > skip sheaves when objects from a remote node are freed.
> > 
> > However, the array is flushed only when:
> >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> >   2) slab_free_hook() returns false and size becomes zero.
> > 
> > When neither of the conditions is met, objects in the array are leaked.
> > This resulted in a memory leak [1], where 82 GiB of memory was allocated
> > for the maple_node cache.
> > 
> > Flush the array after successfully freeing objects to sheaves
> > in the do_free: path.
> > 
> > In the meantime, move the snippet if (!size) goto flush_remote; outside
> > the while loop for readability. Let's say all objects in the array are
> > from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> > an object even when size is zero. This doesn't appear to be harmful,
> > but isn't really readable.
> 
> I'll put this on my test fleet this evening.  Thank you for the quick
> fix! :)

Thanks for testing, Darrick!

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-13  0:43   ` Harry Yoo
@ 2025-11-13 17:01     ` Darrick J. Wong
  2025-11-13 17:02     ` Tytus Rogalewski
  1 sibling, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2025-11-13 17:01 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Andrew Morton, Vlastimil Babka, Liam R . Howlett,
	Tytus Rogalewski, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

On Thu, Nov 13, 2025 at 09:43:20AM +0900, Harry Yoo wrote:
> On Wed, Nov 12, 2025 at 10:46:45AM -0800, Darrick J. Wong wrote:
> > On Tue, Nov 11, 2025 at 09:53:31PM +0900, Harry Yoo wrote:
> > > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> > > skip sheaves when objects from a remote node are freed.
> > > 
> > > However, the array is flushed only when:
> > >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> > >   2) slab_free_hook() returns false and size becomes zero.
> > > 
> > > When neither of the conditions is met, objects in the array are leaked.
> > > This resulted in a memory leak [1], where 82 GiB of memory was allocated
> > > for the maple_node cache.
> > > 
> > > Flush the array after successfully freeing objects to sheaves
> > > in the do_free: path.
> > > 
> > > In the meantime, move the snippet if (!size) goto flush_remote; outside
> > > the while loop for readability. Let's say all objects in the array are
> > > from a remote node: then we acquire s->cpu_sheaves->lock and try to free
> > > an object even when size is zero. This doesn't appear to be harmful,
> > > but isn't really readable.
> > 
> > I'll put this on my test fleet this evening.  Thank you for the quick
> > fix! :)
> 
> Thanks for testing, Darrick!

Nothing OOMed overnight, so I think this patch is ok.

Tested-by: "Darrick J. Wong" <djwong@kernel.org>

--D


> -- 
> Cheers,
> Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-13  0:43   ` Harry Yoo
  2025-11-13 17:01     ` Darrick J. Wong
@ 2025-11-13 17:02     ` Tytus Rogalewski
  2025-11-13 17:10       ` Vlastimil Babka
  1 sibling, 1 reply; 12+ messages in thread
From: Tytus Rogalewski @ 2025-11-13 17:02 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Darrick J. Wong, Andrew Morton, Vlastimil Babka,
	Liam R . Howlett, Christoph Lameter, David Rientjes,
	Roman Gushchin, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1796 bytes --]

Hi,
OK it seems that this in deed fixed the issue, and i do not see leak now.
Thanks for fixing it :)
Btw if you guys need VPS with GPU resources, i can give you free ride on
SimplePod.ai for helping fixing that issue :)




--

tel. 790 202 300

*Tytus Rogalewski*

Dolina Krzemowa 6A

83-010 Jagatowo

NIP: 9570976234


czw., 13 lis 2025 o 01:43 Harry Yoo <harry.yoo@oracle.com> napisał(a):

> On Wed, Nov 12, 2025 at 10:46:45AM -0800, Darrick J. Wong wrote:
> > On Tue, Nov 11, 2025 at 09:53:31PM +0900, Harry Yoo wrote:
> > > The commit 989b09b73978 ("slab: skip percpu sheaves for remote object
> > > freeing") introduced the remote_objects array in free_to_pcs_bulk() to
> > > skip sheaves when objects from a remote node are freed.
> > >
> > > However, the array is flushed only when:
> > >   1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
> > >   2) slab_free_hook() returns false and size becomes zero.
> > >
> > > When neither of the conditions is met, objects in the array are leaked.
> > > This resulted in a memory leak [1], where 82 GiB of memory was
> allocated
> > > for the maple_node cache.
> > >
> > > Flush the array after successfully freeing objects to sheaves
> > > in the do_free: path.
> > >
> > > In the meantime, move the snippet if (!size) goto flush_remote; outside
> > > the while loop for readability. Let's say all objects in the array are
> > > from a remote node: then we acquire s->cpu_sheaves->lock and try to
> free
> > > an object even when size is zero. This doesn't appear to be harmful,
> > > but isn't really readable.
> >
> > I'll put this on my test fleet this evening.  Thank you for the quick
> > fix! :)
>
> Thanks for testing, Darrick!
>
> --
> Cheers,
> Harry / Hyeonggon
>

[-- Attachment #2: Type: text/html, Size: 3492 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk()
  2025-11-13 17:02     ` Tytus Rogalewski
@ 2025-11-13 17:10       ` Vlastimil Babka
  0 siblings, 0 replies; 12+ messages in thread
From: Vlastimil Babka @ 2025-11-13 17:10 UTC (permalink / raw)
  To: Tytus Rogalewski, Harry Yoo
  Cc: Darrick J. Wong, Andrew Morton, Liam R . Howlett,
	Christoph Lameter, David Rientjes, Roman Gushchin, linux-mm

On 11/13/25 18:02, Tytus Rogalewski wrote:
> Hi,
> OK it seems that this in deed fixed the issue, and i do not see leak now.
> Thanks for fixing it :)

Great, thanks a lot! Can I add then this?

Tested-by: Tytus Rogalewski <tytanick@gmail.com>



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-11-13 17:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-11 12:53 [PATCH V1] mm/slub: fix memory leak in free_to_pcs_bulk() Harry Yoo
2025-11-11 13:13 ` Vlastimil Babka
2025-11-11 15:37 ` Liam R. Howlett
2025-11-11 16:48   ` Tytus Rogalewski
2025-11-11 18:26     ` Harry Yoo
2025-11-12 14:47       ` Tytus Rogalewski
2025-11-13  0:42         ` Harry Yoo
2025-11-12 18:46 ` Darrick J. Wong
2025-11-13  0:43   ` Harry Yoo
2025-11-13 17:01     ` Darrick J. Wong
2025-11-13 17:02     ` Tytus Rogalewski
2025-11-13 17:10       ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox