linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
@ 2026-01-24 10:46 Harry Yoo
  2026-01-24 10:53 ` Harry Yoo
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Harry Yoo @ 2026-01-24 10:46 UTC (permalink / raw)
  To: akpm, vbabka
  Cc: linux-mm, cl, rientjes, surenb, harry.yoo, hao.li,
	kernel test robot, stable

When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
can be allocated from the same slab we're allocating the array for.
This led to obj_exts_in_slab() incorrectly returning true [1],
although the array is not allocated from wasted space of the slab.

Vlastimil Babka observed that this problem should be fixed even when
ignoring its incompatibility with obj_exts_in_slab(), because it creates
slabs that are never freed as there is always at least one allocated
object.

To avoid this, use the next kmalloc size or large kmalloc when
kmalloc_slab() returns the same cache we're allocating the array for.

In case of random kmalloc caches, there are multiple kmalloc caches for
the same size and the cache is selected based on the caller address.
Because it is fragile to ensure the same caller address is passed to
kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
to (s->object_size + 1) when the sizes are equal.

Note that this doesn't happen when memory allocation profiling is
disabled, as when the allocation of the array is triggered by memory
cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com [1]
Cc: stable@vger.kernel.org
Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
 mm/slub.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 55 insertions(+), 7 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 3ff1c475b0f1..43ddb96c4081 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2104,6 +2104,52 @@ static inline void init_slab_obj_exts(struct slab *slab)
 	slab->obj_exts = 0;
 }
 
+/*
+ * Calculate the allocation size for slabobj_ext array.
+ *
+ * When memory allocation profiling is enabled, the obj_exts array
+ * could be allocated from the same slab cache it's being allocated for.
+ * This would prevent the slab from ever being freed because it would
+ * always contain at least one allocated object (its own obj_exts array).
+ *
+ * To avoid this, increase the allocation size when we detect the array
+ * would come from the same cache, forcing it to use a different cache.
+ */
+static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
+					 struct slab *slab, gfp_t gfp)
+{
+	size_t sz = sizeof(struct slabobj_ext) * slab->objects;
+	struct kmem_cache *obj_exts_cache;
+
+	/*
+	 * slabobj_ext array for KMALLOC_CGROUP allocations
+	 * are served from KMALLOC_NORMAL caches.
+	 */
+	if (!mem_alloc_profiling_enabled())
+		return sz;
+
+	if (sz > KMALLOC_MAX_CACHE_SIZE)
+		return sz;
+
+	obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);
+	if (s == obj_exts_cache)
+		return obj_exts_cache->object_size + 1;
+
+	/*
+	 * Random kmalloc caches have multiple caches per size, and the cache
+	 * is selected by the caller address. Since caller address may differ
+	 * between kmalloc_slab() and actual allocation, bump size when both
+	 * are normal kmalloc caches of same size.
+	 */
+	if (IS_ENABLED(CONFIG_RANDOM_KMALLOC_CACHES) &&
+			is_kmalloc_normal(s) &&
+			is_kmalloc_normal(obj_exts_cache) &&
+			(s->object_size == obj_exts_cache->object_size))
+		return obj_exts_cache->object_size + 1;
+
+	return sz;
+}
+
 int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 		        gfp_t gfp, bool new_slab)
 {
@@ -2112,26 +2158,26 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 	unsigned long new_exts;
 	unsigned long old_exts;
 	struct slabobj_ext *vec;
+	size_t sz;
 
 	gfp &= ~OBJCGS_CLEAR_MASK;
 	/* Prevent recursive extension vector allocation */
 	gfp |= __GFP_NO_OBJ_EXT;
 
+	sz = obj_exts_alloc_size(s, slab, gfp);
+
 	/*
 	 * Note that allow_spin may be false during early boot and its
 	 * restricted GFP_BOOT_MASK. Due to kmalloc_nolock() only supporting
 	 * architectures with cmpxchg16b, early obj_exts will be missing for
 	 * very early allocations on those.
 	 */
-	if (unlikely(!allow_spin)) {
-		size_t sz = objects * sizeof(struct slabobj_ext);
-
+	if (unlikely(!allow_spin))
 		vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
 				     slab_nid(slab));
-	} else {
-		vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
-				   slab_nid(slab));
-	}
+	else
+		vec = kmalloc_node(sz, gfp | __GFP_ZERO, slab_nid(slab));
+
 	if (!vec) {
 		/*
 		 * Try to mark vectors which failed to allocate.
@@ -2145,6 +2191,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
 		return -ENOMEM;
 	}
 
+	VM_WARN_ON_ONCE(virt_to_slab(vec)->slab_cache == s);
+
 	new_exts = (unsigned long)vec;
 	if (unlikely(!allow_spin))
 		new_exts |= OBJEXTS_NOSPIN_ALLOC;
-- 
2.43.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-24 10:46 [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab Harry Yoo
@ 2026-01-24 10:53 ` Harry Yoo
  2026-01-26  0:51 ` Hao Li
  2026-01-26  7:36 ` Vlastimil Babka
  2 siblings, 0 replies; 10+ messages in thread
From: Harry Yoo @ 2026-01-24 10:53 UTC (permalink / raw)
  To: akpm, vbabka
  Cc: linux-mm, cl, rientjes, surenb, hao.li, kernel test robot, stable

On Sat, Jan 24, 2026 at 07:46:14PM +0900, Harry Yoo wrote:
> When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
> can be allocated from the same slab we're allocating the array for.
> This led to obj_exts_in_slab() incorrectly returning true [1],
> although the array is not allocated from wasted space of the slab.
> 
> Vlastimil Babka observed that this problem should be fixed even when
> ignoring its incompatibility with obj_exts_in_slab(), because it creates
> slabs that are never freed as there is always at least one allocated
> object.
> 
> To avoid this, use the next kmalloc size or large kmalloc when
> kmalloc_slab() returns the same cache we're allocating the array for.
> 
> In case of random kmalloc caches, there are multiple kmalloc caches for
> the same size and the cache is selected based on the caller address.
> Because it is fragile to ensure the same caller address is passed to
> kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
> to (s->object_size + 1) when the sizes are equal.
> 
> Note that this doesn't happen when memory allocation profiling is
> disabled, as when the allocation of the array is triggered by memory
> cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.
> 
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com [1]
> Cc: stable@vger.kernel.org
> Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---

I don't think this is urgent enough to be fixed as part of -rcX,
as it's been there since the introduction of memory allocation profiling.

Perhaps it could be the first patch of slab/for-7.0/obj_metadata branch and
-stable folks will pick up after it lands mainline?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-24 10:46 [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab Harry Yoo
  2026-01-24 10:53 ` Harry Yoo
@ 2026-01-26  0:51 ` Hao Li
  2026-01-26 13:00   ` Harry Yoo
  2026-01-26  7:36 ` Vlastimil Babka
  2 siblings, 1 reply; 10+ messages in thread
From: Hao Li @ 2026-01-26  0:51 UTC (permalink / raw)
  To: Harry Yoo
  Cc: akpm, vbabka, linux-mm, cl, rientjes, surenb, kernel test robot, stable

On Sat, Jan 24, 2026 at 07:46:14PM +0900, Harry Yoo wrote:
> When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
> can be allocated from the same slab we're allocating the array for.
> This led to obj_exts_in_slab() incorrectly returning true [1],
> although the array is not allocated from wasted space of the slab.

This is indeed a tricky issue to uncover.

> 
> Vlastimil Babka observed that this problem should be fixed even when
> ignoring its incompatibility with obj_exts_in_slab(), because it creates
> slabs that are never freed as there is always at least one allocated
> object.
> 
> To avoid this, use the next kmalloc size or large kmalloc when
> kmalloc_slab() returns the same cache we're allocating the array for.

Nice approach.

> 
> In case of random kmalloc caches, there are multiple kmalloc caches for
> the same size and the cache is selected based on the caller address.
> Because it is fragile to ensure the same caller address is passed to
> kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
> to (s->object_size + 1) when the sizes are equal.

Good catch on this corner case!

> 
> Note that this doesn't happen when memory allocation profiling is
> disabled, as when the allocation of the array is triggered by memory
> cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.
> 
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com [1]
> Cc: stable@vger.kernel.org
> Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>

Looks good to me!
Reviewed-by: Hao Li <hao.li@linux.dev>

-- 
Thanks,
Hao

> ---
>  mm/slub.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 55 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 3ff1c475b0f1..43ddb96c4081 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2104,6 +2104,52 @@ static inline void init_slab_obj_exts(struct slab *slab)
>  	slab->obj_exts = 0;
>  }
>  
> +/*
> + * Calculate the allocation size for slabobj_ext array.
> + *
> + * When memory allocation profiling is enabled, the obj_exts array
> + * could be allocated from the same slab cache it's being allocated for.
> + * This would prevent the slab from ever being freed because it would
> + * always contain at least one allocated object (its own obj_exts array).
> + *
> + * To avoid this, increase the allocation size when we detect the array
> + * would come from the same cache, forcing it to use a different cache.
> + */
> +static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
> +					 struct slab *slab, gfp_t gfp)
> +{
> +	size_t sz = sizeof(struct slabobj_ext) * slab->objects;
> +	struct kmem_cache *obj_exts_cache;
> +
> +	/*
> +	 * slabobj_ext array for KMALLOC_CGROUP allocations
> +	 * are served from KMALLOC_NORMAL caches.
> +	 */
> +	if (!mem_alloc_profiling_enabled())
> +		return sz;
> +
> +	if (sz > KMALLOC_MAX_CACHE_SIZE)
> +		return sz;
> +
> +	obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);
> +	if (s == obj_exts_cache)
> +		return obj_exts_cache->object_size + 1;
> +
> +	/*
> +	 * Random kmalloc caches have multiple caches per size, and the cache
> +	 * is selected by the caller address. Since caller address may differ
> +	 * between kmalloc_slab() and actual allocation, bump size when both
> +	 * are normal kmalloc caches of same size.
> +	 */
> +	if (IS_ENABLED(CONFIG_RANDOM_KMALLOC_CACHES) &&
> +			is_kmalloc_normal(s) &&
> +			is_kmalloc_normal(obj_exts_cache) &&
> +			(s->object_size == obj_exts_cache->object_size))
> +		return obj_exts_cache->object_size + 1;
> +
> +	return sz;
> +}
> +
>  int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  		        gfp_t gfp, bool new_slab)
>  {
> @@ -2112,26 +2158,26 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  	unsigned long new_exts;
>  	unsigned long old_exts;
>  	struct slabobj_ext *vec;
> +	size_t sz;
>  
>  	gfp &= ~OBJCGS_CLEAR_MASK;
>  	/* Prevent recursive extension vector allocation */
>  	gfp |= __GFP_NO_OBJ_EXT;
>  
> +	sz = obj_exts_alloc_size(s, slab, gfp);
> +
>  	/*
>  	 * Note that allow_spin may be false during early boot and its
>  	 * restricted GFP_BOOT_MASK. Due to kmalloc_nolock() only supporting
>  	 * architectures with cmpxchg16b, early obj_exts will be missing for
>  	 * very early allocations on those.
>  	 */
> -	if (unlikely(!allow_spin)) {
> -		size_t sz = objects * sizeof(struct slabobj_ext);
> -
> +	if (unlikely(!allow_spin))
>  		vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
>  				     slab_nid(slab));
> -	} else {
> -		vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
> -				   slab_nid(slab));
> -	}
> +	else
> +		vec = kmalloc_node(sz, gfp | __GFP_ZERO, slab_nid(slab));
> +
>  	if (!vec) {
>  		/*
>  		 * Try to mark vectors which failed to allocate.
> @@ -2145,6 +2191,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  		return -ENOMEM;
>  	}
>  
> +	VM_WARN_ON_ONCE(virt_to_slab(vec)->slab_cache == s);
> +
>  	new_exts = (unsigned long)vec;
>  	if (unlikely(!allow_spin))
>  		new_exts |= OBJEXTS_NOSPIN_ALLOC;
> -- 
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-24 10:46 [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab Harry Yoo
  2026-01-24 10:53 ` Harry Yoo
  2026-01-26  0:51 ` Hao Li
@ 2026-01-26  7:36 ` Vlastimil Babka
  2026-01-26  8:30   ` Harry Yoo
  2 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2026-01-26  7:36 UTC (permalink / raw)
  To: Harry Yoo, akpm
  Cc: linux-mm, cl, rientjes, surenb, hao.li, kernel test robot, stable

On 1/24/26 11:46, Harry Yoo wrote:
> When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
> can be allocated from the same slab we're allocating the array for.
> This led to obj_exts_in_slab() incorrectly returning true [1],
> although the array is not allocated from wasted space of the slab.
> 
> Vlastimil Babka observed that this problem should be fixed even when
> ignoring its incompatibility with obj_exts_in_slab(), because it creates
> slabs that are never freed as there is always at least one allocated
> object.
> 
> To avoid this, use the next kmalloc size or large kmalloc when
> kmalloc_slab() returns the same cache we're allocating the array for.
> 
> In case of random kmalloc caches, there are multiple kmalloc caches for
> the same size and the cache is selected based on the caller address.
> Because it is fragile to ensure the same caller address is passed to
> kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
> to (s->object_size + 1) when the sizes are equal.
> 
> Note that this doesn't happen when memory allocation profiling is
> disabled, as when the allocation of the array is triggered by memory
> cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.
> 
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com [1]
> Cc: stable@vger.kernel.org
> Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>

Thanks! Just wondering if we could simplify a bit.

> ---
>  mm/slub.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 55 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 3ff1c475b0f1..43ddb96c4081 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2104,6 +2104,52 @@ static inline void init_slab_obj_exts(struct slab *slab)
>  	slab->obj_exts = 0;
>  }
>  
> +/*
> + * Calculate the allocation size for slabobj_ext array.
> + *
> + * When memory allocation profiling is enabled, the obj_exts array
> + * could be allocated from the same slab cache it's being allocated for.
> + * This would prevent the slab from ever being freed because it would
> + * always contain at least one allocated object (its own obj_exts array).
> + *
> + * To avoid this, increase the allocation size when we detect the array
> + * would come from the same cache, forcing it to use a different cache.
> + */
> +static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
> +					 struct slab *slab, gfp_t gfp)
> +{
> +	size_t sz = sizeof(struct slabobj_ext) * slab->objects;
> +	struct kmem_cache *obj_exts_cache;
> +
> +	/*
> +	 * slabobj_ext array for KMALLOC_CGROUP allocations
> +	 * are served from KMALLOC_NORMAL caches.
> +	 */
> +	if (!mem_alloc_profiling_enabled())
> +		return sz;
> +
> +	if (sz > KMALLOC_MAX_CACHE_SIZE)
> +		return sz;

Could we bail out here immediately if !is_kmalloc_normal(s)?

> +
> +	obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);

Then do this.

> +	if (s == obj_exts_cache)
> +		return obj_exts_cache->object_size + 1;

But not this.

> +	/*
> +	 * Random kmalloc caches have multiple caches per size, and the cache
> +	 * is selected by the caller address. Since caller address may differ
> +	 * between kmalloc_slab() and actual allocation, bump size when both
> +	 * are normal kmalloc caches of same size.
> +	 */
> +	if (IS_ENABLED(CONFIG_RANDOM_KMALLOC_CACHES) &&

Instead just compare object_size unconditionally.

> +			is_kmalloc_normal(s) &&

This we already checked.

> +			is_kmalloc_normal(obj_exts_cache) &&

I think this is guaranteed thanks to "gfp &= ~OBJCGS_CLEAR_MASK;" below so
we don't need it.

> +			(s->object_size == obj_exts_cache->object_size))
> +		return obj_exts_cache->object_size + 1;
> +
> +	return sz;
> +}
> +
>  int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  		        gfp_t gfp, bool new_slab)
>  {
> @@ -2112,26 +2158,26 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  	unsigned long new_exts;
>  	unsigned long old_exts;
>  	struct slabobj_ext *vec;
> +	size_t sz;
>  
>  	gfp &= ~OBJCGS_CLEAR_MASK;
>  	/* Prevent recursive extension vector allocation */
>  	gfp |= __GFP_NO_OBJ_EXT;
>  
> +	sz = obj_exts_alloc_size(s, slab, gfp);
> +
>  	/*
>  	 * Note that allow_spin may be false during early boot and its
>  	 * restricted GFP_BOOT_MASK. Due to kmalloc_nolock() only supporting
>  	 * architectures with cmpxchg16b, early obj_exts will be missing for
>  	 * very early allocations on those.
>  	 */
> -	if (unlikely(!allow_spin)) {
> -		size_t sz = objects * sizeof(struct slabobj_ext);
> -
> +	if (unlikely(!allow_spin))
>  		vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
>  				     slab_nid(slab));
> -	} else {
> -		vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
> -				   slab_nid(slab));
> -	}
> +	else
> +		vec = kmalloc_node(sz, gfp | __GFP_ZERO, slab_nid(slab));
> +
>  	if (!vec) {
>  		/*
>  		 * Try to mark vectors which failed to allocate.
> @@ -2145,6 +2191,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
>  		return -ENOMEM;
>  	}
>  
> +	VM_WARN_ON_ONCE(virt_to_slab(vec)->slab_cache == s);
> +
>  	new_exts = (unsigned long)vec;
>  	if (unlikely(!allow_spin))
>  		new_exts |= OBJEXTS_NOSPIN_ALLOC;



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-26  7:36 ` Vlastimil Babka
@ 2026-01-26  8:30   ` Harry Yoo
  2026-01-26  8:37     ` Vlastimil Babka
  0 siblings, 1 reply; 10+ messages in thread
From: Harry Yoo @ 2026-01-26  8:30 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: akpm, linux-mm, cl, rientjes, surenb, hao.li, kernel test robot, stable

On Mon, Jan 26, 2026 at 08:36:16AM +0100, Vlastimil Babka wrote:
> On 1/24/26 11:46, Harry Yoo wrote:
> > When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
> > can be allocated from the same slab we're allocating the array for.
> > This led to obj_exts_in_slab() incorrectly returning true [1],
> > although the array is not allocated from wasted space of the slab.
> > 
> > Vlastimil Babka observed that this problem should be fixed even when
> > ignoring its incompatibility with obj_exts_in_slab(), because it creates
> > slabs that are never freed as there is always at least one allocated
> > object.
> > 
> > To avoid this, use the next kmalloc size or large kmalloc when
> > kmalloc_slab() returns the same cache we're allocating the array for.
> > 
> > In case of random kmalloc caches, there are multiple kmalloc caches for
> > the same size and the cache is selected based on the caller address.
> > Because it is fragile to ensure the same caller address is passed to
> > kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
> > to (s->object_size + 1) when the sizes are equal.
> > 
> > Note that this doesn't happen when memory allocation profiling is
> > disabled, as when the allocation of the array is triggered by memory
> > cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.
> > 
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com
> > Cc: stable@vger.kernel.org
> > Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> 
> Thanks! Just wondering if we could simplify a bit.
> 
> > ---
> >  mm/slub.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 55 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 3ff1c475b0f1..43ddb96c4081 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2104,6 +2104,52 @@ static inline void init_slab_obj_exts(struct slab *slab)
> >  	slab->obj_exts = 0;
> >  }
> >  
> > +/*
> > + * Calculate the allocation size for slabobj_ext array.
> > + *
> > + * When memory allocation profiling is enabled, the obj_exts array
> > + * could be allocated from the same slab cache it's being allocated for.
> > + * This would prevent the slab from ever being freed because it would
> > + * always contain at least one allocated object (its own obj_exts array).
> > + *
> > + * To avoid this, increase the allocation size when we detect the array
> > + * would come from the same cache, forcing it to use a different cache.
> > + */
> > +static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
> > +					 struct slab *slab, gfp_t gfp)
> > +{
> > +	size_t sz = sizeof(struct slabobj_ext) * slab->objects;
> > +	struct kmem_cache *obj_exts_cache;
> > +
> > +	/*
> > +	 * slabobj_ext array for KMALLOC_CGROUP allocations
> > +	 * are served from KMALLOC_NORMAL caches.
> > +	 */
> > +	if (!mem_alloc_profiling_enabled())
> > +		return sz;
> > +
> > +	if (sz > KMALLOC_MAX_CACHE_SIZE)
> > +		return sz;
> 
> Could we bail out here immediately if !is_kmalloc_normal(s)?

Yes.

> > +
> > +	obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);
> 
> Then do this.

Yes.

> > +	if (s == obj_exts_cache)
> > +		return obj_exts_cache->object_size + 1;
> 
> But not this.

Yes.

> > +	/*
> > +	 * Random kmalloc caches have multiple caches per size, and the cache
> > +	 * is selected by the caller address. Since caller address may differ
> > +	 * between kmalloc_slab() and actual allocation, bump size when both
> > +	 * are normal kmalloc caches of same size.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_RANDOM_KMALLOC_CACHES) &&
> 
> Instead just compare object_size unconditionally.

Yes. Will do.
But perhaps the comment is still worth it?

> > +			is_kmalloc_normal(s) &&
> 
> This we already checked.
> 
> > +			is_kmalloc_normal(obj_exts_cache) &&
> 
> I think this is guaranteed thanks to "gfp &= ~OBJCGS_CLEAR_MASK;" below so
> we don't need it.

Right.

Will respin soon after testing. Thanks!

The end result will be:

/*
 * Calculate the allocation size for slabobj_ext array.
 *
 * When memory allocation profiling is enabled, the obj_exts array
 * could be allocated from the same slab it's being allocated for.
 * This would prevent the slab from ever being freed because it would
 * always contain at least one allocated object (its own obj_exts array).
 *
 * To avoid this, increase the allocation size when we detect the array
 * would come from the same cache, forcing it to use a different cache.
 */
static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
                                         struct slab *slab, gfp_t gfp)
{
        size_t sz = sizeof(struct slabobj_ext) * slab->objects;
        struct kmem_cache *obj_exts_cache;

        /*
         * slabobj_ext array for KMALLOC_CGROUP allocations
         * are served from KMALLOC_NORMAL caches.
         */
        if (!mem_alloc_profiling_enabled())
                return sz;

        if (sz > KMALLOC_MAX_CACHE_SIZE)
                return sz;

        if (!is_kmalloc_normal(s))
                return sz;

        obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);
        /*
         * Random kmalloc caches have multiple caches per size, and the cache
         * is selected by the caller address. Since caller address may differ
         * between kmalloc_slab() and actual allocation, bump size when both
         * are normal kmalloc caches of same size.
         */
        if (s->size == obj_exts_cache->size)
                return s->object_size + 1;

        return sz;
}

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-26  8:30   ` Harry Yoo
@ 2026-01-26  8:37     ` Vlastimil Babka
  2026-01-26  8:57       ` Harry Yoo
  0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2026-01-26  8:37 UTC (permalink / raw)
  To: Harry Yoo
  Cc: akpm, linux-mm, cl, rientjes, surenb, hao.li, kernel test robot, stable

On 1/26/26 09:30, Harry Yoo wrote:
> On Mon, Jan 26, 2026 at 08:36:16AM +0100, Vlastimil Babka wrote:
> /*
>  * Calculate the allocation size for slabobj_ext array.
>  *
>  * When memory allocation profiling is enabled, the obj_exts array
>  * could be allocated from the same slab it's being allocated for.
>  * This would prevent the slab from ever being freed because it would
>  * always contain at least one allocated object (its own obj_exts array).
>  *
>  * To avoid this, increase the allocation size when we detect the array
>  * would come from the same cache, forcing it to use a different cache.
>  */
> static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
>                                          struct slab *slab, gfp_t gfp)
> {
>         size_t sz = sizeof(struct slabobj_ext) * slab->objects;
>         struct kmem_cache *obj_exts_cache;
> 
>         /*
>          * slabobj_ext array for KMALLOC_CGROUP allocations
>          * are served from KMALLOC_NORMAL caches.
>          */
>         if (!mem_alloc_profiling_enabled())
>                 return sz;
> 
>         if (sz > KMALLOC_MAX_CACHE_SIZE)
>                 return sz;
> 
>         if (!is_kmalloc_normal(s))
>                 return sz;
> 
>         obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);
>         /*
>          * Random kmalloc caches have multiple caches per size, and the cache

Maybe start with something like "We can't simply compare s with
obj_exts_cache, because..."

>          * is selected by the caller address. Since caller address may differ
>          * between kmalloc_slab() and actual allocation, bump size when both
>          * are normal kmalloc caches of same size.

As we don't test the other for normal kmalloc(), anymore this now reads as
if we forgot to.

>          */
>         if (s->size == obj_exts_cache->size)
>                 return s->object_size + 1;

Why switch to size from object_size for the checks? I'd be worried that due
to debugging etc this can yield wrong results?

> 
>         return sz;
> }
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-26  8:37     ` Vlastimil Babka
@ 2026-01-26  8:57       ` Harry Yoo
  2026-01-26  9:10         ` Vlastimil Babka
  0 siblings, 1 reply; 10+ messages in thread
From: Harry Yoo @ 2026-01-26  8:57 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: akpm, linux-mm, cl, rientjes, surenb, hao.li, kernel test robot, stable

On Mon, Jan 26, 2026 at 09:37:29AM +0100, Vlastimil Babka wrote:
> On 1/26/26 09:30, Harry Yoo wrote:
> > On Mon, Jan 26, 2026 at 08:36:16AM +0100, Vlastimil Babka wrote:
> > /*
> >  * Calculate the allocation size for slabobj_ext array.
> >  *
> >  * When memory allocation profiling is enabled, the obj_exts array
> >  * could be allocated from the same slab it's being allocated for.
> >  * This would prevent the slab from ever being freed because it would
> >  * always contain at least one allocated object (its own obj_exts array).
> >  *
> >  * To avoid this, increase the allocation size when we detect the array
> >  * would come from the same cache, forcing it to use a different cache.
> >  */
> > static inline size_t obj_exts_alloc_size(struct kmem_cache *s,
> >                                          struct slab *slab, gfp_t gfp)
> > {
> >         size_t sz = sizeof(struct slabobj_ext) * slab->objects;
> >         struct kmem_cache *obj_exts_cache;
> > 
> >         /*
> >          * slabobj_ext array for KMALLOC_CGROUP allocations
> >          * are served from KMALLOC_NORMAL caches.
> >          */
> >         if (!mem_alloc_profiling_enabled())
> >                 return sz;
> > 
> >         if (sz > KMALLOC_MAX_CACHE_SIZE)
> >                 return sz;
> > 
> >         if (!is_kmalloc_normal(s))
> >                 return sz;
> > 
> >         obj_exts_cache = kmalloc_slab(sz, NULL, gfp, 0);
> >         /*
> >          * Random kmalloc caches have multiple caches per size, and the cache
> 
> Maybe start with something like "We can't simply compare s with
> obj_exts_cache, because..."
> 
> >          * is selected by the caller address. Since caller address may differ
> >          * between kmalloc_slab() and actual allocation, bump size when both
> >          * are normal kmalloc caches of same size.
> 
> As we don't test the other for normal kmalloc(), anymore this now reads as
> if we forgot to.

Ok, something like this:

"We can't simply compare s with obj_exts_cache, because random kmalloc
caches have multiple caches per size, selected by caller address.
Since caller address may differ between kmalloc_slab() and actual
allocation, bump size when sizes are equal."

> >          */
> >         if (s->size == obj_exts_cache->size)
> >                 return s->object_size + 1;
> 
> Why switch to size from object_size for the checks? I'd be worried that due
> to debugging etc this can yield wrong results?

Oops, sorry. copied from wrong version of the patch.
Yeah I initially compared size and then switched to object_size for the
reason you mentioned.

> > 
> >         return sz;
> > }

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-26  8:57       ` Harry Yoo
@ 2026-01-26  9:10         ` Vlastimil Babka
  0 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2026-01-26  9:10 UTC (permalink / raw)
  To: Harry Yoo
  Cc: akpm, linux-mm, cl, rientjes, surenb, hao.li, kernel test robot, stable

On 1/26/26 09:57, Harry Yoo wrote:
>> 
>> Maybe start with something like "We can't simply compare s with
>> obj_exts_cache, because..."
>> 
>> >          * is selected by the caller address. Since caller address may differ
>> >          * between kmalloc_slab() and actual allocation, bump size when both
>> >          * are normal kmalloc caches of same size.
>> 
>> As we don't test the other for normal kmalloc(), anymore this now reads as
>> if we forgot to.
> 
> Ok, something like this:
> 
> "We can't simply compare s with obj_exts_cache, because random kmalloc
> caches have multiple caches per size, selected by caller address.
> Since caller address may differ between kmalloc_slab() and actual
> allocation, bump size when sizes are equal."

LGTM!



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-26  0:51 ` Hao Li
@ 2026-01-26 13:00   ` Harry Yoo
  2026-01-26 14:31     ` Hao Li
  0 siblings, 1 reply; 10+ messages in thread
From: Harry Yoo @ 2026-01-26 13:00 UTC (permalink / raw)
  To: Hao Li
  Cc: akpm, vbabka, linux-mm, cl, rientjes, surenb, kernel test robot, stable

On Mon, Jan 26, 2026 at 08:51:10AM +0800, Hao Li wrote:
> On Sat, Jan 24, 2026 at 07:46:14PM +0900, Harry Yoo wrote:
> > When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
> > can be allocated from the same slab we're allocating the array for.
> > This led to obj_exts_in_slab() incorrectly returning true [1],
> > although the array is not allocated from wasted space of the slab.
> 
> This is indeed a tricky issue to uncover.
> 
> > 
> > Vlastimil Babka observed that this problem should be fixed even when
> > ignoring its incompatibility with obj_exts_in_slab(), because it creates
> > slabs that are never freed as there is always at least one allocated
> > object.
> > 
> > To avoid this, use the next kmalloc size or large kmalloc when
> > kmalloc_slab() returns the same cache we're allocating the array for.
> 
> Nice approach.
> 
> > 
> > In case of random kmalloc caches, there are multiple kmalloc caches for
> > the same size and the cache is selected based on the caller address.
> > Because it is fragile to ensure the same caller address is passed to
> > kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
> > to (s->object_size + 1) when the sizes are equal.
> 
> Good catch on this corner case!
> 
> > 
> > Note that this doesn't happen when memory allocation profiling is
> > disabled, as when the allocation of the array is triggered by memory
> > cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.
> > 
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com
> > Cc: stable@vger.kernel.org
> > Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> 
> Looks good to me!
> Reviewed-by: Hao Li <hao.li@linux.dev>

Hi Hao, thanks a lot for reviewing!

I was tempted to add your R-b tag, but since the implementation has
changed a bit, could you please provide R-b again if V2 [1] still looks
good to you?

[1] https://lore.kernel.org/linux-mm/20260126125714.88008-1-harry.yoo@oracle.com

-- 
Cheers,
Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab
  2026-01-26 13:00   ` Harry Yoo
@ 2026-01-26 14:31     ` Hao Li
  0 siblings, 0 replies; 10+ messages in thread
From: Hao Li @ 2026-01-26 14:31 UTC (permalink / raw)
  To: Harry Yoo
  Cc: akpm, vbabka, linux-mm, cl, rientjes, surenb, kernel test robot, stable

On Mon, Jan 26, 2026 at 10:00:53PM +0900, Harry Yoo wrote:
> On Mon, Jan 26, 2026 at 08:51:10AM +0800, Hao Li wrote:
> > On Sat, Jan 24, 2026 at 07:46:14PM +0900, Harry Yoo wrote:
> > > When allocating slabobj_ext array in alloc_slab_obj_exts(), the array
> > > can be allocated from the same slab we're allocating the array for.
> > > This led to obj_exts_in_slab() incorrectly returning true [1],
> > > although the array is not allocated from wasted space of the slab.
> > 
> > This is indeed a tricky issue to uncover.
> > 
> > > 
> > > Vlastimil Babka observed that this problem should be fixed even when
> > > ignoring its incompatibility with obj_exts_in_slab(), because it creates
> > > slabs that are never freed as there is always at least one allocated
> > > object.
> > > 
> > > To avoid this, use the next kmalloc size or large kmalloc when
> > > kmalloc_slab() returns the same cache we're allocating the array for.
> > 
> > Nice approach.
> > 
> > > 
> > > In case of random kmalloc caches, there are multiple kmalloc caches for
> > > the same size and the cache is selected based on the caller address.
> > > Because it is fragile to ensure the same caller address is passed to
> > > kmalloc_slab(), kmalloc_noprof(), and kmalloc_node_noprof(), fall back
> > > to (s->object_size + 1) when the sizes are equal.
> > 
> > Good catch on this corner case!
> > 
> > > 
> > > Note that this doesn't happen when memory allocation profiling is
> > > disabled, as when the allocation of the array is triggered by memory
> > > cgroup (KMALLOC_CGROUP), the array is allocated from KMALLOC_NORMAL.
> > > 
> > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202601231457.f7b31e09-lkp@intel.com
> > > Cc: stable@vger.kernel.org
> > > Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths")
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > 
> > Looks good to me!
> > Reviewed-by: Hao Li <hao.li@linux.dev>
> 
> Hi Hao, thanks a lot for reviewing!
> 
> I was tempted to add your R-b tag, but since the implementation has
> changed a bit, 

Hi Harry,

Thanks for letting me know!

> could you please provide R-b again if V2 [1] still looks
> good to you?
> 
> [1] https://lore.kernel.org/linux-mm/20260126125714.88008-1-harry.yoo@oracle.com

Sure - I've reviewed v2 and it's still LGTM.
I'll send my Reviewed-by on the v2 thread.
Thanks!

> 
> -- 
> Cheers,
> Harry / Hyeonggon


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-01-26 14:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-24 10:46 [PATCH] mm/slab: avoid allocating slabobj_ext array from its own slab Harry Yoo
2026-01-24 10:53 ` Harry Yoo
2026-01-26  0:51 ` Hao Li
2026-01-26 13:00   ` Harry Yoo
2026-01-26 14:31     ` Hao Li
2026-01-26  7:36 ` Vlastimil Babka
2026-01-26  8:30   ` Harry Yoo
2026-01-26  8:37     ` Vlastimil Babka
2026-01-26  8:57       ` Harry Yoo
2026-01-26  9:10         ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox