linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.com>
To: "D, Suneeth" <Suneeth.D@amd.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Harry Yoo <harry.yoo@oracle.com>,
	Petr Tesarik <ptesarik@suse.com>,
	Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Hao Li <hao.li@linux.dev>
Cc: Hao Li <hao.li@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Alexei Starovoitov <ast@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-rt-devel@lists.linux.dev, bpf@vger.kernel.org,
	kasan-dev@googlegroups.com
Subject: Re: [PATCH v4 08/22] slab: make percpu sheaves compatible with kmalloc_nolock()/kfree_nolock()
Date: Mon, 2 Mar 2026 13:16:28 +0100	[thread overview]
Message-ID: <9b0ae03c-8e93-422d-835c-3d4148a7550f@suse.com> (raw)
In-Reply-To: <df5a0dfd-01b7-48a9-8936-4d5e271e68e6@amd.com>

On 3/2/26 12:56, D, Suneeth wrote:
> Hi Vlastimil Babka,

Hi Suneeth!

> On 1/23/2026 12:22 PM, Vlastimil Babka wrote:
>> Before we enable percpu sheaves for kmalloc caches, we need to make sure
>> kmalloc_nolock() and kfree_nolock() will continue working properly and
>> not spin when not allowed to.
>> 
>> Percpu sheaves themselves use local_trylock() so they are already
>> compatible. We just need to be careful with the barn->lock spin_lock.
>> Pass a new allow_spin parameter where necessary to use
>> spin_trylock_irqsave().
>> 
>> In kmalloc_nolock_noprof() we can now attempt alloc_from_pcs() safely,
>> for now it will always fail until we enable sheaves for kmalloc caches
>> next. Similarly in kfree_nolock() we can attempt free_to_pcs().
>> 
> 
> We run will-it-scale micro-benchmark as part of our weekly CI for Kernel 
> Performance Regression testing between a stable vs rc kernel. We 

Great!

> observed will-it-scale-thread-page_fault3 variant was regressing with 
> 9-11% on AMD platforms (Turin and Bergamo)between the kernels v6.19 and 
> v7.0-rc1. Bisecting further landed me onto this commit
> f1427a1d64156bb88d84f364855c364af6f67a3b (slab: make percpu sheaves 
> compatible with kmalloc_nolock()/kfree_nolock()) as the first bad 
> commit. The following were the machines' configuration and test 
> parameters used:-
> 
> Model name:           AMD EPYC 128-Core Processor [Bergamo]
> Thread(s) per core:   2
> Core(s) per socket:   64
> Socket(s):            1
> Total online memory:  256G
> 
> Model name:           AMD EPYC 64-Core Processor [Turin]
> Thread(s) per core:   2
> Core(s) per socket:   64
> Socket(s):            2
> Total online memory:  258G
> 
> Test params:
> ------------
>       nr_task: [1 8 64 128 192 256]
>       mode: thread
>       test: page_fault3
>       kpi: per_thread_ops
>       cpufreq_governor: performance
> 
> The following are the stats after bisection:-
> (the KPI used here is per_thread_ops)
> 
> kernel_versions      					 per_thread_ops
> ---------------      					 ---------------
> v6.19.0 (baseline)                                     - 2410188
> v7.0-rc1 	                                       - 2151474
> v6.19-rc5-f1427a1d6415                                 - 2263974
> v6.19-rc5-f3421f8d154c (one commit before culprit)     - 2323263

I suspect the bisection gave a wrong result here due to noise. The commit
f1427a1d6415 should not affect anything in this benchmark. The values for
the commit and its parent are rather close to each other, and in the middle
of the range between v6.19.0 and v7.0-rc1 numbers.

What I rather suspect is something we noticed recently - v7.0-rc1 enables
sheaves for all caches, but also removes cpu (partial) slabs. In v6.19 only
two caches (vma and maple nodes) have sheaves, but also cpu (partial) slabs
still behind them, effectively caching many more objects than with either
mechanism alone. will-it-scale-thread-page_fault3 is a benchmark that is
very sensitive to vma and maple nodes allocation performance and notice this.

So unfortunately we now see it as a regression between 6.19 and v7, but it
should be just offsetting an improvement in 6.18 when sheaves were
introduced for vma and maple nodes with this unintended ~double caching.

> Recreation steps:
> -----------------
> 1) git clone https://github.com/antonblanchard/will-it-scale.git
> 2) git clone https://github.com/intel/lkp-tests.git
> 3) cd will-it-scale && git apply
> lkp-tests/programs/will-it-scale/pkg/will-it-scale.patch
> 4) make
> 5) python3 runtest.py page_fault3 25 thread 0 0 1 8 64 128 192 256
> 
> NOTE: [5] is specific to machine's architecture. starting from 1 is the
> array of no.of tasks that you'd wish to run the testcase which here is
> no.cores per CCX, per NUMA node/ per Socket, nr_threads.
> 
> I also ran the micro-benchmark with ./tools/testing/perf record and
> following is the diff collected:-
> 
> # ./perf diff perf.data.old perf.data
> Warning:
> 4 out of order events recorded.
> # Event 'cpu/cycles/P'
> #
> # Baseline  Delta Abs  Shared Object          Symbol
> # ........  .........  ..................... 
> ...................................................
> #
>                +11.95%  [kernel.kallsyms]      [k] folio_pte_batch
>                +10.30%  [kernel.kallsyms]      [k] 
> native_queued_spin_lock_slowpath
>                 +9.91%  [kernel.kallsyms]      [k] __block_write_begin_int
>       0.00%     +8.56%  [kernel.kallsyms]      [k] clear_page_erms
>       7.71%     -7.71%  [kernel.kallsyms]      [k] delay_halt
>                 +6.84%  [kernel.kallsyms]      [k] block_dirty_folio
>       1.58%     +4.90%  [kernel.kallsyms]      [k] unmap_page_range
>       0.00%     +4.78%  [kernel.kallsyms]      [k] folio_remove_rmap_ptes
>       3.17%     -3.17%  [kernel.kallsyms]      [k] __vmf_anon_prepare
>       0.00%     +3.09%  [kernel.kallsyms]      [k] ext4_page_mkwrite
>                 +2.32%  [kernel.kallsyms]      [k] ext4_dirty_folio
>       0.00%     +2.01%  [kernel.kallsyms]      [k] vm_normal_page
>       0.00%     +1.93%  [kernel.kallsyms]      [k] set_pte_range
>                 +1.84%  [kernel.kallsyms]      [k] block_commit_write
>                 +1.82%  [kernel.kallsyms]      [k] mod_node_page_state
>                 +1.68%  [kernel.kallsyms]      [k] lruvec_stat_mod_folio
>                 +1.56%  [kernel.kallsyms]      [k] mod_memcg_lruvec_state
>       1.40%     -1.39%  [kernel.kallsyms]      [k] mod_memcg_state
>                 +1.38%  [kernel.kallsyms]      [k] folio_add_file_rmap_ptes
>       5.01%     -0.87%  page_fault3_threads    [.] testcase
>                 +0.84%  [kernel.kallsyms]      [k] tlb_flush_rmap_batch
>                 +0.83%  [kernel.kallsyms]      [k] mark_buffer_dirty
>       1.66%     -0.75%  [kernel.kallsyms]      [k] flush_tlb_mm_range
>                 +0.72%  [kernel.kallsyms]      [k] css_rstat_updated
>       0.60%     -0.60%  [kernel.kallsyms]      [k] osq_unlock
>                 +0.57%  [kernel.kallsyms]      [k] _raw_spin_unlock
>                 +0.55%  [kernel.kallsyms]      [k] perf_iterate_ctx
>                 +0.54%  [kernel.kallsyms]      [k] __rcu_read_lock
>       0.11%     +0.53%  [kernel.kallsyms]      [k] osq_lock
>                 +0.46%  [kernel.kallsyms]      [k] finish_fault
>       0.46%     -0.46%  [kernel.kallsyms]      [k] do_wp_page
>                 +0.45%  [kernel.kallsyms]      [k] pte_val
>       1.10%     -0.41%  [kernel.kallsyms]      [k] filemap_fault
>                 +0.39%  [kernel.kallsyms]      [k] native_set_pte
>                 +0.36%  [kernel.kallsyms]      [k] rwsem_spin_on_owner
>       0.28%     -0.28%  [kernel.kallsyms]      [k] mas_topiary_replace
>                 +0.28%  [kernel.kallsyms]      [k] _raw_spin_lock_irqsave
>                 +0.27%  [kernel.kallsyms]      [k] percpu_counter_add_batch
>                 +0.27%  [kernel.kallsyms]      [k] memset
>       0.00%     +0.24%  [kernel.kallsyms]      [k] mas_walk
>       0.23%     -0.23%  [kernel.kallsyms]      [k] __pmd_alloc
>       0.23%     -0.22%  [kernel.kallsyms]      [k] rcu_core
>                 +0.21%  [kernel.kallsyms]      [k] __rcu_read_unlock
>       0.04%     +0.19%  [kernel.kallsyms]      [k] ext4_da_get_block_prep
>                 +0.19%  [kernel.kallsyms]      [k] lock_vma_under_rcu
>       0.01%     +0.19%  [kernel.kallsyms]      [k] prep_compound_page
>                 +0.18%  [kernel.kallsyms]      [k] filemap_get_entry
>                 +0.17%  [kernel.kallsyms]      [k] folio_mark_dirty
> 
> Would be happy to help with further testing and providing additional 
> data if required.
> 
> Thanks,
> Suneeth D
> 
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
>> Reviewed-by: Hao Li <hao.li@linux.dev>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>>   mm/slub.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++-----------------
>>   1 file changed, 60 insertions(+), 22 deletions(-)
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 41e1bf35707c..4ca6bd944854 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2889,7 +2889,8 @@ static void pcs_destroy(struct kmem_cache *s)
>>   	s->cpu_sheaves = NULL;
>>   }
>>   
>> -static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn)
>> +static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn,
>> +					       bool allow_spin)
>>   {
>>   	struct slab_sheaf *empty = NULL;
>>   	unsigned long flags;
>> @@ -2897,7 +2898,10 @@ static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn)
>>   	if (!data_race(barn->nr_empty))
>>   		return NULL;
>>   
>> -	spin_lock_irqsave(&barn->lock, flags);
>> +	if (likely(allow_spin))
>> +		spin_lock_irqsave(&barn->lock, flags);
>> +	else if (!spin_trylock_irqsave(&barn->lock, flags))
>> +		return NULL;
>>   
>>   	if (likely(barn->nr_empty)) {
>>   		empty = list_first_entry(&barn->sheaves_empty,
>> @@ -2974,7 +2978,8 @@ static struct slab_sheaf *barn_get_full_or_empty_sheaf(struct node_barn *barn)
>>    * change.
>>    */
>>   static struct slab_sheaf *
>> -barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty)
>> +barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty,
>> +			 bool allow_spin)
>>   {
>>   	struct slab_sheaf *full = NULL;
>>   	unsigned long flags;
>> @@ -2982,7 +2987,10 @@ barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty)
>>   	if (!data_race(barn->nr_full))
>>   		return NULL;
>>   
>> -	spin_lock_irqsave(&barn->lock, flags);
>> +	if (likely(allow_spin))
>> +		spin_lock_irqsave(&barn->lock, flags);
>> +	else if (!spin_trylock_irqsave(&barn->lock, flags))
>> +		return NULL;
>>   
>>   	if (likely(barn->nr_full)) {
>>   		full = list_first_entry(&barn->sheaves_full, struct slab_sheaf,
>> @@ -3003,7 +3011,8 @@ barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty)
>>    * barn. But if there are too many full sheaves, reject this with -E2BIG.
>>    */
>>   static struct slab_sheaf *
>> -barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full)
>> +barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full,
>> +			bool allow_spin)
>>   {
>>   	struct slab_sheaf *empty;
>>   	unsigned long flags;
>> @@ -3014,7 +3023,10 @@ barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full)
>>   	if (!data_race(barn->nr_empty))
>>   		return ERR_PTR(-ENOMEM);
>>   
>> -	spin_lock_irqsave(&barn->lock, flags);
>> +	if (likely(allow_spin))
>> +		spin_lock_irqsave(&barn->lock, flags);
>> +	else if (!spin_trylock_irqsave(&barn->lock, flags))
>> +		return ERR_PTR(-EBUSY);
>>   
>>   	if (likely(barn->nr_empty)) {
>>   		empty = list_first_entry(&barn->sheaves_empty, struct slab_sheaf,
>> @@ -5008,7 +5020,8 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>>   		return NULL;
>>   	}
>>   
>> -	full = barn_replace_empty_sheaf(barn, pcs->main);
>> +	full = barn_replace_empty_sheaf(barn, pcs->main,
>> +					gfpflags_allow_spinning(gfp));
>>   
>>   	if (full) {
>>   		stat(s, BARN_GET);
>> @@ -5025,7 +5038,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>>   			empty = pcs->spare;
>>   			pcs->spare = NULL;
>>   		} else {
>> -			empty = barn_get_empty_sheaf(barn);
>> +			empty = barn_get_empty_sheaf(barn, true);
>>   		}
>>   	}
>>   
>> @@ -5165,7 +5178,8 @@ void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp, int node)
>>   }
>>   
>>   static __fastpath_inline
>> -unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>> +unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t size,
>> +				 void **p)
>>   {
>>   	struct slub_percpu_sheaves *pcs;
>>   	struct slab_sheaf *main;
>> @@ -5199,7 +5213,8 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>>   			return allocated;
>>   		}
>>   
>> -		full = barn_replace_empty_sheaf(barn, pcs->main);
>> +		full = barn_replace_empty_sheaf(barn, pcs->main,
>> +						gfpflags_allow_spinning(gfp));
>>   
>>   		if (full) {
>>   			stat(s, BARN_GET);
>> @@ -5700,7 +5715,7 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
>>   	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
>>   	struct kmem_cache *s;
>>   	bool can_retry = true;
>> -	void *ret = ERR_PTR(-EBUSY);
>> +	void *ret;
>>   
>>   	VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO |
>>   				      __GFP_NO_OBJ_EXT));
>> @@ -5731,6 +5746,12 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
>>   		 */
>>   		return NULL;
>>   
>> +	ret = alloc_from_pcs(s, alloc_gfp, node);
>> +	if (ret)
>> +		goto success;
>> +
>> +	ret = ERR_PTR(-EBUSY);
>> +
>>   	/*
>>   	 * Do not call slab_alloc_node(), since trylock mode isn't
>>   	 * compatible with slab_pre_alloc_hook/should_failslab and
>> @@ -5767,6 +5788,7 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
>>   		ret = NULL;
>>   	}
>>   
>> +success:
>>   	maybe_wipe_obj_freeptr(s, ret);
>>   	slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret,
>>   			     slab_want_init_on_alloc(alloc_gfp, s), size);
>> @@ -6087,7 +6109,8 @@ static void __pcs_install_empty_sheaf(struct kmem_cache *s,
>>    * unlocked.
>>    */
>>   static struct slub_percpu_sheaves *
>> -__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
>> +__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>> +			bool allow_spin)
>>   {
>>   	struct slab_sheaf *empty;
>>   	struct node_barn *barn;
>> @@ -6111,7 +6134,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
>>   	put_fail = false;
>>   
>>   	if (!pcs->spare) {
>> -		empty = barn_get_empty_sheaf(barn);
>> +		empty = barn_get_empty_sheaf(barn, allow_spin);
>>   		if (empty) {
>>   			pcs->spare = pcs->main;
>>   			pcs->main = empty;
>> @@ -6125,7 +6148,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
>>   		return pcs;
>>   	}
>>   
>> -	empty = barn_replace_full_sheaf(barn, pcs->main);
>> +	empty = barn_replace_full_sheaf(barn, pcs->main, allow_spin);
>>   
>>   	if (!IS_ERR(empty)) {
>>   		stat(s, BARN_PUT);
>> @@ -6133,7 +6156,8 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
>>   		return pcs;
>>   	}
>>   
>> -	if (PTR_ERR(empty) == -E2BIG) {
>> +	/* sheaf_flush_unused() doesn't support !allow_spin */
>> +	if (PTR_ERR(empty) == -E2BIG && allow_spin) {
>>   		/* Since we got here, spare exists and is full */
>>   		struct slab_sheaf *to_flush = pcs->spare;
>>   
>> @@ -6158,6 +6182,14 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
>>   alloc_empty:
>>   	local_unlock(&s->cpu_sheaves->lock);
>>   
>> +	/*
>> +	 * alloc_empty_sheaf() doesn't support !allow_spin and it's
>> +	 * easier to fall back to freeing directly without sheaves
>> +	 * than add the support (and to sheaf_flush_unused() above)
>> +	 */
>> +	if (!allow_spin)
>> +		return NULL;
>> +
>>   	empty = alloc_empty_sheaf(s, GFP_NOWAIT);
>>   	if (empty)
>>   		goto got_empty;
>> @@ -6200,7 +6232,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
>>    * The object is expected to have passed slab_free_hook() already.
>>    */
>>   static __fastpath_inline
>> -bool free_to_pcs(struct kmem_cache *s, void *object)
>> +bool free_to_pcs(struct kmem_cache *s, void *object, bool allow_spin)
>>   {
>>   	struct slub_percpu_sheaves *pcs;
>>   
>> @@ -6211,7 +6243,7 @@ bool free_to_pcs(struct kmem_cache *s, void *object)
>>   
>>   	if (unlikely(pcs->main->size == s->sheaf_capacity)) {
>>   
>> -		pcs = __pcs_replace_full_main(s, pcs);
>> +		pcs = __pcs_replace_full_main(s, pcs, allow_spin);
>>   		if (unlikely(!pcs))
>>   			return false;
>>   	}
>> @@ -6333,7 +6365,7 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
>>   			goto fail;
>>   		}
>>   
>> -		empty = barn_get_empty_sheaf(barn);
>> +		empty = barn_get_empty_sheaf(barn, true);
>>   
>>   		if (empty) {
>>   			pcs->rcu_free = empty;
>> @@ -6453,7 +6485,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>>   		goto no_empty;
>>   
>>   	if (!pcs->spare) {
>> -		empty = barn_get_empty_sheaf(barn);
>> +		empty = barn_get_empty_sheaf(barn, true);
>>   		if (!empty)
>>   			goto no_empty;
>>   
>> @@ -6467,7 +6499,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p)
>>   		goto do_free;
>>   	}
>>   
>> -	empty = barn_replace_full_sheaf(barn, pcs->main);
>> +	empty = barn_replace_full_sheaf(barn, pcs->main, true);
>>   	if (IS_ERR(empty)) {
>>   		stat(s, BARN_PUT_FAIL);
>>   		goto no_empty;
>> @@ -6719,7 +6751,7 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
>>   
>>   	if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) == numa_mem_id())
>>   	    && likely(!slab_test_pfmemalloc(slab))) {
>> -		if (likely(free_to_pcs(s, object)))
>> +		if (likely(free_to_pcs(s, object, true)))
>>   			return;
>>   	}
>>   
>> @@ -6980,6 +7012,12 @@ void kfree_nolock(const void *object)
>>   	 * since kasan quarantine takes locks and not supported from NMI.
>>   	 */
>>   	kasan_slab_free(s, x, false, false, /* skip quarantine */true);
>> +
>> +	if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) == numa_mem_id())) {
>> +		if (likely(free_to_pcs(s, x, false)))
>> +			return;
>> +	}
>> +
>>   	do_slab_free(s, slab, x, x, 0, _RET_IP_);
>>   }
>>   EXPORT_SYMBOL_GPL(kfree_nolock);
>> @@ -7532,7 +7570,7 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size,
>>   		size--;
>>   	}
>>   
>> -	i = alloc_from_pcs_bulk(s, size, p);
>> +	i = alloc_from_pcs_bulk(s, flags, size, p);
>>   
>>   	if (i < size) {
>>   		/*
>>



  reply	other threads:[~2026-03-02 12:16 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23  6:52 [PATCH v4 00/22] slab: replace cpu (partial) slabs with sheaves Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 01/22] mm/slab: add rcu_barrier() to kvfree_rcu_barrier_on_cache() Vlastimil Babka
2026-01-27 16:08   ` Liam R. Howlett
2026-01-23  6:52 ` [PATCH v4 02/22] mm/slab: fix false lockdep warning in __kfree_rcu_sheaf() Vlastimil Babka
2026-01-23 12:03   ` Sebastian Andrzej Siewior
2026-01-24 10:58     ` Harry Yoo
2026-01-23  6:52 ` [PATCH v4 03/22] slab: add SLAB_CONSISTENCY_CHECKS to SLAB_NEVER_MERGE Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 04/22] mm/slab: move and refactor __kmem_cache_alias() Vlastimil Babka
2026-01-27 16:17   ` Liam R. Howlett
2026-01-27 16:59     ` Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 05/22] mm/slab: make caches with sheaves mergeable Vlastimil Babka
2026-01-27 16:23   ` Liam R. Howlett
2026-01-23  6:52 ` [PATCH v4 06/22] slab: add sheaves to most caches Vlastimil Babka
2026-01-26  6:36   ` Hao Li
2026-01-26  8:39     ` Vlastimil Babka
2026-01-26 13:59   ` Breno Leitao
2026-01-27 16:34   ` Liam R. Howlett
2026-01-27 17:01     ` Vlastimil Babka
2026-01-29  7:24   ` Zhao Liu
2026-01-29  8:21     ` Vlastimil Babka
2026-01-30  7:15       ` Zhao Liu
2026-02-04 18:01         ` Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 07/22] slab: introduce percpu sheaves bootstrap Vlastimil Babka
2026-01-26  6:13   ` Hao Li
2026-01-26  8:42     ` Vlastimil Babka
2026-01-27 17:31   ` Liam R. Howlett
2026-01-23  6:52 ` [PATCH v4 08/22] slab: make percpu sheaves compatible with kmalloc_nolock()/kfree_nolock() Vlastimil Babka
2026-01-23 18:05   ` Alexei Starovoitov
2026-01-27 17:36   ` Liam R. Howlett
2026-01-29  8:25     ` Vlastimil Babka
2026-03-02 11:56   ` D, Suneeth
2026-03-02 12:16     ` Vlastimil Babka [this message]
2026-01-23  6:52 ` [PATCH v4 09/22] slab: handle kmalloc sheaves bootstrap Vlastimil Babka
2026-01-27 18:30   ` Liam R. Howlett
2026-01-23  6:52 ` [PATCH v4 10/22] slab: add optimized sheaf refill from partial list Vlastimil Babka
2026-01-26  7:12   ` Hao Li
2026-01-29  7:43     ` Harry Yoo
2026-01-29  8:29       ` Vlastimil Babka
2026-01-27 20:05   ` Liam R. Howlett
2026-01-29  8:01   ` Harry Yoo
2026-01-23  6:52 ` [PATCH v4 11/22] slab: remove cpu (partial) slabs usage from allocation paths Vlastimil Babka
2026-01-23 18:17   ` Alexei Starovoitov
2026-01-23  6:52 ` [PATCH v4 12/22] slab: remove SLUB_CPU_PARTIAL Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 13/22] slab: remove the do_slab_free() fastpath Vlastimil Babka
2026-01-23 18:15   ` Alexei Starovoitov
2026-01-23  6:52 ` [PATCH v4 14/22] slab: remove defer_deactivate_slab() Vlastimil Babka
2026-01-23 17:31   ` Alexei Starovoitov
2026-01-23  6:52 ` [PATCH v4 15/22] slab: simplify kmalloc_nolock() Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 16/22] slab: remove struct kmem_cache_cpu Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 17/22] slab: remove unused PREEMPT_RT specific macros Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 18/22] slab: refill sheaves from all nodes Vlastimil Babka
2026-01-27 14:28   ` Mateusz Guzik
2026-01-27 22:04     ` Vlastimil Babka
2026-01-29  9:16   ` Harry Yoo
2026-01-23  6:52 ` [PATCH v4 19/22] slab: update overview comments Vlastimil Babka
2026-01-23  6:52 ` [PATCH v4 20/22] slab: remove frozen slab checks from __slab_free() Vlastimil Babka
2026-01-29  7:16   ` Harry Yoo
2026-01-23  6:52 ` [PATCH v4 21/22] mm/slub: remove DEACTIVATE_TO_* stat items Vlastimil Babka
2026-01-29  7:21   ` Harry Yoo
2026-01-23  6:53 ` [PATCH v4 22/22] mm/slub: cleanup and repurpose some " Vlastimil Babka
2026-01-29  7:40   ` Harry Yoo
2026-01-29 15:18 ` [PATCH v4 00/22] slab: replace cpu (partial) slabs with sheaves Hao Li
2026-01-29 15:28   ` Vlastimil Babka
2026-01-29 16:06     ` Hao Li
2026-01-29 16:44       ` Liam R. Howlett
2026-01-30  4:38         ` Hao Li
2026-01-30  4:50     ` Hao Li
2026-01-30  6:17       ` Hao Li
2026-02-04 18:02       ` Vlastimil Babka
2026-02-04 18:24         ` Christoph Lameter (Ampere)
2026-02-06 16:44           ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b0ae03c-8e93-422d-835c-3d4148a7550f@suse.com \
    --to=vbabka@suse.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=Suneeth.D@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=ast@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=cl@gentwo.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-devel@lists.linux.dev \
    --cc=ptesarik@suse.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=surenb@google.com \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox