Re: [PATCH slab/for-next-fixes] mm/slab: allow sheaf refill if blocking is not allowed

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Hao Li <hao.li@linux.dev>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Cc: Harry Yoo <harry.yoo@oracle.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	 David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	Ming Lei <ming.lei@redhat.com>
Subject: Re: [PATCH slab/for-next-fixes] mm/slab: allow sheaf refill if blocking is not allowed
Date: Wed, 4 Mar 2026 15:44:18 +0800	[thread overview]
Message-ID: <3at546the4zbun7g7aoeqrirh46iwsw3vj5ncc4fjhz26gfbb2@tsgplt5o2ybu> (raw)
In-Reply-To: <20260302095536.34062-2-vbabka@kernel.org>

On Mon, Mar 02, 2026 at 10:55:37AM +0100, Vlastimil Babka (SUSE) wrote:
> Ming Lei reported [1] a regression in the ublk null target benchmark due
> to sheaves. The profile shows that the alloc_from_pcs() fastpath fails
> and allocations fall back to ___slab_alloc(). It also shows the
> allocations happen through mempool_alloc().
> 
> The strategy of mempool_alloc() is to call the underlying allocator
> (here slab) without __GFP_DIRECT_RECLAIM first. This does not play well
> with __pcs_replace_empty_main() checking for gfpflags_allow_blocking()
> to decide if it should refill an empty sheaf or fallback to the
> slowpath, so we end up falling back.
> 
> We could change the mempool strategy but there might be other paths
> doing the same ting. So instead allow sheaf refill when blocking is not
> allowed, changing the condition to gfpflags_allow_spinning(). The
> original condition was unnecessarily restrictive.
> 
> Note this doesn't fully resolve the regression [1] as another component
> of that are memoryless nodes, which is to be addressed separately.
> 
> Reported-by: Ming Lei <ming.lei@redhat.com>
> Fixes: e47c897a2949 ("slab: add sheaves to most caches")
> Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/ [1]
> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
> ---
>  mm/slub.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index b1e9f16ba435..17b200695e9b 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4567,7 +4567,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>  	struct slab_sheaf *empty = NULL;
>  	struct slab_sheaf *full;
>  	struct node_barn *barn;
> -	bool can_alloc;
> +	bool allow_spin;
>  
>  	lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
>  
> @@ -4588,8 +4588,9 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>  		return NULL;
>  	}
>  
> -	full = barn_replace_empty_sheaf(barn, pcs->main,
> -					gfpflags_allow_spinning(gfp));
> +	allow_spin = gfpflags_allow_spinning(gfp);
> +
> +	full = barn_replace_empty_sheaf(barn, pcs->main, allow_spin);
>  
>  	if (full) {
>  		stat(s, BARN_GET);
> @@ -4599,9 +4600,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>  
>  	stat(s, BARN_GET_FAIL);
>  
> -	can_alloc = gfpflags_allow_blocking(gfp);
> -
> -	if (can_alloc) {
> +	if (allow_spin) {
>  		if (pcs->spare) {
>  			empty = pcs->spare;
>  			pcs->spare = NULL;
> @@ -4612,7 +4611,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>  
>  	local_unlock(&s->cpu_sheaves->lock);
>  
> -	if (!can_alloc)
> +	if (!allow_spin)
>  		return NULL;
>  
>  	if (empty) {
> @@ -4632,11 +4631,8 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>  	if (!full)
>  		return NULL;
>  
> -	/*
> -	 * we can reach here only when gfpflags_allow_blocking
> -	 * so this must not be an irq
> -	 */
> -	local_lock(&s->cpu_sheaves->lock);
> +	if (!local_trylock(&s->cpu_sheaves->lock))
> +		goto barn_put;

A quick question to make sure I understand this correctly.

My understanding is that after this patch, there is now a new case where
allocations with __GFP_KSWAPD_RECLAIM set (e.g GFP_ATOMIC) can also reach this
lock-reacquire path.

If we were to keep using local_lock here:

1. On non-RT kernels it seems fine, since alloc_from_pcs() already does a
   local_trylock(&s->cpu_sheaves->lock) check.

2. But on PREEMPT_RT, local_lock could potentially schedule away, which may add
   latency. So the idea of using local_trylock here is to fail fast and return
   without incurring that latency - is that the intent behind this change?

-- 
Thanks,
Hao

next prev parent reply	other threads:[~2026-03-04  7:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02  9:55 Vlastimil Babka (SUSE)
2026-03-04  3:05 ` Harry Yoo
2026-03-04  9:58   ` Vlastimil Babka
2026-03-04 10:03     ` Harry Yoo
2026-03-04  7:44 ` Hao Li [this message]
2026-03-04 10:14   ` Vlastimil Babka (SUSE)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3at546the4zbun7g7aoeqrirh46iwsw3vj5ncc4fjhz26gfbb2@tsgplt5o2ybu \
    --to=hao.li@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=harry.yoo@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ming.lei@redhat.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox