linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Hao Li <hao.li@linux.dev>, Marcelo Tosatti <mtosatti@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] slab: distinguish lock and trylock for sheaf_flush_main()
Date: Thu, 26 Feb 2026 15:50:16 +0100	[thread overview]
Message-ID: <c6e94e2a-34ac-4515-b4df-222f2c08c992@kernel.org> (raw)
In-Reply-To: <20260211-b4-sheaf-flush-v1-1-4e7f492f0055@suse.cz>

On 2/11/26 10:42, Vlastimil Babka wrote:
> sheaf_flush_main() can be called from __pcs_replace_full_main() where
> the trylock can in theory fail, and pcs_flush_all() where it's not
> expected to and it would be actually a problem if it failed and left the
> main sheaf not flushed.

Thinking about this more, I now think it's not a theoretical issue because
on PREEMPT_RT I think pcs_flush_all() can preempt someone holding the lock
(on PREEMPT_RT it doesn't have to be an irq handler preempting a holder),
and then fail to flush the main sheaf silently.

The impact is probably limited though - if this failure to flush happens in
__kmem_cache_shutdown(), it means someone was destroying a cache while using
it, so that was already buggy. slab_mem_going_offline_callback() could be
where this matters although it's unlikely someone would do memory hotplug
together with PREEMPT_RT.

But maybe still worth tagging this as Fixes: 2d517aa09bbc ("slab: add opt-in
caching layer of percpu sheaves") and Cc stable and sending it as a hotfix.

> To make this explicit, split the function into sheaf_flush_main() (using
> local_lock()) and sheaf_try_flush_main() (using local_trylock()) where
> both call __sheaf_flush_main_batch() to flush a single batch of objects.
> This will allow lockdep to verify our assumptions.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 47 +++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 18c30872d196..12912b29f5bb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2844,19 +2844,19 @@ static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p);
>   * object pointers are moved to a on-stack array under the lock. To bound the
>   * stack usage, limit each batch to PCS_BATCH_MAX.
>   *
> - * returns true if at least partially flushed
> + * Must be called with s->cpu_sheaves->lock locked, returns with the lock
> + * unlocked.
> + *
> + * Returns how many objects are remaining to be flushed
>   */
> -static bool sheaf_flush_main(struct kmem_cache *s)
> +static unsigned int __sheaf_flush_main_batch(struct kmem_cache *s)
>  {
>  	struct slub_percpu_sheaves *pcs;
>  	unsigned int batch, remaining;
>  	void *objects[PCS_BATCH_MAX];
>  	struct slab_sheaf *sheaf;
> -	bool ret = false;
>  
> -next_batch:
> -	if (!local_trylock(&s->cpu_sheaves->lock))
> -		return ret;
> +	lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
>  
>  	pcs = this_cpu_ptr(s->cpu_sheaves);
>  	sheaf = pcs->main;
> @@ -2874,10 +2874,37 @@ static bool sheaf_flush_main(struct kmem_cache *s)
>  
>  	stat_add(s, SHEAF_FLUSH, batch);
>  
> -	ret = true;
> +	return remaining;
> +}
>  
> -	if (remaining)
> -		goto next_batch;
> +static void sheaf_flush_main(struct kmem_cache *s)
> +{
> +	unsigned int remaining;
> +
> +	do {
> +		local_lock(&s->cpu_sheaves->lock);
> +
> +		remaining = __sheaf_flush_main_batch(s);
> +
> +	} while (remaining);
> +}
> +
> +/*
> + * Returns true if the main sheaf was at least partially flushed.
> + */
> +static bool sheaf_try_flush_main(struct kmem_cache *s)
> +{
> +	unsigned int remaining;
> +	bool ret = false;
> +
> +	do {
> +		if (!local_trylock(&s->cpu_sheaves->lock))
> +			return ret;
> +
> +		ret = true;
> +		remaining = __sheaf_flush_main_batch(s);
> +
> +	} while (remaining);
>  
>  	return ret;
>  }
> @@ -5685,7 +5712,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>  	if (put_fail)
>  		 stat(s, BARN_PUT_FAIL);
>  
> -	if (!sheaf_flush_main(s))
> +	if (!sheaf_try_flush_main(s))
>  		return NULL;
>  
>  	if (!local_trylock(&s->cpu_sheaves->lock))
> 
> ---
> base-commit: 27125df9a5d3b4cfd03bce3a8ec405a368cc9aae
> change-id: 20260211-b4-sheaf-flush-2eb99a9c8bfb
> 
> Best regards,



      parent reply	other threads:[~2026-02-26 14:50 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11  9:42 Vlastimil Babka
2026-02-12  3:11 ` Harry Yoo
2026-02-12  6:48 ` Hao Li
2026-02-26 14:50 ` Vlastimil Babka (SUSE) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6e94e2a-34ac-4515-b4df-222f2c08c992@kernel.org \
    --to=vbabka@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtosatti@redhat.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox