From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Hao Li <hao.li@linux.dev>, Marcelo Tosatti <mtosatti@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] slab: distinguish lock and trylock for sheaf_flush_main()
Date: Thu, 26 Feb 2026 15:50:16 +0100 [thread overview]
Message-ID: <c6e94e2a-34ac-4515-b4df-222f2c08c992@kernel.org> (raw)
In-Reply-To: <20260211-b4-sheaf-flush-v1-1-4e7f492f0055@suse.cz>
On 2/11/26 10:42, Vlastimil Babka wrote:
> sheaf_flush_main() can be called from __pcs_replace_full_main() where
> the trylock can in theory fail, and pcs_flush_all() where it's not
> expected to and it would be actually a problem if it failed and left the
> main sheaf not flushed.
Thinking about this more, I now think it's not a theoretical issue because
on PREEMPT_RT I think pcs_flush_all() can preempt someone holding the lock
(on PREEMPT_RT it doesn't have to be an irq handler preempting a holder),
and then fail to flush the main sheaf silently.
The impact is probably limited though - if this failure to flush happens in
__kmem_cache_shutdown(), it means someone was destroying a cache while using
it, so that was already buggy. slab_mem_going_offline_callback() could be
where this matters although it's unlikely someone would do memory hotplug
together with PREEMPT_RT.
But maybe still worth tagging this as Fixes: 2d517aa09bbc ("slab: add opt-in
caching layer of percpu sheaves") and Cc stable and sending it as a hotfix.
> To make this explicit, split the function into sheaf_flush_main() (using
> local_lock()) and sheaf_try_flush_main() (using local_trylock()) where
> both call __sheaf_flush_main_batch() to flush a single batch of objects.
> This will allow lockdep to verify our assumptions.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/slub.c | 47 +++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 37 insertions(+), 10 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 18c30872d196..12912b29f5bb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2844,19 +2844,19 @@ static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p);
> * object pointers are moved to a on-stack array under the lock. To bound the
> * stack usage, limit each batch to PCS_BATCH_MAX.
> *
> - * returns true if at least partially flushed
> + * Must be called with s->cpu_sheaves->lock locked, returns with the lock
> + * unlocked.
> + *
> + * Returns how many objects are remaining to be flushed
> */
> -static bool sheaf_flush_main(struct kmem_cache *s)
> +static unsigned int __sheaf_flush_main_batch(struct kmem_cache *s)
> {
> struct slub_percpu_sheaves *pcs;
> unsigned int batch, remaining;
> void *objects[PCS_BATCH_MAX];
> struct slab_sheaf *sheaf;
> - bool ret = false;
>
> -next_batch:
> - if (!local_trylock(&s->cpu_sheaves->lock))
> - return ret;
> + lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
>
> pcs = this_cpu_ptr(s->cpu_sheaves);
> sheaf = pcs->main;
> @@ -2874,10 +2874,37 @@ static bool sheaf_flush_main(struct kmem_cache *s)
>
> stat_add(s, SHEAF_FLUSH, batch);
>
> - ret = true;
> + return remaining;
> +}
>
> - if (remaining)
> - goto next_batch;
> +static void sheaf_flush_main(struct kmem_cache *s)
> +{
> + unsigned int remaining;
> +
> + do {
> + local_lock(&s->cpu_sheaves->lock);
> +
> + remaining = __sheaf_flush_main_batch(s);
> +
> + } while (remaining);
> +}
> +
> +/*
> + * Returns true if the main sheaf was at least partially flushed.
> + */
> +static bool sheaf_try_flush_main(struct kmem_cache *s)
> +{
> + unsigned int remaining;
> + bool ret = false;
> +
> + do {
> + if (!local_trylock(&s->cpu_sheaves->lock))
> + return ret;
> +
> + ret = true;
> + remaining = __sheaf_flush_main_batch(s);
> +
> + } while (remaining);
>
> return ret;
> }
> @@ -5685,7 +5712,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
> if (put_fail)
> stat(s, BARN_PUT_FAIL);
>
> - if (!sheaf_flush_main(s))
> + if (!sheaf_try_flush_main(s))
> return NULL;
>
> if (!local_trylock(&s->cpu_sheaves->lock))
>
> ---
> base-commit: 27125df9a5d3b4cfd03bce3a8ec405a368cc9aae
> change-id: 20260211-b4-sheaf-flush-2eb99a9c8bfb
>
> Best regards,
prev parent reply other threads:[~2026-02-26 14:50 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 9:42 Vlastimil Babka
2026-02-12 3:11 ` Harry Yoo
2026-02-12 6:48 ` Hao Li
2026-02-26 14:50 ` Vlastimil Babka (SUSE) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c6e94e2a-34ac-4515-b4df-222f2c08c992@kernel.org \
--to=vbabka@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mtosatti@redhat.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox