From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B33F4CCD1BF for ; Thu, 23 Oct 2025 13:53:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16C3A8E002E; Thu, 23 Oct 2025 09:53:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 080DF8E0002; Thu, 23 Oct 2025 09:53:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E89A08E002E; Thu, 23 Oct 2025 09:53:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D01208E0002 for ; Thu, 23 Oct 2025 09:53:07 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 85A7C140B10 for ; Thu, 23 Oct 2025 13:53:07 +0000 (UTC) X-FDA: 84029520414.25.B63D9C2 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf04.hostedemail.com (Postfix) with ESMTP id 5C60F4000E for ; Thu, 23 Oct 2025 13:53:05 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; spf=pass (imf04.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1761227585; a=rsa-sha256; cv=none; b=x7waAfAlrusoHCRxhKOdlZSLZbZYmpAAWlGq+gActA15kMAtULTe4UPDE9e/C4NPODRnEx gGusnIorlUtqiEM/uq4KZt79yiSsE6kyLwnBh4RR4RcCD8K9XFCQRkJ5hhjEoCGNIMs6Er AYj5DkJ2JDusoq3qdBo8wIHc4GYQXVs= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; spf=pass (imf04.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1761227585; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eS6GuMY77XE9huzfJRU6LAG6pUzwD4wWadGk/P9jC08=; b=u1PCnSTKKaFK2JDkR/OlQHmycFaMRd0TiyBMN5uCoFBaQ62i4CmN5IEkhO0d5qJXkgdquN qmZWE22vnoV7pbO+e6o3NNl1LRPhdpkrZePmFyNuuGWEkvxMN1vjydmkdfISticQbpu8Y3 5JkadNxQxmEORNGLvIXJQv4BTWNm+Xg= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 75E561F769; Thu, 23 Oct 2025 13:53:01 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id ACAA713B03; Thu, 23 Oct 2025 13:52:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id QOvcKTUz+mjvQQAAD6G6ig (envelope-from ); Thu, 23 Oct 2025 13:52:53 +0000 From: Vlastimil Babka Date: Thu, 23 Oct 2025 15:52:29 +0200 Subject: [PATCH RFC 07/19] slab: make percpu sheaves compatible with kmalloc_nolock()/kfree_nolock() MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20251023-sheaves-for-all-v1-7-6ffa2c9941c0@suse.cz> References: <20251023-sheaves-for-all-v1-0-6ffa2c9941c0@suse.cz> In-Reply-To: <20251023-sheaves-for-all-v1-0-6ffa2c9941c0@suse.cz> To: Andrew Morton , Christoph Lameter , David Rientjes , Roman Gushchin , Harry Yoo Cc: Uladzislau Rezki , "Liam R. Howlett" , Suren Baghdasaryan , Sebastian Andrzej Siewior , Alexei Starovoitov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-rt-devel@lists.linux.dev, bpf@vger.kernel.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.14.3 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Action: no action X-Stat-Signature: y5ozdhf938dc3qm4w9y48u77nic5th48 X-Rspamd-Queue-Id: 5C60F4000E X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1761227585-685277 X-HE-Meta: U2FsdGVkX1/vkOG5LDANDoEI6kMwRzzl9MGBF4HVxEg0xZcvW3aljvmRuxG7unmtSLTZFzfOmBdvcXO0P9lpv3Pw9v/14bvMEPux7XNnCPtHVMOO0fY6Xm1x3ule2kpKlx929v0B56mQ+ZVzhZJmOfm4nZMW/34qw0da/OFTyJK89KPbixoOcje188hkqGONIPVa2++TuHBIm/6+NRf6q993wOtgUAtfyhuHMVT2P1RTz1PAt0/3m+8oxhzpebkC2bko/DHVMHixjUQw0WD7QXHSeobEATrxRoVCJFT8o9o0K2rnOgnOe9UljVMZ7fEDkaLt2lw37SzDKL1XFo8s8bglnKND/XulvAS/Zvo6KdMFAv2e4E5k60KRxsjOxq2zjnulHekQzcewF/1NcO+OHN3C3xFLvCJedp1kvWb4yAut0K/1CkQ2YY6p4hwigM5Wt19GLnoUMpmM29SgW6NO0iC2C9x5xahVrzdN6n3ZEe3F9QckyafqI3cE6iAd88fNtKauwywpDsXu6X5+p9auwmjrzL5FaWuHWXycKXp4dhqWQFMWUBSB4tdLYao9AR84Wbh61+6bvz/2jXlLBjuMeVnUNFPBhLxLuAff4wKlbdOFNAOlvtdjlCVstezTJfjc7uXFv60hVi6K4dtC2smnNLnpFKZ+t8BYXEmaEa+E92rUagySbwVGeRRIGh8/t4HqszX24Jdsv8kATy4+6sG3Tm1q704AkR5vtFwIuWWFYmzJw/ZDKv8f0qahlnwchqZn+umvKb13OjoB89O0A4HdWh/+mJZSu0EVxU1Ll6rnGZDAeHUxjI7OQUYuf74aIj6ySNR2ovCJLaGPzWwuKwa5BeusDlf9Xv63w6Rtqdc66c3YAa17trC8sTl+ENZ6LOLoDHF7Ogv9np9H8ZDdtZX+mf8GQqCa3RFDPG1aaGhnzUezVPcugKnscvN1/ssFWdpMcEP3LKyRg89B9rep6Pv 0lOp+f0W tEaQr1wB6g44tALxuOm1uLwHM8tcenv37rwD/YhzDCIj/jjuELQ8r/DJhuwvP2JDCDsvAtTV6ET5JjtqVSWIiTTOBudTyrv9ZgHA4fmMLRcyWVRb5zhgkV5HFZWTkVcmYz99HxAhCWZphBbQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Before we enable percpu sheaves for kmalloc caches, we need to make sure kmalloc_nolock() and kfree_nolock() will continue working properly and not spin when not allowed to. Percpu sheaves themselves use local_trylock() so they are already compatible. We just need to be careful with the barn->lock spin_lock. Pass a new allow_spin parameter where necessary to use spin_trylock_irqsave(). In kmalloc_nolock_noprof() we can now attempt alloc_from_pcs() safely, for now it will always fail until we enable sheaves for kmalloc caches next. Similarly in kfree_nolock() we can attempt free_to_pcs(). Signed-off-by: Vlastimil Babka --- mm/slub.c | 74 ++++++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 52 insertions(+), 22 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index ecb10ed5acfe..5d0b2cf66520 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2876,7 +2876,8 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves = NULL; } -static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn) +static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, + bool allow_spin) { struct slab_sheaf *empty = NULL; unsigned long flags; @@ -2884,7 +2885,10 @@ static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn) if (!data_race(barn->nr_empty)) return NULL; - spin_lock_irqsave(&barn->lock, flags); + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; if (likely(barn->nr_empty)) { empty = list_first_entry(&barn->sheaves_empty, @@ -2961,7 +2965,8 @@ static struct slab_sheaf *barn_get_full_or_empty_sheaf(struct node_barn *barn) * change. */ static struct slab_sheaf * -barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty) +barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty, + bool allow_spin) { struct slab_sheaf *full = NULL; unsigned long flags; @@ -2969,7 +2974,10 @@ barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty) if (!data_race(barn->nr_full)) return NULL; - spin_lock_irqsave(&barn->lock, flags); + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; if (likely(barn->nr_full)) { full = list_first_entry(&barn->sheaves_full, struct slab_sheaf, @@ -2990,7 +2998,8 @@ barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty) * barn. But if there are too many full sheaves, reject this with -E2BIG. */ static struct slab_sheaf * -barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full) +barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full, + bool allow_spin) { struct slab_sheaf *empty; unsigned long flags; @@ -3001,7 +3010,10 @@ barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full) if (!data_race(barn->nr_empty)) return ERR_PTR(-ENOMEM); - spin_lock_irqsave(&barn->lock, flags); + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; if (likely(barn->nr_empty)) { empty = list_first_entry(&barn->sheaves_empty, struct slab_sheaf, @@ -5000,7 +5012,8 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs, return NULL; } - full = barn_replace_empty_sheaf(barn, pcs->main); + full = barn_replace_empty_sheaf(barn, pcs->main, + gfpflags_allow_spinning(gfp)); if (full) { stat(s, BARN_GET); @@ -5017,7 +5030,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs, empty = pcs->spare; pcs->spare = NULL; } else { - empty = barn_get_empty_sheaf(barn); + empty = barn_get_empty_sheaf(barn, true); } } @@ -5154,7 +5167,8 @@ void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp, int node) } static __fastpath_inline -unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, size_t size, void **p) +unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t size, + void **p) { struct slub_percpu_sheaves *pcs; struct slab_sheaf *main; @@ -5188,7 +5202,8 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, size_t size, void **p) return allocated; } - full = barn_replace_empty_sheaf(barn, pcs->main); + full = barn_replace_empty_sheaf(barn, pcs->main, + gfpflags_allow_spinning(gfp)); if (full) { stat(s, BARN_GET); @@ -5693,7 +5708,7 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node) gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags; struct kmem_cache *s; bool can_retry = true; - void *ret = ERR_PTR(-EBUSY); + void *ret; VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO | __GFP_NO_OBJ_EXT)); @@ -5720,6 +5735,13 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node) */ return NULL; + ret = alloc_from_pcs(s, alloc_gfp, node); + + if (ret) + goto success; + + ret = ERR_PTR(-EBUSY); + /* * Do not call slab_alloc_node(), since trylock mode isn't * compatible with slab_pre_alloc_hook/should_failslab and @@ -5756,6 +5778,7 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node) ret = NULL; } +success: maybe_wipe_obj_freeptr(s, ret); slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret, slab_want_init_on_alloc(alloc_gfp, s), size); @@ -6047,7 +6070,8 @@ static void __pcs_install_empty_sheaf(struct kmem_cache *s, * unlocked. */ static struct slub_percpu_sheaves * -__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs) +__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs, + bool allow_spin) { struct slab_sheaf *empty; struct node_barn *barn; @@ -6071,7 +6095,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs) put_fail = false; if (!pcs->spare) { - empty = barn_get_empty_sheaf(barn); + empty = barn_get_empty_sheaf(barn, allow_spin); if (empty) { pcs->spare = pcs->main; pcs->main = empty; @@ -6085,7 +6109,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs) return pcs; } - empty = barn_replace_full_sheaf(barn, pcs->main); + empty = barn_replace_full_sheaf(barn, pcs->main, allow_spin); if (!IS_ERR(empty)) { stat(s, BARN_PUT); @@ -6093,6 +6117,11 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs) return pcs; } + if (!allow_spin) { + local_unlock(&s->cpu_sheaves->lock); + return NULL; + } + if (PTR_ERR(empty) == -E2BIG) { /* Since we got here, spare exists and is full */ struct slab_sheaf *to_flush = pcs->spare; @@ -6160,7 +6189,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs) * The object is expected to have passed slab_free_hook() already. */ static __fastpath_inline -bool free_to_pcs(struct kmem_cache *s, void *object) +bool free_to_pcs(struct kmem_cache *s, void *object, bool allow_spin) { struct slub_percpu_sheaves *pcs; @@ -6171,7 +6200,7 @@ bool free_to_pcs(struct kmem_cache *s, void *object) if (unlikely(pcs->main->size == s->sheaf_capacity)) { - pcs = __pcs_replace_full_main(s, pcs); + pcs = __pcs_replace_full_main(s, pcs, allow_spin); if (unlikely(!pcs)) return false; } @@ -6278,7 +6307,7 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj) goto fail; } - empty = barn_get_empty_sheaf(barn); + empty = barn_get_empty_sheaf(barn, true); if (empty) { pcs->rcu_free = empty; @@ -6398,7 +6427,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p) goto no_empty; if (!pcs->spare) { - empty = barn_get_empty_sheaf(barn); + empty = barn_get_empty_sheaf(barn, true); if (!empty) goto no_empty; @@ -6412,7 +6441,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, size_t size, void **p) goto do_free; } - empty = barn_replace_full_sheaf(barn, pcs->main); + empty = barn_replace_full_sheaf(barn, pcs->main, true); if (IS_ERR(empty)) { stat(s, BARN_PUT_FAIL); goto no_empty; @@ -6659,7 +6688,7 @@ void slab_free(struct kmem_cache *s, struct slab *slab, void *object, if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) == numa_mem_id()) && likely(!slab_test_pfmemalloc(slab))) { - if (likely(free_to_pcs(s, object))) + if (likely(free_to_pcs(s, object, true))) return; } @@ -6922,7 +6951,8 @@ void kfree_nolock(const void *object) * since kasan quarantine takes locks and not supported from NMI. */ kasan_slab_free(s, x, false, false, /* skip quarantine */true); - do_slab_free(s, slab, x, x, 0, _RET_IP_); + if (!free_to_pcs(s, x, false)) + do_slab_free(s, slab, x, x, 0, _RET_IP_); } EXPORT_SYMBOL_GPL(kfree_nolock); @@ -7465,7 +7495,7 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s, gfp_t flags, size_t size, size--; } - i = alloc_from_pcs_bulk(s, size, p); + i = alloc_from_pcs_bulk(s, flags, size, p); if (i < size) { /* -- 2.51.1