From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A78BC28B2E for ; Wed, 12 Mar 2025 18:17:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 583EB280004; Wed, 12 Mar 2025 14:17:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 531F1280001; Wed, 12 Mar 2025 14:17:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AC65280004; Wed, 12 Mar 2025 14:17:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1D3D1280001 for ; Wed, 12 Mar 2025 14:17:10 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E889D161558 for ; Wed, 12 Mar 2025 18:17:09 +0000 (UTC) X-FDA: 83213705778.10.9CAFFFA Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf30.hostedemail.com (Postfix) with ESMTP id B6E7080021 for ; Wed, 12 Mar 2025 18:16:53 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=1filQUv6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=3rpl+xID; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=1filQUv6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=3rpl+xID; spf=pass (imf30.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741803414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VK3ZYDWQDEmhk3a7G3ezOfAYlodRmWWCnSzkuyMDPBU=; b=T/RR3enKD3nPcrl/Wqa8d4O9uabqqzFDYnb2PllnulKt5Ozjg/g+KuipD66eTzZ7/EUqwT s42qskUnKBjzuws6buOXkopTzWRzgL9QgNGZUySSvsHxOC5OWT0x24wbmUL4A85Fha4g5n gSvXw4BltEgygOPP66pATFhIz7/NMYU= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=1filQUv6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=3rpl+xID; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=1filQUv6; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=3rpl+xID; spf=pass (imf30.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741803414; a=rsa-sha256; cv=none; b=suJgwkYF+HJiWDRYuZnH12G7x71TPTOcdj9JTE5BfKf6XqceU2tsayfsdtHWl3yJqf+iyv fX4ZvYdCSNp2q7ccCNa+4bgVc2v0HiffgCIAxqoC8t+doN+kNTiFhxZiqMODTijgUUT7uy VxJrIEDSi8HvZE4hYbo6w4+lwwj+rZo= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B21301F388; Wed, 12 Mar 2025 18:16:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1741803411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VK3ZYDWQDEmhk3a7G3ezOfAYlodRmWWCnSzkuyMDPBU=; b=1filQUv660JODH5Klf5U1pk8p1Hc3ncHCjT/4ZKHhMU2ykFYqiSsOlsvZOtPGMODeCexUg shkBjAv31pPprozp1SlPIOT9X01I33GerY9AucdMIfugS8wiwXIJ5RtCtorGOYJUKq3hDU Pu5yqSOzA1wDH/kudIyegCdAWdiGvfE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1741803411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VK3ZYDWQDEmhk3a7G3ezOfAYlodRmWWCnSzkuyMDPBU=; b=3rpl+xIDczxmo4iD76lSip0UzpmqYViLWVNE44DBZR/QVlz1PkIurq3AO1O2yiVRSQVLiO rifzy1Muu+Z0gsCA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1741803411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VK3ZYDWQDEmhk3a7G3ezOfAYlodRmWWCnSzkuyMDPBU=; b=1filQUv660JODH5Klf5U1pk8p1Hc3ncHCjT/4ZKHhMU2ykFYqiSsOlsvZOtPGMODeCexUg shkBjAv31pPprozp1SlPIOT9X01I33GerY9AucdMIfugS8wiwXIJ5RtCtorGOYJUKq3hDU Pu5yqSOzA1wDH/kudIyegCdAWdiGvfE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1741803411; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VK3ZYDWQDEmhk3a7G3ezOfAYlodRmWWCnSzkuyMDPBU=; b=3rpl+xIDczxmo4iD76lSip0UzpmqYViLWVNE44DBZR/QVlz1PkIurq3AO1O2yiVRSQVLiO rifzy1Muu+Z0gsCA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 8F93F1377F; Wed, 12 Mar 2025 18:16:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id McS1IpPP0We8PwAAD6G6ig (envelope-from ); Wed, 12 Mar 2025 18:16:51 +0000 Message-ID: Date: Wed, 12 Mar 2025 19:16:51 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC v2 06/10] slab: sheaf prefilling for guaranteed allocations Content-Language: en-US To: Harry Yoo Cc: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Uladzislau Rezki , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org References: <20250214-slub-percpu-caches-v2-0-88592ee0966a@suse.cz> <20250214-slub-percpu-caches-v2-6-88592ee0966a@suse.cz> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: mmkmik8ez4xrnw6a4hzur651daadad5q X-Rspamd-Queue-Id: B6E7080021 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1741803413-905116 X-HE-Meta: U2FsdGVkX19OFTaEinUtLgK+CMmaO5+jCxTrsukWw3xz5kGrdX+pDLPPLlN62oAiBCaK0kaHRWjDJ0r8fJyKs425cPG7NSIqxQLyzKASzCSNxK8IBqBGp4wBkkZTNp8yEfoQHVMSqgSOkgqERDv71NeiNm/m7Uukalhaaye6o4urpLez3WyIXyOh63HUqy8Bu4AtFRKvp/QDEfinXYuiijEBFGllci7lg4LvEkhxfU2AguUArBWwM1zI+4VXvS00MBGINCaAfSy5qnvT0+cpcfQEKMh9Dk5IbfwxWqC1Bf0vmLq3Sn5dKIO9YBoebplgekH34yqrtHuZHLVKeNhfMHwE8dnn+HvSJreKeKKnuV9m9sFJaWEnxrd+6D+pqpJtxCdIVhO4A79UOSp0H7ypF2ilBjD4UWdvyXzTs0QvpC2bDIiRNxKOeySdYpT3oHmBOP6qnGnxAOO5Q7Mwgp+wPmyPyoIY9Irz3RTBpSKt/fZG0CYJSpgOm6uxBPl1VlDZEqG8TeD5U/7/aeXsxJ+3/dvj9HZHIQqR5cuKmYHJb9TnDZhVxJWH9qPhz+4eggz5LIMAq5J/CIOskc54QO2vfHDQUJi3XZgaURnQlbeW5R6xoM8evR3/wjspCyC9BcRpnX8wyvvo0hl//hS4yd72CUxrm16MsNk/bcZrNIF+GwgoZFtyGXb3wgBXIQEahiU3mmgip6VQ2NOshq+6DJaC20ha/ZUyTc/fwsLE2w0lzRTBf19PqHivnsc/IQyZGlpA4ny/FcOVSZ3e+8yITkNyzq9h86Ml86kijrk7RXPnDQEmSSEC7iC2BrCcgBXY7eXbh2sFhzK0P6mlaao+KR5IyMe7paeAEmOBXMvVtCuopePWw9j0AaJDt84eJQtThD6wZf62wL+WAfP5VSNiiRLRMMUqRfVZ4+qgwe3v/P/7+VJIJBigrsJKQ73ZY5T+yxAs9o+THOMR6hz8qwxQ0nl QD2eY7ey yUoBoj2s5bnbpgroOtluqyjO2dlp0SMU27QzUvVKePGx3p87VlX6nYF2JFe79FgxumhOgeUlAaUJEas2TMJ1VXehozLGzReKNMr6AuyuDbVZIbk1lrmIDXwtvwjtzm+0Evu+zZmtCkGvyEnLXIhouzqPvYQS5FBirUjHUYmvLm8889JPpck1lhHkwCOR0EmOlttOcdOIJ/FNFl0kVGQJ07/hzCDvQYLvtPeAKgjdjqR57D++IusyxmuoE5vzKhvqBViEHeUqSath5ch6VrvRRBsUpHxjbarM1elULuxE04pDvhZ0dj9aPrICGDiBIiTofmv12LDYeOIwRdKdji6+Zk1hD+8pjcZ/anFVPca8XkMy9qeU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2/25/25 09:00, Harry Yoo wrote: > On Fri, Feb 14, 2025 at 05:27:42PM +0100, Vlastimil Babka wrote: >> Add functions for efficient guaranteed allocations e.g. in a critical >> section that cannot sleep, when the exact number of allocations is not >> known beforehand, but an upper limit can be calculated. >> >> kmem_cache_prefill_sheaf() returns a sheaf containing at least given >> number of objects. >> >> kmem_cache_alloc_from_sheaf() will allocate an object from the sheaf >> and is guaranteed not to fail until depleted. >> >> kmem_cache_return_sheaf() is for giving the sheaf back to the slab >> allocator after the critical section. This will also attempt to refill >> it to cache's sheaf capacity for better efficiency of sheaves handling, >> but it's not stricly necessary to succeed. >> >> kmem_cache_refill_sheaf() can be used to refill a previously obtained >> sheaf to requested size. If the current size is sufficient, it does >> nothing. If the requested size exceeds cache's sheaf_capacity and the >> sheaf's current capacity, the sheaf will be replaced with a new one, >> hence the indirect pointer parameter. >> >> kmem_cache_sheaf_size() can be used to query the current size. >> >> The implementation supports requesting sizes that exceed cache's >> sheaf_capacity, but it is not efficient - such sheaves are allocated >> fresh in kmem_cache_prefill_sheaf() and flushed and freed immediately by >> kmem_cache_return_sheaf(). kmem_cache_refill_sheaf() might be expecially >> ineffective when replacing a sheaf with a new one of a larger capacity. >> It is therefore better to size cache's sheaf_capacity accordingly. >> >> Signed-off-by: Vlastimil Babka >> --- >> include/linux/slab.h | 16 ++++ >> mm/slub.c | 227 +++++++++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 243 insertions(+) > > [... snip ... ] > >> @@ -4831,6 +4857,207 @@ void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t gfpflags, int nod >> } >> EXPORT_SYMBOL(kmem_cache_alloc_node_noprof); >> >> + >> +/* >> + * returns a sheaf that has least the requested size >> + * when prefilling is needed, do so with given gfp flags >> + * >> + * return NULL if sheaf allocation or prefilling failed >> + */ >> +struct slab_sheaf * >> +kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int size) >> +{ >> + struct slub_percpu_sheaves *pcs; >> + struct slab_sheaf *sheaf = NULL; >> + >> + if (unlikely(size > s->sheaf_capacity)) { >> + sheaf = kzalloc(struct_size(sheaf, objects, size), gfp); >> + if (!sheaf) >> + return NULL; >> + >> + sheaf->cache = s; >> + sheaf->capacity = size; >> + >> + if (!__kmem_cache_alloc_bulk(s, gfp, size, >> + &sheaf->objects[0])) { >> + kfree(sheaf); >> + return NULL; >> + } >> + >> + sheaf->size = size; >> + >> + return sheaf; >> + } >> + >> + localtry_lock(&s->cpu_sheaves->lock); >> + pcs = this_cpu_ptr(s->cpu_sheaves); >> + >> + if (pcs->spare) { >> + sheaf = pcs->spare; >> + pcs->spare = NULL; >> + } >> + >> + if (!sheaf) >> + sheaf = barn_get_full_or_empty_sheaf(pcs->barn); > > Can this be outside localtry lock? Strictly speaking we'd have to save the barn pointer first, otherwise cpu hotremove could bite us, I think. But not worth the trouble, as localtry lock is just disabling preemption and taking the barn lock would disable irqs anyway. So we're not increasing contention by holding the localtry lock more than strictly necessary. > >> + >> + localtry_unlock(&s->cpu_sheaves->lock); >> + >> + if (!sheaf) { >> + sheaf = alloc_empty_sheaf(s, gfp); >> + } >> + >> + if (sheaf && sheaf->size < size) { >> + if (refill_sheaf(s, sheaf, gfp)) { >> + sheaf_flush(s, sheaf); >> + free_empty_sheaf(s, sheaf); >> + sheaf = NULL; >> + } >> + } >> + >> + if (sheaf) >> + sheaf->capacity = s->sheaf_capacity; >> + >> + return sheaf; >> +} >> + >> +/* >> + * Use this to return a sheaf obtained by kmem_cache_prefill_sheaf() >> + * It tries to refill the sheaf back to the cache's sheaf_capacity >> + * to avoid handling partially full sheaves. >> + * >> + * If the refill fails because gfp is e.g. GFP_NOWAIT, the sheaf is >> + * instead dissolved >> + */ >> +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, >> + struct slab_sheaf *sheaf) >> +{ >> + struct slub_percpu_sheaves *pcs; >> + bool refill = false; >> + struct node_barn *barn; >> + >> + if (unlikely(sheaf->capacity != s->sheaf_capacity)) { >> + sheaf_flush(s, sheaf); >> + kfree(sheaf); >> + return; >> + } >> + >> + localtry_lock(&s->cpu_sheaves->lock); >> + pcs = this_cpu_ptr(s->cpu_sheaves); >> + >> + if (!pcs->spare) { >> + pcs->spare = sheaf; >> + sheaf = NULL; >> + } else if (pcs->barn->nr_full >= MAX_FULL_SHEAVES) { > > Did you mean (pcs->barn->nr_full < MAX_FULL_SHEAVES)? Oops yeah, fixing this can potentially improve performance. > Otherwise looks good to me. Thanks a lot!