From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60307C25B45 for ; Mon, 23 Oct 2023 18:44:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D40F26B0149; Mon, 23 Oct 2023 14:44:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF1146B014A; Mon, 23 Oct 2023 14:44:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDEF46B014B; Mon, 23 Oct 2023 14:44:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B123E6B0149 for ; Mon, 23 Oct 2023 14:44:46 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 81DAB80AA5 for ; Mon, 23 Oct 2023 18:44:46 +0000 (UTC) X-FDA: 81377602572.25.C35E508 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf15.hostedemail.com (Postfix) with ESMTP id 2B21EA0019 for ; Mon, 23 Oct 2023 18:44:43 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=HbaDqIPA; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=xwbQtoBm; dmarc=none; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698086684; a=rsa-sha256; cv=none; b=zJt4UqVw1Y9iuoiA8KoMgpJWLnNA8FF5pZKwYjr/SwnMAFjgnSiI5c6Cec+aZE75pOUL0F 5+5IQz87MajHfYnINXKmgT/1J96yyvjU2dLMVFsuCgG5mIJ34TJHtRLdWgnEi7GuswPf4N fs9t87kLtx718CkMTbUhHlahZqMNyDY= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=HbaDqIPA; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=xwbQtoBm; dmarc=none; spf=pass (imf15.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698086684; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/EtPfwtnJ/lv0lToPBnLwDwjqcHgx6IPTIbuy0W1QnE=; b=2xdsnytoll6E0nb1f1k0738o+oF6IiTRaQgyXnrZ8STwvo0nr1tH23IntmfVxdPON+RPNw bIp4QlxBCuDxrc87z08oFEBDJ+GXPgsr/RKIEZBn9FQFhkrZY/kTY7QX/Wvte5H/LbxgP1 b/G+V729PTS+p4PIGxpUmc2Xo5L8xmU= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 01A5921B13; Mon, 23 Oct 2023 18:44:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1698086682; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/EtPfwtnJ/lv0lToPBnLwDwjqcHgx6IPTIbuy0W1QnE=; b=HbaDqIPAOsTTMwn4Ahd31fHf8ZBOUwkzL91+TgfESYl0u9aSgjFhzVeLzjGAbnpfUuVXXt zYQmO3ogXQhPSTtCyfL6BFwD3h9gEYCydQemJTwd+YnXdLSHcXo3/jTkj4NTNVouGOiYbY Y6KmzjeR4RqXMgtR0vAAvGCF2pB9V+4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1698086682; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/EtPfwtnJ/lv0lToPBnLwDwjqcHgx6IPTIbuy0W1QnE=; b=xwbQtoBmvvXm3TKcF4EnJxbECg4EjqY54ERIVx/j1MkTMOnYoKI7rNpTJgjls1MAKSfuah b59/ufdEn24km/DQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A52B5139C2; Mon, 23 Oct 2023 18:44:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id BWKIJxm/NmWlcgAAMHmgww (envelope-from ); Mon, 23 Oct 2023 18:44:41 +0000 Message-ID: Date: Mon, 23 Oct 2023 20:44:41 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [RFC PATCH v2 0/6] slub: Delay freezing of CPU partial slabs Content-Language: en-US To: "Christoph Lameter (Ampere)" Cc: chengming.zhou@linux.dev, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chengming Zhou References: <20231021144317.3400916-1-chengming.zhou@linux.dev> <4134b039-fa99-70cd-3486-3d0c7632e4a3@suse.cz> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2B21EA0019 X-Stat-Signature: pxohpwie6fpcgkbyd68nsexdq94dti4i X-HE-Tag: 1698086683-41218 X-HE-Meta: U2FsdGVkX19+3W66uLHSaLlnuD+fHwbq48l4gsozt0QqQ9JoCYoF3PAqVoLQ2mFQ7MCI/eTMPQzmwvLLH2GLvLF+lT5qtAvRQvUV5RhqEFj4SF1rYiXsuRqg/PrM5blcZok1WkIvBLaafSGURnFd9zjC1enNYlmkem32E1Iugr787cz5xfORlKgB6fpoD99b8zMc7DOLYYxiTZ24C/jsvH1EUkOgheON37Hy8GrHk8QSSt7sXCij4sEWN5mHpPntdOPHjsQzdyCwFg9l0jUlsfj0HcwLSvGlpTW6rNDSjL4jUGcPq6xwisl+1sIobe4chF4l5QGMU00TYl2C/HKD+Xkto8h/cIYymafDOxFPSpNEkqJ2/qDSEhJ6H5bX6bpskAq/sc56LsrLxeYsLe0nc3ElwSqfgFDIqzxD19REzVzPBdbEMHuUASqEALOolna6lPqzGM0FqXifehWWNATlY71EXe9LT+6zrC5l1agWiVWCcNRoPD95qSJ7A8kZYECC4cMooZmzLMkuGaKAdESIe///J4BkSaqAzG+W0z/K2z11o2sEYNLRt/k1COtlzbNlOQlNTIh4Wgk0OLmUHoBNa9ZqQ5dL/g3C4rQHzCpToOfohEQ2vd4mhL+uR3EEohdxm/oo3Zq76d/VVr4y+bb4UsSvZpTIiPjTeXA5qmrevp71JExQbY3NU8nkGedN/1+r08sEuHra1lTjBA1oDcu9pHI2K87gtdkc5or6xylZX3H4op4F3xeyQxMCRM8E7btgHXKm03nK0jeNgMCL5KSUKep3mOe+cSidqLW/q1OfDCvrusJbat3AkST9SomBPCVpie0ioz2kPAQ82MmB7gfOMjpHhl3DPHX6ZYmSUQUX9bEi0zYen5cAL5HkRJ/R1YQDM/2gztljgVFGjuKti1GD4nxK2fzF1Il5m4e7isg0Eum2FVFEyFsn3ELgClmiv/OLX+jKi/jI9jAh1YD8uip kRG3T54+ ggHVz2laUYtLXB0dofbZ1/mAhEUQ2WVZy7bmqmclnCeJx6mhKiKeV+PqriBU76jwwyxG1LPLVMO2nirDx4ZVJ/sfhs+N0g8pyCJ/AFYP+pPNvjY+eKIML0it6MdKhvgDweuUkn1id2Ei1dJ0GZvXuJmAI/Y64UhTsUCozxPVPz0b9S0KiMWvQHCwyqG25VjV8QfXIwFSjvzPtYmdzNwYDvjyhkqugGVtwEJTVijd8O2bM8rhRbV0yv9OXHiG6fXrcJ3Xc7vli0El29WSY7uZqYp+yOQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/23/23 19:00, Christoph Lameter (Ampere) wrote: > On Mon, 23 Oct 2023, Vlastimil Babka wrote: > >>> >>> The slab will be delay frozen when it's picked to actively use by the >>> CPU, it becomes full at the same time, in which case we still need to >>> rely on "frozen" bit to avoid manipulating its list. So the slab will >>> be frozen only when activate use and be unfrozen only when deactivate. >> >> Interesting solution! I wonder if we could go a bit further and remove >> acquire_slab() completely. Because AFAICS even after your changes, >> acquire_slab() is still attempted including freezing the slab, which means >> still doing an cmpxchg_double under the list_lock, and now also handling the >> special case when it failed, but we at least filled percpu partial lists. >> What if we only filled the partial list without freezing, and then froze the >> first slab outside of the list_lock? >> >> Or more precisely, instead of returning the acquired "object" we would >> return the first slab removed from partial list. I think it would simplify >> the code a bit, and further reduce list_lock holding times. >> >> I'll also point out a few more details, but it's not a full detailed review >> as the suggestion above, and another for 4/5, could mean a rather >> significant change for v3. > > This is not that easy. The frozen bit indicates that list management does > not have to be done for a slab if its processed in free. If you take a > slab off the list without setting that bit then something else needs to > provide the information that "frozen" provided. Yes, that's the new slab_node_partial flag in patch 1, protected by list_lock. > If the frozen bit changes can be handled in a different way than > with cmpxchg then that is a good optimization. Frozen bit stays the same, but some scenarios can now avoid it. > For much of the frozen handling we must be holding the node list lock > anyways in order to add/remove from the list. So we already have a lock > that could be used to protect flag operations. I can see the following differences between the traditional frozen bit and the new flag: frozen bit advantage: - __slab_free() on an already-frozen slab can ignore list operations and list_lock completely frozen bit disadvantage: - acquire_slab() trying to do cmpxchg_double() under list_lock (see commit 9b1ea29bc0d7) slab_node_partial flag advantage: - we can take slabs off from node partial list without cmpxchg_double() - probably less cmpxchg_double() operations overall slab_node_partial flag disadvantage: - a __slab_free() that encouters a slab that's not frozen (but slab_node_partial flag is not set) might have to do more work, including taking the list_lock only to find out that slab_node_partial flag is false (but AFAICS that happens only when the slab becomes fully free by the free operation, thus relatively rarely). Put together, I think we might indeed get the best of both if the frozen flag is kept to use for cpu slabs, and we rely on slab_node_partial flag for cpu partial slabs, as the series does.