From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D35BEB64DD for ; Thu, 3 Aug 2023 14:54:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB13F280270; Thu, 3 Aug 2023 10:54:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E617D28022C; Thu, 3 Aug 2023 10:54:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D025E280270; Thu, 3 Aug 2023 10:54:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C0A0E28022C for ; Thu, 3 Aug 2023 10:54:56 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9835DB2D47 for ; Thu, 3 Aug 2023 14:54:56 +0000 (UTC) X-FDA: 81083090592.02.F469D4B Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf19.hostedemail.com (Postfix) with ESMTP id 5FEE61A0002 for ; Thu, 3 Aug 2023 14:54:54 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=OjD8Rfoz; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=PRU6WTvy; dmarc=none; spf=pass (imf19.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691074494; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2m+DPw2kbwNaYOowHVlZk01PTLJht/UOyrDYeHeRlHA=; b=ElZkcgFsP18Ev+EyNPld4rievW4S9LXA+pLlqrnwHw/QMsSyX44w4y9JuV3vRhZc4s/riK 5VyqEYmavo6o4zCGme9GPJGb6cummuqwsby2DiYrs5NRMUWp6sZnnARe/uDeCnosihM6cu JHpxA1ghgjc5VopA/AowQ8enZJABPK0= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=OjD8Rfoz; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=PRU6WTvy; dmarc=none; spf=pass (imf19.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691074494; a=rsa-sha256; cv=none; b=jyeoBOq+SEvkJmTrIfrpJnxFDDRP2QgvKt9F1HXmsMZhk/XcQyBuZYPQ2627Q5ortoeYzU kySk/Rd9LAIlccKhKY5Et5fsHYkTsEjvNgo7bEGlO8Il6UG0XTC2tVjycLafZITKIa5GDe uw8+MuK27DF0iaoo0YC5pdUbsAew1xs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E62BB1F45F; Thu, 3 Aug 2023 14:54:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1691074492; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2m+DPw2kbwNaYOowHVlZk01PTLJht/UOyrDYeHeRlHA=; b=OjD8Rfoz7/voV285j5naIDD9KumULZ4z4P6Bcyu7JNjqq5Nm1aDkAI70YXkr/4FbdYVjou bbrwBv6tT5yLNXdsc5KJafbsTZDuKD8zF7Zd0YdfQ/o+8v8mq6zAPDuiR27QPqTvJaz28y 7hbFCw6/dEJDayvNnGxr2B8iPFQvepo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1691074492; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2m+DPw2kbwNaYOowHVlZk01PTLJht/UOyrDYeHeRlHA=; b=PRU6WTvyXCbfYkK39HsYmTrC0U3+1uLiEZqvg+Ma06FQFtxp/ui5mfuIEjNkpY9HrieJVK rN0pqneQZWH6JtBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 839CC134B0; Thu, 3 Aug 2023 14:54:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 324FH7y/y2QPHwAAMHmgww (envelope-from ); Thu, 03 Aug 2023 14:54:52 +0000 Message-ID: <1f88aff2-8027-1020-71b2-6a6528f82207@suse.cz> Date: Thu, 3 Aug 2023 16:54:52 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.1 Subject: Re: [RFC 2/2] mm/slub: prefer NUMA locality over slight memory saving on NUMA machines Content-Language: en-US To: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Christoph Lameter , Pekka Enberg , Joonsoo Kim , David Rientjes , Andrew Morton Cc: Roman Gushchin , Feng Tang , "Sang, Oliver" , Jay Patel , Binder Makin , aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com, fengwei.yin@intel.com, ying.huang@intel.com, lkp , "oe-lkp@lists.linux.dev" , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20230723190906.4082646-1-42.hyeyoo@gmail.com> <20230723190906.4082646-3-42.hyeyoo@gmail.com> From: Vlastimil Babka In-Reply-To: <20230723190906.4082646-3-42.hyeyoo@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 5FEE61A0002 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 3wius9qkzmcpcritzq961m6f3euh78i7 X-HE-Tag: 1691074494-580265 X-HE-Meta: U2FsdGVkX1+XKV/2iy1IpSPW+xyEAOMMGNqQCcxA1eX0aB7/o4SwKocutw8UUXAy14sVLLZM74nWj79VP+ZpVXpwqVSp9KTTCwprBI+y4FD+5ZkJ0SgphbtGGkyVyNIMHu54PTN98rK3eKG9yty6h7cFH8fp9X0aIcF7cdn8Fg9+eizvgiEwWP18bTT+ykM2ZopfuNhWJp2rKs+Yj3Sw3IEjouVMTAKNpfg1X+xJ8pEXL0BKgHFWepjl+JGqH/SOnt2BAuIA9C63T9gc9RZyg+tgpK4qPRZBi1VxIXT6eO0PbrbEY7m1dLBBM5ByAHmUmMGJGtBkFcnDz/Oao32GhBtiyS0mBWNgna0lp9ZOxDnbo3Vabwi/w9PSwOeuDSV2joxA6VMX1n5aZIGb6wxLd5hAg4DJmey5AP55dkafnqkI3nmu2IJkU3EdWutT7v8fRzF/Q3Bi90gH5aGEfXtH5uBt1CsZti6sShpif8pOqjt8LmflPmk7gKOVxE3ZluZP3quakwWQO6l4DzIGuTVT/WkclpF9vDw/EShjtIdlwbU+1ausRDkxNkbD2JqBIzjt3PpV3dS5CdVn2UmcA87JaGY8uyVTLQ1ahsloBgrE3JLZVDCNbbWbUu53Te0A5Pj27ovAe22w4FaeT2NZNU+5jr06W0C42QH+9jG3mDfYJDlNCkfZBcmUZ/2n7G7lrwuM60U7NclRdPr2HEZagylLDRMmRmqywUDk+meios/NqsX4gg0Mc2TOOjEDIu4fXy8kp32rEee0t0xsKdmwAmnHEN2tljw3wRqB8E2KDU9yDEEUsBx9NSLLteyt6cOmFrCuYzxAqtfPmtTk3doviNJuLlqYSh1nAP7UdIsibxggtKcSto8YrUTNlzQXVOdx6ZfA6qQ9mtO1TN3Alb4ZLnb9cs2SLIA83nivF5s04z1kNKVD9aTSWV0rl//gScMnSjwe7srbE49/DU8oHjAY5FE UXsSlFXq fBtddhlcLwdv1ck0GGKp7SlNnpF12I7bL01iZJPeoMFu3ydW44BL/LcWaxKqD6zBlYzUHcM/ITs8ZsgfKL3ftvgbNUXxDmJvRtfNr9cEfOPA+ECQvAa5YuYRbboTNxEd/LfiBMG1jPyxfQdULjnWlJKiEWN7yDZLl4/BLGd19yTojF0lJuNv1x4owOWzgLUiPbpOY/9yIOxGRtGJI7Rxs2CKTWJuj+gJoYod6EBcPmtrl0LQlPax2cukl6dQXugD6NDgO1LHj5M0w//5myiv7A5SlMDvJxbGRovTAkM318IAHnDbqemBdVaW8aA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 7/23/23 21:09, Hyeonggon Yoo wrote: > By default, SLUB sets remote_node_defrag_ratio to 1000, which makes it > (in most cases) take slabs from remote nodes first before trying allocating > new folios on the local node from buddy. > > Documentation/ABI/testing/sysfs-kernel-slab says: >> The file remote_node_defrag_ratio specifies the percentage of >> times SLUB will attempt to refill the cpu slab with a partial >> slab from a remote node as opposed to allocating a new slab on >> the local node. This reduces the amount of wasted memory over >> the entire system but can be expensive. > > Although this made sense when it was introduced, the portion of > per node partial lists in the overall SLUB memory usage has been decreased > since the introduction of per cpu partial lists. Therefore, it's worth > reevaluating its overhead on performance and memory usage. > > [ > XXX: Add performance data. I tried to measure its impact on > hackbench with a 2 socket NUMA machine. but it seems hackbench is > too synthetic to benefit from this, because the skbuff_head_cache's > size fits into the last level cache. > > Probably more realistic workloads like netperf would benefit > from this? > ] > > Set remote_node_defrag_ratio to zero by default, and the new behavior is: > 1) try refilling per CPU partial list from the local node > 2) try allocating new slabs from the local node without reclamation > 3) try refilling per CPU partial list from remote nodes > 4) try allocating new slabs from the local node or remote nodes > > If user specified remote_node_defrag_ratio, it probabilistically tries > 3) first and then try 2) and 4) in order, to avoid unexpected behavioral > change from user's perspective. It makes sense to me, but as you note it would be great to demonstrate benefits, because it adds complexity, especially in the already complex ___slab_alloc(). Networking has been indeed historically a workload very sensitive to slab performance, so seems a good candidate. We could also postpone this until we have tried the percpu arrays improvements discussed at LSF/MM.