From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3066C74A5B for ; Tue, 21 Mar 2023 09:30:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A5E36B0075; Tue, 21 Mar 2023 05:30:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 755CD6B0078; Tue, 21 Mar 2023 05:30:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 643A86B007B; Tue, 21 Mar 2023 05:30:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 54D676B0075 for ; Tue, 21 Mar 2023 05:30:32 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 18BDE141270 for ; Tue, 21 Mar 2023 09:30:32 +0000 (UTC) X-FDA: 80592385104.06.805BC9D Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf24.hostedemail.com (Postfix) with ESMTP id 13AC518001D for ; Tue, 21 Mar 2023 09:30:27 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf24.hostedemail.com: domain of chenjun102@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=chenjun102@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679391029; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:references; bh=ZslhDWd0oEcMPeQlqDpjRsECwhD3E/XM33i+RYRk2eI=; b=wj6euPPNA453sNJ+1PpPIiT70NSDfmm9fL3yhxyOtHnzaMaIjkIIF5UcpBtFFRUhwOjTuy W3xbqbsILTwsfGm/KuOHDAlvKWRmyb66RQRHp3MJYf0O9voaexw+LWN6aPNzJuozbz83zW RAKaSTrQ57qJ63jRZnNtPSbaM2VJ9Go= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf24.hostedemail.com: domain of chenjun102@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=chenjun102@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679391029; a=rsa-sha256; cv=none; b=GoOp9fEWvccJTbAaXvmK584BZLSI9R17wRlggTdd8TWpnvI8Q2fiLwnjDRmmQZHKikPAng Qlp54fLQGmnwLz1bGhjyfBLin5d1JWGh+c0fFpL4rIg0cKxor1fxvsj5z+eEpLUSlE6YnE Mm6NbO+FmuWrLMBIcvwdQ8b7JeIcjHM= Received: from dggpemm500001.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4PgmTd56TCznYDJ; Tue, 21 Mar 2023 17:27:17 +0800 (CST) Received: from dggpemm500006.china.huawei.com (7.185.36.236) by dggpemm500001.china.huawei.com (7.185.36.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Tue, 21 Mar 2023 17:30:23 +0800 Received: from dggpemm500006.china.huawei.com ([7.185.36.236]) by dggpemm500006.china.huawei.com ([7.185.36.236]) with mapi id 15.01.2507.021; Tue, 21 Mar 2023 17:30:23 +0800 From: "chenjun (AM)" To: Mike Rapoport , Vlastimil Babka CC: "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "cl@linux.com" , "penberg@kernel.org" , "rientjes@google.com" , "iamjoonsoo.kim@lge.com" , "akpm@linux-foundation.org" , Hyeonggon Yoo <42.hyeyoo@gmail.com>, "xuqiang (M)" , "Wangkefeng (OS Kernel Lab)" , Michal Hocko , Mel Gorman Subject: Re: [PATCH] mm/slub: Reduce memory consumption in extreme scenarios Thread-Topic: [PATCH] mm/slub: Reduce memory consumption in extreme scenarios Thread-Index: AQHZVnGyqCLx+d0GeECeM+z9zE2Dsg== Date: Tue, 21 Mar 2023 09:30:23 +0000 Message-ID: References: <20230314123403.100158-1-chenjun102@huawei.com> <0cad1ff3-8339-a3eb-fc36-c8bda1392451@suse.cz> <344c7521d72e4107b451c19b329e9864@huawei.com> <8c700468-245d-72e9-99e7-b99d4547e6d8@suse.cz> <015855b3-ced3-8d84-e21d-cc6ce112b556@suse.cz> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.174.178.43] Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 13AC518001D X-Stat-Signature: iyy5o64po5hdo54kte8rr1usfnx73wn4 X-HE-Tag: 1679391027-778004 X-HE-Meta: U2FsdGVkX1/ZtN1XG6zR4Yz0/GFStvoAJyQ/ixx0dSJVY9LiUNmIHkuldG1Ql8CuJPAVAQmlfIqIFUhEHuO94NSEuT5JQXOld71+B9ftBXjNRMajJXcLpXbQzJrqxHqVKLgWkufwOCZBIfUENG1tLemxtSddl/8A5xO+Uu2ybqPTCJ3RI8VYvT0zN/H3WM7RSQxecrYh/srnmrmeMTqz3haxThkfS4PD2YZ67SGC16KTJKaElrbiMHMuC4DZsRbtZmHZZsnGm9PwNSeUEB0L3etD4wx0LpJskXdeObrfzOFfFCi5u/hAuqBUguInxLZ8DdRjq75x5ha1O7Hf7d2IXAYyNfsQdtreXEQffENX1xPZuP+zYmt6fo5ISPxvzPBwkRsOPzyjmfDQRtWnDigvjmFwn5tCu6dQzvdj1Oz4G3ay3nqn+a5t7uEHUViZThUhYdOREIPKnJE3w3PXz8OfFdBCtLWm5YjsfRxHzYWV7eOkeXHUwwxZTGrdZ++uNg+gUpS7XoFeuC8sWAknQXJzvJFD1GsjzzLaLrjZ9uG9q6YR9OAgoXFIBdqIWaAVLYGhxDXvmYWbS1qHbdiAIRP9Swd15gee5QVWATIUd6BhEWA6wYGa2M4fnwSRb9tc3Xw/atpxBVtZ6O8mbBZu3mt3pjohkuUNWm/1zABHULmRnnkQqtgYiAe3c0oqu/jhoElecWdjif8jJqQulZfrbh6UQJDELzNSPoLgaoYEXW9C3hhOEGI60bCugjHncc3gpb/me9hpvRMxDY1viMcYKWzRY7ftxEo9DSUcqfNm8ia3TCwuHk5VptdfiX98is1y8iqJG7nEcT73UC6Ej66DEjrvTNPZrBuAz99LJmJpP37O5LjWiKGjuz/y1dpTrMstSPGvgJTv68T5JWt872WQ22V6VFGwlwjOA307Ya+bKer4hej8Mt3QGMNx5UtUPZSd141rHCs8blKrf82azNwo73G 33/BI1CY KpA2Ni7WH5/tX98WOCWaPuC2Hx5VuGnw4Fog8Z4eb4XEI7KZAMxhYY5t3S9RVkFIDcCzPVY7FjUG4kuUA4nibHd8Xww/l0ljKEt60t4sg8s9pQA++z7BlCo1+H7/TVVnH148lPYuzRugLwcP/INMnARLBXlLWzlgrX9Rh22CHEOz3Yy/37RATqgqwoH/Uv+6kWIIouNRsajJwhbJ5nx70zIHsktcY6g91/JvEBKB7jFGh5c+hP6qeCdo2fsaDbh3zzczV0bLSB1qQm5A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =1B$B:_=1B(B 2023/3/20 17:12, Mike Rapoport =1B$B On Mon, Mar 20, 2023 at 09:05:57AM +0100, Vlastimil Babka wrote:=0A= >> On 3/19/23 08:22, chenjun (AM) wrote:=0A= >>> =1B$B:_=1B(B 2023/3/17 20:06, Vlastimil Babka =1B$B>>> On 3/17/23 12:32, chenjun (AM) wrote:=0A= >>>>> =1B$B:_=1B(B 2023/3/14 22:41, Vlastimil Babka =1B$B>>>>>> pc.flags =3D gfpflags;=0A= >>>>>>> +=0A= >>>>>>> + /*=0A= >>>>>>> + * when (node !=3D NUMA_NO_NODE) && (gfpflags & __GFP_THISNODE)= =0A= >>>>>>> + * 1) try to get a partial slab from target node with __GFP_THISN= ODE.=0A= >>>>>>> + * 2) if 1) failed, try to allocate a new slab from target node w= ith=0A= >>>>>>> + * __GFP_THISNODE.=0A= >>>>>>> + * 3) if 2) failed, retry 1) and 2) without __GFP_THISNODE constr= aint.=0A= >>>>>>> + */=0A= >>>>>>> + if (node !=3D NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE) && try= _thisnode)=0A= >>>>>>> + pc.flags |=3D __GFP_THISNODE;=0A= >>>>>>=0A= >>>>>> Hmm I'm thinking we should also perhaps remove direct reclaim possib= ilities=0A= >>>>>> from the attempt 2). In your qemu test it should make no difference,= as it=0A= >>>>>> fills everything with kernel memory that is not reclaimable. But in = practice=0A= >>>>>> the target node might be filled with user memory, and I think it's b= etter to=0A= >>>>>> quickly allocate on a different node than spend time in direct recla= im. So=0A= >>>>>> the following should work I think?=0A= >>>>>>=0A= >>>>>> pc.flags =3D GFP_NOWAIT | __GFP_NOWARN |__GFP_THISNODE=0A= >>>>>>=0A= >>>>>=0A= >>>>> Hmm, Should it be that:=0A= >>>>>=0A= >>>>> pc.flags |=3D GFP_NOWAIT | __GFP_NOWARN |__GFP_THISNODE=0A= >>>>=0A= >>>> No, we need to ignore the other reclaim-related flags that the caller= =0A= >>>> passed, or it wouldn't work as intended.=0A= >>>> The danger is that we ignore some flag that would be necessary to pass= , but=0A= >>>> I don't think there's any?=0A= >>>>=0A= >>>>=0A= >>>=0A= >>> If we ignore __GFP_ZERO passed by kzalloc=1B$B!$=1B(B kzalloc will not = work.=0A= >>> Could we just unmask __GFP_RECLAIMABLE | __GFP_RECLAIM?=0A= >>>=0A= >>> pc.flags &=3D ~(__GFP_RECLAIMABLE | __GFP_RECLAIM)=0A= >>> pc.flags |=3D __GFP_THISNODE=0A= >>=0A= >> __GFP_RECLAIMABLE would be wrong, but also ignored as new_slab() does:= =0A= >> flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK)=0A= >>=0A= >> which would filter out __GFP_ZERO as well. That's not a problem as kzall= oc()=0A= >> will zero out the individual allocated objects, so it doesn't matter if = we=0A= >> don't zero out the whole slab page.=0A= >>=0A= >> But I wonder, if we're not past due time for a helper e.g.=0A= >> gfp_opportunistic(flags) that would turn any allocation flags to a=0A= >> GFP_NOWAIT while keeping the rest of relevant flags intact, and thus the= re=0A= >> would be one canonical way to do it - I'm sure there's a number of place= s=0A= >> with their own variants now?=0A= >> With such helper we'd just add __GFP_THISNODE to the result here as that= 's=0A= >> specific to this particular opportunistic allocation.=0A= > =0A= > I like the idea, but maybe gfp_no_reclaim() would be clearer?=0A= > =0A= =0A= #define gfp_no_reclaim(gfpflag) (gfpflag & ~__GFP_DIRECT_RECLAIM)=0A= =0A= And here,=0A= =0A= pc.flags =3D gfp_no_reclaim(gfpflags) | __GFP_THISNODE.=0A= =0A= Do I get it right=1B$B!)=1B(B=0A=