From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1684C6FD1D for ; Mon, 20 Mar 2023 08:06:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 27DFC6B0072; Mon, 20 Mar 2023 04:06:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 22EC36B0074; Mon, 20 Mar 2023 04:06:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D0336B0075; Mon, 20 Mar 2023 04:06:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F37246B0072 for ; Mon, 20 Mar 2023 04:06:02 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C99F5A04FD for ; Mon, 20 Mar 2023 08:06:02 +0000 (UTC) X-FDA: 80588543364.15.5F683F8 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf20.hostedemail.com (Postfix) with ESMTP id D0F5E1C001D for ; Mon, 20 Mar 2023 08:06:00 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=UdVBiUBw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=I+lgjnT3; dmarc=none; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679299561; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QZil6SCsGulb1DGcYgYEjYlDnmhOikIUA3WXba3RmkI=; b=2uZHRgW88/qMSkQVm2gFEELaJdnfOcXLg7StoEH26YT3+Q2erzAaCXCxxO0bXYGBHw/zYy yRwMbOYM8gfeA8ZeXWOByiW/mOBqMjpX+32HSAcDyMsS8LtNN8q+cE0/Il4OQuwAaLz4fT VVPqkNTwO6xaCmNrGL5mQ6YXWqK4yog= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=UdVBiUBw; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=I+lgjnT3; dmarc=none; spf=pass (imf20.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679299561; a=rsa-sha256; cv=none; b=3N8X7FNakijs+twPiNKk2O1BbjdYf9zk19eCnd40dZ4UnY+4WPEF5USZPHdc2GY8Dtqb5Y vkNI0ziBf35JG/xoEqf7L8jISs63UINsDpc8OtWABRszCYZFAZljN6j83OwVsfwiJ8bXAY +ywcBefJVw1MAfZPAVAMtVUeZkiVlEw= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7576F1F750; Mon, 20 Mar 2023 08:05:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1679299558; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QZil6SCsGulb1DGcYgYEjYlDnmhOikIUA3WXba3RmkI=; b=UdVBiUBw4pdD7eMEYC02fmGalxMUEmxar7XnouKpcR49uPSb28axeRzee8I/PEp80SeQnT 5SVavWwDg7KNvaAOEsO+S7TLgYaO6B0nbFbaqF+gqhaZ7RHFfepjhWVhWBFvLDJMUmTkVn pfh0kaq5biY/v2ZHGKfwnc7qCKgGOIM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1679299558; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QZil6SCsGulb1DGcYgYEjYlDnmhOikIUA3WXba3RmkI=; b=I+lgjnT3OzT2DVrsW4Asndn9HbGQ73MkSpb3k0gzZ7l0Av9i+RrgrG9c+Sh8hBF4R9SNko 8Qvxci8EHecMGDBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3C3BE13A00; Mon, 20 Mar 2023 08:05:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id HYXyDeYTGGS4RAAAMHmgww (envelope-from ); Mon, 20 Mar 2023 08:05:58 +0000 Message-ID: <015855b3-ced3-8d84-e21d-cc6ce112b556@suse.cz> Date: Mon, 20 Mar 2023 09:05:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH] mm/slub: Reduce memory consumption in extreme scenarios Content-Language: en-US To: "chenjun (AM)" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "cl@linux.com" , "penberg@kernel.org" , "rientjes@google.com" , "iamjoonsoo.kim@lge.com" , "akpm@linux-foundation.org" , Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: "xuqiang (M)" , "Wangkefeng (OS Kernel Lab)" , Michal Hocko , Mike Rapoport , Mel Gorman References: <20230314123403.100158-1-chenjun102@huawei.com> <0cad1ff3-8339-a3eb-fc36-c8bda1392451@suse.cz> <344c7521d72e4107b451c19b329e9864@huawei.com> <8c700468-245d-72e9-99e7-b99d4547e6d8@suse.cz> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: D0F5E1C001D X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: rij36m4gw7qfo8i39xzmjn34c3oad7w6 X-HE-Tag: 1679299560-209860 X-HE-Meta: U2FsdGVkX18E5HeI88voz40mFUs4smTkDvkiG54REg+KiMeHs2FWvKTmApVhv13Tf0IqH82pt0o/pkTuqng8kLDX+B3o9XbrpGC8rbt7PMhhwuyOZpOYOUBjQa18V2vAiCLW1qWlMCTBUQnSlVuK/FxP2yC+wei47qJvChQ7UyJViuLiqoEwvUe5GfcK/umAc02YxVXMNSbHw8C/eYKNB5hXmF7gYZKG6CkfDbVtTFfYNG+0Cwuq19adrD1QpPgz5WZZYRJTGeTyweShPTvTkNJp8fl5yALcavzas/pzdjkkkeo3Lutw7kNbsIy25Pz6X/fBXdqa5mlFyiCUZRDKt/ygTTeW56IwmF/1XqsKJ/fsrxWkP5pDXHTqz67UoMXeIM/xhmwy3wNPhcH+qMSifxGhtANOldE0Pj0BBOBk60X9oBHF7D0ex2TaU2mIvPO8yHMv3H6/5KBXmyrIbchEQIRfrLM46uflpTJtg8mtNRByasZNvzvqvaWlcDNeAAOpWiPRVNgT0y6YMsahuWwkGCETJN7DSCGIJA5H8o7Emp3gJli7GqIqHxYOaVY4Q1+eYpJldRTZesEbMXNSH4tPwwBx24xwk+GLANf1EyXqUr8dNl51XZvZ9/q8uZ8Vy7hsO1hPg31h31uwVa5lUMasTUuCG/QEuyXyOS/SElOCnXkVA4P7uC4nuNQX83fMqDGamXxWyzKtNpcaVYX3epa0MCVuCbMaFo+rABqqaAjsVmrPWdWWmuN4ESq5Ez+MOTQxYS6IsPI7JdD5JrhhSKD80+qwBwG6btqclTFix+Qd1GWdeNXD3hucviqc0VRrrgebHW8WPTTaf0RcPPTAkZcLdMS+M8+ew10+ivCGrRFd3RvSNKe2Pm1IS+VY+iTSp7dvroaQf4m3rbBt6hGJ/KwtkZSsZDpyHhvehrBeCcXPoppqyggurMaqQRkOzU1rChXUeCMjrIIuMNXJPPxBqpP 3OxD+mXH crzw7KcIKrjJwT1xZrGEAaTkG4SVlGG6HyELXIasax6pTt8be9DoIpMvSX/jRkW4r7CN1dqR+xzB9L4XxQL5vufQSpNv8kQDMZfpOFzKKxJ3/Dkr1DRlbHokBJkyW8co46EaLAn/Ig7YlUZstXtkpMKDNrmaeNJj59yu5q0Q5W16+8bklzXkkXxg9OK5TLMmueitSAQGxAbS8ijkRRC4Rc4OmYJ/8PK7TfyTEtfcujghENznbR3H/Ejzn/jxOfm6sb13L/+I0hx9zbELH6N5GmqjsZH16PpAxY/L/PQ6J002Pv4K6oLZAOyzqhwURW/Ht1HpkPFVOp7Fvf7g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 3/19/23 08:22, chenjun (AM) wrote: > 在 2023/3/17 20:06, Vlastimil Babka 写道: >> On 3/17/23 12:32, chenjun (AM) wrote: >>> 在 2023/3/14 22:41, Vlastimil Babka 写道: >>>>> pc.flags = gfpflags; >>>>> + >>>>> + /* >>>>> + * when (node != NUMA_NO_NODE) && (gfpflags & __GFP_THISNODE) >>>>> + * 1) try to get a partial slab from target node with __GFP_THISNODE. >>>>> + * 2) if 1) failed, try to allocate a new slab from target node with >>>>> + * __GFP_THISNODE. >>>>> + * 3) if 2) failed, retry 1) and 2) without __GFP_THISNODE constraint. >>>>> + */ >>>>> + if (node != NUMA_NO_NODE && !(gfpflags & __GFP_THISNODE) && try_thisnode) >>>>> + pc.flags |= __GFP_THISNODE; >>>> >>>> Hmm I'm thinking we should also perhaps remove direct reclaim possibilities >>>> from the attempt 2). In your qemu test it should make no difference, as it >>>> fills everything with kernel memory that is not reclaimable. But in practice >>>> the target node might be filled with user memory, and I think it's better to >>>> quickly allocate on a different node than spend time in direct reclaim. So >>>> the following should work I think? >>>> >>>> pc.flags = GFP_NOWAIT | __GFP_NOWARN |__GFP_THISNODE >>>> >>> >>> Hmm, Should it be that: >>> >>> pc.flags |= GFP_NOWAIT | __GFP_NOWARN |__GFP_THISNODE >> >> No, we need to ignore the other reclaim-related flags that the caller >> passed, or it wouldn't work as intended. >> The danger is that we ignore some flag that would be necessary to pass, but >> I don't think there's any? >> >> > > If we ignore __GFP_ZERO passed by kzalloc, kzalloc will not work. > Could we just unmask __GFP_RECLAIMABLE | __GFP_RECLAIM? > > pc.flags &= ~(__GFP_RECLAIMABLE | __GFP_RECLAIM) > pc.flags |= __GFP_THISNODE __GFP_RECLAIMABLE would be wrong, but also ignored as new_slab() does: flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK) which would filter out __GFP_ZERO as well. That's not a problem as kzalloc() will zero out the individual allocated objects, so it doesn't matter if we don't zero out the whole slab page. But I wonder, if we're not past due time for a helper e.g. gfp_opportunistic(flags) that would turn any allocation flags to a GFP_NOWAIT while keeping the rest of relevant flags intact, and thus there would be one canonical way to do it - I'm sure there's a number of places with their own variants now? With such helper we'd just add __GFP_THISNODE to the result here as that's specific to this particular opportunistic allocation.