From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF3DBC4332F for ; Wed, 4 Jan 2023 12:02:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3915E8E0002; Wed, 4 Jan 2023 07:02:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 341678E0001; Wed, 4 Jan 2023 07:02:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 20A6B8E0002; Wed, 4 Jan 2023 07:02:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1180A8E0001 for ; Wed, 4 Jan 2023 07:02:49 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8818B160C7B for ; Wed, 4 Jan 2023 12:02:48 +0000 (UTC) X-FDA: 80316980016.18.20C63D4 Received: from outbound-smtp37.blacknight.com (outbound-smtp37.blacknight.com [46.22.139.220]) by imf11.hostedemail.com (Postfix) with ESMTP id CC9FE40009 for ; Wed, 4 Jan 2023 12:02:46 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.139.220 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672833767; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J7lDQt12BeZrPoVzn8gI8zcAWID3agkg+9h2XDSETOo=; b=eqx7MGtdPKAF25nz9TIl0HRz1rzg54/KD3ZSm+/Zsfj6j3mmd3EgeFI6HnZiAZXqS6ldMg OP51No1oXlhHs8GNOfFtZt4BtUVhKlzTSu/KcRULVydbO5bBdvXM4R/rG6hQx0ZVFs5nF7 9pHwd3HGpS6JQNbZsgwxYA3eZTuNdfI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.139.220 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672833767; a=rsa-sha256; cv=none; b=M5q3I7xmO8HCTk02TUzO5dtixJ645LLXB24bXIBTa4AscD++7r1/9MRinZMMnEj7HzclVh MjYKKqpe9y85VMtXfd5JI1lqoPaCeWc61w9Zvboy1oxoKPdpj4mRIHrs5Gcl08SVjYUAXF +WRK6z4ThwC945vV6PYeW8TRaJwF8Jg= Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp37.blacknight.com (Postfix) with ESMTPS id 294171EA1 for ; Wed, 4 Jan 2023 12:02:45 +0000 (GMT) Received: (qmail 21443 invoked from network); 4 Jan 2023 12:02:45 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.198.246]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 4 Jan 2023 12:02:44 -0000 Date: Wed, 4 Jan 2023 12:02:43 +0000 From: Mel Gorman To: Vlastimil Babka Cc: Linux-MM , Andrew Morton , Michal Hocko , NeilBrown , Thierry Reding , Matthew Wilcox , LKML Subject: Re: [PATCH 4/6] mm/page_alloc: Explicitly define what alloc flags deplete min reserves Message-ID: <20230104120243.5qopsnscdmuxqyap@techsingularity.net> References: <20221129151701.23261-1-mgorman@techsingularity.net> <20221129151701.23261-5-mgorman@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: X-Stat-Signature: cfgarwnac59gi6ngoid6hqanncdwy7p4 X-Rspam-User: X-Rspamd-Queue-Id: CC9FE40009 X-Rspamd-Server: rspam06 X-HE-Tag: 1672833766-855809 X-HE-Meta: U2FsdGVkX188gqjdDAWfwDOwtsUEcuoK3Crk9bT/71EJDcMKGqiuqbUDwdazFviSXHKOs7BPhMahDddLunLEuwZ5/YiaW/XJS4xbOlDWqvnNbcj9g+sn32MIxrij86TiIrEw6OjusjAu65SgZRSQlOJ1+i7qFZ2oZO2313liewMZDgdMM28t8Bus6W2eM2QE7c5J1LFuhKMfqJBf5YPqGN7o1Lt6vSS5zngzF1+Dqkeeaztiz3n1cNt+dAt3dlu9EoXSiozDCcrj/Iqn6PKJcNfJ6JFA6+9GdFQsCXCNcMNpUuJxHtHhxfgCAJ8kEAU4jvMWrpQ7kgaBo268q3OV/ejUrIJH0+Af9tKLPCgtT+6oVYFYXYjG9e0/gGxSwczsvh712QQuRXdMybTX6Y8nOjQQW6KthLFfhfhy1ThqvWsagW9MuTT2N1nXICx6M11iD/MfnQaqkUYjcN6VF8TddEJrfJ5vCoxA3OI0RMYnaewxM0PIEBRdPIrukG/WWRT8fsKjo4+rOGOuuB783oxk0x3zbQ8aVrjtDsS5JnBlV8Jy0WDCVYsezcF2FHOAJV6pO/Iw+e7cxI+yPfNuDej+aPFoc0qJlKYxvJgwP0nRyH/pgZycOIGG9qDs0nAHCY8+W/mFn+0Lm1zz2tYTZc8gHU7DI79zvFGD1WHKNa51OJ+XKBGfgHlRFQ/aBjoLNO0Wuacx1ysiWKF7+9k1+WjfiQZoEb9Sk5BbM2G0ncRy/5DgLQKy5Xnxm1dJC39l5Njc98OZ67amfhHMOUgb1V98UTtXc8PiXCsbmD3m8m87mpGIxvKuC74zNWbvxWlZlsZpNsknuYiRNfuWAfC0JuBqX08KuKHlIP4r+K6RiN5GiLboubZ/HBDtWCI3DmZlo0YErfPBgmLMjPvtNafcLJMsT09/YKlBYKnu5XWGxz/E8Xf3MEYKybrptYlWFz9TMsIxCb3+gi7ESrOwBic1Boz dxr0PwHi mxezzNTZ8yijuLQOBWTKE1UC4k6xBVgV4sWDX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Dec 08, 2022 at 06:55:00PM +0100, Vlastimil Babka wrote: > On 11/29/22 16:16, Mel Gorman wrote: > > As there are more ALLOC_ flags that affect reserves, define what flags > > affect reserves and clarify the effect of each flag. > > Seems to me this does more than a clarification, but also some functional > tweaks, so it could be helpful if those were spelled out in the changelog. > I will to take out the problematic parts that need clarification. There are two, one I'll drop and the other will be split. More details below. > > @@ -3976,25 +3975,36 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, > > { > > long min = mark; > > int o; > > - const bool alloc_harder = (alloc_flags & (ALLOC_HARDER|ALLOC_OOM)); > > > > /* free_pages may go negative - that's OK */ > > free_pages -= __zone_watermark_unusable_free(z, order, alloc_flags); > > > > - if (alloc_flags & ALLOC_MIN_RESERVE) > > - min -= min / 2; > > + if (alloc_flags & ALLOC_RESERVES) { > > Do we want to keep this unlikely() as alloc_harder did before? > Added back in. > > + /* > > + * __GFP_HIGH allows access to 50% of the min reserve as well > > + * as OOM. > > + */ > > + if (alloc_flags & ALLOC_MIN_RESERVE) > > + min -= min / 2; > > > > - if (unlikely(alloc_harder)) { > > /* > > - * OOM victims can try even harder than normal ALLOC_HARDER > > + * Non-blocking allocations can access some of the reserve > > + * with more access if also __GFP_HIGH. The reasoning is that > > + * a non-blocking caller may incur a more severe penalty > > + * if it cannot get memory quickly, particularly if it's > > + * also __GFP_HIGH. > > + */ > > + if (alloc_flags & (ALLOC_HARDER|ALLOC_HIGHATOMIC)) > > + min -= min / 4; > > For example this seems to change the allowed dip to reserves for > ALLOC_HIGHATOMIC. > You're right and this could cause problems. If high-order atomic allocation failures start appearing again, this change would help but it should be a standalone patch in response to a bug. I'll drop it for now. > > + > > + /* > > + * OOM victims can try even harder than the normal reserve > > * users on the grounds that it's definitely going to be in > > * the exit path shortly and free memory. Any allocation it > > * makes during the free path will be small and short-lived. > > */ > > if (alloc_flags & ALLOC_OOM) > > min -= min / 2; > > - else > > - min -= min / 4; > > } > > (noted that this patch doesn't seem to change the concern I raised in > previous patch) > This might be addressed now with the chjanges to the patch that caused you concerns about OOM handling. > > /* > > @@ -5293,7 +5303,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > > * could deplete whole memory reserves which would just make > > * the situation worse > > */ > > - page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac); > > + page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE|ALLOC_HARDER, ac); > > And this AFAICS seems to give __GFP_NOFAIL 3/4 of min reserves instead of > 1/4, which seems like a significant change (but hopefully ok) so worth > noting at least. > It deserves a standalone patch. Below is the diff I intend to apply to this patch and the standalone patch. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 58e01a31492e..6f41b84a97ac 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3984,7 +3984,7 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, /* free_pages may go negative - that's OK */ free_pages -= __zone_watermark_unusable_free(z, order, alloc_flags); - if (alloc_flags & ALLOC_RESERVES) { + if (unlikely(alloc_flags & ALLOC_RESERVES)) { /* * __GFP_HIGH allows access to 50% of the min reserve as well * as OOM. @@ -3999,7 +3999,7 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, * if it cannot get memory quickly, particularly if it's * also __GFP_HIGH. */ - if (alloc_flags & (ALLOC_HARDER|ALLOC_HIGHATOMIC)) + if (alloc_flags & ALLOC_HARDER) min -= min / 4; /* @@ -5308,7 +5308,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, * could deplete whole memory reserves which would just make * the situation worse */ - page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE|ALLOC_HARDER, ac); + page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac); if (page) goto got_pg; The patch to allow __GFP_NOFAIL deeper access is this --8<-- mm/page_alloc.c: Allow __GFP_NOFAIL requests deeper access to reserves Currently __GFP_NOFAIL allocations without any other flags can access 25% of the reserves but these requests imply that the system cannot make forward progress until the allocation succeeds. Allow __GFP_NOFAIL access to 75% of the min reserve. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6f41b84a97ac..d2df78f5baa2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5308,7 +5308,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, * could deplete whole memory reserves which would just make * the situation worse */ - page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac); + page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_MIN_RESERVE|ALLOC_HARDER, ac); if (page) goto got_pg;