From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75B8BC54EBC for ; Thu, 12 Jan 2023 09:24:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 119F58E0002; Thu, 12 Jan 2023 04:24:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CAD98E0001; Thu, 12 Jan 2023 04:24:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED4A18E0002; Thu, 12 Jan 2023 04:24:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DD3CB8E0001 for ; Thu, 12 Jan 2023 04:24:58 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B27691C641B for ; Thu, 12 Jan 2023 09:24:58 +0000 (UTC) X-FDA: 80345612676.20.1332E81 Received: from outbound-smtp44.blacknight.com (outbound-smtp44.blacknight.com [46.22.136.52]) by imf05.hostedemail.com (Postfix) with ESMTP id A96F0100007 for ; Thu, 12 Jan 2023 09:24:56 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.52 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673515497; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bnyt9tw/sjK9UlySSsFxyzgArKoqLGR9+fKawzbyTwc=; b=IHXm4PmT12PCOgnkaH5Y+7nVSn/hiXZDQKYHWSYw0/81yiMDAmB/XF+RD7W1aNNFP6KSwd LMBy14Y+2fThMHbr7YWb2k0k47kEvAaUe1makr779FHGJUbBeglcS488Ivut5OuCHQWBRc QUQ3U8K+8jW2Nf/Wzx8JDqhUGH2173E= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.52 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673515497; a=rsa-sha256; cv=none; b=xnvbb+BtgbkePoulFE1bK1Aef/GuL9nzjGJmstmqD5stjQEEn+ZZnjHgmM8K8d12jlFZ4x e/7hM+lc/3mbwjvFSHcjeQJEVwbHUKZNOwX5SA75RnnVVLJcshcmWxz66NIp4jrKT449eY 1Rj9ge12f4JzePbcE6qyKx6xwm4vCIw= Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp44.blacknight.com (Postfix) with ESMTPS id 00018F80B9 for ; Thu, 12 Jan 2023 09:24:54 +0000 (GMT) Received: (qmail 22350 invoked from network); 12 Jan 2023 09:24:54 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.198.246]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 12 Jan 2023 09:24:54 -0000 Date: Thu, 12 Jan 2023 09:24:52 +0000 From: Mel Gorman To: Michal Hocko Cc: Linux-MM , Andrew Morton , NeilBrown , Thierry Reding , Matthew Wilcox , Vlastimil Babka , LKML Subject: Re: [PATCH 6/7] mm/page_alloc: Give GFP_ATOMIC and non-blocking allocations access to reserves Message-ID: <20230112092452.rtvo6tkp4rpmxm7v@techsingularity.net> References: <20230109151631.24923-1-mgorman@techsingularity.net> <20230109151631.24923-7-mgorman@techsingularity.net> <20230111170552.5b7z5hetc2lcdwmb@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: A96F0100007 X-Stat-Signature: xdsope94xjxedpn5xny6mf6qf9fij4xf X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1673515496-994337 X-HE-Meta: U2FsdGVkX1+w084flxWbaA3Y0lZ78yYXz+gBZsJEX157sWG+j9to7EnsgeyIbwkae7aN06UAg6JuIXxWLmO7jgqn4lBRVV0hOdctRF0hHbRYDLeIOmSAlgKUHbQ4m+49nAnU+bsEr4qwqBWPVGvhETSA+p3smvlGvEKDJ2h7f9kcW62qc7lbTMGrBnirdDGYvE/1ZQn6fVQGTXLqWECH+tRGCbYWKRoJomu9a1QLidzF64rGNSbW03iuLiGvk0L9fajqCJp9xQiPPeCtXMJ3rKIu/gIkgboaKsCPb2Nx15iYyMfsp1r7X7ll7Fnmv11EOKwIXi3AeijXPjVAJU4QqjF8UWJqBJkec0FlJS9ZqJ2lHHxU1lj7AzjTKpcmb3jAGCiOjCUhKTlS4T90oPOpfN+mbp5dTtpWPx7GbjmJKTHsQ0fVaHKnzF7y1rkjcD6/rua9tyrv7IqFEStkZDgnhoLSyXFbwz7ixXq9EmanApu4UrCuTlEGY1yOHU89HdPGTPZFwQlqvRVB7B/JiAbIgSlxO96xMr9c+oonC9rjRHDMD5RCXEB7it+wHLnphlYJQCk1CLio5OdBMc0xsn4s0+Tp11c8HbHIpsgOBlV6hlSOA2LnJ/rJ5/wpIreSKy0U7Mt5mUHNGKaSw76dMH8HZCRoFOdtWdkCO1FXR3+zK04vyKhVTE3N5QpJWo5Q+xx4ry7jj8MQecMbHTNGBLAhU0qZ4B3UGSvKIenzLJrG6mrLBIVVnb19lVKzkfSlMRANGltJzsaMJRGLtw0DmUupp1g5pkP8Atr3TLpwWNhZAFgXpMOtRYG1ewszmnWu0O7mSpsdavf4NP9WTgU0He9YLZZ3f0/7LGQD4s352MUrKxHNVKYouge5oTecmtTeEAURSerOByrdZPe/+V7mxYoKYNT3CyX2pWcyRsFwBoNXudLEKeKPFtglt3065BWQIT7uKLBlzH9Ws97LuCmRxOi /fn6pFhM jnyE2O+pSgtNvQKyhAxQ+ckKBxg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 12, 2023 at 09:11:06AM +0100, Michal Hocko wrote: > On Wed 11-01-23 17:05:52, Mel Gorman wrote: > > On Wed, Jan 11, 2023 at 04:58:02PM +0100, Michal Hocko wrote: > > > On Mon 09-01-23 15:16:30, Mel Gorman wrote: > > > > Explicit GFP_ATOMIC allocations get flagged ALLOC_HARDER which is a bit > > > > vague. In preparation for removing __GFP_ATOMIC, give GFP_ATOMIC and > > > > other non-blocking allocation requests equal access to reserve. Rename > > > > ALLOC_HARDER to ALLOC_NON_BLOCK to make it more clear what the flag > > > > means. > > > > > > GFP_NOWAIT can be also used for opportunistic allocations which can and > > > should fail quickly if the memory is tight and more elaborate path > > > should be taken (e.g. try higher order allocation first but fall back to > > > smaller request if the memory is fragmented). Do we really want to give > > > those access to memory reserves as well? > > > > Good question. Without __GFP_ATOMIC, GFP_NOWAIT only differs from GFP_ATOMIC > > by __GFP_HIGH but that is not enough to distinguish between a caller that > > cannot sleep versus one that is speculatively attempting an allocation but > > has other options. That changelog is misleading, it's not equal access > > as GFP_NOWAIT ends up with 25% of the reserves which is less than what > > GFP_ATOMIC gets. > > > > Because it becomes impossible to distinguish between non-blocking and > > atomic without __GFP_ATOMIC, there is some justification for allowing > > access to reserves for GFP_NOWAIT. bio for example attempts an allocation > > (clears __GFP_DIRECT_RECLAIM) before falling back to mempool but delays > > in IO can also lead to further allocation pressure. mmu gather failing > > GFP_WAIT slows the rate memory can be freed. NFS failing GFP_NOWAIT will > > have to retry IOs multiple times. The examples were picked at random but > > the point is that there are cases where failing GFP_NOWAIT can degrade > > the system, particularly delay the cleaning of pages before reclaim. > > Fair points. > > > A lot of the truly speculative users appear to use GFP_NOWAIT | __GFP_NOWARN > > so one compromise would be to avoid using reserves if __GFP_NOWARN is > > also specified. > > > > Something like this as a separate patch? > > I cannot say I would be happy about adding more side effects to > __GFP_NOWARN. You are right that it should be used for those optimistic > allocation requests but historically all many of these subtle side effects > have kicked back at some point. True. > Wouldn't it make sense to explicitly > mark those places which really benefit from reserves instead? That would be __GFP_HIGH and would require context from every caller on whether they need reserves or not and to determine what the consequences are if there is a stall. Is there immediate local fallout or wider fallout such as a variable delay before pages can be cleaned? > This is > more work but it should pay off long term. Your examples above would use > GFP_ATOMIC instead of GFP_NOWAIT. > Yes, although it would confuse the meaning of GFP_ATOMIC as a result. It's described as "%GFP_ATOMIC users can not sleep and need the allocation to succeed" and something like the bio callsite does not *need* the allocation to succeed. It can fallback to the mempool and performance simply degrades temporarily. No doubt there are a few abuses of GFP_ATOMIC just to get non-blocking behaviour already. > The semantic would be easier to explain as well. GFP_ATOMIC - non > sleeping allocations which are important so they have access to memory > reserves. GFP_NOWAIT - non sleeping allocations. > People's definition of "important" will vary wildly. The following would avoid reserve access for GFP_NOWAIT for now. It would need to be folded into this patch and a new changelog diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7244ab522028..aa20165224cf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3989,18 +3989,19 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, * __GFP_HIGH allows access to 50% of the min reserve as well * as OOM. */ - if (alloc_flags & ALLOC_MIN_RESERVE) + if (alloc_flags & ALLOC_MIN_RESERVE) { min -= min / 2; - /* - * Non-blocking allocations can access some of the reserve - * with more access if also __GFP_HIGH. The reasoning is that - * a non-blocking caller may incur a more severe penalty - * if it cannot get memory quickly, particularly if it's - * also __GFP_HIGH. - */ - if (alloc_flags & ALLOC_NON_BLOCK) - min -= min / 4; + /* + * Non-blocking allocations (e.g. GFP_ATOMIC) can + * access more reserves than just __GFP_HIGH. Other + * non-blocking allocations requests such as GFP_NOWAIT + * or (GFP_KERNEL & ~__GFP_DIRECT_RECLAIM) do not get + * access to the min reserve. + */ + if (alloc_flags & ALLOC_NON_BLOCK) + min -= min / 4; + } /* * OOM victims can try even harder than the normal reserve -- Mel Gorman SUSE Labs