From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 929FCC54EBC for ; Thu, 12 Jan 2023 09:45:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DF728E0003; Thu, 12 Jan 2023 04:45:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 28F5F8E0001; Thu, 12 Jan 2023 04:45:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 157B58E0003; Thu, 12 Jan 2023 04:45:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 064E98E0001 for ; Thu, 12 Jan 2023 04:45:14 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B8A101A06EC for ; Thu, 12 Jan 2023 09:45:13 +0000 (UTC) X-FDA: 80345663706.22.E53DF95 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf03.hostedemail.com (Postfix) with ESMTP id EF5A82000D for ; Thu, 12 Jan 2023 09:45:11 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b="M/ll2Jxb"; spf=pass (imf03.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673516712; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KNiMMQNVa5sMzIVJxz1JeBcyurPbf7owbHpmHsMMjls=; b=bGaN1DCdkxT02Gux4GAe5uRlI5gmu+S5/V5b9XHEszfp0vyDbYrbGeyNQ5pAEzGf+3x5JQ ZuL2hDtraXWtwb7OnH/mB5x9ZR7uWicqy0mphSHhD7hGvARgljqGKEAB0BLL2GxobptAFx YyNjijRBFkUjXo1mnFLNZcpXkgsf2I0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b="M/ll2Jxb"; spf=pass (imf03.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673516712; a=rsa-sha256; cv=none; b=gQniFYnJcau2z6ktFSKewBNQPe6bY/p23/ul1WRp7rqJ4wzVzJExJwcX8JEJrc0ZWg6h6g m4Dkrrhzu4yazbO0rsSMLVC6DIrrb9bWyq1kHQyeo4Gd5MuOnHud+IJFKK+uc7jzUU6FcP UVLaPvhtiFPU0U4rhPygtHTutXZn1lk= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 619273ED65; Thu, 12 Jan 2023 09:45:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1673516710; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KNiMMQNVa5sMzIVJxz1JeBcyurPbf7owbHpmHsMMjls=; b=M/ll2JxbXKarOND/NWDMQnnEv4wMiNFhx+H1RxnuakOBXNgAZXW67kEbJuMOZPUwqg850S bkyJR/TXsfKRzS3dLAqzrsft5vi7oY2acCNtbR599ljSeoEC0oZiCemGMA7B16OZRfQaO+ qQPxgWpvKe1Q5+fnR6DtkjLk9TVkCVo= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id F0CD013585; Thu, 12 Jan 2023 09:45:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id W2ZzK6XWv2NDHAAAMHmgww (envelope-from ); Thu, 12 Jan 2023 09:45:09 +0000 Date: Thu, 12 Jan 2023 10:45:07 +0100 From: Michal Hocko To: Mel Gorman Cc: Linux-MM , Andrew Morton , NeilBrown , Thierry Reding , Matthew Wilcox , Vlastimil Babka , LKML Subject: Re: [PATCH 6/7] mm/page_alloc: Give GFP_ATOMIC and non-blocking allocations access to reserves Message-ID: References: <20230109151631.24923-1-mgorman@techsingularity.net> <20230109151631.24923-7-mgorman@techsingularity.net> <20230111170552.5b7z5hetc2lcdwmb@techsingularity.net> <20230112092452.rtvo6tkp4rpmxm7v@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230112092452.rtvo6tkp4rpmxm7v@techsingularity.net> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: EF5A82000D X-Stat-Signature: 9i6fhy4paj5eao4cskzhxuf8u854nktd X-HE-Tag: 1673516711-787917 X-HE-Meta: U2FsdGVkX18UwFLGCbikjxpDgEqQUa/AY1CcLphUHKkN/zGw7sKEAWp6pduRNVOHkEGrnZQJGvpFrpqexBnhxrnrCcSSzAgjA9SRG4hocsbqsseUtQAyI6y6AwpItRDIHBkHbz/aJAU/CpILcdew473+O0b5FftbxN5NMbMR72/51K/fCHcSdTGO1RbQtZWeMjIYeIW+KrrCTD6j1lKBu1hT8jiPM1zzvHjmyd5/NaxnByEYowb8eu9r1ZdOTvb5urtbvUHp7aCLEVE8sQserDR1ethkwR/7M4mlyB6g7zl5ctR/N078rk8U2ChFcmuUMdeZGt3ZMNDrc6/slOWjuznOpujR97bVQsx4HX8qm4iut8W/GSReFIsWnRW/HYzK577tvi/rHYfTylJiwRVhJmowlkf/HFsIjHor6WZeXtILjdz2dV+MkZygmSpKEBkHq7sttg3Ka4GhO68WJaZSX4kJSZJhjUrwXKxHIm99OAEsoTQxS2P2WPRTYXu2k03AeTvZyo37HPu/EUWENc/keMSL++0h9JLVemfwYOiurJqDCXVHyKWKD5OR+cZPGx+HtAzg3jKxe86yYIiFUvV7OtGFPQoP/sBB+v3PP+qj6j/Zx8zL0A3+mB6aQ5k25ANqB3VWIYC4oHOXGCYFFTndgzn7NlvUSFGR8TQ2I6fD3uyzIzve3B1l3m6mUY/xFZOe7pOY/Jxz8lHliq+eO/MKbqJYLZVUEDG/l3mIXGtQUdEmFBTgTTgDG83l6YtXtV63o8vunBSHFUStDQEBJM4iKgxNK93WJFi/d3gOKf9TLzszB7qml92cUJ7RGM5lS/kFV7IpmWIa8KmYzfs5jMBODxdfHJ616KrmXpM7X+J7RLm1wp9ISaaLziCLUngJ5o/mySC8wCPTFXaQUp4/wFxK90QqnhauMQ0oxzn7+cy4K2w2x6XxVbwBuhLmP6ZgIOd/Jed/ipqwVmDjzK121St grGrIvOG Ipe0zs+I+jmCVXOQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 12-01-23 09:24:52, Mel Gorman wrote: > On Thu, Jan 12, 2023 at 09:11:06AM +0100, Michal Hocko wrote: > > On Wed 11-01-23 17:05:52, Mel Gorman wrote: > > > On Wed, Jan 11, 2023 at 04:58:02PM +0100, Michal Hocko wrote: > > > > On Mon 09-01-23 15:16:30, Mel Gorman wrote: > > > > > Explicit GFP_ATOMIC allocations get flagged ALLOC_HARDER which is a bit > > > > > vague. In preparation for removing __GFP_ATOMIC, give GFP_ATOMIC and > > > > > other non-blocking allocation requests equal access to reserve. Rename > > > > > ALLOC_HARDER to ALLOC_NON_BLOCK to make it more clear what the flag > > > > > means. > > > > > > > > GFP_NOWAIT can be also used for opportunistic allocations which can and > > > > should fail quickly if the memory is tight and more elaborate path > > > > should be taken (e.g. try higher order allocation first but fall back to > > > > smaller request if the memory is fragmented). Do we really want to give > > > > those access to memory reserves as well? > > > > > > Good question. Without __GFP_ATOMIC, GFP_NOWAIT only differs from GFP_ATOMIC > > > by __GFP_HIGH but that is not enough to distinguish between a caller that > > > cannot sleep versus one that is speculatively attempting an allocation but > > > has other options. That changelog is misleading, it's not equal access > > > as GFP_NOWAIT ends up with 25% of the reserves which is less than what > > > GFP_ATOMIC gets. > > > > > > Because it becomes impossible to distinguish between non-blocking and > > > atomic without __GFP_ATOMIC, there is some justification for allowing > > > access to reserves for GFP_NOWAIT. bio for example attempts an allocation > > > (clears __GFP_DIRECT_RECLAIM) before falling back to mempool but delays > > > in IO can also lead to further allocation pressure. mmu gather failing > > > GFP_WAIT slows the rate memory can be freed. NFS failing GFP_NOWAIT will > > > have to retry IOs multiple times. The examples were picked at random but > > > the point is that there are cases where failing GFP_NOWAIT can degrade > > > the system, particularly delay the cleaning of pages before reclaim. > > > > Fair points. > > > > > A lot of the truly speculative users appear to use GFP_NOWAIT | __GFP_NOWARN > > > so one compromise would be to avoid using reserves if __GFP_NOWARN is > > > also specified. > > > > > > Something like this as a separate patch? > > > > I cannot say I would be happy about adding more side effects to > > __GFP_NOWARN. You are right that it should be used for those optimistic > > allocation requests but historically all many of these subtle side effects > > have kicked back at some point. > > True. > > > Wouldn't it make sense to explicitly > > mark those places which really benefit from reserves instead? > > That would be __GFP_HIGH and would require context from every caller on > whether they need reserves or not and to determine what the consequences > are if there is a stall. Is there immediate local fallout or wider fallout > such as a variable delay before pages can be cleaned? Yes, and I will not hide I do not mind putting the burden on caller to justify adding requirement and eat from otherwise shared pool which memory reserves are. > > This is > > more work but it should pay off long term. Your examples above would use > > GFP_ATOMIC instead of GFP_NOWAIT. > > > > Yes, although it would confuse the meaning of GFP_ATOMIC as a result. > It's described as "%GFP_ATOMIC users can not sleep and need the allocation to > succeed" and something like the bio callsite does not *need* the allocation > to succeed. It can fallback to the mempool and performance simply degrades > temporarily. No doubt there are a few abuses of GFP_ATOMIC just to get > non-blocking behaviour already. I am afraid GFP_ATOMIC will eventually require a closer look. Many users are simply confused by the name and use it from the spin lock context. Others use it from IRQ context because that is the right thing to do (TM). > > The semantic would be easier to explain as well. GFP_ATOMIC - non > > sleeping allocations which are important so they have access to memory > > reserves. GFP_NOWAIT - non sleeping allocations. > > > > People's definition of "important" will vary wildly. The following would > avoid reserve access for GFP_NOWAIT for now. It would need to be folded > into this patch and a new changelog OK, so that effectively means that __GFP_HIGH modifier will give more reserves to non-sleepable allocations than sleepable. That is a better semantic than other special casing because when the two allocations are competing then the priority non-sleepable should win because it simply cannot reclaim. That hierarchy makes sense to me. Thanks for bearing with me here. Changing gfp flags semantic is a PITA. I wish would could design the whole thing from scratch (and screw it in yet another way). I will ack the patch once you post the full version of it. -- Michal Hocko SUSE Labs