From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CB37C4829B for ; Mon, 12 Feb 2024 01:20:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 715046B006E; Sun, 11 Feb 2024 20:20:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C53E6B0072; Sun, 11 Feb 2024 20:20:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 565176B0074; Sun, 11 Feb 2024 20:20:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 46A6F6B006E for ; Sun, 11 Feb 2024 20:20:40 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B4DA2A1715 for ; Mon, 12 Feb 2024 01:20:39 +0000 (UTC) X-FDA: 81781396998.08.DA05957 Received: from mail-oa1-f42.google.com (mail-oa1-f42.google.com [209.85.160.42]) by imf09.hostedemail.com (Postfix) with ESMTP id AA987140015 for ; Mon, 12 Feb 2024 01:20:37 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=2341xpvA; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf09.hostedemail.com: domain of david@fromorbit.com designates 209.85.160.42 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707700837; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zGpbWcfwQU4LVqL/ANKrblaGXAW8mRNF+7w5voOO+QQ=; b=KFRMnSuJafMyOFVITYADH+8I9qBKa7JN/csSUeRp6mIf6cLrTypwTORgOu2UiLSRnnr7Dv XhGMBHdRgyUaRl+5KhnrdP0gTJMrLipIEkxwOOXmZI+SdhrZs5yCoeNGivl/oxA26yEAUn j8trbecl3zekITc4qF5buBlRQ1pwLwo= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=2341xpvA; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf09.hostedemail.com: domain of david@fromorbit.com designates 209.85.160.42 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707700837; a=rsa-sha256; cv=none; b=qZbGLVEHlcIy6W1Xfd8XzPS+TTHPqDP7bGglJK9Nm+uD7+YJlSPcDYC/569iHzvbMouxJY 7MrJJUaKqyDNErbFHB4JR8jhcAu5QyZ2PJ73AAE8S6qffnhtrheDLPqK+bPf92gufu7HEm zKDzzanZ3Q28pXjt+gcEAXIIQg0kihY= Received: by mail-oa1-f42.google.com with SMTP id 586e51a60fabf-2184133da88so1709496fac.0 for ; Sun, 11 Feb 2024 17:20:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1707700836; x=1708305636; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=zGpbWcfwQU4LVqL/ANKrblaGXAW8mRNF+7w5voOO+QQ=; b=2341xpvAeMSGwDILxyfTIi1noQ7797ITok76JYw3ohJsa2CfVOW0fsr1f63opHdzh3 /aRMDJmAfszCKpP+AiZ8vqv8QtXx3baJjQ6wtcS8hNs6Zx4CfmHLBYV37HQaqkg9O8EI hi+PBlLXsIGPaOLRdr9QwP/kho+pMXZ4E06htDCwGSqrGOq1WZ9SLZktbtzqebBFgqwA ZK0wrg/D8tr4mNJN4y2tBS/wFzFwTIOwNcTZTaJ77e74aKJAl3kpUhgOkc8E2Y38B5QF aNcEl4IK7RdFWtcAnp+a5rN1qNikme0jHuDj1y3VGR1Yl7+Ckuue3whJvKl9wJer4w9F w9Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707700836; x=1708305636; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=zGpbWcfwQU4LVqL/ANKrblaGXAW8mRNF+7w5voOO+QQ=; b=aTzIBxCFX4bYjwbjlzKlswyumqMmupUJSdVZADwjYVE0A+x2jlBpH1ifsZKEtJJcbd nGada07VQJtDMRkVS3Y6QKHzNBuIZIToLN1rnu8POo+cl73kHjZsOTkMjz9bwqDogmET FrqnZyNYKsAIZGXBCjojQSvtfrqjZtycWkIKPhZPxdK0xhjPqlwIb7LtmULbTtW+sTFm uF+d8kywAsBRnFrjiylPiqnm3ROHNfLr0zMQVYeveGsYa1GOXVCsIfyYIsGCoVJY1Lvp TELebqGoDaMDZvSryCEJPkw3SXhhC8TtUnd5CZuVBxi0ar9k+4Rp6+b0ayBLm7X8bi0w /nVw== X-Gm-Message-State: AOJu0YzrClkZZz6tRRvYb77YF6xXM04IKsOree+p2P0ILL4iOGRSTdyp 1RpUI6CO1OtWBITXew03CWp9qqTNbfyjs8SecoU4nxVKTkfRGZg5n2Amk9W0w2s= X-Google-Smtp-Source: AGHT+IGgzZfJ5MrlRBimvLaq2eWIMRYS7eIl1L8FsxbVpNRlgcJNKvUgAVQugoKnaLI/DUbFaYMsZQ== X-Received: by 2002:a05:6870:fba4:b0:219:d60e:92c2 with SMTP id kv36-20020a056870fba400b00219d60e92c2mr7064846oab.34.1707700836697; Sun, 11 Feb 2024 17:20:36 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCXZDYUZSbhur42pGyKP46OTNJeg9O4vyN0rt5q6Db4n5o1xjnjmewhgy5pjV8Yka3kQcDvepjkHZ2eUjCzj7I3RwflAA2aA1GqWmpBsTqeQnZY1o7PP8fyNYN0e2tiPlEza3YG2aGjFQs8R+hQWspLhBEtJhRVvjuBw3q08pTUtxMBArQxIYC2gIIH579/1f4YYP3wGpIjGBileEZz2C2Ua2g7XUbnplD7CodPEopniMvSmtGoCnO6Mimpp362/ccVIJBT/u7Zc0nIndQ8kpzwyzbKzmqZYDqnqClLqXU0+/Hexp8cbhuZfHzL0aN+G70o7OdvEAEKw+3iIYBB5jLVfKnecAka+v6fXgzEGxHq4/P09A1UCkIouwY1E+25n139b4SRX4MkLZg== Received: from dread.disaster.area (pa49-181-38-249.pa.nsw.optusnet.com.au. [49.181.38.249]) by smtp.gmail.com with ESMTPSA id j17-20020a635951000000b005d8b2f04eb7sm5604643pgm.62.2024.02.11.17.20.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 11 Feb 2024 17:20:35 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1rZKzo-005HoK-1j; Mon, 12 Feb 2024 12:20:32 +1100 Date: Mon, 12 Feb 2024 12:20:32 +1100 From: Dave Chinner To: "Vlastimil Babka (SUSE)" Cc: Michal Hocko , Matthew Wilcox , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org, linux-nvme@lists.infradead.org, Kent Overstreet Subject: Re: [LSF/MM/BPF TOPIC] Removing GFP_NOFS Message-ID: References: <3ba0dffa-beea-478f-bb6e-777b6304fb69@kernel.org> <3aa399bb-5007-4d12-88ae-ed244e9a653f@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3aa399bb-5007-4d12-88ae-ed244e9a653f@kernel.org> X-Rspamd-Queue-Id: AA987140015 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 4zgdmqrt3m6rudg8ekjf8ihtifk131s4 X-HE-Tag: 1707700837-209519 X-HE-Meta: U2FsdGVkX188yy3Ok4DepEALczgC8rdAPgnHKubQ4dUA+x4y5M0JRUOUwnOalNQm4U+KLb5f2gpf/DWRYKCYh4XRBb2yH5R6DDOLuMketsLYyiBWZT+tRQXbP1ueU748lDXeHUAcHDlPGahkK18DG7XKPNhqTqSUvkc/WNsC7+Gkg3dXPF3wuBR2tOaAkqtVhPjIvbkI+CWH/SXUTHwtqbf0Gv0ARU7zda660HLiiGubOhrwM1fixC61Z3D51eewcF9Xl9fYj52ROBZPQI39Ud92b+QAQIk9pTQbdsmQPr/32InrLEMjYkvkKqknqKvkE2LtrMqxT0/UfQ05dPwrk/NWIm8iCGmbjEzib58qBN9wUoKKcgnyCnlsRFMlxgRqx5v9XQz2hopTUGHF2VD3DjQ6Y+DofAEehk2iBzgckM2WLv3HxqpZuj0lB/D3PHHYrRwCZy8hQijr7BJwZRmMqph6emu39zAn5O/PFH6zblS3xObFw49kfNjD5BDYKAqNc082sq2vq+JL/Jyshop8Gu2hamvG/DYJ+cI++l8v3VYLttZa0YJy3r1w03sQEw+lXYiQrAMmBWZZW0ULzIvXGBk0P99JEC22uVMEE6aRdGrgxktWUkluhGpruOl++X+0psjtIASw6Hx+mQgWKz9HN7wi4dW6ALGqX/QLuCy91SJn2MJTRgvur5CPzhccV3pHb/rKev+xYQVwHRgzHn7B0Ubahx9g1WiuZ7oKvuJJ503plfPXkSzIBLaSxBA0vMHhr5Tq/Br7RUSHgb5QjsaOR5kBNvC5easr3wN7GT83HALj2MdpzRiyKNHM1lc7H8nW1u3pAVx1W3gCvmFd5TQ6FPNbCOnCZztgCt8O6F8760xYieOUP2DHlWAgTmnYdiTfbiU652pPy8tTkD6ykUdtm1EVKVHvpJpQczDcazGdJYJjebWwUx3xYi23042zSRJiNExMmP2tW/lh5949taO yXwtdic6 o57eAClZnlFM1BxzsJDqXqI2FzdNalLosqaK9pSEsxWTWrEZHb6RSoMP9PmzosIHhGpPqjyqUXtm2EjszXIsBOFtmIVhi2eEZnRTciQ54Zyufgir67mp8MGK0wJFfk01d8eBJBmj+FoFz39uYw98+VUD9cH94jx4J3RPck1Ri8aM98JoyaNY6riha7BJQIwqAcXV4O9NlyYrXTdZo5yfwZj1/gOyN69CnxZ+tbS4mH/HPU1rvIbei/fQ5CIVHPz1Bv6DwXG2KUFn3/bWJ/u0gYeD/mWDeEiXrU3WAURgsZXN+6nlBBtJKdnUwFaiL/qrxO7D4bu72Go7txJPNHEK+oV2Htuc766QUfQTUrdMNlqfmYGkL5TmWVkeW6JWjYIN9BtIbK2ftYVmukCOIsNdtZlExm/rv4jcMiIAN8PfGVmq+UZiNmVyEXYJBZDz7Ny1zBlqi/6tcR6aiDrI9CVoU2FMh9Bd58dQ6G1/qdsSo/lArmXdn3GuU9H5LCl2z1ZarM7wfxxlWEJkOSVLzFzO6ct3dzAydUWgEYXWB/VULPWpcScLtmx9YSNKQ9g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 08, 2024 at 08:55:05PM +0100, Vlastimil Babka (SUSE) wrote: > On 2/8/24 18:33, Michal Hocko wrote: > > On Thu 08-02-24 17:02:07, Vlastimil Babka (SUSE) wrote: > >> On 1/9/24 05:47, Dave Chinner wrote: > >> > On Thu, Jan 04, 2024 at 09:17:16PM +0000, Matthew Wilcox wrote: > >> > >> Your points and Kent's proposal of scoped GFP_NOWAIT [1] suggests to me this > >> is no longer FS-only topic as this isn't just about converting to the scoped > >> apis, but also how they should be improved. > > > > Scoped GFP_NOFAIL context is slightly easier from the semantic POV than > > scoped GFP_NOWAIT as it doesn't add a potentially unexpected failure > > mode. It is still tricky to deal with GFP_NOWAIT requests inside the > > NOFAIL scope because that makes it a non failing busy wait for an > > allocation if we need to insist on scope NOFAIL semantic. > > > > On the other hand we can define the behavior similar to what you > > propose with RETRY_MAYFAIL resp. NORETRY. Existing NOWAIT users should > > better handle allocation failures regardless of the external allocation > > scope. > > > > Overriding that scoped NOFAIL semantic with RETRY_MAYFAIL or NORETRY > > resembles the existing PF_MEMALLOC and GFP_NOMEMALLOC semantic and I do > > not see an immediate problem with that. > > > > Having more NOFAIL allocations is not great but if you need to > > emulate those by implementing the nofail semantic outside of the > > allocator then it is better to have those retries inside the allocator > > IMO. > > I see potential issues in scoping both the NOWAIT and NOFAIL > > - NOFAIL - I'm assuming Dave is adding __GFP_NOFAIL to xfs allocations or > adjacent layers where he knows they must not fail for his transaction. But > could the scope affect also something else underneath that could fail > without the failure propagating in a way that it affects xfs? Memory allocaiton failures below the filesystem (i.e. in the IO path) will fail the IO, and if that happens for a read IO within a transaction then it will have the same effect as XFS failing a memory allocation. i.e. it will shut down the filesystem. The key point here is the moment we go below the filesystem we enter into a new scoped allocation context with a guaranteed method of returning errors: NOIO and bio errors. Once we cross an allocation scope boundary, NOFAIL is no longer relevant to the code that is being run because there are other errors that can occur that the filesysetm must handle that. Hence memory allocation errors just don't matter at this point, and the NOFAIL constraint is no longer relevant. Hence we really need to conside NOFAIL differently to NOFS/NOIO. NOFS/NOIO are about avoiding reclaim recursion deadlocks, so are relevant all the way down the stack. NOFAIL is only relevant to a specific subsystem to prevent subsystem allocations from failing, but as soon as we cross into another subsystem that can (and does) return errors for memory allocation failures, the NOFAIL context is no longer relevant. i.e NOFAIL scopes are not relevant outside the subsystem that sets it. Hence we likely need helpers to clear and restore NOFAIL when we cross an allocation context boundaries. e.g. as we cross from filesystem to block layer in the IO stack via submit_bio(). Maybe they should be doing something like: nofail_flags = memalloc_nofail_clear(); noio_flags = memalloc_noio_save(); .... memalloc_noio_restore(noio_flags); memalloc_nofail_reinstate(nofail_flags); > Maybe it's a > high-order allocation with a low-order fallback that really should not be > __GFP_NOFAIL? We would need to hope it has something like RETRY_MAYFAIL or > NORETRY already. But maybe it just relies on >costly order being more likely > to fail implicitly, and those costly orders should be kept excluded from the > scoped NOFAIL? Maybe __GFP_NOWARN should also override the scoped nofail? We definitely need NORETRY/RETRY_MAYFAIL to override scoped NOFAIL at the filesystem layer (e.g. for readahead buffer allocations, xlog_kvmalloc(), etc to correctly fail fast within XFS transactions), but I don't think we should force every subsystem to have to do this just in case a higher level subsystem had a scoped NOFAIL set for it to work correctly. > - NOWAIT - as said already, we need to make sure we're not turning an > allocation that relied on too-small-to-fail into a null pointer exception or > BUG_ON(!page). Agreed. NOWAIT is removing allocation failure constraints and I don't think that can be made to work reliably. Error injection cannot prove the absence of errors and so we can never be certain the code will always operate correctly and not crash when an unexepected allocation failure occurs. -Dave. -- Dave Chinner david@fromorbit.com