From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67249C5475B for ; Sat, 2 Mar 2024 00:02:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7E4494000D; Fri, 1 Mar 2024 19:02:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D268A940007; Fri, 1 Mar 2024 19:02:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEDD894000D; Fri, 1 Mar 2024 19:02:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AFF63940007 for ; Fri, 1 Mar 2024 19:02:36 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 87378A0F59 for ; Sat, 2 Mar 2024 00:02:36 +0000 (UTC) X-FDA: 81850147512.08.D29034A Received: from out-185.mta1.migadu.com (out-185.mta1.migadu.com [95.215.58.185]) by imf14.hostedemail.com (Postfix) with ESMTP id 5021910000D for ; Sat, 2 Mar 2024 00:02:34 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=j6zKOxO8; spf=pass (imf14.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.185 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709337754; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+NHjZYBPaS4fHXiqG86Vmwt7S0fvkx8xr2DfgC0OSpw=; b=1ALaBr0atQR4WrjGG0zwGm3iXwa18YPgWhMpvWaKXvsFHLTZb5pVW6+38K7h7wntRY1/CO awdMZ2cAh25S6fZcCF9ptzu7TTxGMD2YuMvAJuWXbwGUyeyxHzR3qEfjExomXe1EYELeUx toT2hODNoNM2/REzLJPH8mifqORJ/Bk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709337754; a=rsa-sha256; cv=none; b=cc59YRB1ClMp667WwnX6Y3vboCfpTF2p6uutEZXNUsvDKKTUgz5ORjv3v6gWGLnTpXY7vf jr/NFNCEpODwBuYVMFp/TXGqTMmWlPoekNGW95+hYvIa7OzkvaOK5iKgZ1FUFmt2qC1twL DVcY9qlI8IapeS0DTN1GbQmpU766RYM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=j6zKOxO8; spf=pass (imf14.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.185 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Date: Fri, 1 Mar 2024 19:02:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1709337752; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+NHjZYBPaS4fHXiqG86Vmwt7S0fvkx8xr2DfgC0OSpw=; b=j6zKOxO8cxmyJa8amNdox8uWmadC5xdEBohd6FaiNMyyu95kjltv1y4fRkyhm8dnxb3kk6 iyDCldJFmhVgL3sp5NcpWK+bLArdwveVpM6MF4PxFSQDrYQMsn0GzJcY9yp5M+/EpOwHkK 3vKnFNtaY7EsYct/6w9bvvjzoADytgI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: NeilBrown Cc: Dave Chinner , Matthew Wilcox , Amir Goldstein , paulmck@kernel.org, lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, linux-fsdevel , Jan Kara Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU Message-ID: References: <170925937840.24797.2167230750547152404@noble.neil.brown.name> <170933687972.24797.18406852925615624495@noble.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <170933687972.24797.18406852925615624495@noble.neil.brown.name> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 5021910000D X-Rspam-User: X-Stat-Signature: y4ke33qb7gs9pjas8rxe3aix9z5mwhaq X-Rspamd-Server: rspam03 X-HE-Tag: 1709337754-792992 X-HE-Meta: U2FsdGVkX1+FisDc8vznPGqVf+pVqP2kNgBqeoTGNX3ASkVENp+nMkp+Y9mQ6EZ4UMPNQK2N2Mgy//IUt16blfoypmj4N7w5HJdU1/4wvAtyOgMcParFOfY/cuaKfZ2qzCu7vJ84KkfkwqZG/o0jOp75sysFFxOFb5dNRN9IjRt2mWfxzQxa6aV36Abv8DMT5SIDNH8itsphyEgZAmW8ic9ro/4n02VuzHNk2lYf4Ae3f7s8ukFpVk2pLpSsUGGPZnUgO4oCAcrIlBK/J2JsXgy7ZTD62LMV05SKn1UmjrVz9nyLUab9tgif7zn5fI2lWlgAKBqboydM3/1rt9d2vWrjBUc6oOAwxVW+xqwlodlgkAa7bq6iTugtRuuCSxccDIBY694MjwiDIvDoABQ9QyEhu9lPIHeM13UQfYQD/OE72BAeOioqqeXqj2tbsTc2DR3O+JBTXR1sPfAtaZ1bMLfCsRlR1ujCBepb8VpSYWKR1O1CDDiG8rpVP7d/ZfvkG3P0P9eQECrptQihdvXMuc0R7wMMpVkETf9UV/6x+pwnWhPMI5uksAawy9FEPLbGYjShHrxI/MtjRNcCkI+EQE45SYXicpJUUMOq0cPIApkF50uYzXop1MWXMC+O5wXPkPx5b2nOnd3gkXkbHKY9WOyBcvskI5cAZQq47f6lcLoB8fEdPrv2cqWEogZdRIZN5C937Dnmep0amL9wtbONlNI+lvi3kJDtyE1hdIuGp0K9v8h0sXi4v5TMBgWllmxPL5HiUSi0NNm4XBB/oLcC/nfgwzZDUdhL4nseelCOv4jfH2QKqSiUYF7aBifvGTC/PtT+nTOw0zR8UfHRaT7egOWhFw1wNHhhYbbqDY7A4OdImintMC/rZIZYuPSIAHRH49RoAn44jJCTm13WzjiDhuvP+WQ6CJ9/Orb+rI6nI3bI+cH8qF1VL+59/ABprR9gzMSi9aGm+xKiY2yxyuC k9GUnotN mxLXRGujBSGrT+4zMXSw3f1Cj7JyRL+YIoAHvviulbT55KG3fgzOZVSFBq42B5sw48DxtCynP7WAr6Tcz+Ph52StX7cr0GTk9jfXvpNFBfZ65gRBr2L+JbI8/VZuBBWMlEUgdLLOxj4lwpQvUwg5t5D+5ZA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Mar 02, 2024 at 10:47:59AM +1100, NeilBrown wrote: > On Sat, 02 Mar 2024, Kent Overstreet wrote: > > On Fri, Mar 01, 2024 at 04:54:55PM +1100, Dave Chinner wrote: > > > On Fri, Mar 01, 2024 at 01:16:18PM +1100, NeilBrown wrote: > > > > While we are considering revising mm rules, I would really like to > > > > revised the rule that GFP_KERNEL allocations are allowed to fail. > > > > I'm not at all sure that they ever do (except for large allocations - so > > > > maybe we could leave that exception in - or warn if large allocations > > > > are tried without a MAY_FAIL flag). > > > > > > > > Given that GFP_KERNEL can wait, and that the mm can kill off processes > > > > and clear cache to free memory, there should be no case where failure is > > > > needed or when simply waiting will eventually result in success. And if > > > > there is, the machine is a gonner anyway. > > > > > > Yes, please! > > > > > > XFS was designed and implemented on an OS that gave this exact > > > guarantee for kernel allocations back in the early 1990s. Memory > > > allocation simply blocked until it succeeded unless the caller > > > indicated they could handle failure. That's what __GFP_NOFAIL does > > > and XFS is still heavily dependent on this behaviour. > > > > I'm not saying we should get rid of __GFP_NOFAIL - actually, I'd say > > let's remove the underscores and get rid of the silly two page limit. > > GFP_NOFAIL|GFP_KERNEL is perfectly safe for larger allocations, as long > > as you don't mind possibly waiting a bit. > > > > But it can't be the default because, like I mentioned to Neal, there are > > a _lot_ of different places where we allocate memory in the kernel, and > > they have to be able to fail instead of shoving everything else out of > > memory. > > > > > This is the sort of thing I was thinking of in the "remove > > > GFP_NOFS" discussion thread when I said this to Kent: > > > > > > "We need to start designing our code in a way that doesn't require > > > extensive testing to validate it as correct. If the only way to > > > validate new code is correct is via stochastic coverage via error > > > injection, then that is a clear sign we've made poor design choices > > > along the way." > > > > > > https://lore.kernel.org/linux-fsdevel/ZcqWh3OyMGjEsdPz@dread.disaster.area/ > > > > > > If memory allocation doesn't fail by default, then we can remove the > > > vast majority of allocation error handling from the kernel. Make the > > > common case just work - remove the need for all that code to handle > > > failures that is hard to exercise reliably and so are rarely tested. > > > > > > A simple change to make long standing behaviour an actual policy we > > > can rely on means we can remove both code and test matrix overhead - > > > it's a win-win IMO. > > > > We definitely don't want to make GFP_NOIO/GFP_NOFS allocations nofail by > > default - a great many of those allocations have mempools in front of > > them to avoid deadlocks, and if you do that you've made the mempools > > useless. > > > > Not strictly true. mempool_alloc() adds __GFP_NORETRY so the allocation > will certainly fail if that is appropriate. *nod* > I suspect that most places where there is a non-error fallback already > use NORETRY or RETRY_MAYFAIL or similar. NORETRY and RETRY_MAYFAIL actually weren't on my radar, and I don't see _tons_ of uses for either of them - more for NORETRY. My go-to is NOWAIT in this scenario though; my common pattern is "try nonblocking with locks held, then drop locks and retry GFP_KERNEL". > But I agree that changing the meaning of GFP_KERNEL has a potential to > cause problems. I support promoting "GFP_NOFAIL" which should work at > least up to PAGE_ALLOC_COSTLY_ORDER (8 pages). I'd support this change. > I'm unsure how it should be have in PF_MEMALLOC_NOFS and > PF_MEMALLOC_NOIO context. I suspect Dave would tell me it should work in > these contexts, in which case I'm sure it should. > > Maybe we could then deprecate GFP_KERNEL. What do you have in mind? Deprecating GFP_NOFS and GFP_NOIO would be wonderful - those should really just be PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO, now that we're pushing for memalloc_flags_(save|restore) more. Getting rid of those would be a really nice cleanup beacuse then gfp flags would mostly just be: - the type of memory to allocate (highmem, zeroed, etc.) - how hard to try (don't block at all, block some, block forever)