From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E089BC5475B for ; Fri, 1 Mar 2024 23:48:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15DF66B009B; Fri, 1 Mar 2024 18:48:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 10D906B009C; Fri, 1 Mar 2024 18:48:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EEF8B6B009D; Fri, 1 Mar 2024 18:48:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DFC7D6B009B for ; Fri, 1 Mar 2024 18:48:15 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9559E1C0F26 for ; Fri, 1 Mar 2024 23:48:15 +0000 (UTC) X-FDA: 81850111350.21.651417D Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf22.hostedemail.com (Postfix) with ESMTP id 571D7C000C for ; Fri, 1 Mar 2024 23:48:13 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=ocAx1jpV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ZLeExeAr; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=ocAx1jpV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ZLeExeAr; dmarc=pass (policy=none) header.from=suse.de; spf=pass (imf22.hostedemail.com: domain of neilb@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=neilb@suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709336893; a=rsa-sha256; cv=none; b=lvhjvFTto/Knm4az6PH+LSnr/sJCUUxvyjU+qvOdZzrfEjSiG8E7tTKCyFxUuvHuirFb7Q hI3akruyJr2XeR36XCoEkZERV0Z2mm2xynDgYvOtZPLQgNjMd6VKVZheq/IdagMTtOqws9 9NaH6aouwzGCVKYrQJwTzo+PiPg8ksI= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=ocAx1jpV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ZLeExeAr; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=ocAx1jpV; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=ZLeExeAr; dmarc=pass (policy=none) header.from=suse.de; spf=pass (imf22.hostedemail.com: domain of neilb@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=neilb@suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709336893; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/4mtuGG7fT9cEaNkhgmYblBuL3AELoQCFPyay9bo8bE=; b=LdeUWVVzx0IsVGhzd2OUTWyzj6Brb3EQMjGf9G6Zd0B2/WIyQHqXR57pwamoAcEFNOeBLX 37xySHTR+t+NCExaqUk4yWZBTWK4LZDTUIJMEFBRgeIZbAxLBu37pSW4LUFZhEZ7AbVJ/Q Dm/q3Fd1BOkKnX73YMzI6J5fo9tuZW8= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9290A20D18; Fri, 1 Mar 2024 23:48:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1709336891; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/4mtuGG7fT9cEaNkhgmYblBuL3AELoQCFPyay9bo8bE=; b=ocAx1jpVMLI6HvWx5rTvCAXpbf/jEJdoMQaytqMB1bmq0GZKBxQVFqYuR4P4n6USPNr9VZ fEc9lN6w2PkkvAhz990nCv5w4b6uttP5oRby3uIaonEca8T+juHaql/+dc7DEIaUMRvch0 voTPD+dWWNZW17fyo9breKntSPi9UzU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1709336891; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/4mtuGG7fT9cEaNkhgmYblBuL3AELoQCFPyay9bo8bE=; b=ZLeExeAruIWgLLregNdOFADT6UTWcOqa3RDJQWHeg8WcGkhCQUB+JHjKd2qI6OXkinY77/ K0wYhj5lq4BVyhDA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1709336891; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/4mtuGG7fT9cEaNkhgmYblBuL3AELoQCFPyay9bo8bE=; b=ocAx1jpVMLI6HvWx5rTvCAXpbf/jEJdoMQaytqMB1bmq0GZKBxQVFqYuR4P4n6USPNr9VZ fEc9lN6w2PkkvAhz990nCv5w4b6uttP5oRby3uIaonEca8T+juHaql/+dc7DEIaUMRvch0 voTPD+dWWNZW17fyo9breKntSPi9UzU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1709336891; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/4mtuGG7fT9cEaNkhgmYblBuL3AELoQCFPyay9bo8bE=; b=ZLeExeAruIWgLLregNdOFADT6UTWcOqa3RDJQWHeg8WcGkhCQUB+JHjKd2qI6OXkinY77/ K0wYhj5lq4BVyhDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id D06D013A80; Fri, 1 Mar 2024 23:48:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id 2HM+HDdp4mWNcgAAD6G6ig (envelope-from ); Fri, 01 Mar 2024 23:48:07 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 From: "NeilBrown" To: "Kent Overstreet" Cc: "Dave Chinner" , "Matthew Wilcox" , "Amir Goldstein" , paulmck@kernel.org, lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, "linux-fsdevel" , "Jan Kara" Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU In-reply-to: References: , , , <170925937840.24797.2167230750547152404@noble.neil.brown.name>, , Date: Sat, 02 Mar 2024 10:47:59 +1100 Message-id: <170933687972.24797.18406852925615624495@noble.neil.brown.name> X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 571D7C000C X-Stat-Signature: fci47kob8x5trzpfwpd83z1jzuyqqhnx X-HE-Tag: 1709336893-301389 X-HE-Meta: U2FsdGVkX1++7GPwVd2uhI0KW4mKWmNa98UYPCsGnDYTLuGquDQUChSCMO1ubF1s//Mhavfddz+yIqlaAego2nHghYCGfW3v0/iEa3oz0Yhyu4ZpLFIy3ATQ0n5W1GFemiD9gmUEp0EEXOTs2MGQOMPXUV/t/x9Lg0VF5AaRKoKfGhlj0VvueiULCtWxEJgtvcduvu8Zslu5TbIqYXv1ffSHsdcnJmF4mkqEP8U+JhQiUBmmHm0szBQVVt05co9lVJ8g1JCHA9z8AnbSDDjAzHoEs987TZ6cK/bzx3Z7Ik2TvRuPwDm0RVpsDMwjyll01rf/NV18wUVjC3JQRKXtznT0rF0bbGQvgiZzB1WBwJgu7dWulC1tlEgDZheOXB64aVZZdOK9X25jGfLBesylbdDlyjPbJC4bwsER5Ba+SzXU6XC80dbc+yBXx2jTtb87slUUfnqMNjFB0cmbqPk+mZOfdPfJyTfIQfBJl7n5MZ3ELa8d+NfXp97GgMEiVH+ZfedHm3g/9pIvou3YLPRS1MNDlOiW0UdAu5Xy+/mlXu+Vc0dWfj3cH86y1sVI0Jnc1+VXJbFQjCKlwtuz0xUv6TGpQg3tTJN4+mR02YjGD8gYM93IdSblbk1vFKr3ty2ABa+8XkLy2GOd71ed3bUIIB3kYTdL2IA8Tx35DgnejAmrlOjmDI9HlpRZQSptWdYtx9d+4eL/0j0cxhN25NRXXe3+dUPa6mldav9ZDuKCcV96PhyUaKhovgesRpAtdu1JLj24XIh+MFBJAK1kYAwssggL4wSfbE6cBS3ka/SmWpoDNoWjyngKyXiUh5Mcc+QDJUkTMIw/KzgTv1YnJRkf2EvgFH8XZW7qAubOjPqwjS7u3IuPv8FWSQfHTPJ/SCzjbRtPFKqIC5gaPu3JTMMpMimVaNNBKVF4kSYYKu5/jNpIbsFl9gGpgCX93L9PiSbaOnPuT7HaWP6M4ZhTxx7 1oKvjJlO sm6+8lmkfo65kAiBCJQ1Ey75rbFLMRsnXlio1KF6Uxm4H9OsSd7O7HN5gXF5N6CeaeEWjrj2sT2nHc6HbW51UJasAg7Fsh0cHRnAmQH7Lc4p4FQkPHH5jIZ10fecYohJ/m1cNMedXjd1SIMLgzJGrJLhkX4PHbypi2OSbXAnVGl2WY/78E0iGn/mMOuePmgUDnUD/RX6cc4IEWuYIG4WQjMrnUQTlSSCElPO6ddR660nGsSIYja8Zyj3IxHqaesw0HLwCKKAvjg0mz1o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, 02 Mar 2024, Kent Overstreet wrote: > On Fri, Mar 01, 2024 at 04:54:55PM +1100, Dave Chinner wrote: > > On Fri, Mar 01, 2024 at 01:16:18PM +1100, NeilBrown wrote: > > > While we are considering revising mm rules, I would really like to > > > revised the rule that GFP_KERNEL allocations are allowed to fail. > > > I'm not at all sure that they ever do (except for large allocations - so > > > maybe we could leave that exception in - or warn if large allocations > > > are tried without a MAY_FAIL flag). > > >=20 > > > Given that GFP_KERNEL can wait, and that the mm can kill off processes > > > and clear cache to free memory, there should be no case where failure is > > > needed or when simply waiting will eventually result in success. And if > > > there is, the machine is a gonner anyway. > >=20 > > Yes, please! > >=20 > > XFS was designed and implemented on an OS that gave this exact > > guarantee for kernel allocations back in the early 1990s. Memory > > allocation simply blocked until it succeeded unless the caller > > indicated they could handle failure. That's what __GFP_NOFAIL does > > and XFS is still heavily dependent on this behaviour. >=20 > I'm not saying we should get rid of __GFP_NOFAIL - actually, I'd say > let's remove the underscores and get rid of the silly two page limit. > GFP_NOFAIL|GFP_KERNEL is perfectly safe for larger allocations, as long > as you don't mind possibly waiting a bit. >=20 > But it can't be the default because, like I mentioned to Neal, there are > a _lot_ of different places where we allocate memory in the kernel, and > they have to be able to fail instead of shoving everything else out of > memory. >=20 > > This is the sort of thing I was thinking of in the "remove > > GFP_NOFS" discussion thread when I said this to Kent: > >=20 > > "We need to start designing our code in a way that doesn't require > > extensive testing to validate it as correct. If the only way to > > validate new code is correct is via stochastic coverage via error > > injection, then that is a clear sign we've made poor design choices > > along the way." > >=20 > > https://lore.kernel.org/linux-fsdevel/ZcqWh3OyMGjEsdPz@dread.disaster.are= a/ > >=20 > > If memory allocation doesn't fail by default, then we can remove the > > vast majority of allocation error handling from the kernel. Make the > > common case just work - remove the need for all that code to handle > > failures that is hard to exercise reliably and so are rarely tested. > >=20 > > A simple change to make long standing behaviour an actual policy we > > can rely on means we can remove both code and test matrix overhead - > > it's a win-win IMO. >=20 > We definitely don't want to make GFP_NOIO/GFP_NOFS allocations nofail by > default - a great many of those allocations have mempools in front of > them to avoid deadlocks, and if you do that you've made the mempools > useless. >=20 Not strictly true. mempool_alloc() adds __GFP_NORETRY so the allocation will certainly fail if that is appropriate. I suspect that most places where there is a non-error fallback already use NORETRY or RETRY_MAYFAIL or similar. But I agree that changing the meaning of GFP_KERNEL has a potential to cause problems. I support promoting "GFP_NOFAIL" which should work at least up to PAGE_ALLOC_COSTLY_ORDER (8 pages). I'm unsure how it should be have in PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO context. I suspect Dave would tell me it should work in these contexts, in which case I'm sure it should. Maybe we could then deprecate GFP_KERNEL. Thanks, NeilBrown