Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Kent Overstreet <kent.overstreet@linux.dev>
To: NeilBrown <neilb@suse.de>
Cc: Dave Chinner <david@fromorbit.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Amir Goldstein <amir73il@gmail.com>,
	paulmck@kernel.org,  lsf-pc@lists.linux-foundation.org,
	linux-mm@kvack.org,
	 linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU
Date: Fri, 1 Mar 2024 19:02:19 -0500	[thread overview]
Message-ID: <xbjw7mn57qik3ica2k6o7ykt7twryod6rt3uvu73w6xahrrrql@iaplvz7t5tgv> (raw)
In-Reply-To: <170933687972.24797.18406852925615624495@noble.neil.brown.name>

On Sat, Mar 02, 2024 at 10:47:59AM +1100, NeilBrown wrote:
> On Sat, 02 Mar 2024, Kent Overstreet wrote:
> > On Fri, Mar 01, 2024 at 04:54:55PM +1100, Dave Chinner wrote:
> > > On Fri, Mar 01, 2024 at 01:16:18PM +1100, NeilBrown wrote:
> > > > While we are considering revising mm rules, I would really like to
> > > > revised the rule that GFP_KERNEL allocations are allowed to fail.
> > > > I'm not at all sure that they ever do (except for large allocations - so
> > > > maybe we could leave that exception in - or warn if large allocations
> > > > are tried without a MAY_FAIL flag).
> > > > 
> > > > Given that GFP_KERNEL can wait, and that the mm can kill off processes
> > > > and clear cache to free memory, there should be no case where failure is
> > > > needed or when simply waiting will eventually result in success.  And if
> > > > there is, the machine is a gonner anyway.
> > > 
> > > Yes, please!
> > > 
> > > XFS was designed and implemented on an OS that gave this exact
> > > guarantee for kernel allocations back in the early 1990s.  Memory
> > > allocation simply blocked until it succeeded unless the caller
> > > indicated they could handle failure. That's what __GFP_NOFAIL does
> > > and XFS is still heavily dependent on this behaviour.
> > 
> > I'm not saying we should get rid of __GFP_NOFAIL - actually, I'd say
> > let's remove the underscores and get rid of the silly two page limit.
> > GFP_NOFAIL|GFP_KERNEL is perfectly safe for larger allocations, as long
> > as you don't mind possibly waiting a bit.
> > 
> > But it can't be the default because, like I mentioned to Neal, there are
> > a _lot_ of different places where we allocate memory in the kernel, and
> > they have to be able to fail instead of shoving everything else out of
> > memory.
> > 
> > > This is the sort of thing I was thinking of in the "remove
> > > GFP_NOFS" discussion thread when I said this to Kent:
> > > 
> > > 	"We need to start designing our code in a way that doesn't require
> > > 	extensive testing to validate it as correct. If the only way to
> > > 	validate new code is correct is via stochastic coverage via error
> > > 	injection, then that is a clear sign we've made poor design choices
> > > 	along the way."
> > > 
> > > https://lore.kernel.org/linux-fsdevel/ZcqWh3OyMGjEsdPz@dread.disaster.area/
> > > 
> > > If memory allocation doesn't fail by default, then we can remove the
> > > vast majority of allocation error handling from the kernel. Make the
> > > common case just work - remove the need for all that code to handle
> > > failures that is hard to exercise reliably and so are rarely tested.
> > > 
> > > A simple change to make long standing behaviour an actual policy we
> > > can rely on means we can remove both code and test matrix overhead -
> > > it's a win-win IMO.
> > 
> > We definitely don't want to make GFP_NOIO/GFP_NOFS allocations nofail by
> > default - a great many of those allocations have mempools in front of
> > them to avoid deadlocks, and if you do that you've made the mempools
> > useless.
> > 
> 
> Not strictly true.  mempool_alloc() adds __GFP_NORETRY so the allocation
> will certainly fail if that is appropriate.

*nod* 

> I suspect that most places where there is a non-error fallback already
> use NORETRY or RETRY_MAYFAIL or similar.

NORETRY and RETRY_MAYFAIL actually weren't on my radar, and I don't see
_tons_ of uses for either of them - more for NORETRY.

My go-to is NOWAIT in this scenario though; my common pattern is "try
nonblocking with locks held, then drop locks and retry GFP_KERNEL".
 
> But I agree that changing the meaning of GFP_KERNEL has a potential to
> cause problems.  I support promoting "GFP_NOFAIL" which should work at
> least up to PAGE_ALLOC_COSTLY_ORDER (8 pages).

I'd support this change.

> I'm unsure how it should be have in PF_MEMALLOC_NOFS and
> PF_MEMALLOC_NOIO context.  I suspect Dave would tell me it should work in
> these contexts, in which case I'm sure it should.
> 
> Maybe we could then deprecate GFP_KERNEL.

What do you have in mind?

Deprecating GFP_NOFS and GFP_NOIO would be wonderful - those should
really just be PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO, now that we're
pushing for memalloc_flags_(save|restore) more.

Getting rid of those would be a really nice cleanup beacuse then gfp
flags would mostly just be:
 - the type of memory to allocate (highmem, zeroed, etc.)
 - how hard to try (don't block at all, block some, block forever)

next prev parent reply	other threads:[~2024-03-02  0:02 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 18:56 Paul E. McKenney
2024-02-27 19:19 ` [Lsf-pc] " Amir Goldstein
2024-02-27 22:59   ` Paul E. McKenney
2024-03-01  3:28     ` Kent Overstreet
2024-03-05  2:43       ` Paul E. McKenney
2024-03-05  2:56       ` Yosry Ahmed
2024-02-28 19:37   ` Matthew Wilcox
2024-02-29  1:29     ` Dave Chinner
2024-02-29  4:20       ` Kent Overstreet
2024-02-29  4:17     ` Kent Overstreet
2024-02-29  4:24       ` Matthew Wilcox
2024-02-29  4:44         ` Kent Overstreet
2024-03-01  2:16     ` NeilBrown
2024-03-01  2:39       ` Kent Overstreet
2024-03-01  2:48         ` Matthew Wilcox
2024-03-01  3:09           ` Kent Overstreet
2024-03-01  3:33             ` James Bottomley
2024-03-01  3:52               ` Kent Overstreet
2024-03-01  4:01                 ` Kent Overstreet
2024-03-01  4:09                   ` NeilBrown
2024-03-01  4:18                     ` Kent Overstreet
2024-03-01  4:18                   ` James Bottomley
2024-03-01  4:08                 ` James Bottomley
2024-03-01  4:15                   ` Kent Overstreet
2024-03-05  2:54           ` Yosry Ahmed
2024-03-01  5:54       ` Dave Chinner
2024-03-01 20:20         ` Kent Overstreet
2024-03-01 23:47           ` NeilBrown
2024-03-02  0:02             ` Kent Overstreet [this message]
2024-03-02 11:33               ` Tetsuo Handa
2024-03-02 16:53                 ` Matthew Wilcox
2024-03-03 22:45               ` NeilBrown
2024-03-03 22:54                 ` Kent Overstreet
2024-03-04  0:20                 ` Dave Chinner
2024-03-04  1:16                   ` NeilBrown
2024-03-04  0:35                 ` Matthew Wilcox
2024-03-04  1:27                   ` NeilBrown
2024-03-04  2:05                   ` Kent Overstreet
2024-03-12 14:46                 ` Vlastimil Babka
2024-03-12 22:09                   ` NeilBrown
2024-03-20 18:32                   ` Dan Carpenter
2024-03-20 18:48                     ` Vlastimil Babka
2024-03-20 18:55                       ` Matthew Wilcox
2024-03-20 19:07                         ` Kent Overstreet
2024-03-20 19:14                           ` Matthew Wilcox
2024-03-20 19:33                             ` Kent Overstreet
2024-03-20 19:09                     ` Kent Overstreet
2024-03-21  6:27                 ` Dan Carpenter
2024-03-22  1:47                   ` NeilBrown
2024-03-22  6:13                     ` Dan Carpenter
2024-03-24 22:31                       ` NeilBrown
2024-03-25  8:43                         ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xbjw7mn57qik3ica2k6o7ykt7twryod6rt3uvu73w6xahrrrql@iaplvz7t5tgv \
    --to=kent.overstreet@linux.dev \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=neilb@suse.de \
    --cc=paulmck@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox