linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, "Theodore Y. Ts'o" <tytso@mit.edu>,
	Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
	Luis Chamberlain <mcgrof@kernel.org>
Subject: Re: [LSF/MM/BPF Topic] Filesystem reclaim & memory allocation BOF
Date: Thu, 27 Mar 2025 08:48:49 +1100	[thread overview]
Message-ID: <Z-R2QcpHxwetMp5v@dread.disaster.area> (raw)
In-Reply-To: <Z-QcUwDHHfAXl9mK@casper.infradead.org>

On Wed, Mar 26, 2025 at 03:25:07PM +0000, Matthew Wilcox wrote:
> 
> We've got three reports now (two are syzkaller kiddie stuff, but one's a
> real workload) of a warning in the page allocator from filesystems
> doing reclaim.  Essentially they're using GFP_NOFAIL from reclaim
> context.  This got me thinking about bs>PS and I realised that if we fix
> this, then we're going to end up trying to do high order GFP_NOFAIL allocations
> in the memory reclaim path, and that is really no bueno.
> 
> https://lore.kernel.org/linux-mm/20250326105914.3803197-1-matt@readmodwrite.com/

Anything that does IO or blocking memory allocation from evict()
context is a deadlock vector. They will also cause unpredictable
memory allocation latency as direct reclaim can get stuck on them.

The case that was brought up here is overlay dropping the last
reference to an inode from dentry cache reclaim, and that inode
having evict() run on it.

The filesystems then make journal reservations (which can block
waiting on IO), memory allocation (which can block waiting on IO
and/or direct memory reclaim stalling), do IO directly from that
context, etc.

Memory reclaim is supposed to be a non-blocking operation, so inode
reclaim really needs to avoid blocking or doing complex stuff that
requires memory allocation or IO in the direct evict() path.

Indeed, people spent -years- complaining that XFS did IO from
evict() context from direct memory reclaim because this caused
unacceptable memory allocation latency variations. It required
significant architectural changes to XFS inode journalling and
writeback to avoid blocking RMW IO during inode reclaim. It's also
one of the driving reasons for XFS aggressively pushing *any*
XFS-specific inode reclaim work that could block to background
inodegc workers that run after ->destroy_inode has removed the inode
from VFS visibility.

As I understand it, Josef's recent inode reference counting changes
will help with this, allowing the filesystem to hold a passive
reference to the inode whilst it it gets pushed to a background
context where the fs-specific cleanup code is allowed to block. This
is probably the direction we need to head to solve this problem in a
generic manner....

-Dave.
-- 
Dave Chinner
david@fromorbit.com


      parent reply	other threads:[~2025-03-26 21:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-26 15:25 Matthew Wilcox
2025-03-26 15:55 ` Theodore Ts'o
2025-03-26 16:19   ` Matthew Wilcox
2025-03-26 17:47     ` [Lsf-pc] " Jan Kara
2025-03-26 19:08       ` Chris Mason
2025-03-26 21:48 ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-R2QcpHxwetMp5v@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=clm@fb.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox