Re: [LSF/MM/BPF TOPIC] Filesystem inode reclaim

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andreas Dilger <adilger@dilger.ca>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	Matthew Wilcox <willy@infradead.org>,
	lsf-pc@lists.linux-foundation.org
Subject: Re: [LSF/MM/BPF TOPIC] Filesystem inode reclaim
Date: Fri, 10 Apr 2026 15:14:04 -0600	[thread overview]
Message-ID: <1BDB4B6F-D8B8-4FCA-8B3C-FA0672108C75@dilger.ca> (raw)
In-Reply-To: <4q7d2bi2qjg6crznvr55yfnv2gcomfqdt5j2dgkrwp5hh3ynqo@cfgy5o53zjwr>



> On Apr 10, 2026, at 14:56, Jan Kara <jack@suse.cz> wrote:
> 
> On Fri 10-04-26 00:19:43, Christoph Hellwig wrote:
>> I think a patch is more useful than a discussion here, that idea has been
>> voiced multiple times, and effecticely implemented in XFS.
> 
> I know but after thinking for some time I wanted to get some feedback
> before I start coding.
> 
>> Trying to lift the XFS logic into the VFS and finding other consumers
>> for it would be very helpful.
> 
> I hope not to get all the complexity of XFS but we'll see :)
> 
>>> 1) Filesystems will be required to mark inodes that have non-trivial
>>> cleanup work to do on reclaim with an inode flag I_RECLAIM_HARD (or
>>> whatever :)). Usually I expect this to happen on first inode modification
>>> or so. This will require some per-fs work but it shouldn't be that
>>> difficult and filesystems can be adapted one-by-one as they decide to
>>> address these warnings from reclaim.
>> 
>> I think otherwise we call this dirty :)
> 
> Yup :) I was considering for a while to use another kind of dirty flag for
> this and then clean it from flush worker but in the end I decided against
> it as it would be IMHO confusing.
> 
>>> There's also a simpler approach to this problem but with more radical
>>> changes to behavior. For example getting rid of inode LRU completely -
>>> inodes without dentries referencing them anymore should be rare and it
>>> isn't very useful to cache them. So we can always drop inodes on last
>>> iput() (as we currently do for example for unlinked inodes). But I have a
>>> nagging feeling that somebody is depending on inode LRU somewhere - I'd
>>> like poll the collective knowledge of what could possibly go wrong here :)
>> 
>> I've heard this theory multiple times, but we really need to valide that
>> we don't need the LRU.  It also doesn't really solve the above problem,
>> as we still would not want to perform the expensive inode inactivation
>> work inline with the last dput.
>> 
>> So while this might be worth investigating, please keept it separate.
> 
> Ack. With the point Jeff made about NFS revalidations I agree it won't be
> straightforward.

Can this be opt-in to flag an inode with `I_KEEP_UNREFERENCED` so that it
is not reaped immediately when NFS does iput() on an inode (or even set
it on iget() by NFS for that matter, in case there are multiple users?
And a sysfs parameter that makes this optional for other filesystems
(defaults to off)?

That way you could float a trial balloon of an LRU-less kernel, but leave
an escape hatch/debug mechanism if this turns out to kill some workloads.
It would take some time (a few years) to get feedback, but this and the
negative dcache growth have also been discussed for that long without
forward progress.  Having a runtime parameter with the intent to make it
permanent in the future at least moves the needle.

Cheers, Andreas

next prev parent reply	other threads:[~2026-04-10 21:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09  9:16 Jan Kara
2026-04-09 12:57 ` [Lsf-pc] " Amir Goldstein
2026-04-09 16:48   ` Boris Burkov
2026-04-10 10:00     ` Jan Kara
2026-04-10 11:08     ` Christoph Hellwig
2026-04-10 13:58       ` Jan Kara
2026-04-10  9:54   ` Jan Kara
2026-04-09 16:12 ` Darrick J. Wong
2026-04-09 17:37   ` Jeff Layton
2026-04-10  9:43     ` Jan Kara
2026-04-10  7:19 ` Christoph Hellwig
2026-04-10 20:56   ` Jan Kara
2026-04-10 21:14     ` Andreas Dilger [this message]
2026-04-10  9:23 ` Christian Brauner
2026-04-10 10:14   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1BDB4B6F-D8B8-4FCA-8B3C-FA0672108C75@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox