Re: [LSF/MM TOPIC] Making pseudo file systems inodes/dentries more like normal file systems

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	 James Bottomley <James.Bottomley@hansenpartnership.com>,
	Amir Goldstein <amir73il@gmail.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	 lsf-pc@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	 Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [LSF/MM TOPIC] Making pseudo file systems inodes/dentries more like normal file systems
Date: Mon, 29 Jan 2024 16:08:33 +0100	[thread overview]
Message-ID: <20240129-umrechnen-kaiman-cb591bc22fc5@brauner> (raw)
In-Reply-To: <CAHk-=whXg6zAHWZ7f+CdOg5GOMffR3RSDVyvORTZhipxp5iAFQ@mail.gmail.com>

> But no. You should *not* look at a virtual filesystem as a guide how
> to write a filesystem, or how to use the VFS. Look at a real FS. A
> simple one, and preferably one that is built from the ground up to
> look like a POSIX one, so that you don't end up getting confused by
> all the nasty hacks to make it all look ok.
> 
> IOW, while FAT is a simple filesystem, don't look at that one, just
> because then you end up with all the complications that come from
> decades of non-UNIX filesystem history.
> 
> I'd say "look at minix or sysv filesystems", except those may be
> simple but they also end up being so legacy that they aren't good
> examples. You shouldn't use buffer-heads for anything new. But they
> are still probably good examples for one thing: if you want to
> understand the real power of dentries, look at either of the minix or
> sysv 'namei.c' files. Just *look* at how simple they are. Ignore the
> internal implementation of how a directory entry is then looked up on
> disk - because that's obviously filesystem-specific - and instead just
> look at the interface.

I agree and I have to say I'm getting annoyed with this thread.

And I want to fundamentally oppose the notion that it's too difficult to
write a virtual filesystem. Just one look at how many virtual
filesystems we already have and how many are proposed. Recent example is
that KVM wanted to implement restricted memory as a stacking layer on
top of tmpfs which I luckily caught early and told them not to do.

If at all a surprising amount of people that have nothing to do with
filesystems manage to write filesystem drivers quickly and propose them
upstream. And I hope people take a couple of months to write a decently
sized/complex (virtual) filesystem.

And specifically for virtual filesystems they often aren't alike at
all. And that's got nothing to do with the VFS abstractions. It's
simply because a virtual filesystem is often used for purposes when
developers think that they want a filesystem like userspace interface
but don't want all of the actual filesystem semantics that come with it.
So they all differ from each other and what functionality they actually
implement.

And I somewhat oppose the notion that the VFS isn't documented. We do
have extensive documentation for locking rules, a constantly updated
changelog with fundamental changes to all VFS APIs and expectations
around it. Including very intricate details for the reader that really
needs to know everything. I wrote a whole document just on permission
checking and idmappings when we added that to the VFS. Both
implementation and theoretical background. 

And stuff like overlayfs or shiftfs are completely separate stories
because they're even more special as they're (virtual) stacking
filesystems that challenge the VFS in way more radical ways than regular
virtual filesystems.

And I think (Amir may forgive me) that stacking filesystems are
generally an absolutely terrible idea as they complicate the VFS
massively and put us through an insane amount of pain. One just needs to
look at how much additional VFS machinery we have because of that and
how complicated our callchains can become because of that. It's just not
correct to even compare them to a boring virtual filesystem like
binderfs or bpffs.

next prev parent reply	other threads:[~2024-01-29 15:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25 15:48 Steven Rostedt
2024-01-26  1:24 ` Greg Kroah-Hartman
2024-01-26  1:50   ` Steven Rostedt
2024-01-26  1:59     ` Greg Kroah-Hartman
2024-01-26  2:40       ` Steven Rostedt
2024-01-26 14:16         ` Greg Kroah-Hartman
2024-01-26 15:15           ` Steven Rostedt
2024-01-26 15:41             ` Greg Kroah-Hartman
2024-01-26 16:44               ` Steven Rostedt
2024-01-27 10:15                 ` Amir Goldstein
2024-01-27 14:54                   ` Steven Rostedt
2024-01-27 14:59                   ` James Bottomley
2024-01-27 18:06                     ` Matthew Wilcox
2024-01-27 19:44                       ` Linus Torvalds
2024-01-27 20:23                         ` James Bottomley
2024-01-29 15:08                         ` Christian Brauner [this message]
2024-01-29 15:57                           ` Steven Rostedt
2024-01-27 20:07                       ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240129-umrechnen-kaiman-cb591bc22fc5@brauner \
    --to=brauner@kernel.org \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=amir73il@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox