From: Dave Chinner <david@fromorbit.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: Christoph Hellwig <hch@infradead.org>,
ksummit@lists.linux.dev, linux-fsdevel@vger.kernel.org
Subject: Re: [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems
Date: Thu, 7 Sep 2023 08:54:38 +1000 [thread overview]
Message-ID: <ZPkDLp0jyteubQhh@dread.disaster.area> (raw)
In-Reply-To: <8718a8a3-1e62-0e2b-09d0-7bce3155b045@roeck-us.net>
On Wed, Sep 06, 2023 at 03:32:28PM -0700, Guenter Roeck wrote:
> On 8/30/23 07:07, Christoph Hellwig wrote:
> > Hi all,
> >
> > we have a lot of on-disk file system drivers in Linux, which I consider
> > a good thing as it allows a lot of interoperability. At the same time
> > maintaining them is a burden, and there is a lot expectation on how
> > they are maintained.
> >
> > Part 1: untrusted file systems
> >
> > There has been a lot of syzbot fuzzing using generated file system
> > images, which I again consider a very good thing as syzbot is good
> > a finding bugs. Unfortunately it also finds a lot of bugs that no
> > one is interested in fixing. The reason for that is that file system
> > maintainers only consider a tiny subset of the file system drivers,
> > and for some of them a subset of the format options to be trusted vs
> > untrusted input. It thus is not just a waste of time for syzbot itself,
> > but even more so for the maintainers to report fuzzing bugs in other
> > implementations.
> >
> > What can we do to only mark certain file systems (and format options)
> > as trusted on untrusted input and remove a lot of the current tension
> > and make everyone work more efficiently? Note that this isn't even
> > getting into really trusted on-disk formats, which is a security
> > discussion on it's own, but just into formats where the maintainers
> > are interested in dealing with fuzzed images.
> >
> > Part 2: unmaintained file systems
> >
> > A lot of our file system drivers are either de facto or formally
> > unmaintained. If we want to move the kernel forward by finishing
> > API transitions (new mount API, buffer_head removal for the I/O path,
> > ->writepage removal, etc) these file systems need to change as well
> > and need some kind of testing. The easiest way forward would be
> > to remove everything that is not fully maintained, but that would
> > remove a lot of useful features.
> >
> > E.g. the hfsplus driver is unmaintained despite collecting odd fixes.
> > It collects odd fixes because it is really useful for interoperating
> > with MacOS and it would be a pity to remove it. At the same time
> > it is impossible to test changes to hfsplus sanely as there is no
> > mkfs.hfsplus or fsck.hfsplus available for Linux. We used to have
> > one that was ported from the open source Darwin code drops, and
> > I managed to get xfstests to run on hfsplus with them, but this
> > old version doesn't compile on any modern Linux distribution and
> > new versions of the code aren't trivially portable to Linux.
> >
> > Do we have volunteers with old enough distros that we can list as
> > testers for this code? Do we have any other way to proceed?
> >
> > If we don't, are we just going to untested API changes to these
> > code bases, or keep the old APIs around forever?
> >
>
> In this context, it might be worthwhile trying to determine if and when
> to call a file system broken.
>
> Case in point: After this e-mail, I tried playing with a few file systems.
> The most interesting exercise was with ntfsv3.
> Create it, mount it, copy a few files onto it, remove some of them, repeat.
> A script doing that only takes a few seconds to corrupt the file system.
> Trying to unmount it with the current upstream typically results in
> a backtrace and/or crash.
>
> Does that warrant marking it as BROKEN ? If not, what does ?
There's a bigger policy question around that.
I think that if we are going to have filesystems be "community
maintained" because they have no explicit maintainer, we need some
kind of standard policy to be applied.
I'd argue that the filesystem needs, at minimum, a working mkfs and
fsck implementation, and that it is supported by fstests so anyone
changing core infrastructure can simply run fstests against the
filesystem to smoke test the infrastructure changes they are making.
I'd suggest that syzbot coverage of such filesystems is not desired,
because nobody is going to be fixing problems related to on-disk
format verification. All we really care about is that a user can
read and write to the filesystem without trashing anything.
I'd also suggest that we mark filesystem support state via fstype
flags rather than config options. That way we aren't reliant on
distros setting config options correctly to include/indicate the
state of the filesystem implementation. We could also use similar
flags for indicating deprecation and obsolete state (i.e. pending
removal) and have code in the high level mount path issue the
relevant warnings.
This method of marking would also allow us to document and implement
a formal policy for removal of unmaintained and/or obsolete
filesystems without having to be dependent on distros juggling
config variables to allow users to continue using deprecated, broken
and/or obsolete filesystem implementations right up to the point
where they are removed from the kernel.
And let's not forget: removing a filesystem from the kernel is not
removing end user support for extracting data from old filesystems.
We have VMs for that - we can run pretty much any kernel ever built
inside a VM, so users that need to extract data from a really old
filesystem we no longer support in a modern kernel can simply boot
up an old distro that did support it and extract the data that way.
We need to get away from the idea that we have to support old
filesystems forever because someone, somewhere might have an old
disk on the shelf with that filesystem on it and they might plug it
in one day. If that day ever happens, they can go to the effort of
booting an era-relevant distro in a VM to extract that data. It
makes no sense to put an ongoing burden on current development to
support this sort of rare, niche use case....
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2023-09-06 22:54 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-30 14:07 Christoph Hellwig
2023-09-05 23:06 ` Dave Chinner
2023-09-05 23:23 ` Matthew Wilcox
2023-09-06 2:09 ` Dave Chinner
2023-09-06 15:06 ` Christian Brauner
2023-09-06 15:59 ` Christian Brauner
2023-09-06 19:09 ` Geert Uytterhoeven
2023-09-08 8:34 ` Christoph Hellwig
2023-09-07 0:46 ` Bagas Sanjaya
2023-09-09 12:50 ` James Bottomley
2023-09-09 15:44 ` Matthew Wilcox
2023-09-10 19:51 ` James Bottomley
2023-09-10 20:19 ` Kent Overstreet
2023-09-10 21:15 ` Guenter Roeck
2023-09-11 3:10 ` Theodore Ts'o
2023-09-11 19:03 ` James Bottomley
2023-09-12 0:23 ` Dave Chinner
2023-09-12 16:52 ` H. Peter Anvin
2023-09-09 22:42 ` Kent Overstreet
2023-09-10 8:19 ` Geert Uytterhoeven
2023-09-10 8:37 ` Bernd Schubert
2023-09-10 16:35 ` Kent Overstreet
2023-09-10 17:26 ` Geert Uytterhoeven
2023-09-10 17:35 ` Kent Overstreet
2023-09-11 1:05 ` Dave Chinner
2023-09-11 1:29 ` Kent Overstreet
2023-09-11 2:07 ` Dave Chinner
2023-09-11 13:35 ` David Disseldorp
2023-09-11 17:45 ` Bart Van Assche
2023-09-11 19:11 ` David Disseldorp
2023-09-11 23:05 ` Dave Chinner
2023-09-26 5:24 ` Eric W. Biederman
2023-09-08 8:55 ` Christoph Hellwig
2023-09-08 22:47 ` Dave Chinner
2023-09-06 22:32 ` Guenter Roeck
2023-09-06 22:54 ` Dave Chinner [this message]
2023-09-07 0:53 ` Bagas Sanjaya
2023-09-07 3:14 ` Dave Chinner
2023-09-07 1:53 ` Steven Rostedt
2023-09-07 2:22 ` Dave Chinner
2023-09-07 2:51 ` Steven Rostedt
2023-09-07 3:26 ` Matthew Wilcox
2023-09-07 8:04 ` Thorsten Leemhuis
2023-09-07 10:29 ` Christian Brauner
2023-09-07 11:18 ` Thorsten Leemhuis
2023-09-07 12:04 ` Matthew Wilcox
2023-09-07 12:57 ` Guenter Roeck
2023-09-07 13:56 ` Christian Brauner
2023-09-08 8:44 ` Christoph Hellwig
2023-09-07 3:38 ` Dave Chinner
2023-09-07 11:18 ` Steven Rostedt
2023-09-13 16:43 ` Eric Sandeen
2023-09-13 16:58 ` Guenter Roeck
2023-09-13 17:03 ` Linus Torvalds
2023-09-15 22:48 ` Dave Chinner
2023-09-16 19:44 ` Steven Rostedt
2023-09-16 21:50 ` James Bottomley
2023-09-17 1:40 ` NeilBrown
2023-09-17 17:30 ` Linus Torvalds
2023-09-17 18:09 ` Linus Torvalds
2023-09-17 18:57 ` Theodore Ts'o
2023-09-17 19:45 ` Linus Torvalds
2023-09-18 11:14 ` Jan Kara
2023-09-18 17:26 ` Linus Torvalds
2023-09-18 19:32 ` Jiri Kosina
2023-09-18 19:59 ` Linus Torvalds
2023-09-18 20:50 ` Theodore Ts'o
2023-09-18 22:48 ` Linus Torvalds
2023-09-18 20:33 ` H. Peter Anvin
2023-09-19 4:56 ` Dave Chinner
2023-09-25 9:43 ` Christoph Hellwig
2023-09-27 22:23 ` Dave Kleikamp
2023-09-19 1:15 ` Dave Chinner
2023-09-19 5:17 ` Matthew Wilcox
2023-09-19 16:34 ` Theodore Ts'o
2023-09-19 16:45 ` Matthew Wilcox
2023-09-19 17:15 ` Linus Torvalds
2023-09-19 22:57 ` Dave Chinner
2023-09-18 14:54 ` Bill O'Donnell
2023-09-19 2:44 ` Dave Chinner
2023-09-19 16:57 ` James Bottomley
2023-09-25 9:38 ` Christoph Hellwig
2023-09-25 14:14 ` Dan Carpenter
2023-09-25 16:50 ` Linus Torvalds
2023-09-07 9:48 ` Dan Carpenter
2023-09-07 11:04 ` Segher Boessenkool
2023-09-07 11:22 ` Steven Rostedt
2023-09-07 12:24 ` Segher Boessenkool
2023-09-07 11:23 ` Dan Carpenter
2023-09-07 12:30 ` Segher Boessenkool
2023-09-12 9:50 ` Richard Biener
2023-10-23 5:19 ` Eric Gallager
2023-09-08 8:39 ` Christoph Hellwig
2023-09-08 8:38 ` Christoph Hellwig
2023-09-08 23:21 ` Dave Chinner
2023-09-07 0:48 ` Bagas Sanjaya
2023-09-07 3:07 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZPkDLp0jyteubQhh@dread.disaster.area \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=ksummit@lists.linux.dev \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux@roeck-us.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox