From: Jan Kara <jack@suse.cz>
To: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Jeff Layton <jlayton@kernel.org>,
Bruno Haible <bruno@clisp.org>,
Xi Ruoyao <xry111@linuxfromscratch.org>,
bug-gnulib@gnu.org, Alexander Viro <viro@zeniv.linux.org.uk>,
Eric Van Hensbergen <ericvh@kernel.org>,
Latchesar Ionkov <lucho@ionkov.net>,
Dominique Martinet <asmadeus@codewreck.org>,
Christian Schoenebeck <linux_oss@crudebyte.com>,
David Howells <dhowells@redhat.com>,
Marc Dionne <marc.dionne@auristor.com>, Chris Mason <clm@fb.com>,
Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>, Xiubo Li <xiubli@redhat.com>,
Ilya Dryomov <idryomov@gmail.com>,
Jan Harkes <jaharkes@cs.cmu.edu>,
coda@cs.cmu.edu, Tyler Hicks <code@tyhicks.com>,
Gao Xiang <xiang@kernel.org>, Chao Yu <chao@kernel.org>,
Yue Hu <huyue2@coolpad.com>,
Jeffle Xu <jefflexu@linux.alibaba.com>,
Namjae Jeon <linkinjeon@kernel.org>,
Sungjong Seo <sj1557.seo@samsung.com>, Jan Kara <jack@suse.com>,
Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Jaegeuk Kim <jaegeuk@kernel.org>,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
Miklos Szeredi <miklos@szeredi.hu>,
Bo b Peterson <rpeterso@redhat.com>,
Andreas Gruenbacher <agruenba@redhat.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Tejun Heo <tj@kernel.org>,
Trond Myklebust <trond.myklebust@hammerspace.com>,
Anna Schumaker <anna@kernel.org>,
Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
Mark Fasheh <mark@fasheh.com>, Joel Becker <jlbec@evilplan.org>,
Joseph Qi <joseph.qi@linux.alibaba.com>,
Mike Marshall <hubcap@omnibond.com>,
Martin Brandenburg <martin@omnibond.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Iurii Zaikin <yzaikin@google.com>,
Steve French <sfrench@samba.org>,
Paulo Alcantara <pc@manguebit.com>,
Ronnie Sahlberg <ronniesahlberg@gmail.com>,
Shyam Prasad N <sprasad@microsoft.com>,
Tom Talpey <tom@talpey.com>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
Richard Weinberger <richard@nod.at>,
Hans de Goede <hdegoede@redhat.com>,
Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Amir Goldstein <l@gmail.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Benjamin Coddington <bcodding@redhat.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
v9fs@lists.linux.dev, linux-afs@lists.infradead.org,
linux-btrfs@vger.kernel.org, ceph-devel@vger.kernel.org,
codalist@coda.cs.cmu.edu, ecryptfs@vger.kernel.org,
linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com,
linux-nfs@vger.kernel.org, ntfs3@lists.linux.dev,
ocfs2-devel@lists.linux.dev, devel@lists.orangefs.org,
linux-cifs@vger.kernel.org, samba-technical@lists.samba.org,
linux-mtd@lists.infradead.org, linux-mm@kvack.org,
linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v7 12/13] ext4: switch to multigrain timestamps
Date: Wed, 20 Sep 2023 15:03:43 +0200 [thread overview]
Message-ID: <20230920130343.qs2kuzngoomy4s3r@quack3> (raw)
In-Reply-To: <20230920-kaulquappen-computer-0a4a0e4c3c71@brauner>
On Wed 20-09-23 12:30:52, Christian Brauner wrote:
> On Wed, Sep 20, 2023 at 12:17:31PM +0200, Jan Kara wrote:
> > On Wed 20-09-23 10:41:30, Christian Brauner wrote:
> > > > > f1 was last written to *after* f2 was last written to. If the timestamp of f1
> > > > > is then lower than the timestamp of f2, timestamps are fundamentally broken.
> > > > >
> > > > > Many things in user-space depend on timestamps, such as build system
> > > > > centered around 'make', but also 'find ... -newer ...'.
> > > > >
> > > >
> > > >
> > > > What does breakage with make look like in this situation? The "fuzz"
> > > > here is going to be on the order of a jiffy. The typical case for make
> > > > timestamp comparisons is comparing source files vs. a build target. If
> > > > those are being written nearly simultaneously, then that could be an
> > > > issue, but is that a typical behavior? It seems like it would be hard to
> > > > rely on that anyway, esp. given filesystems like NFS that can do lazy
> > > > writeback.
> > > >
> > > > One of the operating principles with this series is that timestamps can
> > > > be of varying granularity between different files. Note that Linux
> > > > already violates this assumption when you're working across filesystems
> > > > of different types.
> > > >
> > > > As to potential fixes if this is a real problem:
> > > >
> > > > I don't really want to put this behind a mount or mkfs option (a'la
> > > > relatime, etc.), but that is one possibility.
> > > >
> > > > I wonder if it would be feasible to just advance the coarse-grained
> > > > current_time whenever we end up updating a ctime with a fine-grained
> > > > timestamp? It might produce some inode write amplification. Files that
> > >
> > > Less than ideal imho.
> > >
> > > If this risks breaking existing workloads by enabling it unconditionally
> > > and there isn't a clear way to detect and handle these situations
> > > without risk of regression then we should move this behind a mount
> > > option.
> > >
> > > So how about the following:
> > >
> > > From cb14add421967f6e374eb77c36cc4a0526b10d17 Mon Sep 17 00:00:00 2001
> > > From: Christian Brauner <brauner@kernel.org>
> > > Date: Wed, 20 Sep 2023 10:00:08 +0200
> > > Subject: [PATCH] vfs: move multi-grain timestamps behind a mount option
> > >
> > > While we initially thought we can do this unconditionally it turns out
> > > that this might break existing workloads that rely on timestamps in very
> > > specific ways and we always knew this was a possibility. Move
> > > multi-grain timestamps behind a vfs mount option.
> > >
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> >
> > Surely this is a safe choice as it moves the responsibility to the sysadmin
> > and the cases where finegrained timestamps are required. But I kind of
> > wonder how is the sysadmin going to decide whether mgtime is safe for his
> > system or not? Because the possible breakage needn't be obvious at the
> > first sight... If I were a sysadmin, I'd rather opt for something like
>
> I think you'll basically enable this because you want to export a
> filesystem via NFS.
OK, that's what I thought but then you have to make a tough choice between:
1) Possibly inconsistent NFS caches on frequent changes.
2) Possibly broken builds on NFS.
Pick your poison ;)
> > finegrained timestamps + lazytime (if I needed the finegrained timestamps
> > functionality). That should avoid the IO overhead of finegrained timestamps
>
> That would work with this patch, no? Or are you saying it would need
> something else?
Sorry, I was not really precise here. What I meant was that instead of
having multigrain timestamps, I (as a sysadmin) would want the filesystem
to set sb->s_time_gran to 1 ns and use lazytime to remove the IO overhead
of the frequent timestamp updates. But that is just me brainstorming
possible solutions of the original NFS problem.
> > as well and I'd know I can have problems with timestamps only after a
> > system crash.
> >
> > I've just got another idea how we could solve the problem: Couldn't we
> > always just report coarsegrained timestamp to userspace and provide access
> > to finegrained value only to NFS which should know what it's doing?
>
> What would changes would be involved for that?
See my other email. It should be fairly small...
> If this is invasive work and we decide this is something that we want to
> do then we should remove FS_MGTIME from btrfs, xfs, ext4, and tmpfs for
> v6.6.
.. but let's see what Jeff thinks. I can miss some problem with the
solution.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2023-09-20 13:04 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-07 19:38 [PATCH v7 00/13] fs: implement " Jeff Layton
2023-08-07 19:38 ` [PATCH v7 01/13] fs: remove silly warning from current_time Jeff Layton
2023-08-08 9:05 ` Jan Kara
2023-08-07 19:38 ` [PATCH v7 02/13] fs: pass the request_mask to generic_fillattr Jeff Layton
2023-08-07 19:38 ` [PATCH v7 03/13] fs: drop the timespec64 arg from generic_update_time Jeff Layton
2023-08-08 9:25 ` Jan Kara
2023-08-07 19:38 ` [PATCH v7 04/13] btrfs: have it use inode_update_timestamps Jeff Layton
2023-08-08 9:26 ` Jan Kara
2023-08-07 19:38 ` [PATCH v7 05/13] fat: make fat_update_time get its own timestamp Jeff Layton
2023-08-08 9:32 ` Jan Kara
2023-08-09 7:08 ` Christian Brauner
2023-08-09 8:37 ` OGAWA Hirofumi
2023-08-09 8:41 ` OGAWA Hirofumi
2023-08-09 10:10 ` Jeff Layton
2023-08-09 13:36 ` OGAWA Hirofumi
2023-08-09 14:22 ` Jeff Layton
2023-08-09 14:44 ` OGAWA Hirofumi
2023-08-09 14:52 ` OGAWA Hirofumi
2023-08-09 15:00 ` Jan Kara
2023-08-09 15:17 ` OGAWA Hirofumi
2023-08-09 16:30 ` Jeff Layton
2023-08-09 17:44 ` OGAWA Hirofumi
2023-08-09 17:59 ` Jeff Layton
2023-08-09 18:31 ` OGAWA Hirofumi
2023-08-09 19:04 ` Jeff Layton
2023-08-09 20:14 ` OGAWA Hirofumi
2023-08-09 22:07 ` Jeff Layton
2023-08-09 22:37 ` OGAWA Hirofumi
2023-08-07 19:38 ` [PATCH v7 06/13] ubifs: have ubifs_update_time use inode_update_timestamps Jeff Layton
2023-08-08 9:37 ` Jan Kara
2023-08-09 7:06 ` Christian Brauner
2023-08-09 8:23 ` Jan Kara
2023-08-07 19:38 ` [PATCH v7 07/13] xfs: have xfs_vn_update_time gets its own timestamp Jeff Layton
2023-08-08 9:39 ` Jan Kara
2023-08-09 7:04 ` Christian Brauner
2023-08-09 15:57 ` Darrick J. Wong
2023-08-07 19:38 ` [PATCH v7 08/13] fs: drop the timespec64 argument from update_time Jeff Layton
2023-08-08 9:45 ` Jan Kara
2023-08-09 12:31 ` Christian Brauner
2023-08-09 18:38 ` Mike Marshall
2023-08-09 19:05 ` Jeff Layton
2023-08-07 19:38 ` [PATCH v7 09/13] fs: add infrastructure for multigrain timestamps Jeff Layton
2023-08-08 10:02 ` Jan Kara
2023-08-07 19:38 ` [PATCH v7 10/13] tmpfs: add support " Jeff Layton
2023-08-07 19:38 ` [PATCH v7 11/13] xfs: switch to " Jeff Layton
2023-08-07 19:38 ` [PATCH v7 12/13] ext4: " Jeff Layton
2023-09-19 7:05 ` Xi Ruoyao
2023-09-19 11:04 ` Jan Kara
2023-09-19 11:33 ` Jeff Layton
2023-09-19 14:52 ` Bruno Haible
2023-09-19 16:31 ` Jeff Layton
2023-09-19 20:10 ` Paul Eggert
2023-09-19 20:46 ` Jeff Layton
2023-09-20 8:41 ` Christian Brauner
2023-09-20 8:50 ` Xi Ruoyao
2023-09-20 9:56 ` Jeff Layton
2023-09-20 10:17 ` Jan Kara
2023-09-20 10:30 ` Christian Brauner
2023-09-20 13:03 ` Jan Kara [this message]
2023-09-20 10:35 ` Jeff Layton
2023-09-20 11:48 ` Christian Brauner
2023-09-20 11:56 ` Jeff Layton
2023-09-20 12:08 ` Christian Brauner
2023-09-20 12:26 ` Jeff Layton
2023-09-20 12:30 ` Christian Brauner
2023-09-20 13:57 ` Chuck Lever III
2023-09-20 14:53 ` Christian Brauner
2023-09-20 15:29 ` Jeff Layton
2023-09-20 15:30 ` Jan Kara
2023-09-20 12:48 ` Jan Kara
2023-09-20 14:12 ` Jeff Layton
2023-09-20 15:45 ` Jan Kara
2023-09-20 12:48 ` Bruno Haible
2023-09-20 9:58 ` Jan Kara
2023-08-07 19:38 ` [PATCH v7 13/13] btrfs: convert " Jeff Layton
2023-08-08 10:05 ` Jan Kara
2023-08-09 7:09 ` [PATCH v7 00/13] fs: implement " Christian Brauner
2023-09-04 18:11 ` [f2fs-dev] " patchwork-bot+f2fs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230920130343.qs2kuzngoomy4s3r@quack3 \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=agruenba@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=anna@kernel.org \
--cc=asmadeus@codewreck.org \
--cc=bcodding@redhat.com \
--cc=brauner@kernel.org \
--cc=bruno@clisp.org \
--cc=bug-gnulib@gnu.org \
--cc=ceph-devel@vger.kernel.org \
--cc=chao@kernel.org \
--cc=clm@fb.com \
--cc=cluster-devel@redhat.com \
--cc=coda@cs.cmu.edu \
--cc=codalist@coda.cs.cmu.edu \
--cc=code@tyhicks.com \
--cc=devel@lists.orangefs.org \
--cc=dhowells@redhat.com \
--cc=djwong@kernel.org \
--cc=dsterba@suse.com \
--cc=ecryptfs@vger.kernel.org \
--cc=ericvh@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=hdegoede@redhat.com \
--cc=hirofumi@mail.parknet.co.jp \
--cc=hubcap@omnibond.com \
--cc=hughd@google.com \
--cc=huyue2@coolpad.com \
--cc=idryomov@gmail.com \
--cc=jack@suse.com \
--cc=jaegeuk@kernel.org \
--cc=jaharkes@cs.cmu.edu \
--cc=jefflexu@linux.alibaba.com \
--cc=jlayton@kernel.org \
--cc=jlbec@evilplan.org \
--cc=josef@toxicpanda.com \
--cc=joseph.qi@linux.alibaba.com \
--cc=keescook@chromium.org \
--cc=l@gmail.com \
--cc=linkinjeon@kernel.org \
--cc=linux-afs@lists.infradead.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-mtd@lists.infradead.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=linux_oss@crudebyte.com \
--cc=lucho@ionkov.net \
--cc=marc.dionne@auristor.com \
--cc=mark@fasheh.com \
--cc=martin@omnibond.com \
--cc=mcgrof@kernel.org \
--cc=miklos@szeredi.hu \
--cc=ntfs3@lists.linux.dev \
--cc=ocfs2-devel@lists.linux.dev \
--cc=pc@manguebit.com \
--cc=richard@nod.at \
--cc=ronniesahlberg@gmail.com \
--cc=rpeterso@redhat.com \
--cc=samba-technical@lists.samba.org \
--cc=senozhatsky@chromium.org \
--cc=sfrench@samba.org \
--cc=sj1557.seo@samsung.com \
--cc=sprasad@microsoft.com \
--cc=tj@kernel.org \
--cc=tom@talpey.com \
--cc=trond.myklebust@hammerspace.com \
--cc=tytso@mit.edu \
--cc=v9fs@lists.linux.dev \
--cc=viro@zeniv.linux.org.uk \
--cc=xiang@kernel.org \
--cc=xiubli@redhat.com \
--cc=xry111@linuxfromscratch.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox