From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA28FC04AA9 for ; Wed, 20 Sep 2023 15:45:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 40EBF6B0181; Wed, 20 Sep 2023 11:45:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 397F06B0184; Wed, 20 Sep 2023 11:45:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 212876B0185; Wed, 20 Sep 2023 11:45:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 143EF6B0181 for ; Wed, 20 Sep 2023 11:45:32 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9C1E516048B for ; Wed, 20 Sep 2023 15:45:31 +0000 (UTC) X-FDA: 81257400462.06.E617F48 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf16.hostedemail.com (Postfix) with ESMTP id 7F75C180011 for ; Wed, 20 Sep 2023 15:45:29 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=gDmHlI6X; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=486eKcFx; spf=pass (imf16.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695224729; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gw0GfkzEv5qjhmPaYmrLrmVLcBgiouetaUVK1NJf9SE=; b=QxWH7IhBTa54vd6T4YnTxuw+UQ5j1HXWqwAEnKo2Ux+P4sdk5WR5svz0M48yOA1G2QpFVI JLr8eYqLJ9Zc0g7SYX+Zjdu7cVCNh3CL62HL9RaTiY3eFUfjDrhEupM5ayti6DK8K3W33B QnDRu4366ixH6wp/5SqM+fIGqE16inA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695224729; a=rsa-sha256; cv=none; b=osTzEN6sK5BpfIDpR/Ca/AxpeqEYqEAENjW6nSBalgzo7IXHaP/CZ5jwRX6uKK8nV8EK0y IgWuaFFhQgyue/M1HLkdKcVUjrAQ166QulQXS7DM2IsRbqGOsgx+i9H3MpejJukK8PbgFy hqUWJ3Z/s146jSo/6D9pXmVlWOIp/h4= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=gDmHlI6X; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=486eKcFx; spf=pass (imf16.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz; dmarc=none Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B4BF420267; Wed, 20 Sep 2023 15:45:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1695224727; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Gw0GfkzEv5qjhmPaYmrLrmVLcBgiouetaUVK1NJf9SE=; b=gDmHlI6XIITA9cWR7273bMWfTaFm79ZWCKV9DUbgwsMsYlQtttv7ABQywrVa7BRoFxk5Vs fY8Ytovd+JgcWaHYYF/cazBOkeQAnuEbAgfy+zWX2EEn3UHUxNv8SLgNC/S16/CAgBAqvG jDzYA0P8qRNq8WfC9VqpMNAjW8ajDQM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1695224727; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Gw0GfkzEv5qjhmPaYmrLrmVLcBgiouetaUVK1NJf9SE=; b=486eKcFxONt6Xlrpd7pWWA685BmUhAvAAZoUWjfk3k5wgXtMAG8bbabr8rPv7UJw6UUY7B IV+4KbXuBo0SdmBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 98F9E132C7; Wed, 20 Sep 2023 15:45:27 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Nk9PJZcTC2VGTwAAMHmgww (envelope-from ); Wed, 20 Sep 2023 15:45:27 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 33A59A077D; Wed, 20 Sep 2023 17:45:27 +0200 (CEST) Date: Wed, 20 Sep 2023 17:45:27 +0200 From: Jan Kara To: Jeff Layton Cc: Jan Kara , Christian Brauner , Bruno Haible , Xi Ruoyao , bug-gnulib@gnu.org, Alexander Viro , Eric Van Hensbergen , Latchesar Ionkov , Dominique Martinet , Christian Schoenebeck , David Howells , Marc Dionne , Chris Mason , Josef Bacik , David Sterba , Xiubo Li , Ilya Dryomov , Jan Harkes , coda@cs.cmu.edu, Tyler Hicks , Gao Xiang , Chao Yu , Yue Hu , Jeffle Xu , Namjae Jeon , Sungjong Seo , Jan Kara , Theodore Ts'o , Andreas Dilger , Jaegeuk Kim , OGAWA Hirofumi , Miklos Szeredi , Bo b Peterson , Andreas Gruenbacher , Greg Kroah-Hartman , Tejun Heo , Trond Myklebust , Anna Schumaker , Konstantin Komarov , Mark Fasheh , Joel Becker , Joseph Qi , Mike Marshall , Martin Brandenburg , Luis Chamberlain , Kees Cook , Iurii Zaikin , Steve French , Paulo Alcantara , Ronnie Sahlberg , Shyam Prasad N , Tom Talpey , Sergey Senozhatsky , Richard Weinberger , Hans de Goede , Hugh Dickins , Andrew Morton , Amir Goldstein , "Darrick J. Wong" , Benjamin Coddington , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, v9fs@lists.linux.dev, linux-afs@lists.infradead.org, linux-btrfs@vger.kernel.org, ceph-devel@vger.kernel.org, codalist@coda.cs.cmu.edu, ecryptfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, ntfs3@lists.linux.dev, ocfs2-devel@lists.linux.dev, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, linux-mm@kvack.org, linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: Re: [PATCH v7 12/13] ext4: switch to multigrain timestamps Message-ID: <20230920154527.pkwot4nu2nzrnamd@quack3> References: <20230807-mgctime-v7-0-d1dec143a704@kernel.org> <20230919110457.7fnmzo4nqsi43yqq@quack3> <1f29102c09c60661758c5376018eac43f774c462.camel@kernel.org> <4511209.uG2h0Jr0uP@nimes> <08b5c6fd3b08b87fa564bb562d89381dd4e05b6a.camel@kernel.org> <20230920-leerung-krokodil-52ec6cb44707@brauner> <20230920101731.ym6pahcvkl57guto@quack3> <317d84b1b909b6c6519a2406fcb302ce22dafa41.camel@kernel.org> <20230920124823.ghl6crb5sh4x2pmt@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 7F75C180011 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 6boa9ms7tzruqd1wxork3cqnmo436m91 X-HE-Tag: 1695224729-280577 X-HE-Meta: U2FsdGVkX1+IO7OcftDiHfwX0u6QyNkTmStEXjkZYVi6/gGhaMefv+fP7/oJE5SZDWmYORDvHV4gD87rEw2cmafdiYFiY+xCfDXE/rWtj2Uk92tW4jTRMsInagT4yivt1Sb4/O54My6xGtuI4gtDzVXB4SSjzIB2gKwnxt7WMPV2d0n2YmZFpvLMIumW6Hj/KwDAlh1Put/SlOHRmZMKOOCAHk2SI2USU9K0d5mA8PcY/5gR9DfyadMgN9yNyZtf1dsLAcuQmkINKgVnwoykAkcu1YNSoGQZ+wLHZgm1Y8JvxxiOy5Y5ckIjBiBRvtpZJ6n3NebhJrgfaoTQ0VnRSOi7jnxTkPwccaz09gyZFKPWVVWk9jfOLg+glEXt34v/q5+W93SpCfRQ/iBi4FB203ZK63B5gpHVNHUeZTnL3IVr2HZn5AtTJmUwEj+XJO85pm8TKFAzGj0mKA5c/ktcbXpcS7MvtrYdBjjmvvoVZU2BaAzN9CR7w6p6Sg1XAA+ES29cQPgkJIkcWvrQwV7BLq9/kFzgae8Sd4p/sHGbPDxqTjlWz/p4IEswczqaDFmbqqjeFacFsSBfMCH6kckHTxTLELDmPJhGnlSUBP1fcXPO1DP3lJjaWnbDCSq/lnHce+5lqrlYfdc5MtXpohw8uyU3CPMprcJHqqwco4CQkeQvmMCzNonSv4mupvIhD8viIfP7G5K2mgBH1A2Q0RFBq39TAQVOldjBJ+L1Jx6OxTXC2Z9zYrkzrA2paE8RkHrCiYnVbw5Of5Kkq0Qa6sCNyJ8r50ZQhVEgIaOyXG9fy4q9vmScDxTMDpqIKzm3IN9gdjToEa2Tb7gV8TpdJUm3odI2FBDjwVNrKGr0bGHf9SDBMh8UeNFaOqrEBOfPWhkKc4OjdzFtnxNd2trllVfl2GDrxmFPJ9hnhIUh00UWMOsKddRHZqgmfsE3ghDgDuzK87864RmDsENW0VQl/wt z1MxjfO8 Gpx1Tfu3QXnzdT0WWUmj5J8lprsAOiTV1bHRBxSj6K//va494eBq2NFxA0cOWhcL8ptw91ctaniw/3wleLo+r54kmkF37PDLexorZOhZf8UUTTf4rnOWKUbElWR/oKm4Jrwkty0nSqJlRZyFHFgUG//yI9Gs4rm591AI717vlG5HWJaw732+7IlWFxA8TQOst1KCbIkoqZRBTzBXWZXn4goAqgPQNbamB2Ozc73IJQkbnSympZVFqxLiefgoV9+29TRWd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 20-09-23 10:12:03, Jeff Layton wrote: > On Wed, 2023-09-20 at 14:48 +0200, Jan Kara wrote: > > On Wed 20-09-23 06:35:18, Jeff Layton wrote: > > > On Wed, 2023-09-20 at 12:17 +0200, Jan Kara wrote: > > > > If I were a sysadmin, I'd rather opt for something like > > > > finegrained timestamps + lazytime (if I needed the finegrained timestamps > > > > functionality). That should avoid the IO overhead of finegrained timestamps > > > > as well and I'd know I can have problems with timestamps only after a > > > > system crash. > > > > > > > I've just got another idea how we could solve the problem: Couldn't we > > > > always just report coarsegrained timestamp to userspace and provide access > > > > to finegrained value only to NFS which should know what it's doing? > > > > > > > > > > I think that'd be hard. First of all, where would we store the second > > > timestamp? We can't just truncate the fine-grained ones to come up with > > > a coarse-grained one. It might also be confusing having nfsd and local > > > filesystems present different attributes. > > > > So what I had in mind (and I definitely miss all the NFS intricacies so the > > idea may be bogus) was that inode->i_ctime would be maintained exactly as > > is now. There will be new (kernel internal at least for now) STATX flag > > STATX_MULTIGRAIN_TS. fill_mg_cmtime() will return timestamp truncated to > > sb->s_time_gran unless STATX_MULTIGRAIN_TS is set. Hence unless you set > > STATX_MULTIGRAIN_TS, there is no difference in the returned timestamps > > compared to the state before multigrain timestamps were introduced. With > > STATX_MULTIGRAIN_TS we return full precision timestamp as stored in the > > inode. Then NFS in fh_fill_pre_attrs() and fh_fill_post_attrs() needs to > > make sure STATX_MULTIGRAIN_TS is set when calling vfs_getattr() to get > > multigrain time. > > > I agree nfsd may now be presenting slightly different timestamps than user > > is able to see with stat(2) directly on the filesystem. But is that a > > problem? Essentially it is a similar solution as the mgtime mount option > > but now sysadmin doesn't have to decide on filesystem mount how to report > > timestamps but the stat caller knowingly opts into possibly inconsistent > > (among files) but high precision timestamps. And in the particular NFS > > usecase where stat is called all the time anyway, timestamps will likely > > even be consistent among files. > > > > I like this idea... > > Would we also need to raise sb->s_time_gran to something corresponding > to HZ on these filesystems? I was actually confused a bit about how timestamp_truncate() works. The jiffie granularity is just direct consequence of current_time() using ktime_get_coarse_real_ts64() and not of timestamp_truncate(). sb->s_time_gran seems to be more about the on-disk format so it doesn't seem like a great idea to touch it. So probably we can just truncate timestamps in generic_fillattr() to HZ granularity unconditionally. > If we truncate the timestamps at a granularity corresponding to HZ before > presenting them via statx and the like then that should work around the > problem with programs that compare timestamps between inodes. Exactly. > With NFSv4, when a filesystem doesn't report a STATX_CHANGE_COOKIE, nfsd > will fake one up using the ctime. It's fine for that to use a full fine- > grained timestamp since we don't expect to be able to compare that value > with one of a different inode. Yes. > I think we'd want nfsd to present the mtime/ctime values as truncated, > just like we would with a local fs. We could hit the same problem of an > earlier-looking timestamp with NFS if we try to present the actual fine- > grained values to the clients. IOW, I'm convinced that we need to avoid > this behavior in most situations. I wasn't sure if there's a way to do this within NFS - i.e., if the value communicated via NFSv3 protocol (I know v4 has a special change cookie field for it) that gets used for detecting need to revalidate file contents isn't the one presented to client's userspace as ctime. If there's a way to do this then great, I'm all for presenting truncated timestamps even for NFS. > If we do this, then we technically don't need the mount option either. Yes, that was my hope. > We could still add it though, and have it govern whether fill_mg_cmtime > truncates the timestamps before storing them in the kstat. Well, if we decide these timestamps are useful for userspace as well, I'd rather make that a userspace visible STATX flag than a mount option. So applications aware of the pitfalls can get high precision timestamps without possibly breaking unaware applications. Honza -- Jan Kara SUSE Labs, CR