From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C72FC4332F for ; Mon, 30 Oct 2023 23:12:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7E0106B025E; Mon, 30 Oct 2023 19:12:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7907E6B0272; Mon, 30 Oct 2023 19:12:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 656B56B0274; Mon, 30 Oct 2023 19:12:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 567B66B025E for ; Mon, 30 Oct 2023 19:12:40 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2928E806BA for ; Mon, 30 Oct 2023 23:12:40 +0000 (UTC) X-FDA: 81403679280.04.29AEF11 Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf15.hostedemail.com (Postfix) with ESMTP id 23E7FA000A for ; Mon, 30 Oct 2023 23:12:37 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=WEa41GoM; spf=pass (imf15.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.47 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698707558; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I9soNDg/CzKcFVIkPX7HzsMrN43wo2kJXg9OOVu3o/I=; b=cUd/JKcrPzFCoMJwjnBX5IQlJhWFELZdP7Ght9sK7s4NHfBud+pta+xwxuKAeQBuE0qg07 r7pk/CE11Fv824FVhB9A7x/RXf4HdCkkiK66IHZMGOR5Mckhz6t55b4rfEbOw8mszM9yWR 8X24urfepFasYedywswmKiVYUDQ1diA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698707558; a=rsa-sha256; cv=none; b=MuCKIe8dN3dMedxRHfrkh05QrsJ7lYuWtLhKghwICtJcWke8wiRghhp515WDLGuSPQCp75 jocavqILKQkJckbH3NWfVPZDrpl/XFk6TlPEYocQa4u4LCieyO0+OKThMEATygg33m1nna FGkJz6QsgYkrp4GT0Req0wppGy8FgGs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=WEa41GoM; spf=pass (imf15.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.47 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-5435336ab0bso1542770a12.1 for ; Mon, 30 Oct 2023 16:12:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1698707556; x=1699312356; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=I9soNDg/CzKcFVIkPX7HzsMrN43wo2kJXg9OOVu3o/I=; b=WEa41GoMHz/gbclyVkPTkHuKlWh6CvvWbGtlUgTyJztHkTQ7smjuQMCW9Jlrpdr/ct rkzdMCqz262Vhyxq16y2/f9uZeyxQjE4oPjKMSDqa0uMn9V7TQdi1F1HI5ebgSxHrVpt yBXgcQrDh7fHWm6I/yNc/xUCuJwHaYtx+mCJA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698707556; x=1699312356; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=I9soNDg/CzKcFVIkPX7HzsMrN43wo2kJXg9OOVu3o/I=; b=EpnJqktFbv51gNuVyPlbshpEWoZY2RwX8zPL9EtTP7VDq3HqM+TPiBPChEtsm1AzxW bxLbbGmDleGKnfTvV5N20DghSljoN+FrFAh15S4RSeUG8Md+5Cpw7EqrgtMGBtBdp3RV m0Vm14gthL6qBYhJt7ancVxujZTKn8QEj8AkZVROprZd8ux/mWCveuzVNS6Z/v0hhx3U ypL5MCG6pjoSh0cGFCr1eR4kwa2DT4Jdlp5FyQmD/q9ETs2XZGMFJJee5tIbijzeEqbs UZmUow4MPZjuDJNZFrcmYKy2kGCM4WjH6TLQBOLVAzV8eouhd1jrT/VGOrOjh7NtuZeU uxZA== X-Gm-Message-State: AOJu0YwVwLLuoVmjUODt2n0PkkSdZKbEnPC6hBlZ6gPzUk++42iOUFsx fICg+HN9rIN/h+WOT5OVcFO55yDbcuXQ8rcFBq/t8g== X-Google-Smtp-Source: AGHT+IEN+W9DNa7m8JKgZoWTpNGq9saVEPl7gzpEvrvUC+vatAgQE27l1dJHcilXeT1kQ8se29Vt3Q== X-Received: by 2002:a05:6402:d6b:b0:541:29c8:959b with SMTP id ec43-20020a0564020d6b00b0054129c8959bmr7027174edb.39.1698707556473; Mon, 30 Oct 2023 16:12:36 -0700 (PDT) Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com. [209.85.208.45]) by smtp.gmail.com with ESMTPSA id x13-20020a50ba8d000000b0052ff9bae873sm132971ede.5.2023.10.30.16.12.35 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 Oct 2023 16:12:35 -0700 (PDT) Received: by mail-ed1-f45.google.com with SMTP id 4fb4d7f45d1cf-53df747cfe5so8559658a12.2 for ; Mon, 30 Oct 2023 16:12:35 -0700 (PDT) X-Received: by 2002:a17:907:25c4:b0:9ae:4776:5a3a with SMTP id ae4-20020a17090725c400b009ae47765a3amr9873098ejc.39.1698707534524; Mon, 30 Oct 2023 16:12:14 -0700 (PDT) MIME-Version: 1.0 References: <61b32a4093948ae1ae8603688793f07de764430f.camel@kernel.org> <2ef9ac6180e47bc9cc8edef20648a000367c4ed2.camel@kernel.org> <6df5ea54463526a3d898ed2bd8a005166caa9381.camel@kernel.org> In-Reply-To: From: Linus Torvalds Date: Mon, 30 Oct 2023 13:11:56 -1000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC 2/9] timekeeping: new interfaces for multigrain timestamp handing To: Dave Chinner Cc: Jeff Layton , Amir Goldstein , Kent Overstreet , Christian Brauner , Alexander Viro , John Stultz , Thomas Gleixner , Stephen Boyd , Chandan Babu R , "Darrick J. Wong" , "Theodore Ts'o" , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Jan Kara , David Howells , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-nfs@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 23E7FA000A X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: tw1m6e8i9614ah3din3c1y91ku6qk9yd X-HE-Tag: 1698707557-62068 X-HE-Meta: U2FsdGVkX1/TyDwVlhPsmjt3AOkzezUOmOgB4E+rUJk6dqNAJo+HSOISU8HwXYYba5wIzXUb4BFD7+rNGmsitSPkvIuOsknHIVmxes688fKZ6T4FKxmlN1mmjbxb6/tMrnVE6iFiqmyB2HP+Bmlbg78PDoyGcg+tSC/0HwPl5Ap8NZtYHNR5wng2QyOXx5tTyAIoMMfUQHyRybt6R2x9LrN77KNoh5MJDRJSJhrdlZZuDqaqRQ5WFcVvTc7trMehyj1nDvDPoKTZI0f4dzc9RHhNWF+lsy+ZwM2KDTtKw0n4Ev+ad6rvSlkKYIImIi6R1GK7giArPxdBn7bILlR8CIAJ1tjYi0R29HotzZTvIf2+hvr2YehjCR+VWGb9PA5dLKwnrcRau8Z7FnKSkg3ASTyLEL9uoYLkfIGjERy0G0QQM/inzNpGNGHM/gWgrHJOGDBDJE+M8JbKyYovZgF5Zm3Lz89gSxfQuLjbiBq+mI922AfGIuZ2p3sJ/0Zs8scsrQreqt0VYHhMxEhqQOvBGvvX8+GD5wFR9ThNMEo+SxT8h6co9JK7BNf+j/fYITocOKcyoDJgJ5Wxa+D07O1oFiLcVWg5lWyP4BLPIepR89CRoN+cAybjz4M7iHFEZCJTucyAWL4YOO7HqeyBro5fvGf9jD5JrWBHtxrAg6SbVCKfloi9/zmtN3+33TQWDCH6yi69ZMJjs5g7ux/lyQ3/4LCm8n4V33diwRQDiXXhmY3wc3jN4OqOkk80J8fM3Tx7bKRjv4titlOTLrGqmiPtQuteRGy7A5oNi7LEfqujzYu96lBgHe5P2M/zv1E2tD9DBIkZpWor7+DAaZVUkzontji2Kns1mF4NUc884Ja9l9hMn247xUXy7oSaUMcLC3b9vJHL9VYKXSXCtAYe2R5cxhBdrLv92hA2zIF4Ik+4kQ/0xu4Z3gNCEdJuqJrjLtl96RNSzRqHDyLE1yGDIJe pqJVY2FC F17UUkYJDtQ+qNpnsZQtXP49T+ogwewFYFt24rM6BprC1JbO2uYOJDRSE5C+agve+HqX/ZrYwNmpXmR/ndTK8/hShc4arhw9cplV7vATpW7BHMs/1aLTJQKZ+7hD1xSICEArMTB16gLxm+KYJ6VnSqnZgvsDUYNLq05AwiqD3T/AJtgOvcQ+Z45iYagQxfSc4o4FTr4+gUSBg1kiXqqnWYRRc0n9krZQRSIhbzyWFGI09Q0cGCDj172cgiwArmaQnDAdGLFUvM0e8FFGYi5MMnKHMcAHWUicHHvZUkD6wCvsdD9dun3IYc+VBcBtfYyTQ2rEa1aXsN2L9PwL/orMNf2wpj2I+stAkkaT7f38bVsAe5XxSF6aegOMUBS1vc+E8RbEZ8WMzNR3IzKiah+9i29mfcUg6OpQyG/QcNYMGt94VqYA9Ui0ULLnda3GKnZA6NZgXfHywqTO/3zVwQfts+ys8+kOl6duaSYM+ZsBUMyvnsjVxwO153l7KkEqmixCAeIot2Cr6CPfj9ZQI59MZPSmTdI9IzQnWraVnFD+XOaJkAV5YHQk9B1PQKA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 30 Oct 2023 at 12:37, Dave Chinner wrote: > > If XFS can ignore relatime or lazytime persistent updates for given > situations, then *we don't need to make periodic on-disk updates of > atime*. This makes the whole problem of "persistent atime update bumps > i_version" go away because then we *aren't making persistent atime > updates* except when some other persistent modification that bumps > [cm]time occurs. Well, I think this should be split into two independent questions: (a) are relatime or lazytime atime updates persistent if nothing else changes? (b) do atime updates _ever_ update i_version *regardless* of relatime or lazytime? and honestly, I think the best answer to (b) would be that "no, i_version should simply not change for atime updates". And I think that answer is what it is because no user of i_version seems to want it. Now, the reason it's a single question for you is that apparently for XFS, the only thing that matters is "inode was written to disk" and that "di_changecount" value is thus related to the persistence of atime updates, but splitting di_changecount out to be a separate thing from i_version seems to be on the table, so I think those two things really could be independent issues. > But I don't want to do this unconditionally - for systems not > running anything that samples i_version we want relatime/lazytime > to behave as they are supposed to and do periodic persistent updates > as per normal. Principle of least surprise and all that jazz. Well - see above: I think in a perfect world, we'd simply never change i_version at all for any atime updates, and relatime/lazytime simply wouldn't be an issue at all wrt i_version. Wouldn't _that_ be the trule "least surprising" behavior? Considering that nobody wants i_version to change for what are otherwise pure reads (that's kind of the *definition* of atime, after all). Now, the annoyance here is that *both* (a) and (b) then have that impact of "i_version no longer tracks di_changecount". So I don't think the issue here is "i_version" per se. I think in a vacuum, the best option of i_version is pretty obvious. But if you want i_version to track di_changecount, *then* you end up with that situation where the persistence of atime matters, and i_version needs to update whenever a (persistent) atime update happens. > So we really need an indication for inodes that we should enable this > mode for the inode. I have asked if we can have per-operation > context flag to trigger this given the needs for io_uring to have > context flags for timestamp updates to be added. I really think some kind of new and even *more* complex and non-intuitive behavior is the worst of both worlds. Having atime updates be conditionally persistent - on top of already being delayed by lazytime/relatime - and having the persistence magically change depending on whether something wants to get i_version updates - sounds just completely crazy. Particularly as *none* of the people who want i_version updates actually want them for atime at all. So I really think this all boils down to "is xfs really willing to split bi_changecount from i_version"? > I have asked if we can have an inode flag set by the VFS or > application code for this. e.g. a flag set by nfsd whenever it accesses a > given inode. > > I have asked if this inode flag can just be triggered if we ever see > I_VERSION_QUERIED set or statx is used to retrieve a change cookie, > and whether this is a reliable mechanism for setting such a flag. See above: linking this to I_VERSION_QUERIED is horrific. The people who set that bit do *NOT* want atime updates to change i_version under any circumstances. It was always a mistake. This really is all *entirely* an artifact of that "bi_changecount" vs "i_version" being tied together. You did seem to imply that you'd be ok with having "bi_changecount" be split from i_version, ie from an earlier email in this thread: "Now that NFS is using a proper abstraction (i.e. vfs_statx()) to get the change cookie, we really don't need to expose di_changecount in i_version at all - we could simply copy an internal di_changecount value into the statx cookie field in xfs_vn_getattr() and there would be almost no change of behaviour from the perspective of NFS and IMA at all" but while I suspect *that* part is easy and straightforward, the problem then becomes one of "what about the persistence of i_version", and then you'd need a new field for *that* anyway, and would want a new on-disk format regardless. Linus