From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49C5FC4167B for ; Mon, 30 Oct 2023 22:37:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A82D66B028C; Mon, 30 Oct 2023 18:37:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A328D6B0293; Mon, 30 Oct 2023 18:37:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8FA026B0295; Mon, 30 Oct 2023 18:37:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7F5136B028C for ; Mon, 30 Oct 2023 18:37:18 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4957EA052E for ; Mon, 30 Oct 2023 22:37:18 +0000 (UTC) X-FDA: 81403590156.21.C93533D Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf24.hostedemail.com (Postfix) with ESMTP id 4481418000E for ; Mon, 30 Oct 2023 22:37:15 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=TRKUlUTC; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf24.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698705435; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s3l/lL5wYf4n3GthHM0NoG2tR/ievkKXd51rX7QyzCg=; b=Ow8IRniYEf/9niBQYLKmdVcuQfXQpyCHsIl0BPQM3CfhIhGLygbH9AXirdInwpg32oYnB4 edsj9JOg6nAdJzg0koeECRNjssu0DwrEOkJiopbrKBOTrtBV8pM6hzDBwoTamLc/vrwM3x yTpNcOHzN5iCAaMUoRbRHCMX3pe7Epw= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=TRKUlUTC; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf24.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698705435; a=rsa-sha256; cv=none; b=SpCoobF6ZYu/r8UQRSwCqry4pya+UtkjtFlobo1ldo8sqTteyNJ0mS+Mw3uNLhheZ4n2Kh u0Y0saRQcXWyOFVH/nWaKPTm6h0BBq68HQCJKNbn2XJKNlGeEUR6ff28x9qiuZvP0BaLmR hEO8BpAy+zEiN7nzeQMxD30aBMwDNF0= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-6ba172c5f3dso4457615b3a.0 for ; Mon, 30 Oct 2023 15:37:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1698705434; x=1699310234; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=s3l/lL5wYf4n3GthHM0NoG2tR/ievkKXd51rX7QyzCg=; b=TRKUlUTCTAs2j3ougV0VVqK1fr+tBaPK44EgFpVkPqjNd7Tihu6VjYV4KZ6Ym9WpJk FcLWhz6t69Sh1G27hb1j1NP6o6DOdBvii1bOYHMIBi7lrR5xmE3Ne4rj2CrKjHL4303H i9/0GrdraFZs2pheeBS3qg3auoqFsWIQ0JWz1q1swWei2aCnQ9oh4Mku742kBQbV08zY yVHcq+PfCzYjxOrUaMlZGTJJhkR6VVUmk9icqteiuKI83FAneLdU4YmmAXXWXKgDE/nI B6GwiyInh/rXzoAYFjFok9ujk4bomJBxTtUaQrrV8izeRt4OsvxlPcnTRV1Qrtw91a2A ZqsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698705434; x=1699310234; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=s3l/lL5wYf4n3GthHM0NoG2tR/ievkKXd51rX7QyzCg=; b=VeZPnuXPaHemmiUYMHT176Jfjv627DaQ8rQ19pcVS/dIjkmLkrMXzchMsziy6x1mgd UNh6dpHZBiHC/Aplw+vHfh3jZmocq5uUTlbrZKyxD0gA1PMrye87/BHZhu10JGzRXaHK hZWI3dLU1aFOGGVgbixGi0kdxkXvS2KhwyV6McVxUhGKIx5FwY0QpxhcmJAcez1YY9Qx aBWAHuP/DS4oCqrOAixb19dUQMdaB4QJIPZ2lY9ECFdSIh6GLuxwbw4vbz7wG71Z4EtE dfOkVlyJ8u61QOre0xhx/rK60F98TtkuTyHF4khcoOhk4b53aBybP2ejqVR1PQfNSwd+ UItA== X-Gm-Message-State: AOJu0Yyx6O00TlRPyG2XMq895M9F3mfAJda4vyLdxUytkcPTWOBisTyB GV/4hzpd4CLHol/JZfp6qojwTA== X-Google-Smtp-Source: AGHT+IH220teegvrDu5BQvWXj+pfKEIBe9mpI/qNDn7XBuC5Q1LlqBFVtPwA3Li0K8x18IiaaOOLaQ== X-Received: by 2002:a05:6a21:6da1:b0:175:7085:ba18 with SMTP id wl33-20020a056a216da100b001757085ba18mr9996599pzb.58.1698705434041; Mon, 30 Oct 2023 15:37:14 -0700 (PDT) Received: from dread.disaster.area (pa49-180-20-59.pa.nsw.optusnet.com.au. [49.180.20.59]) by smtp.gmail.com with ESMTPSA id p11-20020a17090a2d8b00b002774d7e2fefsm2932pjd.36.2023.10.30.15.37.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Oct 2023 15:37:13 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qxasg-0066We-2r; Tue, 31 Oct 2023 09:37:10 +1100 Date: Tue, 31 Oct 2023 09:37:10 +1100 From: Dave Chinner To: Jeff Layton Cc: Amir Goldstein , Linus Torvalds , Kent Overstreet , Christian Brauner , Alexander Viro , John Stultz , Thomas Gleixner , Stephen Boyd , Chandan Babu R , "Darrick J. Wong" , Theodore Ts'o , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Jan Kara , David Howells , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-nfs@vger.kernel.org Subject: Re: [PATCH RFC 2/9] timekeeping: new interfaces for multigrain timestamp handing Message-ID: References: <61b32a4093948ae1ae8603688793f07de764430f.camel@kernel.org> <2ef9ac6180e47bc9cc8edef20648a000367c4ed2.camel@kernel.org> <6df5ea54463526a3d898ed2bd8a005166caa9381.camel@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6df5ea54463526a3d898ed2bd8a005166caa9381.camel@kernel.org> X-Rspamd-Queue-Id: 4481418000E X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: sh7q1h31o896d5e5h6urj4jpgfqi1q31 X-HE-Tag: 1698705435-632180 X-HE-Meta: U2FsdGVkX19prKSw6n9gX/R8Dabq1XzA8oAPqMPMr4SPA42UlGj+s+Ug4/BUrp/S437fdKoK1hKvqLAB5cyFVvmMoQsn8itTLxzstDQ0GDyDjOiXJOFwKcqe2NGp6xyL6No0tbBy/hkpHoTMfE7oz4ewOfmr9RD6L4JLV9Ci08/u2oSmhidTI/UbfxaJkMQXZXHX23GHHerNW4wYhp4uzg/9lLAGOUKve73ig5OYGM2RAS4lZD4TuN7qgCtp4QxdrkLi9eDi5tCQGjbeWFlBV5K6kX8YDLX0y92d47mouqF5vu7Uk28qSoJcoflwNcY25dVEat09ENt8h6Y3yz3HwSQnzGmUqmcgcOSINT+3Vi2bqIDgklS7r6bATYNVJtQ9cTYIcFu4jBzFrwTIsb/g65ng90ATUxTnL0A6tNrYoC0WlS9UuFWjPlmyvSJX/7MzQEFHt+LFRGzZhAaCpqzhY18s0eilLrjnszl/JIL92nf+6LElvtd9SuTFNReEw0SJEgO2QdHZNEnrGJd9r3Ynyv5Apln6FCjvitOD/PQRgCTN6lGx/5jTvaIU6looVrXyNnbFUZCDW3UdcF5+pPDyIlNTWoL5sV+7wGB4QZ+kJkJ5Z8h7BZ0g42iZ9hnrfVW4Id/h4mRcZKanhhwBoQu0ZtAL3QASx89VxQXpltCbZPy41cGYlzuHsxHMkVl9eJrdkdGzayuQ0wlV5UOIcIzjhTwL+2OjYmcANtoVIWySGIJYrPM5bDUMS/rs/hfwV8WR1WZTuLYmxopRp3qwHh5piixWOOFXzqcwlRWSbaVmRYt94opKBEImD24SdK0UFHoHpXhKKocrtOLFKDNieMQ3ilWzndVPR0FgSruGhMHzDve/a5AW5FDn+e/A9O2rHgVlF3/lpU1iufEnTKfHrLspIOocXPHV8XOGOiHMlUqoT0Xf6LM+pOGxY6+Mgiy9GIGD/xLYnIN/H4nLLGz4W0I PJd+9y6l cFs8xbhq972ZKBistiMLU9/H1A8UndIHWwrTHIh+RJpTPlQ5DqkLu2ToHYjhLoC6g+2LgCwIzt/z60Q7TD2SkqAONU4hcsFzPwBiC8cA414llySTNg1iEx9cONKW1u122UNw0lGBP038IGJBgutJxcPj+JEjg75Tp04AUvXc+LBZo0JGGoNbHvxzmsS6Wm6JRt+4KmC/2LzU6Gi7mfMkx8FuogaL0bEmBbhIeVaT9b+K4243qI0N3rTpa1tdg7KDiFZUcqRzN4FwxAwMoZebzh6vgDvwbV/XDULBhHs5AUHimslNPmQWtdGDHUNZ8CSEv/paJA9C9G7xMW/hTqGZBQFEjm6I1WFcjnK8jXXeqLppV+Msusny+Ow8GbZaHXXjSY3NkEMggumVX52HkjFrY5Sd22Nj0aGoYZVBVAXk7ebRoZSG3DMXIwOao3qzlqQq9oR+CjJ23BYuyzZ4xlRHuUW6YQtVBYi2dAr/0ELZxobrQOd8MnHlYqNOlcJyTFD2AkwDu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 27, 2023 at 06:35:58AM -0400, Jeff Layton wrote: > On Thu, 2023-10-26 at 13:20 +1100, Dave Chinner wrote: > > On Wed, Oct 25, 2023 at 08:25:35AM -0400, Jeff Layton wrote: > > > On Wed, 2023-10-25 at 19:05 +1100, Dave Chinner wrote: > > > > On Tue, Oct 24, 2023 at 02:40:06PM -0400, Jeff Layton wrote: > > > In earlier discussions you alluded to some repair and/or analysis tools > > > that depended on this counter. > > > > Yes, and one of those "tools" is *me*. > > > > I frequently look at the di_changecount when doing forensic and/or > > failure analysis on filesystem corpses. SOE analysis, relative > > modification activity, etc all give insight into what happened to > > the filesystem to get it into the state it is currently in, and > > di_changecount provides information no other metadata in the inode > > contains. > > > > > I took a quick look in xfsprogs, but I > > > didn't see anything there. Is there a library or something that these > > > tools use to get at this value? > > > > xfs_db is the tool I use for this, such as: > > > > $ sudo xfs_db -c "sb 0" -c "a rootino" -c "p v3.change_count" /dev/mapper/fast > > v3.change_count = 35 > > $ > > > > The root inode in this filesystem has a change count of 35. The root > > inode has 32 dirents in it, which means that no entries have ever > > been removed or renamed. This sort of insight into the past history > > of inode metadata is largely impossible to get any other way, and > > it's been the difference between understanding failure and having no > > clue more than once. > > > > Most block device parsing applications simply write their own > > decoder that walks the on-disk format. That's pretty trivial to do, > > developers can get all the information needed to do this from the > > on-disk format specification documentation we keep on kernel.org... > > > > Fair enough. I'm not here to tell you that you guys that you need to > change how di_changecount works. If it's too valuable to keep it > counting atime-only updates, then so be it. > > If that's the case however, and given that the multigrain timestamp work > is effectively dead, then I don't see an alternative to growing the on- > disk inode. Do you? Yes, I do see alternatives. That's what I've been trying (unsuccessfully) to describe and get consensus on. I feel like I'm being ignored and rail-roaded here, because nobody is even acknowledging that I'm proposing alternatives and keeps insisting that the only solution is a change of on-disk format. So, I'll summarise the situation *yet again* in the hope that this time I won't get people arguing about atime vs i-version and what constitutes an on-disk format change because that goes nowhere and does nothing to determine which solution might be acceptible. The basic situation is this: If XFS can ignore relatime or lazytime persistent updates for given situations, then *we don't need to make periodic on-disk updates of atime*. This makes the whole problem of "persistent atime update bumps i_version" go away because then we *aren't making persistent atime updates* except when some other persistent modification that bumps [cm]time occurs. But I don't want to do this unconditionally - for systems not running anything that samples i_version we want relatime/lazytime to behave as they are supposed to and do periodic persistent updates as per normal. Principle of least surprise and all that jazz. So we really need an indication for inodes that we should enable this mode for the inode. I have asked if we can have per-operation context flag to trigger this given the needs for io_uring to have context flags for timestamp updates to be added. I have asked if we can have an inode flag set by the VFS or application code for this. e.g. a flag set by nfsd whenever it accesses a given inode. I have asked if this inode flag can just be triggered if we ever see I_VERSION_QUERIED set or statx is used to retrieve a change cookie, and whether this is a reliable mechanism for setting such a flag. I have suggested mechanisms for using masked off bits of timestamps to encode sub-timestamp granularity change counts and keep them invisible to userspace and then not using i_version at all for XFS. This avoids all the problems that the multi-grain timestamp infrastructure exposed due to variable granularity of user visible timestamps and ordering across inodes with different granularity. This is potentially a general solution, too. So, yeah, there are *lots* of ways we can solve this problem without needing to change on-disk formats. -Dave. -- Dave Chinner david@fromorbit.com