From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC064C77B6F for ; Tue, 11 Apr 2023 23:13:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4DA766B0074; Tue, 11 Apr 2023 19:13:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 489F7900003; Tue, 11 Apr 2023 19:13:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35172900002; Tue, 11 Apr 2023 19:13:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 25E236B0074 for ; Tue, 11 Apr 2023 19:13:53 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id EDC90140385 for ; Tue, 11 Apr 2023 23:13:52 +0000 (UTC) X-FDA: 80670664704.08.275B8C5 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by imf21.hostedemail.com (Postfix) with ESMTP id 168B11C0007 for ; Tue, 11 Apr 2023 23:13:49 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=fromorbit-com.20210112.gappssmtp.com header.s=20210112 header.b=UBHxzOgx; spf=pass (imf21.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681254830; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cm7n7h0+YQqhaMN60giOfNLEV6DHkhKIhtK3vaJrO+s=; b=sQp7t7slYth6yUrsgtEIefXBcwK5zPcsFmh3n+p0/KDBByi6ZNn1QkKDnHGmVtbFWGhhOH jYrf+iO0kh+cCBj81NyqXNltRxaRTru4bUU8an2W7Av3tc587eBglI9c5FhBY3xb43973V 7SPD9NTsw9DcpsrQSz6V+h/1tXXdgdA= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=fromorbit-com.20210112.gappssmtp.com header.s=20210112 header.b=UBHxzOgx; spf=pass (imf21.hostedemail.com: domain of david@fromorbit.com designates 209.85.216.46 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681254830; a=rsa-sha256; cv=none; b=THoD5ARa14yl+uQXopj/MRq1cyANVSBRH9s2dNXNO4+AiBrAkfvNTYp3+/IYHjr3sZgGcU Ul+rc5XQFAT8G38Wrzd8NGPPaDlOWYQFqUMZ8WOTua1KQ754tao8nLGz5uL/rwSUCbnwce aj2CUeTr8ClK7SbL45jm5n3TOejz0Tc= Received: by mail-pj1-f46.google.com with SMTP id mn5-20020a17090b188500b00246eddf34f6so1918434pjb.0 for ; Tue, 11 Apr 2023 16:13:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1681254829; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cm7n7h0+YQqhaMN60giOfNLEV6DHkhKIhtK3vaJrO+s=; b=UBHxzOgxwXGsNLNd5A9P7VT4b7Po+zYBdXSmczVUbcnEawjNUyc6Q0yDVbqcVgbHAg QTmK+YIBP35K5pUYpip9sBXhtyUf3gX81/VeiKVT9jXLZfpHY4YVEyyU98F+6x0du6WZ 0zcsJKBeXD/aW3FwNY4xrZy9lJPBMYsetXsgXCawQz79UCg/xkbw2C6P4uSHEP5b0LeH Yp4N8YFsCKela8bX7OHmf/BCOT0KrvQlxQBmDffLwLAoPcYTSBOAI1r/FXscGJF4uU+I h84W2aDFzxwDHnMk2PUzMcRvbOJzzT7/wuUQYX4/fg8IeZhrkFXJLIAckBOjX9w41HXp ooow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681254829; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cm7n7h0+YQqhaMN60giOfNLEV6DHkhKIhtK3vaJrO+s=; b=n+aS2LlK+pw5cavLJQ01GshDEULRPrHNENp+cy5kX1gNgyUU1e4Rp4nWPSQgtHdryj 4HVfDZ7U+r1JcbjbjnEENTa2llFhXe2wEceRu93OUDmlMmoHUMs3gYgtcC3xxNndTTRZ PyEAXAaJbPFBBQ0fefbWsq4Z5BZdl+myRoOS2o7s1RHZu/0RrdSf5sQoc8M2GjktP745 Gw0isBe5P88pjVtqy+FvecvO8OVXpf7wZ+B84xdDlkzJudcBCD+GIsWoLKdv6mDeGBD4 VA3Ip8IajF366rwZekhepHuKvZAEMJPmiGvon4A22S25byz1WJt8IM+gArh4dCXSF26T 4/XQ== X-Gm-Message-State: AAQBX9fMTuFG7Xh4zk0p2skfrL1aiLX3Iq7eFxeKnv+xi185HVlMOLTS ATjm42Qs99h8R0Qc8ZtriAUVBw== X-Google-Smtp-Source: AKy350bcqK4pAcsiH06EOT8P/jZME+o+zads0C6tIbm53vEQbRbnDvgTAjpPgh+ebvn7V4wp2JvjLw== X-Received: by 2002:a17:90a:6b09:b0:23d:16d6:2f05 with SMTP id v9-20020a17090a6b0900b0023d16d62f05mr17621391pjj.22.1681254828818; Tue, 11 Apr 2023 16:13:48 -0700 (PDT) Received: from dread.disaster.area (pa49-180-41-174.pa.nsw.optusnet.com.au. [49.180.41.174]) by smtp.gmail.com with ESMTPSA id jh6-20020a170903328600b001a64011899asm330544plb.25.2023.04.11.16.13.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Apr 2023 16:13:48 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1pmNBJ-002H1W-AF; Wed, 12 Apr 2023 09:13:45 +1000 Date: Wed, 12 Apr 2023 09:13:45 +1000 From: Dave Chinner To: Jeff Layton Cc: Alexander Viro , Christian Brauner , "Darrick J. Wong" , Hugh Dickins , Andrew Morton , Chuck Lever , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-nfs@vger.kernel.org Subject: Re: [RFC PATCH 0/3][RESEND] fs: opportunistic high-res file timestamps Message-ID: <20230411231345.GB3223426@dread.disaster.area> References: <20230411143702.64495-1-jlayton@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230411143702.64495-1-jlayton@kernel.org> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 168B11C0007 X-Stat-Signature: cxptkj6ty7egbcjsszst4n166n1efytp X-HE-Tag: 1681254829-93088 X-HE-Meta: U2FsdGVkX19aLkoW90g7Ncii77YG46Ai0CLYtfZiyeF/or2NlzBoTSIRpoMG4mNiViVdlNMWk5BEXfXAINnCN4ZGZHOGiWSVlbyya7vaqz7CcLyrq5OFjtdo4mYFP8VjDawzY+vluCn4TGRGIOZ/KtZlW++HwZrWogKvzFCcGSzwVrw49qQpft9O941rP+9v7npT1Q+oYDrMKpnt4I7VuVbvN1sonjzmjY4eMiooKh7RTn03Khf0GVtULL77xPE8Ndq8sDSYMFKScwCAwgrAXi1k/xnywinF6kKkNbYKHqt/iPbiOruCvxRNVMkSKLiE+DCsoJqmeK+S/eAIUDg1+/SV2YJ3d1gPcVtYuB8iUEVN80th2N4wkICv3vxI4GIguukl4ux5Y///xMOLPsT2KXO3e1m1YWeo9CMDvIT6jfL/D89FlPgnyockI+AZE8FradJ7bOXnD0x1o3kHRve5QZG8fkIzfCteB38F/Gxl1aOen4PnTZpp2esNbhy0urJ+jzHyyQLinppRo29MXVSYYMPaFQy9ScJFuI6tmUdVUuyfB//6Z80UWoKy6VSeZ8OGztf9ngKeuvJ/1yS5GY4qYiZ2LjPrLIW++HT2zFCqGl7SMptslD8q4UTZ26jt4GnvQoQfuOzEvk5+7ndJHkE8Rv+5rgyecDruW+f9O9suwoptKUu6hY+3p7KgThqye7wB2UV4o3P++dBm0RnhS4ucWP/5wh4c9CBQh0CosxH01ZZUYlf04kkfl1V4V7fRaVfncPR3+A6SONQeDJ/oITaP31tb6C+YdOgLQMhvMyn4La4+4+ZqMGqo/63KHT84+2j5n6HP3JahkaySw2HkL+cy6xeOB9dLp8EOZawAXRlFLzSgNO3trmaqJwnqpCFO59ySicrTjBgGfucPZ5c1AUHwdobtJY7HSgaD3a3Qzpabcl9J/IuP7K+nECWyl1zB6VF4QRvKQpS+203RTK2rSFN GPoy81lf dqXLMebFunNbfHdEB2IVWoGSVNrGOJRia1ypi8iyNP2rrQz5RjO9u1UnbBqUtCT/xzcmFHhVHZOV/B0t37c9IA719aMHgFvZcULNtQ6NK9gC7XXmWjgR2bL+wbGL4unE4Tvu99PxSNF1ug5zaZ2xLwJ6JggB++xopm0TfMxLhMhZbwpZc7koGLfD/rHHPFqdImuwBbMpTFsIej6j5QJiKmO+un2dNxlxRF941sob/lzr0eZQcES5ZA0LHt1LeI9Llg5bydlWsmzIvB2fcO/lIi3v/E6x1O0qfWsq5uvb/GeDwzQdmtHWPC6Qmk8qcJc0l6/F9MnDUHZnLxv2/3c+ZmOn5DJMcrisW/CLKG1OmLv/n7sKHj2VQVwdq1RDlmWxkEL38qKSZwWwgCd/foqa4es9Mq0Z/LCRVDibwsYSu5cfu72wOWB3tQ7/gNrPVZtte1KiQ5FU3iG/i8h1pq+FmPUMphg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 11, 2023 at 10:36:59AM -0400, Jeff Layton wrote: > (Apologies for the resend, but I didn't send this with a wide enough > distribution list originally). > > A few weeks ago, during one of the discussions around i_version, Dave > Chinner wrote this: > > "You've missed the part where I suggested lifting the "nfsd sampled > i_version" state into an inode state flag rather than hiding it in > the i_version field. At that point, we could optimise away the > secondary ctime updates just like you are proposing we do with the > i_version updates. Further, we could also use that state it to > decide whether we need to use high resolution timestamps when > recording ctime updates - if the nfsd has not sampled the > ctime/i_version, we don't need high res timestamps to be recorded > for ctime...." > > While I don't think we can practically optimize away ctime updates > like we do with i_version, I do like the idea of using this scheme to > indicate when we need to use a high-res timestamp. > > This patchset is a first stab at a scheme to do this. It declares a new > i_state flag for this purpose and adds two new vfs-layer functions to > implement conditional high-res timestamp fetching. It then converts both > tmpfs and xfs to use it. > > This seems to behave fine under xfstests, but I haven't yet done > any performance testing with it. I wouldn't expect it to create huge > regressions though since we're only grabbing high res timestamps after > each query. > > I like this scheme because we can potentially convert any filesystem to > use it. No special storage requirements like with i_version field. I > think it'd potentially improve NFS cache coherency with a whole swath of > exportable filesystems, and helps out NFSv3 too. > > This is really just a proof-of-concept. There are a number of things we > could change: > > 1/ We could use the top bit in the tv_sec field as the flag. That'd give > us different flags for ctime and mtime. We also wouldn't need to use > a spinlock. > > 2/ We could probably optimize away the high-res timestamp fetch in more > cases. Basically, always do a coarse-grained ts fetch and only fetch > the high-res ts when the QUERIED flag is set and the existing time > hasn't changed. > > If this approach looks reasonable, I'll plan to start working on > converting more filesystems. Seems reasonable to me. In terms of testing, I suspect the main impact is going to be the additionaly overhead of taking a spinlock in normal stat calls. In which case, testing common tools like giti would be useful. e.g. `git status` runs about 170k stat calls on a typical kernel tree. If anything is going to be noticed by users that actually care, it'll be workloads like this... If we manage to elide the spinlock altogether, then I don't think we're going to be able to measure any sort perf difference on modern hardware short of high end NFS benchmarks that drive servers to their CPU usage limits.... > One thing I'm not clear on is how widely available high res timestamps > are. Is this something we need to gate on particular CONFIG_* options? Don't think so - the kernel should always provide the highest resoultion it can through the get_time interfaces - the _coarse variants simple return what was read from the high res timer at the last scheduler tick, hence avoiding the hardware timer overhead when high res timer resolution is not needed..... Cheers, Dave. -- Dave Chinner david@fromorbit.com