From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 942EDC6FD1F for ; Tue, 14 Mar 2023 23:00:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EA7808E0001; Tue, 14 Mar 2023 19:00:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E57756B0072; Tue, 14 Mar 2023 19:00:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF80A8E0001; Tue, 14 Mar 2023 19:00:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B6ADD6B0071 for ; Tue, 14 Mar 2023 19:00:47 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 96274160DFC for ; Tue, 14 Mar 2023 23:00:47 +0000 (UTC) X-FDA: 80569025334.07.B471AEA Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf05.hostedemail.com (Postfix) with ESMTP id A70C2100012 for ; Tue, 14 Mar 2023 23:00:45 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=A0yRbXnM; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678834845; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E82LIYAgDhe6uTvVBzKfW3zCCzo6pwQC6sPzQF1nwbc=; b=fmLe+RBtbzrc3XCYmPxLvMJ1ChEPlLy0FFI9h5p8i6HYNr92wl/KL6XQ7XJBsQusWA6FNd 5i6Y3gXaeVzQtDCoSM3TEkZG8jRD8YhKvOEM20/PT8ZYA23kJihIXAyYuYlrywb87zLAkS OP9wb9Ui4DDRy221d+irGxIXCkfSj+4= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=A0yRbXnM; spf=pass (imf05.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678834845; a=rsa-sha256; cv=none; b=1CnSUgAKwe32RbZeflW0PG12I/YmS1DzyDqN8AqWquYd3b24znONtxo/Boec7CTX5rLmDM OC/e87zxgYg5aRz6jit2yMlYrv7OdYW1RKhX4j5fadGsNQ2frwRqvZz1hVocNQcFlLLrle I7vYreezbcCuybyYdOxN5JcX+464+6w= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B5DB0B81BFC; Tue, 14 Mar 2023 23:00:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C1FFC433D2; Tue, 14 Mar 2023 23:00:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1678834842; bh=k7dCrxTFioFlmhd3pSkn4+iIIbmSdXDvCzI6LDaxT3w=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=A0yRbXnMyNBzqdtNex8pvZa5BXCEjOJVDVCgsTVVpZnKeL+1s7QZgnwP42X4GKfJ6 O3s6sPCYRb1jt8tZxwoOYe2la3JFIxp6BY4JSjSx+ES7jYrn04Q4TUaQj54RKQwEnt o8yMFXJpMwfXM4spRqvPXkgUyukRthQq7l+hFEv0= Date: Tue, 14 Mar 2023 16:00:41 -0700 From: Andrew Morton To: Nhat Pham Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bfoster@redhat.com, willy@infradead.org, arnd@arndb.de, linux-api@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH v11 0/3] cachestat: a new syscall for page cache state of files Message-Id: <20230314160041.960ede03d5f5ff3dbb3e3fd0@linux-foundation.org> In-Reply-To: <20230308032748.609510-1-nphamcs@gmail.com> References: <20230308032748.609510-1-nphamcs@gmail.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A70C2100012 X-Stat-Signature: mobkjj87xh813qjpusocsi455qs7yp7u X-Rspam-User: X-HE-Tag: 1678834845-562961 X-HE-Meta: U2FsdGVkX18yQUAC76fpptSAtMutmpZ4LxGHpu8fNOTQYgiIH5cRmi5HLkMosbf9PFI1g2Vk8OcYMPWY2rvdJG/LQAgOMOIqy7FNOsJDyWd4u916h8mG/gmqzDfgdQmzUs3L8Fr/iNvLq0pcGxQ97ecUVgRD+vRO+OaW986j4vSfn+/6nG86JeuXWkUdKoy48G+VeoSUZxMIJOUb7CAu63QGRrVnxHGP46KS/WFglu7NYM/Y2QbrR20C11a1qk40gug9a3L3Oj0FHLWuqydKNU9hlvEH7bbwXeJseniaK5iFcBAzqXKZbGfNodG20agIT/DNLWqz/CF30VqVNTumrTtqEk3cbuaU3t10GbSEMfi1ys8BzZ+g8c+3vdyGGvJwPKIGxxB1u2L+96q+o5dl9A8dB1Tn/H5/a3YDq0jmCzORXXcDuVDD2zjs4OygLQMpnzLKXpYSXSnGyd4V1WTd4fGlStj/KPHbtbTGGIVOhbCirDTiWveXydNvnAWpCJWHqyOL6/sCt7OvKeOFCOPXJehWlkRkUgFKFNGbreROGhX2V+5Idt+At47J440boV612BLLyDJgG1l2iUN1j4Ow8iy6u+H5tVP6rnSEjrMWw1GiaMSXKXeT+Z4Ba9J0l5V7kzk0itUCGDt2pgVckdYO10F1hgJmvODheaTUQRyRlHlNgYQDWXVYiJxMPCbsXGUUmZv3ZKsUm/FPGLX2/Zd8R2So6OYg9H2KJGKdkyTT2UKjdJirOrhwRV2bHqsShUWuiO4baA62deYWQtvVUUXOzypmKDUOh4Te359lCF8aOx4xRAiJ3L2eC4ih7vmISZAL/5GvIoQgYhcb3o4+r+OTPoWdCkaOiCDNmnoYK4iC3TCC1BzRkU8A4bjgGcPeHXlcqMnLAuZmFRnTP8WvBfmCaXkZCYdlqKy4GPuZWkt9TnidnSULo8Qc3aZTiV+WkRXPS6b7vL/I1mr91mR9erC g5pr8F2k L0T+ZCermJn7byKl1yR99UHFPruj3KoLNyZ+ex+P/8umbibvvnOsR44uaJk12eM/SXtpkbUC/tyM8UiMC+RjgVcU6q6rxB3bcK6uwK099m4hZ5jHYtTgfydglvD/KoAc15W7y1tUfdies9tgdKES8BGvvcXHV8SoNbE5lTk+zysoZSP99v0VGecOa8NGFcE8CIHfJCFTucfxsWCgocTlPPAgh6Iwq7ciD2CQyUiRIdml6ESfDnGKKCafuv35RHdWUWMmx2ZTaUdZAaCzVyD1tBSzF0giLLKscNZbpqJZuJKnv6RtDQ3FUjKcbWLOTd2ln8M8Lsqn9M1xwFqnwRmQhZpfcsK7mu1rhHUeamsX5O7GTqosJoylrSYBhBA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 7 Mar 2023 19:27:45 -0800 Nhat Pham wrote: > There is currently no good way to query the page cache state of large > file sets and directory trees. There is mincore(), but it scales poorly: > the kernel writes out a lot of bitmap data that userspace has to > aggregate, when the user really doesn not care about per-page information > in that case. The user also needs to mmap and unmap each file as it goes > along, which can be quite slow as well. A while ago I asked about the security implications - could cachestat() be used to figure out what parts of a file another user is reading. This also applies to mincore(), but cachestat() newly permits user A to work out which parts of a file user B has *written* to. I don't recall seeing a response to this, and there is no discussion in the changelogs. Secondly, I'm not seeing description of any use cases. OK, it's faster and better than mincore(), but who cares? In other words, what end-user value compels us to add this feature to Linux? > struct cachestat { > __u64 nr_cache; > __u64 nr_dirty; > __u64 nr_writeback; > __u64 nr_evicted; > __u64 nr_recently_evicted; > }; And these fields are really getting into the weedy details of internal kernel implementation. Bear in mind that we must support this API for ever. Particularly the "evicted" things. The workingset code was implemented eight years ago, which is actually relatively recent. It could be that eight years from now it will have been removed and possibly replaced workingset with something else. Then what do we do? For these reasons, and because of the lack of enthusiasm I have seen from others, I don't think a case has yet been made for the addition of this new syscall.