From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8773FC47258 for ; Wed, 17 Jan 2024 14:35:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4F756B00C0; Wed, 17 Jan 2024 09:35:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DFF246B00D1; Wed, 17 Jan 2024 09:35:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9FA56B00C0; Wed, 17 Jan 2024 09:35:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B7F876B00AE for ; Wed, 17 Jan 2024 09:35:38 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6E34B1A07EB for ; Wed, 17 Jan 2024 14:35:38 +0000 (UTC) X-FDA: 81689051556.10.E6A9FBA Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf07.hostedemail.com (Postfix) with ESMTP id E0FFF40019 for ; Wed, 17 Jan 2024 14:35:34 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=WzEDhvlN; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=KNR2Ef5+; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=WzEDhvlN; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=KNR2Ef5+; dmarc=none; spf=pass (imf07.hostedemail.com: domain of jack@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705502135; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fmadxpqMu4C5DBMON/D+IvGv5Kr7MEI26W5sM9IrEes=; b=7G5CA6zaUGt0WngXGRYT/BcXr8bPmMKNUgarMwuo+nJI7NUW9LwRCJ+v/IH+TAnhQwEMoR FVwIquk3xzVnJWf4kbMUCA8P3kdt+daqYVqEkGxjiJGaBsyFPCNHzgHjf1Wf6hIN9RMAsy 05+0s8aVJm86KdwUGFMf0gGZQ0uxXaQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=WzEDhvlN; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=KNR2Ef5+; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=WzEDhvlN; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=KNR2Ef5+; dmarc=none; spf=pass (imf07.hostedemail.com: domain of jack@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705502135; a=rsa-sha256; cv=none; b=RQ2tF9zzD2c66bw5RykBv8jrEESF/ZWKCx+hZLwRso+vQYjselGUaFqLbynC2aACQ9QyUW 8ZD6ShtGzCp9CDpmJ7vmuWqP8XRIpb8Mw45gEeb3rEQMjzeY7YY/hI+qs1xvEGyJgfZC4e txsgmQUHU7tnLuiQGXNaX+kmTBEArTk= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id B24C3220B6; Wed, 17 Jan 2024 14:35:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1705502132; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fmadxpqMu4C5DBMON/D+IvGv5Kr7MEI26W5sM9IrEes=; b=WzEDhvlNIHHx9adDBXGceN+LdYnU+FqlpoDH/rtQoawQbq9SQl0CQMwV3BhyonEXIbc5ti YlS8cEjzU5Qqs+knkBFSm05U+QOEFEOMUxhcUxHtnNpoEPlXhcMMBu4vc8CxRxZenNDNXl zrvgZGMslJiYHyE7DSzSc3BT0X4lbUU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1705502132; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fmadxpqMu4C5DBMON/D+IvGv5Kr7MEI26W5sM9IrEes=; b=KNR2Ef5+zZYVRy8Nm1HsIcH3SxPY18Ona94DE7oJUNzfjsenh0g93qJQXWd4WsQb1viWBL kMOmfAELIvoSCwAA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1705502132; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fmadxpqMu4C5DBMON/D+IvGv5Kr7MEI26W5sM9IrEes=; b=WzEDhvlNIHHx9adDBXGceN+LdYnU+FqlpoDH/rtQoawQbq9SQl0CQMwV3BhyonEXIbc5ti YlS8cEjzU5Qqs+knkBFSm05U+QOEFEOMUxhcUxHtnNpoEPlXhcMMBu4vc8CxRxZenNDNXl zrvgZGMslJiYHyE7DSzSc3BT0X4lbUU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1705502132; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fmadxpqMu4C5DBMON/D+IvGv5Kr7MEI26W5sM9IrEes=; b=KNR2Ef5+zZYVRy8Nm1HsIcH3SxPY18Ona94DE7oJUNzfjsenh0g93qJQXWd4WsQb1viWBL kMOmfAELIvoSCwAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A51AB13800; Wed, 17 Jan 2024 14:35:32 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id XmtHKLTlp2WuBgAAD6G6ig (envelope-from ); Wed, 17 Jan 2024 14:35:32 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 540BFA0803; Wed, 17 Jan 2024 15:35:28 +0100 (CET) Date: Wed, 17 Jan 2024 15:35:28 +0100 From: Jan Kara To: Christian Brauner Cc: Jan Kara , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org, Matthew Wilcox , Christoph Hellwig Subject: Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs Message-ID: <20240117143528.idmyeadhf4yzs5ck@quack3> References: <20240116-tagelang-zugnummer-349edd1b5792@brauner> <20240116114519.jcktectmk2thgagw@quack3> <20240117-tupfen-unqualifiziert-173af9bc68c8@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240117-tupfen-unqualifiziert-173af9bc68c8@brauner> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: E0FFF40019 X-Stat-Signature: 1aumez4q5h5mquozhu96cnah5hdhwdbg X-Rspam-User: X-HE-Tag: 1705502134-249621 X-HE-Meta: U2FsdGVkX19xgAXHhompTWorAkv5YsXXvPfWNYWF8uZTZqJ0U9yoYCbsV2bnAuzugx880WpGFBMpUytjopRMJvCKq82PBlm+sC02x0oiAMZsW10Jsk+pRidXaEm7UMtbYREuJ4vWDDHthKdHmobATViED3hWAxduNjSCiJ4lyzfy6q3wrCCwKWIgenORcVcDMegRmGwGZykaFvSUuJR/SzhOYktl7miCJAZ7ttp8w59l0Qj1cck6PAgr8Y6dT/QZUek/ivURV5mapoPHIyKEIyutzQiWkN7fGSWux3sEQlFPDB6sXVZo9PumJSldcySwyNryikCLSwCF2DlNHfwCL9caQF1/SILsm3Tektux4+GYrjQqdch+9/momBYxq/ya7+2qMgXD0Z0xV3UBVtV+5gjn8qp7WCEnn7TA6MXVi1WUUGCMxnFyirDkwm8nT9mh8qk9nNmTVq/jZ3eyRisBmHDlBjxbHaeR0w7xw/d6NDzuzV0g546w+MZVDfXEM4v5YxKTTxXbSsvj68oRWqzpP7/9GIsN5EcFWfx2oKTt7t3ZFCAMIovABeYRXlHGBqQBDT9zPDl9dAODpN7zFC6L6viJ+35MOMqRZ9pBgNettlCIOzA/UU583XjScbKyFRLJGCJtbOzl4GI1nqc3odF5XqMkjUREhPLUToPIyeABe0ps3Xms1cGttxVesN3nMPe58JhP+MWqffyotug5lm3V0/lO52aeZg6TN1xGLmqrKImPGf3OTVcVruxLfDHizDZk2NmT38w+hXvYNqQQ9CUN/QlpEIU4TlZuVUg/ckL0qbQ2wgtLW4iXjYpWvZHvUDHOSFm3cEwzehnUrO8msATG5AFrghw8Sty7kWmVlj8v+TZO/lTO8Z5SfzQGwe900HQT9y6eZ+yKO6Eld5i7QhX2u1rYu4PyZK9t3erfMxIP+iBnXKXQO2nrC7cEt2gHelNgRYsD1owEQl3vP60Yv3l HOM2WMSN nblg1GsMhsORbElDpYmSI7TT6u4qauhClkj7YxCNlPY3g7cnEf/fsqoeAtkWG0WojsxXB9vFtqtyhvZM+Cy5GIvPIJDSIDMwLw0yi3Oc+1WG9hZBCOnAvxbwBBiu5IYTpMdgYJw7zRW+iQL7ohfSWntrR6befQFnVai/BfMiqXm+hcnRsFogTRMGbRWmGS1mImLwhzAKjsDHxYUBihbg6LwnBm66po0DOm35kY8ZWC7dgSunvi1afwvRmEI96g3SnujuYh/Q57fdbC4dCPWOO4/Yy7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed 17-01-24 13:53:20, Christian Brauner wrote: > On Tue, Jan 16, 2024 at 12:45:19PM +0100, Jan Kara wrote: > > On Tue 16-01-24 11:50:32, Christian Brauner wrote: > > > > > > > > > My initial reaction is to give userspace an API to drop the page cache > > > of a specific filesystem which may have additional uses. I initially had > > > started drafting an ioctl() and then got swayed towards a > > > posix_fadvise() flag. I found out that this was already proposed a few > > > years ago but got rejected as it was suspected this might just be > > > someone toying around without a real world use-case. I think this here > > > might qualify as a real-world use-case. > > > > > > This may at least help securing users with a regular dm-crypt setup > > > where dm-crypt is the top layer. Users that stack additional layers on > > > top of dm-crypt may still leak plaintext of course if they introduce > > > additional caching. But that's on them. > > > > Well, your usecase has one substantial difference from drop_caches. You > > actually *require* pages to be evicted from the page cache for security > > purposes. And giving any kind of guarantees is going to be tough. Think for > > example when someone grabs page cache folio reference through vmsplice(2), > > then you initiate your dmSuspend and want to evict page cache. What are you > > going to do? You cannot free the folio while the refcount is elevated, you > > could possibly detach it from the page cache so it isn't at least visible > > but that has side effects too - after you resume the folio would remain > > detached so it will not see changes happening to the file anymore. So IMHO > > the only thing you could do without problematic side-effects is report > > error. Which would be user unfriendly and could be actually surprisingly > > frequent due to trasient folio references taken by various code paths. > > I wonder though, if you start suspending userspace and the filesystem > how likely are you to encounter these transient errors? Yeah, my expectation is it should not be frequent in that case. But there could be surprises there - e.g. pages mapping running executable code are practically unevictable. Userspace should be mostly sleeping so there shouldn't be many but there would be some so in the worst case that could result in always returning error from the page cache eviction which would not be very useful. > > Sure we could report error only if the page has pincount elevated, not only > > refcount, but it needs some serious thinking how this would interact. > > > > Also what is going to be the interaction with mlock(2)? > > > > Overall this doesn't seem like "just tweak drop_caches a bit" kind of > > work... > > So when I talked to the Gnome people they were interested in an optimal > or a best-effort solution. So returning an error might actually be useful. OK. So could we then define the effect of your desired call as calling posix_fadvise(..., POSIX_FADV_DONTNEED) for every file? This is kind of best-effort eviction which is reasonably well understood by everybody. Honza -- Jan Kara SUSE Labs, CR