From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 798CEC48BEF for ; Sat, 17 Feb 2024 04:04:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C4B76B00A0; Fri, 16 Feb 2024 23:04:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7748A6B00A1; Fri, 16 Feb 2024 23:04:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 663D16B00A2; Fri, 16 Feb 2024 23:04:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 589626B00A0 for ; Fri, 16 Feb 2024 23:04:42 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BE0991A0100 for ; Sat, 17 Feb 2024 04:04:41 +0000 (UTC) X-FDA: 81799954362.20.D7E89AB Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com [95.215.58.177]) by imf29.hostedemail.com (Postfix) with ESMTP id D6639120009 for ; Sat, 17 Feb 2024 04:04:39 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=v2eKsfr4; spf=pass (imf29.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708142680; a=rsa-sha256; cv=none; b=FVbQprNpiaP7P+CVy1TmO9k1/gCpV4DyV24iLW6rErw8vJZ4NY7CCM59lFJ15nMR31SkNL YwqQB1S+yqC3aUHVC8rG+UQp8we1NbpL8yP8LeVh9KaDE9Ls2OAJCQpgikfWG4m39MG6F3 Us0kcqB6meKB7i87wsQFX+CZbsY+Qsw= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=v2eKsfr4; spf=pass (imf29.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708142680; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A4r5aIV7VuIBs+wO/sCkDN27DEyy8EZefu5EInHNYN4=; b=mS0DH5Hd0BMwz++8T+GvoTAJTHLdzzwDk1jRdjtHgDuOs/A9Ou2T2s1OXwT4lRMEywXK1E CGmEbQ1IIQL34EHXnlcF35OqS+lAi0IGkEEPMB6v1j6v1LPx23lKv5XIxbWSJ2QfzL0ol7 BIAn4ml689zsKtf+kY3ly+ucWMGaEwo= Date: Fri, 16 Feb 2024 23:04:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1708142677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=A4r5aIV7VuIBs+wO/sCkDN27DEyy8EZefu5EInHNYN4=; b=v2eKsfr4RqJiPyoh2Vo0edhLedLN1wz4yD9ORWOpRo9yJFXqX4BKGIZpRKN+FPclQD/Enc Q32XVIPBwxtM+OUDrbBQc1VlgMToH0H5+lxWd7l7rEFIXmYoqzJRTFkh9PKsjgMd+RXgYd ZILDKHzg/QofJ0EzNQR7JGLKUctiiKQ= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Christian Brauner Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org, Matthew Wilcox , Jan Kara , Christoph Hellwig Subject: Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs Message-ID: References: <20240116-tagelang-zugnummer-349edd1b5792@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240116-tagelang-zugnummer-349edd1b5792@brauner> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D6639120009 X-Stat-Signature: tmpi97154c7cu8bupxggkijtodod8jw8 X-Rspam-User: X-HE-Tag: 1708142679-500573 X-HE-Meta: U2FsdGVkX1/DgvvwAc42vCG383tj34BKdGY9gwUjCc7iXcQhCl7+Rr95WEVQUiXhVlS3wvfABUxtZIf9Bs2+6vg7DxV65WEL9yevY42QvmA7hUJ6DrJDdgU5r7tz+af5MO67Rlx4r1F5/V48vYKt8xpeL4VEUlYVCkVi2lp2xoU/Thgu8KRzopp6+t9rFX9cDgX0gFCC1UEdk9Xq9WoBlFXQLXFII7vrhfICrKlDyxDmTlWXvdvFbOowOscTkAAKaocEPuJA8x3lDhfPqHcVQ8yEAVmg2/oL1KV5dOIBh1P2baJWwIRMLwi1OCGU41RM7Dfqjz/NXEYfOBEj32Q6eVXzV/kOVB5WtCm2jBTcw/o46fGzwnoKQtuqQ32gZIjE3neoprjtUDKY3I00P2gqWX411hd4pvb/+PGNpMb3nOnzNSzYZpQ2/Uufbbzx79Wcz1XPfRNRmbOn+nMVLI0/sCljg4gDkZH8cd4JPO3f4EF8CU8eGz9JClI/fazcSX2Q4ouTVT+GW2G+GynbkFcuaCeWeep0GlXZGvaZIYfijq67C6OhLvOvgkEKELn1HTGzSoTMxAMCqI/D0pUNf0H+qcK/u+I97TpPV2O03S7cWPS+tDtiAfAWx8LVNR/ECv+xyvswMmkSFQ9PYU4zDqedEyn1hH3JZayMuXo0Aw4hA5eWc1U2duK/o79CWnVOELYpa+WYV5mXUQFz5d9I7uMHRswIOzRfYRotAY63/3UurhdxPuCX7Heu9tt+8ZfqzeelTds2jXdoGudQXf5GY3ZsO8imyA0Mqt41ZV49LF7uF4y908O3ULEGsI1QfcX7MWgxcSRl2tC3iG9ftdOKJp29JuevHRdvo2HLlODmgasKsZJID3PQBmSK8NOLV7Piz1O6g1sW0Ph5lmnKuv1CQcOXV423lNV7DR5e6/9nh8teZIUYo5pkMLZSRfg2oKKLgtLEj6fzkQykRkPczbEgHiX WeVXTNo5 sMeyvPEFcB6bu3Z3tCbOO8WPeKHKyQRmy1j734h562fdImJugY4VVmCOLzjNHMioC6SbBLtN9p8S5r85m0Vrtlka4tJgNJTnrY9jzHWD3zvO9/SY4EpTTgecaBNqzwuUmI2YTipt4d6lyBshAX+uHiIxzmQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 16, 2024 at 11:50:32AM +0100, Christian Brauner wrote: > Hey, > > I'm not sure this even needs a full LSFMM discussion but since I > currently don't have time to work on the patch I may as well submit it. > > Gnome recently got awared 1M Euro by the Sovereign Tech Fund (STF). The > STF was created by the German government to fund public infrastructure: > > "The Sovereign Tech Fund supports the development, improvement and > maintenance of open digital infrastructure. Our goal is to sustainably > strengthen the open source ecosystem. We focus on security, resilience, > technological diversity, and the people behind the code." (cf. [1]) > > Gnome has proposed various specific projects including integrating > systemd-homed with Gnome. Systemd-homed provides various features and if > you're interested in details then you might find it useful to read [2]. > It makes use of various new VFS and fs specific developments over the > last years. > > One feature is encrypting the home directory via LUKS. An approriate > image or device must contain a GPT partition table. Currently there's > only one partition which is a LUKS2 volume. Inside that LUKS2 volume is > a Linux filesystem. Currently supported are btrfs (see [4] though), > ext4, and xfs. > > The following issue isn't specific to systemd-homed. Gnome wants to be > able to support locking encrypted home directories. For example, when > the laptop is suspended. To do this the luksSuspend command can be used. > > The luksSuspend call is nothing else than a device mapper ioctl to > suspend the block device and it's owning superblock/filesystem. Which in > turn is nothing but a freeze initiated from the block layer: > > dm_suspend() > -> __dm_suspend() > -> lock_fs() > -> bdev_freeze() > > So when we say luksSuspend we really mean block layer initiated freeze. > The overall goal or expectation of userspace is that after a luksSuspend > call all sensitive material has been evicted from relevant caches to > harden against various attacks. And luksSuspend does wipe the encryption > key and suspend the block device. However, the encryption key can still > be available clear-text in the page cache. To illustrate this problem > more simply: > > truncate -s 500M /tmp/img > echo password | cryptsetup luksFormat /tmp/img --force-password > echo password | cryptsetup open /tmp/img test > mkfs.xfs /dev/mapper/test > mount /dev/mapper/test /mnt > echo "secrets" > /mnt/data > cryptsetup luksSuspend test > cat /mnt/data > > This will still happily print the contents of /mnt/data even though the > block device and the owning filesystem are frozen because the data is > still in the page cache. > > To my knowledge, the only current way to get the contents of /mnt/data > or the encryption key out of the page cache is via > /proc/sys/vm/drop_caches which is a big hammer. > > My initial reaction is to give userspace an API to drop the page cache > of a specific filesystem which may have additional uses. I initially had > started drafting an ioctl() and then got swayed towards a > posix_fadvise() flag. I found out that this was already proposed a few > years ago but got rejected as it was suspected this might just be > someone toying around without a real world use-case. I think this here > might qualify as a real-world use-case. > > This may at least help securing users with a regular dm-crypt setup > where dm-crypt is the top layer. Users that stack additional layers on > top of dm-crypt may still leak plaintext of course if they introduce > additional caching. But that's on them. > > Of course other ideas welcome. This isn't entirely unlike snapshot deletion, where we also need to shoot down the pagecache. Technically, the code I have now for snapshot deletion isn't quite what I want; snapshot deletion probably wants something closer to revoke() instead of waiting for files to be closed. But maybe the code I have is close to what you need - maybe we could turn this into a common shared API? https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs.c#n1569 The need for page zeroing is pretty orthogonal; if you want page zeroing you want that enabled for all page cache folios at all times.