linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@linux.dev>
To: Christian Brauner <brauner@kernel.org>
Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	 linux-mm@kvack.org, linux-btrfs@vger.kernel.org,
	linux-block@vger.kernel.org,
	 Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs
Date: Fri, 16 Feb 2024 23:04:28 -0500	[thread overview]
Message-ID: <h5wq7dsi6r7cjjmkpo2dvn5x662eseluzd2kmzbkzegntzlptd@ncjzyaurmiwb> (raw)
In-Reply-To: <20240116-tagelang-zugnummer-349edd1b5792@brauner>

On Tue, Jan 16, 2024 at 11:50:32AM +0100, Christian Brauner wrote:
> Hey,
> 
> I'm not sure this even needs a full LSFMM discussion but since I
> currently don't have time to work on the patch I may as well submit it.
> 
> Gnome recently got awared 1M Euro by the Sovereign Tech Fund (STF). The
> STF was created by the German government to fund public infrastructure:
> 
> "The Sovereign Tech Fund supports the development, improvement and
>  maintenance of open digital infrastructure. Our goal is to sustainably
>  strengthen the open source ecosystem. We focus on security, resilience,
>  technological diversity, and the people behind the code." (cf. [1])
> 
> Gnome has proposed various specific projects including integrating
> systemd-homed with Gnome. Systemd-homed provides various features and if
> you're interested in details then you might find it useful to read [2].
> It makes use of various new VFS and fs specific developments over the
> last years.
> 
> One feature is encrypting the home directory via LUKS. An approriate
> image or device must contain a GPT partition table. Currently there's
> only one partition which is a LUKS2 volume. Inside that LUKS2 volume is
> a Linux filesystem. Currently supported are btrfs (see [4] though),
> ext4, and xfs.
> 
> The following issue isn't specific to systemd-homed. Gnome wants to be
> able to support locking encrypted home directories. For example, when
> the laptop is suspended. To do this the luksSuspend command can be used.
> 
> The luksSuspend call is nothing else than a device mapper ioctl to
> suspend the block device and it's owning superblock/filesystem. Which in
> turn is nothing but a freeze initiated from the block layer:
> 
> dm_suspend()
> -> __dm_suspend()
>    -> lock_fs()
>       -> bdev_freeze()
> 
> So when we say luksSuspend we really mean block layer initiated freeze.
> The overall goal or expectation of userspace is that after a luksSuspend
> call all sensitive material has been evicted from relevant caches to
> harden against various attacks. And luksSuspend does wipe the encryption
> key and suspend the block device. However, the encryption key can still
> be available clear-text in the page cache. To illustrate this problem
> more simply:
> 
> truncate -s 500M /tmp/img
> echo password | cryptsetup luksFormat /tmp/img --force-password
> echo password | cryptsetup open /tmp/img test
> mkfs.xfs /dev/mapper/test
> mount /dev/mapper/test /mnt
> echo "secrets" > /mnt/data
> cryptsetup luksSuspend test
> cat /mnt/data
> 
> This will still happily print the contents of /mnt/data even though the
> block device and the owning filesystem are frozen because the data is
> still in the page cache.
> 
> To my knowledge, the only current way to get the contents of /mnt/data
> or the encryption key out of the page cache is via
> /proc/sys/vm/drop_caches which is a big hammer.
> 
> My initial reaction is to give userspace an API to drop the page cache
> of a specific filesystem which may have additional uses. I initially had
> started drafting an ioctl() and then got swayed towards a
> posix_fadvise() flag. I found out that this was already proposed a few
> years ago but got rejected as it was suspected this might just be
> someone toying around without a real world use-case. I think this here
> might qualify as a real-world use-case.
> 
> This may at least help securing users with a regular dm-crypt setup
> where dm-crypt is the top layer. Users that stack additional layers on
> top of dm-crypt may still leak plaintext of course if they introduce
> additional caching. But that's on them.
> 
> Of course other ideas welcome.

This isn't entirely unlike snapshot deletion, where we also need to
shoot down the pagecache.

Technically, the code I have now for snapshot deletion isn't quite what
I want; snapshot deletion probably wants something closer to revoke()
instead of waiting for files to be closed. But maybe the code I have is
close to what you need - maybe we could turn this into a common shared
API?

https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs.c#n1569

The need for page zeroing is pretty orthogonal; if you want page zeroing
you want that enabled for all page cache folios at all times.


      parent reply	other threads:[~2024-02-17  4:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-16 10:50 Christian Brauner
2024-01-16 11:45 ` Jan Kara
2024-01-17 12:53   ` Christian Brauner
2024-01-17 14:35     ` Jan Kara
2024-01-17 14:52       ` Matthew Wilcox
2024-01-17 20:51         ` Phillip Susi
2024-01-17 20:58           ` Matthew Wilcox
2024-01-18 14:26         ` Christian Brauner
2024-01-30  0:13         ` Adrian Vovk
2024-02-15 13:57           ` Jan Kara
2024-02-15 19:46             ` Adrian Vovk
2024-02-15 23:17               ` Dave Chinner
     [not found]                 ` <10c3b162-265b-442b-80e9-8563c0168a8b@gmail.com>
2024-02-16 20:38                   ` init_on_alloc digression: " John Hubbard
2024-02-16 21:11                     ` Adrian Vovk
2024-02-16 21:19                       ` John Hubbard
2024-01-16 15:25 ` James Bottomley
2024-01-16 15:40   ` Matthew Wilcox
2024-01-16 15:54     ` James Bottomley
2024-01-16 20:56 ` Dave Chinner
2024-01-17  6:17   ` Theodore Ts'o
2024-01-30  1:14     ` Adrian Vovk
2024-01-17 13:19   ` Christian Brauner
2024-01-17 22:26     ` Dave Chinner
2024-01-18 14:09       ` Christian Brauner
2024-02-05 17:39     ` Russell Haley
2024-02-17  4:04 ` Kent Overstreet [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=h5wq7dsi6r7cjjmkpo2dvn5x662eseluzd2kmzbkzegntzlptd@ncjzyaurmiwb \
    --to=kent.overstreet@linux.dev \
    --cc=brauner@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox