From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E079C47258 for ; Wed, 17 Jan 2024 22:26:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99E826B0088; Wed, 17 Jan 2024 17:26:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 94E0C6B0092; Wed, 17 Jan 2024 17:26:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8158D6B0093; Wed, 17 Jan 2024 17:26:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6AB5B6B0088 for ; Wed, 17 Jan 2024 17:26:21 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 27D0C1606CB for ; Wed, 17 Jan 2024 22:26:21 +0000 (UTC) X-FDA: 81690237762.01.00C8AD9 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf24.hostedemail.com (Postfix) with ESMTP id 55415180011 for ; Wed, 17 Jan 2024 22:26:19 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=Fwd8utVO; spf=pass (imf24.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705530379; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0Z/hRTX+MQ2pZDmJRifmLI1vSLH0o1ZgpBqyx2XxHN0=; b=GoqWwTiJHRj8T2XJ17Wq8ArIMTnOVIe/sKQaEIwLb3qFoInrd4d+ghRSUoIJoH1q8+c4or tAoEY7ne3dxk+rJdfin2F7H7yMcAGw5o73qf5MxFrLMZLFCyhHTyMcISl9QCovLokgwZhU nsb8PonOgaTV6JohT6vAc9QAcy29GO0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705530379; a=rsa-sha256; cv=none; b=kRc8nMYkWBAHOE3Rxqg6GcoxqJIXqEG3WcB/Sam+/bJh693GMkwCG8FVvbZCoDnSqzbgUA zdycz9p7sI8tj0nSVGeLfoEKcxAo+pCK6pV/15ogEgS35X93XJgQPJ7HRqiM0ZRUu8i/p6 uZ547XtmD5rQb45xnNGTZ0iDWRiX2t4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=Fwd8utVO; spf=pass (imf24.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1d5db9eb0e7so19991855ad.2 for ; Wed, 17 Jan 2024 14:26:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705530378; x=1706135178; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=0Z/hRTX+MQ2pZDmJRifmLI1vSLH0o1ZgpBqyx2XxHN0=; b=Fwd8utVOLCSof/7tVBdtTdBrW3FVNnaOzyj28qdBls6L1I3h2U07PebKbuxeRhyQD7 k6yZ+PUoz7L4ylIbeEeYAcY8UMntzOrgkuOYopMkFpTksk/vMPo7FxMOlB7ZllMKCHY3 G3WLTdbFZAYhkxaEPbbYMQ6LoaaZvN+bGHJ7vEsidlOVzdxDyMn2J+2zhUL+E6hfYITg BVjwwbvhNCStecOyIjJjRXubFZ19ivFoTQdeySPTPsGe7q2nLUjoUy2FPNmAJdIr+bZ+ mPHyPkF/Pf96Zo+lJv5qTrQSphP0YIXjC/CxURPEaXtVHY6wKIvJB8j7ytfjzwwqft82 91MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705530378; x=1706135178; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=0Z/hRTX+MQ2pZDmJRifmLI1vSLH0o1ZgpBqyx2XxHN0=; b=hM3XNZloJsyWedShBssSZdtl/38tXqj5lK6bgg7HSPUVpBmHOXBDMR7oimlcJd3z7Y H+mGKBVtLFr8bFWdIOPZrJLEuRqhqcjh1YTet4jgsnOTFXtkTmMSJLzE4JeWKt321vk7 cCPh2LqOGRChmMT0qNaXKUj4hzx3PkFAOTNN9j8FEcQgSZsZBBbRifO2EvCK5aSvogmq J970mZAioBbfQLI5LNTRJ8+WW2LQumm/4eekUq3ifX9be0NnyCyfhKAoTwEdxVsf+Wkw MoODE4/69Viw8SqKIIvh64xaBy6vQJm6teY1wS+7Ly8oi71tGbGvByVgx4cgXXIsp/FW cQxw== X-Gm-Message-State: AOJu0YyquUY1GO04Mk8U5G5z0iEGm4O7QTIZF+xOs7AGHPwOosB41f86 9yCbfgl1y6DpOG9MTbWmy1ou4T2E7yGtyA== X-Google-Smtp-Source: AGHT+IHXfhiJO0uOdn4WriD5J2oZVP6Jq3kTbXvHM9B9awAtkN42CQFn1wuH4PYtyXhbASYEzOE6mQ== X-Received: by 2002:a17:902:db03:b0:1d6:f9a8:532d with SMTP id m3-20020a170902db0300b001d6f9a8532dmr1083027plx.107.1705530378063; Wed, 17 Jan 2024 14:26:18 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id 17-20020a170902ee5100b001d5e5836292sm139483plo.130.2024.01.17.14.26.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jan 2024 14:26:17 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1rQEMR-00BlDA-00; Thu, 18 Jan 2024 09:26:15 +1100 Date: Thu, 18 Jan 2024 09:26:14 +1100 From: Dave Chinner To: Christian Brauner Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org, Matthew Wilcox , Jan Kara , Christoph Hellwig , adrianvovk@gmail.com Subject: Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs Message-ID: References: <20240116-tagelang-zugnummer-349edd1b5792@brauner> <20240117-yuppie-unflexibel-dbbb281cb948@brauner> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240117-yuppie-unflexibel-dbbb281cb948@brauner> X-Rspamd-Queue-Id: 55415180011 X-Rspam-User: X-Stat-Signature: mtg6dcqnhkponp7azwpe58ewpmdkb8z4 X-Rspamd-Server: rspam03 X-HE-Tag: 1705530379-158044 X-HE-Meta: U2FsdGVkX1/dKsVsOFE1miIGfSoVZYmBVmyTD2motkNofDHO9j9YG35UY9ukgKZYn2yIgwrE8tzTUyK/55ESOgCX5A8s08nD4n7SR/LeMJjmfwZhsYDCdChUQfq1OJnoF1tzAT3uhD+V3fnxfjfhdyMo2Kq16eaiQUAVTWvSa0K0OI7pTKrTjgwBLNKIfWJE5F9Mq4d5qP9tHzspLI7lTXbTzRWllBER4JAIBfAgViG8hE1avxhjjNIBaHI0nOOvKdNbigvvgLiFU4JXGVUqxQPkrFQfr50Ib0xQFUu6JequWik2j4XcxTFhQsfnRcdq90QcHUcBfWqZuRRRu02C5nSnbtv/TE7iXcHuvHbp7sbNrJu4uCWhwCBk6Nu1mB3yFwVHS64EwWnP1Ykc6idgutjImxtS/ixP88QJgdB76TtfL4LzrS74ViiDcIdtrk5kD0a6+BjxuuFmrzVKaVNwanjEYO/Ag5Q7pi8A8/31XhHTW1bjkUk4Gz9wMRdZEca8iECXzVV7jz8hkA8/QJW3smS12Jh9pWOfF3GRvxJf9P1KmsdcKFPhYB+i9imTL/ZVn2YMd1/wLQWrOmTsMT7Y47UFiVYO4wjFIMtYBiMKcbvgoVWBs/v3c6w+rydqu4AIY+bT9xYbXKL3VM5Ng/zDp3Ws3ApANQvWf9qOSkFrD1csCwzSXcNyNZ0LWlrTvS3AJpf9qmaOh2Mx5dWacTOWbmV5BW04qRQT1uhTjnqgodXsh87bmBirDjbh9J62QwYvARSGTnsGncqxoHay2H/cBCh7ny6MSEDA8TQxxbKIm7TApkWlzyeyuqgSxC5GgjpNsS1jfRaYsDwBz7erjT6aR9kAwFKKWVSZyj9aXAgA+3Pse/0MJRiLPajsTTIH9WnKCvlGKUW9aNhycbtufbPqUpxHmGHsYVOCWGZAzrHGetl1HlBwKzQkrMgfCW+pmX+TqrLHmI3XKiPENzaWKu9 PROyMK/K w1fa1aflAqLkSz82EtLvT55jEEpCg5walPW9uPYSN7cC8xH0wk243KOczKuWweCFdtTxSN2Dq7Cc3e1gUfIEAJnYbat5XfofYIIh+c3uZ81xB6joGoRfIGom+Qw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.008475, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 17, 2024 at 02:19:43PM +0100, Christian Brauner wrote: > On Wed, Jan 17, 2024 at 07:56:01AM +1100, Dave Chinner wrote: > > On Tue, Jan 16, 2024 at 11:50:32AM +0100, Christian Brauner wrote: > > > Hey, > > > > > > I'm not sure this even needs a full LSFMM discussion but since I > > > currently don't have time to work on the patch I may as well submit it. > > > > > > Gnome recently got awared 1M Euro by the Sovereign Tech Fund (STF). The > > > STF was created by the German government to fund public infrastructure: > > > > > > "The Sovereign Tech Fund supports the development, improvement and > > > maintenance of open digital infrastructure. Our goal is to sustainably > > > strengthen the open source ecosystem. We focus on security, resilience, > > > technological diversity, and the people behind the code." (cf. [1]) > > > > > > Gnome has proposed various specific projects including integrating > > > systemd-homed with Gnome. Systemd-homed provides various features and if > > > you're interested in details then you might find it useful to read [2]. > > > It makes use of various new VFS and fs specific developments over the > > > last years. > > > > > > One feature is encrypting the home directory via LUKS. An approriate > > > image or device must contain a GPT partition table. Currently there's > > > only one partition which is a LUKS2 volume. Inside that LUKS2 volume is > > > a Linux filesystem. Currently supported are btrfs (see [4] though), > > > ext4, and xfs. > > > > > > The following issue isn't specific to systemd-homed. Gnome wants to be > > > able to support locking encrypted home directories. For example, when > > > the laptop is suspended. To do this the luksSuspend command can be used. > > > > > > The luksSuspend call is nothing else than a device mapper ioctl to > > > suspend the block device and it's owning superblock/filesystem. Which in > > > turn is nothing but a freeze initiated from the block layer: > > > > > > dm_suspend() > > > -> __dm_suspend() > > > -> lock_fs() > > > -> bdev_freeze() > > > > > > So when we say luksSuspend we really mean block layer initiated freeze. > > > The overall goal or expectation of userspace is that after a luksSuspend > > > call all sensitive material has been evicted from relevant caches to > > > harden against various attacks. And luksSuspend does wipe the encryption > > > key and suspend the block device. However, the encryption key can still > > > be available clear-text in the page cache. > > > > The wiping of secrets is completely orthogonal to the freezing of > > the device and filesystem - the freeze does not need to occur to > > allow the encryption keys and decrypted data to be purged. They > > should not be conflated; purging needs to be a completely separate > > operation that can be run regardless of device/fs freeze status. > > Yes, I'm aware. I didn't mean to imply that these things are in any way > necessarily connected. Just that there are use-cases where they are. And > the encrypted home directory case is one. One froze the block device and > filesystem one would now also like to drop the page cache which has most > of the interesting data. > > The fact that after a block layer initiated freeze - again mostly a > device mapper problem - one may or may not be able to successfully read > from the filesystem is annoying. Of course one can't write, that will > hang one immediately. But if one still has some data in the page cache > one can still dump the contents of that file. That's at least odd > behavior from a users POV even if for us it's cleary why that's the > case. A frozen filesystem doesn't prevent read operations from occurring. > And a freeze does do a sync_filesystem() and a sync_blockdev() to flush > out any dirty data for that specific filesystem. Yes, it's required to do that - the whole point of freezing a filesystem is to bring the filesystem into a *consistent physical state on persistent storage* and to hold it in that state until it is thawed. > So it would be fitting > to give users an api that allows them to also drop the page cache > contents. Not as part of a freeze operation. Read operations have *always* been allowed from frozen filesystems; they are intended to be allowed because one of the use cases for freezing is to create a consistent filesystem state for backup of the filesystem. That requires everything in the filesystem can be read whilst it is frozen, and that means the page cache needs to remain operational. What the underlying device allows when it has been *suspended* is a different issue altogether. The key observation here is that storage device suspend != filesystem freeze and they can have very different semantics depending on the operation being performed on the block device while it is suspended. IOWs, a device suspend implementation might freeze the filesystem to bring the contents of the storage device whilst frozen into a consistent, uptodate state (e.g. for device level backups), but block device level suspend does not *require* that the filesystem is frozen whilst the device IO operations are suspended. > For some use-cases like the Gnome use-case one wants to do a freeze and > drop everything that one can from the page cache for that specific > filesystem. So they have to do an extra system call between FS_IOC_FREEZE and FS_IOC_THAW. What's the problem with that? What are you trying to optimise by colliding cache purging with FS_IOC_FREEZE? If the user/application/infrastructure already has to iterate all the mounted filesystems to freeze them, then it's trivial for them to add a cache purging step to that infrastructure for the storage configurations that might need it. I just don't see why this needs to be part of a block device freeze operation, especially as the "purge caches on this filesystem" operation has potential use cases outside of the luksSuspend context.... Cheers, Dave. -- Dave Chinner david@fromorbit.com