From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Ackerley Tng <ackerleytng@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, kvm@vger.kernel.org,
linux-kselftest@vger.kernel.org
Cc: akpm@linux-foundation.org, lorenzo.stoakes@oracle.com,
Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org,
surenb@google.com, mhocko@suse.com, willy@infradead.org,
pbonzini@redhat.com, shuah@kernel.org, seanjc@google.com,
shivankg@amd.com, rick.p.edgecombe@intel.com,
yan.y.zhao@intel.com, rientjes@google.com, fvdl@google.com,
jthoughton@google.com, vannapurve@google.com,
pratyush@kernel.org, pasha.tatashin@soleen.com,
kalyazin@amazon.com, tabba@google.com, michael.roth@amd.com
Subject: Re: [RFC PATCH v1 00/10] guest_memfd: Track amount of memory allocated on inode
Date: Tue, 24 Feb 2026 16:26:30 +0100 [thread overview]
Message-ID: <9ef9a0bd-4cff-4518-b7fb-e65c9b761a5a@kernel.org> (raw)
In-Reply-To: <CAEvNRgFF0+g9pmp1yitX48ebK=fDpYKSOQDmRfOjzSHxM5UpeQ@mail.gmail.com>
On 2/24/26 00:42, Ackerley Tng wrote:
> "David Hildenbrand (Arm)" <david@kernel.org> writes:
>
>> On 2/23/26 08:04, Ackerley Tng wrote:
>>> Hi,
>>>
>>> Currently, guest_memfd doesn't update inode's i_blocks or i_bytes at
>>> all. Hence, st_blocks in the struct populated by a userspace fstat()
>>> call on a guest_memfd will always be 0. This patch series makes
>>> guest_memfd track the amount of memory allocated on an inode, which
>>> allows fstat() to accurately report that on requests from userspace.
>>>
>>> The inode's i_blocks and i_bytes fields are updated when the folio is
>>> associated or disassociated from the guest_memfd inode, which are at
>>> allocation and truncation times respectively.
>>>
>>> To update inode fields at truncation time, this series implements a
>>> custom truncation function for guest_memfd. An alternative would be to
>>> update truncate_inode_pages_range() to return the number of bytes
>>> truncated or add/use some hook.
>>>
>>> Implementing a custom truncation function was chosen to provide
>>> flexibility for handling truncations in future when guest_memfd
>>> supports sources of pages other than the buddy allocator. This
>>> approach of a custom truncation function also aligns with shmem, which
>>> has a custom shmem_truncate_range().
>>
>> Just wondered how shmem does it: it's through
>> dquot_alloc_block_nodirty() / dquot_free_block_nodirty().
>>
>> It's a shame we can't just use folio_free().
>
> Yup, Hugh pointed out that struct address_space *mapping (and inode) may already
> have been freed by the time .free_folio() is called [1].
>
> [1] https://lore.kernel.org/all/7c2677e1-daf7-3b49-0a04-1efdf451379a@google.com/
>
>> Could we maybe have a
>> different callback (when the mapping is still guaranteed to be around)
>> from where we could update i_blocks on the freeing path?
>
> Do you mean that we should add a new callback to struct
> address_space_operations?
If that avoids having to implement truncation completely ourselves, that might be one
option we could discuss, yes.
Something like:
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 7c753148af88..94f8bb81f017 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -764,6 +764,7 @@ cache in your filesystem. The following members are defined:
sector_t (*bmap)(struct address_space *, sector_t);
void (*invalidate_folio) (struct folio *, size_t start, size_t len);
bool (*release_folio)(struct folio *, gfp_t);
+ void (*remove_folio)(struct folio *folio);
void (*free_folio)(struct folio *);
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
int (*migrate_folio)(struct mapping *, struct folio *dst,
@@ -922,6 +923,11 @@ cache in your filesystem. The following members are defined:
its release_folio will need to ensure this. Possibly it can
clear the uptodate flag if it cannot free private data yet.
+``remove_folio``
+ remove_folio is called just before the folio is removed from the
+ page cache in order to allow the cleanup of properties (e.g.,
+ accounting) that needs the address_space mapping.
+
``free_folio``
free_folio is called once the folio is no longer visible in the
page cache in order to allow the cleanup of any private data.
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8b3dd145b25e..f7f6930977a1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -422,6 +422,7 @@ struct address_space_operations {
sector_t (*bmap)(struct address_space *, sector_t);
void (*invalidate_folio) (struct folio *, size_t offset, size_t len);
bool (*release_folio)(struct folio *, gfp_t);
+ void (*remove_folio)(struct folio *folio);
void (*free_folio)(struct folio *folio);
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
/*
diff --git a/mm/filemap.c b/mm/filemap.c
index 6cd7974d4ada..5a810eaacab2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -250,8 +250,14 @@ void filemap_free_folio(struct address_space *mapping, struct folio *folio)
void filemap_remove_folio(struct folio *folio)
{
struct address_space *mapping = folio->mapping;
+ void (*remove_folio)(struct folio *);
BUG_ON(!folio_test_locked(folio));
+
+ remove_folio = mapping->a_ops->remove_folio;
+ if (unlikely(remove_folio))
+ remove_folio(folio);
+
spin_lock(&mapping->host->i_lock);
xa_lock_irq(&mapping->i_pages);
__filemap_remove_folio(folio, NULL);
Ideally we'd perform it under the lock just after clearing folio->mapping, but I guess that
might be more controversial.
For accounting you need the above might be good enough, but I am not sure for how many
other use cases there might be.
--
Cheers,
David
prev parent reply other threads:[~2026-02-24 15:26 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 7:04 Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 01/10] KVM: guest_memfd: Don't set FGP_ACCESSED when getting folios Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 02/10] KVM: guest_memfd: Directly allocate folios with filemap_alloc_folio() Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 03/10] mm: truncate: Expose preparation steps for truncate_inode_pages_final() Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 04/10] KVM: guest_memfd: Implement evict_inode for guest_memfd Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 05/10] mm: Export unmap_mapping_folio() for KVM Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 06/10] mm: filemap: Export filemap_remove_folio() Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 07/10] KVM: guest_memfd: Implement custom truncation function Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 08/10] KVM: guest_memfd: Track amount of memory allocated on inode Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 09/10] KVM: selftests: Wrap fstat() to assert success Ackerley Tng
2026-02-23 7:04 ` [RFC PATCH v1 10/10] KVM: selftests: Test that st_blocks is updated on allocation Ackerley Tng
2026-02-23 15:23 ` [RFC PATCH v1 00/10] guest_memfd: Track amount of memory allocated on inode David Hildenbrand (Arm)
2026-02-23 23:42 ` Ackerley Tng
2026-02-24 15:26 ` David Hildenbrand (Arm) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9ef9a0bd-4cff-4518-b7fb-e65c9b761a5a@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=ackerleytng@google.com \
--cc=akpm@linux-foundation.org \
--cc=fvdl@google.com \
--cc=jthoughton@google.com \
--cc=kalyazin@amazon.com \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=michael.roth@amd.com \
--cc=pasha.tatashin@soleen.com \
--cc=pbonzini@redhat.com \
--cc=pratyush@kernel.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=shivankg@amd.com \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=tabba@google.com \
--cc=vannapurve@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox