From: Ryan Roberts <ryan.roberts@arm.com>
To: Matthew Wilcox <willy@infradead.org>,
Linux-MM <linux-mm@kvack.org>,
linux-fsdevel@vger.kernel.org
Subject: Large folios for overlayfs (and potential BUG)
Date: Tue, 11 Feb 2025 16:07:58 +0000 [thread overview]
Message-ID: <aea64a67-e236-4606-a330-1d53fed45bf9@arm.com> (raw)
Hi Matthew,
I'm interested in enabling large folios support in overlayfs. I've been doing
some digging and it looked like it should JustWork already, but testing suggests
that's not the case. Given that's not the case, then I think there may be a bug
relating to BS > PS when XFS is providing one of the layers.
So overlayfs has a "lower layer" and an "upper layer". Both are just directories
on other file systems. The lower layer is read-only. The upper layer contains
any created files as well as any modified files (when modified, the whole file
is "copied up" to the upper then modified). The upper layer also contains
"white-outs"; meta data to describe a deleted file. The final view is the merged
views of these 2 layers.
Anyway, overlayfs creates/maintains its own inodes (and mappings), but delegates
IO (.read_iter/.write_iter) down to the "real" file on the real filesystem; one
of the 2 layers. overlayfs never calls mapping_set_large_folios() for it's
mappings. But it also doesn't implement any of the mapping ops (except direct_IO).
overlayfs's read_iter() will delegate into the real file's read_iter() (via
backing_file_read_iter()). For XFS that means it will end up calling
generic_file_read_iter() to interact with the page cache which will use the
mapping ops for the real mapping (i.e. XFS). Since XFS should have called
mapping_set_large_folios(), we should get large folios, right?
Except, testing this from user space shows the folios are small when coming from
overlayfs, backed by XFS. The same test case shows the folios are large when the
file is pulled directly from XFS.
So I guess my unserstanding of this is wrong and for some reason we need to call
mapping_set_large_folios() for overlayfs's mapping, or do something else?
Although I don't really get why that would even be used...
But the fact that this doesn't all JustWork, makes me concerned that this is all
broken for BS > PS? If the underlying FS requires all folios to be bigger than
order-0, but overlayfs is somehow fixing it so that all folios are order-0,
don't we have a mismatch?
FWIW, ChatGPT was suggesting that mapping_set_large_folios() DOES need to be
called for the overlayfs mapping, but from code inspection I don't see why. If
needed, it also opens up the problem that if the file needs to be copied up to
the upper layer in future, we don't know if that can support large folios (or
more generally we don't know if there is a single configuration that can be
supported by both layers). So it would suggest a need to change the large folio
configuration on an active mapping, which I don't think is currently allowed.
Anyway, from my rambling, you can probably tell there are a bunch of holes in my
mental model. Any clarifications/suggestions you have would be gratefully received!
Thanks
Ryan
next reply other threads:[~2025-02-11 16:08 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 16:07 Ryan Roberts [this message]
2025-02-11 17:11 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aea64a67-e236-4606-a330-1d53fed45bf9@arm.com \
--to=ryan.roberts@arm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox