From: Matthew Wilcox <willy@infradead.org>
To: David Frank <david@davidfrank.ch>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: Efficient mapping of sparse file holes to zero-pages
Date: Thu, 20 Feb 2025 13:47:46 +0000 [thread overview]
Message-ID: <Z7cygtpjGDJadgg0@casper.infradead.org> (raw)
In-Reply-To: <CAOR27cSr9yxodkctfp-Yjybh1NsKBeSkhdbZYeK7O5M87PfEYw@mail.gmail.com>
On Thu, Feb 20, 2025 at 01:48:18PM +0100, David Frank wrote:
> I'd like to efficiently mmap a large sparse file (ext4), 95% of which
> is holes. I was unsatisfied with the performance and after profiling,
> I found that most of the time is spent in filemap_add_folio and
> filemap_alloc_folio - much more than in my algorithm:
>
> - 97.87% filemap_fault
> - 97.57% do_sync_mmap_readahead
> - page_cache_ra_order
> - 97.28% page_cache_ra_unbounded
> - 40.80% filemap_add_folio
> + 21.93% __filemap_add_folio
> + 8.88% folio_add_lru
> + 7.56% workingset_refault
> + 28.73% filemap_alloc_folio
> + 22.34% read_pages
> + 3.29% xa_load
Yes, this is expected.
The fundamental problem is that we don't have the sparseness information
at the right point. So the read request (or pagefault) comes in, the
VFS allocates a page, puts it in the pagecache, then asks the filesystem
to fill it. The filesystem knows, so could theoretically tell the VFS
"Oh, this is a hole", but by this point the "damage" is done -- the page
has been allocated and added to the page cache.
Of course, this is a soluble problem. The VFS could ask the filesystem
for its sparseness information (as you do in userspace), but unlike your
particular usecase, the kernel must handle attackers who are trying to
make it do the wrong thing as well as ill-timed writes. So the VFS has
to ensure it does not use stale data from the filesystem.
This is a problem I'm somewhat interested in solving, but I'm a bit
busy with folios right now. And once that project is done, improving
the page cache for reflinked files is next on my list, so I'm not likely
to get to this problem for a few years.
next prev parent reply other threads:[~2025-02-20 13:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-20 12:48 David Frank
2025-02-20 13:47 ` Matthew Wilcox [this message]
2025-02-20 20:46 ` David Frank
2025-02-23 1:47 ` Matthew Wilcox
2025-02-24 16:17 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7cygtpjGDJadgg0@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@davidfrank.ch \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox