From: Yafang Shao <laoar.shao@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm: Add large folio support for async readahead
Date: Tue, 5 Nov 2024 11:01:01 +0800 [thread overview]
Message-ID: <CALOAHbDMUqGnv3veo2CwLmH=cnyt4kWTPKLiX+fq20xfViW5=Q@mail.gmail.com> (raw)
In-Reply-To: <Zyj9qGEEIS5xRn2t@casper.infradead.org>
On Tue, Nov 5, 2024 at 1:00 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Nov 04, 2024 at 10:30:15PM +0800, Yafang Shao wrote:
> > When testing large folio support with XFS on our servers, we observed that
> > only a few large folios are mapped when reading large files via mmap. This
> > behavior occurs because large folio support is currently implemented only
> > for sync readahead, not for async readahead. Consequently, while the
> > first filemap fault may map to a large folio, subsequent filemap faults are
> > mapped to regular folios. This can be verified with a simple test case, as
> > shown below:
>
> I awear I tested this and it worked at the time. Here's what's supposed
> to happen:
>
> You touch the first byte of the mapping, and there's no page in the page
> cache. So we go into the sync readahead path, and force it to read two
> PMDs. From your email, I assume this is succeeding. The readahead
> flag gets set on the second PMD so that when we get to 2MB, we go into
> the 'time to do more readahead' path, ie the 'do_async_mmap_readahead'
> function you patch below.
>
> Now, page_cache_async_ra() does this:
>
> unsigned int order = folio_order(folio);
> ...
> if (index == expected) {
> ra->start += ra->size;
> ra->size = get_next_ra_size(ra, max_pages);
> ra->async_size = ra->size;
> goto readit;
> }
> readit:
> ractl->_index = ra->start;
> page_cache_ra_order(ractl, ra, order);
>
> So it should already be doing readahead of PMD size. Can you dig into
> what's going wrong, now that you understand that this is supposed to
> work already?
Upon further analysis, it appears that the behavior depends on the
/sys/block/*/queue/read_ahead_kb setting. On my test server, this
parameter is set to 128KB. For the current approach to work without
modification, it would need to be 2MB-aligned.
However, I believe MADV_HUGEPAGE behavior should not be dependent on
the value of read_ahead_kb. It would be more robust if the kernel
automatically aligned it to 2MB when MADV_HUGEPAGE is enabled. The
following changes ensure compatibility with non-2MB-aligned
read_ahead_kb values:
diff --git a/mm/readahead.c b/mm/readahead.c
index 3dc6c7a128dd..c024245497cc 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -641,9 +641,9 @@ void page_cache_async_ra(struct readahead_control *ractl,
expected = round_down(ra->start + ra->size - ra->async_size,
1UL << order);
if (index == expected) {
- ra->start += ra->size;
- ra->size = get_next_ra_size(ra, max_pages);
- ra->async_size = ra->size;
+ ra->start += ALIGN(ra->size, HPAGE_PMD_NR);
+ ra->size = ALIGN(get_next_ra_size(ra, max_pages), HPAGE_PMD_NR);
+ ra->async_size = ALIGN(ra->size, HPAGE_PMD_NR);
goto readit;
}
In addition, adjustments may be needed in the sync readahead path to
allow MADV_HUGEPAGE readahead sizes to be tunable based on
read_ahead_kb. I will continue to analyze this and submit a revised
version accordingly.
--
Regards
Yafang
next prev parent reply other threads:[~2024-11-05 3:01 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-04 14:30 Yafang Shao
2024-11-04 17:00 ` Matthew Wilcox
2024-11-05 3:01 ` Yafang Shao [this message]
2024-11-05 4:21 ` Matthew Wilcox
2024-11-05 5:48 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALOAHbDMUqGnv3veo2CwLmH=cnyt4kWTPKLiX+fq20xfViW5=Q@mail.gmail.com' \
--to=laoar.shao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox