From: David Hildenbrand <david@redhat.com>
To: Yafang Shao <laoar.shao@gmail.com>,
willy@infradead.org, akpm@linux-foundation.org
Cc: linux-mm@kvack.org, stable@vger.kernel.org
Subject: Re: [PATCH v2] mm/readahead: Fix large folio support in async readahead
Date: Mon, 11 Nov 2024 11:33:38 +0100 [thread overview]
Message-ID: <85cfc467-320f-4388-b027-2cbad85dfbed@redhat.com> (raw)
In-Reply-To: <20241108141710.9721-1-laoar.shao@gmail.com>
On 08.11.24 15:17, Yafang Shao wrote:
> When testing large folio support with XFS on our servers, we observed that
> only a few large folios are mapped when reading large files via mmap.
> After a thorough analysis, I identified it was caused by the
> `/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this
> parameter is set to 128KB. After I tune it to 2MB, the large folio can
> work as expected. However, I believe the large folio behavior should not be
> dependent on the value of read_ahead_kb. It would be more robust if the
> kernel can automatically adopt to it.
Now I am extremely confused.
Documentation/ABI/stable/sysfs-block:
"[RW] Maximum number of kilobytes to read-ahead for filesystems on this
block device."
So, with your patch, will we also be changing the readahead size to
exceed that, or simply allocate larger folios and not exceeding the
readahead size (e.g., leaving them partially non-filled)?
If you're also changing the readahead behavior to exceed the
configuration parameter it would sound to me like "I am pushing the
brake pedal and my care brakes; fix the brakes to adopt whether to brake
automatically" :)
Likely I am missing something here, and how the read_ahead_kb parameter
is used after your patch.
>
> With /sys/block/*/queue/read_ahead_kb set to 128KB and performing a
> sequential read on a 1GB file using MADV_HUGEPAGE, the differences in
> /proc/meminfo are as follows:
>
> - before this patch
> FileHugePages: 18432 kB
> FilePmdMapped: 4096 kB
>
> - after this patch
> FileHugePages: 1067008 kB
> FilePmdMapped: 1048576 kB
>
> This shows that after applying the patch, the entire 1GB file is mapped to
> huge pages. The stable list is CCed, as without this patch, large folios
> don’t function optimally in the readahead path.
>> It's worth noting that if read_ahead_kb is set to a larger value
that isn't
> aligned with huge page sizes (e.g., 4MB + 128KB), it may still fail to map
> to hugepages.
>
> Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings")
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: stable@vger.kernel.org
>
> ---
> mm/readahead.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> Changes:
> v1->v2:
> - Drop the align (Matthew)
> - Improve commit log (Andrew)
>
> RFC->v1: https://lore.kernel.org/linux-mm/20241106092114.8408-1-laoar.shao@gmail.com/
> - Simplify the code as suggested by Matthew
>
> RFC: https://lore.kernel.org/linux-mm/20241104143015.34684-1-laoar.shao@gmail.com/
>
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 3dc6c7a128dd..9b8a48e736c6 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -385,6 +385,8 @@ static unsigned long get_next_ra_size(struct file_ra_state *ra,
> return 4 * cur;
> if (cur <= max / 2)
> return 2 * cur;
> + if (cur > max)
> + return cur;
> return max;
Maybe something like
return max_t(unsigned long, cur, max);
might be more readable (likely "max()" cannot be used because of the
local variable name "max" ...).
... but it's rather weird having a "max" and then returning something
larger than the "max" ... especially with code like
"ra->size = get_next_ra_size(ra, max_pages);"
Maybe we can improve that by renaming "max_pages" / "max" to what it
actually is supposed to be (which I haven't quite understood yet).
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-11-11 10:33 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-08 14:17 Yafang Shao
2024-11-11 10:33 ` David Hildenbrand [this message]
2024-11-11 14:28 ` Yafang Shao
2024-11-11 15:05 ` David Hildenbrand
2024-11-11 15:26 ` David Hildenbrand
2024-11-11 16:13 ` Yafang Shao
2024-11-11 16:08 ` Yafang Shao
2024-11-11 18:31 ` David Hildenbrand
2024-11-11 19:10 ` Yafang Shao
2024-11-12 15:19 ` David Hildenbrand
2024-11-13 2:16 ` Yafang Shao
2024-11-13 8:28 ` David Hildenbrand
2024-11-13 9:46 ` David Hildenbrand
2024-11-13 9:54 ` Yafang Shao
2024-11-13 10:24 ` David Hildenbrand
2024-11-13 4:19 ` Matthew Wilcox
2024-11-13 8:12 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=85cfc467-320f-4388-b027-2cbad85dfbed@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=stable@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox