linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: willy@infradead.org, akpm@linux-foundation.org
Cc: linux-mm@kvack.org, Yafang Shao <laoar.shao@gmail.com>
Subject: [PATCH] mm/readahead: Fix large folio support in async readahead
Date: Wed,  6 Nov 2024 17:21:14 +0800	[thread overview]
Message-ID: <20241106092114.8408-1-laoar.shao@gmail.com> (raw)

When testing large folio support with XFS on our servers, we observed that
only a few large folios are mapped when reading large files via mmap.
After a thorough analysis, I identified it was caused by the
`/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this
parameter is set to 128KB. After I tune it to 2MB, the large folio can
work as expected. However, I believe the large folio behavior should not be
dependent on the value of read_ahead_kb. It would be more robust if the
kernel can automatically adopt to it.

With `/sys/block/*/queue/read_ahead_kb` set to a non-2MB aligned size,
this issue can be verified with a simple test case, as shown below:

      #define LEN (1024 * 1024 * 1024) // 1GB file
      int main(int argc, char *argv[])
      {
          char *addr;
          int fd, i;

          fd = open("data", O_RDWR);
          if (fd < 0) {
              perror("open");
              exit(-1);
          }

          addr = mmap(NULL, LEN, PROT_READ|PROT_WRITE,
                      MAP_SHARED, fd, 0);
          if (addr == MAP_FAILED) {
              perror("mmap");
              exit(-1);
          }

          if (madvise(addr, LEN, MADV_HUGEPAGE)) {
              perror("madvise");
              exit(-1);
          }

          for (i = 0; i < LEN / 4096; i++)
                memset(addr + i * 4096, 1, 1);

          while (1) {} // Verifiable with /proc/meminfo

          munmap(addr, LEN);
          close(fd);
          exit(0);
      }

When large folio support is enabled and read_ahead_kb is set to a smaller
value, ra->size (4MB) may exceed the maximum allowed size (e.g., 128KB). To
address this, we need to add a conditional check for such cases. However,
this alone is insufficient, as users might set read_ahead_kb to a larger,
non-hugepage-aligned value (e.g., 4MB + 128KB). In these instances, it is
essential to explicitly align ra->size with the hugepage size.

Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
---
 mm/readahead.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Changes: 
RFC->v1:
- Simplify the code as suggested by Matthew

RFC: https://lore.kernel.org/linux-mm/20241104143015.34684-1-laoar.shao@gmail.com/

diff --git a/mm/readahead.c b/mm/readahead.c
index 3dc6c7a128dd..9e2c6168ebfa 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -385,6 +385,8 @@ static unsigned long get_next_ra_size(struct file_ra_state *ra,
 		return 4 * cur;
 	if (cur <= max / 2)
 		return 2 * cur;
+	if (cur > max)
+		return cur;
 	return max;
 }
 
@@ -642,7 +644,7 @@ void page_cache_async_ra(struct readahead_control *ractl,
 			1UL << order);
 	if (index == expected) {
 		ra->start += ra->size;
-		ra->size = get_next_ra_size(ra, max_pages);
+		ra->size = ALIGN(get_next_ra_size(ra, max_pages), 1 << order);
 		ra->async_size = ra->size;
 		goto readit;
 	}
-- 
2.43.5



             reply	other threads:[~2024-11-06  9:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-06  9:21 Yafang Shao [this message]
2024-11-06 21:03 ` Andrew Morton
2024-11-07  3:39   ` Yafang Shao
2024-11-07  4:06     ` Andrew Morton
2024-11-07  6:01       ` Yafang Shao
2024-11-07  4:52 ` Matthew Wilcox
2024-11-07  5:55   ` Yafang Shao
2024-11-07 15:00     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241106092114.8408-1-laoar.shao@gmail.com \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox