linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm/readahead: Fix large folio support in async readahead
Date: Thu, 7 Nov 2024 13:55:52 +0800	[thread overview]
Message-ID: <CALOAHbAYx-L2dH_97XhD1HTMwevBNvJrxNxa82qfKzwtKoOQBQ@mail.gmail.com> (raw)
In-Reply-To: <ZyxHc5Uukh47CO2R@casper.infradead.org>

On Thu, Nov 7, 2024 at 12:52 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Nov 06, 2024 at 05:21:14PM +0800, Yafang Shao wrote:
> > When testing large folio support with XFS on our servers, we observed that
> > only a few large folios are mapped when reading large files via mmap.
> > After a thorough analysis, I identified it was caused by the
> > `/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this
> > parameter is set to 128KB. After I tune it to 2MB, the large folio can
> > work as expected. However, I believe the large folio behavior should not be
> > dependent on the value of read_ahead_kb. It would be more robust if the
> > kernel can automatically adopt to it.
> >
> > With `/sys/block/*/queue/read_ahead_kb` set to a non-2MB aligned size,
> > this issue can be verified with a simple test case, as shown below:
>
> I don't like to see these programs in commit messages.  If it's a
> valuable program, it should go into tools/testing.  If not, it shouldn't
> be in the commit message.

I just want to make it easy for others to verify this change. I'm okay
with dropping this program.

>
> > When large folio support is enabled and read_ahead_kb is set to a smaller
> > value, ra->size (4MB) may exceed the maximum allowed size (e.g., 128KB). To
> > address this, we need to add a conditional check for such cases. However,
> > this alone is insufficient, as users might set read_ahead_kb to a larger,
> > non-hugepage-aligned value (e.g., 4MB + 128KB). In these instances, it is
> > essential to explicitly align ra->size with the hugepage size.
>
> I wish you'd discussed this in the earlier thread instead of just
> smashing it into this patch.  Because your solution is wrong.
>
> > @@ -642,7 +644,7 @@ void page_cache_async_ra(struct readahead_control *ractl,
> >                       1UL << order);
> >       if (index == expected) {
> >               ra->start += ra->size;
> > -             ra->size = get_next_ra_size(ra, max_pages);
> > +             ra->size = ALIGN(get_next_ra_size(ra, max_pages), 1 << order);
>
> Let's suppose that someone sets read_ahead_kb to 192kB.  If the previous
> readahead did 128kB, we now try to align that to 128kB, so we'll readahead
> 256kB which is larger than max.  We were only intending to breach the
> 'max' for the MADV_HUGE case, not for all cases.

In the non-MADV_HUGE case, the order can only be 0, correct?

>
> Honestly, I don't know if we should try to defend a stupid sysadmin
> against the consequences of their misconfiguration like this.  I'd be in
> favour of getting rid of the configuration knob entirely (or just ignoring
> what the sysadmin set it to), but if we do that, we need to replace it
> with something that can automatically figure out what the correct setting
> for the readahead_max_kb should be (which is probably a function of the
> bandwidth, latency and seek time of the underlying device).
>
> But that's obviously not part of this patch.  I'd be in favour of just
> dropping this ALIGN and leaving just the first hunk of the patch.

I’m okay with removing this ALIGN, since we won’t be setting a large
read_ahead_kb value in our production environment. ;)

-- 
Regards
Yafang


  reply	other threads:[~2024-11-07  5:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-06  9:21 Yafang Shao
2024-11-06 21:03 ` Andrew Morton
2024-11-07  3:39   ` Yafang Shao
2024-11-07  4:06     ` Andrew Morton
2024-11-07  6:01       ` Yafang Shao
2024-11-07  4:52 ` Matthew Wilcox
2024-11-07  5:55   ` Yafang Shao [this message]
2024-11-07 15:00     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALOAHbAYx-L2dH_97XhD1HTMwevBNvJrxNxa82qfKzwtKoOQBQ@mail.gmail.com \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox