From: Johannes Weiner <hannes@cmpxchg.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mgorman@techsingularity.net>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers
Date: Thu, 10 Apr 2025 16:17:18 -0400 [thread overview]
Message-ID: <20250410201718.GA366747@cmpxchg.org> (raw)
In-Reply-To: <2025de6c-a25b-42f2-8ff2-da2bad0e0aa0@suse.cz>
On Thu, Apr 10, 2025 at 05:19:06PM +0200, Vlastimil Babka wrote:
> On 3/13/25 22:05, Johannes Weiner wrote:
> > compaction_suitable() hardcodes the min watermark, with a boost to the
> > low watermark for costly orders. However, compaction_ready() requires
> > order-0 at the high watermark. It currently checks the marks twice.
> >
> > Make the watermark a parameter to compaction_suitable() and have the
> > callers pass in what they require:
> >
> > - compaction_zonelist_suitable() is used by the direct reclaim path,
> > so use the min watermark.
> >
> > - compact_suit_allocation_order() has a watermark in context derived
> > from cc->alloc_flags.
> >
> > The only quirk is that kcompactd doesn't initialize cc->alloc_flags
> > explicitly. There is a direct check in kcompactd_do_work() that
> > passes ALLOC_WMARK_MIN, but there is another check downstack in
> > compact_zone() that ends up passing the unset alloc_flags. Since
> > they default to 0, and that coincides with ALLOC_WMARK_MIN, it is
> > correct. But it's subtle. Set cc->alloc_flags explicitly.
> >
> > - should_continue_reclaim() is direct reclaim, use the min watermark.
> >
> > - Finally, consolidate the two checks in compaction_ready() to a
> > single compaction_suitable() call passing the high watermark.
> >
> > There is a tiny change in behavior: before, compaction_suitable()
> > would check order-0 against min or low, depending on costly
> > order. Then there'd be another high watermark check.
> >
> > Now, the high watermark is passed to compaction_suitable(), and the
> > costly order-boost (low - min) is added on top. This means
> > compaction_ready() sets a marginally higher target for free pages.
> >
> > In a kernelbuild + THP pressure test, though, this didn't show any
> > measurable negative effects on memory pressure or reclaim rates. As
> > the comment above the check says, reclaim is usually stopped short
> > on should_continue_reclaim(), and this just defines the worst-case
> > reclaim cutoff in case compaction is not making any headway.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>
> <snip>
>
> > @@ -2513,13 +2516,13 @@ compaction_suit_allocation_order(struct zone *zone, unsigned int order,
> > */
> > if (order > PAGE_ALLOC_COSTLY_ORDER && async &&
> > !(alloc_flags & ALLOC_CMA)) {
> > - watermark = low_wmark_pages(zone) + compact_gap(order);
> > - if (!__zone_watermark_ok(zone, 0, watermark, highest_zoneidx,
> > - 0, zone_page_state(zone, NR_FREE_PAGES)))
> > + if (!__zone_watermark_ok(zone, 0, watermark + compact_gap(order),
> > + highest_zoneidx, 0,
> > + zone_page_state(zone, NR_FREE_PAGES)))
> > return COMPACT_SKIPPED;
>
> The watermark here is no longer recalculated as low_wmark_pages() but the
> value from above based on alloc_flags is reused.
> It's probably ok, maybe it's even more correct, just wasn't mentioned in the
> changelog as another tiny change of behavior so I wanted to point it out.
Ah yes, it would have made sense to point out.
I was wondering about this check. It was introduced to bail on
compaction if there are not enough free non-CMA pages. But if there
are, we still fall through and check the superset of regular + CMA
pages against the watermarks as well. We know this will succeed, so
this seems moot.
It's also a little odd that compaction_suitable() hardcodes ALLOC_CMA
with the explanation that "CMA are migration targets", but then this
check says "actually, it doesn't help us if blocks are formed in CMA".
Does it make more sense to plumb alloc_flags to compaction_suitable()?
There is more head-scratching, though. The check is meant to test
whether compaction has a chance of forming non-CMA blocks. But free
pages are targets. You could have plenty of non-contiguous, free
non-CMA memory - compaction will then form blocks in CMA by moving CMA
pages into those non-CMA targets.
The longer I look at this, the more I feel like this just hard-coded
the very specific scenario the patch author had a problem with: CMA is
massive. The page allocator fills up regular memory first. Once
regular memory is full, non-CMA requests stall on compaction making
CMA blocks. So just bail on compaction then.
It's a valid problem, but I don't see how this code makes any general
sense outside of this exact sequence of events. Especially once
compaction has moved stuff around between regular and CMA memory, the
issue will be back, and the check does nothing to prevent it.
next prev parent reply other threads:[~2025-04-10 20:17 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 21:05 [PATCH 0/5] mm: reliable huge page allocator Johannes Weiner
2025-03-13 21:05 ` [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers Johannes Weiner
2025-03-14 15:08 ` Zi Yan
2025-03-16 4:28 ` Hugh Dickins
2025-03-17 18:18 ` Johannes Weiner
2025-03-21 6:21 ` kernel test robot
2025-03-21 13:55 ` Johannes Weiner
2025-04-10 15:19 ` Vlastimil Babka
2025-04-10 20:17 ` Johannes Weiner [this message]
2025-04-11 7:32 ` Vlastimil Babka
2025-03-13 21:05 ` [PATCH 2/5] mm: page_alloc: trace type pollution from compaction capturing Johannes Weiner
2025-03-14 18:36 ` Zi Yan
2025-03-13 21:05 ` [PATCH 3/5] mm: page_alloc: defrag_mode Johannes Weiner
2025-03-14 18:54 ` Zi Yan
2025-03-14 20:50 ` Johannes Weiner
2025-03-14 22:54 ` Zi Yan
2025-03-22 15:05 ` Brendan Jackman
2025-03-23 0:58 ` Johannes Weiner
2025-03-23 1:34 ` Johannes Weiner
2025-03-23 3:46 ` Johannes Weiner
2025-03-23 18:04 ` Brendan Jackman
2025-03-31 15:55 ` Johannes Weiner
2025-03-13 21:05 ` [PATCH 4/5] mm: page_alloc: defrag_mode kswapd/kcompactd assistance Johannes Weiner
2025-03-13 21:05 ` [PATCH 5/5] mm: page_alloc: defrag_mode kswapd/kcompactd watermarks Johannes Weiner
2025-03-14 21:05 ` Johannes Weiner
2025-04-11 8:19 ` Vlastimil Babka
2025-04-11 15:39 ` Johannes Weiner
2025-04-11 16:51 ` Vlastimil Babka
2025-04-11 18:21 ` Johannes Weiner
2025-04-13 2:20 ` Johannes Weiner
2025-04-15 7:31 ` Vlastimil Babka
2025-04-15 7:44 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250410201718.GA366747@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox