From: Johannes Weiner <hannes@cmpxchg.org>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: linux-mm@kvack.org, Kaiyang Zhao <kaiyang2@cs.cmu.edu>,
Mel Gorman <mgorman@techsingularity.net>,
Vlastimil Babka <vbabka@suse.cz>,
David Rientjes <rientjes@google.com>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd
Date: Wed, 26 Apr 2023 11:22:06 -0400 [thread overview]
Message-ID: <20230426152206.GA30064@cmpxchg.org> (raw)
In-Reply-To: <87o7nbe8gg.fsf@yhuang6-desk2.ccr.corp.intel.com>
On Wed, Apr 26, 2023 at 09:30:23AM +0800, Huang, Ying wrote:
> Johannes Weiner <hannes@cmpxchg.org> writes:
>
> > On Tue, Apr 25, 2023 at 11:12:28AM +0800, Huang, Ying wrote:
> >> Johannes Weiner <hannes@cmpxchg.org> writes:
> >>
> >> > Kswapd currently bails on higher-order allocations with an open-coded
> >> > check for whether it's reclaimed the compaction gap.
> >> >
> >> > compaction_suitable() is the customary interface to coordinate reclaim
> >> > with compaction.
> >> >
> >> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> >> > ---
> >> > mm/vmscan.c | 67 ++++++++++++++++++-----------------------------------
> >> > 1 file changed, 23 insertions(+), 44 deletions(-)
> >> >
> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> > index ee8c8ca2e7b5..723705b9e4d9 100644
> >> > --- a/mm/vmscan.c
> >> > +++ b/mm/vmscan.c
> >> > @@ -6872,12 +6872,18 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx)
> >> > if (!managed_zone(zone))
> >> > continue;
> >> >
> >> > + /* Allocation can succeed in any zone, done */
> >> > if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING)
> >> > mark = wmark_pages(zone, WMARK_PROMO);
> >> > else
> >> > mark = high_wmark_pages(zone);
> >> > if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx))
> >> > return true;
> >> > +
> >> > + /* Allocation can't succeed, but enough order-0 to compact */
> >> > + if (compaction_suitable(zone, order,
> >> > + highest_zoneidx) == COMPACT_CONTINUE)
> >> > + return true;
> >>
> >> Should we check the following first?
> >>
> >> order > 0 && zone_watermark_ok_safe(zone, 0, mark, highest_zoneidx)
> >
> > That's what compaction_suitable() does. It checks whether there are
> > enough migration targets for compaction (COMPACT_CONTINUE) or whether
> > reclaim needs to do some more work (COMPACT_SKIPPED).
>
> Yes. And I found that the watermark used in compaction_suitable() is
> low_wmark_pages() or min_wmark_pages(), which doesn't match the
> watermark here. Or did I miss something?
Ahh, you're right, kswapd will bail prematurely. Compaction cannot
reliably meet the high watermark with a low watermark scratch space.
I'll add the order check before the suitable test, for clarity, and so
that order-0 requests don't check the same thing twice.
For the watermark, I'd make it an arg to compaction_suitable() and use
whatever the reclaimer targets (high for kswapd, min for direct).
However, there is a minor snag. compaction_suitable() currently has
its own smarts regarding the watermark:
/*
* Watermarks for order-0 must be met for compaction to be able to
* isolate free pages for migration targets. This means that the
* watermark and alloc_flags have to match, or be more pessimistic than
* the check in __isolate_free_page(). We don't use the direct
* compactor's alloc_flags, as they are not relevant for freepage
* isolation. We however do use the direct compactor's highest_zoneidx
* to skip over zones where lowmem reserves would prevent allocation
* even if compaction succeeds.
* For costly orders, we require low watermark instead of min for
* compaction to proceed to increase its chances.
* ALLOC_CMA is used, as pages in CMA pageblocks are considered
* suitable migration targets
*/
watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
low_wmark_pages(zone) : min_wmark_pages(zone);
Historically it has always checked low instead of min. Then Vlastimil
changed it to min for non-costly orders here:
commit 8348faf91f56371d4bada6fc5915e19580a15ffe
Author: Vlastimil Babka <vbabka@suse.cz>
Date: Fri Oct 7 16:58:00 2016 -0700
mm, compaction: require only min watermarks for non-costly orders
The __compaction_suitable() function checks the low watermark plus a
compact_gap() gap to decide if there's enough free memory to perform
compaction. Then __isolate_free_page uses low watermark check to decide
if particular free page can be isolated. In the latter case, using low
watermark is needlessly pessimistic, as the free page isolations are
only temporary. For __compaction_suitable() the higher watermark makes
sense for high-order allocations where more freepages increase the
chance of success, and we can typically fail with some order-0 fallback
when the system is struggling to reach that watermark. But for
low-order allocation, forming the page should not be that hard. So
using low watermark here might just prevent compaction from even trying,
and eventually lead to OOM killer even if we are above min watermarks.
So after this patch, we use min watermark for non-costly orders in
__compaction_suitable(), and for all orders in __isolate_free_page().
Lowering to min wasn't an issue for non-costly, but AFAICS there was
no explicit testing for whether min would work for costly orders too.
I'd propose trying it with min even for costly and see what happens.
If it does regress, a better place to boost scratch space for costly
orders might be compact_gap(), so I'd move it there.
Does that sound reasonable?
next prev parent reply other threads:[~2023-04-26 15:22 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 19:12 [RFC PATCH 00/26] mm: reliable huge page allocator Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 01/26] block: bdev: blockdev page cache is movable Johannes Weiner
2023-04-19 4:07 ` Matthew Wilcox
2023-04-21 12:25 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 02/26] mm: compaction: avoid GFP_NOFS deadlocks Johannes Weiner
2023-04-21 12:27 ` Mel Gorman
2023-04-21 14:17 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 03/26] mm: make pageblock_order 2M per default Johannes Weiner
2023-04-19 0:01 ` Kirill A. Shutemov
2023-04-19 2:55 ` Johannes Weiner
2023-04-19 3:44 ` Johannes Weiner
2023-04-19 11:10 ` David Hildenbrand
2023-04-19 10:36 ` Vlastimil Babka
2023-04-19 11:09 ` David Hildenbrand
2023-04-21 12:37 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 04/26] mm: page_isolation: write proper kerneldoc Johannes Weiner
2023-04-21 12:39 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 05/26] mm: page_alloc: per-migratetype pcplist for THPs Johannes Weiner
2023-04-21 12:47 ` Mel Gorman
2023-04-21 15:06 ` Johannes Weiner
2023-04-28 10:29 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 06/26] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-04-21 12:54 ` Mel Gorman
2023-04-21 15:08 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 07/26] mm: page_alloc: move capture_control to the page allocator Johannes Weiner
2023-04-21 12:59 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 08/26] mm: page_alloc: claim blocks during compaction capturing Johannes Weiner
2023-04-21 13:12 ` Mel Gorman
2023-04-25 14:39 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 09/26] mm: page_alloc: move expand() above compaction_capture() Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 10/26] mm: page_alloc: allow compaction capturing from larger blocks Johannes Weiner
2023-04-21 14:14 ` Mel Gorman
2023-04-25 15:40 ` Johannes Weiner
2023-04-28 10:41 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 11/26] mm: page_alloc: introduce MIGRATE_FREE Johannes Weiner
2023-04-21 14:25 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 12/26] mm: page_alloc: per-migratetype free counts Johannes Weiner
2023-04-21 14:28 ` Mel Gorman
2023-04-21 15:35 ` Johannes Weiner
2023-04-21 16:03 ` Mel Gorman
2023-04-21 16:32 ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 13/26] mm: compaction: remove compaction result helpers Johannes Weiner
2023-04-21 14:32 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 14/26] mm: compaction: simplify should_compact_retry() Johannes Weiner
2023-04-21 14:36 ` Mel Gorman
2023-04-25 2:15 ` Johannes Weiner
2023-04-25 0:56 ` Huang, Ying
2023-04-25 2:11 ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 15/26] mm: compaction: simplify free block check in suitable_migration_target() Johannes Weiner
2023-04-21 14:39 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 16/26] mm: compaction: improve compaction_suitable() accuracy Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 17/26] mm: compaction: refactor __compaction_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 18/26] mm: compaction: remove unnecessary is_via_compact_memory() checks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 19/26] mm: compaction: drop redundant watermark check in compaction_zonelist_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd Johannes Weiner
2023-04-25 3:12 ` Huang, Ying
2023-04-25 14:26 ` Johannes Weiner
2023-04-26 1:30 ` Huang, Ying
2023-04-26 15:22 ` Johannes Weiner [this message]
2023-04-27 5:41 ` Huang, Ying
2023-04-18 19:13 ` [RFC PATCH 21/26] mm: compaction: align compaction goals with reclaim goals Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 22/26] mm: page_alloc: manage free memory in whole pageblocks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 23/26] mm: page_alloc: kill highatomic Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 24/26] mm: page_alloc: kill watermark boosting Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 25/26] mm: page_alloc: disallow fallbacks when 2M defrag is enabled Johannes Weiner
2023-04-21 14:56 ` Mel Gorman
2023-04-21 15:24 ` Johannes Weiner
2023-04-21 15:55 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 26/26] mm: page_alloc: add sanity checks for migratetypes Johannes Weiner
2023-04-18 23:54 ` [RFC PATCH 00/26] mm: reliable huge page allocator Kirill A. Shutemov
2023-04-19 2:08 ` Johannes Weiner
2023-04-19 10:56 ` Vlastimil Babka
2023-04-19 4:11 ` Matthew Wilcox
2023-04-21 16:11 ` Mel Gorman
2023-04-21 17:14 ` Matthew Wilcox
2023-05-02 15:21 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230426152206.GA30064@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=kaiyang2@cs.cmu.edu \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox