From: "Huang, Ying" <ying.huang@intel.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, Kaiyang Zhao <kaiyang2@cs.cmu.edu>,
Mel Gorman <mgorman@techsingularity.net>,
Vlastimil Babka <vbabka@suse.cz>,
David Rientjes <rientjes@google.com>,
linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd
Date: Thu, 27 Apr 2023 13:41:38 +0800 [thread overview]
Message-ID: <87cz3peval.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <20230426152206.GA30064@cmpxchg.org> (Johannes Weiner's message of "Wed, 26 Apr 2023 11:22:06 -0400")
Johannes Weiner <hannes@cmpxchg.org> writes:
> On Wed, Apr 26, 2023 at 09:30:23AM +0800, Huang, Ying wrote:
>> Johannes Weiner <hannes@cmpxchg.org> writes:
>>
>> > On Tue, Apr 25, 2023 at 11:12:28AM +0800, Huang, Ying wrote:
>> >> Johannes Weiner <hannes@cmpxchg.org> writes:
>> >>
>> >> > Kswapd currently bails on higher-order allocations with an open-coded
>> >> > check for whether it's reclaimed the compaction gap.
>> >> >
>> >> > compaction_suitable() is the customary interface to coordinate reclaim
>> >> > with compaction.
>> >> >
>> >> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>> >> > ---
>> >> > mm/vmscan.c | 67 ++++++++++++++++++-----------------------------------
>> >> > 1 file changed, 23 insertions(+), 44 deletions(-)
>> >> >
>> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> >> > index ee8c8ca2e7b5..723705b9e4d9 100644
>> >> > --- a/mm/vmscan.c
>> >> > +++ b/mm/vmscan.c
>> >> > @@ -6872,12 +6872,18 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx)
>> >> > if (!managed_zone(zone))
>> >> > continue;
>> >> >
>> >> > + /* Allocation can succeed in any zone, done */
>> >> > if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING)
>> >> > mark = wmark_pages(zone, WMARK_PROMO);
>> >> > else
>> >> > mark = high_wmark_pages(zone);
>> >> > if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx))
>> >> > return true;
>> >> > +
>> >> > + /* Allocation can't succeed, but enough order-0 to compact */
>> >> > + if (compaction_suitable(zone, order,
>> >> > + highest_zoneidx) == COMPACT_CONTINUE)
>> >> > + return true;
>> >>
>> >> Should we check the following first?
>> >>
>> >> order > 0 && zone_watermark_ok_safe(zone, 0, mark, highest_zoneidx)
>> >
>> > That's what compaction_suitable() does. It checks whether there are
>> > enough migration targets for compaction (COMPACT_CONTINUE) or whether
>> > reclaim needs to do some more work (COMPACT_SKIPPED).
>>
>> Yes. And I found that the watermark used in compaction_suitable() is
>> low_wmark_pages() or min_wmark_pages(), which doesn't match the
>> watermark here. Or did I miss something?
>
> Ahh, you're right, kswapd will bail prematurely. Compaction cannot
> reliably meet the high watermark with a low watermark scratch space.
>
> I'll add the order check before the suitable test, for clarity, and so
> that order-0 requests don't check the same thing twice.
>
> For the watermark, I'd make it an arg to compaction_suitable() and use
> whatever the reclaimer targets (high for kswapd, min for direct).
>
> However, there is a minor snag. compaction_suitable() currently has
> its own smarts regarding the watermark:
>
> /*
> * Watermarks for order-0 must be met for compaction to be able to
> * isolate free pages for migration targets. This means that the
> * watermark and alloc_flags have to match, or be more pessimistic than
> * the check in __isolate_free_page(). We don't use the direct
> * compactor's alloc_flags, as they are not relevant for freepage
> * isolation. We however do use the direct compactor's highest_zoneidx
> * to skip over zones where lowmem reserves would prevent allocation
> * even if compaction succeeds.
> * For costly orders, we require low watermark instead of min for
> * compaction to proceed to increase its chances.
> * ALLOC_CMA is used, as pages in CMA pageblocks are considered
> * suitable migration targets
> */
> watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ?
> low_wmark_pages(zone) : min_wmark_pages(zone);
>
> Historically it has always checked low instead of min. Then Vlastimil
> changed it to min for non-costly orders here:
>
> commit 8348faf91f56371d4bada6fc5915e19580a15ffe
> Author: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri Oct 7 16:58:00 2016 -0700
>
> mm, compaction: require only min watermarks for non-costly orders
>
> The __compaction_suitable() function checks the low watermark plus a
> compact_gap() gap to decide if there's enough free memory to perform
> compaction. Then __isolate_free_page uses low watermark check to decide
> if particular free page can be isolated. In the latter case, using low
> watermark is needlessly pessimistic, as the free page isolations are
> only temporary. For __compaction_suitable() the higher watermark makes
> sense for high-order allocations where more freepages increase the
> chance of success, and we can typically fail with some order-0 fallback
> when the system is struggling to reach that watermark. But for
> low-order allocation, forming the page should not be that hard. So
> using low watermark here might just prevent compaction from even trying,
> and eventually lead to OOM killer even if we are above min watermarks.
>
> So after this patch, we use min watermark for non-costly orders in
> __compaction_suitable(), and for all orders in __isolate_free_page().
>
> Lowering to min wasn't an issue for non-costly, but AFAICS there was
> no explicit testing for whether min would work for costly orders too.
>
> I'd propose trying it with min even for costly and see what happens.
>
> If it does regress, a better place to boost scratch space for costly
> orders might be compact_gap(), so I'd move it there.
>
> Does that sound reasonable?
Sounds good to me, Thanks!
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2023-04-27 5:43 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 19:12 [RFC PATCH 00/26] mm: reliable huge page allocator Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 01/26] block: bdev: blockdev page cache is movable Johannes Weiner
2023-04-19 4:07 ` Matthew Wilcox
2023-04-21 12:25 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 02/26] mm: compaction: avoid GFP_NOFS deadlocks Johannes Weiner
2023-04-21 12:27 ` Mel Gorman
2023-04-21 14:17 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 03/26] mm: make pageblock_order 2M per default Johannes Weiner
2023-04-19 0:01 ` Kirill A. Shutemov
2023-04-19 2:55 ` Johannes Weiner
2023-04-19 3:44 ` Johannes Weiner
2023-04-19 11:10 ` David Hildenbrand
2023-04-19 10:36 ` Vlastimil Babka
2023-04-19 11:09 ` David Hildenbrand
2023-04-21 12:37 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 04/26] mm: page_isolation: write proper kerneldoc Johannes Weiner
2023-04-21 12:39 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 05/26] mm: page_alloc: per-migratetype pcplist for THPs Johannes Weiner
2023-04-21 12:47 ` Mel Gorman
2023-04-21 15:06 ` Johannes Weiner
2023-04-28 10:29 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 06/26] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-04-21 12:54 ` Mel Gorman
2023-04-21 15:08 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 07/26] mm: page_alloc: move capture_control to the page allocator Johannes Weiner
2023-04-21 12:59 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 08/26] mm: page_alloc: claim blocks during compaction capturing Johannes Weiner
2023-04-21 13:12 ` Mel Gorman
2023-04-25 14:39 ` Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 09/26] mm: page_alloc: move expand() above compaction_capture() Johannes Weiner
2023-04-18 19:12 ` [RFC PATCH 10/26] mm: page_alloc: allow compaction capturing from larger blocks Johannes Weiner
2023-04-21 14:14 ` Mel Gorman
2023-04-25 15:40 ` Johannes Weiner
2023-04-28 10:41 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 11/26] mm: page_alloc: introduce MIGRATE_FREE Johannes Weiner
2023-04-21 14:25 ` Mel Gorman
2023-04-18 19:12 ` [RFC PATCH 12/26] mm: page_alloc: per-migratetype free counts Johannes Weiner
2023-04-21 14:28 ` Mel Gorman
2023-04-21 15:35 ` Johannes Weiner
2023-04-21 16:03 ` Mel Gorman
2023-04-21 16:32 ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 13/26] mm: compaction: remove compaction result helpers Johannes Weiner
2023-04-21 14:32 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 14/26] mm: compaction: simplify should_compact_retry() Johannes Weiner
2023-04-21 14:36 ` Mel Gorman
2023-04-25 2:15 ` Johannes Weiner
2023-04-25 0:56 ` Huang, Ying
2023-04-25 2:11 ` Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 15/26] mm: compaction: simplify free block check in suitable_migration_target() Johannes Weiner
2023-04-21 14:39 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 16/26] mm: compaction: improve compaction_suitable() accuracy Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 17/26] mm: compaction: refactor __compaction_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 18/26] mm: compaction: remove unnecessary is_via_compact_memory() checks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 19/26] mm: compaction: drop redundant watermark check in compaction_zonelist_suitable() Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd Johannes Weiner
2023-04-25 3:12 ` Huang, Ying
2023-04-25 14:26 ` Johannes Weiner
2023-04-26 1:30 ` Huang, Ying
2023-04-26 15:22 ` Johannes Weiner
2023-04-27 5:41 ` Huang, Ying [this message]
2023-04-18 19:13 ` [RFC PATCH 21/26] mm: compaction: align compaction goals with reclaim goals Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 22/26] mm: page_alloc: manage free memory in whole pageblocks Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 23/26] mm: page_alloc: kill highatomic Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 24/26] mm: page_alloc: kill watermark boosting Johannes Weiner
2023-04-18 19:13 ` [RFC PATCH 25/26] mm: page_alloc: disallow fallbacks when 2M defrag is enabled Johannes Weiner
2023-04-21 14:56 ` Mel Gorman
2023-04-21 15:24 ` Johannes Weiner
2023-04-21 15:55 ` Mel Gorman
2023-04-18 19:13 ` [RFC PATCH 26/26] mm: page_alloc: add sanity checks for migratetypes Johannes Weiner
2023-04-18 23:54 ` [RFC PATCH 00/26] mm: reliable huge page allocator Kirill A. Shutemov
2023-04-19 2:08 ` Johannes Weiner
2023-04-19 10:56 ` Vlastimil Babka
2023-04-19 4:11 ` Matthew Wilcox
2023-04-21 16:11 ` Mel Gorman
2023-04-21 17:14 ` Matthew Wilcox
2023-05-02 15:21 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87cz3peval.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=hannes@cmpxchg.org \
--cc=kaiyang2@cs.cmu.edu \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox