From: Vlastimil Babka <vbabka@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/5] mm: page_alloc: defrag_mode kswapd/kcompactd watermarks
Date: Fri, 11 Apr 2025 10:19:58 +0200 [thread overview]
Message-ID: <46f1b2ab-2903-4cde-9e68-e334a0d0df22@suse.cz> (raw)
In-Reply-To: <20250313210647.1314586-6-hannes@cmpxchg.org>
On 3/13/25 22:05, Johannes Weiner wrote:
> The previous patch added pageblock_order reclaim to kswapd/kcompactd,
> which helps, but produces only one block at a time. Allocation stalls
> and THP failure rates are still higher than they could be.
>
> To adequately reflect ALLOC_NOFRAGMENT demand for pageblocks, change
> the watermarking for kswapd & kcompactd: instead of targeting the high
> watermark in order-0 pages and checking for one suitable block, simply
> require that the high watermark is entirely met in pageblocks.
Hrm.
> ---
> include/linux/mmzone.h | 1 +
> mm/compaction.c | 37 ++++++++++++++++++++++++++++++-------
> mm/internal.h | 1 +
> mm/page_alloc.c | 29 +++++++++++++++++++++++------
> mm/vmscan.c | 15 ++++++++++++++-
> mm/vmstat.c | 1 +
> 6 files changed, 70 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index dbb0ad69e17f..37c29f3fbca8 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -138,6 +138,7 @@ enum numa_stat_item {
> enum zone_stat_item {
> /* First 128 byte cacheline (assuming 64 bit words) */
> NR_FREE_PAGES,
> + NR_FREE_PAGES_BLOCKS,
> NR_ZONE_LRU_BASE, /* Used only for compaction and reclaim retry */
> NR_ZONE_INACTIVE_ANON = NR_ZONE_LRU_BASE,
> NR_ZONE_ACTIVE_ANON,
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 036353ef1878..4a2ccb82d0b2 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2329,6 +2329,22 @@ static enum compact_result __compact_finished(struct compact_control *cc)
> if (!pageblock_aligned(cc->migrate_pfn))
> return COMPACT_CONTINUE;
>
> + /*
> + * When defrag_mode is enabled, make kcompactd target
> + * watermarks in whole pageblocks. Because they can be stolen
> + * without polluting, no further fallback checks are needed.
> + */
> + if (defrag_mode && !cc->direct_compaction) {
> + if (__zone_watermark_ok(cc->zone, cc->order,
> + high_wmark_pages(cc->zone),
> + cc->highest_zoneidx, cc->alloc_flags,
> + zone_page_state(cc->zone,
> + NR_FREE_PAGES_BLOCKS)))
> + return COMPACT_SUCCESS;
> +
> + return COMPACT_CONTINUE;
> + }
Wonder if this ever succeds in practice. Is high_wmark_pages() even aligned
to pageblock size? If not, and it's X pageblocks and a half, we will rarely
have NR_FREE_PAGES_BLOCKS cover all of that? Also concurrent allocations can
put us below high wmark quickly and then we never satisfy this?
Doesn't then happen that with defrag_mode, in practice kcompactd basically
always runs until scanners met?
Maybe the check could instead e.g. compare NR_FREE_PAGES (aligned down to
pageblock size, or even with some extra slack?) with NR_FREE_PAGES_BLOCKS?
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -6724,11 +6724,24 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx)
> * meet watermarks.
> */
> for_each_managed_zone_pgdat(zone, pgdat, i, highest_zoneidx) {
> + unsigned long free_pages;
> +
> if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING)
> mark = promo_wmark_pages(zone);
> else
> mark = high_wmark_pages(zone);
> - if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx))
I think you just removed the only user of this _safe() function. Is the
cpu-drift control it does no longer necessary?
> +
> + /*
> + * In defrag_mode, watermarks must be met in whole
> + * blocks to avoid polluting allocator fallbacks.
> + */
> + if (defrag_mode)
> + free_pages = zone_page_state(zone, NR_FREE_PAGES_BLOCKS);
> + else
> + free_pages = zone_page_state(zone, NR_FREE_PAGES);
> +
> + if (__zone_watermark_ok(zone, order, mark, highest_zoneidx,
> + 0, free_pages))
> return true;
> }
>
next prev parent reply other threads:[~2025-04-11 8:20 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 21:05 [PATCH 0/5] mm: reliable huge page allocator Johannes Weiner
2025-03-13 21:05 ` [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers Johannes Weiner
2025-03-14 15:08 ` Zi Yan
2025-03-16 4:28 ` Hugh Dickins
2025-03-17 18:18 ` Johannes Weiner
2025-03-21 6:21 ` kernel test robot
2025-03-21 13:55 ` Johannes Weiner
2025-04-10 15:19 ` Vlastimil Babka
2025-04-10 20:17 ` Johannes Weiner
2025-04-11 7:32 ` Vlastimil Babka
2025-03-13 21:05 ` [PATCH 2/5] mm: page_alloc: trace type pollution from compaction capturing Johannes Weiner
2025-03-14 18:36 ` Zi Yan
2025-03-13 21:05 ` [PATCH 3/5] mm: page_alloc: defrag_mode Johannes Weiner
2025-03-14 18:54 ` Zi Yan
2025-03-14 20:50 ` Johannes Weiner
2025-03-14 22:54 ` Zi Yan
2025-03-22 15:05 ` Brendan Jackman
2025-03-23 0:58 ` Johannes Weiner
2025-03-23 1:34 ` Johannes Weiner
2025-03-23 3:46 ` Johannes Weiner
2025-03-23 18:04 ` Brendan Jackman
2025-03-31 15:55 ` Johannes Weiner
2025-03-13 21:05 ` [PATCH 4/5] mm: page_alloc: defrag_mode kswapd/kcompactd assistance Johannes Weiner
2025-03-13 21:05 ` [PATCH 5/5] mm: page_alloc: defrag_mode kswapd/kcompactd watermarks Johannes Weiner
2025-03-14 21:05 ` Johannes Weiner
2025-04-11 8:19 ` Vlastimil Babka [this message]
2025-04-11 15:39 ` Johannes Weiner
2025-04-11 16:51 ` Vlastimil Babka
2025-04-11 18:21 ` Johannes Weiner
2025-04-13 2:20 ` Johannes Weiner
2025-04-15 7:31 ` Vlastimil Babka
2025-04-15 7:44 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46f1b2ab-2903-4cde-9e68-e334a0d0df22@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox