From: Michal Hocko <mhocko@suse.com>
To: Gabriel Krisman Bertazi <krisman@suse.de>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
Baoquan He <bhe@redhat.com>
Subject: Re: [PATCH] Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone"
Date: Wed, 26 Feb 2025 07:54:35 +0100 [thread overview]
Message-ID: <Z766q9qWtvHA_-kZ@tiehlicka> (raw)
In-Reply-To: <20250226032258.234099-1-krisman@suse.de>
On Tue 25-02-25 22:22:58, Gabriel Krisman Bertazi wrote:
> Commit 96a5c186efff ("mm/page_alloc.c: don't show protection in zone's
> ->lowmem_reserve[] for empty zone") removes the protection of lower
> zones from allocations targeting memory-less high zones. This had an
> unintended impact on the pattern of reclaims because it makes the
> high-zone-targeted allocation more likely to succeed in lower zones,
> which adds pressure to said zones. I.e, the following corresponding
> checks in zone_watermark_ok/zone_watermark_fast are less likely to
> trigger:
>
> if (free_pages <= min + z->lowmem_reserve[highest_zoneidx])
> return false;
>
> As a result, we are observing an increase in reclaim and kswapd scans,
> due to the increased pressure. This was initially observed as increased
> latency in filesystem operations when benchmarking with fio on a machine
> with some memory-less zones, but it has since been associated with
> increased contention in locks related to memory reclaim. By reverting
> this patch, the original performance was recovered on that machine.
I think it would be nice to show the memory layout on that machine (is
there any movable or device zone)?
Exact reclaim patterns are really hard to predict and it is little bit
surprising the said patch has caused an increased kswapd activity
because I would expect that there will be more reclaim with the lowmem
reserves in place. But it is quite possible that the higher zone memory
pressure is just tipping over and increase the lowmem pressure enough
that it shows up.
In any case 96a5c186efff seems incorrect because it assumes that the
protection has anything to do with how higher zone is populated while
the protection fundamentaly protects lower zone from higher zones
allocation. Those allocations are independent on the actual memory in
that zone.
> The original commit was introduced as a clarification of the
> /proc/zoneinfo output, so it doesn't seem there are usecases depending
> on it, making the revert a simple solution.
>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Baoquan He <bhe@redhat.com>
> Fixes: 96a5c186efff ("mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone")
> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Thanks!
> ---
> mm/page_alloc.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 579789600a3c..fe986e6de7a0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5849,11 +5849,10 @@ static void setup_per_zone_lowmem_reserve(void)
>
> for (j = i + 1; j < MAX_NR_ZONES; j++) {
> struct zone *upper_zone = &pgdat->node_zones[j];
> - bool empty = !zone_managed_pages(upper_zone);
>
> managed_pages += zone_managed_pages(upper_zone);
>
> - if (clear || empty)
> + if (clear)
> zone->lowmem_reserve[j] = 0;
> else
> zone->lowmem_reserve[j] = managed_pages / ratio;
> --
> 2.47.0
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2025-02-26 6:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-26 3:22 Gabriel Krisman Bertazi
2025-02-26 6:54 ` Michal Hocko [this message]
2025-02-26 10:00 ` Baoquan He
2025-02-26 10:52 ` Michal Hocko
2025-02-26 11:00 ` Michal Hocko
2025-02-26 11:51 ` Baoquan He
2025-02-26 12:01 ` Michal Hocko
2025-02-26 15:57 ` Baoquan He
2025-02-26 17:46 ` Michal Hocko
2025-02-27 9:41 ` Baoquan He
2025-02-27 9:16 ` Vlastimil Babka
2025-02-27 10:24 ` Baoquan He
2025-02-27 13:16 ` Vlastimil Babka
2025-02-27 15:53 ` Baoquan He
2025-02-26 13:07 ` Vlastimil Babka
2025-02-26 16:05 ` Gabriel Krisman Bertazi
2025-02-26 23:00 ` Andrew Morton
2025-02-26 13:00 ` Vlastimil Babka
2025-02-27 11:50 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z766q9qWtvHA_-kZ@tiehlicka \
--to=mhocko@suse.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=krisman@suse.de \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox