From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: Bing Jiao <bingjiao@google.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 0/2] mm/vmscan: optimize preferred target demotion node selection
Date: Wed, 7 Jan 2026 09:46:52 -0800 [thread overview]
Message-ID: <20260107174652.3973445-1-joshua.hahnjy@gmail.com> (raw)
In-Reply-To: <20260107072814.2324646-1-bingjiao@google.com>
On Wed, 7 Jan 2026 07:28:12 +0000 Bing Jiao <bingjiao@google.com> wrote:
Hello Bing, thank you for your patch!
I have a few questions about the motivation about this patch.
> In tiered memory systems, the demotion aims to move cold folios to the
> far-tier nodes. To maintain system performance, the demotion target
> should ideally be the node with the shortest NUMA distance from the
> source node.
>
> However, the current implementation has two suboptimal behaviors:
>
> 1. Unbalanced Fallback: When the primary preferred demotion node is full,
> the allocator falls back to other nodes in a way that often skews
> toward zones that closer to the primary preferred node rather than
> distributing the load evenly across fallback nodes.
I definitely think this is a problem that can exist for some workloads /
machines, and I agree that there should be some mechanism to manage this
in the demotion code as well. In the context of tiered memory, it might be
the case that some far-nodes have more restrited memory bandwidth, so better
distribution of memory across those nodes definitely sounds like something
that should at least be considered (even if it might not be the sole factor).
With that said, I think adding some numbers here to motivate this change could
definitely make the argument more convincing. In particular, I don't think
I am fully convinced that doing a full random selection from the demotion
targets makes the most sense. Maybe there are a few more things to consider,
like the node's capacity, how full it is, bandwidth, etc. For instance,
weighted interleave auto-tuning makes a weighted selection based on each
node's bandwidth.
At least right now, it seems like we're consistent with how the demotion node
gets selected when the preferred node is full.
Do your changes lead to a "better" distribution of memory? And does this
distribution lead to increased performance? I think some numbers here could
help my understanding and convince others as well : -)
> 2. Suboptimal Target Selection: demote_folio_list() randomly select
> a preferred node from the allowed mask, potentially selecting
> a very distant node.
Following up, I think it could be helpful to have a unified story about how
demotion nodes should be selected. In particular, I'm not entirely confident
if it makes sense to have a "try on the preferred demotion target, and then
select randomly among all other nodes" story, since these have conflicting
stories of "prefer close nodes" vs "distribute demotions". To put it explicitly,
what makes the first demotion target special? Should we just select randomly
for *all* demotion targets, not just if the preferred node is full?
Sorry if it seems like I am asking too many questions, I just wanted to get
a better understanding of the motivation behind the patch.
Thank you, and I hope you have a great day!
Joshua
next prev parent reply other threads:[~2026-01-07 17:46 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-07 7:28 Bing Jiao
2026-01-07 7:28 ` [PATCH v1 1/2] mm/vmscan: balance demotion allocation in alloc_demote_folio() Bing Jiao
2026-01-08 12:44 ` Donet Tom
2026-01-09 23:45 ` Bing Jiao
2026-01-10 0:52 ` Joshua Hahn
2026-01-07 7:28 ` [PATCH v1 2/2] mm/vmscan: select the closest perferred node in demote_folio_list() Bing Jiao
2026-01-07 17:39 ` [PATCH v1 0/2] mm/vmscan: optimize preferred target demotion node selection Andrew Morton
2026-01-07 17:46 ` Joshua Hahn [this message]
2026-01-08 6:03 ` Bing Jiao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260107174652.3973445-1-joshua.hahnjy@gmail.com \
--to=joshua.hahnjy@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=bingjiao@google.com \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox