From: Bing Jiao <bingjiao@google.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Donet Tom <donettom@linux.ibm.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 1/2] mm/vmscan: balance demotion allocation in alloc_demote_folio()
Date: Mon, 12 Jan 2026 19:23:36 +0000 [thread overview]
Message-ID: <aWVKJta4vuZEOIZV@google.com> (raw)
In-Reply-To: <20260110005229.1348817-1-joshua.hahnjy@gmail.com>
On Fri, Jan 09, 2026 at 04:52:28PM -0800, Joshua Hahn wrote:
> On Fri, 9 Jan 2026 23:45:57 +0000 Bing Jiao <bingjiao@google.com> wrote:
>
> > On Thu, Jan 08, 2026 at 06:14:02PM +0530, Donet Tom wrote:
> > >
> > > On 1/7/26 12:58 PM, Bing Jiao wrote:
> > > > + /* Randomly select a node from fallback nodes for balanced allocation */
> > > > + if (allowed_mask) {
> > > > + mtc->nid = node_random(allowed_mask);
> > >
> > >
> > > This random selection can cause allocations to fall back to distant memory
> > > even when the nearer demotion target has sufficient free memory, correct?
> > > Could this also lead to increased promotion latency?
> >
> > Hi Donet,
> >
> > Thanks for your questions.
> >
> > Yes, the random selection could select a distant node and lead to
> > incresed promotion latency.
> >
> > I just realized that the the fallback allocation should not weighted
> > by a single metric, such as node distance, capacity, free space.
>
> Hello Bing, I hope you are doing well!
>
> Yes -- this is also what I believe, and I think this idea of "how should we
> select demotion / allocation targets" is something that is a difficult problem
> (and one that may not have a single solution that "just works").
>
> It's also a question that I have been thinking about, and what was discussed
> in part at LSFMMBPF last year. At the time, I made some auto-tuning weights [1]
> for weighted interleave based on bandwidth capacity, since the main benefit of
> weighted interleave is to distribute memory accesses across multiple nodes
> to maximize how much bandwidth the system can use at once. A follow-up was to
> think about how these weights could change over time, and what heuristics
> should be used to determine how the weights are selected.
>
> Ultimately, we agreed that the heuristics should probably be delegated to
> userspace, since there are just so many scenarios that could change what
> metrics should take priority. (Jonathan Corbet wrote a great summary of the
> discussion in an LWN article [2])
>
> Coming back to this patchset, I think that all of the ideas above apply
> nicely here as well. What nodes should be selected for demotion and how they
> should be weighted is a difficult question, and one that is probably best
> answered by userspace and what workload they expect to use on their specific
> system.
>
> What I do believe though, is that an unweighted random selection / round-robin
> approach to selecting demotion targets might lead to some unexpected
> performance implications.
>
> > We need a thoroughly study before changing alloc_demote_folio().
>
> So I think this is the way to go : -)
> Although, I'm not actively exploring this at the moment ;)
>
> Please let me know what you think, I hope you have a great day!
> Joshua
>
> [1] https://lore.kernel.org/all/20250109185048.28587-1-joshua.hahnjy@gmail.com/
> [2] https://lwn.net/Articles/1016842/
Hi Joshua, hope you had a great weekend!
I appreciate you sharing that information. I really enjoyed reading these
articles and discussions.
It makes sense to assume users understand their requirements, but I think
the kernel needs internal heuristics for weight adjustment. Because
users often lack the comprehensive and immediate information necessary
to update their configration in a timely manner, unless the system has
an omniscient administrator who can oversee and (pre)allocate resource
for all tasks running on that system. Therefore, I think it is still
necessary to have kernel on weight adjustment.
I will think more about this and explore it further from userspace,
kernel space, or using a hybrid approach.
Thank you again for the sharing!
Best,
Bing
next prev parent reply other threads:[~2026-01-12 19:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-07 7:28 [PATCH v1 0/2] mm/vmscan: optimize preferred target demotion node selection Bing Jiao
2026-01-07 7:28 ` [PATCH v1 1/2] mm/vmscan: balance demotion allocation in alloc_demote_folio() Bing Jiao
2026-01-08 12:44 ` Donet Tom
2026-01-09 23:45 ` Bing Jiao
2026-01-10 0:52 ` Joshua Hahn
2026-01-12 19:23 ` Bing Jiao [this message]
2026-01-07 7:28 ` [PATCH v1 2/2] mm/vmscan: select the closest perferred node in demote_folio_list() Bing Jiao
2026-01-07 17:39 ` [PATCH v1 0/2] mm/vmscan: optimize preferred target demotion node selection Andrew Morton
2026-01-07 17:46 ` Joshua Hahn
2026-01-08 6:03 ` Bing Jiao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWVKJta4vuZEOIZV@google.com \
--to=bingjiao@google.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=david@kernel.org \
--cc=donettom@linux.ibm.com \
--cc=hannes@cmpxchg.org \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox