Re: [RFC] Add per-socket weight support for multi-socket systems in weighted interleave

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Gregory Price <gourry@gourry.net>
To: Rakie Kim <rakie.kim@sk.com>
Cc: joshua.hahnjy@gmail.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org, dan.j.williams@intel.com,
	ying.huang@linux.alibaba.com, kernel_team@skhynix.com,
	honggyu.kim@sk.com, yunjeong.mun@sk.com
Subject: Re: [RFC] Add per-socket weight support for multi-socket systems in weighted interleave
Date: Fri, 9 May 2025 01:49:59 -0400	[thread overview]
Message-ID: <aB2Xh4jEqpSTuvsi@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <20250509023032.235-1-rakie.kim@sk.com>

On Fri, May 09, 2025 at 11:30:26AM +0900, Rakie Kim wrote:
> 
> Scenario 1: Adapt weighting based on the task's execution node
> A task prefers only the DRAM and locally attached CXL memory of the
> socket on which it is running, in order to avoid cross-socket access and
> optimize bandwidth.
> - A task running on CPU0 (node0) would prefer DRAM0 (w=3) and CXL0 (w=1)
> - A task running on CPU1 (node1) would prefer DRAM1 (w=3) and CXL1 (w=1)
... snip ...
> 
> However, Scenario 1 does not depend on such information. Rather, it is
> a locality-preserving optimization where we isolate memory access to
> each socket's DRAM and CXL nodes. I believe this use case is implementable
> today and worth considering independently from interconnect performance
> awareness.
> 

There's nothing to implement - all the controls exist:

1) --cpunodebind=0
2) --weighted-interleave=0,2
3) cpuset.mems
4) cpuset.cpus

You might consider maybe something like "--local-tier" (akin to
--localalloc) that sets an explicitly fallback set based on the local
node.  You'd end up doing something like

current_nid = memtier_next_local_node(socket_nid, current_nid)

Where this interface returns the preferred fallback ordering but doesn't
allow cross-socket fallback.

That might be useful, i suppose, in letting a user do:

--cpunodebind=0 --weighted-interleave --local-tier

without having to know anything about the local memory tier structure.

> > At the same time we were discussing this, we were also discussing how to
> > do external task-mempolicy modifications - which seemed significantly
> > more useful, but ultimately more complex and without sufficient
> > interested parties / users.
> 
> I'd like to learn more about that thread. If you happen to have a pointer
> to that discussion, it would be really helpful.
> 

https://lore.kernel.org/all/20231122211200.31620-1-gregory.price@memverge.com/
https://lore.kernel.org/all/ZV5zGROLefrsEcHJ@r13-u19.micron.com/
https://lore.kernel.org/linux-mm/ZWYsth2CtC4Ilvoz@memverge.com/
https://lore.kernel.org/linux-mm/20221010094842.4123037-1-hezhongkun.hzk@bytedance.com/
There are locking issues with these that aren't easy to fix.

I think the bytedance method uses a task_work queueing to defer a
mempolicy update to the task itself the next time it makes a kernel/user
transition.  That's probably the best overall approach i've seen.

https://lore.kernel.org/linux-mm/ZWezcQk+BYEq%2FWiI@memverge.com/
More notes gathered prior to implementing weighted interleave.

~Gregory

next prev parent reply	other threads:[~2025-05-09  5:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-07  9:35 rakie.kim
2025-05-07 16:38 ` Gregory Price
2025-05-08  6:30   ` Rakie Kim
2025-05-08 15:12     ` Gregory Price
2025-05-09  2:30       ` Rakie Kim
2025-05-09  5:49         ` Gregory Price [this message]
2025-05-12  8:22           ` Rakie Kim
2025-05-09 11:31       ` Jonathan Cameron
2025-05-09 16:29         ` Gregory Price
2025-05-12  8:23           ` Rakie Kim
2025-05-12  8:23         ` Rakie Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aB2Xh4jEqpSTuvsi@gourry-fedora-PF4VCD3F \
    --to=gourry@gourry.net \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=honggyu.kim@sk.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rakie.kim@sk.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yunjeong.mun@sk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox