Re: [RFC LPC2026 PATCH v2 00/11] Specific Purpose Memory NUMA Nodes

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yiannis Nikolakopoulos <yiannis.nikolakop@gmail.com>
To: Gregory Price <gourry@gourry.net>
Cc: "David Hildenbrand (Red Hat)" <david@kernel.org>,
	linux-mm@kvack.org, kernel-team@meta.com,
	 linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
	 nvdimm@lists.linux.dev, linux-fsdevel@vger.kernel.org,
	 cgroups@vger.kernel.org, dave@stgolabs.net,
	jonathan.cameron@huawei.com,  dave.jiang@intel.com,
	alison.schofield@intel.com, vishal.l.verma@intel.com,
	 ira.weiny@intel.com, dan.j.williams@intel.com,
	longman@redhat.com,  akpm@linux-foundation.org,
	lorenzo.stoakes@oracle.com,  Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	 mhocko@suse.com, osalvador@suse.de, ziy@nvidia.com,
	matthew.brost@intel.com,  joshua.hahnjy@gmail.com,
	rakie.kim@sk.com, byungchul@sk.com,
	 ying.huang@linux.alibaba.com, apopple@nvidia.com,
	mingo@redhat.com,  peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org,  dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com,  mgorman@suse.de,
	vschneid@redhat.com, tj@kernel.org, hannes@cmpxchg.org,
	 mkoutny@suse.com, kees@kernel.org, muchun.song@linux.dev,
	 roman.gushchin@linux.dev, shakeel.butt@linux.dev,
	rientjes@google.com,  jackmanb@google.com, cl@gentwo.org,
	harry.yoo@oracle.com,  axelrasmussen@google.com,
	yuanchu@google.com, weixugc@google.com,
	 zhengqi.arch@bytedance.com, yosry.ahmed@linux.dev,
	nphamcs@gmail.com,  chengming.zhou@linux.dev,
	fabio.m.de.francesco@linux.intel.com,  rrichter@amd.com,
	ming.li@zohomail.com, usamaarif642@gmail.com,
	 brauner@kernel.org, oleg@redhat.com, namcao@linutronix.de,
	 escape@linux.alibaba.com, dongjoo.seo1@samsung.com
Subject: Re: [RFC LPC2026 PATCH v2 00/11] Specific Purpose Memory NUMA Nodes
Date: Thu, 11 Dec 2025 00:29:15 +0100	[thread overview]
Message-ID: <CAOi6=wTCPDM4xDJyzB1SdU6ChDch27eyTUtTAmajRNFhOFUN=A@mail.gmail.com> (raw)
In-Reply-To: <aSSepu6NDqS8HHCa@gourry-fedora-PF4VCD3F>

Just managed to go through the series and I think there are very good
ideas here. It seems to cover the isolation requirements that are
needed for the devices with inline compression.
As an RFC I can try to build something on top of it and test it more.

I hope we find the right abstractions for this to move forward.

On Tue, Nov 25, 2025 at 6:58 AM Gregory Price <gourry@gourry.net> wrote:
>
> On Mon, Nov 24, 2025 at 10:19:37AM +0100, David Hildenbrand (Red Hat) wrote:
> > [...]
> >
>
> Apologies in advance for the wall of text, both of your questions really
> do cut to the core of the series.  The first (SPM nodes) is basically a
> plumbing problem I haven't had time to address pre-LPC, the second (GFP)
> is actually a design decision that is definitely up in the air.
>
> So consider this a dump of everything I wouldn't have had time to cover
> in the LPC session.
>
> > > 3) Addition of MHP_SPM_NODE flag to instruct memory_hotplug.c that the
> > >     capacity being added should mark the node as an SPM Node.
> >
> > Sounds a bit like the wrong interface for configuring this. This smells like
> > a per-node setting that should be configured before hotplugging any memory.
> >
>
> Assuming you're specifically talking about the MHP portion of this.
>
> I agree, and I think the plumbing ultimately goes through acpi and
> kernel configs.  This was my shortest path to demonstrate a functional
> prototype by LPC.
>
> I think the most likely option simply reserving additional NUMA nodes for
> hotpluggable regions based on a Kconfig setting.
>
> I think the real setup process should look like follows:
>
> 1. At __init time, Linux reserves additional SPM nodes based on some
>    configuration (build? runtime? etc)
>
>    Essentially create:  nodes[N_SPM]
>
> 2. At SPM setup time, a driver registers an "Abstract Type" with
>    mm/memory_tiers.c  which maps SPM->Type.
>
>    This gives the core some management callback infrastructure without
>    polluting the core with device specific nonsense.
>
>    This also gives the driver a change to define things like SLIT
>    distances for those nodes, which otherwise won't exist.
>
> 3. At hotplug time, memory-hotplug.c should only have to flip a bit
>    in `mt_sysram_nodes` if NID is not in nodes[N_SPM].  That logic
>    is still there to ensure the base filtering works as intended.
>
>
> I haven't quite figured out how to plumb out nodes[N_SPM] as described
> above, but I did figure out how to demonstrate roughly the same effect
> through memory-hotplug.c - hopefully that much is clear.
>
> The problem with the above plan, is whether that "Makes sense" according
> to ACPI specs and friends.
>
> This operates in "Ambiguity Land", which is uncomfortable.
What you describe in a high level above makes sense. And while I agree
that ACPI seems like a good layer for this, it could take a while for
things to converge. At the same time different vendors might do things
differently (unsurprisingly I guess...). For example, it would not be
an absurd idea that the "specialness" of the device (e.g. compression)
appears as a vendor specific capability in CXL. So, it would make
sense to allow specific device drivers to set the respective node as
SPM (as I understood you suggest above, right?)

Finally, going back to the isolation, I'm curious to see if this
covers GPU use cases as Alistair brought up or HBMs in general. Maybe
there could be synergies with the HBM related talk in the device MC?

Best,
/Yiannis

next prev parent reply	other threads:[~2025-12-10 23:29 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-12 19:29 Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 01/11] mm: constify oom_control, scan_control, and alloc_context nodemask Gregory Price
2025-12-15  6:11   ` Balbir Singh
2025-11-12 19:29 ` [RFC PATCH v2 02/11] mm: change callers of __cpuset_zone_allowed to cpuset_zone_allowed Gregory Price
2025-12-15  6:14   ` Balbir Singh
2025-12-15 12:38     ` Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 03/11] gfp: Add GFP_SPM_NODE for Specific Purpose Memory (SPM) allocations Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 04/11] memory-tiers: Introduce SysRAM and Specific Purpose Memory Nodes Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 05/11] mm: restrict slub, oom, compaction, and page_alloc to sysram by default Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 06/11] mm,cpusets: rename task->mems_allowed to task->sysram_nodes Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 07/11] cpuset: introduce cpuset.mems.sysram Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 08/11] mm/memory_hotplug: add MHP_SPM_NODE flag Gregory Price
2025-11-13 14:58   ` [PATCH] memory-tiers: multi-definition fixup Gregory Price
2025-11-13 16:37     ` kernel test robot
2025-11-12 19:29 ` [RFC PATCH v2 09/11] drivers/dax: add spm_node bit to dev_dax Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 10/11] drivers/cxl: add spm_node bit to cxl region Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 11/11] [HACK] mm/zswap: compressed ram integration example Gregory Price
2025-11-18  7:02 ` [RFC LPC2026 PATCH v2 00/11] Specific Purpose Memory NUMA Nodes Alistair Popple
2025-11-18 10:36   ` Gregory Price
2025-11-21 21:07   ` Gregory Price
2025-11-23 23:09     ` Alistair Popple
2025-11-24 15:28       ` Gregory Price
2025-11-27  5:03         ` Alistair Popple
2025-11-24  9:19 ` David Hildenbrand (Red Hat)
2025-11-24 18:06   ` Gregory Price
2025-12-10 23:29     ` Yiannis Nikolakopoulos [this message]
2025-11-25 14:09 ` Kiryl Shutsemau
2025-11-25 15:05   ` Gregory Price
2025-11-27  5:12     ` Alistair Popple
2025-11-26  3:23 ` Balbir Singh
2025-11-26  8:29   ` Gregory Price
2025-12-03  4:36     ` Balbir Singh
2025-12-03  5:25       ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOi6=wTCPDM4xDJyzB1SdU6ChDch27eyTUtTAmajRNFhOFUN=A@mail.gmail.com' \
    --to=yiannis.nikolakop@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alison.schofield@intel.com \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=brauner@kernel.org \
    --cc=bsegall@google.com \
    --cc=byungchul@sk.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=cl@gentwo.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@kernel.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=dongjoo.seo1@samsung.com \
    --cc=escape@linux.alibaba.com \
    --cc=fabio.m.de.francesco@linux.intel.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=ira.weiny@intel.com \
    --cc=jackmanb@google.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=juri.lelli@redhat.com \
    --cc=kees@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=matthew.brost@intel.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=ming.li@zohomail.com \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=namcao@linutronix.de \
    --cc=nphamcs@gmail.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=oleg@redhat.com \
    --cc=osalvador@suse.de \
    --cc=peterz@infradead.org \
    --cc=rakie.kim@sk.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=rppt@kernel.org \
    --cc=rrichter@amd.com \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=vishal.l.verma@intel.com \
    --cc=vschneid@redhat.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox