From: Gregory Price <gourry@gourry.net>
To: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org, kernel-team@meta.com,
linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
nvdimm@lists.linux.dev, linux-fsdevel@vger.kernel.org,
cgroups@vger.kernel.org, dave@stgolabs.net,
jonathan.cameron@huawei.com, dave.jiang@intel.com,
alison.schofield@intel.com, vishal.l.verma@intel.com,
ira.weiny@intel.com, dan.j.williams@intel.com,
longman@redhat.com, akpm@linux-foundation.org, david@redhat.com,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
mhocko@suse.com, osalvador@suse.de, ziy@nvidia.com,
matthew.brost@intel.com, joshua.hahnjy@gmail.com,
rakie.kim@sk.com, byungchul@sk.com, ying.huang@linux.alibaba.com,
apopple@nvidia.com, mingo@redhat.com, peterz@infradead.org,
juri.lelli@redhat.com, vincent.guittot@linaro.org,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, vschneid@redhat.com,
tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com,
kees@kernel.org, muchun.song@linux.dev, roman.gushchin@linux.dev,
shakeel.butt@linux.dev, rientjes@google.com, jackmanb@google.com,
cl@gentwo.org, harry.yoo@oracle.com, axelrasmussen@google.com,
yuanchu@google.com, weixugc@google.com,
zhengqi.arch@bytedance.com, yosry.ahmed@linux.dev,
nphamcs@gmail.com, chengming.zhou@linux.dev,
fabio.m.de.francesco@linux.intel.com, rrichter@amd.com,
ming.li@zohomail.com, usamaarif642@gmail.com, brauner@kernel.org,
oleg@redhat.com, namcao@linutronix.de, escape@linux.alibaba.com,
dongjoo.seo1@samsung.com
Subject: Re: [RFC LPC2026 PATCH v2 00/11] Specific Purpose Memory NUMA Nodes
Date: Wed, 3 Dec 2025 00:25:33 -0500 [thread overview]
Message-ID: <aS_JzWHHn8hBHSCe@gourry-fedora-PF4VCD3F> (raw)
In-Reply-To: <36edd166-7e11-4d43-9839-42467d4399d1@nvidia.com>
On Wed, Dec 03, 2025 at 03:36:33PM +1100, Balbir Singh wrote:
> > - I discussed in my note to David that this is probably the right
> > way to go about doing it. I think N_MEMORY can still be set, if
> > a new global-default-node policy is created.
> >
>
> I still think N_MEMORY as a flag should mean something different from
> N_SPM_NODE_MEMORY because their characteristics are different
>
... snip ... (I agree, see later)
> > - Instead, I can see either per-component policies (reclaim->nodes)
> > or a global policy that covers all of those components (similar to
> > my sysram_nodes). Drivers would then be responsible to register
> > their hotplugged memory nodes with those components accordingly.
> >
>
> To me node zonelists provide the right abstraction of where to allocate from
> and how to fallback as needed. I'll read your patches to figure out how your
> approach is different. I wanted the isolation at allocation time
>
... snip ... (I agree, see later)
>
> Yes, we should look at the pros and cons. To be honest, I'd wouldn't be
> opposed to having kswapd and reclaim look different for these nodes, it
> would also mean that we'd need pagecache hooks if we want page cache on
> these nodes. Everything else, including move_pages() should just work.
>
Basically my series does (roughly) the same as yours, but adds the
cpusets controls and a GFP flag. The MHP extention should ultimately
be converted to N_SPM_NODE_MEMORY (or whatever we decide to name it).
After some more time to think, I think we want all of it.
- N_SPM_NODE_MEMORY (or whatever we call it) handles filtering out
SPM at allocation time by default and protects all current users
of N_MEMORY from exposure to SPM.
- cpusets controls allow userland isolation control and a default sysram
mask (I think cpusets.sysram_nodes doesn't even need to be exposed via
sysfs to be honest). cpusets fix is needed due to task->mems_allowed
being used as a default nodemask on systems using cgroups/cpusets.
- GFP_SP_NODE protects against someone doing something like:
get_page_from_freelist(..., node_states[N_POSSIBLE])
or
numactl --interleave --all ./my_program
While providing a way to punch an explicit hole in the isolation
(GFP_SP_NODE means "Use N_SPM_NODE_MEMORY instead of N_MEMORY")
This could be argued against so long as we restrict mempolicy.c
to N_MEMORY nodes (to avoid `--interleave --all` issues), but this
limitation may not be preferable.
My concern is for breaking existing userland software that happens
to run on a system with SPM - but you can probably imagine many more
bad scenarios.
~Gregory
prev parent reply other threads:[~2025-12-03 5:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-12 19:29 Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 01/11] mm: constify oom_control, scan_control, and alloc_context nodemask Gregory Price
2025-12-15 6:11 ` Balbir Singh
2025-11-12 19:29 ` [RFC PATCH v2 02/11] mm: change callers of __cpuset_zone_allowed to cpuset_zone_allowed Gregory Price
2025-12-15 6:14 ` Balbir Singh
2025-12-15 12:38 ` Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 03/11] gfp: Add GFP_SPM_NODE for Specific Purpose Memory (SPM) allocations Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 04/11] memory-tiers: Introduce SysRAM and Specific Purpose Memory Nodes Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 05/11] mm: restrict slub, oom, compaction, and page_alloc to sysram by default Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 06/11] mm,cpusets: rename task->mems_allowed to task->sysram_nodes Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 07/11] cpuset: introduce cpuset.mems.sysram Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 08/11] mm/memory_hotplug: add MHP_SPM_NODE flag Gregory Price
2025-11-13 14:58 ` [PATCH] memory-tiers: multi-definition fixup Gregory Price
2025-11-13 16:37 ` kernel test robot
2026-01-15 2:38 ` [PATCH] dax/kmem: add build config for protected dax memory blocks Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 09/11] drivers/dax: add spm_node bit to dev_dax Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 10/11] drivers/cxl: add spm_node bit to cxl region Gregory Price
2025-11-12 19:29 ` [RFC PATCH v2 11/11] [HACK] mm/zswap: compressed ram integration example Gregory Price
2025-11-18 7:02 ` [RFC LPC2026 PATCH v2 00/11] Specific Purpose Memory NUMA Nodes Alistair Popple
2025-11-18 10:36 ` Gregory Price
2025-11-21 21:07 ` Gregory Price
2025-11-23 23:09 ` Alistair Popple
2025-11-24 15:28 ` Gregory Price
2025-11-27 5:03 ` Alistair Popple
2025-11-24 9:19 ` David Hildenbrand (Red Hat)
2025-11-24 18:06 ` Gregory Price
2025-12-10 23:29 ` Yiannis Nikolakopoulos
2025-11-25 14:09 ` Kiryl Shutsemau
2025-11-25 15:05 ` Gregory Price
2025-11-27 5:12 ` Alistair Popple
2025-11-26 3:23 ` Balbir Singh
2025-11-26 8:29 ` Gregory Price
2025-12-03 4:36 ` Balbir Singh
2025-12-03 5:25 ` Gregory Price [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aS_JzWHHn8hBHSCe@gourry-fedora-PF4VCD3F \
--to=gourry@gourry.net \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=alison.schofield@intel.com \
--cc=apopple@nvidia.com \
--cc=axelrasmussen@google.com \
--cc=balbirs@nvidia.com \
--cc=brauner@kernel.org \
--cc=bsegall@google.com \
--cc=byungchul@sk.com \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=cl@gentwo.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dietmar.eggemann@arm.com \
--cc=dongjoo.seo1@samsung.com \
--cc=escape@linux.alibaba.com \
--cc=fabio.m.de.francesco@linux.intel.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=ira.weiny@intel.com \
--cc=jackmanb@google.com \
--cc=jonathan.cameron@huawei.com \
--cc=joshua.hahnjy@gmail.com \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=kernel-team@meta.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=matthew.brost@intel.com \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=ming.li@zohomail.com \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=namcao@linutronix.de \
--cc=nphamcs@gmail.com \
--cc=nvdimm@lists.linux.dev \
--cc=oleg@redhat.com \
--cc=osalvador@suse.de \
--cc=peterz@infradead.org \
--cc=rakie.kim@sk.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=rrichter@amd.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=vincent.guittot@linaro.org \
--cc=vishal.l.verma@intel.com \
--cc=vschneid@redhat.com \
--cc=weixugc@google.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yosry.ahmed@linux.dev \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox