linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gregory Price <gourry@gourry.net>
To: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-cxl@vger.kernel.org
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, kernel-team@meta.com,
	longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org,
	mkoutny@suse.com, corbet@lwn.net, gregkh@linuxfoundation.org,
	rafael@kernel.org, dakr@kernel.org, dave@stgolabs.net,
	jonathan.cameron@huawei.com, dave.jiang@intel.com,
	alison.schofield@intel.com, vishal.l.verma@intel.com,
	ira.weiny@intel.com, dan.j.williams@intel.com,
	akpm@linux-foundation.org, vbabka@suse.cz, surenb@google.com,
	mhocko@suse.com, jackmanb@google.com, ziy@nvidia.com,
	david@kernel.org, lorenzo.stoakes@oracle.com,
	Liam.Howlett@oracle.com, rppt@kernel.org,
	axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com,
	yury.norov@gmail.com, linux@rasmusvillemoes.dk,
	rientjes@google.com, shakeel.butt@linux.dev, chrisl@kernel.org,
	kasong@tencent.com, shikemeng@huaweicloud.com, nphamcs@gmail.com,
	bhe@redhat.com, baohua@kernel.org, yosry.ahmed@linux.dev,
	chengming.zhou@linux.dev, roman.gushchin@linux.dev,
	muchun.song@linux.dev, osalvador@suse.de,
	matthew.brost@intel.com, joshua.hahnjy@gmail.com,
	rakie.kim@sk.com, byungchul@sk.com, gourry@gourry.net,
	ying.huang@linux.alibaba.com, apopple@nvidia.com, cl@gentwo.org,
	harry.yoo@oracle.com, zhengqi.arch@bytedance.com
Subject: [RFC PATCH v3 8/8] drivers/cxl: add zswap private_region type
Date: Thu,  8 Jan 2026 15:37:55 -0500	[thread overview]
Message-ID: <20260108203755.1163107-9-gourry@gourry.net> (raw)
In-Reply-To: <20260108203755.1163107-1-gourry@gourry.net>

Add a sample type of a zswap region, which registers itself as a valid
target node with mm/zswap.  Zswap will callback into the driver on new
page allocation and free.

On cxl_zswap_page_allocated(), we would check whether the worst case vs
current compression ratio is safe to allow new writes.

On cxl_zswap_page_freed(), zero the page to adjust the ratio down.

A device driver registering a Zswap private region would need to provide
an indicator to this component whether to allow new allocations - this
would probably be done via an interrupt setting a bit which says the
compression ratio has reached some conservative threshold.

Signed-off-by: Gregory Price <gourry@gourry.net>
---
 drivers/cxl/core/private_region/Makefile      |   3 +
 .../cxl/core/private_region/private_region.c  |  10 ++
 .../cxl/core/private_region/private_region.h  |   4 +
 drivers/cxl/core/private_region/zswap.c       | 127 ++++++++++++++++++
 drivers/cxl/cxl.h                             |   2 +
 5 files changed, 146 insertions(+)
 create mode 100644 drivers/cxl/core/private_region/zswap.c

diff --git a/drivers/cxl/core/private_region/Makefile b/drivers/cxl/core/private_region/Makefile
index d17498129ba6..ba495cd3f89f 100644
--- a/drivers/cxl/core/private_region/Makefile
+++ b/drivers/cxl/core/private_region/Makefile
@@ -7,3 +7,6 @@ ccflags-y += -I$(srctree)/drivers/cxl
 
 # Core dispatch and sysfs
 obj-$(CONFIG_CXL_REGION) += private_region.o
+
+# Type-specific implementations
+obj-$(CONFIG_CXL_REGION) += zswap.o
diff --git a/drivers/cxl/core/private_region/private_region.c b/drivers/cxl/core/private_region/private_region.c
index ead48abb9fc7..da5fb3d264e1 100644
--- a/drivers/cxl/core/private_region/private_region.c
+++ b/drivers/cxl/core/private_region/private_region.c
@@ -16,6 +16,8 @@
 static const char *private_type_to_string(enum cxl_private_region_type type)
 {
 	switch (type) {
+	case CXL_PRIVATE_ZSWAP:
+		return "zswap";
 	default:
 		return "";
 	}
@@ -23,6 +25,8 @@ static const char *private_type_to_string(enum cxl_private_region_type type)
 
 static enum cxl_private_region_type string_to_private_type(const char *str)
 {
+	if (sysfs_streq(str, "zswap"))
+		return CXL_PRIVATE_ZSWAP;
 	return CXL_PRIVATE_NONE;
 }
 
@@ -88,6 +92,9 @@ int cxl_register_private_region(struct cxl_region *cxlr)
 
 	/* Call type-specific registration which sets memtype and callbacks */
 	switch (cxlr->private_type) {
+	case CXL_PRIVATE_ZSWAP:
+		rc = cxl_register_zswap_region(cxlr);
+		break;
 	default:
 		dev_dbg(&cxlr->dev, "unsupported private_type: %d\n",
 			cxlr->private_type);
@@ -113,6 +120,9 @@ void cxl_unregister_private_region(struct cxl_region *cxlr)
 
 	/* Dispatch to type-specific cleanup */
 	switch (cxlr->private_type) {
+	case CXL_PRIVATE_ZSWAP:
+		cxl_unregister_zswap_region(cxlr);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/cxl/core/private_region/private_region.h b/drivers/cxl/core/private_region/private_region.h
index 9b34e51d8df4..84d43238dbe1 100644
--- a/drivers/cxl/core/private_region/private_region.h
+++ b/drivers/cxl/core/private_region/private_region.h
@@ -7,4 +7,8 @@ struct cxl_region;
 int cxl_register_private_region(struct cxl_region *cxlr);
 void cxl_unregister_private_region(struct cxl_region *cxlr);
 
+/* Type-specific registration functions - called from region.c dispatch */
+int cxl_register_zswap_region(struct cxl_region *cxlr);
+void cxl_unregister_zswap_region(struct cxl_region *cxlr);
+
 #endif /* __CXL_PRIVATE_REGION_H__ */
diff --git a/drivers/cxl/core/private_region/zswap.c b/drivers/cxl/core/private_region/zswap.c
new file mode 100644
index 000000000000..c213abe2fad7
--- /dev/null
+++ b/drivers/cxl/core/private_region/zswap.c
@@ -0,0 +1,127 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * CXL Private Region - zswap type implementation
+ *
+ * This file implements the zswap private region type for CXL devices.
+ * It handles registration/unregistration of CXL regions as zswap
+ * compressed memory targets.
+ */
+
+#include <linux/device.h>
+#include <linux/highmem.h>
+#include <linux/node.h>
+#include <linux/zswap.h>
+#include <linux/memory_hotplug.h>
+#include "../../cxl.h"
+#include "../core.h"
+#include "private_region.h"
+
+/*
+ * CXL zswap region page_allocated callback
+ *
+ * This callback is invoked by zswap when a page is allocated from a private
+ * node to validate that the page is safe to use. For a real compressed memory
+ * device, this would check the device's compression ratio and return an error
+ * if the page cannot safely store data.
+ *
+ * Currently this is a placeholder that always succeeds. A real implementation
+ * would query the device hardware to determine if sufficient compression
+ * headroom exists.
+ */
+static int cxl_zswap_page_allocated(struct page *page, void *data)
+{
+	struct cxl_region *cxlr = data;
+
+	/*
+	 * TODO: Query the CXL device to check if this page allocation is safe.
+	 *
+	 * A real compressed memory device would track its compression ratio
+	 * and report whether it has headroom to accept new data. If the
+	 * compression ratio is too low (device is near capacity), this should
+	 * return -ENOSPC to tell zswap to try another node.
+	 *
+	 * For now, always succeed since we're testing with regular memory.
+	 */
+	dev_dbg(&cxlr->dev, "page_allocated callback for nid %d\n",
+		page_to_nid(page));
+
+	return 0;
+}
+
+/*
+ * CXL zswap region page_freed callback
+ *
+ * This callback is invoked when a page from a private node is being freed.
+ * We zero the page before returning it to the allocator so that the compressed
+ * memory device can reclaim capacity - zeroed pages achieve excellent
+ * compression ratios.
+ */
+static void cxl_zswap_page_freed(struct page *page, void *data)
+{
+	struct cxl_region *cxlr = data;
+
+	/*
+	 * Zero the page to improve the device's compression ratio.
+	 * Zeroed pages compress extremely well, reclaiming device capacity.
+	 */
+	clear_highpage(page);
+
+	dev_dbg(&cxlr->dev, "page_freed callback for nid %d\n",
+		page_to_nid(page));
+}
+
+/*
+ * Unregister a zswap region from the zswap subsystem.
+ *
+ * This function removes the node from zswap direct nodes and unregisters
+ * the private node operations.
+ */
+void cxl_unregister_zswap_region(struct cxl_region *cxlr)
+{
+	int nid;
+
+	if (!cxlr->private ||
+	    cxlr->private_ops.memtype != NODE_MEM_ZSWAP)
+		return;
+
+	if (!cxlr->params.res)
+		return;
+
+	nid = phys_to_target_node(cxlr->params.res->start);
+
+	zswap_remove_direct_node(nid);
+	node_unregister_private(nid, &cxlr->private_ops);
+
+	dev_dbg(&cxlr->dev, "unregistered zswap region for nid %d\n", nid);
+}
+
+/*
+ * Register a zswap region with the zswap subsystem.
+ *
+ * This function sets up the memtype, page_allocated callback, and
+ * registers the node with zswap as a direct compression target.
+ * The caller is responsible for adding the dax region after this succeeds.
+ */
+int cxl_register_zswap_region(struct cxl_region *cxlr)
+{
+	int nid, rc;
+
+	if (!cxlr->private || !cxlr->params.res)
+		return -EINVAL;
+
+	nid = phys_to_target_node(cxlr->params.res->start);
+
+	/* Register with node subsystem as zswap memory */
+	cxlr->private_ops.memtype = NODE_MEM_ZSWAP;
+	cxlr->private_ops.page_allocated = cxl_zswap_page_allocated;
+	cxlr->private_ops.page_freed = cxl_zswap_page_freed;
+	rc = node_register_private(nid, &cxlr->private_ops);
+	if (rc)
+		return rc;
+
+	/* Register this node with zswap as a direct compression target */
+	zswap_add_direct_node(nid);
+
+	dev_dbg(&cxlr->dev, "registered zswap region for nid %d\n", nid);
+	return 0;
+}
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index b276956ff88d..89d8ae4e796c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -534,9 +534,11 @@ enum cxl_partition_mode {
 /**
  * enum cxl_private_region_type - CXL private region types
  * @CXL_PRIVATE_NONE: No private region type set
+ * @CXL_PRIVATE_ZSWAP: Region used for zswap compressed memory
  */
 enum cxl_private_region_type {
 	CXL_PRIVATE_NONE,
+	CXL_PRIVATE_ZSWAP,
 };
 
 /**
-- 
2.52.0



      parent reply	other threads:[~2026-01-08 20:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-08 20:37 [RFC PATCH v3 0/8] mm,numa: N_PRIVATE node isolation for device-managed memory Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 1/8] numa,memory_hotplug: create N_PRIVATE (Private Nodes) Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 2/8] mm: constify oom_control, scan_control, and alloc_context nodemask Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 3/8] mm: restrict slub, compaction, and page_alloc to sysram Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 4/8] cpuset: introduce cpuset.mems.sysram Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 5/8] Documentation/admin-guide/cgroups: update docs for mems_allowed Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 6/8] drivers/cxl/core/region: add private_region Gregory Price
2026-01-08 20:37 ` [RFC PATCH v3 7/8] mm/zswap: compressed ram direct integration Gregory Price
2026-01-09 16:00   ` Yosry Ahmed
2026-01-09 17:03     ` Gregory Price
2026-01-09 21:40     ` Gregory Price
2026-01-08 20:37 ` Gregory Price [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260108203755.1163107-9-gourry@gourry.net \
    --to=gourry@gourry.net \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alison.schofield@intel.com \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=byungchul@sk.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dakr@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=david@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=ira.weiny@intel.com \
    --cc=jackmanb@google.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=kernel-team@meta.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=longman@redhat.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=osalvador@suse.de \
    --cc=rafael@kernel.org \
    --cc=rakie.kim@sk.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=vishal.l.verma@intel.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=yuanchu@google.com \
    --cc=yury.norov@gmail.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox