linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Enhance sysfs handling for memory hotplug in weighted interleave
@ 2025-04-01  9:08 Rakie Kim
  2025-04-01  9:08 ` [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Rakie Kim @ 2025-04-01  9:08 UTC (permalink / raw)
  To: gourry
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun, rakie.kim

The following patch series enhances the weighted interleave policy in the
memory management subsystem by improving sysfs handling, fixing memory leaks,
and introducing dynamic sysfs updates for memory hotplug support.

### Background
The weighted interleave policy distributes memory allocations across multiple
NUMA nodes based on their performance weight, thereby optimizing memory
bandwidth utilization. The weight values are configured through sysfs.

Previously, sysfs entries for weighted interleave were managed statically
at initialization. This led to several issues:
- Memory Leaks: Improper `kobject` deallocation caused memory leaks
  when initialization failed or when nodes were removed.
- Lack of Dynamic Updates: Sysfs attributes were created only during
  initialization, preventing nodes added at runtime from being recognized.
- Handling of Unusable Nodes: Sysfs entries were generated for all
  possible nodes (`N_POSSIBLE`), including memoryless or unavailable nodes,
  leading to sysfs entries for unusable nodes and potential
  misconfigurations.

### Patch Overview
1. [PATCH 1/3] Fix memory leaks in weighted interleave sysfs
   - Ensures proper cleanup of `kobject` allocations.
   - Replaces unnecessary `kfree()` calls with `kobject_put()`, preventing
     memory leaks and improving system stability.

2. [PATCH 2/3] Enable dynamic updates for weighted interleave sysfs
   - Restructures sysfs handling to allow runtime updates.
   - The sysfs attributes are now globally accessible, enabling external
     modules to manage interleave settings dynamically.

3. [PATCH 3/3] Support memory hotplug in weighted interleave
   - Modifies sysfs creation logic to restrict entries to nodes that are
     online and have memory, excluding unusable nodes.
   - Introduces a memory hotplug mechanism to dynamically add and remove
     sysfs attributes when nodes transition into or out of the `N_MEMORY` set.
   - Ensures that sysfs attributes are properly removed when nodes go offline,
     preventing stale or redundant entries from persisting.

These patches have been tested under CXL-based memory configurations,
including hotplug scenarios, to ensure proper behavior and stability.

 mm/mempolicy.c | 195 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 127 insertions(+), 68 deletions(-)


base-commit: 38fec10eb60d687e30c8c6b5420d86e8149f7557
-- 
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs
  2025-04-01  9:08 [PATCH v4 0/3] Enhance sysfs handling for memory hotplug in weighted interleave Rakie Kim
@ 2025-04-01  9:08 ` Rakie Kim
  2025-04-01 21:21   ` Dan Williams
  2025-04-01  9:08 ` [PATCH v4 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave Rakie Kim
  2025-04-01  9:08 ` [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in " Rakie Kim
  2 siblings, 1 reply; 8+ messages in thread
From: Rakie Kim @ 2025-04-01  9:08 UTC (permalink / raw)
  To: gourry
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun, rakie.kim

Memory leaks occurred when removing sysfs attributes for weighted
interleave. Improper kobject deallocation led to unreleased memory
when initialization failed or when nodes were removed.

This patch resolves the issue by replacing unnecessary `kfree()`
calls with `kobject_put()`, ensuring proper cleanup and preventing
memory leaks.

By correctly using `kobject_put()`, the release function now
properly deallocates memory without causing resource leaks,
thereby improving system stability.

Fixes: dce41f5ae253 ("mm/mempolicy: implement the sysfs-based weighted_interleave interface")
Signed-off-by: Rakie Kim <rakie.kim@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
---
 mm/mempolicy.c | 61 +++++++++++++++++++++++++-------------------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index bbaadbeeb291..5950d5d5b85e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3448,7 +3448,9 @@ static void sysfs_wi_release(struct kobject *wi_kobj)
 
 	for (i = 0; i < nr_node_ids; i++)
 		sysfs_wi_node_release(node_attrs[i], wi_kobj);
-	kobject_put(wi_kobj);
+
+	kfree(node_attrs);
+	kfree(wi_kobj);
 }
 
 static const struct kobj_type wi_ktype = {
@@ -3494,15 +3496,22 @@ static int add_weighted_interleave_group(struct kobject *root_kobj)
 	struct kobject *wi_kobj;
 	int nid, err;
 
-	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
-	if (!wi_kobj)
+	node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
+			     GFP_KERNEL);
+	if (!node_attrs)
 		return -ENOMEM;
 
+	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
+	if (!wi_kobj) {
+		err = -ENOMEM;
+		goto node_out;
+	}
+
 	err = kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj,
 				   "weighted_interleave");
 	if (err) {
-		kfree(wi_kobj);
-		return err;
+		kobject_put(wi_kobj);
+		goto err_out;
 	}
 
 	for_each_node_state(nid, N_POSSIBLE) {
@@ -3512,9 +3521,17 @@ static int add_weighted_interleave_group(struct kobject *root_kobj)
 			break;
 		}
 	}
-	if (err)
+	if (err) {
 		kobject_put(wi_kobj);
+		goto err_out;
+	}
+
 	return 0;
+
+node_out:
+	kfree(node_attrs);
+err_out:
+	return err;
 }
 
 static void mempolicy_kobj_release(struct kobject *kobj)
@@ -3528,7 +3545,6 @@ static void mempolicy_kobj_release(struct kobject *kobj)
 	mutex_unlock(&iw_table_lock);
 	synchronize_rcu();
 	kfree(old);
-	kfree(node_attrs);
 	kfree(kobj);
 }
 
@@ -3542,37 +3558,22 @@ static int __init mempolicy_sysfs_init(void)
 	static struct kobject *mempolicy_kobj;
 
 	mempolicy_kobj = kzalloc(sizeof(*mempolicy_kobj), GFP_KERNEL);
-	if (!mempolicy_kobj) {
-		err = -ENOMEM;
-		goto err_out;
-	}
-
-	node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
-			     GFP_KERNEL);
-	if (!node_attrs) {
-		err = -ENOMEM;
-		goto mempol_out;
-	}
+	if (!mempolicy_kobj)
+		return -ENOMEM;
 
 	err = kobject_init_and_add(mempolicy_kobj, &mempolicy_ktype, mm_kobj,
 				   "mempolicy");
 	if (err)
-		goto node_out;
+		goto err_out;
 
 	err = add_weighted_interleave_group(mempolicy_kobj);
-	if (err) {
-		pr_err("mempolicy sysfs structure failed to initialize\n");
-		kobject_put(mempolicy_kobj);
-		return err;
-	}
+	if (err)
+		goto err_out;
+
+	return 0;
 
-	return err;
-node_out:
-	kfree(node_attrs);
-mempol_out:
-	kfree(mempolicy_kobj);
 err_out:
-	pr_err("failed to add mempolicy kobject to the system\n");
+	kobject_put(mempolicy_kobj);
 	return err;
 }
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave
  2025-04-01  9:08 [PATCH v4 0/3] Enhance sysfs handling for memory hotplug in weighted interleave Rakie Kim
  2025-04-01  9:08 ` [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
@ 2025-04-01  9:08 ` Rakie Kim
  2025-04-01  9:08 ` [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in " Rakie Kim
  2 siblings, 0 replies; 8+ messages in thread
From: Rakie Kim @ 2025-04-01  9:08 UTC (permalink / raw)
  To: gourry
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun, rakie.kim

Previously, the weighted interleave sysfs structure was statically
managed, preventing dynamic updates when nodes were added or removed.

This patch restructures the weighted interleave sysfs to support
dynamic insertion and deletion. The sysfs that was part of
the 'weighted_interleave_group' is now globally accessible,
allowing external access to that sysfs.

With this change, sysfs management for weighted interleave is
more flexible, supporting hotplug events and runtime updates
more effectively.

Signed-off-by: Rakie Kim <rakie.kim@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
---
 mm/mempolicy.c | 71 ++++++++++++++++++++++----------------------------
 1 file changed, 31 insertions(+), 40 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 5950d5d5b85e..3092a792bd28 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3388,6 +3388,13 @@ struct iw_node_attr {
 	int nid;
 };
 
+struct sysfs_wi_group {
+	struct kobject wi_kobj;
+	struct iw_node_attr *nattrs[];
+};
+
+static struct sysfs_wi_group *wi_group;
+
 static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr,
 			 char *buf)
 {
@@ -3430,27 +3437,24 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
 	return count;
 }
 
-static struct iw_node_attr **node_attrs;
-
-static void sysfs_wi_node_release(struct iw_node_attr *node_attr,
-				  struct kobject *parent)
+static void sysfs_wi_node_release(int nid)
 {
-	if (!node_attr)
+	if (!wi_group->nattrs[nid])
 		return;
-	sysfs_remove_file(parent, &node_attr->kobj_attr.attr);
-	kfree(node_attr->kobj_attr.attr.name);
-	kfree(node_attr);
+
+	sysfs_remove_file(&wi_group->wi_kobj,
+			  &wi_group->nattrs[nid]->kobj_attr.attr);
+	kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
+	kfree(wi_group->nattrs[nid]);
 }
 
 static void sysfs_wi_release(struct kobject *wi_kobj)
 {
-	int i;
-
-	for (i = 0; i < nr_node_ids; i++)
-		sysfs_wi_node_release(node_attrs[i], wi_kobj);
+	int nid;
 
-	kfree(node_attrs);
-	kfree(wi_kobj);
+	for (nid = 0; nid < nr_node_ids; nid++)
+		sysfs_wi_node_release(nid);
+	kfree(wi_group);
 }
 
 static const struct kobj_type wi_ktype = {
@@ -3458,7 +3462,7 @@ static const struct kobj_type wi_ktype = {
 	.release = sysfs_wi_release,
 };
 
-static int add_weight_node(int nid, struct kobject *wi_kobj)
+static int sysfs_wi_node_add(int nid)
 {
 	struct iw_node_attr *node_attr;
 	char *name;
@@ -3480,57 +3484,44 @@ static int add_weight_node(int nid, struct kobject *wi_kobj)
 	node_attr->kobj_attr.store = node_store;
 	node_attr->nid = nid;
 
-	if (sysfs_create_file(wi_kobj, &node_attr->kobj_attr.attr)) {
+	if (sysfs_create_file(&wi_group->wi_kobj, &node_attr->kobj_attr.attr)) {
 		kfree(node_attr->kobj_attr.attr.name);
 		kfree(node_attr);
 		pr_err("failed to add attribute to weighted_interleave\n");
 		return -ENOMEM;
 	}
 
-	node_attrs[nid] = node_attr;
+	wi_group->nattrs[nid] = node_attr;
 	return 0;
 }
 
-static int add_weighted_interleave_group(struct kobject *root_kobj)
+static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
 {
-	struct kobject *wi_kobj;
 	int nid, err;
 
-	node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
-			     GFP_KERNEL);
-	if (!node_attrs)
+	wi_group = kzalloc(sizeof(struct sysfs_wi_group) + 		\
+		       nr_node_ids * sizeof(struct iw_node_attr *),	\
+		       GFP_KERNEL);
+	if (!wi_group)
 		return -ENOMEM;
 
-	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
-	if (!wi_kobj) {
-		err = -ENOMEM;
-		goto node_out;
-	}
-
-	err = kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj,
+	err = kobject_init_and_add(&wi_group->wi_kobj, &wi_ktype, mempolicy_kobj,
 				   "weighted_interleave");
-	if (err) {
-		kobject_put(wi_kobj);
+	if (err)
 		goto err_out;
-	}
 
 	for_each_node_state(nid, N_POSSIBLE) {
-		err = add_weight_node(nid, wi_kobj);
+		err = sysfs_wi_node_add(nid);
 		if (err) {
 			pr_err("failed to add sysfs [node%d]\n", nid);
-			break;
+			goto err_out;
 		}
 	}
-	if (err) {
-		kobject_put(wi_kobj);
-		goto err_out;
-	}
 
 	return 0;
 
-node_out:
-	kfree(node_attrs);
 err_out:
+	kobject_put(&wi_group->wi_kobj);
 	return err;
 }
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in weighted interleave
  2025-04-01  9:08 [PATCH v4 0/3] Enhance sysfs handling for memory hotplug in weighted interleave Rakie Kim
  2025-04-01  9:08 ` [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
  2025-04-01  9:08 ` [PATCH v4 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave Rakie Kim
@ 2025-04-01  9:08 ` Rakie Kim
  2025-04-01 20:32   ` Gregory Price
  2 siblings, 1 reply; 8+ messages in thread
From: Rakie Kim @ 2025-04-01  9:08 UTC (permalink / raw)
  To: gourry
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun, rakie.kim

The weighted interleave policy distributes page allocations across multiple
NUMA nodes based on their performance weight, thereby improving memory
bandwidth utilization. The weight values for each node are configured
through sysfs.

Previously, sysfs entries for configuring weighted interleave were created
for all possible nodes (N_POSSIBLE) at initialization, including nodes that
might not have memory. However, not all nodes in N_POSSIBLE are usable at
runtime, as some may remain memoryless or offline.
This led to sysfs entries being created for unusable nodes, causing
potential misconfiguration issues.

To address this issue, this patch modifies the sysfs creation logic to:
1) Limit sysfs entries to nodes that are online and have memory, avoiding
   the creation of sysfs entries for nodes that cannot be used.
2) Support memory hotplug by dynamically adding and removing sysfs entries
   based on whether a node transitions into or out of the N_MEMORY state.

Additionally, the patch ensures that sysfs attributes are properly managed
when nodes go offline, preventing stale or redundant entries from persisting
in the system.

By making these changes, the weighted interleave policy now manages its
sysfs entries more efficiently, ensuring that only relevant nodes are
considered for interleaving, and dynamically adapting to memory hotplug
events.

Signed-off-by: Rakie Kim <rakie.kim@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
---
 mm/mempolicy.c | 113 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 90 insertions(+), 23 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3092a792bd28..fa755d20780c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -113,6 +113,7 @@
 #include <asm/tlbflush.h>
 #include <asm/tlb.h>
 #include <linux/uaccess.h>
+#include <linux/memory.h>
 
 #include "internal.h"
 
@@ -3390,6 +3391,7 @@ struct iw_node_attr {
 
 struct sysfs_wi_group {
 	struct kobject wi_kobj;
+	struct mutex kobj_lock;
 	struct iw_node_attr *nattrs[];
 };
 
@@ -3439,13 +3441,24 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
 
 static void sysfs_wi_node_release(int nid)
 {
-	if (!wi_group->nattrs[nid])
+	struct iw_node_attr *attr;
+
+	if (nid < 0 || nid >= nr_node_ids)
+		return;
+
+	mutex_lock(&wi_group->kobj_lock);
+	attr = wi_group->nattrs[nid];
+	if (!attr) {
+		mutex_unlock(&wi_group->kobj_lock);
 		return;
+	}
 
-	sysfs_remove_file(&wi_group->wi_kobj,
-			  &wi_group->nattrs[nid]->kobj_attr.attr);
-	kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
-	kfree(wi_group->nattrs[nid]);
+	wi_group->nattrs[nid] = NULL;
+	mutex_unlock(&wi_group->kobj_lock);
+
+	sysfs_remove_file(&wi_group->wi_kobj, &attr->kobj_attr.attr);
+	kfree(attr->kobj_attr.attr.name);
+	kfree(attr);
 }
 
 static void sysfs_wi_release(struct kobject *wi_kobj)
@@ -3464,35 +3477,84 @@ static const struct kobj_type wi_ktype = {
 
 static int sysfs_wi_node_add(int nid)
 {
-	struct iw_node_attr *node_attr;
+	int ret = 0;
 	char *name;
+	struct iw_node_attr *new_attr = NULL;
 
-	node_attr = kzalloc(sizeof(*node_attr), GFP_KERNEL);
-	if (!node_attr)
+	if (nid < 0 || nid >= nr_node_ids) {
+		pr_err("Invalid node id: %d\n", nid);
+		return -EINVAL;
+	}
+
+	new_attr = kzalloc(sizeof(struct iw_node_attr), GFP_KERNEL);
+	if (!new_attr)
 		return -ENOMEM;
 
 	name = kasprintf(GFP_KERNEL, "node%d", nid);
 	if (!name) {
-		kfree(node_attr);
+		kfree(new_attr);
 		return -ENOMEM;
 	}
 
-	sysfs_attr_init(&node_attr->kobj_attr.attr);
-	node_attr->kobj_attr.attr.name = name;
-	node_attr->kobj_attr.attr.mode = 0644;
-	node_attr->kobj_attr.show = node_show;
-	node_attr->kobj_attr.store = node_store;
-	node_attr->nid = nid;
+	mutex_lock(&wi_group->kobj_lock);
+	if (wi_group->nattrs[nid]) {
+		mutex_unlock(&wi_group->kobj_lock);
+		pr_info("Node [%d] already exists\n", nid);
+		kfree(new_attr);
+		kfree(name);
+		return 0;
+	}
 
-	if (sysfs_create_file(&wi_group->wi_kobj, &node_attr->kobj_attr.attr)) {
-		kfree(node_attr->kobj_attr.attr.name);
-		kfree(node_attr);
-		pr_err("failed to add attribute to weighted_interleave\n");
-		return -ENOMEM;
+	wi_group->nattrs[nid] = new_attr;
+	mutex_unlock(&wi_group->kobj_lock);
+
+	sysfs_attr_init(&wi_group->nattrs[nid]->kobj_attr.attr);
+	wi_group->nattrs[nid]->kobj_attr.attr.name = name;
+	wi_group->nattrs[nid]->kobj_attr.attr.mode = 0644;
+	wi_group->nattrs[nid]->kobj_attr.show = node_show;
+	wi_group->nattrs[nid]->kobj_attr.store = node_store;
+	wi_group->nattrs[nid]->nid = nid;
+
+	ret = sysfs_create_file(&wi_group->wi_kobj,
+				&wi_group->nattrs[nid]->kobj_attr.attr);
+	if (ret) {
+		kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
+		kfree(wi_group->nattrs[nid]);
+		wi_group->nattrs[nid] = NULL;
+		pr_err("Failed to add attribute to weighted_interleave: %d\n", ret);
 	}
 
-	wi_group->nattrs[nid] = node_attr;
-	return 0;
+	return ret;
+}
+
+static int wi_node_notifier(struct notifier_block *nb,
+			       unsigned long action, void *data)
+{
+	int err;
+	struct memory_notify *arg = data;
+	int nid = arg->status_change_nid;
+
+	if (nid < 0)
+		goto notifier_end;
+
+	switch(action) {
+	case MEM_ONLINE:
+		if (node_state(nid, N_MEMORY)) {
+			err = sysfs_wi_node_add(nid);
+			if (err) {
+				pr_err("failed to add sysfs [node%d]\n", nid);
+				return NOTIFY_BAD;
+			}
+		}
+		break;
+	case MEM_OFFLINE:
+		if (!node_state(nid, N_MEMORY))
+			sysfs_wi_node_release(nid);
+		break;
+	}
+
+notifier_end:
+	return NOTIFY_OK;
 }
 
 static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
@@ -3504,13 +3566,17 @@ static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
 		       GFP_KERNEL);
 	if (!wi_group)
 		return -ENOMEM;
+	mutex_init(&wi_group->kobj_lock);
 
 	err = kobject_init_and_add(&wi_group->wi_kobj, &wi_ktype, mempolicy_kobj,
 				   "weighted_interleave");
 	if (err)
 		goto err_out;
 
-	for_each_node_state(nid, N_POSSIBLE) {
+	for_each_online_node(nid) {
+		if (!node_state(nid, N_MEMORY))
+			continue;
+
 		err = sysfs_wi_node_add(nid);
 		if (err) {
 			pr_err("failed to add sysfs [node%d]\n", nid);
@@ -3518,6 +3584,7 @@ static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
 		}
 	}
 
+	hotplug_memory_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
 	return 0;
 
 err_out:
-- 
2.34.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in weighted interleave
  2025-04-01  9:08 ` [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in " Rakie Kim
@ 2025-04-01 20:32   ` Gregory Price
  2025-04-02  1:28     ` Rakie Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Gregory Price @ 2025-04-01 20:32 UTC (permalink / raw)
  To: Rakie Kim
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun

On Tue, Apr 01, 2025 at 06:08:59PM +0900, Rakie Kim wrote:
>  static void sysfs_wi_release(struct kobject *wi_kobj)
> @@ -3464,35 +3477,84 @@ static const struct kobj_type wi_ktype = {
>  
>  static int sysfs_wi_node_add(int nid)
>  {
... snip ..
> +	mutex_lock(&wi_group->kobj_lock);
> +	if (wi_group->nattrs[nid]) {
> +		mutex_unlock(&wi_group->kobj_lock);
> +		pr_info("Node [%d] already exists\n", nid);
> +		kfree(new_attr);
> +		kfree(name);
> +		return 0;
> +	}
>  
> -	if (sysfs_create_file(&wi_group->wi_kobj, &node_attr->kobj_attr.attr)) {
> -		kfree(node_attr->kobj_attr.attr.name);
> -		kfree(node_attr);
> -		pr_err("failed to add attribute to weighted_interleave\n");
> -		return -ENOMEM;
> +	wi_group->nattrs[nid] = new_attr;
> +	mutex_unlock(&wi_group->kobj_lock);
> +

Shouldn't all of this
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> +	sysfs_attr_init(&wi_group->nattrs[nid]->kobj_attr.attr);
> +	wi_group->nattrs[nid]->kobj_attr.attr.name = name;
> +	wi_group->nattrs[nid]->kobj_attr.attr.mode = 0644;
> +	wi_group->nattrs[nid]->kobj_attr.show = node_show;
> +	wi_group->nattrs[nid]->kobj_attr.store = node_store;
> +	wi_group->nattrs[nid]->nid = nid;
> +
> +	ret = sysfs_create_file(&wi_group->wi_kobj,
> +				&wi_group->nattrs[nid]->kobj_attr.attr);
> +	if (ret) {
> +		kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
> +		kfree(wi_group->nattrs[nid]);
> +		wi_group->nattrs[nid] = NULL;
> +		pr_err("Failed to add attribute to weighted_interleave: %d\n", ret);
>  	}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Be happening inside the lock as well?

> +
> +	switch(action) {
> +	case MEM_ONLINE:
> +		if (node_state(nid, N_MEMORY)) {

Hm, I see the issue here, ok, this node_state check isn't needed, as it
will always be true.  So this function needs to handle duplicates still.
                          vvv 
> +			err = sysfs_wi_node_add(nid);
> +			if (err) {
> +				pr_err("failed to add sysfs [node%d]\n", nid);
> +				return NOTIFY_BAD;
> +			}
> +		}
> +		break;
> +	case MEM_OFFLINE:
> +		if (!node_state(nid, N_MEMORY))

This check is good for the time being.

> +			sysfs_wi_node_release(nid);
> +		break;
> +	}
> +
> +notifier_end:
> +	return NOTIFY_OK;
>  }
>  
> 

But really I think we probably just want to change to build on top of this:
https://lore.kernel.org/all/20250401092716.537512-2-osalvador@suse.de/

And use register_node_notifier with NODE_BECAME_MEMORYLESS and NODE_BECAME_MEM_AWARE

~Gregory


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs
  2025-04-01  9:08 ` [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
@ 2025-04-01 21:21   ` Dan Williams
  2025-04-02  1:29     ` Rakie Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Williams @ 2025-04-01 21:21 UTC (permalink / raw)
  To: Rakie Kim, gourry
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun, rakie.kim

Rakie Kim wrote:
> Memory leaks occurred when removing sysfs attributes for weighted
> interleave. Improper kobject deallocation led to unreleased memory
> when initialization failed or when nodes were removed.
> 
> This patch resolves the issue by replacing unnecessary `kfree()`
> calls with `kobject_put()`, ensuring proper cleanup and preventing
> memory leaks.
> 
> By correctly using `kobject_put()`, the release function now
> properly deallocates memory without causing resource leaks,
> thereby improving system stability.
> 
> Fixes: dce41f5ae253 ("mm/mempolicy: implement the sysfs-based weighted_interleave interface")
> Signed-off-by: Rakie Kim <rakie.kim@sk.com>
> Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
> Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
> Reviewed-by: Gregory Price <gourry@gourry.net>
> ---
>  mm/mempolicy.c | 61 +++++++++++++++++++++++++-------------------------
>  1 file changed, 31 insertions(+), 30 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index bbaadbeeb291..5950d5d5b85e 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -3448,7 +3448,9 @@ static void sysfs_wi_release(struct kobject *wi_kobj)
>  
>  	for (i = 0; i < nr_node_ids; i++)
>  		sysfs_wi_node_release(node_attrs[i], wi_kobj);
> -	kobject_put(wi_kobj);
> +
> +	kfree(node_attrs);
> +	kfree(wi_kobj);
>  }
>  
>  static const struct kobj_type wi_ktype = {
> @@ -3494,15 +3496,22 @@ static int add_weighted_interleave_group(struct kobject *root_kobj)
>  	struct kobject *wi_kobj;
>  	int nid, err;
>  
> -	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
> -	if (!wi_kobj)
> +	node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
> +			     GFP_KERNEL);
> +	if (!node_attrs)
>  		return -ENOMEM;
>  
> +	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
> +	if (!wi_kobj) {
> +		err = -ENOMEM;
> +		goto node_out;
> +	}
> +
>  	err = kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj,
>  				   "weighted_interleave");

It would be nice if this could take advantage of scope-based cleanup to
avoid the new gotos. It would need a new:

DEFINE_FREE(kobject_put, struct kobject *, if (!IS_ERR_OR_NULL(_T)) kobject_put(_T))

...and a wrapper around kobject_init_and_add() to support auto cleanup:

struct kobject *kobject_init_and_add_or_errptr(struct kobject *kobj)
{
	int err = kobject_init_and_add(kobj...);

	if (err)
		return ERR_PTR(err);
	return kobj;
}

With those then you could do:

struct kobject *wi_kobj __free(kfree) = kzalloc(sizeof(struct kobject), GFP_KERNEL);
struct kobject *kobj __free(kobject_put) = kobject_init_and_add_or_errptr(no_free_ptr(wi_kobj), ...)

Otherwise, the patch does look good to me as is, but it seems like an
opportunity for further cleanups that might also help other
kobject_init_and_add() code paths.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in weighted interleave
  2025-04-01 20:32   ` Gregory Price
@ 2025-04-02  1:28     ` Rakie Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Rakie Kim @ 2025-04-02  1:28 UTC (permalink / raw)
  To: Gregory Price
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	dan.j.williams, ying.huang, david, Jonathan.Cameron, kernel_team,
	honggyu.kim, yunjeong.mun, Rakie Kim

On Tue, 1 Apr 2025 16:32:42 -0400 Gregory Price <gourry@gourry.net> wrote:
> On Tue, Apr 01, 2025 at 06:08:59PM +0900, Rakie Kim wrote:
> >  static void sysfs_wi_release(struct kobject *wi_kobj)
> > @@ -3464,35 +3477,84 @@ static const struct kobj_type wi_ktype = {
> >  
> >  static int sysfs_wi_node_add(int nid)
> >  {
> ... snip ..
> > +	mutex_lock(&wi_group->kobj_lock);
> > +	if (wi_group->nattrs[nid]) {
> > +		mutex_unlock(&wi_group->kobj_lock);
> > +		pr_info("Node [%d] already exists\n", nid);
> > +		kfree(new_attr);
> > +		kfree(name);
> > +		return 0;
> > +	}
> >  
> > -	if (sysfs_create_file(&wi_group->wi_kobj, &node_attr->kobj_attr.attr)) {
> > -		kfree(node_attr->kobj_attr.attr.name);
> > -		kfree(node_attr);
> > -		pr_err("failed to add attribute to weighted_interleave\n");
> > -		return -ENOMEM;
> > +	wi_group->nattrs[nid] = new_attr;
> > +	mutex_unlock(&wi_group->kobj_lock);
> > +
> 
> Shouldn't all of this
> vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
> > +	sysfs_attr_init(&wi_group->nattrs[nid]->kobj_attr.attr);
> > +	wi_group->nattrs[nid]->kobj_attr.attr.name = name;
> > +	wi_group->nattrs[nid]->kobj_attr.attr.mode = 0644;
> > +	wi_group->nattrs[nid]->kobj_attr.show = node_show;
> > +	wi_group->nattrs[nid]->kobj_attr.store = node_store;
> > +	wi_group->nattrs[nid]->nid = nid;
> > +
> > +	ret = sysfs_create_file(&wi_group->wi_kobj,
> > +				&wi_group->nattrs[nid]->kobj_attr.attr);
> > +	if (ret) {
> > +		kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
> > +		kfree(wi_group->nattrs[nid]);
> > +		wi_group->nattrs[nid] = NULL;
> > +		pr_err("Failed to add attribute to weighted_interleave: %d\n", ret);
> >  	}
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Be happening inside the lock as well?

I agree that applying your suggestion would make the code more robust.
I will update the patch to follow your recommendation.

> 
> > +
> > +	switch(action) {
> > +	case MEM_ONLINE:
> > +		if (node_state(nid, N_MEMORY)) {
> 
> Hm, I see the issue here, ok, this node_state check isn't needed, as it
> will always be true.  So this function needs to handle duplicates still.

Yes, you're right. The `node_state(nid, N_MEMORY)` check I added here is
redundant because it will always be true in this context. I will remove it
to avoid unnecessary duplication.

>                           vvv 
> > +			err = sysfs_wi_node_add(nid);
> > +			if (err) {
> > +				pr_err("failed to add sysfs [node%d]\n", nid);
> > +				return NOTIFY_BAD;
> > +			}
> > +		}
> > +		break;
> > +	case MEM_OFFLINE:
> > +		if (!node_state(nid, N_MEMORY))
> 
> This check is good for the time being.

This check looks appropriate for now and I'll keep it as-is.

> 
> > +			sysfs_wi_node_release(nid);
> > +		break;
> > +	}
> > +
> > +notifier_end:
> > +	return NOTIFY_OK;
> >  }
> >  
> > 
> 
> But really I think we probably just want to change to build on top of this:
> https://lore.kernel.org/all/20250401092716.537512-2-osalvador@suse.de/
> 
> And use register_node_notifier with NODE_BECAME_MEMORYLESS and NODE_BECAME_MEM_AWARE
> 
> ~Gregory

Thank you for sharing the link regarding `node_notify`. I agree that the
mechanism you pointed out would be a better fit for this patch.

By using `register_node_notifier` with `NODE_BECAME_MEMORYLESS` and
`NODE_BECAME_MEM_AWARE`, we can avoid unnecessary callbacks and implement
this functionality more efficiently.

However, I think it would be better to apply the current patch first and
then update it to use `node_notify` once that support is finalized and
available upstream.

Rakie



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs
  2025-04-01 21:21   ` Dan Williams
@ 2025-04-02  1:29     ` Rakie Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Rakie Kim @ 2025-04-02  1:29 UTC (permalink / raw)
  To: Dan Williams
  Cc: akpm, linux-mm, linux-kernel, linux-cxl, joshua.hahnjy,
	ying.huang, david, Jonathan.Cameron, kernel_team, honggyu.kim,
	yunjeong.mun, rakie.kim, gourry

On Tue, 1 Apr 2025 14:21:12 -0700 Dan Williams <dan.j.williams@intel.com> wrote:
> Rakie Kim wrote:
> > Memory leaks occurred when removing sysfs attributes for weighted
> > interleave. Improper kobject deallocation led to unreleased memory
> > when initialization failed or when nodes were removed.
> > 
> > This patch resolves the issue by replacing unnecessary `kfree()`
> > calls with `kobject_put()`, ensuring proper cleanup and preventing
> > memory leaks.
> > 
> > By correctly using `kobject_put()`, the release function now
> > properly deallocates memory without causing resource leaks,
> > thereby improving system stability.
> > 
> > Fixes: dce41f5ae253 ("mm/mempolicy: implement the sysfs-based weighted_interleave interface")
> > Signed-off-by: Rakie Kim <rakie.kim@sk.com>
> > Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
> > Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
> > Reviewed-by: Gregory Price <gourry@gourry.net>
> > ---
> >  mm/mempolicy.c | 61 +++++++++++++++++++++++++-------------------------
> >  1 file changed, 31 insertions(+), 30 deletions(-)
> > 
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index bbaadbeeb291..5950d5d5b85e 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -3448,7 +3448,9 @@ static void sysfs_wi_release(struct kobject *wi_kobj)
> >  
> >  	for (i = 0; i < nr_node_ids; i++)
> >  		sysfs_wi_node_release(node_attrs[i], wi_kobj);
> > -	kobject_put(wi_kobj);
> > +
> > +	kfree(node_attrs);
> > +	kfree(wi_kobj);
> >  }
> >  
> >  static const struct kobj_type wi_ktype = {
> > @@ -3494,15 +3496,22 @@ static int add_weighted_interleave_group(struct kobject *root_kobj)
> >  	struct kobject *wi_kobj;
> >  	int nid, err;
> >  
> > -	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
> > -	if (!wi_kobj)
> > +	node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
> > +			     GFP_KERNEL);
> > +	if (!node_attrs)
> >  		return -ENOMEM;
> >  
> > +	wi_kobj = kzalloc(sizeof(struct kobject), GFP_KERNEL);
> > +	if (!wi_kobj) {
> > +		err = -ENOMEM;
> > +		goto node_out;
> > +	}
> > +
> >  	err = kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj,
> >  				   "weighted_interleave");
> 
> It would be nice if this could take advantage of scope-based cleanup to
> avoid the new gotos. It would need a new:
> 
> DEFINE_FREE(kobject_put, struct kobject *, if (!IS_ERR_OR_NULL(_T)) kobject_put(_T))
> 
> ...and a wrapper around kobject_init_and_add() to support auto cleanup:
> 
> struct kobject *kobject_init_and_add_or_errptr(struct kobject *kobj)
> {
> 	int err = kobject_init_and_add(kobj...);
> 
> 	if (err)
> 		return ERR_PTR(err);
> 	return kobj;
> }
> 
> With those then you could do:
> 
> struct kobject *wi_kobj __free(kfree) = kzalloc(sizeof(struct kobject), GFP_KERNEL);
> struct kobject *kobj __free(kobject_put) = kobject_init_and_add_or_errptr(no_free_ptr(wi_kobj), ...)
> 
> Otherwise, the patch does look good to me as is, but it seems like an
> opportunity for further cleanups that might also help other
> kobject_init_and_add() code paths.

Thank you for your response regarding this patch.

I believe that the method you suggested using `DEFINE_FREE` is an optimal
solution for addressing the memory release issue related to `kobject`.

I also think that similar problems may arise in many other kernel modules
that use `kobject`, and therefore, it would be beneficial to address this
issue more generally in those modules as well.

However, I believe that including such functionality as part of this patch
series may not be ideal. Instead, I think it's better to introduce this
approach as a separate patch for broader application and review.

I will work on creating a follow-up patch based on your suggestion.
Once again, thank you for proposing a clean and helpful solution.

Rakie



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-04-02  1:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-01  9:08 [PATCH v4 0/3] Enhance sysfs handling for memory hotplug in weighted interleave Rakie Kim
2025-04-01  9:08 ` [PATCH v4 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
2025-04-01 21:21   ` Dan Williams
2025-04-02  1:29     ` Rakie Kim
2025-04-01  9:08 ` [PATCH v4 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave Rakie Kim
2025-04-01  9:08 ` [PATCH v4 3/3] mm/mempolicy: Support memory hotplug in " Rakie Kim
2025-04-01 20:32   ` Gregory Price
2025-04-02  1:28     ` Rakie Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox