From: Dan Williams <dan.j.williams@intel.com>
To: Rakie Kim <rakie.kim@sk.com>, <gourry@gourry.net>
Cc: <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<joshua.hahnjy@gmail.com>, <dan.j.williams@intel.com>,
<ying.huang@linux.alibaba.com>, <david@redhat.com>,
<Jonathan.Cameron@huawei.com>, <kernel_team@skhynix.com>,
<honggyu.kim@sk.com>, <yunjeong.mun@sk.com>, <rakie.kim@sk.com>
Subject: Re: [PATCH v3 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave
Date: Wed, 2 Apr 2025 09:33:51 -0700 [thread overview]
Message-ID: <67ed66ef7c070_9dac294e0@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <20250320041749.881-3-rakie.kim@sk.com>
Rakie Kim wrote:
> Previously, the weighted interleave sysfs structure was statically
> managed, preventing dynamic updates when nodes were added or removed.
>
> This patch restructures the weighted interleave sysfs to support
> dynamic insertion and deletion. The sysfs that was part of
> the 'weighted_interleave_group' is now globally accessible,
> allowing external access to that sysfs.
>
> With this change, sysfs management for weighted interleave is
> more flexible, supporting hotplug events and runtime updates
> more effectively.
I understand the urge to try to make a general case for a patch, but it
is better to state the explicit reason especially when someone is later
reading the history and may not realize that this is part of a series.
So instead of making claims like "this is more flexible / more effective
for runtime updates", state that motivation explicitly. Something like:
"In preparation for enabling weighted-interleave sysfs attributes to
react to node-online/offline events, introduce sysfs_wi_node_add() and
sysfs_wi_node_delete() helpers to dynamically manage the
weighted-interleave attributes.
A follow-on patch registers a memory-hotplug notifier to use these
helpers, for now just refactor the current "publish all possible node"
approach to use sysfs_wi_node_{add,delete}()."
>
> Signed-off-by: Rakie Kim <rakie.kim@sk.com>
> ---
> mm/mempolicy.c | 70 ++++++++++++++++++++++----------------------------
> 1 file changed, 30 insertions(+), 40 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 5950d5d5b85e..6c8843114afd 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -3388,6 +3388,13 @@ struct iw_node_attr {
> int nid;
> };
>
> +struct sysfs_wi_group {
> + struct kobject wi_kobj;
> + struct iw_node_attr *nattrs[];
> +};
> +
> +static struct sysfs_wi_group *sgrp;
> +
> static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr,
> char *buf)
> {
> @@ -3430,27 +3437,23 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
> return count;
> }
>
> -static struct iw_node_attr **node_attrs;
> -
> -static void sysfs_wi_node_release(struct iw_node_attr *node_attr,
> - struct kobject *parent)
> +static void sysfs_wi_node_release(int nid)
I called this sysfs_wi_node_delete() above because _release() is
typically callback invoked on last put of a kobject.
> {
> - if (!node_attr)
> + if (!sgrp->nattrs[nid])
> return;
> - sysfs_remove_file(parent, &node_attr->kobj_attr.attr);
> - kfree(node_attr->kobj_attr.attr.name);
> - kfree(node_attr);
> +
> + sysfs_remove_file(&sgrp->wi_kobj, &sgrp->nattrs[nid]->kobj_attr.attr);
> + kfree(sgrp->nattrs[nid]->kobj_attr.attr.name);
> + kfree(sgrp->nattrs[nid]);
> }
>
> static void sysfs_wi_release(struct kobject *wi_kobj)
> {
> - int i;
> -
> - for (i = 0; i < nr_node_ids; i++)
> - sysfs_wi_node_release(node_attrs[i], wi_kobj);
> + int nid;
>
> - kfree(node_attrs);
> - kfree(wi_kobj);
> + for (nid = 0; nid < nr_node_ids; nid++)
> + sysfs_wi_node_release(nid);
> + kfree(sgrp);
This looks broken, are you sure that a kobject with a zero reference can
still host child attributes?
The teardown flow I would expect is:
sysfs_remove_file(node_attrs[i],
kobject_del(wi_kobj)
...that does final kobject_put()...
kfree(container_of(wi_kobj))
However, now I do not think patch1 is actually fixing anything because
there is never a kobject_del() of the mempolicy_kobj. Just like there is
never a kobject_del() of the mm_kobj.
So patch1 seems to potentially be addressing a bug introduced by this
dynamic work which is caused by the original code being confused about
the kobject shutdown path.
The original problems are that sysfs_wi_release() has a kobject_put()
which, yes, is broken, but equally problematic is that there is no
kobject_del() in sight for either of these kobjects(), even with the new
changes. mempolicy_kobj_release() seems to confuse the activities that I
would expect to be near a kobject_del() call with the minimal kfree() on
final put.
> }
>
> static const struct kobj_type wi_ktype = {
> @@ -3458,7 +3461,7 @@ static const struct kobj_type wi_ktype = {
> .release = sysfs_wi_release,
> };
>
> -static int add_weight_node(int nid, struct kobject *wi_kobj)
> +static int sysfs_wi_node_add(int nid)
> {
> struct iw_node_attr *node_attr;
> char *name;
> @@ -3480,57 +3483,44 @@ static int add_weight_node(int nid, struct kobject *wi_kobj)
> node_attr->kobj_attr.store = node_store;
> node_attr->nid = nid;
>
> - if (sysfs_create_file(wi_kobj, &node_attr->kobj_attr.attr)) {
> + if (sysfs_create_file(&sgrp->wi_kobj, &node_attr->kobj_attr.attr)) {
> kfree(node_attr->kobj_attr.attr.name);
> kfree(node_attr);
> pr_err("failed to add attribute to weighted_interleave\n");
> return -ENOMEM;
> }
>
> - node_attrs[nid] = node_attr;
> + sgrp->nattrs[nid] = node_attr;
> return 0;
> }
>
> -static int add_weighted_interleave_group(struct kobject *root_kobj)
> +static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
> {
> - struct kobject *wi_kobj;
> int nid, err;
>
> - node_attrs = kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
> - GFP_KERNEL);
> - if (!node_attrs)
> + sgrp = kzalloc(sizeof(struct sysfs_wi_group) + \
> + nr_node_ids * sizeof(struct iw_node_attr *), \
> + GFP_KERNEL);
The recommended way to allocate a struct with a flexible array is using
the struct_size() helper.
kzalloc(struct_size(sgrp, nattrs, nr_node_ids), GFP_KERNEL)
...but overall I think the original code needs a cleanup and to be clear
that I think there is no memory leak risk exposed to existing users
given the shutdown path is never invoked.
next prev parent reply other threads:[~2025-04-02 16:34 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 4:17 [PATCH v3 0/3] Enhance sysfs handling for memory hotplug in " Rakie Kim
2025-03-20 4:17 ` [PATCH v3 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
2025-03-20 5:40 ` Rakie Kim
2025-03-20 16:59 ` Gregory Price
2025-03-21 4:36 ` Rakie Kim
2025-03-21 4:53 ` Gregory Price
2025-03-21 5:06 ` Rakie Kim
2025-03-20 16:45 ` Joshua Hahn
2025-03-21 4:37 ` Rakie Kim
2025-03-21 14:03 ` Gregory Price
2025-03-24 8:47 ` Rakie Kim
2025-03-21 13:59 ` Gregory Price
2025-03-24 16:40 ` Markus Elfring
2025-03-25 10:27 ` Rakie Kim
2025-03-20 4:17 ` [PATCH v3 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave Rakie Kim
2025-03-21 14:09 ` Gregory Price
2025-03-24 8:48 ` Rakie Kim
2025-04-02 16:33 ` Dan Williams [this message]
2025-04-03 4:25 ` Rakie Kim
2025-03-20 4:17 ` [PATCH v3 3/3] mm/mempolicy: Support memory hotplug in " Rakie Kim
2025-03-21 14:24 ` Gregory Price
2025-03-24 8:48 ` Rakie Kim
2025-03-24 8:54 ` Rakie Kim
2025-03-24 13:32 ` Gregory Price
2025-03-25 10:27 ` Rakie Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=67ed66ef7c070_9dac294e0@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=gourry@gourry.net \
--cc=honggyu.kim@sk.com \
--cc=joshua.hahnjy@gmail.com \
--cc=kernel_team@skhynix.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rakie.kim@sk.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yunjeong.mun@sk.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox