linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rakie Kim <rakie.kim@sk.com>
To: Gregory Price <gourry@gourry.net>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org,
	joshua.hahnjy@gmail.com, dan.j.williams@intel.com,
	ying.huang@linux.alibaba.com, david@redhat.com,
	Jonathan.Cameron@huawei.com, kernel_team@skhynix.com,
	honggyu.kim@sk.com, yunjeong.mun@sk.com,
	Rakie Kim <rakie.kim@sk.com>
Subject: Re: [PATCH v3 3/3] mm/mempolicy: Support memory hotplug in weighted interleave
Date: Mon, 24 Mar 2025 17:48:39 +0900	[thread overview]
Message-ID: <20250324084920.987-1-rakie.kim@sk.com> (raw)
In-Reply-To: <Z912rrV4xTOBwEP7@gourry-fedora-PF4VCD3F>

On Fri, 21 Mar 2025 10:24:46 -0400 Gregory Price <gourry@gourry.net> wrote:
> On Thu, Mar 20, 2025 at 01:17:48PM +0900, Rakie Kim wrote:
> ... snip ...
> > +	mutex_lock(&sgrp->kobj_lock);
> > +	if (sgrp->nattrs[nid]) {
> > +		mutex_unlock(&sgrp->kobj_lock);
> > +		pr_info("Node [%d] already exists\n", nid);
> > +		kfree(new_attr);
> > +		kfree(name);
> > +		return 0;
> > +	}
> >  
> > -	if (sysfs_create_file(&sgrp->wi_kobj, &node_attr->kobj_attr.attr)) {
> > -		kfree(node_attr->kobj_attr.attr.name);
> > -		kfree(node_attr);
> > -		pr_err("failed to add attribute to weighted_interleave\n");
> > -		return -ENOMEM;
> > +	sgrp->nattrs[nid] = new_attr;
> > +	mutex_unlock(&sgrp->kobj_lock);
> > +
> > +	sysfs_attr_init(&sgrp->nattrs[nid]->kobj_attr.attr);
> > +	sgrp->nattrs[nid]->kobj_attr.attr.name = name;
> > +	sgrp->nattrs[nid]->kobj_attr.attr.mode = 0644;
> > +	sgrp->nattrs[nid]->kobj_attr.show = node_show;
> > +	sgrp->nattrs[nid]->kobj_attr.store = node_store;
> > +	sgrp->nattrs[nid]->nid = nid;
> 
> These accesses need to be inside the lock as well.  Probably we can't
> get here concurrently, but I can't so so definitively that I'm
> comfortable blind-accessing it outside the lock.

You're right, and I appreciate your point. It's not difficult to apply your
suggestion, so I plan to update the code as follows:

    sgrp->nattrs[nid] = new_attr;

    sysfs_attr_init(&sgrp->nattrs[nid]->kobj_attr.attr);
    sgrp->nattrs[nid]->kobj_attr.attr.name = name;
    sgrp->nattrs[nid]->kobj_attr.attr.mode = 0644;
    sgrp->nattrs[nid]->kobj_attr.show = node_show;
    sgrp->nattrs[nid]->kobj_attr.store = node_store;
    sgrp->nattrs[nid]->nid = nid;

    ret = sysfs_create_file(&sgrp->wi_kobj,
           &sgrp->nattrs[nid]->kobj_attr.attr);
    if (ret) {
        mutex_unlock(&sgrp->kobj_lock);
        ...
    }
    mutex_unlock(&sgrp->kobj_lock);

> 
> > +static int wi_node_notifier(struct notifier_block *nb,
> > +			       unsigned long action, void *data)
> > +{
> ... snip ...
> > +	case MEM_OFFLINE:
> > +		sysfs_wi_node_release(nid);
> 
> I'm still not convinced this is correct.  `offline_pages()` says this:
> 
> /*
>  * {on,off}lining is constrained to full memory sections (or more
>  * precisely to memory blocks from the user space POV).
>  */
> 
> And that is the function calling:
> 	memory_notify(MEM_OFFLINE, &arg);
> 
> David pointed out that this should be called when offlining each memory
> block.  This is not the same as simply doing `echo 0 > online`, you need
> to remove the dax device associated with the memory.
> 
> For example:
> 
>       node1
>     /       \
>  dax0.0    dax1.0
>    |          |
>   mb1        mb2
> 
> 
> With this code, if I `daxctl reconfigure-device devmem dax0.0` it will
> remove the first memory block, causing MEM_OFFLINE event to fire and
> removing the node - despite the fact that dax1.0 is still present.
> 
> This matters for systems with memory holes in CXL hotplug memory and
> also for systems with Dynamic Capacity Devices surfacing capacity as
> separate dax devices.
> 
> ~Gregory

If all memory blocks belonging to a node are offlined, the node will lose its
`N_MEMORY` state before the notifier callback is invoked. This should help avoid
the issue you mentioned.
Please let me know your thoughts on this approach.

Rakie



  reply	other threads:[~2025-03-24  8:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20  4:17 [PATCH v3 0/3] Enhance sysfs handling for " Rakie Kim
2025-03-20  4:17 ` [PATCH v3 1/3] mm/mempolicy: Fix memory leaks in weighted interleave sysfs Rakie Kim
2025-03-20  5:40   ` Rakie Kim
2025-03-20 16:59     ` Gregory Price
2025-03-21  4:36       ` Rakie Kim
2025-03-21  4:53         ` Gregory Price
2025-03-21  5:06           ` Rakie Kim
2025-03-20 16:45   ` Joshua Hahn
2025-03-21  4:37     ` Rakie Kim
2025-03-21 14:03       ` Gregory Price
2025-03-24  8:47         ` Rakie Kim
2025-03-21 13:59   ` Gregory Price
2025-03-24 16:40   ` Markus Elfring
2025-03-25 10:27     ` Rakie Kim
2025-03-20  4:17 ` [PATCH v3 2/3] mm/mempolicy: Support dynamic sysfs updates for weighted interleave Rakie Kim
2025-03-21 14:09   ` Gregory Price
2025-03-24  8:48     ` Rakie Kim
2025-04-02 16:33   ` Dan Williams
2025-04-03  4:25     ` Rakie Kim
2025-03-20  4:17 ` [PATCH v3 3/3] mm/mempolicy: Support memory hotplug in " Rakie Kim
2025-03-21 14:24   ` Gregory Price
2025-03-24  8:48     ` Rakie Kim [this message]
2025-03-24  8:54       ` Rakie Kim
2025-03-24 13:32         ` Gregory Price
2025-03-25 10:27           ` Rakie Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250324084920.987-1-rakie.kim@sk.com \
    --to=rakie.kim@sk.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=gourry@gourry.net \
    --cc=honggyu.kim@sk.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yunjeong.mun@sk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox