linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Rakie Kim <rakie.kim@sk.com>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org,
	joshua.hahnjy@gmail.com, dan.j.williams@intel.com,
	ying.huang@linux.alibaba.com, kernel_team@skhynix.com,
	honggyu.kim@sk.com, yunjeong.mun@sk.com,
	Gregory Price <gourry@gourry.net>
Subject: Re: [PATCH v2 2/4] mm/mempolicy: Support memory hotplug in weighted interleave
Date: Fri, 14 Mar 2025 10:17:26 +0100	[thread overview]
Message-ID: <e29293fd-b81a-4133-800a-875dcd252d67@redhat.com> (raw)
In-Reply-To: <20250314060053.743-1-rakie.kim@sk.com>

> Hi David,

Hi :)

> 
> I am currently working on adding memory hotplug-related functionality
> to the weighted interleave feature. While discussing this with Gregory,
> a question came up. If you happen to know the answer to the following,
> I would greatly appreciate your input.
> 
> I have added the following logic to call add_weight_node when a node
> transitions to the N_MEMORY state to create a sysfs entry. Conversely,
> when all memory blocks of a node go offline (!N_MEMORY),
> I call sysfs_wi_node_release to remove the corresponding sysfs entry.
> 

As a spoiler: I don't like how we squeezed the status_change_nid / 
status_change_nid_normal stuff into the memory notifier that works on a 
single memory block -> single zone. But decoupling it is not as easy, 
because we have this status_change_nid vs. status_change_nid_normal thing.

For the basic "node going offline / node going online", a separate 
notifier (or separate callbacks) would make at least your use case much 
clearer.

The whole "status_change_nid_normal" is only used by slab. I wonder if 
we could get rid of it, and simply let slab check the relevant 
zone->present pages when notified about onlining/offlining of a singe 
memory block.

Then, we would only have status_change_nid and could just convert that 
to a NODE_MEM_ONLINE / NODE_MEM_OFFLINE notifier or sth like that.

Hmmm, if I wouldn't be on PTO, I would prototype that real quick :)

> +static int wi_node_notifier(struct notifier_block *nb,
> +                              unsigned long action, void *data)
> +{
> +       int err;
> +       struct memory_notify *arg = data;
> +       int nid = arg->status_change_nid;
> +
> +       if (nid < 0)
> +               goto notifier_end;
> +
> +       switch(action) {
> +       case MEM_ONLINE:
> +               err = add_weight_node(nid, wi_kobj);
> +               if (err) {
> +                       pr_err("failed to add sysfs [node%d]\n", nid);
> +                       kobject_put(wi_kobj);
> +                       return NOTIFY_BAD;
> +               }
> +               break;
> +       case MEM_OFFLINE:
> +               sysfs_wi_node_release(node_attrs[nid], wi_kobj);
> +               break;
> +       }
> +
> +notifier_end:
> +       return NOTIFY_OK;
> +}
> 
> One question I have is whether the MEM_OFFLINE action in the code
> below will be triggered when a node that consists of multiple memory
> blocks has only one of its memory blocks transitioning to the offline state.
> 

node_states_check_changes_offline() should be making sure that that is 
the case.

Only if offlining the current block will make the node (all zones) have 
no present pages any more, then we'll set status_change_nid to >= 0.


> +       int nid = arg->status_change_nid;
> +
> +       if (nid < 0)
> +               goto notifier_end;
> 
> Based on my analysis, wi_node_notifier should function as expected
> because arg->status_change_nid only holds a non-negative value
> under the following conditions:
> 
> 1) !N_MEMORY -> N_MEMORY
>     When the first memory block of a node transitions to the online state,
>     it holds a non-negative value.
>     In all other cases, it remains -1 (NUMA_NO_NODE).
> 
> 2) N_MEMORY -> !N_MEMORY
>     When all memory blocks of a node transition to the offline state,
>     it holds a non-negative value.
>     In all other cases, it remains -1 (NUMA_NO_NODE).
> 
> I would truly appreciate it if you could confirm whether this analysis is correct.
> Below is a more detailed explanation of my findings.
> 

Yes, that's at least how it is supposed to work (-bugs, but I am not 
aware of any) :)

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-03-14  9:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12  7:56 [PATCH v2 1/4] mm/mempolicy: Fix memory leaks in mempolicy_sysfs_init() Rakie Kim
2025-03-12  7:56 ` [PATCH v2 2/4] mm/mempolicy: Support memory hotplug in weighted interleave Rakie Kim
2025-03-12 16:03   ` Gregory Price
2025-03-13  6:33     ` Rakie Kim
2025-03-13 16:23       ` Gregory Price
2025-03-13 22:36         ` David Hildenbrand
2025-03-14  6:00           ` Rakie Kim
2025-03-14  9:17             ` David Hildenbrand [this message]
2025-03-17  8:23               ` Rakie Kim
2025-03-12  7:56 ` [PATCH v2 3/4] mm/mempolicy: Enable sysfs support for " Rakie Kim
2025-03-12 16:14   ` Gregory Price
2025-03-13  6:34     ` Rakie Kim
2025-03-13 16:40       ` Gregory Price
2025-03-14  6:35         ` Rakie Kim
2025-03-12  7:56 ` [PATCH v2 4/4] mm/mempolicy: Fix duplicate node addition in sysfs for " Rakie Kim
2025-03-12 15:04   ` Joshua Hahn
2025-03-13  6:34     ` Rakie Kim
2025-03-13 16:42   ` Gregory Price
2025-03-14  6:35     ` Rakie Kim
2025-03-12 15:49 ` [PATCH v2 1/4] mm/mempolicy: Fix memory leaks in mempolicy_sysfs_init() Gregory Price
2025-03-13  6:31   ` Rakie Kim
2025-03-13 15:52     ` Gregory Price
2025-03-14  7:44       ` Rakie Kim
2025-03-14 10:55       ` Jonathan Cameron
2025-03-14 13:42         ` Gregory Price
2025-03-17  8:24           ` Rakie Kim
2025-03-17  8:24         ` Rakie Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e29293fd-b81a-4133-800a-875dcd252d67@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=gourry@gourry.net \
    --cc=honggyu.kim@sk.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel_team@skhynix.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rakie.kim@sk.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yunjeong.mun@sk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox