From: "Huang, Ying" <ying.huang@intel.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Abhishek Goel <huntbag@linux.vnet.ibm.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: Only re-generate demotion targets when a numa node changes its N_CPU state
Date: Mon, 14 Mar 2022 11:09:44 +0800 [thread overview]
Message-ID: <878rtd6xhj.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <YisTwzumn3tgL9H4@localhost.localdomain> (Oscar Salvador's message of "Fri, 11 Mar 2022 10:17:55 +0100")
Oscar Salvador <osalvador@suse.de> writes:
> On Fri, Mar 11, 2022 at 01:06:26PM +0800, Huang, Ying wrote:
>> Oscar Salvador <osalvador@suse.de> writes:
>> > -static int __init migrate_on_reclaim_init(void)
>> > -{
>> > - int ret;
>> > -
>> > node_demotion = kmalloc_array(nr_node_ids,
>> > sizeof(struct demotion_nodes),
>> > GFP_KERNEL);
>> > WARN_ON(!node_demotion);
>> >
>> > - ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline",
>> > - NULL, migration_offline_cpu);
>> > /*
>> > - * In the unlikely case that this fails, the automatic
>> > - * migration targets may become suboptimal for nodes
>> > - * where N_CPU changes. With such a small impact in a
>> > - * rare case, do not bother trying to do anything special.
>> > + * At this point, all numa nodes with memory/CPus have their state
>> > + * properly set, so we can build the demotion order now.
>> > */
>> > - WARN_ON(ret < 0);
>> > - ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
>> > - migration_online_cpu, NULL);
>> > - WARN_ON(ret < 0);
>> > -
>> > + set_migration_target_nodes();
>>
>> If my understanding were correct, we should enclose
>> set_migration_target_nodes() here with cpus_read_lock(). And add some
>> comment before set_migration_target_nodes() for this. I don't know
>> whether the locking order is right.
>
> Oh, I see that cpuhp_setup_state() holds the cpu-hotplug lock while
> calling in, so yeah, we might want to hold in there.
>
> The thing is, not long ago we found out that we could have ACPI events
> like memory-hotplug operations at boot stage [1], so I guess it is
> safe to assume we could also have cpu-hotplug operations at that stage
> as well, and so we want to hold cpus_read_lock() just to be on the safe
> side.
>
> But, unless I am missing something, that does not apply to
> set_migration_target_nodes() being called from a callback,
> as the callback (somewhere up the chain) already holds that lock.
> e.g: (_cpu_up takes cpus_write_lock()) and the same for the down
> operation.
>
> So, to sum it up, we only need the cpus_read_lock() in
> migrate_on_reclaim_init().
Yes. That is what I want to say. Sorry for confusing.
>> > hotplug_memory_notifier(migrate_on_reclaim_callback, 100);
>>
>> And we should register the notifier before calling set_migration_target_nodes()?
>
> I cannot made my mind here.
> The primary reason I placed the call before registering the notifier is
> because the original code called set_migration_target_nodes() before
> doing so:
>
> <--
> ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
> migration_online_cpu, NULL);
> WARN_ON(ret < 0);
>
> hotplug_memory_notifier(migrate_on_reclaim_callback, 100);
> -->
>
> I thought about following the same line. Why do you think it should be
> called afterwards?
>
> I am not really sure whether it has a different impact depending on the
> order.
> Note that memory-hotplug acpi events can happen at boot time, so by the
> time we register the memory_hotplug notifier, we can have some hotplug
> memory coming in, and so we call set_migration_target_nodes().
>
> But that is fine, and I cannot see a difference shufling the order
> of them.
> Do you see a problem in there?
Per my understanding, the race condition as follows may be possible in
theory,
CPU1 CPU2
---- ----
set_migration_target_nodes()
<-- // a new node is hotplugged, and missed
hotplug_memory_notifier()
During boot, this may be impossible in practice. But I still think it's
good to make the order correct in general. And it's not hard to do that.
Best Regards,
Huang, Ying
> [1] https://patchwork.kernel.org/project/linux-mm/patch/20200915094143.79181-3-ldufour@linux.ibm.com/
prev parent reply other threads:[~2022-03-14 3:09 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-10 12:07 Oscar Salvador
2022-03-11 1:33 ` Baolin Wang
2022-03-11 2:24 ` Huang, Ying
2022-03-11 8:39 ` Oscar Salvador
2022-03-14 1:03 ` Huang, Ying
2022-03-14 15:13 ` Oscar Salvador
2022-03-14 15:20 ` Dave Hansen
2022-03-15 6:13 ` Oscar Salvador
2022-03-15 6:31 ` Huang, Ying
2022-03-11 2:39 ` Andrew Morton
2022-03-11 9:23 ` Oscar Salvador
2022-03-11 17:10 ` Dave Hansen
2022-03-14 9:09 ` Abhishek Goel
2022-03-11 5:06 ` Huang, Ying
2022-03-11 9:17 ` Oscar Salvador
2022-03-14 3:09 ` Huang, Ying [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878rtd6xhj.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=dave.hansen@linux.intel.com \
--cc=huntbag@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox