[PATCH RFC] mm/memory_hotplug: Don't take the cpu_hotplug

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC] mm/memory_hotplug: Don't take the cpu_hotplug_lock
@ 2019-07-25  9:22 David Hildenbrand
  2019-07-26  8:19 ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: David Hildenbrand @ 2019-07-25  9:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, David Hildenbrand, Andrew Morton, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Dan Williams, Thomas Gleixner

Commit 9852a7212324 ("mm: drop hotplug lock from lru_add_drain_all()")
states that lru_add_drain_all() "Doesn't need any cpu hotplug locking
because we do rely on per-cpu kworkers being shut down before our
page_alloc_cpu_dead callback is executed on the offlined cpu."

And also "Calling this function with cpu hotplug locks held can actually
lead to obscure indirect dependencies via WQ context.".

Since commit 3f906ba23689 ("mm/memory-hotplug: switch locking to a percpu
rwsem") we do a cpus_read_lock() in mem_hotplug_begin().

I don't see how that lock is still helpful, we already hold the
device_hotplug_lock to protect try_offline_node(), which is AFAIK one
problematic part that can race with CPU hotplug. If it is still
necessary, we should document why.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e7c3b219a305..43b8cd4b96f5 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -86,14 +86,12 @@ __setup("memhp_default_state=", setup_memhp_default_state);
 
 void mem_hotplug_begin(void)
 {
-	cpus_read_lock();
 	percpu_down_write(&mem_hotplug_lock);
 }
 
 void mem_hotplug_done(void)
 {
 	percpu_up_write(&mem_hotplug_lock);
-	cpus_read_unlock();
 }
 
 u64 max_mem_size = U64_MAX;
-- 
2.21.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC] mm/memory_hotplug: Don't take the cpu_hotplug_lock
  2019-07-25  9:22 [PATCH RFC] mm/memory_hotplug: Don't take the cpu_hotplug_lock David Hildenbrand
@ 2019-07-26  8:19 ` Michal Hocko
  2019-07-26  8:22   ` David Hildenbrand
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2019-07-26  8:19 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Thomas Gleixner

On Thu 25-07-19 11:22:06, David Hildenbrand wrote:
> Commit 9852a7212324 ("mm: drop hotplug lock from lru_add_drain_all()")
> states that lru_add_drain_all() "Doesn't need any cpu hotplug locking
> because we do rely on per-cpu kworkers being shut down before our
> page_alloc_cpu_dead callback is executed on the offlined cpu."
> 
> And also "Calling this function with cpu hotplug locks held can actually
> lead to obscure indirect dependencies via WQ context.".
> 
> Since commit 3f906ba23689 ("mm/memory-hotplug: switch locking to a percpu
> rwsem") we do a cpus_read_lock() in mem_hotplug_begin().
> 
> I don't see how that lock is still helpful, we already hold the
> device_hotplug_lock to protect try_offline_node(), which is AFAIK one
> problematic part that can race with CPU hotplug. If it is still
> necessary, we should document why.

I have forgot all the juicy details. Maybe Thomas remembers. The
previous recursive home grown locking was just terrible. I do not see
stop_machine being used in the memory hotplug anymore.
 
I do support this kind of removal because binding CPU and MEM hotplug
locks is fragile and wrong. But this patch really needs more explanation
on why this is safe. In other words what does cpu_read_lock protects
from in mem hotplug paths.

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index e7c3b219a305..43b8cd4b96f5 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -86,14 +86,12 @@ __setup("memhp_default_state=", setup_memhp_default_state);
>  
>  void mem_hotplug_begin(void)
>  {
> -	cpus_read_lock();
>  	percpu_down_write(&mem_hotplug_lock);
>  }
>  
>  void mem_hotplug_done(void)
>  {
>  	percpu_up_write(&mem_hotplug_lock);
> -	cpus_read_unlock();
>  }
>  
>  u64 max_mem_size = U64_MAX;
> -- 
> 2.21.0

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC] mm/memory_hotplug: Don't take the cpu_hotplug_lock
  2019-07-26  8:19 ` Michal Hocko
@ 2019-07-26  8:22   ` David Hildenbrand
  0 siblings, 0 replies; 3+ messages in thread
From: David Hildenbrand @ 2019-07-26  8:22 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Andrew Morton, Oscar Salvador,
	Pavel Tatashin, Dan Williams, Thomas Gleixner

On 26.07.19 10:19, Michal Hocko wrote:
> On Thu 25-07-19 11:22:06, David Hildenbrand wrote:
>> Commit 9852a7212324 ("mm: drop hotplug lock from lru_add_drain_all()")
>> states that lru_add_drain_all() "Doesn't need any cpu hotplug locking
>> because we do rely on per-cpu kworkers being shut down before our
>> page_alloc_cpu_dead callback is executed on the offlined cpu."
>>
>> And also "Calling this function with cpu hotplug locks held can actually
>> lead to obscure indirect dependencies via WQ context.".
>>
>> Since commit 3f906ba23689 ("mm/memory-hotplug: switch locking to a percpu
>> rwsem") we do a cpus_read_lock() in mem_hotplug_begin().
>>
>> I don't see how that lock is still helpful, we already hold the
>> device_hotplug_lock to protect try_offline_node(), which is AFAIK one
>> problematic part that can race with CPU hotplug. If it is still
>> necessary, we should document why.
> 
> I have forgot all the juicy details. Maybe Thomas remembers. The
> previous recursive home grown locking was just terrible. I do not see
> stop_machine being used in the memory hotplug anymore.
>  
> I do support this kind of removal because binding CPU and MEM hotplug
> locks is fragile and wrong. But this patch really needs more explanation
> on why this is safe. In other words what does cpu_read_lock protects
> from in mem hotplug paths.

And that is the purpose of marking this RFC, because I am not aware of
any :) Hopefully Thomas can clarify if we are missing something
important (undocumented) here - if so I'll document it.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-26  8:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-25  9:22 [PATCH RFC] mm/memory_hotplug: Don't take the cpu_hotplug_lock David Hildenbrand
2019-07-26  8:19 ` Michal Hocko
2019-07-26  8:22   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox