From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 468E36B577A for ; Fri, 30 Nov 2018 04:30:26 -0500 (EST) Received: by mail-qt1-f200.google.com with SMTP id u32so4840746qte.1 for ; Fri, 30 Nov 2018 01:30:26 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f97si3076998qtb.199.2018.11.30.01.30.25 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 30 Nov 2018 01:30:25 -0800 (PST) Subject: Re: [PATCH v3] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection References: <20181122101241.7965-1-richard.weiyang@gmail.com> <20181130065847.13714-1-richard.weiyang@gmail.com> From: David Hildenbrand Message-ID: Date: Fri, 30 Nov 2018 10:30:22 +0100 MIME-Version: 1.0 In-Reply-To: <20181130065847.13714-1-richard.weiyang@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: Wei Yang , mhocko@suse.com, osalvador@suse.de Cc: akpm@linux-foundation.org, linux-mm@kvack.org On 30.11.18 07:58, Wei Yang wrote: > During online_pages phase, pgdat->nr_zones will be updated in case this > zone is empty. > > Currently the online_pages phase is protected by the global lock > mem_hotplug_begin(), which ensures there is no contention during the > update of nr_zones. But this global lock introduces scalability issues. > > The patch moves init_currently_empty_zone under both zone_span_writelock > and pgdat_resize_lock because both the pgdat state is changed (nr_zones) > and the zone's start_pfn. Also this patch changes the documentation > of node_size_lock to include the protectioin of nr_zones. s/protectioin/protection/ > > Signed-off-by: Wei Yang > Acked-by: Michal Hocko > Reviewed-by: Oscar Salvador > CC: David Hildenbrand > > --- > David, I may not catch you exact comment on the code or changelog. If I > missed, just let me know. I guess I would have rewritten it to something like the following " Currently the online_pages phase is protected by two global locks (device_device_hotplug_lock and mem_hotplug_lock). Especial the latter can result in scalability issues, as it will slow down code relying on get_online_mems(). Let's prepare code for not having to rely on get_online_mems() but instead some more fine grained locks. During online_pages phase, pgdat->nr_zones will be updated in case the zone is empty. Right now mem_hotplug_lock ensures that there is no contention during the update of nr_zones. The patch moves init_currently_empty_zone under both zone_span_writelock and pgdat_resize_lock because both the pgdat state is changed (nr_zones) and the zone's start_pfn. Also this patch changes the documentation of node_size_lock to include the protection of nr_zones. " Does that make sense? > > --- > v3: > * slightly modify the last paragraph of changelog based on Michal's > comment > v2: > * commit log changes > * modify the code in move_pfn_range_to_zone() instead of in > init_currently_empty_zone() > * pgdat_resize_lock documentation change > --- > include/linux/mmzone.h | 7 ++++--- > mm/memory_hotplug.c | 5 ++--- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 3d0c472438d2..37d9c5c3faa6 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -635,9 +635,10 @@ typedef struct pglist_data { > #endif > #if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT) > /* > - * Must be held any time you expect node_start_pfn, node_present_pages > - * or node_spanned_pages stay constant. Holding this will also > - * guarantee that any pfn_valid() stays that way. > + * Must be held any time you expect node_start_pfn, > + * node_present_pages, node_spanned_pages or nr_zones stay constant. > + * Holding this will also guarantee that any pfn_valid() stays that > + * way. > * > * pgdat_resize_lock() and pgdat_resize_unlock() are provided to > * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 61972da38d93..f626e7e5f57b 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -742,14 +742,13 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, > int nid = pgdat->node_id; > unsigned long flags; > > - if (zone_is_empty(zone)) > - init_currently_empty_zone(zone, start_pfn, nr_pages); > - > clear_zone_contiguous(zone); > > /* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */ > pgdat_resize_lock(pgdat, &flags); > zone_span_writelock(zone); > + if (zone_is_empty(zone)) > + init_currently_empty_zone(zone, start_pfn, nr_pages); > resize_zone_range(zone, start_pfn, nr_pages); > zone_span_writeunlock(zone); > resize_pgdat_range(pgdat, start_pfn, nr_pages); > -- Thanks, David / dhildenb