From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AA51C43461 for ; Tue, 15 Sep 2020 07:07:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 04F4D206C9 for ; Tue, 15 Sep 2020 07:07:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="EYhLfS5P" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04F4D206C9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7B79C6B005D; Tue, 15 Sep 2020 03:07:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 76A278E0006; Tue, 15 Sep 2020 03:07:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 656A26B0078; Tue, 15 Sep 2020 03:07:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 48EA06B005D for ; Tue, 15 Sep 2020 03:07:14 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0B1A4362D for ; Tue, 15 Sep 2020 07:07:14 +0000 (UTC) X-FDA: 77264414388.15.glove82_200ab702710f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 0FE6D1814B0D0 for ; Tue, 15 Sep 2020 07:07:11 +0000 (UTC) X-HE-Tag: glove82_200ab702710f X-Filterd-Recvd-Size: 8988 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 07:07:10 +0000 (UTC) Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 08F73aeV132949; Tue, 15 Sep 2020 03:07:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=R4xV2VPoGHpx/Ey11POcOomUSO3T1KuCJZuJfRK5jBA=; b=EYhLfS5PYAfdYV2ZAvaq7sd/OHTlYY9ZU+2GrgniNOlY/bla00xM/xnX0Vd70Ns2LuKb rlApxxkA9eB65dpfsdRBStag2qmoQlWr6HRsjj7/D6xlOIG+23lgKMluS7TSpd5hYGhx g6X7gokA0Mn9Vhzy3mtm9hTpeoXR6/KTZ+ie16aa0DKOcg1APjGnj9IRHQlPwprtq5ID jQ8AnrHeoseT/L2uNGOiHN9v00+l3oev1GpQKwT62al0PQ43zddV/CS1Gs0zXs9jRif8 +cezHzahqKr54/My53YZrHgmjacDYYU0Xlesqxf4zP0pdX2kTT6o7YBj/H2+ORPGJS1v 4A== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 33jqu2sv9e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 15 Sep 2020 03:07:07 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 08F740Ji134552; Tue, 15 Sep 2020 03:07:07 -0400 Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com with ESMTP id 33jqu2sv7k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 15 Sep 2020 03:07:07 -0400 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 08F773XS025360; Tue, 15 Sep 2020 07:07:03 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma06fra.de.ibm.com with ESMTP id 33hjgds1m5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 15 Sep 2020 07:07:03 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 08F770We16908772 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 15 Sep 2020 07:07:00 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 77A7D5205A; Tue, 15 Sep 2020 07:07:00 +0000 (GMT) Received: from pomme.local (unknown [9.145.72.89]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 3E8B852052; Tue, 15 Sep 2020 07:06:59 +0000 (GMT) Subject: Re: [PATCH v2 2/3] mm: don't rely on system state to detect hot-plug operations To: David Hildenbrand , akpm@linux-foundation.org, Oscar Salvador , mhocko@kernel.org, Greg Kroah-Hartman Cc: linux-mm@kvack.org, "Rafael J . Wysocki" , nathanl@linux.ibm.com, cheloha@linux.ibm.com, Tony Luck , Fenghua Yu , linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Michal Hocko References: <20200914165042.96218-1-ldufour@linux.ibm.com> <20200914165042.96218-3-ldufour@linux.ibm.com> From: Laurent Dufour Message-ID: <6657eaa7-f27b-ce8e-523d-4447b46b7363@linux.ibm.com> Date: Tue, 15 Sep 2020 09:06:59 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235,18.0.687 definitions=2020-09-15_04:2020-09-15,2020-09-15 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 adultscore=0 mlxscore=0 bulkscore=0 impostorscore=0 phishscore=0 malwarescore=0 spamscore=0 suspectscore=0 mlxlogscore=999 priorityscore=1501 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009150063 X-Rspamd-Queue-Id: 0FE6D1814B0D0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Le 14/09/2020 =C3=A0 19:15, David Hildenbrand a =C3=A9crit=C2=A0: >> arch/ia64/mm/init.c | 4 +-- >> drivers/base/node.c | 86 ++++++++++++++++++++++++++++--------------= -- >> include/linux/node.h | 11 +++--- >> mm/memory_hotplug.c | 5 +-- >> 4 files changed, 68 insertions(+), 38 deletions(-) >> >> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c >> index b5054b5e77c8..8e7b8c6c576e 100644 >> --- a/arch/ia64/mm/init.c >> +++ b/arch/ia64/mm/init.c >> @@ -538,7 +538,7 @@ virtual_memmap_init(u64 start, u64 end, void *arg) >> if (map_start < map_end) >> memmap_init_zone((unsigned long)(map_end - map_start), >> args->nid, args->zone, page_to_pfn(map_start), >> - MEMPLUG_EARLY, NULL); >> + MEMINIT_EARLY, NULL); >=20 > Patch #1. Sure, this explains why I was able to build on ia64 but that's not the ri= ght place. >> return 0; >> } >> =20 >> @@ -548,7 +548,7 @@ memmap_init (unsigned long size, int nid, unsigned= long zone, >> { >> if (!vmem_map) { >> memmap_init_zone(size, nid, zone, start_pfn, >> - MEMPLUG_EARLY, NULL); >> + MEMINIT_EARLY, NULL); I'll fix that too. >> } else { >> struct page *start; >> struct memmap_init_callback_data args; >> diff --git a/drivers/base/node.c b/drivers/base/node.c >> index 508b80f6329b..01ee73c9d675 100644 >> --- a/drivers/base/node.c >> +++ b/drivers/base/node.c >> @@ -761,14 +761,36 @@ static int __ref get_nid_for_pfn(unsigned long p= fn) >> return pfn_to_nid(pfn); >> } >> =20 >> +static int do_register_memory_block_under_node(int nid, >> + struct memory_block *mem_blk) >> +{ >> + int ret; >> + >> + /* >> + * If this memory block spans multiple nodes, we only indicate >> + * the last processed node. >> + */ >> + mem_blk->nid =3D nid; >> + >> + ret =3D sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj, >> + &mem_blk->dev.kobj, >> + kobject_name(&mem_blk->dev.kobj)); >> + if (ret) >> + return ret; >> + >> + return sysfs_create_link_nowarn(&mem_blk->dev.kobj, >> + &node_devices[nid]->dev.kobj, >> + kobject_name(&node_devices[nid]->dev.kobj)); >> +} >> + >> /* register memory section under specified node if it spans that nod= e */ >> -static int register_mem_sect_under_node(struct memory_block *mem_blk, >> - void *arg) >> +static int register_mem_block_under_node_early(struct memory_block *m= em_blk, >> + void *arg) >> { >> unsigned long memory_block_pfns =3D memory_block_size_bytes() / PAG= E_SIZE; >> unsigned long start_pfn =3D section_nr_to_pfn(mem_blk->start_sectio= n_nr); >> unsigned long end_pfn =3D start_pfn + memory_block_pfns - 1; >> - int ret, nid =3D *(int *)arg; >> + int nid =3D *(int *)arg; >> unsigned long pfn; >> =20 >> for (pfn =3D start_pfn; pfn <=3D end_pfn; pfn++) { >> @@ -785,38 +807,34 @@ static int register_mem_sect_under_node(struct m= emory_block *mem_blk, >> } >> =20 >> /* >> - * We need to check if page belongs to nid only for the boot >> - * case, during hotplug we know that all pages in the memory >> - * block belong to the same node. >> - */ >> - if (system_state =3D=3D SYSTEM_BOOTING) { >> - page_nid =3D get_nid_for_pfn(pfn); >> - if (page_nid < 0) >> - continue; >> - if (page_nid !=3D nid) >> - continue; >> - } >> - >> - /* >> - * If this memory block spans multiple nodes, we only indicate >> - * the last processed node. >> + * We need to check if page belongs to nid only at the boot >> + * case because node's ranges can be interleaved. >> */ >> - mem_blk->nid =3D nid; >> - >> - ret =3D sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj, >> - &mem_blk->dev.kobj, >> - kobject_name(&mem_blk->dev.kobj)); >> - if (ret) >> - return ret; >> + page_nid =3D get_nid_for_pfn(pfn); >> + if (page_nid < 0) >> + continue; >> + if (page_nid !=3D nid) >> + continue; >> =20 >> - return sysfs_create_link_nowarn(&mem_blk->dev.kobj, >> - &node_devices[nid]->dev.kobj, >> - kobject_name(&node_devices[nid]->dev.kobj)); >> + /* The memory block is registered to the first matching node */ >=20 > That comment is misleading in that context. >=20 > A memory block is registered if there is at least a page that belongs t= o > the nid. It's perfectly fine to have a single memory block belong to > multiple NUMA nodes (when the split is within a memory block). I'd just > drop it. I agree the comment is not accurate, I'll drop it.