From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 953B3EB64D7 for ; Fri, 16 Jun 2023 07:51:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC9C16B0075; Fri, 16 Jun 2023 03:51:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D7A466B0078; Fri, 16 Jun 2023 03:51:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C691F8E0001; Fri, 16 Jun 2023 03:51:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B810F6B0075 for ; Fri, 16 Jun 2023 03:51:27 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 923FF160B04 for ; Fri, 16 Jun 2023 07:51:27 +0000 (UTC) X-FDA: 80907841014.17.D6DA59B Received: from out-12.mta0.migadu.com (out-12.mta0.migadu.com [91.218.175.12]) by imf05.hostedemail.com (Postfix) with ESMTP id B1144100007 for ; Fri, 16 Jun 2023 07:51:24 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Qs1wTCGb; spf=pass (imf05.hostedemail.com: domain of yajun.deng@linux.dev designates 91.218.175.12 as permitted sender) smtp.mailfrom=yajun.deng@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686901885; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bMtAKP+P5Y4Hq8LEnvTfp0OCofr+9CIr8IAtwL6K60I=; b=wvQa19Q+JqgHw/N8e8nttGFI1VBRyTD6409oJmCmurR36gezeM8vLntdxUSAFZ5R7aXiT4 u3ko/0nWXI9d0dXzezBTqJ/JvZutuS1vJ1msq3+U5BztVXvrRkn1shFH1upR8xVdJC8RLY EPY5lDEjJWCwY6AOWa7rjHI+hpPr+mM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686901885; a=rsa-sha256; cv=none; b=rPvy+tvK2+5epg/LdZkP8h6myHI6il0UA+H+voHKldOBRiL76uohNhy3o190Le8fq80zRv e8wym7YChJIFMQa0eMO1ga455myAALDaT43hTmweBtJiaQgTk1SsWFKIjRQ4GsnywRkbOR M4VbmLvHGOsKu1Yqq5NNKBnfziJl80c= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Qs1wTCGb; spf=pass (imf05.hostedemail.com: domain of yajun.deng@linux.dev designates 91.218.175.12 as permitted sender) smtp.mailfrom=yajun.deng@linux.dev; dmarc=pass (policy=none) header.from=linux.dev MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1686901882; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bMtAKP+P5Y4Hq8LEnvTfp0OCofr+9CIr8IAtwL6K60I=; b=Qs1wTCGbpHCFf0Oj8l9qjpOojhKRgI4zV+yeAXxtFEhyOUh5RKNo8wEaUb7r8zU0Cnl/CW usYUPNa9v+cCZ3KsxnnEAZRFhzQrz623YsGwvHUXbGYtWGJ43pdbcjf0MjFR7bohwDevdf PqgHuuYzWdCvyExLrnP2StIdMyH6mKo= Date: Fri, 16 Jun 2023 07:51:21 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "Yajun Deng" Message-ID: <5ba9ad9bedb2fd3fb96571a778fc35b5@linux.dev> Subject: Re: [PATCH v2] mm: pass nid to reserve_bootmem_region() To: "Mike Rapoport" Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "kernel test robot" In-Reply-To: <20230616072247.GL52412@kernel.org> References: <20230616072247.GL52412@kernel.org> <20230616023011.2952211-1-yajun.deng@linux.dev> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: B1144100007 X-Rspam-User: X-Stat-Signature: 3i6gt5e884bgzopoecsm78tbc399h9pt X-Rspamd-Server: rspam03 X-HE-Tag: 1686901884-342400 X-HE-Meta: U2FsdGVkX1/BY73+G2iJozMWy842BO6OqkpexP41M3WJGsXe4qnAjd6Qj71nDkkIxzb2w/C/JADp60sRxWWWJP1dqW2qza6NOrWishKpcmq5foYNb5coTmdvvAMNbtliJGNI+3Sigq4VIyEWN5QkFhUvt74tXhv2wf4Hat3NtQo3kgVRanmaaTT4VZbmEqmo+o87hTa/Sf45o1j0jrIRbJOGU/ALU7jduRoAcM5OVThburdZugxxHUcNIQX4ZSOu4avWWs0ZXkgWj5wCxKIhAf7SGdWbu0Y4vebdY1vf/Sfwhw6AvxZja2A2cxxMgH0HSN2ze6ljnhIdkLFQB0X+Dikd7jS+a5txpciQXTGarWc4TJxIUGP20pIEGa2vCGq0pHusbWG3HXQQ8hwqVHP5FEWQdMWX0uzss+tpYixS+nzfBSo0hNoBtaFRb/Ykzx8xXXBaEzqcGsndBokzGoJ4TXMOq8XMN2d6zlhvOcgCIiaPHiCXG8h0cac93zb4bYPL20LrGU0cS9juTIIdXszBofdPLv1kbJ7EfboX7EPcl+FPlV+RrScJ55tujbt5auQx4C3yipfXwBg+3MnWdALMrKaw8oP4SuxwSeASBt16JjZAksYR2C2a5THzH5h+gHj6rS6+7WLh62KlDQ7GHSSl3ugHSWIZkKHI0PoL2O1xyUcNGo4hD6Lgv2Wkcygf7xi7cgPNxgZ4f/Fsh2397io/FNENM3ZHonhrJzlxT5HSHSIE97q5UPA9I+bc19uzKfpHgb2vCb9qM31f0M/3zBcDNM/cGSCYbxVHujIQxeiS0RO/74g7jyqacObqvqgW9LF7X4MBYsvkBpqErV98Qt5Wf4HOeJ7/mXuZE3TIqrXQCTcmXv9ECRe73iKv2g+IgEDfH8YW7s8XQBFRF8QdwmXDbUgvY2LFz39Vo4j6U1h4NdUFlxeg7PGDhn7sCEqnL94UW8eY/hUNobSbI9cSOEr ZWp3JAf1 aS93s1Tjc5FiB3LuHnhxrR8Cd4LZRf+6VjEQKtdoDCAQc46PKw2yTL+NShHK6A0MVkKpQ7wU+Aqm5/UgBb2jQ8LKNXe9Mo7cG/ADpXgWi+jjaPezEgcfejDeHMVRPKWB4Ac6Du4t7NjiScNjY5ufLSKw/x9DP7PN1auRq0eKGo9+Fugq0u3q89/mnszpYbw1LRxU432rq3Vg+ZVa4+x8C+SzgqQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: June 16, 2023 3:22 PM, "Mike Rapoport" wrote:=0A=0A> On= Fri, Jun 16, 2023 at 10:30:11AM +0800, Yajun Deng wrote:=0A> =0A>> early= _pfn_to_nid() is called frequently in init_reserved_page(), it=0A>> retur= ns the node id of the PFN. These PFN are probably from the same=0A>> memo= ry region, they have the same node id. It's not necessary to call=0A>> ea= rly_pfn_to_nid() for each PFN.=0A>> =0A>> Pass nid to eserve_bootmem_regi= on() and drop the call to=0A>> early_pfn_to_nid() in init_reserved_page()= .=0A>> =0A>> The most beneficial function is memmap_init_reserved_pages()= if define=0A>> CONFIG_DEFERRED_STRUCT_PAGE_INIT.=0A>> The following data= was tested on x86 machine, it has 190GB RAM,=0A>> =0A>> before:=0A>> mem= map_init_reserved_pages() 67ms=0A>> =0A>> after:=0A>> memmap_init_reserve= d_pages() 20ms=0A>> =0A>> Signed-off-by: Yajun Deng =0A>> Reported-by: kernel test robot =0A>> Closes: https:= //lore.kernel.org/oe-kbuild-all/202306160145.juJMr3Bi-lkp@intel.com=0A>> = ---=0A>> include/linux/mm.h | 3 ++-=0A>> mm/memblock.c | 9 ++++++---=0A>>= mm/mm_init.c | 31 +++++++++++++++++++------------=0A>> 3 files changed, = 27 insertions(+), 16 deletions(-)=0A>> =0A>> diff --git a/include/linux/m= m.h b/include/linux/mm.h=0A>> index 17317b1673b0..39e72ca6bf22 100644=0A>= > --- a/include/linux/mm.h=0A>> +++ b/include/linux/mm.h=0A>> @@ -2964,7 = +2964,8 @@ extern unsigned long free_reserved_area(void *start, void *end= ,=0A>> =0A>> extern void adjust_managed_page_count(struct page *page, lon= g count);=0A>> =0A>> -extern void reserve_bootmem_region(phys_addr_t star= t, phys_addr_t end);=0A>> +extern void reserve_bootmem_region(phys_addr_t= start,=0A>> + phys_addr_t end, int nid);=0A>> =0A>> /* Free the reserved= page into the buddy system, so it gets managed. */=0A>> static inline vo= id free_reserved_page(struct page *page)=0A>> diff --git a/mm/memblock.c = b/mm/memblock.c=0A>> index ff0da1858778..6dc51dc845e5 100644=0A>> --- a/m= m/memblock.c=0A>> +++ b/mm/memblock.c=0A>> @@ -2091,18 +2091,21 @@ static= void __init memmap_init_reserved_pages(void)=0A>> {=0A>> struct memblock= _region *region;=0A>> phys_addr_t start, end;=0A>> + int nid;=0A>> u64 i;= =0A>> =0A>> /* initialize struct pages for the reserved regions */=0A>> -= for_each_reserved_mem_range(i, &start, &end)=0A>> - reserve_bootmem_regi= on(start, end);=0A>> + __for_each_mem_range(i, &memblock.reserved, NULL, = NUMA_NO_NODE,=0A>> + MEMBLOCK_NONE, &start, &end, &nid)=0A>> + reserve_bo= otmem_region(start, end, nid);=0A> =0A> I'd prefer to see for_each_reserv= ed_mem_region() loop here=0A> =0Aokay.=0A=0A>> /* and also treat struct p= ages for the NOMAP regions as PageReserved */=0A>> for_each_mem_region(re= gion) {=0A>> if (memblock_is_nomap(region)) {=0A>> start =3D region->base= ;=0A>> end =3D start + region->size;=0A>> - reserve_bootmem_region(start,= end);=0A>> + nid =3D memblock_get_region_node(region);=0A>> + reserve_bo= otmem_region(start, end, nid);=0A>> }=0A>> }=0A>> }=0A>> diff --git a/mm/= mm_init.c b/mm/mm_init.c=0A>> index d393631599a7..1499efbebc6f 100644=0A>= > --- a/mm/mm_init.c=0A>> +++ b/mm/mm_init.c=0A>> @@ -646,10 +646,8 @@ st= atic inline void pgdat_set_deferred_range(pg_data_t *pgdat)=0A>> }=0A>> = =0A>> /* Returns true if the struct page for the pfn is initialised */=0A= >> -static inline bool __meminit early_page_initialised(unsigned long pfn= )=0A>> +static inline bool __meminit early_page_initialised(unsigned long= pfn, int nid)=0A>> {=0A>> - int nid =3D early_pfn_to_nid(pfn);=0A>> -=0A= >> if (node_online(nid) && pfn >=3D NODE_DATA(nid)->first_deferred_pfn)= =0A>> return false;=0A>> =0A>> @@ -695,15 +693,14 @@ defer_init(int nid, = unsigned long pfn, unsigned long end_pfn)=0A>> return false;=0A>> }=0A>> = =0A>> -static void __meminit init_reserved_page(unsigned long pfn)=0A>> += static void __meminit init_reserved_page(unsigned long pfn, int nid)=0A>>= {=0A>> pg_data_t *pgdat;=0A>> - int nid, zid;=0A>> + int zid;=0A>> =0A>>= - if (early_page_initialised(pfn))=0A>> + if (early_page_initialised(pfn= , nid))=0A>> return;=0A>> =0A>> - nid =3D early_pfn_to_nid(pfn);=0A>> pgd= at =3D NODE_DATA(nid);=0A>> =0A>> for (zid =3D 0; zid < MAX_NR_ZONES; zid= ++) {=0A>> @@ -717,7 +714,7 @@ static void __meminit init_reserved_page(u= nsigned long pfn)=0A>> #else=0A>> static inline void pgdat_set_deferred_r= ange(pg_data_t *pgdat) {}=0A>> =0A>> -static inline bool early_page_initi= alised(unsigned long pfn)=0A>> +static inline bool early_page_initialised= (unsigned long pfn, int nid)=0A>> {=0A>> return true;=0A>> }=0A>> @@ -727= ,7 +724,7 @@ static inline bool defer_init(int nid, unsigned long pfn, un= signed long=0A>> end_pfn)=0A>> return false;=0A>> }=0A>> =0A>> -static in= line void init_reserved_page(unsigned long pfn)=0A>> +static inline void = init_reserved_page(unsigned long pfn, int nid)=0A>> {=0A>> }=0A>> #endif = /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */=0A>> @@ -738,16 +735,20 @@ static = inline void init_reserved_page(unsigned long pfn)=0A>> * marks the pages = PageReserved. The remaining valid pages are later=0A>> * sent to the budd= y page allocator.=0A>> */=0A>> -void __meminit reserve_bootmem_region(phy= s_addr_t start, phys_addr_t end)=0A>> +void __meminit reserve_bootmem_reg= ion(phys_addr_t start,=0A>> + phys_addr_t end, int nid)=0A>> {=0A>> unsig= ned long start_pfn =3D PFN_DOWN(start);=0A>> unsigned long end_pfn =3D PF= N_UP(end);=0A>> =0A>> + if (nid =3D=3D MAX_NUMNODES)=0A>> + nid =3D first= _online_node;=0A> =0A> How can this happen?=0A> =0A=0ASome reserved memor= y regions may not set nid. I found it when I debug.=0AWe can see that by = memblock_debug_show().=0A=0A>> +=0A>> for (; start_pfn < end_pfn; start_p= fn++) {=0A>> if (pfn_valid(start_pfn)) {=0A>> struct page *page =3D pfn_t= o_page(start_pfn);=0A>> =0A>> - init_reserved_page(start_pfn);=0A>> + ini= t_reserved_page(start_pfn, nid);=0A>> =0A>> /* Avoid false-positive PageT= ail() */=0A>> INIT_LIST_HEAD(&page->lru);=0A>> @@ -2579,7 +2580,13 @@ voi= d __init set_dma_reserve(unsigned long new_dma_reserve)=0A>> void __init = memblock_free_pages(struct page *page, unsigned long pfn,=0A>> unsigned i= nt order)=0A>> {=0A>> - if (!early_page_initialised(pfn))=0A>> + int nid = =3D 0;=0A>> +=0A>> +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT=0A>> + nid = =3D early_pfn_to_nid(pfn);=0A>> +#endif=0A> =0A> Wen can pass nid to memb= lock_free_pages, no?=0A>=0A=0Amemblock_free_pages() was called by __free_= pages_memory() and memblock_free_late().=0AFor the latter, I'm not sure i= f we can pass nid.=0A=0AI think we can pass nid to reserve_bootmem_region= () in this patch, and pass nid to=0Amemblock_free_pages() in another patc= h if we can confirm this.=0A =0A>> +=0A>> + if (!early_page_initialised(p= fn, nid))=0A>> return;=0A>> if (!kmsan_memblock_free_pages(page, order)) = {=0A>> /* KMSAN will take care of these pages. */=0A>> --=0A>> 2.25.1=0A>= =0A> --=0A> Sincerely yours,=0A> Mike.