From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D780EE4996 for ; Tue, 22 Aug 2023 11:49:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B406E280002; Tue, 22 Aug 2023 07:49:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF07E90000D; Tue, 22 Aug 2023 07:49:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B7E5280002; Tue, 22 Aug 2023 07:49:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 88E9890000D for ; Tue, 22 Aug 2023 07:49:18 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 58E2EB1AA8 for ; Tue, 22 Aug 2023 11:49:18 +0000 (UTC) X-FDA: 81151569996.12.718E367 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) by imf07.hostedemail.com (Postfix) with ESMTP id 9DF2C4000F for ; Tue, 22 Aug 2023 11:49:16 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=M+QZ3IZY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of zhiguangni01@gmail.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=zhiguangni01@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692704956; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HnkSOchAykgmuLN4U3TSGjFgL9Uj0OcvtXZ0L3nkmls=; b=Co/IMk4h0Dyxce2kgpxHj/z2SQhGftU8WmsPwYlNe6AZDhRtK3PhmvOLtLxQTTkR1eq3AE s3VvCp/5x+x7ee/tURWFw71z+JauMImz4e1ZOBVbLem6olCMm3dB0vE9Mmrqvroi0dpQdA xQv6WVFGxjRch3IsEjNJQIThrpnZ5CE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=M+QZ3IZY; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of zhiguangni01@gmail.com designates 209.85.219.179 as permitted sender) smtp.mailfrom=zhiguangni01@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692704956; a=rsa-sha256; cv=none; b=cuNSbNxu67P4AdG9ZzmZRJbTkrUvyPAszLKCxfB3RO3qDh59b1kfL3eK3aNOuseuMKGDrJ /Kj4m6sDH601AsUSu9qv0wOQ0VbAdr2o3nTgCP/XN4GOyOitIuNzJi59Qp6jY5NZOo9F/L 75ZUbTjkz+ExQZXhWO9MwoynIaFaMJI= Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-d7766072ba4so1824467276.1 for ; Tue, 22 Aug 2023 04:49:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692704956; x=1693309756; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HnkSOchAykgmuLN4U3TSGjFgL9Uj0OcvtXZ0L3nkmls=; b=M+QZ3IZYbaW9VT0juEBOMrByNp3PoahO2AFBKtWy6sIbhIBMN2vaUa4OILaoEQkj9Y hDjeJiOsof+5b0qbk3lp/3U8cQfwud2POus1oOsr7JT+l5Ojq/wn+o+hYEETLDChSTPA iUiv99hYyIV9PIHIiaxNaYHEGyNc1rrKyKrfjiCRAfkSSIgr4SsckS5QMJYLn0JKgTXK 9MzQlDmUyll9TRxGXRt5m4OnYHXSloa4z5Sgq1/giVtIFlA/OGi6wYvTJosqyX2JWZC7 l+97b5AM3ZLuUZy0Jl1T97s5bNIzFexQ0xENGQJ1k9xZu/CJR/k5bexzG+Mt1H3i50+k jNyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692704956; x=1693309756; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HnkSOchAykgmuLN4U3TSGjFgL9Uj0OcvtXZ0L3nkmls=; b=bDM4VeubT9Gmy5uC7vM5Swg0uGRgIW+XJaZCheqOvXuVuTXBseP8wbD26meHTSnAaA 3L4vsmWpvVyOFfK2HfsMbOpFSJyiEzy/YLKNLE6t2mLwLKSmSZ+8gwKMxgM7DUs0Cbc2 joTzQPDjpWEU/c0FbFPYwR+s7nanXyVtUWkfdwMM1UJUYVYfq+jDuj05ymdlA82MF06j 1eJekn8ImZHql6wAgNRy1YYpcsl2fqADq7WaBCXAzevcO00xiulIPeJya7jgwQ2LG7uN w1NdFUvKMhpGyYiuPr+6W8nRjqUjW5MhrkTDL+NADcp+COXVEfLDr89FXsK5NOQZg/7C NqTg== X-Gm-Message-State: AOJu0YyMCmCOkzv+akPX9+qC6o4V4UQGcdMPPUWXTAtUTs5nxk2un9Pg IXNVgjmqHqTDlefQkbahP04odlbdqgpBB9+OwI4= X-Google-Smtp-Source: AGHT+IFvUkjp9Mr1ETGQ84flNE4HGHxASyHx07nqyD2nR6dxOh4T8VS7wCmrqwBMtgja+2xSt05UGV5IGU6bH0O644I= X-Received: by 2002:a25:6842:0:b0:d45:af57:edfe with SMTP id d63-20020a256842000000b00d45af57edfemr8206076ybc.26.1692704955656; Tue, 22 Aug 2023 04:49:15 -0700 (PDT) MIME-Version: 1.0 References: <20230814155911.GN2607694@kernel.org> In-Reply-To: <20230814155911.GN2607694@kernel.org> From: Liam Ni Date: Tue, 22 Aug 2023 19:49:05 +0800 Message-ID: Subject: Re: [RESEND PATCH V3] NUMA:Improve the efficiency of calculating pages loss To: Mike Rapoport Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, loongarch@lists.linux.dev, zhoubinbin@loongson.cn, chenfeiyang@loongson.cn, jiaxun.yang@flygoat.com, Andrew Morton , "H. Peter Anvin" , x86@kernel.org, Borislav Petkov , Ingo Molnar , Thomas Gleixner , peterz@infradead.org, luto@kernel.org, Dave Hansen , kernel@xen0n.name, chenhuacai@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9DF2C4000F X-Stat-Signature: bg949am4z68bgtrqf3catejyb7776m64 X-HE-Tag: 1692704956-549756 X-HE-Meta: U2FsdGVkX1+GIKJe1mWJuoYkmKUilcav6ts6sRI7euXvPJJ4qgKowxW/aNDaeEWXJre2TxBFwezG7p7iKzidgh1wmQRQR/Oj30vdKWQ6T6gn6loi7ZXvjQ/RLqfBtPlLyS8PFyL/VrM/1srH4fLM5TUwz48R9n6JS4u/0r/vv/iE0KS6pcFuJfbC/yvTthQheqq6NJfbRYIToWIt4w6xcCbPra2+DZMJ8HNcjrM8TzxOnKZ0AhWVXlucxFto7lcaJwRVDOIbnAQf4UkSl6/VrrhQb1hGZsMJaBF5IMtzaLea7kxYd2KCdbYkXAvFJGfgksQe8FTl3vDFrkyVVcy9Ey98l9NhnvPOTBtd1cV+xW4nex0YnyLeS2kfzAmswy9qyEimNiGSWtnBKVwYeZ+e8jy4mgGajWG/z+bBG9utsniegcYP7YsMhjKhhz/GUH6ccpbeXInD6+NMZWSiwg6QXwtr8RoVTZBcCUuL+xo63c+/qzjvQv9oMZgV93KDxZJSIjMv7VSuNf7nwZH5A7UhwcT4K2nO8SH6Fv6fe31aI3CAuR1uxzvdF6wg+bw/b0AOKDgWUb6S8ZMCLw25m1eub9fZtL+wruo6zkXna3H+0t5/gGRKkDn7cPW9erQgoavv8diCDBYS5OGbA7hU0d8yQ/91aLrx+SMsbQdKNLJb+KLU8D3GKsqb9FqcXb47DNumqfTvYNA6wWnby/E3UDzoaGqFtPlGk6K+AejZAK09ZUyS0F0ICwm9oA+9uXA5alzgqcb1mUARmury4X4Sw1ngV2kQsrsaD/roTGgYKG4bTbUYx2MW8n5dPNn7+8cj1gSTOZJaf3/pJfuBQUrPyOnDXFF95dRa/DEKKcJ+yg4J7YaWJKMDUIcGGHoYaITyaJky20XVZpOrHX3Mn5NF01a/5fB9JTgcIz+3zSXS0od8I8OWke9DfzSoGRe7Z10uKj+iAquIrUEBhEk77Ou0ezn N+x/vdLf 18WUhv6rLElJjotYDiZN4Vh/J+bTfM2GegsQImQoaewpxJe1uMTxlG+3ElZPsOLEbpfKfO+NmK9ffKsvipoCUVXkpE8p8k8x6bkmU2H6VsUa82QSV+zaB83wBqhMtN2fBfzEg1ILyho7LnM/UUMDSa1GjluBCDab5GKzN3OjBRsmaVkXDPh/DCh6et6T3j0oR1nlXt5adjbDIveyP8oJdKRMljSwjnEg1MXvIcaQGZ0PM0+Fwdp/rGKUDsGjJG4bgUquFFoyHQjFPNH5tllHgaDfCuQwRWhOlqXPHVqb02M6JXzLsL8RgXHeEcvmqZSnlcfvfMUkMLNOr/UJl+TF4T2VaJhH0G9e66f2Ayies6hfmri2gyKL+lYOXmI9DZfhtek13ILZgwVujJQMRzKQBxTj1raLI1eJ44GpUIYv5cYf/tP6BXqjHE+hY4A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 15 Aug 2023 at 00:00, Mike Rapoport wrote: > > On Fri, Aug 04, 2023 at 11:32:51PM +0800, Liam Ni wrote: > > Optimize the way of calculating missing pages. > > > > In the previous implementation, We calculate missing pages as follows: > > 1. calculate numaram by traverse all the numa_meminfo's and for each of > > them traverse all the regions in memblock.memory to prepare for > > counting missing pages. > > > > 2. Traverse all the regions in memblock.memory again to get e820ram. > > > > 3. the missing page is (e820ram - numaram ) > > > > But,it's enough to count memory in =E2=80=98memblock.memory=E2=80=99 th= at doesn't have > > the node assigned. > > > > V2:https://lore.kernel.org/all/20230619075315.49114-1-zhiguangni01@gmai= l.com/ > > V1:https://lore.kernel.org/all/20230615142016.419570-1-zhiguangni01@gma= il.com/ > > > > Signed-off-by: Liam Ni > > --- > > arch/loongarch/kernel/numa.c | 23 ++++++++--------------- > > arch/x86/mm/numa.c | 26 +++++++------------------- > > include/linux/mm.h | 1 + > > mm/mm_init.c | 20 ++++++++++++++++++++ > > 4 files changed, 36 insertions(+), 34 deletions(-) > > > > diff --git a/arch/loongarch/kernel/numa.c b/arch/loongarch/kernel/numa.= c > > index 708665895b47..0239891e4d19 100644 > > --- a/arch/loongarch/kernel/numa.c > > +++ b/arch/loongarch/kernel/numa.c > > @@ -262,25 +262,18 @@ static void __init node_mem_init(unsigned int nod= e) > > * Sanity check to catch more bad NUMA configurations (they are amazin= gly > > * common). Make sure the nodes cover all memory. > > */ > > -static bool __init numa_meminfo_cover_memory(const struct numa_meminfo= *mi) > > +static bool __init memblock_validate_numa_coverage(const u64 limit) > > There is no need to have arch specific memblock_validate_numa_coverage(). > You can add this function to memblock and call it from NUMA initializatio= n > instead of numa_meminfo_cover_memory(). Remove implementation of numa_meminfo_cover_memory function? > > The memblock_validate_numa_coverage() will count all the pages without no= de > ID set and compare to the threshold provided by the architectures. > > > { > > - int i; > > - u64 numaram, biosram; > > + u64 lo_pg; > > > > - numaram =3D 0; > > - for (i =3D 0; i < mi->nr_blks; i++) { > > - u64 s =3D mi->blk[i].start >> PAGE_SHIFT; > > - u64 e =3D mi->blk[i].end >> PAGE_SHIFT; > > + lo_pg =3D max_pfn - calculate_without_node_pages_in_range(); > > > > - numaram +=3D e - s; > > - numaram -=3D __absent_pages_in_range(mi->blk[i].nid, s,= e); > > - if ((s64)numaram < 0) > > - numaram =3D 0; > > + /* We seem to lose 3 pages somewhere. Allow 1M of slack. */ > > + if (lo_pg >=3D limit) { > > + pr_err("NUMA: We lost 1m size page.\n"); > > + return false; > > } > > - max_pfn =3D max_low_pfn; > > - biosram =3D max_pfn - absent_pages_in_range(0, max_pfn); > > > > - BUG_ON((s64)(biosram - numaram) >=3D (1 << (20 - PAGE_SHIFT))); > > return true; > > } > > > > @@ -428,7 +421,7 @@ int __init init_numa_memory(void) > > return -EINVAL; > > > > init_node_memblock(); > > - if (numa_meminfo_cover_memory(&numa_meminfo) =3D=3D false) > > + if (memblock_validate_numa_coverage(SZ_1M) =3D=3D false) > > return -EINVAL; > > > > for_each_node_mask(node, node_possible_map) { > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > > index 2aadb2019b4f..14feec144675 100644 > > --- a/arch/x86/mm/numa.c > > +++ b/arch/x86/mm/numa.c > > @@ -451,30 +451,18 @@ EXPORT_SYMBOL(__node_distance); > > * Sanity check to catch more bad NUMA configurations (they are amazin= gly > > * common). Make sure the nodes cover all memory. > > */ > > -static bool __init numa_meminfo_cover_memory(const struct numa_meminfo= *mi) > > +static bool __init memblock_validate_numa_coverage(const u64 limit) > > { > > - u64 numaram, e820ram; > > - int i; > > + u64 lo_pg; > > > > - numaram =3D 0; > > - for (i =3D 0; i < mi->nr_blks; i++) { > > - u64 s =3D mi->blk[i].start >> PAGE_SHIFT; > > - u64 e =3D mi->blk[i].end >> PAGE_SHIFT; > > - numaram +=3D e - s; > > - numaram -=3D __absent_pages_in_range(mi->blk[i].nid, s,= e); > > - if ((s64)numaram < 0) > > - numaram =3D 0; > > - } > > - > > - e820ram =3D max_pfn - absent_pages_in_range(0, max_pfn); > > + lo_pg =3D max_pfn - calculate_without_node_pages_in_range(); > > > > /* We seem to lose 3 pages somewhere. Allow 1M of slack. */ > > - if ((s64)(e820ram - numaram) >=3D (1 << (20 - PAGE_SHIFT))) { > > - printk(KERN_ERR "NUMA: nodes only cover %LuMB of your > > %LuMB e820 RAM. Not used.\n", > > - (numaram << PAGE_SHIFT) >> 20, > > - (e820ram << PAGE_SHIFT) >> 20); > > + if (lo_pg >=3D limit) { > > + pr_err("NUMA: We lost 1m size page.\n"); > > return false; > > } > > + > > return true; > > } > > > > @@ -583,7 +571,7 @@ static int __init numa_register_memblks(struct > > numa_meminfo *mi) > > return -EINVAL; > > } > > } > > - if (!numa_meminfo_cover_memory(mi)) > > + if (!memblock_validate_numa_coverage(SZ_1M)) > > return -EINVAL; > > > > /* Finally register nodes. */ > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 0daef3f2f029..b32457ad1ae3 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -3043,6 +3043,7 @@ unsigned long __absent_pages_in_range(int nid, > > unsigned long start_pfn, > > unsigned long end_pfn); > > extern unsigned long absent_pages_in_range(unsigned long start_pfn, > > unsigned long end_pfn); > > +extern unsigned long calculate_without_node_pages_in_range(void); > > extern void get_pfn_range_for_nid(unsigned int nid, > > unsigned long *start_pfn, unsigned long *end_pf= n); > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > index 3ddd18a89b66..13a4883787e3 100644 > > --- a/mm/mm_init.c > > +++ b/mm/mm_init.c > > @@ -1132,6 +1132,26 @@ static void __init > > adjust_zone_range_for_zone_movable(int nid, > > } > > } > > > > +/** > > + * @start_pfn: The start PFN to start searching for holes > > + * @end_pfn: The end PFN to stop searching for holes > > + * > > + * Return: Return the number of page frames without node assigned > > within a range. > > + */ > > +unsigned long __init calculate_without_node_pages_in_range(void) > > +{ > > + unsigned long num_pages; > > + unsigned long start_pfn, end_pfn; > > + int nid, i; > > + > > + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &= nid) { > > + if (nid =3D=3D NUMA_NO_NODE) > > + num_pages +=3D end_pfn - start_pfn; > > + } > > + > > + return num_pages; > > +} > > + > > /* > > * Return the number of holes in a range on a node. If nid is MAX_NUMN= ODES, > > * then all holes in the requested range will be accounted for. > > -- > > 2.25.1 > > -- > Sincerely yours, > Mike.