From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D75E728CA for ; Fri, 29 Sep 2023 17:01:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A91B8E0001; Fri, 29 Sep 2023 13:01:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 059C68D00E3; Fri, 29 Sep 2023 13:01:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8AB18E0001; Fri, 29 Sep 2023 13:01:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D85988D00E3 for ; Fri, 29 Sep 2023 13:01:16 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9F2991CA681 for ; Fri, 29 Sep 2023 17:01:16 +0000 (UTC) X-FDA: 81290250552.19.867044D Received: from out-193.mta0.migadu.com (out-193.mta0.migadu.com [91.218.175.193]) by imf13.hostedemail.com (Postfix) with ESMTP id 6649F2005F for ; Fri, 29 Sep 2023 17:01:02 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Xdo4RtH8; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf13.hostedemail.com: domain of yajun.deng@linux.dev designates 91.218.175.193 as permitted sender) smtp.mailfrom=yajun.deng@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696006863; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vyPpy7V4wE2Fq1thEJrKI/yAvrKJAgrKyO5PTWzt+c0=; b=5426ZpAi/3Um7sIU2QSlGV7XpAgMI0iAxxMEgapVBcO7D2pFFcul4CxurLEdmmX3Di5iEI xG88jokkHbZH6O7Tvo0xj+jU1FILjEGOeymsP5oUlINEYxS4TQnlBxn/KFcuzQaN+/AtfQ b0VjtVMl5r0z6SeOtakJv4pKsdkn3oY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Xdo4RtH8; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf13.hostedemail.com: domain of yajun.deng@linux.dev designates 91.218.175.193 as permitted sender) smtp.mailfrom=yajun.deng@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696006863; a=rsa-sha256; cv=none; b=R9beURDcaMqQt4vBZiePQNkxgCZzKs9eCwtQTjaWaUX4/593fWQkMJkbRfRSD0Y5V+KBda IVmfW1ASHwLHhwy1TFgbl4Kd2TAcTtm3wlO9izLjSROyndUqxmOs3WX/QghtqczynqTUvZ 1+TWAju3qncLABA2pECfPp4OLwX0haQ= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1696006860; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vyPpy7V4wE2Fq1thEJrKI/yAvrKJAgrKyO5PTWzt+c0=; b=Xdo4RtH8KjUa1J1VtAZiBKj4mG+gMXTwbn84kOhAg/0zDdgYY8vacCk7XCm8fqr3VtGmH3 3ScYBlcAxQgNIIjSOtCgvV8cxL11kK6HGaRziMOdcpa9DwjFr1bRsjKRgzVo5wjy+jtE+/ /jvBjjLJse3UA9CW7O2465ciDuSfD2I= From: Yajun Deng To: akpm@linux-foundation.org, rppt@kernel.org Cc: mike.kravetz@oracle.com, muchun.song@linux.dev, willy@infradead.org, david@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yajun Deng Subject: [PATCH v5 2/2] mm: Init page count in reserve_bootmem_region when MEMINIT_EARLY Date: Sat, 30 Sep 2023 01:00:26 +0800 Message-Id: <20230929170026.2520216-3-yajun.deng@linux.dev> In-Reply-To: <20230929170026.2520216-1-yajun.deng@linux.dev> References: <20230929170026.2520216-1-yajun.deng@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 6649F2005F X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: nzq9h935qgx8mbswdiemt9nyyetjt8s4 X-HE-Tag: 1696006862-373815 X-HE-Meta: U2FsdGVkX1/v2uwRGNQO90RMdbVVfrOQFZsyAsTC5/WdX1RZNfqaIskye1Np5WXe1EcpQhycSCdbcymuQXubYVgL91hFiNgDgCEEnMGSLo8kvUyDpA7k1fFskcIzRAcu7Q24RfAHujIFCnph3nkdWb/rcHgcBkSnf0poRgqAMxwpvn4VTuDbUhtKD2TrW+bhe457W+wpm0vpJ2iY+9Sd3aJyVk7KWlRFqjbf5cey5uexJUzZSKKRhj5KzVyxmrvafPSfa91mVokjm9YqfC2zNNm52YToxMANB0egxeIuc7XUDhONPD1dAB5EX93MPUaPag30Xy4mGTxHPP+DxT/bQdcVRTTscz/IYweaHLvdT7dimyy1S1P2sO4NFBozm+tNsISn6uvZ7Kq8zteWXdjQNWpSdGZSsP2VlzWdfuK3WWGUQ9I7xFNSl9Uxfm59qBWepVAkf86wrEZvMhswPcPhwZTvY59gauwo7m0P43g/YKViC6hUYYnavu9X2WSHazhiARxPpQEYTiZcaIaSE03VmPItUU+UcPyPhLGKpl85jChZtFGknuWRdWFRQEmS63nZzWrzQMJ6ulicQyznTuuAn3MPCoNm4pxHJNBtd6wMi+rP52UJ3rbBlTET/3HJdAMIeSAJxfUVRts+e1WwFHWS22F+6PFzXxlF+Op9h8P7AKw4gEep0RizpZByFaGJR+2oScQX9bXiLY/Q2Cysxy/eW/HAR2sQFpSbAW09vjmw/3xhp1shHTRYxynE+HTFWXAMtFJ57cZKDzgulCknswtAObDuXg1gxXaopzuLdO1K00woAuiYXONoQQ3pO8omGtTyzKB0k7QAGXI+qfuNuNX8o2bsWnf7l8kT/+ppXtERVjZwsyJ6pQeavFvxB2FsI37BVhdc4hmt6+NtOwaeBCtA5pQYLKvBgtaVeOzNJgQuy3zAnKFeO83UGVilS4AwEf5s2OyM7wPDK+/uyJRerrv 4rh1VBt8 BrBxg0ETbYwToAlJO4e6KGKp4agh3XkWDpxPSW4Hhw3bk8rNMcpHh0woGFZQh9nGBniVTZBkOBM5074eAQ3C9UKYrPxXNW1g6py+HtGnyz7P7sqpNo0As3x2/hkec/b5ORMgNMMv3dZyHeE2ryrXmBnRwgiNrMS9Pl3LNqxG1XuaeJ9/Gawu5WHAEwg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: memmap_init_range() would init page count of all pages, but the free pages count would be reset in __free_pages_core(). There are opposite operations. It's unnecessary and time-consuming when it's MEMINIT_EARLY context. Init page count in reserve_bootmem_region when in MEMINIT_EARLY context, and check the page count before reset it. At the same time, the INIT_LIST_HEAD in reserve_bootmem_region isn't need, as it already done in __init_single_page. The following data was tested on an x86 machine with 190GB of RAM. before: free_low_memory_core_early() 341ms after: free_low_memory_core_early() 285ms Signed-off-by: Yajun Deng --- v5: add flags in memmap_init_range. v4: same with v2. v3: same with v2. v2: check page count instead of check context before reset it. v1: https://lore.kernel.org/all/20230922070923.355656-1-yajun.deng@linux.dev/ --- mm/mm_init.c | 20 +++++++++++++++----- mm/page_alloc.c | 20 ++++++++++++-------- 2 files changed, 27 insertions(+), 13 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 0549e7c3d588..f84f1ede57c6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -718,7 +718,7 @@ static void __meminit init_reserved_page(unsigned long pfn, int nid) if (zone_spans_pfn(zone, pfn)) break; } - __init_single_page(pfn_to_page(pfn), pfn, zid, nid, INIT_PAGE_COUNT); + __init_single_page(pfn_to_page(pfn), pfn, zid, nid, 0); } #else static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {} @@ -756,8 +756,11 @@ void __meminit reserve_bootmem_region(phys_addr_t start, init_reserved_page(start_pfn, nid); - /* Avoid false-positive PageTail() */ - INIT_LIST_HEAD(&page->lru); + /* + * We didn't init page count in memmap_init_range when + * MEMINIT_EARLY, so it must init page count here. + */ + init_page_count(page); /* * no need for atomic set_bit because the struct @@ -850,6 +853,7 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone struct vmem_altmap *altmap, int migratetype) { unsigned long pfn, end_pfn = start_pfn + size; + enum page_init_flags flags = 0; struct page *page; if (highest_memmap_pfn < end_pfn - 1) @@ -888,9 +892,15 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone } page = pfn_to_page(pfn); - __init_single_page(page, pfn, zone, nid, INIT_PAGE_COUNT); + + /* If the context is MEMINIT_EARLY, we will init page count and + * mark page reserved in reserve_bootmem_region, the free region + * wouldn't have page count and we will check the pages count + * in __free_pages_core. + */ if (context == MEMINIT_HOTPLUG) - __SetPageReserved(page); + flags = INIT_PAGE_COUNT | INIT_PAGE_RESERVED; + __init_single_page(page, pfn, zone, nid, flags); /* * Usually, we want to mark the pageblock MIGRATE_MOVABLE, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7df77b58a961..bc68b5452d01 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1289,18 +1289,22 @@ void __free_pages_core(struct page *page, unsigned int order) unsigned int loop; /* - * When initializing the memmap, __init_single_page() sets the refcount - * of all pages to 1 ("allocated"/"not free"). We have to set the - * refcount of all involved pages to 0. + * When initializing the memmap, memmap_init_range sets the refcount + * of all pages to 1 ("reserved" and "free") in hotplug context. We + * have to set the refcount of all involved pages to 0. Otherwise, + * we don't do it, as reserve_bootmem_region only set the refcount on + * reserve region ("reserved") in early context. */ - prefetchw(p); - for (loop = 0; loop < (nr_pages - 1); loop++, p++) { - prefetchw(p + 1); + if (page_count(page)) { + prefetchw(p); + for (loop = 0; loop < (nr_pages - 1); loop++, p++) { + prefetchw(p + 1); + __ClearPageReserved(p); + set_page_count(p, 0); + } __ClearPageReserved(p); set_page_count(p, 0); } - __ClearPageReserved(p); - set_page_count(p, 0); atomic_long_add(nr_pages, &page_zone(page)->managed_pages); -- 2.25.1