From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F608C83F33 for ; Tue, 5 Sep 2023 10:12:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DEC994000C; Tue, 5 Sep 2023 06:12:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 667898E001A; Tue, 5 Sep 2023 06:12:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5094994000C; Tue, 5 Sep 2023 06:12:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3D0358E001A for ; Tue, 5 Sep 2023 06:12:32 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id F20701CA46B for ; Tue, 5 Sep 2023 10:12:31 +0000 (UTC) X-FDA: 81202129302.01.667B46E Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf14.hostedemail.com (Postfix) with ESMTP id B50DF10002B for ; Tue, 5 Sep 2023 10:12:28 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf14.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693908749; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1w6KNAeIoI1b9LaVHVkqLMLJy40hZfHl+wBU1MX3JpU=; b=GwxLPShwL0lVeCSRXaG7eLSFq3Wt45zsKEsR3cHOCIoXA26dxNKR0tnPhBR+Grh+x0yIW5 i0U16ON6rN8s86dC/KREMBpAR+EdF7lSj3Il6+75Ai4ei7VQfNE/FQ/2PD0qAwFyH7wucr rR9ZED8Fwh/LBFwcUekfRC6v8EkA9UU= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf14.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693908749; a=rsa-sha256; cv=none; b=H48ROZWW1u5dbe29HTSY+qoji6NA+2B8yMsY4zCHJa0shKLO44GdBxqs/jtrFKvb1DEo7O NXSoyzT7S489p74ktg9vgTlKk1SYDgxYJpMM6/DwaTD8h3VA8hb6cHes7EzgMNcPydnzUC Ya3WZ2JS2mppS6ZJ76wMy+9tBRDZvms= Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.55]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Rg1V53HpHz1M9C3; Tue, 5 Sep 2023 18:10:37 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Tue, 5 Sep 2023 18:12:23 +0800 Message-ID: <312e2d11-f662-4858-8e39-6f5d7f3b98be@huawei.com> Date: Tue, 5 Sep 2023 18:12:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list() Content-Language: en-US To: Muchun Song CC: Andrew Morton , Mike Kravetz , Linux-MM , Yuan Can References: <20230905071016.2818810-1-wangkefeng.wang@huawei.com> <877C4C7F-C946-4B55-A4C5-A493EA21B6AC@linux.dev> From: Kefeng Wang In-Reply-To: <877C4C7F-C946-4B55-A4C5-A493EA21B6AC@linux.dev> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B50DF10002B X-Stat-Signature: u35f96kexd7jq688hz5hf6mgj1ep17ji X-HE-Tag: 1693908748-250154 X-HE-Meta: U2FsdGVkX19ytzA7ymMZlqMdGiOFLkU2WTeJBD2BQzN244DHKCX4GlyQfMGjKoslBxyKs41K4iB33QkwOxU5TF5Jj+G2UVQViVTL7+wa/LEDnb6sWtkSzEz+OQVRVufTqTItlxx+p1IIJ7fnv+fJoQzhNlW+cAVfBB/VDViPIGkoiPaJ4lxPD2EH3bRoOxr8ccQyqN4352Oc70wZ8ZgtEY2213K8VC0VjPNlCQBv2PmrvsxVX+7AOjhCmm61nu1XOyGDFafSxDjif/RbukO1lhvu9FYvh7HVv296sMAf7as/WFA0dmstW1mYQoxDxPaR3iVN7f0S13KGa+vN0zr0h29FbzTVT07FE7j7fvtobHiyAELSrx/a7kD+oj2q4mikoBe53RCHYPCZ3rZW9KRwiWbl9llL00J+PR8hPdG1KLRkmcY4t51KDJjSUqjNqT1LwhkttI2LMmIm4Oc//HupYRHCMPi1/5/ogATZs+VxA3s1CRfKOWjOCKpmLOJ1cujTybQ/lHKKatyvZRrc/BXIb0Hj8++dCLIk7F57wZohV+DTq9fXZWavyQ1/5b4sK0jOBfZiG1qiXpk8ZV5ufG6ARJIPoa+Oagq1EAeRjIYDQznBn2gHYGITWFBwBETw2wnu7QmpT8x0hqJTJkjSygZ5jX/we7Oa9j5Hbyy6UVtc1BRvfi33LCB4novFJjHQMfosUFvb5GCjJ3EgBCO4pptC7D/UUMfF5MeoPs49t3LzWlIK01HNcyji+7Ne1gyX1AvbXllSu/P4MOQxP4jk1yazKYAj5k8ose13xHjusKvsYHWXDnFzFqe0sBxK+lMGil8VogbrbRHaWcnrjSwh4mzc4YWdvEFjnJaCcdd/qplu6X3Nhlrdr7hhfpJVWfhfSPDMosUTSz8zib8pD/hmo0h8xN9ptYKRr2nrHcol0KlXQ3jNn3dBwDafp9NMDeSWeb9t1rfRCSeWxOl/GIxbk2y 8n2r3I2u 9QAmBeqVhFGc0ZSkx1HJFRFa77y9blC4Gw5/B7bMl3AEfEcegAxgqaS58TY4RKZKVNTmTSkFje0WVdfcK/dxJSD5Drqk4esIpO8Pz64YQRbZZpX2lgNRbiiMjKcrqH5G7tQ7DkGs507DKw4X9E+0muIgFOieY1QKVssNnhHEHJlsatjq2ZJZPFlForcF/TtXdq1vS8hW/vVRy51JwLk7SvEMrLZc5nMvgFCDa2R96WNSST69xfqdCLIdK6wnj60plZQ6vHG0f0GnhUYQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/9/5 17:23, Muchun Song wrote: > > >> On Sep 5, 2023, at 15:32, Kefeng Wang wrote: >> On 2023/9/5 15:10, Kefeng Wang wrote: >>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in >>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas >>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list() >>> to use it to accelerate page allocation. >>> Simple test on arm64's qemu with 1G Hugetlb, 870,842ns vs 3,845,252ns >>> despite the fluctuations, it is still a nice improvement. >>> Tested-by: Yuan Can >>> Signed-off-by: Kefeng Wang >>> --- >>> include/linux/gfp.h | 9 +++++++++ >>> mm/hugetlb_vmemmap.c | 7 ++++++- >>> 2 files changed, 15 insertions(+), 1 deletion(-) >>> diff --git a/include/linux/gfp.h b/include/linux/gfp.h >>> index 665f06675c83..d6e82f15b61f 100644 >>> --- a/include/linux/gfp.h >>> +++ b/include/linux/gfp.h >>> @@ -195,6 +195,15 @@ alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list) >>> return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL); >>> } >>> +static inline unsigned long >>> +alloc_pages_bulk_list_node(gfp_t gfp, int nid, unsigned long nr_pages, struct list_head *list) >>> +{ >>> + if (nid == NUMA_NO_NODE) >>> + nid = numa_mem_id(); >>> + >>> + return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, list, NULL); >>> +} >>> + >>> static inline unsigned long >>> alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array) >>> { >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index 4b9734777f69..699c4fea6b9f 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -384,8 +384,13 @@ static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, >>> unsigned long nr_pages = (end - start) >> PAGE_SHIFT; >>> int nid = page_to_nid((struct page *)start); >>> struct page *page, *next; >>> + unsigned long nr_alloced; >>> - while (nr_pages--) { >>> + nr_alloced = alloc_pages_bulk_list_node(gfp_mask, nid, nr_pages, list); >>> + if (!nr_alloced) >>> + return -ENOMEM; >>> + >> >> eh, forget to inc nr_allocated in the fallback patch, will resend > > Do not change the judgement, "nr_pages -= nr_alloced;" is enough > and simple. nr_pages = 7, nr_alloced = 4, new nr_pages = 3, the fallback won't execute if nr_alloced not cleared, will add nr_allocated only alloc page successfully. while (nr_allocated < nr_pages) { page = alloc_pages_node(nid, gfp_mask, 0); if (!page) goto out; list_add_tail(&page->lru, list); + nr_allocated++; } > >>> + while (nr_alloced < nr_pages) { >>> page = alloc_pages_node(nid, gfp_mask, 0); >>> if (!page) >>> goto out; > >