From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5836BEE14D3 for ; Thu, 7 Sep 2023 06:36:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70C2A8E0019; Thu, 7 Sep 2023 02:36:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 693EC8E000F; Thu, 7 Sep 2023 02:36:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50E448E0019; Thu, 7 Sep 2023 02:36:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3D8898E000F for ; Thu, 7 Sep 2023 02:36:27 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0E2CFC1141 for ; Thu, 7 Sep 2023 06:36:27 +0000 (UTC) X-FDA: 81208842414.15.ABFFCE9 Received: from out-227.mta1.migadu.com (out-227.mta1.migadu.com [95.215.58.227]) by imf19.hostedemail.com (Postfix) with ESMTP id 133011A000B for ; Thu, 7 Sep 2023 06:36:23 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=mNz9PnXK; spf=pass (imf19.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.227 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694068584; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zmhMqr916I5a1jQqOQ0qBekgWSOoGci4kGprArfbHIs=; b=sdTnBuB82968CfqDvwt9YzeIkPGEu+3+EkfJVtQiLKlbgOQ2xTMXANyeQ9Fqv/f2kjc8iN EALUyJ2uIVC12GuSMhdlgfH2yMsrYd+4X/QHQeg9f3mUr8hNmH0I3LPC0kueDCvmmg6pRB zvYFwTiiacEXrevCxc9l13BLnFmLzls= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694068584; a=rsa-sha256; cv=none; b=xY1EjoS4gGnzDa6pBhjCdGQh3YEWTaU35eCq/STbv+KiUnftsWK7SgIsickjPSKQAYbu2U Bz5yp/OdcxOjsBz1cBzWslovxkXTAYOUgsBwkSCfHg8t2wdhwjPEdH61JS/6InXr6h0YMk 9NKQshq2qjSRG0oRCDekC2VfjvrrEJA= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=mNz9PnXK; spf=pass (imf19.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.227 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1694068581; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zmhMqr916I5a1jQqOQ0qBekgWSOoGci4kGprArfbHIs=; b=mNz9PnXKDgTjNXPCkcDtOJgNc/KpD44Xxy02+V9hsAOiDltLSEOSJMYYsmiRfZhQbNxaT8 NIRlognnO/LZ3WovopiRc09KcuOSpWMMLH/iENW4OXyf1cWV7mCD1KAMk4As+7EF9sSGYw sfMR+STB2etgLYIna/JtAf4l2YPV+yA= Mime-Version: 1.0 Subject: Re: [PATCH resend] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list() X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <4d13cd1e-4d5f-4270-a720-3c8098b1f62c@huawei.com> Date: Thu, 7 Sep 2023 14:35:23 +0800 Cc: Matthew Wilcox , Andrew Morton , Mike Kravetz , Linux-MM , Yuan Can Content-Transfer-Encoding: quoted-printable Message-Id: <37EB1F18-83E9-4E40-9694-3C004D04D9C4@linux.dev> References: <20230905103508.2996474-1-wangkefeng.wang@huawei.com> <4029895B-26D8-4FA7-9E1A-2452B1C71130@linux.dev> <11F83276-0C5C-4526-85F7-C807D741EAFD@linux.dev> <4d13cd1e-4d5f-4270-a720-3c8098b1f62c@huawei.com> To: Kefeng Wang X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 133011A000B X-Rspam-User: X-Stat-Signature: ofu6iebiimmkkgq9x3xitq5jbtj9trtc X-Rspamd-Server: rspam03 X-HE-Tag: 1694068583-795480 X-HE-Meta: U2FsdGVkX1+W4R4kkDnS1sOsZPKODNEpefZTNjZvHS8We1IBIX1xcA/Y9M8LWWlCYV9XqZPqkqWe7i7GVVrYg9A4t3BnWaHtaOiQKEEn6bTLANxu1jEfOK01foVLXTUclrJ8OvGV8Dj42IBRcO8rxyeRHbrMYApA5Zz/IDEooThmv4AFguQq3zkDGHI2BtQsPj+l5+qAimpjRh3FVOfIJUa8OuRuUQ6XhajNqzIBb57oQlDHgBkgWGfZLVLlLcWY5Lpouqv/WuRlQQkqG6h2Tna9r5Nnwt788gdnNQxinJyrE5EyhLoQMCPwEGU1tcPc/6ubkJSsM3X2xPigWq9PmlVDG7ri1qSMTxe0tWwwzRkIrkMuQ4m7CABI3DW4RJorcfrldJipXui25iLsxsre2VMMOx5kwzQmNAEP8YQOrZimEwrGIEgBQO2akmSfVVqsqoaUUQYkrvXYl5tuuTq09FsjRDmYVRz4nlk1kmGWQRG9/L37rZouOuLVy9jZuQO9inFTvhKTDnNWC1Z6LuU/Essjg+Tq5tshYMVHe8/eAONsY+uZdsmsk76E3HqF6TCKBSIcrNfqAclo9yVug47Vp1fdh+b70I7gr5YMonIbGlkDJLMyXCKvil3+wwicsR1Nf+yP76REVKFHS0m0RQ//p4KAnRyWUr3gPyuqepERdvVwA9Sm80sjc+S2wdPOkkg0PPOUpAn0Cgx2KdrjCOICRmrH25w0XpN2aflsoi9d33FuS9yUFtuDLLJ/8n0BPSFSDGsW2SQaszytg8mANadK3V9dueKUh09QhjfrY1hm3tN7trkYZLb4SswkTAUCSH08fgnRDLDe+yOA7+WiHDx5oE+XxVQythqWxSyr4kIF1TSvMZaU+ZzWhaIoXg5xiFqL5ekW4Qcq9PYxMv7QZHhP85zxVaY+01gH2z/uCqriHun+I/1vBbRdelUuwOSvTPASa8GjNUAPBGnc272eD+k nwSVEQ6Y FXkA6z9xT1hQ+k76/vp7na6ADpKw4GlPpRFkRFtI92kuaZclqwu/dywPYuHzQgTseDWM3qLeHLowDEctAGwY6UqPjXRBEW9nepWfcwSliUKjjm5djJ82cYxCPIuycytfgL3xOfpbckYnRzyKDBKiY0uoBFkjIA76eFS+EheYKYnP1zojGOMSNZXtkB5CfdEsibU58ebuNnrVIJlk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Sep 6, 2023, at 22:58, Kefeng Wang = wrote: >=20 >=20 >=20 > On 2023/9/6 22:32, Muchun Song wrote: >>> On Sep 6, 2023, at 17:33, Kefeng Wang = wrote: >>>=20 >>>=20 >>>=20 >>> On 2023/9/6 11:25, Muchun Song wrote: >>>>> On Sep 6, 2023, at 11:13, Kefeng Wang = wrote: >>>>>=20 >>>>>=20 >>>>>=20 >>>>> On 2023/9/6 10:47, Matthew Wilcox wrote: >>>>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote: >>>>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once = in >>>>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator = varietas >>>>>>> alloc_pages_bulk_list_node() and switch = alloc_vmemmap_page_list() >>>>>>> to use it to accelerate page allocation. >>>>>> Argh, no, please don't do this. >>>>>> Iterating a linked list is _expensive_. It is about 10x quicker = to >>>>>> iterate an array than a linked list. Adding the list_head option >>>>>> to __alloc_pages_bulk() was a colossal mistake. Don't perpetuate = it. >>>>>> These pages are going into an array anyway. Don't put them on a = list >>>>>> first. >>>>>=20 >>>>> struct vmemmap_remap_walk - walk vmemmap page table >>>>>=20 >>>>> * @vmemmap_pages: the list head of the vmemmap pages that can be = freed >>>>> * or is mapped from. >>>>>=20 >>>>> At present, the struct vmemmap_remap_walk use a list for vmemmap = page table walk, so do you mean we need change vmemmap_pages from a list = to a array firstly and then use array bulk api, even kill list bulk api = ? >>>> It'll be a little complex for hugetlb_vmemmap. Should it be = reasonable to >>>> directly use __alloc_pages_bulk in hugetlb_vmemmap itself? >>>=20 >>>=20 >>> We could use alloc_pages_bulk_array_node() here without introduce a = new >>> alloc_pages_bulk_list_node(), only focus on accelerate page = allocation >>> for now. >>>=20 >> No. Using alloc_pages_bulk_array_node() will add more complexity (you = need to allocate >> an array fist) for hugetlb_vmemap and this path that you optimized is = only a control >> path and this optimization is at the millisecond level. So I don't = think it is a great >> value to do this. > I tried it, yes, a little complex, >=20 > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index 4b9734777f69..5f502e18f950 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -377,26 +377,53 @@ static int vmemmap_remap_free(unsigned long = start, unsigned long end, > return ret; > } >=20 > +static int vmemmap_bulk_alloc_pages(gfp_t gfp, int nid, unsigned int = nr_pages, > + struct page **pages) > +{ > + unsigned int last, allocated =3D 0; > + > + do { > + last =3D allocated; > + > + allocated =3D alloc_pages_bulk_array_node(gfp, nid, nr_pages, = pages); > + if (allocated =3D=3D last) > + goto err; > + > + } while (allocated < nr_pages) > + > + return 0; > +err: > + for (allocated =3D 0; allocated < nr_pages; allocated++) { > + if (pages[allocated]) > + __free_page(pages[allocated]); > + } > + > + return -ENOMEM; > +} > + > static int alloc_vmemmap_page_list(unsigned long start, unsigned long = end, > struct list_head *list) > { > gfp_t gfp_mask =3D GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE; > unsigned long nr_pages =3D (end - start) >> PAGE_SHIFT; > int nid =3D page_to_nid((struct page *)start); > - struct page *page, *next; > + struct page **pages; > + int ret =3D -ENOMEM; > + > + pages =3D kzalloc(array_size(nr_pages, sizeof(struct page *)), = gfp_mask); > + if (!pages) > + return ret; > + > + ret =3D vmemmap_bulk_alloc_pages(gfp_mask, nid, nr_pages, pages); > + if (ret) > + goto out; >=20 > while (nr_pages--) { > - page =3D alloc_pages_node(nid, gfp_mask, 0); > - if (!page) > - goto out; > - list_add_tail(&page->lru, list); > + list_add_tail(&pages[nr_pages]->lru, list); > } > - > - return 0; > out: > - list_for_each_entry_safe(page, next, list, lru) > - __free_page(page); > - return -ENOMEM; > + kfree(pages); > + return ret; > } >=20 > or just use __alloc_pages_bulk in it, but as Matthew said, we should > avoid list usage, list api need to be cleanup and no one should use = it, > or no change, since it is not a hot path :)> Thanks. Let's keep it no change. Thanks.