From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7D08CD3437 for ; Tue, 19 Sep 2023 06:09:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 25AE86B04A7; Tue, 19 Sep 2023 02:09:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 20B816B04A8; Tue, 19 Sep 2023 02:09:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0FA956B04A9; Tue, 19 Sep 2023 02:09:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 010826B04A7 for ; Tue, 19 Sep 2023 02:09:38 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B8D961CA419 for ; Tue, 19 Sep 2023 06:09:38 +0000 (UTC) X-FDA: 81252320436.17.E3078AF Received: from out-219.mta0.migadu.com (out-219.mta0.migadu.com [91.218.175.219]) by imf12.hostedemail.com (Postfix) with ESMTP id C82BB40014 for ; Tue, 19 Sep 2023 06:09:35 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="wvqzGS/l"; spf=pass (imf12.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.219 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695103776; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MlkU9Df9vk+qiyZsy/nPTvZmzt2RWsDcSJFtYEtUJKg=; b=k3MAsfm5UxLCek1vIKm6ewEyy4rNnZNJ7l76Bs6jy5yTico7QXhjs2dUJhOGBK1o2znTH6 GGekk7RURnW5uTThdUlC5J+sCr0lYmgk533Sl9Xk0OKgCw0As0Fi6YCVG9EHoDvUx4EzWb G0cpjRGg/f9rJssnZuIC5HxtEIYWuLY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695103776; a=rsa-sha256; cv=none; b=nOIg2bCMQ+x7a1ChqpT662Ah43V8kscA5tC9rEo1uLZjAJuk2OGz7qW8Kl4WaDZnWKl0mj W0pMBP7MSXbcyFDBV7TEv3gqpSzrNuZr+kOaE8bip/trBCgAocW5nPSPI4YCchZvB67P5L VmzfzKTXK5ErSEHavp5rgMLCpv29GRA= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="wvqzGS/l"; spf=pass (imf12.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.219 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Message-ID: <9a716de0-91c3-5f29-4f88-391b9aaeb5ce@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1695103773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MlkU9Df9vk+qiyZsy/nPTvZmzt2RWsDcSJFtYEtUJKg=; b=wvqzGS/ll4uuH2/135HItXTWJsO+11FUOftlTOWcEyEtwEr+ssffj6ejCwI+wB3G6hZqSJ rkTDfi3gvuzcncN1XwCkJXZy1R+Xz2l08NfWjAIe5UOQP8Hj6SkibvAc3jnuTHWp+ccodW 2smQ8pPl1GASp9eZ0iE0ThyQNCBnezg= Date: Tue, 19 Sep 2023 14:09:25 +0800 MIME-Version: 1.0 Subject: Re: [PATCH v4 5/8] hugetlb: batch freeing of vmemmap pages To: Mike Kravetz , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Joao Martins , Oscar Salvador , David Hildenbrand , Miaohe Lin , David Rientjes , Anshuman Khandual , Naoya Horiguchi , Barry Song <21cnbao@gmail.com>, Michal Hocko , Matthew Wilcox , Xiongchun Duan , Andrew Morton References: <20230918230202.254631-1-mike.kravetz@oracle.com> <20230918230202.254631-6-mike.kravetz@oracle.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <20230918230202.254631-6-mike.kravetz@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: C82BB40014 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: cjyafh8fun9i6n86d9f5kxc6nzeoobs3 X-HE-Tag: 1695103775-338631 X-HE-Meta: U2FsdGVkX19LRZHIWXplI/2Y/J+017Yux1Fl7O1wtPSSuetFcCPUpypT7IdoQLnh503L+SPvDCymRRh+F8NQ+CxKG/T7FgPdijg2mgZzgXWzeZIMgDzjjuvS77LW7RCkDHcNEP5sJ0UDaBCSyEG7oE+i8sMPO/VotpDvQMTAsUBtrRS00BfqKbPKiYnopn0XuBbX5Kp/0BS7S6IhHLQACp+s8mWGvjv/RPa2g2WpvEVZVXYCtvPM6bshhFPYg5nXWgZgSJqS7V+GCZRKLrtVpuiBV8ZTlU90i1Um+kxH/7ruLvz8CHi4e0P50t0kPUpsCntbXSdDCa2GoHrxwTC+aJN1XKMPM7m3YsZl3ryjOa14WZZbClXBzk6ZLg2nIrKdxAto0aOgKa2bGpdU9EzvrZvT+HFm//FUL3skJmvwnv3/Ai+Yx8423PrhClBOOom7VgGkT8CovCX+0e7VnVmOhEqhKZLQTa0OaoLrv4SJJkQInDRRyquoey/EYr1YAhw3tIdSE1+lvnuD9WghS9aeTA/Suz2fYeAoDoQ1d6li/0Wb9lmf18M2nlTAlVraOpRwMloVTwlxsZj0jQhWIp1ayEfe4A/qlp6YmIMTzVhdT9Ma86f1ww7gNmvBLPnmx5b0iBk1XlEiw7mpF5kKW0akLrasTSm+LmZ8vG2ixlDOm420SESujdBwfHUbt+7gy1JU15ks2syHQQeWEe6tg6BrFbcD6oTY9Xnwdetb2I2fPTNvEET+gSSUpS5JF+dxYK7Grz/1VO6N3lkMj+szar1Tyvs2zZXUVpgavaUea78oSYpJYFjJCSPuVtBhi1Y9DbuuChoZXH8x2dTeEN+2lTS2onyBPE2dbaYPld+x7S3rArPJcXPdbuLYX4s58K/QyvQWLRDWNSdSbFUw7Da+Ehw9b1lZaYtrzXK3myrTTstd/2MrLkLGqjO3OeKWg+P4gTyPom5z5ju8p6zzb9yNAO/ dt1LNJ9J h1YWI0PCYgW+DMzCBb3g+VwK1B882JLbVdXXqy0SCqUpPWAycZTUwEFzDiE2O/gFMtJ1qoT3wh6o3ELbdE3CBaOx3UlRgJn30OUXvj6qJgEcAwN0Fce0oh5tROi9hr4szBM2iv8hihSj1oQKLywoTZFbbhqyDVVvsp+2t X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/9/19 07:01, Mike Kravetz wrote: > Now that batching of hugetlb vmemmap optimization processing is possible, > batch the freeing of vmemmap pages. When freeing vmemmap pages for a > hugetlb page, we add them to a list that is freed after the entire batch > has been processed. > > This enhances the ability to return contiguous ranges of memory to the > low level allocators. > > Signed-off-by: Mike Kravetz Reviewed-by: Muchun Song One nit bellow. > --- > mm/hugetlb_vmemmap.c | 85 ++++++++++++++++++++++++++++++-------------- > 1 file changed, 59 insertions(+), 26 deletions(-) > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index 463a4037ec6e..147ed15bcae4 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -222,6 +222,9 @@ static void free_vmemmap_page_list(struct list_head *list) > { > struct page *page, *next; > > + if (list_empty(list)) > + return; It seems unnecessary since the following "list_for_each_entry_safe" could handle empty-list case. Right? > + > list_for_each_entry_safe(page, next, list, lru) > free_vmemmap_page(page); > } > @@ -251,7 +254,7 @@ static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, > } > > entry = mk_pte(walk->reuse_page, pgprot); > - list_add_tail(&page->lru, walk->vmemmap_pages); > + list_add(&page->lru, walk->vmemmap_pages); > set_pte_at(&init_mm, addr, pte, entry); > } > > @@ -306,18 +309,20 @@ static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, > * @end: end address of the vmemmap virtual address range that we want to > * remap. > * @reuse: reuse address. > + * @vmemmap_pages: list to deposit vmemmap pages to be freed. It is callers > + * responsibility to free pages. > * > * Return: %0 on success, negative error code otherwise. > */ > static int vmemmap_remap_free(unsigned long start, unsigned long end, > - unsigned long reuse) > + unsigned long reuse, > + struct list_head *vmemmap_pages) > { > int ret; > - LIST_HEAD(vmemmap_pages); > struct vmemmap_remap_walk walk = { > .remap_pte = vmemmap_remap_pte, > .reuse_addr = reuse, > - .vmemmap_pages = &vmemmap_pages, > + .vmemmap_pages = vmemmap_pages, > }; > int nid = page_to_nid((struct page *)reuse); > gfp_t gfp_mask = GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; > @@ -334,7 +339,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, > if (walk.reuse_page) { > copy_page(page_to_virt(walk.reuse_page), > (void *)walk.reuse_addr); > - list_add(&walk.reuse_page->lru, &vmemmap_pages); > + list_add(&walk.reuse_page->lru, vmemmap_pages); > } > > /* > @@ -365,15 +370,13 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, > walk = (struct vmemmap_remap_walk) { > .remap_pte = vmemmap_restore_pte, > .reuse_addr = reuse, > - .vmemmap_pages = &vmemmap_pages, > + .vmemmap_pages = vmemmap_pages, > }; > > vmemmap_remap_range(reuse, end, &walk); > } > mmap_read_unlock(&init_mm); > > - free_vmemmap_page_list(&vmemmap_pages); > - > return ret; > } > > @@ -389,7 +392,7 @@ static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, > page = alloc_pages_node(nid, gfp_mask, 0); > if (!page) > goto out; > - list_add_tail(&page->lru, list); > + list_add(&page->lru, list); > } > > return 0; > @@ -576,24 +579,17 @@ static bool vmemmap_should_optimize(const struct hstate *h, const struct page *h > return true; > } > > -/** > - * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages. > - * @h: struct hstate. > - * @head: the head page whose vmemmap pages will be optimized. > - * > - * This function only tries to optimize @head's vmemmap pages and does not > - * guarantee that the optimization will succeed after it returns. The caller > - * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages > - * have been optimized. > - */ > -void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) > +static int __hugetlb_vmemmap_optimize(const struct hstate *h, > + struct page *head, > + struct list_head *vmemmap_pages) > { > + int ret = 0; > unsigned long vmemmap_start = (unsigned long)head, vmemmap_end; > unsigned long vmemmap_reuse; > > VM_WARN_ON_ONCE(!PageHuge(head)); > if (!vmemmap_should_optimize(h, head)) > - return; > + return ret; > > static_branch_inc(&hugetlb_optimize_vmemmap_key); > > @@ -603,21 +599,58 @@ void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) > > /* > * Remap the vmemmap virtual address range [@vmemmap_start, @vmemmap_end) > - * to the page which @vmemmap_reuse is mapped to, then free the pages > - * which the range [@vmemmap_start, @vmemmap_end] is mapped to. > + * to the page which @vmemmap_reuse is mapped to. Add pages previously > + * mapping the range to vmemmap_pages list so that they can be freed by > + * the caller. > */ > - if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse)) > + ret = vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, vmemmap_pages); > + if (ret) > static_branch_dec(&hugetlb_optimize_vmemmap_key); > else > SetHPageVmemmapOptimized(head); > + > + return ret; > +} > + > +/** > + * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages. > + * @h: struct hstate. > + * @head: the head page whose vmemmap pages will be optimized. > + * > + * This function only tries to optimize @head's vmemmap pages and does not > + * guarantee that the optimization will succeed after it returns. The caller > + * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages > + * have been optimized. > + */ > +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) > +{ > + LIST_HEAD(vmemmap_pages); > + > + __hugetlb_vmemmap_optimize(h, head, &vmemmap_pages); > + free_vmemmap_page_list(&vmemmap_pages); > } > > void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) > { > struct folio *folio; > + LIST_HEAD(vmemmap_pages); > > - list_for_each_entry(folio, folio_list, lru) > - hugetlb_vmemmap_optimize(h, &folio->page); > + list_for_each_entry(folio, folio_list, lru) { > + int ret = __hugetlb_vmemmap_optimize(h, &folio->page, > + &vmemmap_pages); > + > + /* > + * Pages to be freed may have been accumulated. If we > + * encounter an ENOMEM, free what we have and try again. > + */ > + if (ret == -ENOMEM && !list_empty(&vmemmap_pages)) { > + free_vmemmap_page_list(&vmemmap_pages); > + INIT_LIST_HEAD(&vmemmap_pages); > + __hugetlb_vmemmap_optimize(h, &folio->page, &vmemmap_pages); > + } > + } > + > + free_vmemmap_page_list(&vmemmap_pages); > } > > static struct ctl_table hugetlb_vmemmap_sysctls[] = {