linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Gang Li <gang.li@linux.dev>
Cc: David Hildenbrand <david@redhat.com>,
	 Mike Kravetz <mike.kravetz@oracle.com>,
	 Muchun Song <muchun.song@linux.dev>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Tim Chen <tim.c.chen@linux.intel.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com
Subject: Re: [PATCH v3 0/7] hugetlb: parallelize hugetlb page init on boot
Date: Tue, 2 Jan 2024 17:52:53 -0800 (PST)	[thread overview]
Message-ID: <5c30a825-b588-e3a9-83db-f8eef4cb9012@google.com> (raw)
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>

On Tue, 2 Jan 2024, Gang Li wrote:

> Hi all, hugetlb init parallelization has now been updated to v3.
> 
> This series is tested on next-20240102 and can not be applied to v6.7-rc8.
> 
> Update Summary:
> - Select CONFIG_PADATA as we use padata_do_multithreaded
> - Fix a race condition in h->next_nid_to_alloc
> - Fix local variable initialization issues
> - Remove RFC tag
> 
> Thanks to the testing by David Rientjes, we now know that this patch reduce
> hugetlb 1G initialization time from 77s to 18.3s on a 12T machine[4].
> 
> # Introduction
> Hugetlb initialization during boot takes up a considerable amount of time.
> For instance, on a 2TB system, initializing 1,800 1GB huge pages takes 1-2
> seconds out of 10 seconds. Initializing 11,776 1GB pages on a 12TB Intel
> host takes more than 1 minute[1]. This is a noteworthy figure.
> 
> Inspired by [2] and [3], hugetlb initialization can also be accelerated
> through parallelization. Kernel already has infrastructure like
> padata_do_multithreaded, this patch uses it to achieve effective results
> by minimal modifications.
> 
> [1] https://lore.kernel.org/all/783f8bac-55b8-5b95-eb6a-11a583675000@google.com/
> [2] https://lore.kernel.org/all/20200527173608.2885243-1-daniel.m.jordan@oracle.com/
> [3] https://lore.kernel.org/all/20230906112605.2286994-1-usama.arif@bytedance.com/
> [4] https://lore.kernel.org/all/76becfc1-e609-e3e8-2966-4053143170b6@google.com/
> 
> # Test result
>         test          no patch(ms)   patched(ms)   saved   
>  ------------------- -------------- ------------- -------- 
>   256c2t(4 node) 1G           4745          2024   57.34%
>   128c1t(2 node) 1G           3358          1712   49.02%
>       12t        1G          77000         18300   76.23%
> 
>   256c2t(4 node) 2M           3336          1051   68.52%
>   128c1t(2 node) 2M           1943           716   63.15%
> 

I tested 1GB hugetlb on a smaller AMD host with the following:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3301,7 +3301,7 @@ int alloc_bootmem_huge_page(struct hstate *h, int nid)
 int __alloc_bootmem_huge_page(struct hstate *h, int nid)
 {
        struct huge_bootmem_page *m = NULL; /* initialize for clang */
-       int nr_nodes, node;
+       int nr_nodes, node = nid;
 
        /* do node specific alloc */
        if (nid != NUMA_NO_NODE) {

After the build error is fixed, feel free to add:

	Tested-by: David Rientjes <rientjes@google.com>

to each patch.  I think Andrew will probably take a build fix up as a
delta on top of patch 4 rather than sending a whole new series unless
there is other feedback that you receive.


  parent reply	other threads:[~2024-01-03  1:52 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-02 13:12 Gang Li
2024-01-02 13:12 ` [PATCH v3 1/7] hugetlb: code clean for hugetlb_hstate_alloc_pages Gang Li
2024-01-10 10:19   ` Muchun Song
2024-01-11  3:30     ` Gang Li
2024-01-10 21:55   ` Tim Chen
2024-01-11  3:34     ` Gang Li
2024-01-02 13:12 ` [PATCH v3 2/7] hugetlb: split hugetlb_hstate_alloc_pages Gang Li
2024-01-10 23:12   ` Tim Chen
2024-01-11  3:44     ` Gang Li
2024-01-16  7:02   ` Muchun Song
2024-01-16  8:09     ` Gang Li
2024-01-02 13:12 ` [PATCH v3 3/7] padata: dispatch works on different nodes Gang Li
2024-01-11 17:50   ` Tim Chen
2024-01-12  7:09     ` Gang Li
2024-01-12 18:27       ` Tim Chen
2024-01-15  8:57         ` Gang Li
2024-01-17 22:14           ` Tim Chen
2024-01-18  6:15             ` Gang Li
2024-01-02 13:12 ` [PATCH v3 4/7] hugetlb: pass *next_nid_to_alloc directly to for_each_node_mask_to_alloc Gang Li
2024-01-03  1:32   ` David Rientjes
2024-01-03  2:22     ` Gang Li
2024-01-03  2:36       ` David Rientjes
2024-01-11 22:21   ` Tim Chen
2024-01-12  8:07     ` Gang Li
2024-01-02 13:12 ` [PATCH v3 5/7] hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA Gang Li
2024-01-11 22:49   ` Tim Chen
2024-01-16  9:26   ` Muchun Song
2024-01-02 13:12 ` [PATCH v3 6/7] hugetlb: parallelize 2M hugetlb allocation and initialization Gang Li
2024-01-02 13:12 ` [PATCH v3 7/7] hugetlb: parallelize 1G hugetlb initialization Gang Li
2024-01-03  1:52 ` David Rientjes [this message]
2024-01-03  2:20   ` [PATCH v3 0/7] hugetlb: parallelize hugetlb page init on boot Gang Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c30a825-b588-e3a9-83db-f8eef4cb9012@google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=gang.li@linux.dev \
    --cc=ligang.bdlg@bytedance.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox