From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C80EDC46CD2 for ; Wed, 3 Jan 2024 01:52:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 299C98D002E; Tue, 2 Jan 2024 20:52:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 24A498D0006; Tue, 2 Jan 2024 20:52:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EB3F8D002E; Tue, 2 Jan 2024 20:52:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F14EC8D0006 for ; Tue, 2 Jan 2024 20:52:57 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CAE031C109B for ; Wed, 3 Jan 2024 01:52:57 +0000 (UTC) X-FDA: 81636326394.27.43A95C4 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf02.hostedemail.com (Postfix) with ESMTP id 1FF1A80005 for ; Wed, 3 Jan 2024 01:52:55 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=W7hOgYsw; spf=pass (imf02.hostedemail.com: domain of rientjes@google.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704246776; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9EZuImka5YRzZke3ZlhRWW36OaCkie43Uhpr5ROyDyE=; b=zuNaAkxnT12zso0lbmAs5YkDCCzROizWjTUXs9xSBL6cZffvaV8Zxcv+oglMSrcI7WRlmC 1AjOcKlp1n2xVdZO3EohL5SDbcKLTiIJ78exRDK6DvMVF+/TKY6cZp+Gggg7+GpsV7NdOj 9qp1rw0PSUNVWfbj3r2gH5OsRJ2SgXY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704246776; a=rsa-sha256; cv=none; b=Ub2FPLzAGhDopOLLK80RMDhe3dwsYZVfobDLcxgcPU1yZDRZ+ucx7VBLwruEX/J022zWRB jnV3PZRa2RoX/N0SzMiwcWfsCY3d99Lt6EBAU7QrKqrxQkXfotPetvUiwsyt7xdGWFwTT9 zbCS09Ld9AOe91BUfXXwnlsiQ6zyM+0= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=W7hOgYsw; spf=pass (imf02.hostedemail.com: domain of rientjes@google.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=rientjes@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-1d3ea8d0f9dso37595ad.1 for ; Tue, 02 Jan 2024 17:52:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1704246775; x=1704851575; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=9EZuImka5YRzZke3ZlhRWW36OaCkie43Uhpr5ROyDyE=; b=W7hOgYswI6azVi97Jz+tOtSAFYbCndw0XmRvpiwazgVbgVqWpGqLtx6aA/9KEDE74b ZoAkQOyHF1Ct1+ScxF5brh2r/6gWJUySGoRf78EU40/Qf9r//V1oHb5WaMh6WZMkxCW4 LoEcqzfhPOX6ow2Y1g/jjGSzGXVx0G/rvma9/YL6QwKbwI49TkPbvAuNz5iKy/EbUfZ9 xtL3k7shN8oI/BLAXJkIcvv9R+UR9n9/ThaTYbfa0qRBAJ1CdZUYYX3s1H5Noi4JvR4p EFm0ftsmatXTLQW6KtoOXirvwgfUg+MNYUUviC8Np2NtTSIl0Px6f49vm0oGKe+pgCrO 75rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704246775; x=1704851575; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9EZuImka5YRzZke3ZlhRWW36OaCkie43Uhpr5ROyDyE=; b=EDx3aP03Qjs/SWEq/5rrSK27rqknY/1y6/3dxFzhMLOHP9ihitpDqRBYn4CDDn05dA gl06fu++WUD3l4aBUlrXMr+tLlU563S01DvV0X9JjP5p3hrtg5k6eLl97ZrxdNlvQ6NR 4VdQV3v9e7TjFIv63eEW6S8hTD8+TlQC//JWygpmvBKk056H51hu1zJmqcHQBbW0SW9I wf76wtntrmUzrFygKLZcyEUEKkkGM+3Lw5uh10VKHsi1f29SKFSoT4JF3yT3iyG8Pqi/ kEiVU7h2ih8gyHF3ccQ9yimR7/xxQKir2VmMQWMAT5PhZ+Lf484KnACaRxlnzzrwV2N/ tRbg== X-Gm-Message-State: AOJu0Yy098xvYjSKBxxcfimzAGsp0q/97slD+UONhAe3boCTMke6NDkg J8qd3RKIyUwS0fPPD/Ft2pallz1hqEBY X-Google-Smtp-Source: AGHT+IFBMbIPwvwHQWt67pApon40peNGM8MP0vJQu5XBJVjX/jKeI2Jwu/rTVR13REHFhBno8Yu1+A== X-Received: by 2002:a17:902:dad1:b0:1d0:9fc7:6bdb with SMTP id q17-20020a170902dad100b001d09fc76bdbmr84674plx.0.1704246774807; Tue, 02 Jan 2024 17:52:54 -0800 (PST) Received: from [2620:0:1008:15:c73b:7876:89ec:9102] ([2620:0:1008:15:c73b:7876:89ec:9102]) by smtp.gmail.com with ESMTPSA id r29-20020aa79edd000000b006d9c65cc854sm14458398pfq.26.2024.01.02.17.52.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jan 2024 17:52:54 -0800 (PST) Date: Tue, 2 Jan 2024 17:52:53 -0800 (PST) From: David Rientjes To: Gang Li cc: David Hildenbrand , Mike Kravetz , Muchun Song , Andrew Morton , Tim Chen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com Subject: Re: [PATCH v3 0/7] hugetlb: parallelize hugetlb page init on boot In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev> Message-ID: <5c30a825-b588-e3a9-83db-f8eef4cb9012@google.com> References: <20240102131249.76622-1-gang.li@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 1FF1A80005 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: us9oqinh4541yfqcpnp9tjkq9mq9na9c X-HE-Tag: 1704246775-640489 X-HE-Meta: U2FsdGVkX1+CfXM7SxvtH4L6StwucsMwr9XNIpHN6WvUZbU/DPrGQnD5d0K1b5Ihx8nAUQZ6NkDhlIltch/nIX5k+Uf5AZq2q3DQ/+NI8iznwBdkW3SyLdXsGOb3ss9BdP0HLwK1rMZhgcw1yAj9pIiNfATur05LWDCO2l9of6/J9g6OOhe4qakrTcS9+Th+fxhazCJE8u6VS1ya+TGNRoXI5xushO5PGPI+gcUrZnpRf8MyST/xReLKEYwNBW5zcY+oIhoTuu+8wfocG9HQOWTOtF0skPXf70OQvg/98b1pE7sm9C600yb+OztZTcc3zk6Uwev+Aa/MCD+GUygPydsuffogJSZeYI0nka1dYS+HDgof+qjAlJ+nhxDVemGcr2QQaOi8T9e8P7h+Y+jZ+i+IEJaRWfxRVXgE+lXTyagrkY/XXoYUtIhpwj/GTcXjm+LTcrj/kBkkMWfKxhz/CAE+dbnLJMXffPPuFE6LFQl80LL2o0l1temXUjuqmXvy1NEx9+SOfdSLxmbcQ06RPr7HjiyB4RfMU2p4S+SZG+dnOPWQZ4nsyhtifZuOXYiSvPpw3V6foRgcDmbMON+pnaF+W6tB09zSR+Qy6CmMK3aVyJl15nKZh2mGStrJvFwA/N7eSgGIuY4Kry6XYvB7gOVCflgPPtRMcwhQiuhQTSPWLIrfdlu7HUQLfXRhnuq62c8sVq48CLS8xZQSdYfw56TmZix/1U7Zd2WHfJZMCCSA82eyoOPs3KOC7D+5NIJ6zUUWuJ8s88MtoYLoPW4RsSZ25fobNW2etDKMd7aBSfRLjua2aeyC5S+LVaZk9szCgcmZ98fsauEawss5KVAP6UNvsQpZ3CLUr3klakhWNnrztqwzwa8SgzrXERH5uG5cFSMq1d2fAH+Rm6Nrc88O8F7ueKRRWqXdVNvUJ83ODx+PrBgPMws28OQdV2z1GHHYC/3SpAUI4g083A7UTaw klQycJ8x hyIOVf0oBMb+UUPZmT6IaVLeFxOzcWkhZXKOA8DysmVgrb3WIrJJzG6dAtCqJDkVb3oMDpsFPvh3YaoLLMKWVhabNNNb8sNvzK/+Ii+UOugCiTf26YsUO7C9XpbAAwl5fj3IF8qNd8DC6ziTIcGD4RseCPjN/jeLUuL8q/B+SgJb95E7L5VEAqnH9W/b6x/k9B61+sSIjipRN+uifb8BSN9NNrI9RLFZS28xiUU6hLaQcxhpLJy5+LyIb4b8GB6h9ZN3Zpvx4IopN9kqVcNJt28y178bByBNHZVofFzXVhjHfK8y1ZjZOf/9S0Hxes/z8/HorQ6nbnn0vCgLVcU/MkQtJQrbFeD0hzTRsxtJVk1BoBfIIlPZc4iaIzIhQ1p1FTHLSnEc4nvvYBu5IIx5nuHq+8pcoWOKHngTSu1kLK7faCGKf5YJ152Ef4/UYduhkKLgSSj3arl2PWkFyK2YjoTi+kYZ7R/ls1MwR4xosNGCFKoB1LAdDA/su9eK3a/V9jRjyCyg9hzvjLM8z0qfp30SS3Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 2 Jan 2024, Gang Li wrote: > Hi all, hugetlb init parallelization has now been updated to v3. > > This series is tested on next-20240102 and can not be applied to v6.7-rc8. > > Update Summary: > - Select CONFIG_PADATA as we use padata_do_multithreaded > - Fix a race condition in h->next_nid_to_alloc > - Fix local variable initialization issues > - Remove RFC tag > > Thanks to the testing by David Rientjes, we now know that this patch reduce > hugetlb 1G initialization time from 77s to 18.3s on a 12T machine[4]. > > # Introduction > Hugetlb initialization during boot takes up a considerable amount of time. > For instance, on a 2TB system, initializing 1,800 1GB huge pages takes 1-2 > seconds out of 10 seconds. Initializing 11,776 1GB pages on a 12TB Intel > host takes more than 1 minute[1]. This is a noteworthy figure. > > Inspired by [2] and [3], hugetlb initialization can also be accelerated > through parallelization. Kernel already has infrastructure like > padata_do_multithreaded, this patch uses it to achieve effective results > by minimal modifications. > > [1] https://lore.kernel.org/all/783f8bac-55b8-5b95-eb6a-11a583675000@google.com/ > [2] https://lore.kernel.org/all/20200527173608.2885243-1-daniel.m.jordan@oracle.com/ > [3] https://lore.kernel.org/all/20230906112605.2286994-1-usama.arif@bytedance.com/ > [4] https://lore.kernel.org/all/76becfc1-e609-e3e8-2966-4053143170b6@google.com/ > > # Test result > test no patch(ms) patched(ms) saved > ------------------- -------------- ------------- -------- > 256c2t(4 node) 1G 4745 2024 57.34% > 128c1t(2 node) 1G 3358 1712 49.02% > 12t 1G 77000 18300 76.23% > > 256c2t(4 node) 2M 3336 1051 68.52% > 128c1t(2 node) 2M 1943 716 63.15% > I tested 1GB hugetlb on a smaller AMD host with the following: diff --git a/mm/hugetlb.c b/mm/hugetlb.c --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3301,7 +3301,7 @@ int alloc_bootmem_huge_page(struct hstate *h, int nid) int __alloc_bootmem_huge_page(struct hstate *h, int nid) { struct huge_bootmem_page *m = NULL; /* initialize for clang */ - int nr_nodes, node; + int nr_nodes, node = nid; /* do node specific alloc */ if (nid != NUMA_NO_NODE) { After the build error is fixed, feel free to add: Tested-by: David Rientjes to each patch. I think Andrew will probably take a build fix up as a delta on top of patch 4 rather than sending a whole new series unless there is other feedback that you receive.