From: Thomas Prescher via B4 Relay <devnull+thomas.prescher.cyberus-technology.de@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org,
Thomas Prescher <thomas.prescher@cyberus-technology.de>
Subject: [PATCH v3 1/3] mm: hugetlb: improve parallel huge page allocation time
Date: Fri, 28 Feb 2025 00:02:10 +0100 [thread overview]
Message-ID: <20250228-hugepage-parameter-v3-1-2628e9b2b5c0@cyberus-technology.de> (raw)
In-Reply-To: <20250228-hugepage-parameter-v3-0-2628e9b2b5c0@cyberus-technology.de>
From: Thomas Prescher <thomas.prescher@cyberus-technology.de>
Before this patch, the kernel currently used a hard coded
value of 2 threads per NUMA node for these allocations.
This patch changes this policy and the kernel now uses 25%
of the available hardware threads for the allocations.
Signed-off-by: Thomas Prescher <thomas.prescher@cyberus-technology.de>
---
mm/hugetlb.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 163190e89ea16450026496c020b544877db147d1..e9b1b3e2b9d467f067d54359e1401a03f9926108 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -14,9 +14,11 @@
#include <linux/pagemap.h>
#include <linux/mempolicy.h>
#include <linux/compiler.h>
+#include <linux/cpumask.h>
#include <linux/cpuset.h>
#include <linux/mutex.h>
#include <linux/memblock.h>
+#include <linux/minmax.h>
#include <linux/sysfs.h>
#include <linux/slab.h>
#include <linux/sched/mm.h>
@@ -3427,31 +3429,31 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h)
.numa_aware = true
};
+ unsigned int num_allocation_threads = max(num_online_cpus() / 4, 1);
+
job.thread_fn = hugetlb_pages_alloc_boot_node;
job.start = 0;
job.size = h->max_huge_pages;
/*
- * job.max_threads is twice the num_node_state(N_MEMORY),
+ * job.max_threads is 25% of the available cpu threads by default.
*
- * Tests below indicate that a multiplier of 2 significantly improves
- * performance, and although larger values also provide improvements,
- * the gains are marginal.
+ * On large servers with terabytes of memory, huge page allocation
+ * can consume a considerably amount of time.
*
- * Therefore, choosing 2 as the multiplier strikes a good balance between
- * enhancing parallel processing capabilities and maintaining efficient
- * resource management.
+ * Tests below show how long it takes to allocate 1 TiB of memory with 2MiB huge pages.
+ * 2MiB huge pages. Using more threads can significantly improve allocation time.
*
- * +------------+-------+-------+-------+-------+-------+
- * | multiplier | 1 | 2 | 3 | 4 | 5 |
- * +------------+-------+-------+-------+-------+-------+
- * | 256G 2node | 358ms | 215ms | 157ms | 134ms | 126ms |
- * | 2T 4node | 979ms | 679ms | 543ms | 489ms | 481ms |
- * | 50G 2node | 71ms | 44ms | 37ms | 30ms | 31ms |
- * +------------+-------+-------+-------+-------+-------+
+ * +-----------------------+-------+-------+-------+-------+-------+
+ * | threads | 8 | 16 | 32 | 64 | 128 |
+ * +-----------------------+-------+-------+-------+-------+-------+
+ * | skylake 144 cpus | 44s | 22s | 16s | 19s | 20s |
+ * | cascade lake 192 cpus | 39s | 20s | 11s | 10s | 9s |
+ * +-----------------------+-------+-------+-------+-------+-------+
*/
- job.max_threads = num_node_state(N_MEMORY) * 2;
- job.min_chunk = h->max_huge_pages / num_node_state(N_MEMORY) / 2;
+
+ job.max_threads = num_allocation_threads;
+ job.min_chunk = h->max_huge_pages / num_allocation_threads;
padata_do_multithreaded(&job);
return h->nr_huge_pages;
--
2.48.1
next prev parent reply other threads:[~2025-02-27 23:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-27 23:02 [PATCH v3 0/3] Add a command line option that enables control of how many threads should be used to allocate huge pages Thomas Prescher via B4 Relay
2025-02-27 23:02 ` Thomas Prescher via B4 Relay [this message]
2025-02-27 23:02 ` [PATCH v3 2/3] mm: hugetlb: add hugetlb_alloc_threads cmdline option Thomas Prescher via B4 Relay
2025-02-27 23:02 ` [PATCH v3 3/3] mm: hugetlb: log time needed to allocate hugepages Thomas Prescher via B4 Relay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250228-hugepage-parameter-v3-1-2628e9b2b5c0@cyberus-technology.de \
--to=devnull+thomas.prescher.cyberus-technology.de@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=thomas.prescher@cyberus-technology.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox