From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76624C7EE30 for ; Thu, 26 Jun 2025 20:09:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2C628D000E; Thu, 26 Jun 2025 16:09:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 917D18D0001; Thu, 26 Jun 2025 16:09:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76BF18D000E; Thu, 26 Jun 2025 16:09:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5FFB98D0001 for ; Thu, 26 Jun 2025 16:09:42 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 16631C03BC for ; Thu, 26 Jun 2025 20:09:42 +0000 (UTC) X-FDA: 83598642204.14.6467B5C Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) by imf24.hostedemail.com (Postfix) with ESMTP id 387C1180011 for ; Thu, 26 Jun 2025 20:09:40 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="T7+s/N11"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750968580; a=rsa-sha256; cv=none; b=wjSvJFzEVb629Aw4gGjr7o2yz9gFVig8Vs2S2oNpOaSGQ1QmbxB/TbV/4luMRYnWIItINO SFvM6w76rDmofF2xm8UaAB3XMuDOsmqqJWshCx/ri6mWPHT+nP6TxFy0mXKg5wSL6cxT4Q pnGki8Ne1gFjbDR6o72VM1/mm8/DskU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="T7+s/N11"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of joshua.hahnjy@gmail.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=joshua.hahnjy@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750968580; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XkomMQFK2naO5Td/U8diUi8TzBSBojT4mBxZFqTyC2A=; b=bMfvEoP0txAMgvqM8LN8HpRd5ZrD4HirMaq4BFs/QsnfwYtYcP6lWAq6MQg0kKMSXfpN4/ PYNv3/3LNpH7Eq1m2Vqd5lHzOk69CijT2darRiyVM08ELPNl+r/obJZoMfRJSP69iKwFe7 AB6uHOdOuMJKQtO7IqUOJGUZ5yT/i5o= Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-e7387d4a336so1287556276.2 for ; Thu, 26 Jun 2025 13:09:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750968579; x=1751573379; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XkomMQFK2naO5Td/U8diUi8TzBSBojT4mBxZFqTyC2A=; b=T7+s/N11JHR4O/cu3hs7uYZEe0oy6NWSQ3yEx+rgUB4UMdOsE1UYe/oAJ5c7A9QI28 2OrbR0+T12BGVFsB0Ndefu9cMNal99XvufT/rHtFvuoeQLvpvfGR+FsPYmDDruB6Crpe IqteWRgC3YJW2pFosesk+jPL+wp+EsyKPcRwrVXNiGK0a8sStOL2jvzc0ZpXx8bBAoxE FIAhF18JA++rHEgGOniM8VN8U8gnESG/J5B9ET2QC7UEL9OZJwS4ndXpyArnMpTMRwdg +zzVPR1Ah7tfjrYB6nFuFCypU4/4JjLSQSSfjrnGsV9oVrkePJnySzngGVxGFKU0AGHt eXjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750968579; x=1751573379; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XkomMQFK2naO5Td/U8diUi8TzBSBojT4mBxZFqTyC2A=; b=jzql5d1nC/SosFjRshMGooWJfRLUvcZOXmf2wA8qT2kZba/LLFdB0bqNBNXH1MnK4n jZoAFbOIRTK6QlAurhv/SLPyND7zFCt9W2EriDJqklsIWIMibrZN9IHmvHyjQTGj5P8q CAVO/h93tszEuanjvZABT+E6ibRJWNSQ1t+kzLveV69iGt33jHqKWg6j3psFaFaHUEw8 f4Ttye77wxOni4frWF+pbUsM9DDy7u3jNee2h2PPUuvJyGhG13pgLHHNf33gM1ORUiX8 trbGCOGuJ9yKc0IcRw63ZvNeVV0oW+5FaWlLXYguRSeaavrRuk+qamSLWLi5Mjz1Ng9s OaFg== X-Forwarded-Encrypted: i=1; AJvYcCX0bd17M9AmiUBlGmn9m0b0TLKez3XiuiypNlXDBd45/hylCTZELgdBqaSDsczl/eKUAXRUBVOrtw==@kvack.org X-Gm-Message-State: AOJu0YzCi4NpOUciXnY0olPIUOk0D4Ce66LlzriarBYnBqGpnRpmxJYt VE+Y0gVsGdcC4nV1rL/Ok4HGi6b9wQxbovZDyomnQtJb2h/iTeSa/2IC X-Gm-Gg: ASbGnct5KJpP7CDGkXG707oKxEuLl+TtABXeMtSw3a8v5UmqOX3Ch2mkHSV2dpAPVWu ESJlr0DE30K+tjtFymUOxICUiOAIc0q1KdhylZom/H+bKYTRHxo1upwUQcfTfE90Wv/g9CijlUY GJ1zIK5n+UvAdjH95td6M9ieVJrfUM0D0WlR0+tPDSj8AxY/IZoztK8i8NjLQHBwW9+36/jsaj2 rkJSloW5y6O+VoX0HMHfIuEjZg0B2YKh9by6MLAVfx+8y5DdphWxnz4CK6p9gybT+qLtpGhqM3X JZ7KpMq/wif8PPXkQLRD7XHdfzYcS1xZoyEVfdNX2/GGRV8y5G55AM2HlGkcwaWapYK3NcuG X-Google-Smtp-Source: AGHT+IG0TCS7LX5hXAapjcI47CD8a3h/pDdk4+7qaAM9DfaiQOLdgREVmRaWEyhLA67mK9b72dDlAQ== X-Received: by 2002:a05:690c:b17:b0:712:e516:2a30 with SMTP id 00721157ae682-715171c74e9mr8955137b3.28.1750968579204; Thu, 26 Jun 2025 13:09:39 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:4d::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-71515cc7bcesm1313407b3.112.2025.06.26.13.09.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Jun 2025 13:09:38 -0700 (PDT) From: Joshua Hahn To: Gregory Price Cc: Andrew Morton , Alistair Popple , Byungchul Park , David Hildenbrand , Matthew Brost , Rakie Kim , Ying Huang , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: [PATCH 2/2] mm/mempolicy: Skip extra call to __alloc_pages_bulk in weighted interleave Date: Thu, 26 Jun 2025 13:09:34 -0700 Message-ID: <20250626200936.3974420-3-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250626200936.3974420-1-joshua.hahnjy@gmail.com> References: <20250626200936.3974420-1-joshua.hahnjy@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 387C1180011 X-Rspamd-Server: rspam10 X-Stat-Signature: mmt343t3gzn4sbh1iarwxzapcc7a4xfj X-HE-Tag: 1750968580-383920 X-HE-Meta: U2FsdGVkX19DVo7R708rsNblIfcn+oY06yg9P9PGL8BzyB/GTQQ1j5s46Mhns3YY4eos73hoYEAZXOo5bjYNVeNN3zcaHJFRF/wE1/K8OPGzkLjxK62zQkNO5sXAJcXH2xDxNwLObs+HQCtpebvuNCBU9pGE/iztO3CFdCBTP6aWbumo7iZhAxL5e3bE/SzcGvyc6nv0fy3hzzhYe0bI7D1euOZjDZ4WAohgpwVWPtiVebedZ7N22znQnyjA5Gc71Wz8RxhdKG9HnOHphMrKSCfNKl2VUPB+jOs+qa9SZ+KVee/vZLCBp16CGPxU4gDTplg1I2B14TbETAli0CiCPVzZ9eTBFzrbaTinPXcG3h+xH21+nFFrwaitfS+XTHy4AqzfK4jlZA4bUknrxOvVsBHfgDIRmUQHx1ZR/gcu+aCNT+JPhoGJ+IHcrM0NM9flvKpPQP4IfE+GiSmxjXgRgOQIPKOlj8z/FblTfiPbmAhOcP6ck+BgNYEzLynAdkZEGmj5FLXA4bORTBrE27uheFrzLL0yOeg+1ms8IBkk+4MIDzt6ntskte0wgeIClEVcpL5niICNYPK3fEL0pKU0nTTKUKG79St3pEv9Kowpb4Va5eaVUjgfWf2Qj99MJ/KnvX+RC7BUsCUbaMHA1SlakWj9550iKXxG6LMZxa5qBLW6E5miFzv99WmH8sWrna1TU0u7Y0iLXvqdNjE1aMWkcu7uR+MZy4NHODqDVkeWVjb6w0KE4GF43uRU1xP104D2wO8d4C+ESutycjaPT4mKNbrBWABa21l8nkSIrrWHO/e44vV3PmKbD3Q8MQQ4QSsbFz4ILXJi+kNTtR/wCT7aTLFrAmduS4uWKJ7JKTeXy5qTDf8tmfLfnso9R9M8IBWTOg8MJ+DSWHwdxG4yNISrHrzLocBQClymvm9GJyjf9VUej7vLx/mQ0FhTg4b9yUjuh5phSm1qWPm9adh4L/S yuNHLo5I M8wzRu7gvIkkz9S+kmwlkSaC2xp7pLarQnf4+5mRrvTN1IPgt6965+dxVqGprEE6fzNmKT6J6pSK55cOxXeV3HHRaHPW9pNpPtQXak/Dge2zgJZ7mbh72wuBOEqBmF1abELCFhk8OHKh/3zhpHNFn25Zmhz//NiFAYDhnSufGCcGlmdp61tslSDiCpUzjz8QBym3XRAS2BmvkR11e+/+17TJN1D7a2GfKluj7vFG8VeLchHl+Wehx765G+hM4pgs570L7jVe6fnyfAavd7++XC/iHJkKhzp3FJ6tDasaF3cY06AdpmfrwG5SrrYEQBHDPgiSYfuGkLWd5LljJZPPjeMmvOEDVrpE+ryUWWBLK1C2H5nq8Gr8ZSUNs+LCca83VFzb/37d7tiequ0y93uKFkUaMC7MuOOBgcQUxYVPBqj2W9WG2icQZGrbbNziAsFRSiyKhro9KN/Qeux5iLx++G+PlOeZjuqtCCN1VRiOy735C2By1wqC3VHJlqYiiDX9BwRw6rpNYFgtVodDM5qcWMig0Oz2nRNGHPeJk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, alloc_pages_bulk_weighted_interleave can make up to nr_node_ids+1 calls to __alloc_pages_bulk. The additional allocation can happen if the previous call to this function finished the weighted round robin allocation partially on a node. To make up for this, the next time this function is called, an extra allocation is made to finish cleanly on the node boundaries before performing the weighted round-robin cycles again. Instead of making an additional call, we can calculate how many additional pages should be allocated from the first node (aka carryover) and add that value to the number of pages that should be allocated as part of the current round-robin cycle. Running a quick benchmark by compiling the kernel shows a small increase in performance. These experiments were run on a machine with 2 nodes, each with 125GB memory and 40 CPUs. time numactl -w 0,1 make -j$(nproc) +----------+---------+------------+---------+ | Time (s) | 6.16 | With patch | % Delta | +----------+---------+------------+---------+ | Real | 88.374 | 88.3356 | -0.2019 | | User | 3631.7 | 3636.263 | 0.0631 | | Sys | 366.029 | 363.792 | -0.7534 | +----------+---------+------------+---------+ Signed-off-by: Joshua Hahn --- mm/mempolicy.c | 39 ++++++++++++++++++++------------------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 78ad74a0e249..0d693f96cf66 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2569,7 +2569,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, unsigned long node_pages, delta; u8 *weights, weight; unsigned int weight_total = 0; - unsigned long rem_pages = nr_pages; + unsigned long rem_pages = nr_pages, carryover = 0; nodemask_t nodes; int nnodes, node; int resume_node = MAX_NUMNODES - 1; @@ -2594,18 +2594,12 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, node = me->il_prev; weight = me->il_weight; if (weight && node_isset(node, nodes)) { - node_pages = min(rem_pages, weight); - nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, - page_array); - page_array += nr_allocated; - total_allocated += nr_allocated; - /* if that's all the pages, no need to interleave */ if (rem_pages <= weight) { - me->il_weight -= rem_pages; - return total_allocated; + node_pages = rem_pages; + me->il_weight -= node_pages; + goto allocate; } - /* Otherwise we adjust remaining pages, continue from there */ - rem_pages -= weight; + carryover = weight; } /* clear active weight in case of an allocation failure */ me->il_weight = 0; @@ -2614,7 +2608,7 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, /* create a local copy of node weights to operate on outside rcu */ weights = kzalloc(nr_node_ids, GFP_KERNEL); if (!weights) - return total_allocated; + return 0; rcu_read_lock(); state = rcu_dereference(wi_state); @@ -2634,16 +2628,17 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, /* * Calculate rounds/partial rounds to minimize __alloc_pages_bulk calls. * Track which node weighted interleave should resume from. + * Account for carryover. It is always allocated from the first node. * * if (rounds > 0) and (delta == 0), resume_node will always be * the node following prev_node and its weight. */ - rounds = rem_pages / weight_total; - delta = rem_pages % weight_total; + rounds = (rem_pages - carryover) / weight_total; + delta = (rem_pages - carryover) % weight_total; resume_node = next_node_in(prev_node, nodes); resume_weight = weights[resume_node]; + node = carryover ? prev_node : next_node_in(prev_node, nodes); for (i = 0; i < nnodes; i++) { - node = next_node_in(prev_node, nodes); weight = weights[node]; /* when delta is depleted, resume from that node */ if (delta && delta < weight) { @@ -2651,12 +2646,14 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, resume_weight = weight - delta; } /* Add the node's portion of the delta, if there is one */ - node_pages = weight * rounds + min(delta, weight); + node_pages = weight * rounds + min(delta, weight) + carryover; delta -= min(delta, weight); + carryover = 0; /* node_pages can be 0 if an allocation fails and rounds == 0 */ if (!node_pages) break; +allocate: nr_allocated = __alloc_pages_bulk(gfp, node, NULL, node_pages, page_array); page_array += nr_allocated; @@ -2664,10 +2661,14 @@ static unsigned long alloc_pages_bulk_weighted_interleave(gfp_t gfp, if (total_allocated == nr_pages) break; prev_node = node; + node = next_node_in(prev_node, nodes); + } + + if (weights) { + me->il_prev = resume_node; + me->il_weight = resume_weight; + kfree(weights); } - me->il_prev = resume_node; - me->il_weight = resume_weight; - kfree(weights); return total_allocated; } -- 2.47.1