From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFDAFCF58DA for ; Fri, 20 Sep 2024 06:58:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 709626B0082; Fri, 20 Sep 2024 02:58:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6940A6B0083; Fri, 20 Sep 2024 02:58:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 580186B0085; Fri, 20 Sep 2024 02:58:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 38C616B0082 for ; Fri, 20 Sep 2024 02:58:52 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id DE469A123B for ; Fri, 20 Sep 2024 06:58:51 +0000 (UTC) X-FDA: 82584214062.06.47D26AA Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 1207C140009 for ; Fri, 20 Sep 2024 06:58:49 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726815497; a=rsa-sha256; cv=none; b=T5YWoWu3Kc1bYk46XsHtjAr3zi0EkJnU+T5v/luD2WSnCm9SpgeSPqbwH5LpsKg2gq7CxL Tp0/hkEjIvXS1lpKvWinlVLLMKX7m+gJc+hUSm9+tElxv3rIXHkdJ9LUk6XoeswqgekCue tcjRThjXiLbl75QELKX6Q4CAru2E5dQ= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726815497; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ntnYUcCq1RYskEfWaBbu+cYyEZ4+MyXVXXRXHyZsqxY=; b=UxvrkiIE5wYb10I+MWegSv16URm17u/iXNbghXjxfnWN1GUgsifHrPMTEZVo/i+k5aLOGV MlWHjEiFfMuYesQV79oxOqjJjDGDvV/zggaHXrU6nASEpbF93nQnk4puDtgwx1rRRRLzf8 i2oirRrHN4Ja6+Z29xTznxqnerwU+zM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D78D7FEC; Thu, 19 Sep 2024 23:59:18 -0700 (PDT) Received: from [10.163.35.184] (unknown [10.163.35.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F4693F64C; Thu, 19 Sep 2024 23:58:46 -0700 (PDT) Message-ID: Date: Fri, 20 Sep 2024 12:28:44 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] mm/vmstat: Defer the refresh_zone_stat_thresholds after all CPUs bringup To: Saurabh Sengar , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: ssengar@microsoft.com, wei.liu@kernel.org, srivatsa@csail.mit.edu References: <1723443220-20623-1-git-send-email-ssengar@linux.microsoft.com> Content-Language: en-US From: Anshuman Khandual In-Reply-To: <1723443220-20623-1-git-send-email-ssengar@linux.microsoft.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 1207C140009 X-Rspamd-Server: rspam01 X-Stat-Signature: fnukpu4gi6dtpb543766ca9j3dsq9ezm X-HE-Tag: 1726815529-825521 X-HE-Meta: U2FsdGVkX18aNGD0k1KX/jXUIUxmsaUxmsLReT8LQaDfI+KDKmRvL1axEG4w1hxIZqs/3x9giB45qYvjoXQOs+LgiInaZNQFkZJ/nzONhYXf6616YgGuvxkQcQ2d3X4BQWDrXPAYnlGuQNnXozXGuOXasQcFeSf3yaiBW6lMVYE7X3Co55q71z7Lnc8nclOuWHYrDcg/JHURgJ44pu1QKWnahehLwUXZN3qe6sdKv7Ys4oIl7MlwESm+9Bb/sVApr8OH1GvTcQJ67PLKto0bEmujdKR8iKsoktS/sAz7htVV9X1t1j7BHbAfzhIGLaBVY9RXpvC+hpdXRLXUiKXy69+Yh4FjkzgMHui6UbOd6DF4Pe9C5jIC5fSWzQUjXX7ENUlk9dx/lNuxfq545+RMFw39uv/NdNn1faVIoQfsNNmbUjyK5hXQuO6gwRsOsHfvTyFD4whFUu/kKH+NO1qPVrTk4N2gLYNffm7Q6gqtyoOAsMj82lmS8Q+X6u7rEzZxJNHz6fvHZ8A2rnLS/VpZC5Sea961/ETtL1ZsQ8f0X5AlNjeQZzAd2hTcTPibzOqb7Cigzi8WlJ5ADvslYnV6gdE3GrQzHuV+oSYIE50jB+7qX59n7oB1eoWoBqTIhnvfkxOydOxZF2x1XrW34UyTeKpRU4WZppVFzspvEr53xHgbJtQLqTG7ihblZ0uI5Xdr5qNtgm2UjqNJRjoDXJl0pk110CbY50hDeJIeH9tORFSfQVkf88HRtrcfbcVwbvddWxhHWZjxDuEphW4pXnKhOfb+1KwoeUhkcNMDfvLaUvbe87Bwb8Hsfeh+ZeRwFI/v4oAir7i2q77oCMx9wlX8csaDNgAHDUs4y3ud6OqhOw6O/CPvU+RPAaZQ7U9V4LjkBgkw5RBgnCMjHc/HgSiJjyIpZGP7g2nvS5RVhXjBvHFKRy9E/7SV4Zxki8jY/Y88w4tNNwUs6NWJ5XQpeLT AXmgTevi G/Tt2YDxX+ROHvUdaPz1LS3hJNWh0+bR24a+kB34lpXJTSDnK+oXY0rvHd7ToIA9bTimDacUKP48UBoJVi8a2j8XVpx8mSdX2QRoy/3650iAgq94JqHtD1rHzGOqZwvKYb0kD0sG/onc7SSD/l4H+WZVNRZFZAdMzsaQf497UT6fj2k7b9FpdN7TuVyxntZ650VYEYBR7ii6bkenOhuqiD4VkkA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 8/12/24 11:43, Saurabh Sengar wrote: > refresh_zone_stat_thresholds function has two loops which is expensive for > higher number of CPUs and NUMA nodes. > > Below is the rough estimation of total iterations done by these loops > based on number of NUMA and CPUs. > > Total number of iterations: nCPU * 2 * Numa * mCPU > Where: > nCPU = total number of CPUs > Numa = total number of NUMA nodes > mCPU = mean value of total CPUs (e.g., 512 for 1024 total CPUs) > > For the system under test with 16 NUMA nodes and 1024 CPUs, this > results in a substantial increase in the number of loop iterations > during boot-up when NUMA is enabled: > > No NUMA = 1024*2*1*512 = 1,048,576 : Here refresh_zone_stat_thresholds > takes around 224 ms total for all the CPUs in the system under test. > 16 NUMA = 1024*2*16*512 = 16,777,216 : Here refresh_zone_stat_thresholds > takes around 4.5 seconds total for all the CPUs in the system under test. > > Calling this for each CPU is expensive when there are large number > of CPUs along with multiple NUMAs. Fix this by deferring > refresh_zone_stat_thresholds to be called later at once when all the > secondary CPUs are up. Also, register the DYN hooks to keep the > existing hotplug functionality intact. > > Signed-off-by: Saurabh Sengar > --- > [V2] > - Move vmstat_late_init_done under CONFIG_SMP to fix > variable 'defined but not used' warning. > > mm/vmstat.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/mm/vmstat.c b/mm/vmstat.c > index 4e2dc067a654..fa235c65c756 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -1908,6 +1908,7 @@ static const struct seq_operations vmstat_op = { > #ifdef CONFIG_SMP > static DEFINE_PER_CPU(struct delayed_work, vmstat_work); > int sysctl_stat_interval __read_mostly = HZ; > +static int vmstat_late_init_done; > > #ifdef CONFIG_PROC_FS > static void refresh_vm_stats(struct work_struct *work) > @@ -2110,7 +2111,8 @@ static void __init init_cpu_node_state(void) > > static int vmstat_cpu_online(unsigned int cpu) > { > - refresh_zone_stat_thresholds(); > + if (vmstat_late_init_done) > + refresh_zone_stat_thresholds(); > > if (!node_state(cpu_to_node(cpu), N_CPU)) { > node_set_state(cpu_to_node(cpu), N_CPU); > @@ -2142,6 +2144,14 @@ static int vmstat_cpu_dead(unsigned int cpu) > return 0; > } > > +static int __init vmstat_late_init(void) > +{ > + refresh_zone_stat_thresholds(); > + vmstat_late_init_done = 1; > + > + return 0; > +} > +late_initcall(vmstat_late_init);> #endif > > struct workqueue_struct *mm_percpu_wq; late_initcall() triggered vmstat_late_init() guaranteed to be called before the last call into vmstat_cpu_online() during a normal boot ? Otherwise refresh_zone_stat_thresholds() will never be called unless there is a CPU online event later.