From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Saurabh Sengar <ssengar@linux.microsoft.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Cc: ssengar@microsoft.com, wei.liu@kernel.org, srivatsa@csail.mit.edu
Subject: Re: [PATCH v2] mm/vmstat: Defer the refresh_zone_stat_thresholds after all CPUs bringup
Date: Fri, 20 Sep 2024 12:28:44 +0530 [thread overview]
Message-ID: <b1dc2aa1-cd38-4f1f-89e9-6d009a619541@arm.com> (raw)
In-Reply-To: <1723443220-20623-1-git-send-email-ssengar@linux.microsoft.com>
On 8/12/24 11:43, Saurabh Sengar wrote:
> refresh_zone_stat_thresholds function has two loops which is expensive for
> higher number of CPUs and NUMA nodes.
>
> Below is the rough estimation of total iterations done by these loops
> based on number of NUMA and CPUs.
>
> Total number of iterations: nCPU * 2 * Numa * mCPU
> Where:
> nCPU = total number of CPUs
> Numa = total number of NUMA nodes
> mCPU = mean value of total CPUs (e.g., 512 for 1024 total CPUs)
>
> For the system under test with 16 NUMA nodes and 1024 CPUs, this
> results in a substantial increase in the number of loop iterations
> during boot-up when NUMA is enabled:
>
> No NUMA = 1024*2*1*512 = 1,048,576 : Here refresh_zone_stat_thresholds
> takes around 224 ms total for all the CPUs in the system under test.
> 16 NUMA = 1024*2*16*512 = 16,777,216 : Here refresh_zone_stat_thresholds
> takes around 4.5 seconds total for all the CPUs in the system under test.
>
> Calling this for each CPU is expensive when there are large number
> of CPUs along with multiple NUMAs. Fix this by deferring
> refresh_zone_stat_thresholds to be called later at once when all the
> secondary CPUs are up. Also, register the DYN hooks to keep the
> existing hotplug functionality intact.
>
> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> ---
> [V2]
> - Move vmstat_late_init_done under CONFIG_SMP to fix
> variable 'defined but not used' warning.
>
> mm/vmstat.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 4e2dc067a654..fa235c65c756 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1908,6 +1908,7 @@ static const struct seq_operations vmstat_op = {
> #ifdef CONFIG_SMP
> static DEFINE_PER_CPU(struct delayed_work, vmstat_work);
> int sysctl_stat_interval __read_mostly = HZ;
> +static int vmstat_late_init_done;
>
> #ifdef CONFIG_PROC_FS
> static void refresh_vm_stats(struct work_struct *work)
> @@ -2110,7 +2111,8 @@ static void __init init_cpu_node_state(void)
>
> static int vmstat_cpu_online(unsigned int cpu)
> {
> - refresh_zone_stat_thresholds();
> + if (vmstat_late_init_done)
> + refresh_zone_stat_thresholds();
>
> if (!node_state(cpu_to_node(cpu), N_CPU)) {
> node_set_state(cpu_to_node(cpu), N_CPU);
> @@ -2142,6 +2144,14 @@ static int vmstat_cpu_dead(unsigned int cpu)
> return 0;
> }
>
> +static int __init vmstat_late_init(void)
> +{
> + refresh_zone_stat_thresholds();
> + vmstat_late_init_done = 1;
> +
> + return 0;
> +}
> +late_initcall(vmstat_late_init);> #endif
>
> struct workqueue_struct *mm_percpu_wq;
late_initcall() triggered vmstat_late_init() guaranteed to be called
before the last call into vmstat_cpu_online() during a normal boot ?
Otherwise refresh_zone_stat_thresholds() will never be called unless
there is a CPU online event later.
next prev parent reply other threads:[~2024-09-20 6:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-12 6:13 Saurabh Sengar
2024-08-23 9:30 ` Saurabh Singh Sengar
2024-08-23 9:32 ` Saurabh Singh Sengar
2024-09-19 19:52 ` Saurabh Singh Sengar
2024-09-20 8:16 ` Andrew Morton
2024-09-20 9:25 ` Srivatsa S. Bhat
2024-09-23 20:17 ` Christoph Lameter (Ampere)
2024-09-24 2:56 ` Srivatsa S. Bhat
2024-09-24 7:40 ` Saurabh Singh Sengar
2024-09-24 7:39 ` Saurabh Singh Sengar
2024-09-20 6:58 ` Anshuman Khandual [this message]
2024-09-20 9:14 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b1dc2aa1-cd38-4f1f-89e9-6d009a619541@arm.com \
--to=anshuman.khandual@arm.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=srivatsa@csail.mit.edu \
--cc=ssengar@linux.microsoft.com \
--cc=ssengar@microsoft.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox