From: Vlastimil Babka <vbabka@suse.cz>
To: Sandipan Das <sandipan@linux.ibm.com>, akpm@linux-foundation.org
Cc: linux-mm@kvack.org, khlebnikov@yandex-team.ru, mhocko@suse.com,
kirill@shutemov.name, aneesh.kumar@linux.ibm.com,
srikar@linux.vnet.ibm.com
Subject: Re: [PATCH v2] mm: Reset numa stats for boot pagesets
Date: Mon, 11 May 2020 13:13:06 +0200 [thread overview]
Message-ID: <e29844d7-e4bd-5482-33e2-345dd64468b7@suse.cz> (raw)
In-Reply-To: <9c9c2d1b15e37f6e6bf32f99e3100035e90c4ac9.1588868430.git.sandipan@linux.ibm.com>
On 5/7/20 6:29 PM, Sandipan Das wrote:
> Initially, the per-cpu pagesets of each zone are set to the
> boot pagesets. The real pagesets are allocated later but
> before that happens, page allocations do occur and the numa
> stats for the boot pagesets get incremented since they are
> common to all zones at that point.
>
> The real pagesets, however, are allocated for the populated
> zones only. Unpopulated zones, like those associated with
> memory-less nodes, continue using the boot pageset and end
> up skewing the numa stats of the corresponding node.
>
> E.g.
>
> $ numactl -H
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3
> node 0 size: 0 MB
> node 0 free: 0 MB
> node 1 cpus: 4 5 6 7
> node 1 size: 8131 MB
> node 1 free: 6980 MB
> node distances:
> node 0 1
> 0: 10 40
> 1: 40 10
>
> $ numastat
> node0 node1
> numa_hit 108 56495
> numa_miss 0 0
> numa_foreign 0 0
> interleave_hit 0 4537
> local_node 108 31547
> other_node 0 24948
>
> Hence, the boot pageset stats need to be cleared after
> the real pagesets are allocated.
>
> From this point onwards, the stats of the boot pagesets do
> not change as page allocations requested for a memory-less
> node will either fail (if __GFP_THISNODE is used) or get
> fulfilled by a preferred zone of a different node based on
> the fallback zonelist.
>
> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
With suggestion below.
> ---
>
> The previous version and discussion around it can be found at
> https://lore.kernel.org/linux-mm/20200504070304.127361-1-sandipan@linux.ibm.com/
>
> Changes in v2:
>
> - Reset the stats of the boot pagesets instead of explicitly
> returning zero as suggested by Vlastimil.
>
> - Changed the subject to reflect the above.
>
> ---
> mm/page_alloc.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 69827d4fa052..1543e32f7e4e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6256,6 +6256,25 @@ void __init setup_per_cpu_pageset(void)
> for_each_populated_zone(zone)
> setup_zone_pageset(zone);
>
> +#ifdef CONFIG_NUMA
> + if (static_branch_likely(&vm_numa_stat_key)) {
I would just remove this test and do it unconditionally, as the branch can be
only disabled later in boot by a sysctl.
> + struct per_cpu_pageset *pcp;
> + int cpu;
> +
> + /*
> + * Unpopulated zones continue using the boot pagesets.
> + * The numa stats for these pagesets need to be reset.
> + * Otherwise, they will end up skewing the stats of
> + * the nodes these zones are associated with.
> + */
> + for_each_possible_cpu(cpu) {
> + pcp = &per_cpu(boot_pageset, cpu);
> + memset(pcp->vm_numa_stat_diff, 0,
> + sizeof(pcp->vm_numa_stat_diff));
> + }
> + }
> +#endif
> +
> for_each_online_pgdat(pgdat)
> pgdat->per_cpu_nodestats =
> alloc_percpu(struct per_cpu_nodestat);
>
prev parent reply other threads:[~2020-05-11 11:13 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-07 16:29 Sandipan Das
2020-05-11 11:13 ` Vlastimil Babka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e29844d7-e4bd-5482-33e2-345dd64468b7@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=khlebnikov@yandex-team.ru \
--cc=kirill@shutemov.name \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=sandipan@linux.ibm.com \
--cc=srikar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox