From: Luiz Capitulino <luizcap@redhat.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>, linux-mm@kvack.org
Cc: Donet Tom <donettom@linux.ibm.com>, Gang Li <gang.li@linux.dev>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
Muchun Song <muchun.song@linux.dev>,
David Rientjes <rientjes@google.com>,
Sourabh Jain <sourabhjain@linux.ibm.com>,
Pavithra Prakash <pavrampu@linux.ibm.com>
Subject: Re: [PATCH v2] mm/hugetlb: Fix hugepage allocation for interleaved memory nodes
Date: Mon, 20 Jan 2025 16:39:23 -0500 [thread overview]
Message-ID: <99f2ab0c-1e1f-43ab-82ab-20438dab34a6@redhat.com> (raw)
In-Reply-To: <f8d8dad3a5471d284f54185f65d575a6aaab692b.1736592534.git.ritesh.list@gmail.com>
On 2025-01-11 06:06, Ritesh Harjani (IBM) wrote:
> gather_bootmem_prealloc() function assumes the start nid as 0 and size as
> num_node_state(N_MEMORY). That means in case if memory attached numa nodes
> are interleaved, then gather_bootmem_prealloc_parallel() will fail to scan
> few of these nodes.
>
> Since memory attached numa nodes can be interleaved in any fashion, hence
> ensure that the current code checks for all numa node ids
> (.size = nr_node_ids). Let's still keep max_threads as N_MEMORY, so that
> it can distributes all nr_node_ids among the these many no. threads.
With this change we guarantee that we check huge_boot_pages[] for all
available nodes (vs. only for num_node_state(N_MEMORY)) which looks
correct to me so:
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Now, although not really related to this patch, there's one detail: in
gather_bootmem_prealloc_node() we call prep_and_add_bootmem_folios()
even when folio_list is empty. This may cause a few calls to
flush_tlb_all() down the code path when CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
even when huge_boot_pages[] is empty...
>
> e.g. qemu cmdline
> ========================
> numa_cmd="-numa node,nodeid=1,memdev=mem1,cpus=2-3 -numa node,nodeid=0,cpus=0-1 -numa dist,src=0,dst=1,val=20"
> mem_cmd="-object memory-backend-ram,id=mem1,size=16G"
>
> w/o this patch for cmdline (default_hugepagesz=1GB hugepagesz=1GB hugepages=2):
> ==========================
> ~ # cat /proc/meminfo |grep -i huge
> AnonHugePages: 0 kB
> ShmemHugePages: 0 kB
> FileHugePages: 0 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 1048576 kB
> Hugetlb: 0 kB
>
> with this patch for cmdline (default_hugepagesz=1GB hugepagesz=1GB hugepages=2):
> ===========================
> ~ # cat /proc/meminfo |grep -i huge
> AnonHugePages: 0 kB
> ShmemHugePages: 0 kB
> FileHugePages: 0 kB
> HugePages_Total: 2
> HugePages_Free: 2
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 1048576 kB
> Hugetlb: 2097152 kB
>
> Fixes: b78b27d02930 ("hugetlb: parallelize 1G hugetlb initialization")
> Cc: Donet Tom <donettom@linux.ibm.com>
> Cc: Gang Li <gang.li@linux.dev>
> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
> Cc: linux-mm@kvack.org
> Suggested-by: Muchun Song <muchun.song@linux.dev>
> Reported-by: Pavithra Prakash <pavrampu@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
> v1 -> v2:
> 1. Made .size = nr_node_ids instead of only online nodes as suggested by Muchun.
>
> [v1]: https://lore.kernel.org/linux-mm/7e0ca1e8acd7dd5c1fe7cbb252de4eb55a8e851b.1727984881.git.ritesh.list@gmail.com
>
> mm/hugetlb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c498874a7170..4e2a1e907ec5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3289,7 +3289,7 @@ static void __init gather_bootmem_prealloc(void)
> .thread_fn = gather_bootmem_prealloc_parallel,
> .fn_arg = NULL,
> .start = 0,
> - .size = num_node_state(N_MEMORY),
> + .size = nr_node_ids,
> .align = 1,
> .min_chunk = 1,
> .max_threads = num_node_state(N_MEMORY),
> --
> 2.39.5
>
>
prev parent reply other threads:[~2025-01-20 21:39 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-11 11:06 Ritesh Harjani (IBM)
2025-01-13 4:47 ` Sourabh Jain
2025-01-20 21:39 ` Luiz Capitulino [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99f2ab0c-1e1f-43ab-82ab-20438dab34a6@redhat.com \
--to=luizcap@redhat.com \
--cc=daniel.m.jordan@oracle.com \
--cc=donettom@linux.ibm.com \
--cc=gang.li@linux.dev \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=pavrampu@linux.ibm.com \
--cc=rientjes@google.com \
--cc=ritesh.list@gmail.com \
--cc=sourabhjain@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox