linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Luiz Capitulino <luizcap@redhat.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>, linux-mm@kvack.org
Cc: Donet Tom <donettom@linux.ibm.com>, Gang Li <gang.li@linux.dev>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	David Rientjes <rientjes@google.com>,
	Sourabh Jain <sourabhjain@linux.ibm.com>,
	Pavithra Prakash <pavrampu@linux.ibm.com>
Subject: Re: [PATCH v2] mm/hugetlb: Fix hugepage allocation for interleaved memory nodes
Date: Mon, 20 Jan 2025 16:39:23 -0500	[thread overview]
Message-ID: <99f2ab0c-1e1f-43ab-82ab-20438dab34a6@redhat.com> (raw)
In-Reply-To: <f8d8dad3a5471d284f54185f65d575a6aaab692b.1736592534.git.ritesh.list@gmail.com>

On 2025-01-11 06:06, Ritesh Harjani (IBM) wrote:
> gather_bootmem_prealloc() function assumes the start nid as 0 and size as
> num_node_state(N_MEMORY). That means in case if memory attached numa nodes
> are interleaved, then gather_bootmem_prealloc_parallel() will fail to scan
> few of these nodes.
> 
> Since memory attached numa nodes can be interleaved in any fashion, hence
> ensure that the current code checks for all numa node ids
> (.size = nr_node_ids). Let's still keep max_threads as N_MEMORY, so that
> it can distributes all nr_node_ids among the these many no. threads.

With this change we guarantee that we check huge_boot_pages[] for all
available nodes (vs. only for num_node_state(N_MEMORY)) which looks
correct to me so:

   Reviewed-by: Luiz Capitulino <luizcap@redhat.com>

Now, although not really related to this patch, there's one detail: in
gather_bootmem_prealloc_node() we call prep_and_add_bootmem_folios()
even when folio_list is empty. This may cause a few calls to
flush_tlb_all() down the code path when CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
even when huge_boot_pages[] is empty...

> 
> e.g. qemu cmdline
> ========================
> numa_cmd="-numa node,nodeid=1,memdev=mem1,cpus=2-3 -numa node,nodeid=0,cpus=0-1 -numa dist,src=0,dst=1,val=20"
> mem_cmd="-object memory-backend-ram,id=mem1,size=16G"
> 
> w/o this patch for cmdline (default_hugepagesz=1GB hugepagesz=1GB hugepages=2):
> ==========================
> ~ # cat /proc/meminfo  |grep -i huge
> AnonHugePages:         0 kB
> ShmemHugePages:        0 kB
> FileHugePages:         0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:    1048576 kB
> Hugetlb:               0 kB
> 
> with this patch for cmdline (default_hugepagesz=1GB hugepagesz=1GB hugepages=2):
> ===========================
> ~ # cat /proc/meminfo |grep -i huge
> AnonHugePages:         0 kB
> ShmemHugePages:        0 kB
> FileHugePages:         0 kB
> HugePages_Total:       2
> HugePages_Free:        2
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:    1048576 kB
> Hugetlb:         2097152 kB
> 
> Fixes: b78b27d02930 ("hugetlb: parallelize 1G hugetlb initialization")
> Cc: Donet Tom <donettom@linux.ibm.com>
> Cc: Gang Li <gang.li@linux.dev>
> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
> Cc: linux-mm@kvack.org
> Suggested-by: Muchun Song <muchun.song@linux.dev>
> Reported-by: Pavithra Prakash <pavrampu@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
> v1 -> v2:
> 1. Made .size = nr_node_ids instead of only online nodes as suggested by Muchun.
> 
> [v1]: https://lore.kernel.org/linux-mm/7e0ca1e8acd7dd5c1fe7cbb252de4eb55a8e851b.1727984881.git.ritesh.list@gmail.com
> 
>   mm/hugetlb.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index c498874a7170..4e2a1e907ec5 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3289,7 +3289,7 @@ static void __init gather_bootmem_prealloc(void)
>   		.thread_fn	= gather_bootmem_prealloc_parallel,
>   		.fn_arg		= NULL,
>   		.start		= 0,
> -		.size		= num_node_state(N_MEMORY),
> +		.size		= nr_node_ids,
>   		.align		= 1,
>   		.min_chunk	= 1,
>   		.max_threads	= num_node_state(N_MEMORY),
> --
> 2.39.5
> 
> 



      parent reply	other threads:[~2025-01-20 21:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-11 11:06 Ritesh Harjani (IBM)
2025-01-13  4:47 ` Sourabh Jain
2025-01-20 21:39 ` Luiz Capitulino [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99f2ab0c-1e1f-43ab-82ab-20438dab34a6@redhat.com \
    --to=luizcap@redhat.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=donettom@linux.ibm.com \
    --cc=gang.li@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=pavrampu@linux.ibm.com \
    --cc=rientjes@google.com \
    --cc=ritesh.list@gmail.com \
    --cc=sourabhjain@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox