linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Donet Tom <donettom@linux.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Ritesh Harjani <ritesh.list@gmail.com>,
	rafael@kernel.org, Danilo Krummrich <dakr@kernel.org>
Subject: Re: [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set
Date: Fri, 11 Apr 2025 13:59:38 +0300	[thread overview]
Message-ID: <Z_j2Gv9n4DOj6LSs@kernel.org> (raw)
In-Reply-To: <d982df07-e7d1-4d4f-a2c3-857901ccc0d0@linux.ibm.com>

On Fri, Apr 11, 2025 at 12:27:28AM +0530, Donet Tom wrote:
> 
> On 4/10/25 1:37 PM, Mike Rapoport wrote:
> > On Wed, Apr 09, 2025 at 10:57:57AM +0530, Donet Tom wrote:
> > > In the current implementation, when CONFIG_DEFERRED_STRUCT_PAGE_INIT is
> > > set, we iterate over all PFNs in the memory block and use
> > > early_pfn_to_nid to find the NID until a match is found.
> > > 
> > > This patch we are using curr_node_memblock_intersect_memory_block() to
> > > check if the current node's memblock intersects with the memory block
> > > passed when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. If an intersection
> > > is found, the memory block is added to the current node.
> > > 
> > > If CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the existing mechanism
> > > for finding the NID will continue to be used.
> > I don't think we really need different mechanisms for different settings of
> > CONFIG_DEFERRED_STRUCT_PAGE_INIT.
> > 
> > node_dev_init() runs after all struct pages are already initialized and can
> > always use pfn_to_nid().
> 
> 
> In the current implementation, if CONFIG_DEFERRED_STRUCT_PAGE_INIT
> is enabled, we perform a binary search in the memblock region to
> determine the pfn's nid. Otherwise, we use pfn_to_nid() to obtain
> the pfn's nid.
> 
> Your point is that we could unify this logic and always use
> pfn_to_nid() to determine the pfn's nid, regardless of whether
> CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. Is that
> correct?

Yes, struct pages should be ready by the time node_dev_init() is called
even when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set.
 
> > 
> > kernel_init_freeable() ->
> > 	page_alloc_init_late(); /* completes initialization of deferred pages */
> > 	...
> > 	do_basic_setup() ->
> > 		driver_init() ->
> > 			node_dev_init();
> > 
> > The next step could be refactoring register_mem_block_under_node_early() to
> > loop over memblock regions rather than over pfns.
> So it the current implementation
> 
> node_dev_init()
>     register_one_node
>         register_memory_blocks_under_node
>             walk_memory_blocks()
>                 register_mem_block_under_node_early
>                     get_nid_for_pfn
> 
> We get each node's start and end PFN from the pg_data. Using these
> values, we determine the memory block's start and end within the
> current node. To identify the node to which these memory block
> belongs,we iterate over each PFN in the range.
> 
> The problem I am facing is,
> 
> In my system node4 has a memory block ranging from memory30351
> to memory38524, and memory128433. The memory blocks between
> memory38524 and memory128433 do not belong to this node.
> 
> In  walk_memory_blocks() we iterate over all memory blocks starting
> from memory38524 to memory128433.
> In register_mem_block_under_node_early(), up to memory38524, the
> first pfn correctly returns the corresponding nid and the function
> returns from there. But after memory38524 and until memory128433,
> the loop iterates through each pfn and checks the nid. Since the nid
> does not match the required nid, the loop continues. This causes
> the soft lockups.
> 
> This issue occurs only when CONFIG_DEFERRED_STRUCT_PAGE_INIT
> is enabled, as a binary search is used to determine the PFN's nid. When
> this configuration is disabled, pfn_to_nid is faster, and the issue does
> not seen.( Faster because nid is getting from page)
> 
> To speed up the code when CONFIG_DEFERRED_STRUCT_PAGE_INIT
> is enabled, I added this function that iterates over all memblock regions
> for each memory block to determine its nid.
> 
> "Loop over memblock regions instead of iterating over PFNs" -
> My question is - in register_one_node, do you mean that we should iterate
> over all memblock regions, identify the regions belonging to the current
> node, and then retrieve the corresponding memory blocks to register them
> under that node?

I looked more closely at register_mem_block_under_node_early() and
iteration over memblock regions won't make sense there. 

It might make sense to use for_each_mem_range() as top level loop in
node_dev_init(), but that's a separate topic.
 
> Thanks
> Donet
> 

-- 
Sincerely yours,
Mike.


  reply	other threads:[~2025-04-11 10:59 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-09  5:27 [PATCH 1/2] mm/memblock: Added a New Memblock Function to Check if the Current Node's Memblock Region Intersects with a Memory Block Donet Tom
2025-04-09  5:27 ` [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set Donet Tom
2025-04-10  8:07   ` Mike Rapoport
2025-04-10 18:57     ` Donet Tom
2025-04-11 10:59       ` Mike Rapoport [this message]
2025-04-11 11:36         ` Donet Tom
2025-04-15  9:46           ` Mike Rapoport
2025-04-15 10:08             ` Donet Tom
2025-04-10  2:20 ` [PATCH 1/2] mm/memblock: Added a New Memblock Function to Check if the Current Node's Memblock Region Intersects with a Memory Block Andrew Morton
2025-04-10  4:35   ` Donet Tom
2025-04-10  7:03 ` kernel test robot
2025-04-10  7:25 ` kernel test robot
2025-04-10  7:49 ` Mike Rapoport
2025-04-10 19:06   ` Donet Tom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_j2Gv9n4DOj6LSs@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dakr@kernel.org \
    --cc=david@redhat.com \
    --cc=donettom@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox