linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Donet Tom <donettom@linux.ibm.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Ritesh Harjani <ritesh.list@gmail.com>,
	rafael@kernel.org, Danilo Krummrich <dakr@kernel.org>
Subject: Re: [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set
Date: Fri, 11 Apr 2025 17:06:55 +0530	[thread overview]
Message-ID: <736ca451-8adc-4c5c-b721-6b78eaeb4699@linux.ibm.com> (raw)
In-Reply-To: <Z_j2Gv9n4DOj6LSs@kernel.org>


On 4/11/25 4:29 PM, Mike Rapoport wrote:
> On Fri, Apr 11, 2025 at 12:27:28AM +0530, Donet Tom wrote:
>> On 4/10/25 1:37 PM, Mike Rapoport wrote:
>>> On Wed, Apr 09, 2025 at 10:57:57AM +0530, Donet Tom wrote:
>>>> In the current implementation, when CONFIG_DEFERRED_STRUCT_PAGE_INIT is
>>>> set, we iterate over all PFNs in the memory block and use
>>>> early_pfn_to_nid to find the NID until a match is found.
>>>>
>>>> This patch we are using curr_node_memblock_intersect_memory_block() to
>>>> check if the current node's memblock intersects with the memory block
>>>> passed when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. If an intersection
>>>> is found, the memory block is added to the current node.
>>>>
>>>> If CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the existing mechanism
>>>> for finding the NID will continue to be used.
>>> I don't think we really need different mechanisms for different settings of
>>> CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>>>
>>> node_dev_init() runs after all struct pages are already initialized and can
>>> always use pfn_to_nid().
>>
>> In the current implementation, if CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> is enabled, we perform a binary search in the memblock region to
>> determine the pfn's nid. Otherwise, we use pfn_to_nid() to obtain
>> the pfn's nid.
>>
>> Your point is that we could unify this logic and always use
>> pfn_to_nid() to determine the pfn's nid, regardless of whether
>> CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. Is that
>> correct?
> Yes, struct pages should be ready by the time node_dev_init() is called
> even when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set.

ok.

Thanks Mike.

>   
>>> kernel_init_freeable() ->
>>> 	page_alloc_init_late(); /* completes initialization of deferred pages */
>>> 	...
>>> 	do_basic_setup() ->
>>> 		driver_init() ->
>>> 			node_dev_init();
>>>
>>> The next step could be refactoring register_mem_block_under_node_early() to
>>> loop over memblock regions rather than over pfns.
>> So it the current implementation
>>
>> node_dev_init()
>>      register_one_node
>>          register_memory_blocks_under_node
>>              walk_memory_blocks()
>>                  register_mem_block_under_node_early
>>                      get_nid_for_pfn
>>
>> We get each node's start and end PFN from the pg_data. Using these
>> values, we determine the memory block's start and end within the
>> current node. To identify the node to which these memory block
>> belongs,we iterate over each PFN in the range.
>>
>> The problem I am facing is,
>>
>> In my system node4 has a memory block ranging from memory30351
>> to memory38524, and memory128433. The memory blocks between
>> memory38524 and memory128433 do not belong to this node.
>>
>> In  walk_memory_blocks() we iterate over all memory blocks starting
>> from memory38524 to memory128433.
>> In register_mem_block_under_node_early(), up to memory38524, the
>> first pfn correctly returns the corresponding nid and the function
>> returns from there. But after memory38524 and until memory128433,
>> the loop iterates through each pfn and checks the nid. Since the nid
>> does not match the required nid, the loop continues. This causes
>> the soft lockups.
>>
>> This issue occurs only when CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> is enabled, as a binary search is used to determine the PFN's nid. When
>> this configuration is disabled, pfn_to_nid is faster, and the issue does
>> not seen.( Faster because nid is getting from page)
>>
>> To speed up the code when CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> is enabled, I added this function that iterates over all memblock regions
>> for each memory block to determine its nid.
>>
>> "Loop over memblock regions instead of iterating over PFNs" -
>> My question is - in register_one_node, do you mean that we should iterate
>> over all memblock regions, identify the regions belonging to the current
>> node, and then retrieve the corresponding memory blocks to register them
>> under that node?
> I looked more closely at register_mem_block_under_node_early() and
> iteration over memblock regions won't make sense there.
>
> It might make sense to use for_each_mem_range() as top level loop in
> node_dev_init(), but that's a separate topic.

Yes, this makes sense to me as well. So in your opinion, instead of adding
a new memblock search function like I added , it's better to use
|for_each_mem_range()| in|node_dev_init()|, which would work for all
cases—regardless of whether|CONFIG_DEFERRED_STRUCT_PAGE_INIT| is set or
not. Right?


>   
>> Thanks
>> Donet
>>


  reply	other threads:[~2025-04-11 11:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-09  5:27 [PATCH 1/2] mm/memblock: Added a New Memblock Function to Check if the Current Node's Memblock Region Intersects with a Memory Block Donet Tom
2025-04-09  5:27 ` [PATCH 2/2] base/node: Use curr_node_memblock_intersect_memory_block to Get Memory Block NID if CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set Donet Tom
2025-04-10  8:07   ` Mike Rapoport
2025-04-10 18:57     ` Donet Tom
2025-04-11 10:59       ` Mike Rapoport
2025-04-11 11:36         ` Donet Tom [this message]
2025-04-15  9:46           ` Mike Rapoport
2025-04-15 10:08             ` Donet Tom
2025-04-10  2:20 ` [PATCH 1/2] mm/memblock: Added a New Memblock Function to Check if the Current Node's Memblock Region Intersects with a Memory Block Andrew Morton
2025-04-10  4:35   ` Donet Tom
2025-04-10  7:03 ` kernel test robot
2025-04-10  7:25 ` kernel test robot
2025-04-10  7:49 ` Mike Rapoport
2025-04-10 19:06   ` Donet Tom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=736ca451-8adc-4c5c-b721-6b78eaeb4699@linux.ibm.com \
    --to=donettom@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dakr@kernel.org \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox