linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drivers/base/node: Handle error properly in register_one_node()
@ 2025-07-02 11:28 Donet Tom
  2025-07-02 12:46 ` Oscar Salvador
  0 siblings, 1 reply; 5+ messages in thread
From: Donet Tom @ 2025-07-02 11:28 UTC (permalink / raw)
  To: David Hildenbrand, Andrew Morton, Oscar Salvador, Zi Yan,
	Greg Kroah-Hartman
  Cc: Ritesh Harjani, linux-mm, linux-kernel, Rafael J . Wysocki,
	Danilo Krummrich, Jonathan Cameron, Alison Schofield, Yury Norov,
	Dave Jiang, KAMEZAWA Hiroyuki, Donet Tom

If register_node() returns an error, it is not handled correctly.
The function will proceed further and try to register CPUs under the
node, which is not correct.

So, in this patch, if register_node() returns an error, we return
immediately from the function.

Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---

This patch is based on the mm-unstable branch.

Fixes: 76b67ed9dce6 ("[PATCH] node hotplug: register cpu: remove node struct")

The issue has been present since the above commit, which is
quite old. Should I add a Fixes: tag and backport it to all
kernels that have this commit?
---
 drivers/base/node.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index bef84f01712f..aec991b4c0b2 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -885,6 +885,8 @@ int register_one_node(int nid)
 	node_devices[nid] = node;
 
 	error = register_node(node_devices[nid], nid);
+	if (error)
+		return error;
 
 	/* link cpu under this node */
 	for_each_present_cpu(cpu) {
-- 
2.47.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/base/node: Handle error properly in register_one_node()
  2025-07-02 11:28 [PATCH] drivers/base/node: Handle error properly in register_one_node() Donet Tom
@ 2025-07-02 12:46 ` Oscar Salvador
  2025-07-02 12:59   ` Donet Tom
  0 siblings, 1 reply; 5+ messages in thread
From: Oscar Salvador @ 2025-07-02 12:46 UTC (permalink / raw)
  To: Donet Tom
  Cc: David Hildenbrand, Andrew Morton, Zi Yan, Greg Kroah-Hartman,
	Ritesh Harjani, linux-mm, linux-kernel, Rafael J . Wysocki,
	Danilo Krummrich, Jonathan Cameron, Alison Schofield, Yury Norov,
	Dave Jiang, KAMEZAWA Hiroyuki

On Wed, Jul 02, 2025 at 06:28:56AM -0500, Donet Tom wrote:
> If register_node() returns an error, it is not handled correctly.
> The function will proceed further and try to register CPUs under the
> node, which is not correct.
> 
> So, in this patch, if register_node() returns an error, we return
> immediately from the function.
> 
> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
> ---
> 
... 
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index bef84f01712f..aec991b4c0b2 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -885,6 +885,8 @@ int register_one_node(int nid)
>  	node_devices[nid] = node;
>  
>  	error = register_node(node_devices[nid], nid);
> +	if (error)
> +		return error;

Ok, all current callers (based on mm-unstable) panic or BUG() if this fails,
but powerpc, in init_phb_dynamic(), which keeps on going.
Unless it panics somewhere down the road as well.

So I think we need to: 

 node_devices[nid] = NULL
 kfree(node)

 ?

Also, once Hannes fix lands, we might need that as well.

Anyway, I'd suggest you hold on until Hannes fix lands, so we can later
rebase all your mem-hotplug on top of that [1].

[1] https://lore.kernel.org/linux-mm/86f89a65-f0f6-4462-9eea-ac691de2f3b6@suse.de/T/#mbf392eb390b8053f96be50da3b40dfd9b62dd389


-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/base/node: Handle error properly in register_one_node()
  2025-07-02 12:46 ` Oscar Salvador
@ 2025-07-02 12:59   ` Donet Tom
  2025-07-04 12:29     ` David Hildenbrand
  0 siblings, 1 reply; 5+ messages in thread
From: Donet Tom @ 2025-07-02 12:59 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand, Andrew Morton, Zi Yan, Greg Kroah-Hartman,
	Ritesh Harjani, linux-mm, linux-kernel, Rafael J . Wysocki,
	Danilo Krummrich, Jonathan Cameron, Alison Schofield, Yury Norov,
	Dave Jiang, KAMEZAWA Hiroyuki


On 7/2/25 6:16 PM, Oscar Salvador wrote:
> On Wed, Jul 02, 2025 at 06:28:56AM -0500, Donet Tom wrote:
>> If register_node() returns an error, it is not handled correctly.
>> The function will proceed further and try to register CPUs under the
>> node, which is not correct.
>>
>> So, in this patch, if register_node() returns an error, we return
>> immediately from the function.
>>
>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>> ---
>>
> ...
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index bef84f01712f..aec991b4c0b2 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -885,6 +885,8 @@ int register_one_node(int nid)
>>   	node_devices[nid] = node;
>>   
>>   	error = register_node(node_devices[nid], nid);
>> +	if (error)
>> +		return error;
> Ok, all current callers (based on mm-unstable) panic or BUG() if this fails,
> but powerpc, in init_phb_dynamic(), which keeps on going.
> Unless it panics somewhere down the road as well.
>
> So I think we need to:
>
>   node_devices[nid] = NULL
>   kfree(node)
>
>   ?


Yes, I will add this too.

But one question: if register_node() fails, is it okay to continue, or 
should we panic?

What is the correct way to handle this?


> Also, once Hannes fix lands, we might need that as well.
>
> Anyway, I'd suggest you hold on until Hannes fix lands, so we can later
> rebase all your mem-hotplug on top of that [1].

Sure


>
> [1] https://lore.kernel.org/linux-mm/86f89a65-f0f6-4462-9eea-ac691de2f3b6@suse.de/T/#mbf392eb390b8053f96be50da3b40dfd9b62dd389
>
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/base/node: Handle error properly in register_one_node()
  2025-07-02 12:59   ` Donet Tom
@ 2025-07-04 12:29     ` David Hildenbrand
  2025-07-07  4:01       ` Donet Tom
  0 siblings, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2025-07-04 12:29 UTC (permalink / raw)
  To: Donet Tom, Oscar Salvador
  Cc: Andrew Morton, Zi Yan, Greg Kroah-Hartman, Ritesh Harjani,
	linux-mm, linux-kernel, Rafael J . Wysocki, Danilo Krummrich,
	Jonathan Cameron, Alison Schofield, Yury Norov, Dave Jiang,
	KAMEZAWA Hiroyuki

On 02.07.25 14:59, Donet Tom wrote:
> 
> On 7/2/25 6:16 PM, Oscar Salvador wrote:
>> On Wed, Jul 02, 2025 at 06:28:56AM -0500, Donet Tom wrote:
>>> If register_node() returns an error, it is not handled correctly.
>>> The function will proceed further and try to register CPUs under the
>>> node, which is not correct.
>>>
>>> So, in this patch, if register_node() returns an error, we return
>>> immediately from the function.
>>>
>>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>>> ---
>>>
>> ...
>>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>>> index bef84f01712f..aec991b4c0b2 100644
>>> --- a/drivers/base/node.c
>>> +++ b/drivers/base/node.c
>>> @@ -885,6 +885,8 @@ int register_one_node(int nid)
>>>    	node_devices[nid] = node;
>>>    
>>>    	error = register_node(node_devices[nid], nid);
>>> +	if (error)
>>> +		return error;
>> Ok, all current callers (based on mm-unstable) panic or BUG() if this fails,
>> but powerpc, in init_phb_dynamic(), which keeps on going.
>> Unless it panics somewhere down the road as well.
>>
>> So I think we need to:
>>
>>    node_devices[nid] = NULL
>>    kfree(node)
>>
>>    ?
> 
> 
> Yes, I will add this too.
> 
> But one question: if register_node() fails, is it okay to continue, or
> should we panic?
> 
> What is the correct way to handle this?

panic() or BUG() is not the answer :)

Try to recover ...

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drivers/base/node: Handle error properly in register_one_node()
  2025-07-04 12:29     ` David Hildenbrand
@ 2025-07-07  4:01       ` Donet Tom
  0 siblings, 0 replies; 5+ messages in thread
From: Donet Tom @ 2025-07-07  4:01 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador
  Cc: Andrew Morton, Zi Yan, Greg Kroah-Hartman, Ritesh Harjani,
	linux-mm, linux-kernel, Rafael J . Wysocki, Danilo Krummrich,
	Jonathan Cameron, Alison Schofield, Yury Norov, Dave Jiang,
	KAMEZAWA Hiroyuki


On 7/4/25 5:59 PM, David Hildenbrand wrote:
> On 02.07.25 14:59, Donet Tom wrote:
>>
>> On 7/2/25 6:16 PM, Oscar Salvador wrote:
>>> On Wed, Jul 02, 2025 at 06:28:56AM -0500, Donet Tom wrote:
>>>> If register_node() returns an error, it is not handled correctly.
>>>> The function will proceed further and try to register CPUs under the
>>>> node, which is not correct.
>>>>
>>>> So, in this patch, if register_node() returns an error, we return
>>>> immediately from the function.
>>>>
>>>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>>>> ---
>>>>
>>> ...
>>>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>>>> index bef84f01712f..aec991b4c0b2 100644
>>>> --- a/drivers/base/node.c
>>>> +++ b/drivers/base/node.c
>>>> @@ -885,6 +885,8 @@ int register_one_node(int nid)
>>>>        node_devices[nid] = node;
>>>>           error = register_node(node_devices[nid], nid);
>>>> +    if (error)
>>>> +        return error;
>>> Ok, all current callers (based on mm-unstable) panic or BUG() if 
>>> this fails,
>>> but powerpc, in init_phb_dynamic(), which keeps on going.
>>> Unless it panics somewhere down the road as well.
>>>
>>> So I think we need to:
>>>
>>>    node_devices[nid] = NULL
>>>    kfree(node)
>>>
>>>    ?
>>
>>
>> Yes, I will add this too.
>>
>> But one question: if register_node() fails, is it okay to continue, or
>> should we panic?
>>
>> What is the correct way to handle this?
>
> panic() or BUG() is not the answer :)
>
> Try to recover ...

Got it, thank you very much, David.




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-07  4:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-02 11:28 [PATCH] drivers/base/node: Handle error properly in register_one_node() Donet Tom
2025-07-02 12:46 ` Oscar Salvador
2025-07-02 12:59   ` Donet Tom
2025-07-04 12:29     ` David Hildenbrand
2025-07-07  4:01       ` Donet Tom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox