linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jinjiang Tu <tujinjiang@huawei.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: <akpm@linux-foundation.org>, <muchun.song@linux.dev>,
	<david@redhat.com>, <linux-mm@kvack.org>,
	<wangkefeng.wang@huawei.com>, <sunnanyong@huawei.com>
Subject: Re: [PATCH v2] mm/hugetlb: fix surplus pages in dissolve_free_huge_page()
Date: Wed, 5 Mar 2025 11:46:15 +0800	[thread overview]
Message-ID: <932dde6f-06a0-121d-584c-256e74ebadb9@huawei.com> (raw)
In-Reply-To: <Z8cDkiEUmF30i4bl@localhost.localdomain>


在 2025/3/4 21:43, Oscar Salvador 写道:
> On Tue, Mar 04, 2025 at 09:21:06PM +0800, Jinjiang Tu wrote:
>> In dissolve_free_huge_page(), free huge pages are dissolved without
>> adjusting surplus count. However, free huge pages may be accounted as
>> surplus pages, and will lead to wrong surplus count.
>>
>> I reproduce this issue on qemu. The steps are:
>> 1) Node1 is memory-less at first. Hot-add memory to node1 by executing
>> the two commands in qemu monitor:
>>    object_add memory-backend-ram,id=mem1,size=1G
>>    device_add pc-dimm,id=dimm1,memdev=mem1,node=1
>> 2) online one memory block of Node1 with:
>>    echo online_movable > /sys/devices/system/node/node1/memoryX/state
>> 3) create 64 huge pages for node1
>> 4) run a program to reserve (don't consume) all the huge pages
>> 5) echo 0 > nr_huge_pages for node1. After this step, free huge pages in
>> Node1 are surplus.
>> 6) create 80 huge pages for node0
>> 7) offline memory of node1, The memory range to offline contains the free
>> surplus huge pages created in step3) ~ step5)
>>    echo offline > /sys/devices/system/node/node1/memoryX/state
>> 8) kill the program in step 4)
>>
>> The result:
>>             Node0     Node1
>> total       80        0
>> free        80        0
>> surplus     0         61
>>
>> To fix it, adjust surplus when destroying huge pages if the node has
>> surplus pages in dissolve_free_hugetlb_folio().
>>
>> The result with this patch:
>>             Node0     Node1
>> total       80        0
>> free        80        0
>> surplus     0         0
>>
>> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> Acked-by: Oscar Salvador <osalvador@suse.de>
>
>> @@ -2157,7 +2159,9 @@ int dissolve_free_hugetlb_folio(struct folio *folio)
>>   			goto retry;
>>   		}
>>   
>> -		remove_hugetlb_folio(h, folio, false);
>> +		if (h->surplus_huge_pages_node[folio_nid(folio)])
>> +			adjust_surplus = true;
>> +		remove_hugetlb_folio(h, folio, adjust_surplus);
>>   		h->max_huge_pages--;
>>   		spin_unlock_irq(&hugetlb_lock);
>>   
>> @@ -2177,7 +2181,7 @@ int dissolve_free_hugetlb_folio(struct folio *folio)
>>   			rc = hugetlb_vmemmap_restore_folio(h, folio);
>>   			if (rc) {
>>   				spin_lock_irq(&hugetlb_lock);
>> -				add_hugetlb_folio(h, folio, false);
>> +				add_hugetlb_folio(h, folio, adjust_surplus);
> I was about to point this out, but checking v1 I saw that David already
> that.
> My alternative would have been to just get rid of the adjust_surplus
> boolean and to the checking right within the lock cycle e.g:
>
>   if (h->surplus_huge_pages_node[folio_nid(folio)])
>          add_hugetlb_folio(h, folio, true);
>   else
>          add_hugetlb_folio(h, folio, false);

It seems wrong to fix like this. h->surplus_huge_pages_node[folio_nid(folio)] !=0 means existing surplus
pages after removing folio (the variable named folio), doesn't mean if the folio need
to be treated as surplus too.

> But I guess that's fine as you already explained.
>
>   
>


      reply	other threads:[~2025-03-05  3:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-04 13:21 Jinjiang Tu
2025-03-04 13:43 ` Oscar Salvador
2025-03-05  3:46   ` Jinjiang Tu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=932dde6f-06a0-121d-584c-256e74ebadb9@huawei.com \
    --to=tujinjiang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=sunnanyong@huawei.com \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox