From: Jinjiang Tu <tujinjiang@huawei.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: <akpm@linux-foundation.org>, <muchun.song@linux.dev>,
<david@redhat.com>, <linux-mm@kvack.org>,
<wangkefeng.wang@huawei.com>, <sunnanyong@huawei.com>
Subject: Re: [PATCH v2] mm/hugetlb: fix surplus pages in dissolve_free_huge_page()
Date: Wed, 5 Mar 2025 11:46:15 +0800 [thread overview]
Message-ID: <932dde6f-06a0-121d-584c-256e74ebadb9@huawei.com> (raw)
In-Reply-To: <Z8cDkiEUmF30i4bl@localhost.localdomain>
在 2025/3/4 21:43, Oscar Salvador 写道:
> On Tue, Mar 04, 2025 at 09:21:06PM +0800, Jinjiang Tu wrote:
>> In dissolve_free_huge_page(), free huge pages are dissolved without
>> adjusting surplus count. However, free huge pages may be accounted as
>> surplus pages, and will lead to wrong surplus count.
>>
>> I reproduce this issue on qemu. The steps are:
>> 1) Node1 is memory-less at first. Hot-add memory to node1 by executing
>> the two commands in qemu monitor:
>> object_add memory-backend-ram,id=mem1,size=1G
>> device_add pc-dimm,id=dimm1,memdev=mem1,node=1
>> 2) online one memory block of Node1 with:
>> echo online_movable > /sys/devices/system/node/node1/memoryX/state
>> 3) create 64 huge pages for node1
>> 4) run a program to reserve (don't consume) all the huge pages
>> 5) echo 0 > nr_huge_pages for node1. After this step, free huge pages in
>> Node1 are surplus.
>> 6) create 80 huge pages for node0
>> 7) offline memory of node1, The memory range to offline contains the free
>> surplus huge pages created in step3) ~ step5)
>> echo offline > /sys/devices/system/node/node1/memoryX/state
>> 8) kill the program in step 4)
>>
>> The result:
>> Node0 Node1
>> total 80 0
>> free 80 0
>> surplus 0 61
>>
>> To fix it, adjust surplus when destroying huge pages if the node has
>> surplus pages in dissolve_free_hugetlb_folio().
>>
>> The result with this patch:
>> Node0 Node1
>> total 80 0
>> free 80 0
>> surplus 0 0
>>
>> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
> Acked-by: Oscar Salvador <osalvador@suse.de>
>
>> @@ -2157,7 +2159,9 @@ int dissolve_free_hugetlb_folio(struct folio *folio)
>> goto retry;
>> }
>>
>> - remove_hugetlb_folio(h, folio, false);
>> + if (h->surplus_huge_pages_node[folio_nid(folio)])
>> + adjust_surplus = true;
>> + remove_hugetlb_folio(h, folio, adjust_surplus);
>> h->max_huge_pages--;
>> spin_unlock_irq(&hugetlb_lock);
>>
>> @@ -2177,7 +2181,7 @@ int dissolve_free_hugetlb_folio(struct folio *folio)
>> rc = hugetlb_vmemmap_restore_folio(h, folio);
>> if (rc) {
>> spin_lock_irq(&hugetlb_lock);
>> - add_hugetlb_folio(h, folio, false);
>> + add_hugetlb_folio(h, folio, adjust_surplus);
> I was about to point this out, but checking v1 I saw that David already
> that.
> My alternative would have been to just get rid of the adjust_surplus
> boolean and to the checking right within the lock cycle e.g:
>
> if (h->surplus_huge_pages_node[folio_nid(folio)])
> add_hugetlb_folio(h, folio, true);
> else
> add_hugetlb_folio(h, folio, false);
It seems wrong to fix like this. h->surplus_huge_pages_node[folio_nid(folio)] !=0 means existing surplus
pages after removing folio (the variable named folio), doesn't mean if the folio need
to be treated as surplus too.
> But I guess that's fine as you already explained.
>
>
>
prev parent reply other threads:[~2025-03-05 3:46 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-04 13:21 Jinjiang Tu
2025-03-04 13:43 ` Oscar Salvador
2025-03-05 3:46 ` Jinjiang Tu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=932dde6f-06a0-121d-584c-256e74ebadb9@huawei.com \
--to=tujinjiang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=sunnanyong@huawei.com \
--cc=wangkefeng.wang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox