* [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages
@ 2025-01-10 2:56 yangge1116
2025-01-10 8:14 ` David Hildenbrand
0 siblings, 1 reply; 5+ messages in thread
From: yangge1116 @ 2025-01-10 2:56 UTC (permalink / raw)
To: akpm
Cc: linux-mm, linux-kernel, 21cnbao, david, baolin.wang, liuzixing, yangge
From: yangge <yangge1116@126.com>
When there are free hugetlb folios in the hugetlb pool, during the
migration of in-use hugetlb folios, new folios is allocated from
the free hugetlb pool. After the migration is completed, the old
folios are released back to the free hugetlb pool. However, after
the old folios are released to the free hugetlb pool, they may be
reallocated. When replace_free_hugepage_folios() is executed later,
it cannot release these old folios back to the buddy system.
As discussed with David in [1], when alloc_contig_range() is used
to migrate multiple in-use hugetlb pages, it can lead to the issue
described above. For example:
[huge 0] [huge 1]
To migrate huge 0, we obtain huge x from the pool. After the migration
is completed, we return the now-freed huge 0 back to the pool. When
it's time to migrate huge 1, we can simply reuse the now-freed huge 0
from the pool. As a result, when replace_free_hugepage_folios() is
executed, it cannot release huge 0 back to the buddy system.
To slove the proble above, we should prevent reuse of isolated free
hugepages.
Link: https://lore.kernel.org/lkml/1734503588-16254-1-git-send-email-yangge1116@126.com/
Fixes: 08d312ee4c0a ("mm: replace free hugepage folios after migration")
Signed-off-by: yangge <yangge1116@126.com>
---
mm/hugetlb.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9a55960..e5f9999 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -48,6 +48,7 @@
#include <linux/page_owner.h>
#include "internal.h"
#include "hugetlb_vmemmap.h"
+#include <linux/page-isolation.h>
int hugetlb_max_hstate __read_mostly;
unsigned int default_hstate_idx;
@@ -1273,6 +1274,9 @@ static struct folio *dequeue_hugetlb_folio_node_exact(struct hstate *h,
if (folio_test_hwpoison(folio))
continue;
+ if (is_migrate_isolate_page(&folio->page))
+ continue;
+
list_move(&folio->lru, &h->hugepage_activelist);
folio_ref_unfreeze(folio, 1);
folio_clear_hugetlb_freed(folio);
--
2.7.4
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages
2025-01-10 2:56 [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages yangge1116
@ 2025-01-10 8:14 ` David Hildenbrand
2025-01-11 1:45 ` Andrew Morton
2025-01-11 8:00 ` Ge Yang
0 siblings, 2 replies; 5+ messages in thread
From: David Hildenbrand @ 2025-01-10 8:14 UTC (permalink / raw)
To: yangge1116, akpm; +Cc: linux-mm, linux-kernel, 21cnbao, baolin.wang, liuzixing
On 10.01.25 03:56, yangge1116@126.com wrote:
> From: yangge <yangge1116@126.com>
>
> When there are free hugetlb folios in the hugetlb pool, during the
> migration of in-use hugetlb folios, new folios is allocated from
> the free hugetlb pool. After the migration is completed, the old
> folios are released back to the free hugetlb pool. However, after
> the old folios are released to the free hugetlb pool, they may be
> reallocated. When replace_free_hugepage_folios() is executed later,
> it cannot release these old folios back to the buddy system.
>
> As discussed with David in [1], when alloc_contig_range() is used
> to migrate multiple in-use hugetlb pages, it can lead to the issue
> described above. For example:
>
> [huge 0] [huge 1]
>
> To migrate huge 0, we obtain huge x from the pool. After the migration
> is completed, we return the now-freed huge 0 back to the pool. When
> it's time to migrate huge 1, we can simply reuse the now-freed huge 0
> from the pool. As a result, when replace_free_hugepage_folios() is
> executed, it cannot release huge 0 back to the buddy system.
>
> To slove the proble above, we should prevent reuse of isolated free
> hugepages.
s/slove/solve/
s/proble/problem/
>
> Link: https://lore.kernel.org/lkml/1734503588-16254-1-git-send-email-yangge1116@126.com/
> Fixes: 08d312ee4c0a ("mm: replace free hugepage folios after migration")
The commit id is not stable yet.
$ git tag --contains 08d312ee4c0a
mm-everything-2025-01-09-06-44
next-20250110
We should squash this into the original fix. Can you resend the whole
thing and merge the patch descriptions?
> Signed-off-by: yangge <yangge1116@126.com>
> ---
> mm/hugetlb.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 9a55960..e5f9999 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -48,6 +48,7 @@
> #include <linux/page_owner.h>
> #include "internal.h"
> #include "hugetlb_vmemmap.h"
> +#include <linux/page-isolation.h>
>
> int hugetlb_max_hstate __read_mostly;
> unsigned int default_hstate_idx;
> @@ -1273,6 +1274,9 @@ static struct folio *dequeue_hugetlb_folio_node_exact(struct hstate *h,
> if (folio_test_hwpoison(folio))
> continue;
>
> + if (is_migrate_isolate_page(&folio->page))
> + continue;
> +
> list_move(&folio->lru, &h->hugepage_activelist);
> folio_ref_unfreeze(folio, 1);
> folio_clear_hugetlb_freed(folio);
Sorry for not getting back to your previous mail, this week was a bit crazy.
This will work reliably if the huge page does not span more than a
single page block.
Assuming it would span multiple ones, we might have only isolated the
last etc. pageblock. For the common cases it might do, but not for all
cases unfortunately (especially not gigantic pages, but I recall we skip
them during alloc_contig_pages(); I recall some oddities on ppc even
without gigantic pages involved).
One option would be to stare at all involved pageblocks, although a bit
nasty ... let me think about this.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages
2025-01-10 8:14 ` David Hildenbrand
@ 2025-01-11 1:45 ` Andrew Morton
2025-01-11 8:00 ` Ge Yang
2025-01-11 8:00 ` Ge Yang
1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2025-01-11 1:45 UTC (permalink / raw)
To: David Hildenbrand
Cc: yangge1116, linux-mm, linux-kernel, 21cnbao, baolin.wang, liuzixing
On Fri, 10 Jan 2025 09:14:47 +0100 David Hildenbrand <david@redhat.com> wrote:
> > To migrate huge 0, we obtain huge x from the pool. After the migration
> > is completed, we return the now-freed huge 0 back to the pool. When
> > it's time to migrate huge 1, we can simply reuse the now-freed huge 0
> > from the pool. As a result, when replace_free_hugepage_folios() is
> > executed, it cannot release huge 0 back to the buddy system.
> >
> > To slove the proble above, we should prevent reuse of isolated free
> > hugepages.
>
> s/slove/solve/
> s/proble/problem/
>
> >
> > Link: https://lore.kernel.org/lkml/1734503588-16254-1-git-send-email-yangge1116@126.com/
> > Fixes: 08d312ee4c0a ("mm: replace free hugepage folios after migration")
>
> The commit id is not stable yet.
>
> $ git tag --contains 08d312ee4c0a
> mm-everything-2025-01-09-06-44
> next-20250110
>
>
> We should squash this into the original fix. Can you resend the whole
> thing and merge the patch descriptions?
I queued this as
replace-free-hugepage-folios-after-migration-fix-3.patch. But yes, a
clean v4 with a redone changelog would be a nice thing to have.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages
2025-01-10 8:14 ` David Hildenbrand
2025-01-11 1:45 ` Andrew Morton
@ 2025-01-11 8:00 ` Ge Yang
1 sibling, 0 replies; 5+ messages in thread
From: Ge Yang @ 2025-01-11 8:00 UTC (permalink / raw)
To: David Hildenbrand, akpm
Cc: linux-mm, linux-kernel, 21cnbao, baolin.wang, liuzixing
在 2025/1/10 16:14, David Hildenbrand 写道:
> On 10.01.25 03:56, yangge1116@126.com wrote:
>> From: yangge <yangge1116@126.com>
>>
>> When there are free hugetlb folios in the hugetlb pool, during the
>> migration of in-use hugetlb folios, new folios is allocated from
>> the free hugetlb pool. After the migration is completed, the old
>> folios are released back to the free hugetlb pool. However, after
>> the old folios are released to the free hugetlb pool, they may be
>> reallocated. When replace_free_hugepage_folios() is executed later,
>> it cannot release these old folios back to the buddy system.
>>
>> As discussed with David in [1], when alloc_contig_range() is used
>> to migrate multiple in-use hugetlb pages, it can lead to the issue
>> described above. For example:
>>
>> [huge 0] [huge 1]
>>
>> To migrate huge 0, we obtain huge x from the pool. After the migration
>> is completed, we return the now-freed huge 0 back to the pool. When
>> it's time to migrate huge 1, we can simply reuse the now-freed huge 0
>> from the pool. As a result, when replace_free_hugepage_folios() is
>> executed, it cannot release huge 0 back to the buddy system.
>>
>> To slove the proble above, we should prevent reuse of isolated free
>> hugepages.
>
> s/slove/solve/
> s/proble/problem/
>
>>
>> Link: https://lore.kernel.org/lkml/1734503588-16254-1-git-send-email-
>> yangge1116@126.com/
>> Fixes: 08d312ee4c0a ("mm: replace free hugepage folios after migration")
>
> The commit id is not stable yet.
>
> $ git tag --contains 08d312ee4c0a
> mm-everything-2025-01-09-06-44
> next-20250110
>
>
> We should squash this into the original fix. Can you resend the whole
> thing and merge the patch descriptions?
>
ok, thanks.
>> Signed-off-by: yangge <yangge1116@126.com>
>> ---
>> mm/hugetlb.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 9a55960..e5f9999 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -48,6 +48,7 @@
>> #include <linux/page_owner.h>
>> #include "internal.h"
>> #include "hugetlb_vmemmap.h"
>> +#include <linux/page-isolation.h>
>> int hugetlb_max_hstate __read_mostly;
>> unsigned int default_hstate_idx;
>> @@ -1273,6 +1274,9 @@ static struct folio
>> *dequeue_hugetlb_folio_node_exact(struct hstate *h,
>> if (folio_test_hwpoison(folio))
>> continue;
>> + if (is_migrate_isolate_page(&folio->page))
>> + continue;
>> +
>> list_move(&folio->lru, &h->hugepage_activelist);
>> folio_ref_unfreeze(folio, 1);
>> folio_clear_hugetlb_freed(folio);
>
> Sorry for not getting back to your previous mail, this week was a bit
> crazy.
>
> This will work reliably if the huge page does not span more than a
> single page block.
>
> Assuming it would span multiple ones, we might have only isolated the
> last etc. pageblock. For the common cases it might do, but not for all
> cases unfortunately (especially not gigantic pages, but I recall we skip
> them during alloc_contig_pages(); I recall some oddities on ppc even
> without gigantic pages involved).
>
> One option would be to stare at all involved pageblocks, although a bit
> nasty ... let me think about this.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages
2025-01-11 1:45 ` Andrew Morton
@ 2025-01-11 8:00 ` Ge Yang
0 siblings, 0 replies; 5+ messages in thread
From: Ge Yang @ 2025-01-11 8:00 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand
Cc: linux-mm, linux-kernel, 21cnbao, baolin.wang, liuzixing
在 2025/1/11 9:45, Andrew Morton 写道:
> On Fri, 10 Jan 2025 09:14:47 +0100 David Hildenbrand <david@redhat.com> wrote:
>
>>> To migrate huge 0, we obtain huge x from the pool. After the migration
>>> is completed, we return the now-freed huge 0 back to the pool. When
>>> it's time to migrate huge 1, we can simply reuse the now-freed huge 0
>>> from the pool. As a result, when replace_free_hugepage_folios() is
>>> executed, it cannot release huge 0 back to the buddy system.
>>>
>>> To slove the proble above, we should prevent reuse of isolated free
>>> hugepages.
>>
>> s/slove/solve/
>> s/proble/problem/
>>
>>>
>>> Link: https://lore.kernel.org/lkml/1734503588-16254-1-git-send-email-yangge1116@126.com/
>>> Fixes: 08d312ee4c0a ("mm: replace free hugepage folios after migration")
>>
>> The commit id is not stable yet.
>>
>> $ git tag --contains 08d312ee4c0a
>> mm-everything-2025-01-09-06-44
>> next-20250110
>>
>>
>> We should squash this into the original fix. Can you resend the whole
>> thing and merge the patch descriptions?
>
> I queued this as
> replace-free-hugepage-folios-after-migration-fix-3.patch. But yes, a
> clean v4 with a redone changelog would be a nice thing to have.
ok, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-01-11 8:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-10 2:56 [PATCH] mm/hugetlb: prevent reuse of isolated free hugepages yangge1116
2025-01-10 8:14 ` David Hildenbrand
2025-01-11 1:45 ` Andrew Morton
2025-01-11 8:00 ` Ge Yang
2025-01-11 8:00 ` Ge Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox