* [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
@ 2025-09-12 16:58 Kiryl Shutsemau
2025-09-12 17:13 ` David Hildenbrand
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Kiryl Shutsemau @ 2025-09-12 16:58 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes
Cc: Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, linux-mm, linux-kernel
From: Kiryl Shutsemau <kas@kernel.org>
MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
PMD page table is installed or not.
Consider following example:
p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
err = madvise(p, 2UL << 20, MADV_COLLAPSE);
fd is a populated tmpfs file.
The result depends on the address that the kernel returns on mmap().
If it is located in an existing PMD table, the madvise() will succeed.
However, if the table does not exist, it will fail with -EINVAL.
This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
a page table is missing, which causes collapse_pte_mapped_thp() to fail.
SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
tables as needed.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
---
mm/khugepaged.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b486c1d19b2d..9e76a4f46df9 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1488,6 +1488,28 @@ static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr,
return SCAN_SUCCEED;
}
+static int install_huge_pmd(struct vm_area_struct *vma, unsigned long haddr,
+ pmd_t *pmd, struct folio *folio)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ pgd_t *pgd;
+ p4d_t *p4d;
+ pud_t *pud;
+
+ pgd = pgd_offset(mm, haddr);
+ p4d = p4d_alloc(mm, pgd, haddr);
+ if (!p4d)
+ return SCAN_FAIL;
+ pud = pud_alloc(mm, p4d, haddr);
+ if (!pud)
+ return SCAN_FAIL;
+ pmd = pmd_alloc(mm, pud, haddr);
+ if (!pmd)
+ return SCAN_FAIL;
+
+ return set_huge_pmd(vma, haddr, pmd, folio, &folio->page);
+}
+
/**
* collapse_pte_mapped_thp - Try to collapse a pte-mapped THP for mm at
* address haddr.
@@ -1556,6 +1578,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
switch (result) {
case SCAN_SUCCEED:
break;
+ case SCAN_PMD_NULL:
case SCAN_PMD_NONE:
/*
* All pte entries have been removed and pmd cleared.
@@ -1700,7 +1723,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
maybe_install_pmd:
/* step 5: install pmd entry */
result = install_pmd
- ? set_huge_pmd(vma, haddr, pmd, folio, &folio->page)
+ ? install_huge_pmd(vma, haddr, pmd, folio)
: SCAN_SUCCEED;
goto drop_folio;
abort:
--
2.50.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
2025-09-12 16:58 [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL Kiryl Shutsemau
@ 2025-09-12 17:13 ` David Hildenbrand
2025-09-12 17:44 ` Kiryl Shutsemau
2025-09-12 18:32 ` Lorenzo Stoakes
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2025-09-12 17:13 UTC (permalink / raw)
To: Kiryl Shutsemau, Andrew Morton, Lorenzo Stoakes
Cc: Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, linux-mm, linux-kernel
On 12.09.25 18:58, Kiryl Shutsemau wrote:
> From: Kiryl Shutsemau <kas@kernel.org>
>
> MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
> PMD page table is installed or not.
>
> Consider following example:
>
> p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
> err = madvise(p, 2UL << 20, MADV_COLLAPSE);
>
> fd is a populated tmpfs file.
>
> The result depends on the address that the kernel returns on mmap().
> If it is located in an existing PMD table, the madvise() will succeed.
> However, if the table does not exist, it will fail with -EINVAL.
>
> This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
> a page table is missing, which causes collapse_pte_mapped_thp() to fail.
>
> SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
> collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
> tables as needed.
>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> ---
Makes sense to me.
Is this something we want a Fixes: or even Cc: stable for?
I assume it doesn't really happen frequently, but could also happen
after MADV_DONTNEED'ing the full range with pt reclaim I think after
having faulted in some PTEs, for example.
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
2025-09-12 17:13 ` David Hildenbrand
@ 2025-09-12 17:44 ` Kiryl Shutsemau
0 siblings, 0 replies; 7+ messages in thread
From: Kiryl Shutsemau @ 2025-09-12 17:44 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
linux-mm, linux-kernel
On Fri, Sep 12, 2025 at 07:13:48PM +0200, David Hildenbrand wrote:
> On 12.09.25 18:58, Kiryl Shutsemau wrote:
> > From: Kiryl Shutsemau <kas@kernel.org>
> >
> > MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
> > PMD page table is installed or not.
> >
> > Consider following example:
> >
> > p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
> > MAP_SHARED, fd, 0);
> > err = madvise(p, 2UL << 20, MADV_COLLAPSE);
> >
> > fd is a populated tmpfs file.
> >
> > The result depends on the address that the kernel returns on mmap().
> > If it is located in an existing PMD table, the madvise() will succeed.
> > However, if the table does not exist, it will fail with -EINVAL.
> >
> > This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
> > a page table is missing, which causes collapse_pte_mapped_thp() to fail.
> >
> > SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
> > collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
> > tables as needed.
> >
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > ---
>
> Makes sense to me.
>
> Is this something we want a Fixes: or even Cc: stable for?
I am not sure if it is stable@ matter.
I believe it is there from the start:
Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
2025-09-12 16:58 [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL Kiryl Shutsemau
2025-09-12 17:13 ` David Hildenbrand
@ 2025-09-12 18:32 ` Lorenzo Stoakes
2025-09-14 7:26 ` Dev Jain
2025-09-15 3:36 ` Baolin Wang
3 siblings, 0 replies; 7+ messages in thread
From: Lorenzo Stoakes @ 2025-09-12 18:32 UTC (permalink / raw)
To: Kiryl Shutsemau
Cc: Andrew Morton, David Hildenbrand, Zi Yan, Baolin Wang,
Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
linux-mm, linux-kernel
On Fri, Sep 12, 2025 at 05:58:11PM +0100, Kiryl Shutsemau wrote:
> From: Kiryl Shutsemau <kas@kernel.org>
>
> MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
> PMD page table is installed or not.
>
> Consider following example:
>
> p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
> err = madvise(p, 2UL << 20, MADV_COLLAPSE);
>
> fd is a populated tmpfs file.
>
> The result depends on the address that the kernel returns on mmap().
> If it is located in an existing PMD table, the madvise() will succeed.
> However, if the table does not exist, it will fail with -EINVAL.
>
> This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
> a page table is missing, which causes collapse_pte_mapped_thp() to fail.
>
> SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
> collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
> tables as needed.
>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
LGTM, nice!
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> mm/khugepaged.c | 25 ++++++++++++++++++++++++-
> 1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b486c1d19b2d..9e76a4f46df9 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1488,6 +1488,28 @@ static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr,
> return SCAN_SUCCEED;
> }
>
> +static int install_huge_pmd(struct vm_area_struct *vma, unsigned long haddr,
> + pmd_t *pmd, struct folio *folio)
> +{
> + struct mm_struct *mm = vma->vm_mm;
> + pgd_t *pgd;
> + p4d_t *p4d;
> + pud_t *pud;
> +
> + pgd = pgd_offset(mm, haddr);
> + p4d = p4d_alloc(mm, pgd, haddr);
> + if (!p4d)
> + return SCAN_FAIL;
> + pud = pud_alloc(mm, p4d, haddr);
> + if (!pud)
> + return SCAN_FAIL;
> + pmd = pmd_alloc(mm, pud, haddr);
> + if (!pmd)
> + return SCAN_FAIL;
> +
> + return set_huge_pmd(vma, haddr, pmd, folio, &folio->page);
> +}
> +
> /**
> * collapse_pte_mapped_thp - Try to collapse a pte-mapped THP for mm at
> * address haddr.
> @@ -1556,6 +1578,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
> switch (result) {
> case SCAN_SUCCEED:
> break;
> + case SCAN_PMD_NULL:
> case SCAN_PMD_NONE:
> /*
> * All pte entries have been removed and pmd cleared.
> @@ -1700,7 +1723,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
> maybe_install_pmd:
> /* step 5: install pmd entry */
> result = install_pmd
> - ? set_huge_pmd(vma, haddr, pmd, folio, &folio->page)
> + ? install_huge_pmd(vma, haddr, pmd, folio)
> : SCAN_SUCCEED;
> goto drop_folio;
> abort:
> --
> 2.50.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
2025-09-12 16:58 [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL Kiryl Shutsemau
2025-09-12 17:13 ` David Hildenbrand
2025-09-12 18:32 ` Lorenzo Stoakes
@ 2025-09-14 7:26 ` Dev Jain
2025-09-15 11:15 ` Kiryl Shutsemau
2025-09-15 3:36 ` Baolin Wang
3 siblings, 1 reply; 7+ messages in thread
From: Dev Jain @ 2025-09-14 7:26 UTC (permalink / raw)
To: Kiryl Shutsemau, Andrew Morton, David Hildenbrand, Lorenzo Stoakes
Cc: Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Barry Song, linux-mm, linux-kernel
On 12/09/25 10:28 pm, Kiryl Shutsemau wrote:
> From: Kiryl Shutsemau <kas@kernel.org>
>
> MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
> PMD page table is installed or not.
>
> Consider following example:
>
> p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
> err = madvise(p, 2UL << 20, MADV_COLLAPSE);
>
> fd is a populated tmpfs file.
>
> The result depends on the address that the kernel returns on mmap().
> If it is located in an existing PMD table, the madvise() will succeed.
> However, if the table does not exist, it will fail with -EINVAL.
>
> This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
> a page table is missing, which causes collapse_pte_mapped_thp() to fail.
>
> SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
> collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
> tables as needed.
Thanks.
Besides this patch, the label maybe_install_pmd is misleading -
SCAN_PMD_NONE means that the pmd table exists, just that the pmd
entry is none, so the pmd is already installed. Along with this,
the argument bool install_pmd should likewise be install_huge_pmd.
>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> ---
> mm/khugepaged.c | 25 ++++++++++++++++++++++++-
> 1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b486c1d19b2d..9e76a4f46df9 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1488,6 +1488,28 @@ static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr,
> return SCAN_SUCCEED;
> }
>
> +static int install_huge_pmd(struct vm_area_struct *vma, unsigned long haddr,
> + pmd_t *pmd, struct folio *folio)
> +{
> + struct mm_struct *mm = vma->vm_mm;
> + pgd_t *pgd;
> + p4d_t *p4d;
> + pud_t *pud;
> +
> + pgd = pgd_offset(mm, haddr);
> + p4d = p4d_alloc(mm, pgd, haddr);
> + if (!p4d)
> + return SCAN_FAIL;
> + pud = pud_alloc(mm, p4d, haddr);
> + if (!pud)
> + return SCAN_FAIL;
> + pmd = pmd_alloc(mm, pud, haddr);
> + if (!pmd)
> + return SCAN_FAIL;
> +
> + return set_huge_pmd(vma, haddr, pmd, folio, &folio->page);
> +}
> +
For the SCAN_PMD_NONE case, we are unconditionally traversing the pagetables
now which is not needed. How about, in set_huge_pmd(), we pass a boolean install_pmd,
and at the start of the function, call install_pmd() which will do the traversal
and the pmd_alloc()? That will also make it crystal clear that in the SCAN_PMD_NULL
case, we are first installing the PMD table and then setting it to huge. Right now
the distinction between the two cases is not clear.
> /**
> * collapse_pte_mapped_thp - Try to collapse a pte-mapped THP for mm at
> * address haddr.
> @@ -1556,6 +1578,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
> switch (result) {
> case SCAN_SUCCEED:
> break;
> + case SCAN_PMD_NULL:
> case SCAN_PMD_NONE:
> /*
> * All pte entries have been removed and pmd cleared.
> @@ -1700,7 +1723,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
> maybe_install_pmd:
> /* step 5: install pmd entry */
> result = install_pmd
> - ? set_huge_pmd(vma, haddr, pmd, folio, &folio->page)
> + ? install_huge_pmd(vma, haddr, pmd, folio)
> : SCAN_SUCCEED;
> goto drop_folio;
> abort:
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
2025-09-12 16:58 [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL Kiryl Shutsemau
` (2 preceding siblings ...)
2025-09-14 7:26 ` Dev Jain
@ 2025-09-15 3:36 ` Baolin Wang
3 siblings, 0 replies; 7+ messages in thread
From: Baolin Wang @ 2025-09-15 3:36 UTC (permalink / raw)
To: Kiryl Shutsemau, Andrew Morton, David Hildenbrand, Lorenzo Stoakes
Cc: Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
Barry Song, linux-mm, linux-kernel
On 2025/9/13 00:58, Kiryl Shutsemau wrote:
> From: Kiryl Shutsemau <kas@kernel.org>
>
> MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
> PMD page table is installed or not.
>
> Consider following example:
>
> p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
> err = madvise(p, 2UL << 20, MADV_COLLAPSE);
>
> fd is a populated tmpfs file.
>
> The result depends on the address that the kernel returns on mmap().
> If it is located in an existing PMD table, the madvise() will succeed.
> However, if the table does not exist, it will fail with -EINVAL.
>
> This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
> a page table is missing, which causes collapse_pte_mapped_thp() to fail.
>
> SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
> collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
> tables as needed.
>
> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> ---
Make sense to me. Thanks.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL
2025-09-14 7:26 ` Dev Jain
@ 2025-09-15 11:15 ` Kiryl Shutsemau
0 siblings, 0 replies; 7+ messages in thread
From: Kiryl Shutsemau @ 2025-09-15 11:15 UTC (permalink / raw)
To: Dev Jain
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
Barry Song, linux-mm, linux-kernel
On Sun, Sep 14, 2025 at 12:56:13PM +0530, Dev Jain wrote:
>
> On 12/09/25 10:28 pm, Kiryl Shutsemau wrote:
> > From: Kiryl Shutsemau <kas@kernel.org>
> >
> > MADV_COLLAPSE on a file mapping behaves inconsistently depending on if
> > PMD page table is installed or not.
> >
> > Consider following example:
> >
> > p = mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
> > MAP_SHARED, fd, 0);
> > err = madvise(p, 2UL << 20, MADV_COLLAPSE);
> >
> > fd is a populated tmpfs file.
> >
> > The result depends on the address that the kernel returns on mmap().
> > If it is located in an existing PMD table, the madvise() will succeed.
> > However, if the table does not exist, it will fail with -EINVAL.
> >
> > This occurs because find_pmd_or_thp_or_none() returns SCAN_PMD_NULL when
> > a page table is missing, which causes collapse_pte_mapped_thp() to fail.
> >
> > SCAN_PMD_NULL and SCAN_PMD_NONE should be treated the same in
> > collapse_pte_mapped_thp(): install the PMD leaf entry and allocate page
> > tables as needed.
>
> Thanks.
>
> Besides this patch, the label maybe_install_pmd is misleading -
> SCAN_PMD_NONE means that the pmd table exists, just that the pmd
> entry is none, so the pmd is already installed.
That's never ending confusion between PTE/PMD/P?D entry and table.
Addressing it is out of scope of the patch :P
> Along with this,
> the argument bool install_pmd should likewise be install_huge_pmd.
Well, if you rename install_pmd to install_huge_pmd it will overshadow
the install_huge_pmd() function. And the label name is not a problem in
my view.
>
> >
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > ---
> > mm/khugepaged.c | 25 ++++++++++++++++++++++++-
> > 1 file changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index b486c1d19b2d..9e76a4f46df9 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1488,6 +1488,28 @@ static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr,
> > return SCAN_SUCCEED;
> > }
> > +static int install_huge_pmd(struct vm_area_struct *vma, unsigned long haddr,
> > + pmd_t *pmd, struct folio *folio)
> > +{
> > + struct mm_struct *mm = vma->vm_mm;
> > + pgd_t *pgd;
> > + p4d_t *p4d;
> > + pud_t *pud;
> > +
> > + pgd = pgd_offset(mm, haddr);
> > + p4d = p4d_alloc(mm, pgd, haddr);
> > + if (!p4d)
> > + return SCAN_FAIL;
> > + pud = pud_alloc(mm, p4d, haddr);
> > + if (!pud)
> > + return SCAN_FAIL;
> > + pmd = pmd_alloc(mm, pud, haddr);
> > + if (!pmd)
> > + return SCAN_FAIL;
> > +
> > + return set_huge_pmd(vma, haddr, pmd, folio, &folio->page);
> > +}
> > +
>
> For the SCAN_PMD_NONE case, we are unconditionally traversing the pagetables
> now which is not needed. How about, in set_huge_pmd(), we pass a boolean install_pmd,
> and at the start of the function, call install_pmd() which will do the traversal
> and the pmd_alloc()? That will also make it crystal clear that in the SCAN_PMD_NULL
> case, we are first installing the PMD table and then setting it to huge. Right now
> the distinction between the two cases is not clear.
I just realized that my install_huge_pmd() doesn't use pmd that is pass
in. And looking at code again, I think it is better to integrate the
page table allocation directly into set_huge_pmd().
See the patch below. I will submit it as v2, if there's no objections.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b486c1d19b2d..986718599355 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1472,15 +1472,32 @@ static void collect_mm_slot(struct khugepaged_mm_slot *mm_slot)
static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr,
pmd_t *pmdp, struct folio *folio, struct page *page)
{
+ struct mm_struct *mm = vma->vm_mm;
struct vm_fault vmf = {
.vma = vma,
.address = addr,
.flags = 0,
- .pmd = pmdp,
};
+ pgd_t *pgdp;
+ p4d_t *p4dp;
+ pud_t *pudp;
mmap_assert_locked(vma->vm_mm);
+ if (!pmdp) {
+ pgdp = pgd_offset(mm, addr);
+ p4dp = p4d_alloc(mm, pgdp, addr);
+ if (!p4dp)
+ return SCAN_FAIL;
+ pudp = pud_alloc(mm, p4dp, addr);
+ if (!pudp)
+ return SCAN_FAIL;
+ pmdp = pmd_alloc(mm, pudp, addr);
+ if (!pmdp)
+ return SCAN_FAIL;
+ }
+
+ vmf.pmd = pmdp;
if (do_set_pmd(&vmf, folio, page))
return SCAN_FAIL;
@@ -1556,6 +1573,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
switch (result) {
case SCAN_SUCCEED:
break;
+ case SCAN_PMD_NULL:
case SCAN_PMD_NONE:
/*
* All pte entries have been removed and pmd cleared.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-15 11:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-12 16:58 [PATCH] mm/khugepaged: Do not fail collapse_pte_mapped_thp() on SCAN_PMD_NULL Kiryl Shutsemau
2025-09-12 17:13 ` David Hildenbrand
2025-09-12 17:44 ` Kiryl Shutsemau
2025-09-12 18:32 ` Lorenzo Stoakes
2025-09-14 7:26 ` Dev Jain
2025-09-15 11:15 ` Kiryl Shutsemau
2025-09-15 3:36 ` Baolin Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox