* [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries
@ 2025-09-24 10:02 Lance Yang
2025-09-24 10:10 ` David Hildenbrand
2025-09-24 10:10 ` Kiryl Shutsemau
0 siblings, 2 replies; 9+ messages in thread
From: Lance Yang @ 2025-09-24 10:02 UTC (permalink / raw)
To: akpm
Cc: david, lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang,
dev.jain, hughd, ioworker0, kirill, linux-kernel, linux-mm,
mpenttil, npache, ryan.roberts, ziy, richard.weiyang, Lance Yang
From: Lance Yang <lance.yang@linux.dev>
The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp
markers. Other special markers (e.g., GUARD, POISONED) would not be caught
early, leading to failures deeper in the swap-in logic.
hpage_collapse_scan_pmd()
`- collapse_huge_page()
`- __collapse_huge_page_swapin() -> fails!
As David suggested[1], this patch skips any such non-swap entries early.
If a special marker is found, the scan is aborted immediately with the
SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted
work.
[1] https://lore.kernel.org/linux-mm/7840f68e-7580-42cb-a7c8-1ba64fd6df69@redhat.com
[2] https://lore.kernel.org/linux-mm/7df49fe7-c6b7-426a-8680-dcd55219c8bd@lucifer.local
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
mm/khugepaged.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 7ab2d1a42df3..e9778e7734b5 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1285,16 +1285,19 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
_pte++, addr += PAGE_SIZE) {
pte_t pteval = ptep_get(_pte);
if (is_swap_pte(pteval)) {
+ swp_entry_t swp = pte_to_swp_entry(pteval);
++unmapped;
if (!cc->is_khugepaged ||
unmapped <= khugepaged_max_ptes_swap) {
/*
- * Always be strict with uffd-wp
- * enabled swap entries. Please see
- * comment below for pte_uffd_wp().
+ * Always be strict with PTE markers, which are
+ * special non-swap entries (e.g., for UFFD_WP,
+ * POISONED, GUARD). We cannot collapse over
+ * them, so just abort the scan here.
*/
- if (pte_swp_uffd_wp_any(pteval)) {
- result = SCAN_PTE_UFFD_WP;
+ if (is_pte_marker_entry(swp) &&
+ pte_marker_get(swp)) {
+ result = SCAN_PTE_NON_PRESENT;
goto out_unmap;
}
continue;
--
2.49.0
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 10:02 [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries Lance Yang @ 2025-09-24 10:10 ` David Hildenbrand 2025-09-24 10:17 ` Kiryl Shutsemau 2025-09-24 11:47 ` Lance Yang 2025-09-24 10:10 ` Kiryl Shutsemau 1 sibling, 2 replies; 9+ messages in thread From: David Hildenbrand @ 2025-09-24 10:10 UTC (permalink / raw) To: Lance Yang, akpm Cc: lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, kirill, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang On 24.09.25 12:02, Lance Yang wrote: > From: Lance Yang <lance.yang@linux.dev> > > The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp > markers. Other special markers (e.g., GUARD, POISONED) would not be caught > early, leading to failures deeper in the swap-in logic. > > hpage_collapse_scan_pmd() > `- collapse_huge_page() > `- __collapse_huge_page_swapin() -> fails! > > As David suggested[1], this patch skips any such non-swap entries early. > If a special marker is found, the scan is aborted immediately with the > SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted > work. Note that I suggested to skip all non-present entries except swap entries, which includes migration entries, hwpoisoned entries etc. -- Cheers David / dhildenb ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 10:10 ` David Hildenbrand @ 2025-09-24 10:17 ` Kiryl Shutsemau 2025-09-24 10:40 ` David Hildenbrand 2025-09-24 11:47 ` Lance Yang 1 sibling, 1 reply; 9+ messages in thread From: Kiryl Shutsemau @ 2025-09-24 10:17 UTC (permalink / raw) To: David Hildenbrand Cc: Lance Yang, akpm, lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang On Wed, Sep 24, 2025 at 12:10:47PM +0200, David Hildenbrand wrote: > On 24.09.25 12:02, Lance Yang wrote: > > From: Lance Yang <lance.yang@linux.dev> > > > > The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp > > markers. Other special markers (e.g., GUARD, POISONED) would not be caught > > early, leading to failures deeper in the swap-in logic. > > > > hpage_collapse_scan_pmd() > > `- collapse_huge_page() > > `- __collapse_huge_page_swapin() -> fails! > > > > As David suggested[1], this patch skips any such non-swap entries early. > > If a special marker is found, the scan is aborted immediately with the > > SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted > > work. > > Note that I suggested to skip all non-present entries except swap entries, > which includes migration entries, hwpoisoned entries etc. Hm. So swap in is fine, but wait for migration to complete is not? Seems odd. -- Kiryl Shutsemau / Kirill A. Shutemov ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 10:17 ` Kiryl Shutsemau @ 2025-09-24 10:40 ` David Hildenbrand 0 siblings, 0 replies; 9+ messages in thread From: David Hildenbrand @ 2025-09-24 10:40 UTC (permalink / raw) To: Kiryl Shutsemau Cc: Lance Yang, akpm, lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang On 24.09.25 12:17, Kiryl Shutsemau wrote: > On Wed, Sep 24, 2025 at 12:10:47PM +0200, David Hildenbrand wrote: >> On 24.09.25 12:02, Lance Yang wrote: >>> From: Lance Yang <lance.yang@linux.dev> >>> >>> The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp >>> markers. Other special markers (e.g., GUARD, POISONED) would not be caught >>> early, leading to failures deeper in the swap-in logic. >>> >>> hpage_collapse_scan_pmd() >>> `- collapse_huge_page() >>> `- __collapse_huge_page_swapin() -> fails! >>> >>> As David suggested[1], this patch skips any such non-swap entries early. >>> If a special marker is found, the scan is aborted immediately with the >>> SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted >>> work. >> >> Note that I suggested to skip all non-present entries except swap entries, >> which includes migration entries, hwpoisoned entries etc. > > Hm. So swap in is fine, but wait for migration to complete is not? If so we'd have to add the logic to actually wait for migration entries, and not count them towards max swap entries. But that's a different discussion and could be added on top of cleanly handling all non-swap entries. -- Cheers David / dhildenb ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 10:10 ` David Hildenbrand 2025-09-24 10:17 ` Kiryl Shutsemau @ 2025-09-24 11:47 ` Lance Yang 2025-09-29 10:10 ` Lance Yang 2025-09-29 10:29 ` David Hildenbrand 1 sibling, 2 replies; 9+ messages in thread From: Lance Yang @ 2025-09-24 11:47 UTC (permalink / raw) To: David Hildenbrand Cc: lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, kirill, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang, akpm On 2025/9/24 18:10, David Hildenbrand wrote: > On 24.09.25 12:02, Lance Yang wrote: >> From: Lance Yang <lance.yang@linux.dev> >> >> The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp >> markers. Other special markers (e.g., GUARD, POISONED) would not be >> caught >> early, leading to failures deeper in the swap-in logic. >> >> hpage_collapse_scan_pmd() >> `- collapse_huge_page() >> `- __collapse_huge_page_swapin() -> fails! >> >> As David suggested[1], this patch skips any such non-swap entries early. >> If a special marker is found, the scan is aborted immediately with the >> SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted >> work. > > Note that I suggested to skip all non-present entries except swap > entries, which includes migration entries, hwpoisoned entries etc. Oops, I completely misunderstood your suggestion :( It should be to handle all special non-present entries (migration, hwpoison, markers), not just a specific type of marker ... How about this version, which handles all non-swap entries as you suggested? diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 7ab2d1a42df3..27f432e7f07c 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1284,7 +1284,23 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR; _pte++, addr += PAGE_SIZE) { pte_t pteval = ptep_get(_pte); - if (is_swap_pte(pteval)) { + if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { + ++none_or_zero; + if (!userfaultfd_armed(vma) && + (!cc->is_khugepaged || + none_or_zero <= khugepaged_max_ptes_none)) { + continue; + } else { + result = SCAN_EXCEED_NONE_PTE; + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); + goto out_unmap; + } + } else if (!pte_present(pteval)) { + if (non_swap_entry(pte_to_swp_entry(pteval))) { + result = SCAN_PTE_NON_PRESENT; + goto out_unmap; + } + ++unmapped; if (!cc->is_khugepaged || unmapped <= khugepaged_max_ptes_swap) { @@ -1293,7 +1309,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, * enabled swap entries. Please see * comment below for pte_uffd_wp(). */ - if (pte_swp_uffd_wp_any(pteval)) { + if (pte_swp_uffd_wp(pteval)) { result = SCAN_PTE_UFFD_WP; goto out_unmap; } @@ -1304,18 +1320,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, goto out_unmap; } } - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { - ++none_or_zero; - if (!userfaultfd_armed(vma) && - (!cc->is_khugepaged || - none_or_zero <= khugepaged_max_ptes_none)) { - continue; - } else { - result = SCAN_EXCEED_NONE_PTE; - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); - goto out_unmap; - } - } if (pte_uffd_wp(pteval)) { /* * Don't collapse the page if any of the small --- Thanks, Lance ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 11:47 ` Lance Yang @ 2025-09-29 10:10 ` Lance Yang 2025-09-29 10:29 ` David Hildenbrand 1 sibling, 0 replies; 9+ messages in thread From: Lance Yang @ 2025-09-29 10:10 UTC (permalink / raw) To: David Hildenbrand Cc: lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, kirill, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang, akpm On 2025/9/24 19:47, Lance Yang wrote: > > > On 2025/9/24 18:10, David Hildenbrand wrote: >> On 24.09.25 12:02, Lance Yang wrote: >>> From: Lance Yang <lance.yang@linux.dev> >>> >>> The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp >>> markers. Other special markers (e.g., GUARD, POISONED) would not be >>> caught >>> early, leading to failures deeper in the swap-in logic. >>> >>> hpage_collapse_scan_pmd() >>> `- collapse_huge_page() >>> `- __collapse_huge_page_swapin() -> fails! >>> >>> As David suggested[1], this patch skips any such non-swap entries early. >>> If a special marker is found, the scan is aborted immediately with the >>> SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted >>> work. >> >> Note that I suggested to skip all non-present entries except swap >> entries, which includes migration entries, hwpoisoned entries etc. > > Oops, I completely misunderstood your suggestion :( > > It should be to handle all special non-present entries (migration, > hwpoison, markers), not just a specific type of marker ... > > How about this version, which handles all non-swap entries as you > suggested? @David Gently ping ... > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 7ab2d1a42df3..27f432e7f07c 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1284,7 +1284,23 @@ static int hpage_collapse_scan_pmd(struct > mm_struct *mm, > for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR; > _pte++, addr += PAGE_SIZE) { > pte_t pteval = ptep_get(_pte); > - if (is_swap_pte(pteval)) { > + if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > + ++none_or_zero; > + if (!userfaultfd_armed(vma) && > + (!cc->is_khugepaged || > + none_or_zero <= khugepaged_max_ptes_none)) { > + continue; > + } else { > + result = SCAN_EXCEED_NONE_PTE; > + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > + goto out_unmap; > + } > + } else if (!pte_present(pteval)) { > + if (non_swap_entry(pte_to_swp_entry(pteval))) { > + result = SCAN_PTE_NON_PRESENT; > + goto out_unmap; > + } > + > ++unmapped; > if (!cc->is_khugepaged || > unmapped <= khugepaged_max_ptes_swap) { > @@ -1293,7 +1309,7 @@ static int hpage_collapse_scan_pmd(struct > mm_struct *mm, > * enabled swap entries. Please see > * comment below for pte_uffd_wp(). > */ > - if (pte_swp_uffd_wp_any(pteval)) { > + if (pte_swp_uffd_wp(pteval)) { > result = SCAN_PTE_UFFD_WP; > goto out_unmap; > } > @@ -1304,18 +1320,6 @@ static int hpage_collapse_scan_pmd(struct > mm_struct *mm, > goto out_unmap; > } > } > - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > - ++none_or_zero; > - if (!userfaultfd_armed(vma) && > - (!cc->is_khugepaged || > - none_or_zero <= khugepaged_max_ptes_none)) { > - continue; > - } else { > - result = SCAN_EXCEED_NONE_PTE; > - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > - goto out_unmap; > - } > - } > if (pte_uffd_wp(pteval)) { > /* > * Don't collapse the page if any of the small > --- > > Thanks, > Lance ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 11:47 ` Lance Yang 2025-09-29 10:10 ` Lance Yang @ 2025-09-29 10:29 ` David Hildenbrand 2025-09-29 10:39 ` Lance Yang 1 sibling, 1 reply; 9+ messages in thread From: David Hildenbrand @ 2025-09-29 10:29 UTC (permalink / raw) To: Lance Yang Cc: lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, kirill, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang, akpm On 24.09.25 13:47, Lance Yang wrote: > > > On 2025/9/24 18:10, David Hildenbrand wrote: >> On 24.09.25 12:02, Lance Yang wrote: >>> From: Lance Yang <lance.yang@linux.dev> >>> >>> The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp >>> markers. Other special markers (e.g., GUARD, POISONED) would not be >>> caught >>> early, leading to failures deeper in the swap-in logic. >>> >>> hpage_collapse_scan_pmd() >>> `- collapse_huge_page() >>> `- __collapse_huge_page_swapin() -> fails! >>> >>> As David suggested[1], this patch skips any such non-swap entries early. >>> If a special marker is found, the scan is aborted immediately with the >>> SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted >>> work. >> >> Note that I suggested to skip all non-present entries except swap >> entries, which includes migration entries, hwpoisoned entries etc. > > Oops, I completely misunderstood your suggestion :( > > It should be to handle all special non-present entries (migration, > hwpoison, markers), not just a specific type of marker ... > > How about this version, which handles all non-swap entries as you > suggested? > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 7ab2d1a42df3..27f432e7f07c 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1284,7 +1284,23 @@ static int hpage_collapse_scan_pmd(struct > mm_struct *mm, > for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR; > _pte++, addr += PAGE_SIZE) { > pte_t pteval = ptep_get(_pte); > - if (is_swap_pte(pteval)) { > + if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > + ++none_or_zero; > + if (!userfaultfd_armed(vma) && > + (!cc->is_khugepaged || > + none_or_zero <= khugepaged_max_ptes_none)) { > + continue; > + } else { > + result = SCAN_EXCEED_NONE_PTE; > + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > + goto out_unmap; > + } > + } else if (!pte_present(pteval)) { > + if (non_swap_entry(pte_to_swp_entry(pteval))) { > + result = SCAN_PTE_NON_PRESENT; > + goto out_unmap; > + } > + > ++unmapped; > if (!cc->is_khugepaged || > unmapped <= khugepaged_max_ptes_swap) { > @@ -1293,7 +1309,7 @@ static int hpage_collapse_scan_pmd(struct > mm_struct *mm, > * enabled swap entries. Please see > * comment below for pte_uffd_wp(). > */ > - if (pte_swp_uffd_wp_any(pteval)) { > + if (pte_swp_uffd_wp(pteval)) { > result = SCAN_PTE_UFFD_WP; > goto out_unmap; > } > @@ -1304,18 +1320,6 @@ static int hpage_collapse_scan_pmd(struct > mm_struct *mm, > goto out_unmap; > } > } > - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { > - ++none_or_zero; > - if (!userfaultfd_armed(vma) && > - (!cc->is_khugepaged || > - none_or_zero <= khugepaged_max_ptes_none)) { > - continue; > - } else { > - result = SCAN_EXCEED_NONE_PTE; > - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); > - goto out_unmap; > - } > - } > if (pte_uffd_wp(pteval)) { From a quick glimpse, this should work. And as raised, we might be able to unify later the scanning with the almost-duplicated code when we do the second scan. -- Cheers David / dhildenb ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-29 10:29 ` David Hildenbrand @ 2025-09-29 10:39 ` Lance Yang 0 siblings, 0 replies; 9+ messages in thread From: Lance Yang @ 2025-09-29 10:39 UTC (permalink / raw) To: David Hildenbrand Cc: lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, kirill, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang, akpm On 2025/9/29 18:29, David Hildenbrand wrote: > On 24.09.25 13:47, Lance Yang wrote: >> >> >> On 2025/9/24 18:10, David Hildenbrand wrote: >>> On 24.09.25 12:02, Lance Yang wrote: >>>> From: Lance Yang <lance.yang@linux.dev> >>>> >>>> The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp >>>> markers. Other special markers (e.g., GUARD, POISONED) would not be >>>> caught >>>> early, leading to failures deeper in the swap-in logic. >>>> >>>> hpage_collapse_scan_pmd() >>>> `- collapse_huge_page() >>>> `- __collapse_huge_page_swapin() -> fails! >>>> >>>> As David suggested[1], this patch skips any such non-swap entries >>>> early. >>>> If a special marker is found, the scan is aborted immediately with the >>>> SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted >>>> work. >>> >>> Note that I suggested to skip all non-present entries except swap >>> entries, which includes migration entries, hwpoisoned entries etc. >> >> Oops, I completely misunderstood your suggestion :( >> >> It should be to handle all special non-present entries (migration, >> hwpoison, markers), not just a specific type of marker ... >> >> How about this version, which handles all non-swap entries as you >> suggested? >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 7ab2d1a42df3..27f432e7f07c 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -1284,7 +1284,23 @@ static int hpage_collapse_scan_pmd(struct >> mm_struct *mm, >> for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR; >> _pte++, addr += PAGE_SIZE) { >> pte_t pteval = ptep_get(_pte); >> - if (is_swap_pte(pteval)) { >> + if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { >> + ++none_or_zero; >> + if (!userfaultfd_armed(vma) && >> + (!cc->is_khugepaged || >> + none_or_zero <= khugepaged_max_ptes_none)) { >> + continue; >> + } else { >> + result = SCAN_EXCEED_NONE_PTE; >> + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); >> + goto out_unmap; >> + } >> + } else if (!pte_present(pteval)) { >> + if (non_swap_entry(pte_to_swp_entry(pteval))) { >> + result = SCAN_PTE_NON_PRESENT; >> + goto out_unmap; >> + } >> + >> ++unmapped; >> if (!cc->is_khugepaged || >> unmapped <= khugepaged_max_ptes_swap) { >> @@ -1293,7 +1309,7 @@ static int hpage_collapse_scan_pmd(struct >> mm_struct *mm, >> * enabled swap entries. Please see >> * comment below for pte_uffd_wp(). >> */ >> - if (pte_swp_uffd_wp_any(pteval)) { >> + if (pte_swp_uffd_wp(pteval)) { >> result = SCAN_PTE_UFFD_WP; >> goto out_unmap; >> } >> @@ -1304,18 +1320,6 @@ static int hpage_collapse_scan_pmd(struct >> mm_struct *mm, >> goto out_unmap; >> } >> } >> - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { >> - ++none_or_zero; >> - if (!userfaultfd_armed(vma) && >> - (!cc->is_khugepaged || >> - none_or_zero <= khugepaged_max_ptes_none)) { >> - continue; >> - } else { >> - result = SCAN_EXCEED_NONE_PTE; >> - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); >> - goto out_unmap; >> - } >> - } >> if (pte_uffd_wp(pteval)) { > > From a quick glimpse, this should work. And as raised, we might be able > to unify later the scanning with the almost-duplicated code when we do > the second scan. Sounds good! Let's get this one merged first, and I'll send a follow-up patch to unify the duplicated code as you suggested ;) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries 2025-09-24 10:02 [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries Lance Yang 2025-09-24 10:10 ` David Hildenbrand @ 2025-09-24 10:10 ` Kiryl Shutsemau 1 sibling, 0 replies; 9+ messages in thread From: Kiryl Shutsemau @ 2025-09-24 10:10 UTC (permalink / raw) To: Lance Yang Cc: akpm, david, lorenzo.stoakes, Liam.Howlett, baohua, baolin.wang, dev.jain, hughd, ioworker0, linux-kernel, linux-mm, mpenttil, npache, ryan.roberts, ziy, richard.weiyang On Wed, Sep 24, 2025 at 06:02:07PM +0800, Lance Yang wrote: > From: Lance Yang <lance.yang@linux.dev> > > The existing check in hpage_collapse_scan_pmd() is specific to uffd-wp > markers. Other special markers (e.g., GUARD, POISONED) would not be caught > early, leading to failures deeper in the swap-in logic. > > hpage_collapse_scan_pmd() > `- collapse_huge_page() > `- __collapse_huge_page_swapin() -> fails! > > As David suggested[1], this patch skips any such non-swap entries early. > If a special marker is found, the scan is aborted immediately with the > SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted > work. > > [1] https://lore.kernel.org/linux-mm/7840f68e-7580-42cb-a7c8-1ba64fd6df69@redhat.com > [2] https://lore.kernel.org/linux-mm/7df49fe7-c6b7-426a-8680-dcd55219c8bd@lucifer.local > > Suggested-by: David Hildenbrand <david@redhat.com> > Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Signed-off-by: Lance Yang <lance.yang@linux.dev> > --- > mm/khugepaged.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 7ab2d1a42df3..e9778e7734b5 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1285,16 +1285,19 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > _pte++, addr += PAGE_SIZE) { > pte_t pteval = ptep_get(_pte); > if (is_swap_pte(pteval)) { > + swp_entry_t swp = pte_to_swp_entry(pteval); > ++unmapped; > if (!cc->is_khugepaged || > unmapped <= khugepaged_max_ptes_swap) { > /* > - * Always be strict with uffd-wp > - * enabled swap entries. Please see > - * comment below for pte_uffd_wp(). > + * Always be strict with PTE markers, which are > + * special non-swap entries (e.g., for UFFD_WP, > + * POISONED, GUARD). We cannot collapse over > + * them, so just abort the scan here. > */ > - if (pte_swp_uffd_wp_any(pteval)) { > - result = SCAN_PTE_UFFD_WP; > + if (is_pte_marker_entry(swp) && > + pte_marker_get(swp)) { Hm. Can we have a marker that have zero pte_marker_get()? > + result = SCAN_PTE_NON_PRESENT; > goto out_unmap; > } > continue; > -- > 2.49.0 > -- Kiryl Shutsemau / Kirill A. Shutemov ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-09-29 10:39 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-24 10:02 [PATCH mm-new 1/1] mm/khugepaged: abort collapse scan on non-swap entries Lance Yang 2025-09-24 10:10 ` David Hildenbrand 2025-09-24 10:17 ` Kiryl Shutsemau 2025-09-24 10:40 ` David Hildenbrand 2025-09-24 11:47 ` Lance Yang 2025-09-29 10:10 ` Lance Yang 2025-09-29 10:29 ` David Hildenbrand 2025-09-29 10:39 ` Lance Yang 2025-09-24 10:10 ` Kiryl Shutsemau
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox