* [PATCH v2 0/2] Expand scope of khugepaged anonymous collapse @ 2025-09-08 7:50 Dev Jain 2025-09-08 7:50 ` [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions Dev Jain 2025-09-08 7:50 ` [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain 0 siblings, 2 replies; 7+ messages in thread From: Dev Jain @ 2025-09-08 7:50 UTC (permalink / raw) To: akpm, david, kas, willy, hughd Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel, Dev Jain Currently khugepaged does not collapse an anonymous region which does not have a single writable pte. This is wasteful since a region mapped with non-writable ptes, for example, non-writable VMAs mapped by the application, won't benefit from THP collapse. An additional consequence of this constraint is that MADV_COLLAPSE does not perform a collapse on a non-writable VMA, and this restriction is nowhere to be found on the manpage - the restriction itself sounds wrong to me since the user knows the protection of the memory it has mapped, so collapsing read-only memory via madvise() should be a choice of the user which shouldn't be overridden by the kernel. Therefore, remove this constraint. On an arm64 bare metal machine, comparing with vanilla 6.17-rc2, an average of 5% improvement is seen on some mmtests benchmarks, particularly hackbench, with a maximum improvement of 12%. In the following table, (I) denotes statistically significant improvement, (R) denotes statistically significant regression. +-------------------------+--------------------------------+---------------+ | mmtests/hackbench | process-pipes-1 (seconds) | -0.06% | | | process-pipes-4 (seconds) | -0.27% | | | process-pipes-7 (seconds) | (I) -12.13% | | | process-pipes-12 (seconds) | (I) -5.32% | | | process-pipes-21 (seconds) | (I) -2.87% | | | process-pipes-30 (seconds) | (I) -3.39% | | | process-pipes-48 (seconds) | (I) -5.65% | | | process-pipes-79 (seconds) | (I) -6.74% | | | process-pipes-110 (seconds) | (I) -6.26% | | | process-pipes-141 (seconds) | (I) -4.99% | | | process-pipes-172 (seconds) | (I) -4.45% | | | process-pipes-203 (seconds) | (I) -3.65% | | | process-pipes-234 (seconds) | (I) -3.45% | | | process-pipes-256 (seconds) | (I) -3.47% | | | process-sockets-1 (seconds) | 2.13% | | | process-sockets-4 (seconds) | 1.02% | | | process-sockets-7 (seconds) | -0.26% | | | process-sockets-12 (seconds) | -1.24% | | | process-sockets-21 (seconds) | 0.01% | | | process-sockets-30 (seconds) | -0.15% | | | process-sockets-48 (seconds) | 0.15% | | | process-sockets-79 (seconds) | 1.45% | | | process-sockets-110 (seconds) | -1.64% | | | process-sockets-141 (seconds) | (I) -4.27% | | | process-sockets-172 (seconds) | 0.30% | | | process-sockets-203 (seconds) | -1.71% | | | process-sockets-234 (seconds) | -1.94% | | | process-sockets-256 (seconds) | -0.71% | | | thread-pipes-1 (seconds) | 0.66% | | | thread-pipes-4 (seconds) | 1.66% | | | thread-pipes-7 (seconds) | -0.17% | | | thread-pipes-12 (seconds) | (I) -4.12% | | | thread-pipes-21 (seconds) | (I) -2.13% | | | thread-pipes-30 (seconds) | (I) -3.78% | | | thread-pipes-48 (seconds) | (I) -5.77% | | | thread-pipes-79 (seconds) | (I) -5.31% | | | thread-pipes-110 (seconds) | (I) -6.12% | | | thread-pipes-141 (seconds) | (I) -4.00% | | | thread-pipes-172 (seconds) | (I) -3.01% | | | thread-pipes-203 (seconds) | (I) -2.62% | | | thread-pipes-234 (seconds) | (I) -2.00% | | | thread-pipes-256 (seconds) | (I) -2.30% | | | thread-sockets-1 (seconds) | (R) 2.39% | +-------------------------+--------------------------------+---------------+ +-------------------------+------------------------------------------------+ | mmtests/sysbench-mutex | sysbenchmutex-1 (usec) | -0.02% | | | sysbenchmutex-4 (usec) | -0.02% | | | sysbenchmutex-7 (usec) | 0.00% | | | sysbenchmutex-12 (usec) | 0.12% | | | sysbenchmutex-21 (usec) | -0.40% | | | sysbenchmutex-30 (usec) | 0.08% | | | sysbenchmutex-48 (usec) | 2.59% | | | sysbenchmutex-79 (usec) | -0.80% | | | sysbenchmutex-110 (usec) | -3.87% | | | sysbenchmutex-128 (usec) | (I) -4.46% | +-------------------------+--------------------------------+---------------+ --- Based on today's mm-new. v1->v2: - Replace non-writable VMAs with non-writable PTEs to be more specific - Add cover letter RFC->v1: - Drop writable references from tracepoints RFC: - https://lore.kernel.org/all/20250901074817.73012-1-dev.jain@arm.com/ Dev Jain (2): mm: Enable khugepaged anonymous collapse on non-writable regions mm: Drop all references of writable and SCAN_PAGE_RO include/trace/events/huge_memory.h | 19 ++++++------------- mm/khugepaged.c | 23 +++++------------------ 2 files changed, 11 insertions(+), 31 deletions(-) -- 2.30.2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions 2025-09-08 7:50 [PATCH v2 0/2] Expand scope of khugepaged anonymous collapse Dev Jain @ 2025-09-08 7:50 ` Dev Jain 2025-09-09 18:49 ` Zach O'Keefe 2025-09-10 4:03 ` Anshuman Khandual 2025-09-08 7:50 ` [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain 1 sibling, 2 replies; 7+ messages in thread From: Dev Jain @ 2025-09-08 7:50 UTC (permalink / raw) To: akpm, david, kas, willy, hughd Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel, Dev Jain Currently khugepaged does not collapse an anonymous region which does not have a single writable pte. This is wasteful since a region mapped with non-writable ptes, for example, non-writable VMAs mapped by the application, won't benefit from THP collapse. An additional consequence of this constraint is that MADV_COLLAPSE does not perform a collapse on a non-writable VMA, and this restriction is nowhere to be found on the manpage - the restriction itself sounds wrong to me since the user knows the protection of the memory it has mapped, so collapsing read-only memory via madvise() should be a choice of the user which shouldn't be overridden by the kernel. Therefore, remove this restriction by not honouring SCAN_PAGE_RO. Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Kiryl Shutsemau <kas@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Dev Jain <dev.jain@arm.com> --- mm/khugepaged.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4ec324a4c1fe..a0f1df2a7ae6 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, writable = true; } - if (unlikely(!writable)) { - result = SCAN_PAGE_RO; - } else if (unlikely(cc->is_khugepaged && !referenced)) { + if (unlikely(cc->is_khugepaged && !referenced)) { result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, mmu_notifier_test_young(vma->vm_mm, _address))) referenced++; } - if (!writable) { - result = SCAN_PAGE_RO; - } else if (cc->is_khugepaged && + if (cc->is_khugepaged && (!referenced || (unmapped && referenced < HPAGE_PMD_NR / 2))) { result = SCAN_LACK_REFERENCED_PAGE; @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start, case SCAN_PMD_NULL: case SCAN_PTE_NON_PRESENT: case SCAN_PTE_UFFD_WP: - case SCAN_PAGE_RO: case SCAN_LACK_REFERENCED_PAGE: case SCAN_PAGE_NULL: case SCAN_PAGE_COUNT: -- 2.30.2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions 2025-09-08 7:50 ` [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions Dev Jain @ 2025-09-09 18:49 ` Zach O'Keefe 2025-09-10 4:03 ` Anshuman Khandual 1 sibling, 0 replies; 7+ messages in thread From: Zach O'Keefe @ 2025-09-09 18:49 UTC (permalink / raw) To: Dev Jain Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel On Mon, Sep 8, 2025 at 12:51 AM Dev Jain <dev.jain@arm.com> wrote: > > Currently khugepaged does not collapse an anonymous region which does not > have a single writable pte. This is wasteful since a region mapped with > non-writable ptes, for example, non-writable VMAs mapped by the > application, won't benefit from THP collapse. > > An additional consequence of this constraint is that MADV_COLLAPSE does not > perform a collapse on a non-writable VMA, and this restriction is nowhere > to be found on the manpage - the restriction itself sounds wrong to me > since the user knows the protection of the memory it has mapped, so > collapsing read-only memory via madvise() should be a choice of the > user which shouldn't be overridden by the kernel. Sorry ; late to the party. Certainly agree wrt MADV_COLLAPSE. Ditto for khugepaged as well. Check added when support for non-writable pages were added to khugepaged, though retaining heuristic that at least one pte should be writable; 10359213d05a ("mm: incorporate read-only pages into transparent huge pages"), which predates max_ptes_swap. > Therefore, remove this restriction by not honouring SCAN_PAGE_RO.> > Acked-by: David Hildenbrand <david@redhat.com> > Acked-by: Zi Yan <ziy@nvidia.com> > Reviewed-by: Wei Yang <richard.weiyang@gmail.com> > Reviewed-by: Kiryl Shutsemau <kas@kernel.org> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Zach O'Keefe <zokeefe@google.com> > Signed-off-by: Dev Jain <dev.jain@arm.com> > --- > mm/khugepaged.c | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 4ec324a4c1fe..a0f1df2a7ae6 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > writable = true; > } > > - if (unlikely(!writable)) { > - result = SCAN_PAGE_RO; > - } else if (unlikely(cc->is_khugepaged && !referenced)) { > + if (unlikely(cc->is_khugepaged && !referenced)) { > result = SCAN_LACK_REFERENCED_PAGE; > } else { > result = SCAN_SUCCEED; > @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > mmu_notifier_test_young(vma->vm_mm, _address))) > referenced++; > } > - if (!writable) { > - result = SCAN_PAGE_RO; > - } else if (cc->is_khugepaged && > + if (cc->is_khugepaged && > (!referenced || > (unmapped && referenced < HPAGE_PMD_NR / 2))) { > result = SCAN_LACK_REFERENCED_PAGE; > @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start, > case SCAN_PMD_NULL: > case SCAN_PTE_NON_PRESENT: > case SCAN_PTE_UFFD_WP: > - case SCAN_PAGE_RO: > case SCAN_LACK_REFERENCED_PAGE: > case SCAN_PAGE_NULL: > case SCAN_PAGE_COUNT: > -- > 2.30.2 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions 2025-09-08 7:50 ` [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions Dev Jain 2025-09-09 18:49 ` Zach O'Keefe @ 2025-09-10 4:03 ` Anshuman Khandual 1 sibling, 0 replies; 7+ messages in thread From: Anshuman Khandual @ 2025-09-10 4:03 UTC (permalink / raw) To: Dev Jain, akpm, david, kas, willy, hughd Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel On 08/09/25 1:20 PM, Dev Jain wrote: > Currently khugepaged does not collapse an anonymous region which does not > have a single writable pte. This is wasteful since a region mapped with > non-writable ptes, for example, non-writable VMAs mapped by the > application, won't benefit from THP collapse. > > An additional consequence of this constraint is that MADV_COLLAPSE does not > perform a collapse on a non-writable VMA, and this restriction is nowhere > to be found on the manpage - the restriction itself sounds wrong to me > since the user knows the protection of the memory it has mapped, so > collapsing read-only memory via madvise() should be a choice of the > user which shouldn't be overridden by the kernel. Agreed. Dropping this constraint makes sense both for MAD_COLLAPSE system call and khugepaged based collapse as well. > > Therefore, remove this restriction by not honouring SCAN_PAGE_RO. > > Acked-by: David Hildenbrand <david@redhat.com> > Acked-by: Zi Yan <ziy@nvidia.com> > Reviewed-by: Wei Yang <richard.weiyang@gmail.com> > Reviewed-by: Kiryl Shutsemau <kas@kernel.org> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > Signed-off-by: Dev Jain <dev.jain@arm.com> > --- Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> > mm/khugepaged.c | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 4ec324a4c1fe..a0f1df2a7ae6 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > writable = true; > } > > - if (unlikely(!writable)) { > - result = SCAN_PAGE_RO; > - } else if (unlikely(cc->is_khugepaged && !referenced)) { > + if (unlikely(cc->is_khugepaged && !referenced)) { > result = SCAN_LACK_REFERENCED_PAGE; > } else { > result = SCAN_SUCCEED; > @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > mmu_notifier_test_young(vma->vm_mm, _address))) > referenced++; > } > - if (!writable) { > - result = SCAN_PAGE_RO; > - } else if (cc->is_khugepaged && > + if (cc->is_khugepaged && > (!referenced || > (unmapped && referenced < HPAGE_PMD_NR / 2))) { > result = SCAN_LACK_REFERENCED_PAGE; > @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start, > case SCAN_PMD_NULL: > case SCAN_PTE_NON_PRESENT: > case SCAN_PTE_UFFD_WP: > - case SCAN_PAGE_RO: > case SCAN_LACK_REFERENCED_PAGE: > case SCAN_PAGE_NULL: > case SCAN_PAGE_COUNT: ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO 2025-09-08 7:50 [PATCH v2 0/2] Expand scope of khugepaged anonymous collapse Dev Jain 2025-09-08 7:50 ` [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions Dev Jain @ 2025-09-08 7:50 ` Dev Jain 2025-09-09 18:51 ` Zach O'Keefe 2025-09-10 4:06 ` Anshuman Khandual 1 sibling, 2 replies; 7+ messages in thread From: Dev Jain @ 2025-09-08 7:50 UTC (permalink / raw) To: akpm, david, kas, willy, hughd Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel, Dev Jain Now that all actionable outcomes from checking pte_write() are gone, drop the related references. Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Kiryl Shutsemau <kas@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Dev Jain <dev.jain@arm.com> --- include/trace/events/huge_memory.h | 19 ++++++------------- mm/khugepaged.c | 14 +++----------- 2 files changed, 9 insertions(+), 24 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 2305df6cb485..dd94d14a2427 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -19,7 +19,6 @@ EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ EM( SCAN_PTE_MAPPED_HUGEPAGE, "pte_mapped_hugepage") \ - EM( SCAN_PAGE_RO, "no_writable_page") \ EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ EM( SCAN_PAGE_NULL, "page_null") \ EM( SCAN_SCAN_ABORT, "scan_aborted") \ @@ -55,15 +54,14 @@ SCAN_STATUS TRACE_EVENT(mm_khugepaged_scan_pmd, - TP_PROTO(struct mm_struct *mm, struct folio *folio, bool writable, + TP_PROTO(struct mm_struct *mm, struct folio *folio, int referenced, int none_or_zero, int status, int unmapped), - TP_ARGS(mm, folio, writable, referenced, none_or_zero, status, unmapped), + TP_ARGS(mm, folio, referenced, none_or_zero, status, unmapped), TP_STRUCT__entry( __field(struct mm_struct *, mm) __field(unsigned long, pfn) - __field(bool, writable) __field(int, referenced) __field(int, none_or_zero) __field(int, status) @@ -73,17 +71,15 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, TP_fast_assign( __entry->mm = mm; __entry->pfn = folio ? folio_pfn(folio) : -1; - __entry->writable = writable; __entry->referenced = referenced; __entry->none_or_zero = none_or_zero; __entry->status = status; __entry->unmapped = unmapped; ), - TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", + TP_printk("mm=%p, scan_pfn=0x%lx, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", __entry->mm, __entry->pfn, - __entry->writable, __entry->referenced, __entry->none_or_zero, __print_symbolic(__entry->status, SCAN_STATUS), @@ -117,15 +113,14 @@ TRACE_EVENT(mm_collapse_huge_page, TRACE_EVENT(mm_collapse_huge_page_isolate, TP_PROTO(struct folio *folio, int none_or_zero, - int referenced, bool writable, int status), + int referenced, int status), - TP_ARGS(folio, none_or_zero, referenced, writable, status), + TP_ARGS(folio, none_or_zero, referenced, status), TP_STRUCT__entry( __field(unsigned long, pfn) __field(int, none_or_zero) __field(int, referenced) - __field(bool, writable) __field(int, status) ), @@ -133,15 +128,13 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, __entry->pfn = folio ? folio_pfn(folio) : -1; __entry->none_or_zero = none_or_zero; __entry->referenced = referenced; - __entry->writable = writable; __entry->status = status; ), - TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, writable=%d, status=%s", + TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, status=%s", __entry->pfn, __entry->none_or_zero, __entry->referenced, - __entry->writable, __print_symbolic(__entry->status, SCAN_STATUS)) ); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a0f1df2a7ae6..af5f5c80fe4e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -39,7 +39,6 @@ enum scan_result { SCAN_PTE_NON_PRESENT, SCAN_PTE_UFFD_WP, SCAN_PTE_MAPPED_HUGEPAGE, - SCAN_PAGE_RO, SCAN_LACK_REFERENCED_PAGE, SCAN_PAGE_NULL, SCAN_SCAN_ABORT, @@ -557,7 +556,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, struct folio *folio = NULL; pte_t *_pte; int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; - bool writable = false; for (_pte = pte; _pte < pte + HPAGE_PMD_NR; _pte++, address += PAGE_SIZE) { @@ -671,9 +669,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, address))) referenced++; - - if (pte_write(pteval)) - writable = true; } if (unlikely(cc->is_khugepaged && !referenced)) { @@ -681,13 +676,13 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, } else { result = SCAN_SUCCEED; trace_mm_collapse_huge_page_isolate(folio, none_or_zero, - referenced, writable, result); + referenced, result); return result; } out: release_pte_pages(pte, _pte, compound_pagelist); trace_mm_collapse_huge_page_isolate(folio, none_or_zero, - referenced, writable, result); + referenced, result); return result; } @@ -1280,7 +1275,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, unsigned long _address; spinlock_t *ptl; int node = NUMA_NO_NODE, unmapped = 0; - bool writable = false; VM_BUG_ON(address & ~HPAGE_PMD_MASK); @@ -1344,8 +1338,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, result = SCAN_PTE_UFFD_WP; goto out_unmap; } - if (pte_write(pteval)) - writable = true; page = vm_normal_page(vma, _address, pteval); if (unlikely(!page) || unlikely(is_zone_device_page(page))) { @@ -1435,7 +1427,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, *mmap_locked = false; } out: - trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, folio, referenced, none_or_zero, result, unmapped); return result; } -- 2.30.2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO 2025-09-08 7:50 ` [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain @ 2025-09-09 18:51 ` Zach O'Keefe 2025-09-10 4:06 ` Anshuman Khandual 1 sibling, 0 replies; 7+ messages in thread From: Zach O'Keefe @ 2025-09-09 18:51 UTC (permalink / raw) To: Dev Jain Cc: akpm, david, kas, willy, hughd, ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel Thanks, Dev. On Mon, Sep 8, 2025 at 12:51 AM Dev Jain <dev.jain@arm.com> wrote: > > Now that all actionable outcomes from checking pte_write() are gone, > drop the related references. > > Acked-by: David Hildenbrand <david@redhat.com> > Acked-by: Zi Yan <ziy@nvidia.com> > Reviewed-by: Kiryl Shutsemau <kas@kernel.org> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Zach O'Keefe <zokeefe@google.com> > Signed-off-by: Dev Jain <dev.jain@arm.com> > --- > include/trace/events/huge_memory.h | 19 ++++++------------- > mm/khugepaged.c | 14 +++----------- > 2 files changed, 9 insertions(+), 24 deletions(-) > > diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h > index 2305df6cb485..dd94d14a2427 100644 > --- a/include/trace/events/huge_memory.h > +++ b/include/trace/events/huge_memory.h > @@ -19,7 +19,6 @@ > EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ > EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ > EM( SCAN_PTE_MAPPED_HUGEPAGE, "pte_mapped_hugepage") \ > - EM( SCAN_PAGE_RO, "no_writable_page") \ > EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ > EM( SCAN_PAGE_NULL, "page_null") \ > EM( SCAN_SCAN_ABORT, "scan_aborted") \ > @@ -55,15 +54,14 @@ SCAN_STATUS > > TRACE_EVENT(mm_khugepaged_scan_pmd, > > - TP_PROTO(struct mm_struct *mm, struct folio *folio, bool writable, > + TP_PROTO(struct mm_struct *mm, struct folio *folio, > int referenced, int none_or_zero, int status, int unmapped), > > - TP_ARGS(mm, folio, writable, referenced, none_or_zero, status, unmapped), > + TP_ARGS(mm, folio, referenced, none_or_zero, status, unmapped), > > TP_STRUCT__entry( > __field(struct mm_struct *, mm) > __field(unsigned long, pfn) > - __field(bool, writable) > __field(int, referenced) > __field(int, none_or_zero) > __field(int, status) > @@ -73,17 +71,15 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, > TP_fast_assign( > __entry->mm = mm; > __entry->pfn = folio ? folio_pfn(folio) : -1; > - __entry->writable = writable; > __entry->referenced = referenced; > __entry->none_or_zero = none_or_zero; > __entry->status = status; > __entry->unmapped = unmapped; > ), > > - TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", > + TP_printk("mm=%p, scan_pfn=0x%lx, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", > __entry->mm, > __entry->pfn, > - __entry->writable, > __entry->referenced, > __entry->none_or_zero, > __print_symbolic(__entry->status, SCAN_STATUS), > @@ -117,15 +113,14 @@ TRACE_EVENT(mm_collapse_huge_page, > TRACE_EVENT(mm_collapse_huge_page_isolate, > > TP_PROTO(struct folio *folio, int none_or_zero, > - int referenced, bool writable, int status), > + int referenced, int status), > > - TP_ARGS(folio, none_or_zero, referenced, writable, status), > + TP_ARGS(folio, none_or_zero, referenced, status), > > TP_STRUCT__entry( > __field(unsigned long, pfn) > __field(int, none_or_zero) > __field(int, referenced) > - __field(bool, writable) > __field(int, status) > ), > > @@ -133,15 +128,13 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, > __entry->pfn = folio ? folio_pfn(folio) : -1; > __entry->none_or_zero = none_or_zero; > __entry->referenced = referenced; > - __entry->writable = writable; > __entry->status = status; > ), > > - TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, writable=%d, status=%s", > + TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, status=%s", > __entry->pfn, > __entry->none_or_zero, > __entry->referenced, > - __entry->writable, > __print_symbolic(__entry->status, SCAN_STATUS)) > ); > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index a0f1df2a7ae6..af5f5c80fe4e 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -39,7 +39,6 @@ enum scan_result { > SCAN_PTE_NON_PRESENT, > SCAN_PTE_UFFD_WP, > SCAN_PTE_MAPPED_HUGEPAGE, > - SCAN_PAGE_RO, > SCAN_LACK_REFERENCED_PAGE, > SCAN_PAGE_NULL, > SCAN_SCAN_ABORT, > @@ -557,7 +556,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > struct folio *folio = NULL; > pte_t *_pte; > int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; > - bool writable = false; > > for (_pte = pte; _pte < pte + HPAGE_PMD_NR; > _pte++, address += PAGE_SIZE) { > @@ -671,9 +669,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, > address))) > referenced++; > - > - if (pte_write(pteval)) > - writable = true; > } > > if (unlikely(cc->is_khugepaged && !referenced)) { > @@ -681,13 +676,13 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > } else { > result = SCAN_SUCCEED; > trace_mm_collapse_huge_page_isolate(folio, none_or_zero, > - referenced, writable, result); > + referenced, result); > return result; > } > out: > release_pte_pages(pte, _pte, compound_pagelist); > trace_mm_collapse_huge_page_isolate(folio, none_or_zero, > - referenced, writable, result); > + referenced, result); > return result; > } > > @@ -1280,7 +1275,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > unsigned long _address; > spinlock_t *ptl; > int node = NUMA_NO_NODE, unmapped = 0; > - bool writable = false; > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > @@ -1344,8 +1338,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > result = SCAN_PTE_UFFD_WP; > goto out_unmap; > } > - if (pte_write(pteval)) > - writable = true; > > page = vm_normal_page(vma, _address, pteval); > if (unlikely(!page) || unlikely(is_zone_device_page(page))) { > @@ -1435,7 +1427,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > *mmap_locked = false; > } > out: > - trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced, > + trace_mm_khugepaged_scan_pmd(mm, folio, referenced, > none_or_zero, result, unmapped); > return result; > } > -- > 2.30.2 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO 2025-09-08 7:50 ` [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain 2025-09-09 18:51 ` Zach O'Keefe @ 2025-09-10 4:06 ` Anshuman Khandual 1 sibling, 0 replies; 7+ messages in thread From: Anshuman Khandual @ 2025-09-10 4:06 UTC (permalink / raw) To: Dev Jain, akpm, david, kas, willy, hughd Cc: ziy, baolin.wang, lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, baohua, richard.weiyang, linux-mm, linux-kernel On 08/09/25 1:20 PM, Dev Jain wrote: > Now that all actionable outcomes from checking pte_write() are gone, > drop the related references. > > Acked-by: David Hildenbrand <david@redhat.com> > Acked-by: Zi Yan <ziy@nvidia.com> > Reviewed-by: Kiryl Shutsemau <kas@kernel.org> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > include/trace/events/huge_memory.h | 19 ++++++------------- > mm/khugepaged.c | 14 +++----------- > 2 files changed, 9 insertions(+), 24 deletions(-) > > diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h > index 2305df6cb485..dd94d14a2427 100644 > --- a/include/trace/events/huge_memory.h > +++ b/include/trace/events/huge_memory.h > @@ -19,7 +19,6 @@ > EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \ > EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \ > EM( SCAN_PTE_MAPPED_HUGEPAGE, "pte_mapped_hugepage") \ > - EM( SCAN_PAGE_RO, "no_writable_page") \ > EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \ > EM( SCAN_PAGE_NULL, "page_null") \ > EM( SCAN_SCAN_ABORT, "scan_aborted") \ > @@ -55,15 +54,14 @@ SCAN_STATUS > > TRACE_EVENT(mm_khugepaged_scan_pmd, > > - TP_PROTO(struct mm_struct *mm, struct folio *folio, bool writable, > + TP_PROTO(struct mm_struct *mm, struct folio *folio, > int referenced, int none_or_zero, int status, int unmapped), > > - TP_ARGS(mm, folio, writable, referenced, none_or_zero, status, unmapped), > + TP_ARGS(mm, folio, referenced, none_or_zero, status, unmapped), > > TP_STRUCT__entry( > __field(struct mm_struct *, mm) > __field(unsigned long, pfn) > - __field(bool, writable) > __field(int, referenced) > __field(int, none_or_zero) > __field(int, status) > @@ -73,17 +71,15 @@ TRACE_EVENT(mm_khugepaged_scan_pmd, > TP_fast_assign( > __entry->mm = mm; > __entry->pfn = folio ? folio_pfn(folio) : -1; > - __entry->writable = writable; > __entry->referenced = referenced; > __entry->none_or_zero = none_or_zero; > __entry->status = status; > __entry->unmapped = unmapped; > ), > > - TP_printk("mm=%p, scan_pfn=0x%lx, writable=%d, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", > + TP_printk("mm=%p, scan_pfn=0x%lx, referenced=%d, none_or_zero=%d, status=%s, unmapped=%d", > __entry->mm, > __entry->pfn, > - __entry->writable, > __entry->referenced, > __entry->none_or_zero, > __print_symbolic(__entry->status, SCAN_STATUS), > @@ -117,15 +113,14 @@ TRACE_EVENT(mm_collapse_huge_page, > TRACE_EVENT(mm_collapse_huge_page_isolate, > > TP_PROTO(struct folio *folio, int none_or_zero, > - int referenced, bool writable, int status), > + int referenced, int status), > > - TP_ARGS(folio, none_or_zero, referenced, writable, status), > + TP_ARGS(folio, none_or_zero, referenced, status), > > TP_STRUCT__entry( > __field(unsigned long, pfn) > __field(int, none_or_zero) > __field(int, referenced) > - __field(bool, writable) > __field(int, status) > ), > > @@ -133,15 +128,13 @@ TRACE_EVENT(mm_collapse_huge_page_isolate, > __entry->pfn = folio ? folio_pfn(folio) : -1; > __entry->none_or_zero = none_or_zero; > __entry->referenced = referenced; > - __entry->writable = writable; > __entry->status = status; > ), > > - TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, writable=%d, status=%s", > + TP_printk("scan_pfn=0x%lx, none_or_zero=%d, referenced=%d, status=%s", > __entry->pfn, > __entry->none_or_zero, > __entry->referenced, > - __entry->writable, > __print_symbolic(__entry->status, SCAN_STATUS)) > ); > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index a0f1df2a7ae6..af5f5c80fe4e 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -39,7 +39,6 @@ enum scan_result { > SCAN_PTE_NON_PRESENT, > SCAN_PTE_UFFD_WP, > SCAN_PTE_MAPPED_HUGEPAGE, > - SCAN_PAGE_RO, > SCAN_LACK_REFERENCED_PAGE, > SCAN_PAGE_NULL, > SCAN_SCAN_ABORT, > @@ -557,7 +556,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > struct folio *folio = NULL; > pte_t *_pte; > int none_or_zero = 0, shared = 0, result = SCAN_FAIL, referenced = 0; > - bool writable = false; > > for (_pte = pte; _pte < pte + HPAGE_PMD_NR; > _pte++, address += PAGE_SIZE) { > @@ -671,9 +669,6 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, > address))) > referenced++; > - > - if (pte_write(pteval)) > - writable = true; > } > > if (unlikely(cc->is_khugepaged && !referenced)) { > @@ -681,13 +676,13 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, > } else { > result = SCAN_SUCCEED; > trace_mm_collapse_huge_page_isolate(folio, none_or_zero, > - referenced, writable, result); > + referenced, result); > return result; > } > out: > release_pte_pages(pte, _pte, compound_pagelist); > trace_mm_collapse_huge_page_isolate(folio, none_or_zero, > - referenced, writable, result); > + referenced, result); > return result; > } > > @@ -1280,7 +1275,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > unsigned long _address; > spinlock_t *ptl; > int node = NUMA_NO_NODE, unmapped = 0; > - bool writable = false; > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > @@ -1344,8 +1338,6 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > result = SCAN_PTE_UFFD_WP; > goto out_unmap; > } > - if (pte_write(pteval)) > - writable = true; > > page = vm_normal_page(vma, _address, pteval); > if (unlikely(!page) || unlikely(is_zone_device_page(page))) { > @@ -1435,7 +1427,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm, > *mmap_locked = false; > } > out: > - trace_mm_khugepaged_scan_pmd(mm, folio, writable, referenced, > + trace_mm_khugepaged_scan_pmd(mm, folio, referenced, > none_or_zero, result, unmapped); > return result; > } ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-09-10 4:06 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-08 7:50 [PATCH v2 0/2] Expand scope of khugepaged anonymous collapse Dev Jain 2025-09-08 7:50 ` [PATCH v2 1/2] mm: Enable khugepaged anonymous collapse on non-writable regions Dev Jain 2025-09-09 18:49 ` Zach O'Keefe 2025-09-10 4:03 ` Anshuman Khandual 2025-09-08 7:50 ` [PATCH v2 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain 2025-09-09 18:51 ` Zach O'Keefe 2025-09-10 4:06 ` Anshuman Khandual
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox