From: Dev Jain <dev.jain@arm.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, david@redhat.com, kas@kernel.org,
willy@infradead.org, hughd@google.com, ziy@nvidia.com,
baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
baohua@kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs
Date: Wed, 3 Sep 2025 14:36:11 +0530 [thread overview]
Message-ID: <052c867a-963c-4a5e-88f8-0b2d87d40f14@arm.com> (raw)
In-Reply-To: <20250903080839.wuivg2u7smyuxo5e@master>
On 03/09/25 1:38 pm, Wei Yang wrote:
> On Wed, Sep 03, 2025 at 11:16:34AM +0530, Dev Jain wrote:
>> Currently khugepaged does not collapse a region which does not have a
>> single writable page. This is wasteful since non-writable VMAs mapped by
>> the application won't benefit from THP collapse. Therefore, remove this
>> restriction and allow khugepaged to collapse a VMA with arbitrary
>> protections.
>>
>> Along with this, currently MADV_COLLAPSE does not perform a collapse on a
>> non-writable VMA, and this restriction is nowhere to be found on the
>> manpage - the restriction itself sounds wrong to me since the user knows
>> the protection of the memory it has mapped, so collapsing read-only
>> memory via madvise() should be a choice of the user which shouldn't
>> be overriden by the kernel.
>>
>> On an arm64 machine, an average of 5% improvement is seen on some mmtests
>> benchmarks, particularly hackbench, with a maximum improvement of 12%.
>>
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
> [...]
>> mm/khugepaged.c | 9 ++-------
>> 1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 4ec324a4c1fe..a0f1df2a7ae6 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -676,9 +676,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>> writable = true;
>> }
>>
>> - if (unlikely(!writable)) {
>> - result = SCAN_PAGE_RO;
>> - } else if (unlikely(cc->is_khugepaged && !referenced)) {
> Would this cause more memory usage in system?
>
> For example, one application would fork itself many times. It executable area
> is read only, so all of them share one copy in memory.
>
> Now we may collapse the range and create one copy for each process.
I forgot to add "anonymous VMAs" in the patch description - for the case you
describe, the VMA will be shmem or file VMA and this patch doesn't concern that.
Andrew, could you please change the first line of the patch description from
"Currently khugepaged does not collapse a region" to "Currently khugepaged does not collapse an anonymous region"?
Thanks.
>
> Ok, we have max_ptes_shared, while if some ptes are none, could it still do
> collapse?
>
> Maybe this is not realistic, just curious.
>
>> + if (unlikely(cc->is_khugepaged && !referenced)) {
>> result = SCAN_LACK_REFERENCED_PAGE;
>> } else {
>> result = SCAN_SUCCEED;
>> @@ -1421,9 +1419,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>> mmu_notifier_test_young(vma->vm_mm, _address)))
>> referenced++;
>> }
>> - if (!writable) {
>> - result = SCAN_PAGE_RO;
>> - } else if (cc->is_khugepaged &&
>> + if (cc->is_khugepaged &&
>> (!referenced ||
>> (unmapped && referenced < HPAGE_PMD_NR / 2))) {
>> result = SCAN_LACK_REFERENCED_PAGE;
>> @@ -2830,7 +2826,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
>> case SCAN_PMD_NULL:
>> case SCAN_PTE_NON_PRESENT:
>> case SCAN_PTE_UFFD_WP:
>> - case SCAN_PAGE_RO:
>> case SCAN_LACK_REFERENCED_PAGE:
>> case SCAN_PAGE_NULL:
>> case SCAN_PAGE_COUNT:
>> --
>> 2.30.2
>>
next prev parent reply other threads:[~2025-09-03 9:07 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-03 5:46 Dev Jain
2025-09-03 5:46 ` [PATCH 2/2] mm: Drop all references of writable and SCAN_PAGE_RO Dev Jain
2025-09-03 6:53 ` David Hildenbrand
2025-09-03 9:04 ` Kiryl Shutsemau
2025-09-03 13:26 ` Lorenzo Stoakes
2025-09-03 14:33 ` David Hildenbrand
2025-09-03 15:47 ` Zi Yan
2025-09-03 20:35 ` Lorenzo Stoakes
2025-09-04 6:12 ` Baolin Wang
2025-09-03 6:52 ` [PATCH 1/2] mm: Enable khugepaged to operate on non-writable VMAs David Hildenbrand
2025-09-03 8:08 ` Wei Yang
2025-09-03 8:13 ` David Hildenbrand
2025-09-03 8:30 ` Wei Yang
2025-09-03 9:06 ` Dev Jain [this message]
2025-09-03 9:15 ` Dev Jain
2025-09-03 9:18 ` Dev Jain
2025-09-03 9:22 ` David Hildenbrand
2025-09-03 18:25 ` Lorenzo Stoakes
2025-09-04 3:56 ` Dev Jain
2025-09-03 13:11 ` Wei Yang
2025-09-03 9:03 ` Kiryl Shutsemau
2025-09-03 15:46 ` Zi Yan
2025-09-03 20:34 ` Lorenzo Stoakes
2025-09-04 4:04 ` Dev Jain
2025-09-04 6:11 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=052c867a-963c-4a5e-88f8-0b2d87d40f14@arm.com \
--to=dev.jain@arm.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=richard.weiyang@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox