linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Dev Jain <dev.jain@arm.com>
Cc: akpm@linux-foundation.org, ryan.roberts@arm.com,
	david@redhat.com, willy@infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	will@kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz,
	jannh@google.com, anshuman.khandual@arm.com, peterx@redhat.com,
	joey.gouly@arm.com, ioworker0@gmail.com, baohua@kernel.org,
	kevin.brodsky@arm.com, quic_zhenhuah@quicinc.com,
	christophe.leroy@csgroup.eu, yangyicong@hisilicon.com,
	linux-arm-kernel@lists.infradead.org, hughd@google.com,
	yang@os.amperecomputing.com, ziy@nvidia.com
Subject: Re: [PATCH v5 0/7] Optimize mprotect() for large folios
Date: Fri, 18 Jul 2025 19:53:11 +0100	[thread overview]
Message-ID: <7d21fff7-bf2b-4362-b2cf-0cd92fe0cf7c@lucifer.local> (raw)
In-Reply-To: <fdd6203c-dd9b-4c33-98d7-255f97973ad2@arm.com>

On Fri, Jul 18, 2025 at 03:20:16PM +0530, Dev Jain wrote:
>
> On 18/07/25 2:32 pm, Dev Jain wrote:
> > Use folio_pte_batch() to optimize change_pte_range(). On arm64, if the ptes
> > are painted with the contig bit, then ptep_get() will iterate through all
> > 16 entries to collect a/d bits. Hence this optimization will result in
> > a 16x reduction in the number of ptep_get() calls. Next,
> > ptep_modify_prot_start() will eventually call contpte_try_unfold() on
> > every contig block, thus flushing the TLB for the complete large folio
> > range. Instead, use get_and_clear_full_ptes() so as to elide TLBIs on
> > each contig block, and only do them on the starting and ending
> > contig block.
> >
> > For split folios, there will be no pte batching; the batch size returned
> > by folio_pte_batch() will be 1. For pagetable split folios, the ptes will
> > still point to the same large folio; for arm64, this results in the
> > optimization described above, and for other arches, a minor improvement
> > is expected due to a reduction in the number of function calls.
> >
> > mm-selftests pass on arm64. I have some failing tests on my x86 VM already;
> > no new tests fail as a result of this patchset.
> >
> > We use the following test cases to measure performance, mprotect()'ing
> > the mapped memory to read-only then read-write 40 times:
> >
> > Test case 1: Mapping 1G of memory, touching it to get PMD-THPs, then
> > pte-mapping those THPs
> > Test case 2: Mapping 1G of memory with 64K mTHPs
> > Test case 3: Mapping 1G of memory with 4K pages
> >
> > Average execution time on arm64, Apple M3:
> > Before the patchset:
> > T1: 2.1 seconds   T2: 2 seconds   T3: 1 second
> >
> > After the patchset:
> > T1: 0.65 seconds   T2: 0.7 seconds   T3: 1.1 seconds
> >
>
> For the note: the numbers are different from the previous versions.
> I must have run the test for more number of iterations and then
> pasted the test program here for 40 iterations, that's why the mismatch.
>

Thanks for this clarification!


      reply	other threads:[~2025-07-18 18:53 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-18  9:02 Dev Jain
2025-07-18  9:02 ` [PATCH v5 1/7] mm: Refactor MM_CP_PROT_NUMA skipping case into new function Dev Jain
2025-07-18 16:19   ` Lorenzo Stoakes
2025-07-20 23:44   ` Barry Song
2025-07-21  3:44     ` Dev Jain
2025-07-22 11:05       ` Dev Jain
2025-07-22 11:25   ` Ryan Roberts
2025-07-23 13:57   ` Zi Yan
2025-07-18  9:02 ` [PATCH v5 2/7] mm: Optimize mprotect() for MM_CP_PROT_NUMA by batch-skipping PTEs Dev Jain
2025-07-18 16:40   ` Lorenzo Stoakes
2025-07-22 11:26   ` Ryan Roberts
2025-07-23 14:25   ` Zi Yan
2025-07-18  9:02 ` [PATCH v5 3/7] mm: Add batched versions of ptep_modify_prot_start/commit Dev Jain
2025-07-18 17:05   ` Lorenzo Stoakes
2025-07-20 23:59   ` Barry Song
2025-07-22 11:35   ` Ryan Roberts
2025-07-23 15:09   ` Zi Yan
2025-07-18  9:02 ` [PATCH v5 4/7] mm: Introduce FPB_RESPECT_WRITE for PTE batching infrastructure Dev Jain
2025-07-18 17:12   ` Lorenzo Stoakes
2025-07-22 11:37   ` Ryan Roberts
2025-07-23 15:28   ` Zi Yan
2025-07-23 15:32     ` Dev Jain
2025-07-18  9:02 ` [PATCH v5 5/7] mm: Split can_change_pte_writable() into private and shared parts Dev Jain
2025-07-18 17:27   ` Lorenzo Stoakes
2025-07-23 15:40   ` Zi Yan
2025-07-18  9:02 ` [PATCH v5 6/7] mm: Optimize mprotect() by PTE batching Dev Jain
2025-07-18 18:49   ` Lorenzo Stoakes
2025-07-19 13:46     ` Dev Jain
2025-07-20 11:20       ` Lorenzo Stoakes
2025-07-20 14:39         ` Dev Jain
2025-07-24 19:55   ` Zi Yan
2025-08-06  8:08   ` David Hildenbrand
2025-08-06  8:12     ` David Hildenbrand
2025-08-06  8:15     ` Will Deacon
2025-08-06  8:19       ` David Hildenbrand
2025-08-06  8:53     ` Dev Jain
2025-08-06  8:56       ` David Hildenbrand
2025-08-06  9:12     ` Lorenzo Stoakes
2025-08-06  9:21       ` David Hildenbrand
2025-08-06  9:37         ` Dev Jain
2025-08-06  9:50           ` Lorenzo Stoakes
2025-08-06 10:04             ` Dev Jain
2025-08-06 10:07               ` Dev Jain
2025-08-06 10:12               ` David Hildenbrand
2025-08-06 10:11             ` David Hildenbrand
2025-08-06 10:20               ` Dev Jain
2025-08-06 10:28                 ` David Hildenbrand
2025-08-06 10:45                 ` Lorenzo Stoakes
2025-08-06 10:45               ` Lorenzo Stoakes
2025-07-18  9:02 ` [PATCH v5 7/7] arm64: Add batched versions of ptep_modify_prot_start/commit Dev Jain
2025-07-18 18:50   ` Lorenzo Stoakes
2025-07-21 15:57   ` Catalin Marinas
2025-07-18  9:50 ` [PATCH v5 0/7] Optimize mprotect() for large folios Dev Jain
2025-07-18 18:53   ` Lorenzo Stoakes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d21fff7-bf2b-4362-b2cf-0cd92fe0cf7c@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=joey.gouly@arm.com \
    --cc=kevin.brodsky@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=quic_zhenhuah@quicinc.com \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yangyicong@hisilicon.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox