Re: [PATCH v4 0/4] Optimize mprotect() for large folios

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Dev Jain <dev.jain@arm.com>
Cc: akpm@linux-foundation.org, ryan.roberts@arm.com,
	david@redhat.com, willy@infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	will@kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz,
	jannh@google.com, anshuman.khandual@arm.com, peterx@redhat.com,
	joey.gouly@arm.com, ioworker0@gmail.com, baohua@kernel.org,
	kevin.brodsky@arm.com, quic_zhenhuah@quicinc.com,
	christophe.leroy@csgroup.eu, yangyicong@hisilicon.com,
	linux-arm-kernel@lists.infradead.org, hughd@google.com,
	yang@os.amperecomputing.com, ziy@nvidia.com
Subject: Re: [PATCH v4 0/4] Optimize mprotect() for large folios
Date: Mon, 30 Jun 2025 12:17:19 +0100	[thread overview]
Message-ID: <11939364-5488-4067-885b-aabd76fee46e@lucifer.local> (raw)
In-Reply-To: <20250628113435.46678-1-dev.jain@arm.com>

On Sat, Jun 28, 2025 at 05:04:31PM +0530, Dev Jain wrote:
> This patchset optimizes the mprotect() system call for large folios
> by PTE-batching. No issues were observed with mm-selftests, build
> tested on x86_64.

Should also be tested on x86-64 not only build tested :)

You are still not really giving details here, so same comment as your mremap()
series, please explain why you're doing this, what for, what benefits you expect
to achieve, where etc.

E.g. 'this is deisgned to optimise mTHP cases on arm64, we expect to see
benefits on amd64 also and for intel there should be no impact'.

It's probably also worth actually going and checking to make sure that this is
the case re: other arches. See below on that...

>
> We use the following test cases to measure performance, mprotect()'ing
> the mapped memory to read-only then read-write 40 times:
>
> Test case 1: Mapping 1G of memory, touching it to get PMD-THPs, then
> pte-mapping those THPs
> Test case 2: Mapping 1G of memory with 64K mTHPs
> Test case 3: Mapping 1G of memory with 4K pages
>
> Average execution time on arm64, Apple M3:
> Before the patchset:
> T1: 7.9 seconds   T2: 7.9 seconds   T3: 4.2 seconds
>
> After the patchset:
> T1: 2.1 seconds   T2: 2.2 seconds   T3: 4.3 seconds
>
> Observing T1/T2 and T3 before the patchset, we also remove the regression
> introduced by ptep_get() on a contpte block. And, for large folios we get
> an almost 74% performance improvement, albeit the trade-off being a slight
> degradation in the small folio case.

This is nice, though order-0 is probably going to be your bread and butter no?

Having said that, mprotect() is not a hot path, this delta is small enough to
quite possibly just be noise, and personally I'm not all that bothered.

But let's run this same test on x86-64 too please and get some before/after
numbers just to confirm no major impact.

Thanks for including code.

>
> Here is the test program:
>
>  #define _GNU_SOURCE
>  #include <sys/mman.h>
>  #include <stdlib.h>
>  #include <string.h>
>  #include <stdio.h>
>  #include <unistd.h>
>
>  #define SIZE (1024*1024*1024)
>
> unsigned long pmdsize = (1UL << 21);
> unsigned long pagesize = (1UL << 12);
>
> static void pte_map_thps(char *mem, size_t size)
> {
> 	size_t offs;
> 	int ret = 0;
>
>
> 	/* PTE-map each THP by temporarily splitting the VMAs. */
> 	for (offs = 0; offs < size; offs += pmdsize) {
> 		ret |= madvise(mem + offs, pagesize, MADV_DONTFORK);
> 		ret |= madvise(mem + offs, pagesize, MADV_DOFORK);
> 	}
>
> 	if (ret) {
> 		fprintf(stderr, "ERROR: mprotect() failed\n");
> 		exit(1);
> 	}
> }
>
> int main(int argc, char *argv[])
> {
> 	char *p;
>         int ret = 0;
> 	p = mmap((1UL << 30), SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 	if (p != (1UL << 30)) {
> 		perror("mmap");
> 		return 1;
> 	}
>
>
>
> 	memset(p, 0, SIZE);
> 	if (madvise(p, SIZE, MADV_NOHUGEPAGE))
> 		perror("madvise");
> 	explicit_bzero(p, SIZE);
> 	pte_map_thps(p, SIZE);
>
> 	for (int loops = 0; loops < 40; loops++) {
> 		if (mprotect(p, SIZE, PROT_READ))
> 			perror("mprotect"), exit(1);
> 		if (mprotect(p, SIZE, PROT_READ|PROT_WRITE))
> 			perror("mprotect"), exit(1);
> 		explicit_bzero(p, SIZE);
> 	}
> }
>
> ---
> The patchset is rebased onto Saturday's mm-new.
>
> v3->v4:
>  - Refactor skipping logic into a new function, edit patch 1 subject
>    to highlight it is only for MM_CP_PROT_NUMA case (David H)
>  - Refactor the optimization logic, add more documentation to the generic
>    batched functions, do not add clear_flush_ptes, squash patch 4
>    and 5 (Ryan)
>
> v2->v3:
>  - Add comments for the new APIs (Ryan, Lorenzo)
>  - Instead of refactoring, use a "skip_batch" label
>  - Move arm64 patches at the end (Ryan)
>  - In can_change_pte_writable(), check AnonExclusive page-by-page (David H)
>  - Resolve implicit declaration; tested build on x86 (Lance Yang)
>
> v1->v2:
>  - Rebase onto mm-unstable (6ebffe676fcf: util_macros.h: make the header more resilient)
>  - Abridge the anon-exclusive condition (Lance Yang)
>
> Dev Jain (4):
>   mm: Optimize mprotect() for MM_CP_PROT_NUMA by batch-skipping PTEs
>   mm: Add batched versions of ptep_modify_prot_start/commit
>   mm: Optimize mprotect() by PTE-batching
>   arm64: Add batched versions of ptep_modify_prot_start/commit
>
>  arch/arm64/include/asm/pgtable.h |  10 ++
>  arch/arm64/mm/mmu.c              |  28 +++-
>  include/linux/pgtable.h          |  83 +++++++++-
>  mm/mprotect.c                    | 269 +++++++++++++++++++++++--------
>  4 files changed, 315 insertions(+), 75 deletions(-)
>
> --
> 2.30.2
>

next prev parent reply	other threads:[~2025-06-30 11:17 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-28 11:34 Dev Jain
2025-06-28 11:34 ` [PATCH v4 1/4] mm: Optimize mprotect() for MM_CP_PROT_NUMA by batch-skipping PTEs Dev Jain
2025-06-30  9:42   ` Ryan Roberts
2025-06-30  9:49     ` Dev Jain
2025-06-30  9:55       ` Ryan Roberts
2025-06-30 10:05         ` Dev Jain
2025-06-30 11:25   ` Lorenzo Stoakes
2025-06-30 11:39     ` Ryan Roberts
2025-06-30 11:53       ` Lorenzo Stoakes
2025-06-30 11:40     ` Dev Jain
2025-06-30 11:51       ` Lorenzo Stoakes
2025-06-30 11:56         ` Dev Jain
2025-07-02  9:37   ` Lorenzo Stoakes
2025-07-02 15:01     ` Dev Jain
2025-07-02 15:37       ` Lorenzo Stoakes
2025-06-28 11:34 ` [PATCH v4 2/4] mm: Add batched versions of ptep_modify_prot_start/commit Dev Jain
2025-06-30 10:10   ` Ryan Roberts
2025-06-30 10:17     ` Dev Jain
2025-06-30 10:35       ` Ryan Roberts
2025-06-30 10:42         ` Dev Jain
2025-06-30 12:57   ` Lorenzo Stoakes
2025-07-01  4:44     ` Dev Jain
2025-07-01  7:33       ` Ryan Roberts
2025-07-01  8:06         ` Lorenzo Stoakes
2025-07-01  8:23           ` Ryan Roberts
2025-07-01  8:34             ` Lorenzo Stoakes
2025-06-28 11:34 ` [PATCH v4 3/4] mm: Optimize mprotect() by PTE-batching Dev Jain
2025-06-28 12:39   ` Dev Jain
2025-06-30 10:31   ` Ryan Roberts
2025-06-30 11:21     ` Dev Jain
2025-06-30 11:47       ` Dev Jain
2025-06-30 11:50       ` Ryan Roberts
2025-06-30 11:53         ` Dev Jain
2025-07-01  5:47     ` Dev Jain
2025-07-01  7:39       ` Ryan Roberts
2025-06-30 12:52   ` Lorenzo Stoakes
2025-07-01  5:30     ` Dev Jain
     [not found]     ` <ec2c3f60-43e9-47d9-9058-49d608845200@arm.com>
2025-07-01  8:06       ` Dev Jain
2025-07-01  8:24         ` Ryan Roberts
2025-07-01  8:15       ` Lorenzo Stoakes
2025-07-01  8:30         ` Ryan Roberts
2025-07-01  8:51           ` Lorenzo Stoakes
2025-07-01  9:53             ` Ryan Roberts
2025-07-01 10:21               ` Lorenzo Stoakes
2025-07-01 11:31                 ` Ryan Roberts
2025-07-01 13:40                   ` Lorenzo Stoakes
2025-07-02 10:32                     ` Lorenzo Stoakes
2025-07-02 15:03                       ` Dev Jain
2025-07-02 15:22                         ` Lorenzo Stoakes
2025-07-03 12:59                           ` David Hildenbrand
2025-06-28 11:34 ` [PATCH v4 4/4] arm64: Add batched versions of ptep_modify_prot_start/commit Dev Jain
2025-06-30 10:43   ` Ryan Roberts
2025-06-29 23:05 ` [PATCH v4 0/4] Optimize mprotect() for large folios Andrew Morton
2025-06-30  3:33   ` Dev Jain
2025-06-30 10:45     ` Ryan Roberts
2025-06-30 11:22       ` Dev Jain
2025-06-30 11:17 ` Lorenzo Stoakes [this message]
2025-06-30 11:25   ` Dev Jain
2025-06-30 11:27 ` Lorenzo Stoakes
2025-06-30 11:43   ` Dev Jain
2025-07-01  0:08     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11939364-5488-4067-885b-aabd76fee46e@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=joey.gouly@arm.com \
    --cc=kevin.brodsky@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=quic_zhenhuah@quicinc.com \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yangyicong@hisilicon.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox