Re: [REGRESSION] mm/mprotect: 2x+ slowdown for >=400KiB regions since PTE batching (cac1db8c3aad)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Pedro Falcato <pfalcato@suse.de>
To: Dev Jain <dev.jain@arm.com>
Cc: Luke Yang <luyang@redhat.com>,
	david@kernel.org, surenb@google.com,  jhladky@redhat.com,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	 willy@infradead.org, vbabka@suse.cz, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [REGRESSION] mm/mprotect: 2x+ slowdown for >=400KiB regions since PTE batching (cac1db8c3aad)
Date: Wed, 18 Feb 2026 10:06:56 +0000	[thread overview]
Message-ID: <5dso4ctke4baz7hky62zyfdzyg27tcikdbg5ecnrqmnluvmxzo@sciiqgatpqqv> (raw)
In-Reply-To: <8315cbde-389c-40c5-ac72-92074625489a@arm.com>

On Wed, Feb 18, 2026 at 10:31:19AM +0530, Dev Jain wrote:
> 
> On 17/02/26 11:38 pm, Pedro Falcato wrote:
> > On Tue, Feb 17, 2026 at 12:43:38PM -0500, Luke Yang wrote:
> >> On Mon, Feb 16, 2026 at 03:42:08PM +0530, Dev Jain wrote:
> >>> On 13/02/26 10:56 pm, David Hildenbrand (Arm) wrote:
> >>>> On 2/13/26 18:16, Suren Baghdasaryan wrote:
> >>>>> On Fri, Feb 13, 2026 at 4:24 PM Pedro Falcato <pfalcato@suse.de> wrote:
> >>>>>> On Fri, Feb 13, 2026 at 04:47:29PM +0100, David Hildenbrand (Arm) wrote:
> >>>>>>> Hi!
> >>>>>>>
> >>>>>>>
> >>>>>>> Micro-benchmark results are nice. But what is the real word impact?
> >>>>>>> IOW, why
> >>>>>>> should we care?
> >>>>>> Well, mprotect is widely used in thread spawning, code JITting,
> >>>>>> and even process startup. And we don't want to pay for a feature we can't
> >>>>>> even use (on x86).
> >>>>> I agree. When I straced Android's zygote a while ago, mprotect() came
> >>>>> up #30 in the list of most frequently used syscalls and one of the
> >>>>> most used mm-related syscalls due to its use during process creation.
> >>>>> However, I don't know how often it's used on VMAs of size >=400KiB.
> >>>> See my point? :) If this is apparently so widespread then finding a real
> >>>> reproducer is likely not a problem. Otherwise it's just speculation.
> >>>>
> >>>> It would also be interesting to know whether the reproducer ran with any
> >>>> sort of mTHP enabled or not. 
> >>> Yes. Luke, can you experiment with the following microbenchmark:
> >>>
> >>> https://pastebin.com/3hNtYirT
> >>>
> >>> and see if there is an optimization for pte-mapped 2M folios, before and
> >>> after the commit?
> >>>
> >>> (set transparent_hugepages/enabled=always, hugepages-2048Kb/enabled=always)
> > Since you're testing stuff, could you please test the changes in:
> > https://github.com/heatd/linux/tree/mprotect-opt ?
> >
> > Not posting them yet since merge window, etc. Plus I think there's some
> > further optimization work we can pull off.
> >
> > With the benchmark in https://gist.github.com/heatd/25eb2edb601719d22bfb514bcf06a132
> > (compiled with g++ -O2 file.cpp -lbenchmark, needs google/benchmark) I've measured
> > about an 18% speedup between original vs with patches.
> 
> Thanks for working on this. Some comments -
> 
> 1. Rejecting batching with pte_batch_hint() means that we also don't batch 16K and 32K large
> folios on arm64, since the cont bit is on starting only at 64K. Not sure how imp this is.

I don't understand what you mean. Is ARM64 doing large folio optimization,
even when there's no special MMU support for it (the aforementioned 16K and
32K cases)? If so, perhaps it's time for a ARCH_SUPPORTS_PTE_BATCHING flag.
Though if you could provide numbers in that case it would be much appreciated.

> 2. Did you measure if there is an optimization due to just the first commit ("prefetch the next pte")?

Yes, I could measure a sizeable improvement (perhaps some 5%). I tested on
zen5 (which is a pretty beefy uarch) and the loop is so full of ~~crap~~
features that the prefetcher seems to be doing a poor job, at least per my
results.

> I actually had prefetch in mind - is it possible to do some kind of prefetch(pfn_to_page(pte_pfn(pte)))
> to optimize the call to vm_normal_folio()?

Certainly possible, but I suspect it doesn't make too much sense. You want to
avoid bringing in the cacheline if possible. In the pte's case, I know we're
probably going to look at it and modify it, and if I'm wrong it's just one
cacheline we misprefetched (though I had some parallel convos and it might
be that we need a branch there to avoid prefetching out of the PTE table).
We would like to avoid bringing in the folio cacheline at all, even if we
don't stall through some fancy prefetching or sheer CPU magic.

-- 
Pedro

next prev parent reply	other threads:[~2026-02-18 10:07 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-13 15:08 Luke Yang
2026-02-13 15:47 ` David Hildenbrand (Arm)
2026-02-13 16:24   ` Pedro Falcato
2026-02-13 17:16     ` Suren Baghdasaryan
2026-02-13 17:26       ` David Hildenbrand (Arm)
2026-02-16 10:12         ` Dev Jain
2026-02-16 14:56           ` Pedro Falcato
2026-02-17 17:43           ` Luke Yang
2026-02-17 18:08             ` Pedro Falcato
2026-02-18  5:01               ` Dev Jain
2026-02-18 10:06                 ` Pedro Falcato [this message]
2026-02-18 10:38                   ` Dev Jain
2026-02-18 10:46                     ` David Hildenbrand (Arm)
2026-02-18 11:58                       ` Pedro Falcato
2026-02-18 12:24                         ` David Hildenbrand (Arm)
2026-02-19 12:15                           ` Pedro Falcato
2026-02-19 13:02                             ` David Hildenbrand (Arm)
2026-02-19 15:00                               ` Pedro Falcato
2026-02-19 15:29                                 ` David Hildenbrand (Arm)
2026-02-20  4:12                                 ` Dev Jain
2026-02-18 11:52                     ` Pedro Falcato
2026-02-18  4:50             ` Dev Jain
2026-02-18 13:29 ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5dso4ctke4baz7hky62zyfdzyg27tcikdbg5ecnrqmnluvmxzo@sciiqgatpqqv \
    --to=pfalcato@suse.de \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=jhladky@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luyang@redhat.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox