From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
To: Pedro Falcato <pfalcato@suse.de>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Jann Horn <jannh@google.com>, Dev Jain <dev.jain@arm.com>,
Luke Yang <luyang@redhat.com>,
jhladky@redhat.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4] mm/mprotect: un-inline folio_pte_batch_flags()
Date: Fri, 20 Mar 2026 11:27:32 +0000 [thread overview]
Message-ID: <a9d6947e-f1f1-4053-adde-189b5546dbb1@lucifer.local> (raw)
In-Reply-To: <tv6pcikk456ynfx7lz3lrql3yafg4njuy3bzi4v6yi5hpe27gi@hakvdy2jmewu>
On Fri, Mar 20, 2026 at 10:59:55AM +0000, Pedro Falcato wrote:
> On Fri, Mar 20, 2026 at 10:36:59AM +0000, Lorenzo Stoakes (Oracle) wrote:
> > On Thu, Mar 19, 2026 at 10:41:54PM +0100, David Hildenbrand (Arm) wrote:
> > >
> > > >
> > > > This is all seems VERY delicate, and subject to somebody else coming along and
> > > > breaking it/causing some of these noinline/__always_inline invocations to make
> > > > things far worse.
> > > >
> > > > I also reserve the right to seriously rework this pile of crap software.
> > > >
> > > > I'd rather we try to find less fragile ways to optimise!
> > > >
> > > > Maybe there's some steps that are bigger wins than others?
> > >
> > > What we can do is, collect similar folio_pte_batch_*() variants and
> > > centralize them in mm/utils.c.
> >
> > I'm not sure that addresses any of the comments above?
> >
> > Putting logic specific to components of mm away from where those components
> > are and into mm/util.c seems like a complete regression in terms of
> > fragility and code separation.
> >
> > And for what reason would you want to do that? To force a noinline of an
> > inline and people 'just have to know' that's why you randomly separated the
> > two?
> >
> > Doesn't sound appealing overall.
> >
> > I'd rather we find a way to implement the batching such that it doesn't
> > exhibit bad inlining decisions in the first place.
>
> Yes, you make a good point. At the end of the day (taking change_protection()
> as the example at hand here), after these changes:
> change_protection()
> loop over p4ds
> loop over puds
> loop over pmds
> loop over ptes
> nr_ptes = loop over ptes and find out how many we have
> if (making write) {
> loop over nr_ptes
> loop over ptes and find out how many are anonexclusive or not, in a row
> loop over ptes and set them
> } else {
> loop over ptes and set them
> }
>
> which the compiler FWIW tries to inline it all into one function, but then
> does a poor job at figuring things out. And the CPU gets confused too. It
> was frankly shocking how much performance I could squeeze out of a
>
> if (nr_ptes == 1) {
> /* don't bother with the loops and the funny logic */
> }
Let's maybe focus on generalising this then to start?
I think David + I are in agreement that these are obvious (TM) improvements. But
let's see if we can generalise them?
>
> I would not be surprised if the other syscalls have similar problems.
>
> >
> > I mean mprotect_folio_batch() having !folio, !folio_test_large() checks
> > only there is already silly, we should have a _general_ function that does
> > optimisations like that.
> >
> > Isn't the issue rather than folio_pte_batch_flags() shouldn't be an inline
> > function in internal.h but rather a function in mm/util.c?
> >
> > >
> > > For
> > >
> > > nr_ptes = mprotect_folio_pte_batch(folio, pte, oldpte,
> > > max_nr_ptes, /* flags = */ 0)
> > >
> > > We might just be able to use folio_pte_batch()?
> >
> > Yup.
> >
> > >
> > > For the other variant (soft-dirt+write) we'd have to create a helper like
> > >
> > > folio_pte_batch_sd_w() [better name suggestion welcome]
> > >
> > > That will reduce the code footprint overall I guess.
> >
> > I mean yeah that's a terrible name so obviously it'd have to be something
> > better.
> >
> > But again, this seems pretty stupid, now we're writing a bunch of duplicate
> > per-case code to force noinline because we made the central function inline
> > no?
>
> Yeah.
I guess we can discuss on other part of thread :)
>
> >
> > That's also super fragile, because an engineer might later decide that
> > pattern is horrible and fix it, and we regress this.
> >
> > But I mean overall, is the perf here really all that important? Are people
> > really that dependent on mprotect() et al. performing brilliantly fast?
>
> Obviously no one truly depends on mprotect, but I believe in fast primitives :)
>
> > Couldn't we do this with any mm interface and end up making efforts that
> > degrade code quality, increase fragility for dubious benefit?
>
> Yes, which is why I don't want to degrade code quality :) It would be ideal to
> find something that works both ways. Per my description of the existing code
> above, you can tell that it's neither fast, nor beautiful :p
I mean the code is terrible in most of these places (mremap() is much better
because I put a lot of time into de-insane-it-ising it), but it's less degrading
quality but rather:
static noinline void blahdy_blah(a trillion parameters)
{
... lots of code ...
}
Developer comes along, modifies code/params/etc. or uses function in another
place and degrades perf somehow and suddenly what made sense at X point of time
no longer makes sense at Y point of time, but develoeprs don't understand it so
are scared to remove it and etc.
Which is why it's better to try to abstract as much as possible and have some
way of then putting the sensitive stuff in a specific place like:
/*
* Lorenzo-style overly long comment with lots of exposition, a beginning,
* middle and end, ASCII diagrams and all sorts...
*
* ...
*/
static noinline void __super_special_perf_bit_of_whatever_dont_touch_please(...)
{
}
That's wrapped up in some saner interface that people can use at will with the
perf-y component safely locked away in an insane asylum^W^W another compilation
unit or header.
>
> --
> Pedro
Cheers, Lorenzo
next prev parent reply other threads:[~2026-03-20 11:27 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 18:31 [PATCH 0/4] mm/mprotect: micro-optimization work Pedro Falcato
2026-03-19 18:31 ` [PATCH 1/4] mm/mprotect: encourage inlining with __always_inline Pedro Falcato
2026-03-19 18:59 ` Lorenzo Stoakes (Oracle)
2026-03-19 19:00 ` Lorenzo Stoakes (Oracle)
2026-03-19 21:28 ` David Hildenbrand (Arm)
2026-03-20 9:59 ` Pedro Falcato
2026-03-20 10:08 ` David Hildenbrand (Arm)
2026-03-19 18:31 ` [PATCH 2/4] mm/mprotect: move softleaf code out of the main function Pedro Falcato
2026-03-19 19:06 ` Lorenzo Stoakes (Oracle)
2026-03-19 21:33 ` David Hildenbrand (Arm)
2026-03-20 10:04 ` Pedro Falcato
2026-03-20 10:07 ` David Hildenbrand (Arm)
2026-03-20 10:54 ` Lorenzo Stoakes (Oracle)
2026-03-19 18:31 ` [PATCH 3/4] mm/mprotect: un-inline folio_pte_batch_flags() Pedro Falcato
2026-03-19 19:14 ` Lorenzo Stoakes (Oracle)
2026-03-19 21:41 ` David Hildenbrand (Arm)
2026-03-20 10:36 ` Lorenzo Stoakes (Oracle)
2026-03-20 10:59 ` Pedro Falcato
2026-03-20 11:02 ` David Hildenbrand (Arm)
2026-03-20 11:27 ` Lorenzo Stoakes (Oracle) [this message]
2026-03-20 11:01 ` David Hildenbrand (Arm)
2026-03-20 11:45 ` Lorenzo Stoakes (Oracle)
2026-03-23 12:56 ` David Hildenbrand (Arm)
2026-03-20 10:34 ` Pedro Falcato
2026-03-20 10:51 ` Lorenzo Stoakes (Oracle)
2026-03-19 18:31 ` [PATCH 4/4] mm/mprotect: special-case small folios when applying write permissions Pedro Falcato
2026-03-19 19:17 ` Lorenzo Stoakes (Oracle)
2026-03-20 10:36 ` Pedro Falcato
2026-03-20 10:42 ` Lorenzo Stoakes (Oracle)
2026-03-19 21:43 ` David Hildenbrand (Arm)
2026-03-20 10:37 ` Pedro Falcato
2026-03-20 2:42 ` [PATCH 0/4] mm/mprotect: micro-optimization work Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a9d6947e-f1f1-4053-adde-189b5546dbb1@lucifer.local \
--to=ljs@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=jannh@google.com \
--cc=jhladky@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luyang@redhat.com \
--cc=pfalcato@suse.de \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox