linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pedro Falcato <pfalcato@suse.de>
To: Dev Jain <dev.jain@arm.com>
Cc: Luke Yang <luyang@redhat.com>,
	david@kernel.org, surenb@google.com,  jhladky@redhat.com,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	 willy@infradead.org, vbabka@suse.cz, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [REGRESSION] mm/mprotect: 2x+ slowdown for >=400KiB regions since PTE batching (cac1db8c3aad)
Date: Wed, 18 Feb 2026 11:52:20 +0000	[thread overview]
Message-ID: <dabkyhc5ugwekgcajifjomxpkyh3wie42se5udblji67olsygr@oi3jf3bwfrjq> (raw)
In-Reply-To: <eaa6be47-f1fc-4b88-b267-5aa38e3ba2a9@arm.com>

On Wed, Feb 18, 2026 at 04:08:11PM +0530, Dev Jain wrote:
> 
> There are two things at play here:
> 
> 1. All arches are expected to benefit from pte batching on large folios, because
> of doing similar operations together in one shot. For code paths except mprotect
> and mremap, that benefit is far more clear due to:
> 
> a) batching across atomic operations etc. For example, see copy_present_ptes -> folio_ref_add.
>    Instead of bumping the reference by 1 nr times, we bump it by nr in one shot.
> 
> b) vm_normal_folio was already being invoked. So, all in all the only new overhead
>    we introduce is of folio_pte_batch(_flags). In fact, since we already have the
>    folio, I recall that we even just special case the large folio case, out from
>    the small folio case. Thus 4K folio processing will have no overhead.
> 
> 2. Due to the requirements of contpte, ptep_get() on arm64 needs to fetch a/d bits
> across a cont block. Thus, for each ptep_get, it does 16 pte accesses. To avoid this,
> it becomes critical to batch on arm64.
>

Understood.
 
> 
> >
> >> 2. Did you measure if there is an optimization due to just the first commit ("prefetch the next pte")?
> > Yes, I could measure a sizeable improvement (perhaps some 5%). I tested on
> > zen5 (which is a pretty beefy uarch) and the loop is so full of ~~crap~~
> > features that the prefetcher seems to be doing a poor job, at least per my
> > results.
> 
> Nice.
> 
> >
> >> I actually had prefetch in mind - is it possible to do some kind of prefetch(pfn_to_page(pte_pfn(pte)))
> >> to optimize the call to vm_normal_folio()?
> > Certainly possible, but I suspect it doesn't make too much sense. You want to
> > avoid bringing in the cacheline if possible. In the pte's case, I know we're
> > probably going to look at it and modify it, and if I'm wrong it's just one
> > cacheline we misprefetched (though I had some parallel convos and it might
> > be that we need a branch there to avoid prefetching out of the PTE table).
> > We would like to avoid bringing in the folio cacheline at all, even if we
> > don't stall through some fancy prefetching or sheer CPU magic.
> 
> I dunno, need other opinions.
> 
> The question here becomes that - should we prefer performance on 4K folios or
> large folios? As Luke reports in the other email, the benefit on pte-mapped-thp
> was staggering.

We want order-0 folios to be as performant as we can, since they are the
bulk of all folios in an mTHP-less system (especially anon folios, I know the
page cache is a little more complex these days).

> 
> I believe that if the sysadmin is enabling CONFIG_TRANSPARENT_HUGEPAGE, they know
> that the kernel will contain code which incorporates this fact that it will see
> large folios. So, is it reasonable to penalize folio order-0 case, in preference
> to folio order > 0? If yes, we can simply stop batching if !IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE).

No, the sysadmin does not enable CONFIG_TRANSPARENT_HUGEPAGE. We're lucky if
the distribution knows what CONFIG_THP does. It is not reasonable, IMO, to
penalize anything.

-- 
Pedro


  parent reply	other threads:[~2026-02-18 11:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-13 15:08 Luke Yang
2026-02-13 15:47 ` David Hildenbrand (Arm)
2026-02-13 16:24   ` Pedro Falcato
2026-02-13 17:16     ` Suren Baghdasaryan
2026-02-13 17:26       ` David Hildenbrand (Arm)
2026-02-16 10:12         ` Dev Jain
2026-02-16 14:56           ` Pedro Falcato
2026-02-17 17:43           ` Luke Yang
2026-02-17 18:08             ` Pedro Falcato
2026-02-18  5:01               ` Dev Jain
2026-02-18 10:06                 ` Pedro Falcato
2026-02-18 10:38                   ` Dev Jain
2026-02-18 10:46                     ` David Hildenbrand (Arm)
2026-02-18 11:58                       ` Pedro Falcato
2026-02-18 12:24                         ` David Hildenbrand (Arm)
2026-02-19 12:15                           ` Pedro Falcato
2026-02-19 13:02                             ` David Hildenbrand (Arm)
2026-02-19 15:00                               ` Pedro Falcato
2026-02-19 15:29                                 ` David Hildenbrand (Arm)
2026-02-20  4:12                                 ` Dev Jain
2026-02-18 11:52                     ` Pedro Falcato [this message]
2026-02-18  4:50             ` Dev Jain
2026-02-18 13:29 ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dabkyhc5ugwekgcajifjomxpkyh3wie42se5udblji67olsygr@oi3jf3bwfrjq \
    --to=pfalcato@suse.de \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=jhladky@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luyang@redhat.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox