linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: "David Hildenbrand (Red Hat)" <david@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
	Samuel Holland <samuel.holland@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <pjw@kernel.org>,
	linux-riscv@lists.infradead.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, devicetree@vger.kernel.org,
	Suren Baghdasaryan <surenb@google.com>,
	linux-kernel@vger.kernel.org, Mike Rapoport <rppt@kernel.org>,
	Michal Hocko <mhocko@suse.com>, Conor Dooley <conor@kernel.org>,
	Krzysztof Kozlowski <krzk+dt@kernel.org>,
	Alexandre Ghiti <alex@ghiti.fr>,
	Emil Renner Berthing <kernel@esmil.dk>,
	Rob Herring <robh+dt@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Julia Lawall <Julia.Lawall@inria.fr>,
	Nicolas Palix <nicolas.palix@imag.fr>,
	Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Re: [PATCH v3 06/22] mm: Always use page table accessor functions
Date: Wed, 26 Nov 2025 12:27:56 +0000	[thread overview]
Message-ID: <6bdf2b89-7768-4b90-b5e7-ff174196ea7b@lucifer.local> (raw)
In-Reply-To: <df7d10ba-bb42-4ea1-8c5b-5db88a18eccb@kernel.org>

On Wed, Nov 26, 2025 at 01:19:00PM +0100, David Hildenbrand (Red Hat) wrote:
> On 11/26/25 13:16, David Hildenbrand (Red Hat) wrote:
> > On 11/26/25 12:09, Ryan Roberts wrote:
> > > On 13/11/2025 01:45, Samuel Holland wrote:
> > > > Some platforms need to fix up the values when reading or writing page
> > > > tables. Because of this, the accessors must always be used; it is not
> > > > valid to simply dereference a pXX_t pointer.
> > > >
> > > > Fix all of the instances of this pattern in generic code, mostly by
> > > > applying the below coccinelle semantic patch, repeated for each page
> > > > table level. Some additional fixes were applied manually, mostly to
> > > > macros where type information is unavailable.
> > > >
> > > > In a few places, a `pte_t *` or `pmd_t *` is actually a pointer to a PTE
> > > > or PMDE value stored on the stack, not a pointer to a page table. In
> > > > those cases, it is not appropriate to use the accessors, because the
> > > > value is not globally visible, and any transformation from pXXp_get()
> > > > has already been applied. Those places are marked by naming the pointer
> > > > `ptentp` or `pmdvalp`, as opposed to `ptep` or `pmdp`.
> > > >
> > > > @@
> > > > pte_t *P;
> > > > expression E;
> > > > expression I;
> > > > @@
> > > > - P[I] = E
> > > > + set_pte(P + I, E)
> > > >
> > > > @@
> > > > pte_t *P;
> > > > expression E;
> > > > @@
> > > > (
> > > > - WRITE_ONCE(*P, E)
> > > > + set_pte(P, E)
> > > > |
> > > > - *P = E
> > > > + set_pte(P, E)
> > > > )
> > >
> > > There should absolutely never be any instances of core code directly setting an
> > > entry at any level. This *must* always go via the arch code helpers. Did you
> > > find any instances of this? If so, I would consider these bugs and suggest
> > > sending as a separate bugfix patch. Bad things could happen on arm64 because we
> > > may need to break a contiguous mapping, which would not happen if the value is
> > > set directly.
> > >
> > > >
> > > > @@
> > > > pte_t *P;
> > > > expression I;
> > > > @@
> > > > (
> > > >     &P[I]
> > > > |
> > > > - READ_ONCE(P[I])
> > > > + ptep_get(P + I)
> > > > |
> > > > - P[I]
> > > > + ptep_get(P + I)
> > > > )
> > > >
> > > > @@
> > > > pte_t *P;
> > > > @@
> > > > (
> > > > - READ_ONCE(*P)
> > > > + ptep_get(P)
> > > > |
> > > > - *P
> > > > + ptep_get(P)
> > > > )
> > >
> > > For reading the *PTE*, conversion over to ptep_get() should have already been
> > > done (I did this a few years back when implementing support for arm64 contiguous
> > > mappings). If you find any cases where direct dereference or READ_ONCE() is
> > > being done in generic code for PTE, then that's a bug and should also be sent as
> > > a separate patch.
> > >
> > > FYI, my experience was that Coccinelle didn't find everything when I was
> > > converting to ptep_get() - although it could have been that my Cochinelle skills
> > > were not up to scratch! I ended up using an additional method where I did a
> > > find/replace to convert "pte_t *" to "ptep_handle_t" and declared pte_handle_t
> > > as a void* which causes a compiler error on dereference. Then in a few key
> > > places I did a manual case from pte_handle_t to (pte_t *) and compiled allyesconfig.
> > >
> > > I'm assuming the above Cocchinelle template was also used for pmd_t, pud_t,
> > > p4d_t and pgd_t?
> > >
> > > >
> > > > Additionally, the following semantic patch was used to convert PMD and
> > > > PUD references inside struct vm_fault:
> > > >
> > > > @@
> > > > struct vm_fault vmf;
> > > > @@
> > > > (
> > > > - *vmf.pmd
> > > > + pmdp_get(vmf.pmd)
> > > > |
> > > > - *vmf.pud
> > > > + pudp_get(vmf.pud)
> > > > )
> > > >
> > > > @@
> > > > struct vm_fault *vmf;
> > > > @@
> > > > (
> > > > - *vmf->pmd
> > > > + pmdp_get(vmf->pmd)
> > > > |
> > > > - *vmf->pud
> > > > + pudp_get(vmf->pud)
> > > > )
> > > >
> > > >
> > > > Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> > > > ---
> > > > This commit covers some of the same changes as an existing series from
> > > > Anshuman Khandual[1]. Unlike that series, this commit is a purely
> > > > mechanical conversion to demonstrate the RISC-V changes, so it does not
> > > > insert local variables to avoid redundant calls to the accessors. A
> > > > manual conversion like in that series could improve performance.
> > > >
> > > > [1]: https://lore.kernel.org/linux-mm/20240917073117.1531207-1-anshuman.khandual@arm.com/
> > >
> > > Hi,
> > >
> > > I've just come across this patch and wanted to mention that we could also
> > > benefit from this improved absraction for some features we are looking at for
> > > arm64. As you mention, Anshuman had a go but hit some roadblocks.
> > >
> > > The main issue is that the compiler was unable to optimize away the READ_ONCE()s
> > > for the case where certain levels of the pgtable are folded. But it can optimize
> > > the plain C dereferences. There were complaints the the generated code for arm
> > > (32) and powerpc was significantly impacted due to having many more (redundant)
> > > loads.
> > >
> >
> > We do have mm_pmd_folded()/p4d_folded() etc, could that help to sort
> > this out internally?
> >
>
> Just stumbled over the reply from Christope:
>
> https://lkml.kernel.org/r/0019d675-ce3d-4a5c-89ed-f126c45145c9@kernel.org
>
> And wonder if we could handle that somehow directly in the pgdp_get() etc.

I find that kind of gross to be honest. Isn't the whole point of folding that we
don't have to think about it...

And we're now modifying how we do things for ppc32 specifically? Or are there
arches with fewer cobwebs on them that are actually impacted here?

Is it really necessary to do this?

And obv. that doesn't address the READ_ONCE() vs. not READ_ONCE() stuff, plus
then we have to do it for every arch no?

>
> --
> Cheers
>
> David

Cheers, Lorenzo


  reply	other threads:[~2025-11-26 12:28 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-13  1:45 [PATCH v3 00/22] riscv: Memory type control for platforms with physical memory aliases Samuel Holland
2025-11-13  1:45 ` [PATCH v3 01/22] mm/ptdump: replace READ_ONCE() with standard page table accessors Samuel Holland
2025-11-13  1:45 ` [PATCH v3 02/22] mm: " Samuel Holland
2025-11-13  4:05   ` Dev Jain
2025-11-13  1:45 ` [PATCH v3 03/22] mm/dirty: replace READ_ONCE() with pudp_get() Samuel Holland
2025-11-13  1:45 ` [PATCH v3 04/22] perf/events: replace READ_ONCE() with standard page table accessors Samuel Holland
2025-11-13 19:10   ` David Hildenbrand (Red Hat)
2025-11-13  1:45 ` [PATCH v3 05/22] mm: Move the fallback definitions of pXXp_get() Samuel Holland
2025-11-13 19:11   ` David Hildenbrand (Red Hat)
2025-11-13  1:45 ` [PATCH v3 06/22] mm: Always use page table accessor functions Samuel Holland
2025-11-13  4:53   ` kernel test robot
2025-11-13  5:46   ` kernel test robot
2025-11-26 11:08   ` Christophe Leroy (CS GROUP)
2025-11-26 11:09   ` Ryan Roberts
2025-11-26 12:16     ` David Hildenbrand (Red Hat)
2025-11-26 12:19       ` David Hildenbrand (Red Hat)
2025-11-26 12:27         ` Lorenzo Stoakes [this message]
2025-11-26 12:35           ` David Hildenbrand (Red Hat)
2025-11-26 13:03             ` Ryan Roberts
2025-11-26 13:47               ` Wei Yang
2025-11-26 14:22                 ` Ryan Roberts
2025-11-26 14:37                   ` Lorenzo Stoakes
2025-11-26 14:53                     ` David Hildenbrand (Red Hat)
2025-11-26 14:46                   ` David Hildenbrand (Red Hat)
2025-11-26 14:52                     ` Lorenzo Stoakes
2025-11-26 14:56                       ` David Hildenbrand (Red Hat)
2025-11-26 15:08                         ` Lorenzo Stoakes
2025-11-26 15:12                           ` David Hildenbrand (Red Hat)
2025-11-26 16:07                             ` Ryan Roberts
2025-11-26 16:34                               ` Ryan Roberts
2025-11-26 20:31                                 ` David Hildenbrand (Red Hat)
2025-11-27  7:14                                   ` David Hildenbrand (Red Hat)
2025-11-27  7:31                                     ` David Hildenbrand (Red Hat)
2025-11-27 15:32                                       ` Ryan Roberts
2025-11-27 19:39                                 ` Christophe Leroy (CS GROUP)
2025-11-27 19:44                                 ` Christophe Leroy (CS GROUP)
2025-11-27  8:26                   ` Christophe Leroy (CS GROUP)
2025-11-27  8:35                     ` David Hildenbrand (Red Hat)
2025-11-13  1:45 ` [PATCH v3 07/22] checkpatch: Warn on page table access without accessors Samuel Holland
2025-11-13  2:21   ` Joe Perches
2025-11-13  2:36     ` Samuel Holland
2025-11-13 19:17       ` David Hildenbrand (Red Hat)
2025-11-13  1:45 ` [PATCH v3 08/22] mm: Allow page table accessors to be non-idempotent Samuel Holland
2025-11-13  7:19   ` kernel test robot
2025-11-27 16:57   ` Ryan Roberts
2025-11-27 17:47     ` David Hildenbrand (Red Hat)
2025-11-13  1:45 ` [PATCH v3 09/22] riscv: hibernate: Replace open-coded pXXp_get() Samuel Holland
2025-11-13  1:45 ` [PATCH v3 10/22] riscv: mm: Always use page table accessor functions Samuel Holland
2025-11-13  1:45 ` [PATCH v3 11/22] riscv: mm: Simplify set_p4d() and set_pgd() Samuel Holland
2025-11-13  1:45 ` [PATCH v3 12/22] riscv: mm: Deduplicate _PAGE_CHG_MASK definition Samuel Holland
2025-11-13  1:45 ` [PATCH v3 13/22] riscv: ptdump: Only show N and MT bits when enabled in the kernel Samuel Holland
2025-11-13  1:45 ` [PATCH v3 14/22] riscv: mm: Fix up memory types when writing page tables Samuel Holland
2025-11-13  1:45 ` [PATCH v3 15/22] riscv: mm: Expose all page table bits to assembly code Samuel Holland
2025-11-13  1:45 ` [PATCH v3 16/22] riscv: alternative: Add an ALTERNATIVE_3 macro Samuel Holland
2025-11-13  1:45 ` [PATCH v3 17/22] riscv: alternative: Allow calls with alternate link registers Samuel Holland
2025-11-13  1:45 ` [PATCH v3 18/22] riscv: Fix logic for selecting DMA_DIRECT_REMAP Samuel Holland
2025-11-13  1:45 ` [PATCH v3 19/22] dt-bindings: riscv: Describe physical memory regions Samuel Holland
2025-12-04 15:12   ` Rob Herring
2025-11-13  1:45 ` [PATCH v3 20/22] riscv: mm: Use physical memory aliases to apply PMAs Samuel Holland
2025-11-13  1:45 ` [PATCH v3 21/22] riscv: dts: starfive: jh7100: Use physical memory ranges for DMA Samuel Holland
2025-11-13  1:45 ` [PATCH v3 22/22] riscv: dts: eswin: eic7700: " Samuel Holland
2025-11-13 19:13 ` [PATCH v3 00/22] riscv: Memory type control for platforms with physical memory aliases David Hildenbrand (Red Hat)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6bdf2b89-7768-4b90-b5e7-ff174196ea7b@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Julia.Lawall@inria.fr \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=anshuman.khandual@arm.com \
    --cc=conor@kernel.org \
    --cc=david@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=kernel@esmil.dk \
    --cc=krzk+dt@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mhocko@suse.com \
    --cc=nicolas.palix@imag.fr \
    --cc=palmer@dabbelt.com \
    --cc=pjw@kernel.org \
    --cc=robh+dt@kernel.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=samuel.holland@sifive.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox