linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <baohua@kernel.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	Huacai Chen <chenhuacai@kernel.org>,
	WANG Xuerui <kernel@xen0n.name>, Jonas Bonn <jonas@southpole.se>,
	Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>,
	Stafford Horne <shorne@gmail.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	Alexandre Ghiti <alex@ghiti.fr>, Jann Horn <jannh@google.com>,
	loongarch@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-openrisc@vger.kernel.org, linux-riscv@lists.infradead.org,
	linux-mm@kvack.org
Subject: Re: [PATCH RESEND] mm/pagewalk: split walk_page_range_novma() into kernel/user parts
Date: Wed, 4 Jun 2025 10:20:03 +0100	[thread overview]
Message-ID: <db00bb89-a7f4-4749-be2f-7a98c7636070@lucifer.local> (raw)
In-Reply-To: <51ec4269-b132-4163-9cb5-766042a3769d@redhat.com>

On Wed, Jun 04, 2025 at 09:39:30AM +0200, David Hildenbrand wrote:
> On 03.06.25 21:22, Lorenzo Stoakes wrote:
> > The walk_page_range_novma() function is rather confusing - it supports two
> > modes, one used often, the other used only for debugging.
> >
> > The first mode is the common case of traversal of kernel page tables, which
> > is what nearly all callers use this for.
>
> ... and what people should be using it for 🙂

:)

Yeah the whole intent of this patch is to detach the 'crazy debug' bit from
the 'used by arches all over the place' stuff.

Being super clear as to what you're doing matters.

>
> >
> > Secondly it provides an unusual debugging interface that allows for the
> > traversal of page tables in a userland range of memory even for that memory
> > which is not described by a VMA.
> >
> > This is highly unusual and it is far from certain that such page tables
> > should even exist, but perhaps this is precisely why it is useful as a
> > debugging mechanism.
> >
> > As a result, this is utilised by ptdump only. Historically, things were
> > reversed - ptdump was the only user, and other parts of the kernel evolved
> > to use the kernel page table walking here.
> >
> > Since we have some complicated and confusing locking rules for the novma
> > case, it makes sense to separate the two usages into their own functions.
> >
> > Doing this also provide self-documentation as to the intent of the caller -
> > are they doing something rather unusual or are they simply doing a standard
> > kernel page table walk?
> >
> > We therefore maintain walk_page_range_novma() for this single usage, and
> > document the function as such.
>
> If we have to keep this dangerous interface, it should probably be
>
> walk_page_range_debug() or walk_page_range_dump()

Ugh it's too early, I thought Mike suggested this :P but he suggested the
mm/internal.h bit.

But anyway I agree with both, will fix in v2.

>
> >
> > Note that ptdump uses the precise same function for kernel walking as a
> > convenience, so we permit this but make it very explicit by having
> > walk_page_range_novma() invoke walk_page_range_kernel() in this case.
> >
> > We introduce walk_page_range_kernel() for the far more common case of
> > kernel page table traversal.
>
> I wonder if we should give it a completely different name scheme to
> highlight that this is something completely different.
>
> walk_kernel_page_table_range()

Yeah, I think this might be a good idea actually. This is doing something
'unusual' unlike all the other walk_kernel_xxx() handlers, so this should
highlight it even more clearly.

Will fixup in v2.

>
> etc.
>
>
> --
> Cheers,
>
> David / dhildenb
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


      parent reply	other threads:[~2025-06-04  9:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03 19:22 Lorenzo Stoakes
2025-06-04  7:39 ` David Hildenbrand
2025-06-04  8:07   ` Mike Rapoport
2025-06-04  8:12     ` David Hildenbrand
2025-06-04  9:09     ` Lorenzo Stoakes
2025-06-04 12:26       ` Oscar Salvador
2025-06-04 12:31         ` Lorenzo Stoakes
2025-06-04  9:20   ` Lorenzo Stoakes [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db00bb89-a7f4-4749-be2f-7a98c7636070@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=baohua@kernel.org \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=jannh@google.com \
    --cc=jonas@southpole.se \
    --cc=kernel@xen0n.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-openrisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=loongarch@lists.linux.dev \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rppt@kernel.org \
    --cc=shorne@gmail.com \
    --cc=stefan.kristiansson@saunalahti.fi \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox