Re: [LSF/MM/BPF TOPIC] Per-process page size

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Pedro Falcato <pfalcato@suse.de>, Dev Jain <dev.jain@arm.com>
Cc: lsf-pc@lists.linux-foundation.org, ryan.roberts@arm.com,
	catalin.marinas@arm.com, will@kernel.org, ardb@kernel.org,
	willy@infradead.org, hughd@google.com,
	baolin.wang@linux.alibaba.com, akpm@linux-foundation.org,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Per-process page size
Date: Mon, 23 Feb 2026 14:01:53 +0100	[thread overview]
Message-ID: <c338ac80-c680-4a40-a1f5-f5808c816090@kernel.org> (raw)
In-Reply-To: <fqsd4x5oqavouhwawmsmpanaszvr6xsboc2oeqzs3fafrtovpk@gfnqznoqlabk>

On 2/23/26 13:49, Pedro Falcato wrote:
> On Mon, Feb 23, 2026 at 10:37:55AM +0530, Dev Jain wrote:
>>> I don't understand. What exactly are you trying to do here? Maintain 2
>>> different paging structures, one for core mm and the other for the arch? As
>>> done in architectures with no radix tree paging structures?
>>
>> The mm->pgd will be the software pagetable. So suppose that do_anonymous_page is
>> doing set_ptes on the PTE table belonging to the software pagetable. We will
>> hook a "native_set_ptes" into set_ptes, which will set the ptes on a different
>> pagetable maintained by arm64 code (probably mm_context_t->native_pgd).
> 
> Traditionally, you do this kind of funky manipulation in update_mmu_cache.
> 
> But this is still an extremely complex and invasive change (that I assume most
> people would not like to see) with dubious benefit.
> 
>>
>>>
>>> If so, that's wildly inefficient, unless you're willing to go into reclaimable
>>> page tables on the arm64 side. And that brings extra problems and extra fun :)
>>
>> I didn't understand the reclaimable reference, but yes we need to make this efficient.
> 
> I'm not talking about CPU runtime efficiency, but memory efficiency. Doing
> this makes you essentially duplicate page tables - not exactly ideal. This is
> a Known Problem in classic UNIX systems which do something similar
> (but not the same): anonymous memory pointers are stored in some intermediary
> structure (SunOS and UVM call it "amap"), and paging structures are entirely
> redundant there. They can freely tear down a page table because they can freely
> put it together from the amap and file mappings (what they call vm_object and
> we call address_space).
> 
> Anyway, I'm boring you with these funny historical details so you can understand
> the similarities: the Linux page table format generally matches hardware, and
> we store anonymous memory "state" there, so you can't ever tear-down a pgtable
> without losing state of whatever was mapped there before. However, if you go
> down the "arm64 now has a separate pgtable structure", the roles switch:
> arm64's internal page table format makes for the real page tables, and linux's
> pgtable structure is nothing more than an "amap". So you could (and perhaps
> should) freely reclaim arm64 MMU page tables once memory pressure hits, because
> they are freely discardable.
> 
> Does this make sense?

I've been thinking about building the 64k page tables similar to how 
HMM/KVM handles it, invalidating them through mmu notifiers etc and 
building them on demand.

Considering the 64k MMU of a process just like a special device that 
builds its own page tables.

This way, they could get reclaimed more easily and most of the core + 
arm64 page able manipulation code could be kept as is.

However, I don't know how much the performance impact of that approach 
would be.

-- 
Cheers,

David

next prev parent reply	other threads:[~2026-02-23 13:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17 14:50 Dev Jain
2026-02-17 15:22 ` Matthew Wilcox
2026-02-17 15:30   ` David Hildenbrand (Arm)
2026-02-17 15:51     ` Ryan Roberts
2026-02-20  4:49     ` Matthew Wilcox
2026-02-20 16:50       ` David Hildenbrand (Arm)
2026-02-23 13:02         ` [Lsf-pc] " Jan Kara
2026-02-18  8:39   ` Dev Jain
2026-02-18  8:58     ` Dev Jain
2026-02-18  9:15       ` David Hildenbrand (Arm)
2026-02-20  9:49   ` Arnd Bergmann
2026-02-20 13:37 ` Pedro Falcato
2026-02-23  5:07   ` Dev Jain
2026-02-23 12:49     ` Pedro Falcato
2026-02-23 13:01       ` David Hildenbrand (Arm) [this message]
2026-02-23 15:18     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c338ac80-c680-4a40-a1f5-f5808c816090@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@suse.com \
    --cc=pfalcato@suse.de \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox