linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] A pagetable library for the kernel?
@ 2026-02-19 17:51 Brendan Jackman
  0 siblings, 0 replies; only message in thread
From: Brendan Jackman @ 2026-02-19 17:51 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-mm, rppt

As work on Address Space Isolation [0] trudges slowly along (next series coming
soon™... I promise... some details of the plan are in [0]) I've been running
into a common issue whenever I try to do new stuff with the kernel address
space: We have too many sets of pagetable manipulation routines, and yet we
don't have one that suits ASI's needs.

Similarly, I'm currently working on support for efficiently unmapping
guest_memfd pages from the physmap (an extension to [1]) - in this case I've run
into very much the same issues as with ASI.

Here are some areas of the kernel that manipulate pagetables:

1. The collection of APIs that are specific to userspace pagetables: mmu_gather,
   mm/pagewalk.c, some vm_fault logic, all that good stuff.

2. The set_memory_* and set_direct_map_* APIs. (Which are implemented per-arch).

3. Some non-userspace-specific APIs in mm/memory.c, such as
   apply_to_page_range().

4. mm/vmalloc.c

5. Highmem logic such as kmap_local_*

6. Boot and memory-hotplug support code (your architecture's version of
   arch/x86/mm/init_64.c).

7. x86's KPTI

8. x86's LDT logic

(At LPC I started enumerating these off the top of my head and multiple people
spoke out with more examples I hadn't thought of - please join in if you can see
more!)

By and large, these components are designed completely independently from one
another. This is made possible by the smart design of the low-level helper API
(pte_present() and friends), and it does lead to nice explicit coding style.

Here are some "new" things I've wanted to do with pagetables, which are not
currently supported by any library:

- Have a second kernel pagetable (for ASI's "nonsensitive address space")

- Modify pagetables safely from a context where allocation is not possible

- Modify the kernel's pagetables while accounting pagetable allocations to the
  current process

I think it's time to discuss if there's a way to scope out a "library" that:

a) Reduces the overall amount of code in the kernel, while

b) Serving the needs of the incoming guest_memfd and ASI features.

In this session I'd first like to do a quick survey of the pagetable
manipulation systems already in the kernel (that I know about), what purposes
they serve and what capabilities they have. Then I'd like to discuss some ideas
for the scope of a new "library" and which of these components it might replace.

Mike Rapoport has shared a prototype that he wrote for a generic higher-level
PGD abstraction, so I will be using that as inspiration.

This is mostly about looking for feedback and input from maintainers and
experts: what opportunities for refactoring might I be missing? What challenges
might I be forgetting about for sharing code?

[0] https://lpc.events/event/19/contributions/2029/
[1] https://lore.kernel.org/all/20260126164445.11867-1-kalyazin@amazon.com/


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-02-19 17:51 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-19 17:51 [LSF/MM/BPF TOPIC] A pagetable library for the kernel? Brendan Jackman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox