From: Brendan Jackman <jackmanb@google.com>
To: lsf-pc@lists.linux-foundation.org
Cc: linux-mm@kvack.org, rppt@kernel.org
Subject: [LSF/MM/BPF TOPIC] A pagetable library for the kernel?
Date: Thu, 19 Feb 2026 17:51:09 +0000 [thread overview]
Message-ID: <20260219175113.618562-1-jackmanb@google.com> (raw)
As work on Address Space Isolation [0] trudges slowly along (next series coming
soon™... I promise... some details of the plan are in [0]) I've been running
into a common issue whenever I try to do new stuff with the kernel address
space: We have too many sets of pagetable manipulation routines, and yet we
don't have one that suits ASI's needs.
Similarly, I'm currently working on support for efficiently unmapping
guest_memfd pages from the physmap (an extension to [1]) - in this case I've run
into very much the same issues as with ASI.
Here are some areas of the kernel that manipulate pagetables:
1. The collection of APIs that are specific to userspace pagetables: mmu_gather,
mm/pagewalk.c, some vm_fault logic, all that good stuff.
2. The set_memory_* and set_direct_map_* APIs. (Which are implemented per-arch).
3. Some non-userspace-specific APIs in mm/memory.c, such as
apply_to_page_range().
4. mm/vmalloc.c
5. Highmem logic such as kmap_local_*
6. Boot and memory-hotplug support code (your architecture's version of
arch/x86/mm/init_64.c).
7. x86's KPTI
8. x86's LDT logic
(At LPC I started enumerating these off the top of my head and multiple people
spoke out with more examples I hadn't thought of - please join in if you can see
more!)
By and large, these components are designed completely independently from one
another. This is made possible by the smart design of the low-level helper API
(pte_present() and friends), and it does lead to nice explicit coding style.
Here are some "new" things I've wanted to do with pagetables, which are not
currently supported by any library:
- Have a second kernel pagetable (for ASI's "nonsensitive address space")
- Modify pagetables safely from a context where allocation is not possible
- Modify the kernel's pagetables while accounting pagetable allocations to the
current process
I think it's time to discuss if there's a way to scope out a "library" that:
a) Reduces the overall amount of code in the kernel, while
b) Serving the needs of the incoming guest_memfd and ASI features.
In this session I'd first like to do a quick survey of the pagetable
manipulation systems already in the kernel (that I know about), what purposes
they serve and what capabilities they have. Then I'd like to discuss some ideas
for the scope of a new "library" and which of these components it might replace.
Mike Rapoport has shared a prototype that he wrote for a generic higher-level
PGD abstraction, so I will be using that as inspiration.
This is mostly about looking for feedback and input from maintainers and
experts: what opportunities for refactoring might I be missing? What challenges
might I be forgetting about for sharing code?
[0] https://lpc.events/event/19/contributions/2029/
[1] https://lore.kernel.org/all/20260126164445.11867-1-kalyazin@amazon.com/
reply other threads:[~2026-02-19 17:51 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260219175113.618562-1-jackmanb@google.com \
--to=jackmanb@google.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=rppt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox