linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Brodsky <kevin.brodsky@arm.com>
To: linux-hardening@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
	Kevin Brodsky <kevin.brodsky@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Ira Weiny <ira.weiny@intel.com>, Jann Horn <jannh@google.com>,
	Jeff Xu <jeffxu@chromium.org>, Joey Gouly <joey.gouly@arm.com>,
	Kees Cook <kees@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Marc Zyngier <maz@kernel.org>, Mark Brown <broonie@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Maxwell Bland <mbland@motorola.com>,
	"Mike Rapoport (IBM)" <rppt@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Pierre Langlois <pierre.langlois@arm.com>,
	Quentin Perret <qperret@google.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
	Yang Shi <yang@os.amperecomputing.com>,
	Yeoreum Yun <yeoreum.yun@arm.com>,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	x86@kernel.org
Subject: [PATCH v6 16/30] mm: kpkeys: Defer early call to set_memory_pkey()
Date: Fri, 27 Feb 2026 17:55:04 +0000	[thread overview]
Message-ID: <20260227175518.3728055-17-kevin.brodsky@arm.com> (raw)
In-Reply-To: <20260227175518.3728055-1-kevin.brodsky@arm.com>

The kpkeys_hardened_pgtables feature requires all page table pages
to be mapped with a non-default pkey. When the linear map
uses large block mappings, setting the pkey for an arbitrary range
may require splitting an existing block.

The kpkeys page table allocator attempts to reduce such splitting,
but it cannot avoid it altogether. This is problematic during early
boot on some systems (arm64 with BBML2-noabort), because the linear
map may not be split until feature detection has completed on all
CPUs. This occurs after the buddy allocator becomes available, and
pagetable_alloc() is called multiple times by that point.

To address this, defer the first call to set_memory_pkey()
(triggered by the refill in pba_init()) until a point where it is
safe to do so. A late initialisation function is introduced to that
effect.

Only one such early region may be registered; further refills in
that early window will trigger a warning and leave the memory
unprotected. The underlying assumption is that there are relatively
few calls to pagetable_alloc() before
kpkeys_hardened_pgtables_init_late() is called. This seems to be the
case at least on arm64; the main user is vmalloc() while allocating
per-CPU IRQ stacks, and even with the largest possible NR_CPUS this
would not require allocating more than 16 PTE pages.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---

This patch is rather unpleasant (especially the arbitrary limit of pages
that can be deferred), but it seems difficult to avoid on arm64 as we
must wait to know whether all CPUs support BBML2-noabort before relying
on it to split blocks.

The case where the boot CPU supports BBML2-noabort but some other
doesn't is not explicitly supported. In that case, the linear map will
end up being PTE-mapped, but we will still use the block allocator for
page tables. This may be suboptimal, but it remains functionally
correct.

---
 include/linux/kpkeys.h        |  8 +++++
 mm/kpkeys_hardened_pgtables.c | 58 +++++++++++++++++++++++++++++++++--
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/include/linux/kpkeys.h b/include/linux/kpkeys.h
index 983f55655dde..8cfeb6e5af56 100644
--- a/include/linux/kpkeys.h
+++ b/include/linux/kpkeys.h
@@ -133,6 +133,12 @@ bool kpkeys_ready_for_direct_map_split(void);
  */
 void kpkeys_hardened_pgtables_init(void);
 
+/*
+ * Should be called by architecture code as soon as it is safe to modify the
+ * pkey of arbitrary linear map ranges.
+ */
+void kpkeys_hardened_pgtables_init_late(void);
+
 #else /* CONFIG_KPKEYS_HARDENED_PGTABLES */
 
 static inline bool kpkeys_hardened_pgtables_enabled(void)
@@ -159,6 +165,8 @@ static inline void kpkeys_pgtable_free(struct page *page) {}
 
 static inline void kpkeys_hardened_pgtables_init(void) {}
 
+static inline void kpkeys_hardened_pgtables_init_late(void) {}
+
 #endif /* CONFIG_KPKEYS_HARDENED_PGTABLES */
 
 #endif /* _LINUX_KPKEYS_H */
diff --git a/mm/kpkeys_hardened_pgtables.c b/mm/kpkeys_hardened_pgtables.c
index 5b1231e1422a..223a0bb02df0 100644
--- a/mm/kpkeys_hardened_pgtables.c
+++ b/mm/kpkeys_hardened_pgtables.c
@@ -39,6 +39,7 @@ static void pba_pgtable_free(struct page *page);
 static int pba_prepare_direct_map_split(void);
 static bool pba_ready_for_direct_map_split(void);
 static void pba_init(void);
+static void pba_init_late(void);
 
 /* Trivial allocator in case the linear map is PTE-mapped (no block mapping) */
 static struct page *noblock_pgtable_alloc(gfp_t gfp)
@@ -107,6 +108,15 @@ void __init kpkeys_hardened_pgtables_init(void)
 	static_branch_enable(&kpkeys_hardened_pgtables_key);
 }
 
+void __init kpkeys_hardened_pgtables_init_late(void)
+{
+	if (!arch_kpkeys_enabled())
+		return;
+
+	if (pba_enabled())
+		pba_init_late();
+}
+
 /*
  * pkeys block allocator (PBA): dedicated page table allocator for block-mapped
  * linear map. Block splitting is minimised by prioritising the allocation and
@@ -174,7 +184,13 @@ static struct pkeys_block_allocator pkeys_block_allocator = {
 	.alloc_mutex = __MUTEX_INITIALIZER(pkeys_block_allocator.alloc_mutex)
 };
 
+static struct {
+	struct page *head_page;
+	unsigned int order;
+} pba_early_region __initdata;
+
 static __ro_after_init DEFINE_STATIC_KEY_FALSE(pba_enabled_key);
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(pba_can_set_pkey);
 
 static bool pba_enabled(void)
 {
@@ -188,6 +204,28 @@ static bool alloc_mutex_locked(void)
 	return mutex_get_owner(&pba->alloc_mutex) == (unsigned long)current;
 }
 
+/*
+ * __ref is used as this is called from __refill_pages() which is not __init.
+ * The call to pba_init_late() guarantees this is not called after boot has
+ * completed.
+ */
+static void __ref register_early_region(struct page *head_page,
+					unsigned int order)
+{
+	/*
+	 * Only one region is expected to be registered. Any further region
+	 * is left untracked (i.e. unprotected).
+	 */
+	if (WARN_ON(pba_early_region.head_page))
+		return;
+
+	pr_debug("%s: order=%d, pfn=%lx\n", __func__, order,
+		 page_to_pfn(head_page));
+
+	pba_early_region.head_page = head_page;
+	pba_early_region.order = order;
+}
+
 static void cached_list_add_pages(struct page *page, unsigned int nr_pages)
 {
 	struct pkeys_block_allocator *pba = &pkeys_block_allocator;
@@ -227,7 +265,7 @@ static struct page *__refill_pages(bool alloc_one)
 	struct pkeys_block_allocator *pba = &pkeys_block_allocator;
 	struct page *page;
 	unsigned int order;
-	int ret;
+	int ret = 0;
 
 	for (int i = 0; i < ARRAY_SIZE(refill_orders); ++i) {
 		order = refill_orders[i];
@@ -243,7 +281,10 @@ static struct page *__refill_pages(bool alloc_one)
 
 	guard(mutex)(&pba->alloc_mutex);
 
-	ret = set_pkey_pgtable(page, 1 << order);
+	if (static_branch_likely(&pba_can_set_pkey))
+		ret = set_pkey_pgtable(page, 1 << order);
+	else
+		register_early_region(page, order);
 
 	if (ret) {
 		__free_pages(page, order);
@@ -406,7 +447,20 @@ static void __init pba_init(void)
 	/*
 	 * Refill the cache so that the reserve pages are available for
 	 * splitting next time we need to refill.
+	 *
+	 * We cannot split the linear map at this stage, so the allocated
+	 * region will be registered as early region (pba_early_region) and
+	 * its pkey set later.
 	 */
 	ret = refill_pages();
 	WARN_ON(ret);
 }
+
+static void __init pba_init_late(void)
+{
+	static_branch_enable(&pba_can_set_pkey);
+
+	if (pba_early_region.head_page)
+		set_pkey_pgtable(pba_early_region.head_page,
+				 1 << pba_early_region.order);
+}
-- 
2.51.2



  parent reply	other threads:[~2026-02-27 17:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-27 17:54 [PATCH v6 00/30] pkeys-based page table hardening Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 01/30] mm: Introduce kpkeys Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 02/30] set_memory: Introduce set_memory_pkey() stub Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 03/30] arm64: mm: Enable overlays for all EL1 indirect permissions Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 04/30] arm64: Introduce por_elx_set_pkey_perms() helper Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 05/30] arm64: Implement asm/kpkeys.h using POE Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 06/30] arm64: set_memory: Implement set_memory_pkey() Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 07/30] arm64: Reset POR_EL1 on exception entry Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 08/30] arm64: Context-switch POR_EL1 Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 09/30] arm64: Initialize POR_EL1 register on cpu_resume() Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 10/30] arm64: Enable kpkeys Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 11/30] memblock: Move INIT_MEMBLOCK_* macros to header Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 12/30] set_memory: Introduce arch_has_pte_only_direct_map() Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 13/30] mm: kpkeys: Introduce kpkeys_hardened_pgtables feature Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 14/30] mm: kpkeys: Introduce block-based page table allocator Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 15/30] mm: kpkeys: Handle splitting of linear map Kevin Brodsky
2026-02-27 17:55 ` Kevin Brodsky [this message]
2026-02-27 17:55 ` [PATCH v6 17/30] mm: kpkeys: Add shrinker for block pgtable allocator Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 18/30] mm: kpkeys: Introduce early page table allocator Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 19/30] mm: kpkeys: Introduce hook for protecting static page tables Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 20/30] arm64: cpufeature: Add helper to directly probe CPU for POE support Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 21/30] arm64: set_memory: Implement arch_has_pte_only_direct_map() Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 22/30] arm64: kpkeys: Support KPKEYS_LVL_PGTABLES Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 23/30] arm64: kpkeys: Ensure the linear map can be modified Kevin Brodsky
2026-02-27 20:28   ` kernel test robot
2026-02-27 17:55 ` [PATCH v6 24/30] arm64: kpkeys: Handle splitting of linear map Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 25/30] arm64: kpkeys: Protect early page tables Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 26/30] arm64: kpkeys: Protect init_pg_dir Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 27/30] arm64: kpkeys: Guard page table writes Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 28/30] arm64: kpkeys: Batch KPKEYS_LVL_PGTABLES switches Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 29/30] arm64: kpkeys: Enable kpkeys_hardened_pgtables support Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 30/30] mm: Add basic tests for kpkeys_hardened_pgtables Kevin Brodsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260227175518.3728055-17-kevin.brodsky@arm.com \
    --to=kevin.brodsky@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=jannh@google.com \
    --cc=jeffxu@chromium.org \
    --cc=joey.gouly@arm.com \
    --cc=kees@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@kernel.org \
    --cc=maz@kernel.org \
    --cc=mbland@motorola.com \
    --cc=peterz@infradead.org \
    --cc=pierre.langlois@arm.com \
    --cc=qperret@google.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox