linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Brendan Jackman <jackmanb@google.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org,  Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	 Vlastimil Babka <vbabka@suse.cz>,
	David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 Mike Rapoport <rppt@kernel.org>,
	Junaid Shahid <junaids@google.com>,
	Reiji Watanabe <reijiw@google.com>,
	 Patrick Bellasi <derkling@google.com>,
	Brendan Jackman <jackmanb@google.com>,
	 Yosry Ahmed <yosry.ahmed@linux.dev>
Subject: [PATCH RFC 11/11] mm/page_alloc: Add support for ASI-unmapping pages
Date: Thu, 13 Mar 2025 18:11:30 +0000	[thread overview]
Message-ID: <20250313-asi-page-alloc-v1-11-04972e046cea@google.com> (raw)
In-Reply-To: <20250313-asi-page-alloc-v1-0-04972e046cea@google.com>

While calling asi_map() is pretty easy, to unmap pages we need to ensure
a TLB shootdown is complete before we allow them to be allocated.

Therefore, treat this as a special case of buddy allocation. Allocate an
entire block, release the zone lock and enable interrupts, then do the
unmap and TLB shootdown. Once that's complete, return any unwanted pages
within the block to the freelists.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 mm/internal.h   |  5 ++++
 mm/page_alloc.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 86 insertions(+), 7 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index b82ab345fb994b7c4971e550556e24bb68f315f6..7904be86fa2c7fded62c100d84fe572c15407ccf 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1189,6 +1189,11 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
 #endif
 #define ALLOC_HIGHATOMIC	0x200 /* Allows access to MIGRATE_HIGHATOMIC */
 #define ALLOC_KSWAPD		0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */
+#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION
+#define ALLOC_ASI_UNMAP	       0x1000 /* allow asi_unmap(), requiring TLB shootdown. */
+#else
+#define ALLOC_ASI_UNMAP		0x0
+#endif
 
 /* Flags that allow allocations below the min watermark. */
 #define ALLOC_RESERVES (ALLOC_NON_BLOCK|ALLOC_MIN_RESERVE|ALLOC_HIGHATOMIC|ALLOC_OOM)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0d8bbad8675c99282f308c4a4122d5d9c4b14dae..9ac883d7a71387d291bc04bad675e2545dd7ba0f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1627,6 +1627,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
  * The other migratetypes do not have fallbacks.
  *
  * Note there are no fallbacks from sensitive to nonsensitive migratetypes.
+ * That's instead handled as a totally special case in __rmqueue_asi_unmap().
  */
 #ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION
 #define TERMINATE_FALLBACK -1
@@ -2790,7 +2791,77 @@ static inline void zone_statistics(struct zone *preferred_zone, struct zone *z,
 #endif
 }
 
-static __always_inline
+#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION
+/*
+ * Allocate a page by converting some memory to sensitive, by doing an ASI
+ * unmap. This can't be done via __rmqueue_fallback because that unmap requires
+ * a TLB flush which can only be done with interrupts on.
+ */
+static inline
+struct page *__rmqueue_asi_unmap(struct zone *zone, unsigned int request_order,
+				 unsigned int alloc_flags, int migratetype)
+{
+	struct page *page;
+	int alloc_order;
+	int i;
+
+	lockdep_assert_irqs_enabled();
+
+	if (!(alloc_flags & ALLOC_ASI_UNMAP))
+		return NULL;
+
+	VM_WARN_ON_ONCE(migratetype == MIGRATE_UNMOVABLE_NONSENSITIVE);
+
+	/*
+	 * Need to unmap a whole pageblock (otherwise it might require
+	 * allocating pagetables). Can't do that with the zone lock held, but we
+	 * also can't flip the block's migratetype until the flush is complete,
+	 * otherwise any concurrent sensitive allocations could momentarily leak
+	 * data into the restricted address space. As a simple workaround,
+	 * "allocate" at least the whole block, unmap it (with IRQs enabled),
+	 * then free any remainder of the block again.
+	 *
+	 * An alternative to this could be to synchronize an intermediate state
+	 * on the pageblock (since since this code can't be called in an IRQ,
+	 * this shouldn't be too bad - it's likely OK to just busy-wait until
+	 * any conurrent TLB flush completes.).
+	 */
+
+	alloc_order = max(request_order, pageblock_order);
+	spin_lock_irq(&zone->lock);
+	page = __rmqueue_smallest(zone, alloc_order, MIGRATE_UNMOVABLE_NONSENSITIVE);
+	spin_unlock_irq(&zone->lock);
+	if (!page)
+		return NULL;
+
+	asi_unmap(page, 1 << alloc_order);
+
+	change_pageblock_range(page, alloc_order, migratetype);
+
+	if (request_order >= alloc_order)
+		return page;
+
+	spin_lock_irq(&zone->lock);
+	for (i = request_order; i < alloc_order; i++) {
+		struct page *page_to_free = page + (1 << i);
+
+		__free_one_page(page_to_free, page_to_pfn(page_to_free), zone, i,
+			migratetype, FPI_SKIP_REPORT_NOTIFY);
+	}
+	spin_unlock_irq(&zone->lock);
+
+	return page;
+}
+#else
+static inline
+struct page *__rmqueue_asi_unmap(struct zone *zone, unsigned int order,
+				unsigned int alloc_flags, int migratetype)
+{
+	return NULL;
+}
+#endif
+
+static noinline
 struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
 			   unsigned int order, unsigned int alloc_flags,
 			   int migratetype)
@@ -2814,13 +2885,14 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
 			 */
 			if (!page && (alloc_flags & (ALLOC_OOM|ALLOC_NON_BLOCK)))
 				page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);
-
-			if (!page) {
-				spin_unlock_irqrestore(&zone->lock, flags);
-				return NULL;
-			}
 		}
 		spin_unlock_irqrestore(&zone->lock, flags);
+
+		if (!page)
+			page = __rmqueue_asi_unmap(zone, order, alloc_flags, migratetype);
+
+		if (!page)
+			return NULL;
 	} while (check_new_pages(page, order));
 
 	__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
@@ -3356,6 +3428,8 @@ static inline unsigned int gfp_to_alloc_flags_cma(gfp_t gfp_mask,
 	if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
 		alloc_flags |= ALLOC_CMA;
 #endif
+	if (gfp_mask & __GFP_SENSITIVE && gfpflags_allow_blocking(gfp_mask))
+		alloc_flags |= ALLOC_ASI_UNMAP;
 	return alloc_flags;
 }
 
@@ -4382,7 +4456,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
 	if (reserve_flags)
 		alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, reserve_flags) |
-					  (alloc_flags & ALLOC_KSWAPD);
+					  (alloc_flags & (ALLOC_KSWAPD | ALLOC_ASI_UNMAP));
 
 	/*
 	 * Reset the nodemask and zonelist iterators if memory policies can be

-- 
2.49.0.rc1.451.g8f38331e32-goog



  parent reply	other threads:[~2025-03-13 18:12 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13 18:11 [PATCH RFC 00/11] mm: ASI integration for the page allocator Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC 01/11] x86/mm: Bare minimum ASI API for page_alloc integration Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC 02/11] x86/mm: Factor out phys_pgd_init() Brendan Jackman
2025-03-13 22:14   ` Yosry Ahmed
2025-03-17 16:24     ` Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC 03/11] x86/mm: Add lookup_pgtable_in_pgd() Brendan Jackman
2025-03-13 22:09   ` Yosry Ahmed
2025-03-14  9:12     ` Brendan Jackman
2025-03-14 17:56       ` Yosry Ahmed
2025-03-13 18:11 ` [PATCH RFC 04/11] x86/mm/asi: Sync physmap into ASI_GLOBAL_NONSENSITIVE Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC HACKS 05/11] Add asi_map() and asi_unmap() Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC 06/11] mm/page_alloc: Add __GFP_SENSITIVE and always set it Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC HACKS 07/11] mm/slub: Set __GFP_SENSITIVE for reclaimable slabs Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC HACKS 08/11] mm/page_alloc: Simplify gfp_migratetype() Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC 09/11] mm/page_alloc: Split MIGRATE_UNMOVABLE by sensitivity Brendan Jackman
2025-03-13 18:11 ` [PATCH RFC 10/11] mm/page_alloc: Add support for nonsensitive allocations Brendan Jackman
2025-03-13 18:11 ` Brendan Jackman [this message]
2025-06-10 17:04 ` [PATCH RFC 00/11] mm: ASI integration for the page allocator Brendan Jackman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250313-asi-page-alloc-v1-11-04972e046cea@google.com \
    --to=jackmanb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=derkling@google.com \
    --cc=junaids@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=reijiw@google.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox