From: Brendan Jackman <jackmanb@google.com>
To: Brendan Jackman <jackmanb@google.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Vlastimil Babka <vbabka@kernel.org>, Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Lorenzo Stoakes <ljs@kernel.org>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
<x86@kernel.org>, <rppt@kernel.org>,
Sumit Garg <sumit.garg@oss.qualcomm.com>, <derkling@google.com>,
<reijiw@google.com>, Will Deacon <will@kernel.org>,
<rientjes@google.com>,
"Kalyazin, Nikita" <kalyazin@amazon.co.uk>,
<patrick.roy@linux.dev>,
"Itazuri, Takahiro" <itazur@amazon.co.uk>,
Andy Lutomirski <luto@kernel.org>,
David Kaplan <david.kaplan@amd.com>,
Thomas Gleixner <tglx@kernel.org>, Yosry Ahmed <yosry@kernel.org>
Subject: Re: [PATCH v2 22/22] mm/secretmem: Use __GFP_UNMAPPED when available
Date: Tue, 31 Mar 2026 14:40:51 +0000 [thread overview]
Message-ID: <DHH1NTVNTA8W.2313NYMA29J42@google.com> (raw)
In-Reply-To: <20260320-page_alloc-unmapped-v2-22-28bf1bd54f41@google.com>
I skipped on posting most of the intervening AI reviews coz there's just
so much stuff and most of it is pretty boring, but this one is
interesting.
https://sashiko.dev/#/patchset/20260320-page_alloc-unmapped-v2-0-28bf1bd54f41%40google.com
On Fri Mar 20, 2026 at 6:23 PM UTC, Brendan Jackman wrote:
> This is the simplest possible way to adopt __GFP_UNMAPPED. Use it to
> allocate pages when it's available, meaning the
> set_direct_map_invalid_noflush() call is no longer needed.
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
> mm/secretmem.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 74 insertions(+), 13 deletions(-)
>
> diff --git a/mm/secretmem.c b/mm/secretmem.c
> index 5f57ac4720d32..9fef91237358a 100644
> --- a/mm/secretmem.c
> +++ b/mm/secretmem.c
> @@ -6,6 +6,7 @@
> */
>
> #include <linux/mm.h>
> +#include <linux/mermap.h>
> #include <linux/fs.h>
> #include <linux/swap.h>
> #include <linux/mount.h>
> @@ -47,13 +48,78 @@ bool secretmem_active(void)
> return !!atomic_read(&secretmem_users);
> }
>
> +/*
> + * If it's supported, allocate using __GFP_UNMAPPED. This lets the page
> + * allocator amortize TLB flushes and avoids direct map fragmentation.
> + */
> +#ifdef CONFIG_PAGE_ALLOC_UNMAPPED
> +static inline struct folio *secretmem_folio_alloc(gfp_t gfp, unsigned int order)
> +{
> + int err;
> +
> + /* Required for __GFP_UNMAPPED|__GFP_ZERO. */
> + err = mermap_mm_prepare(current->mm);
> + if (err)
> + return ERR_PTR(err);
Sashiko:
> In remote access paths such as process_vm_readv or io_uring worker threads,
> current->mm might be NULL or point to a different address space than the
> faulted VMA. Should this use vmf->vma->vm_mm instead?
This can't happen, right? I think the assumption is good; doing a
secretmem fault in a kthread or any process that doesn't have the file
mmap()'d would be a bug? But this definitely feels like something I
could be wrong about.
AI slop to empirically check the two examples mentioned by the AI fail
early (slop for slop, it's slop all the way down...):
- process_vm_readv(): https://paste.debian.net/hidden/5625ef2e
- io_uring: https://paste.debian.net/hidden/e7763ad2
[...]
> +static inline void secretmem_folio_restore(struct folio *folio)
> +{
> + set_direct_map_default_noflush(folio_page(folio, 0));
> +}
I defined a lovely helper here but neglected to actually call it.
Sashiko says "this isn't a bug" but that's wrong, calling
set_direct_map_default_noflush() before freeing a __GFP_UNMAPPED page is
not OK.
(Should it be? I think no. We _could_ define "default" so that it checks
the pageblock flags and does the right thing for you. But then we'd be
baking in the assumption that the page allocator can efficiently look
that up. This would be rather tricky if we decided we need to mix mapped
an unmapped pages in the same block).
And this was hiding the more important bug which was that I forgot to do
the mermap dance for zeroing the page.
> +static inline void secretmem_folio_flush(struct folio *folio)
> +{
> + unsigned long addr = (unsigned long)folio_address(folio);
> +
> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +}
> +#endif
> +
> static vm_fault_t secretmem_fault(struct vm_fault *vmf)
> {
> struct address_space *mapping = vmf->vma->vm_file->f_mapping;
> struct inode *inode = file_inode(vmf->vma->vm_file);
> pgoff_t offset = vmf->pgoff;
> gfp_t gfp = vmf->gfp_mask;
> - unsigned long addr;
> struct folio *folio;
> vm_fault_t ret;
> int err;
> @@ -66,16 +132,9 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
> retry:
> folio = filemap_lock_folio(mapping, offset);
> if (IS_ERR(folio)) {
> - folio = folio_alloc(gfp | __GFP_ZERO, 0);
> - if (!folio) {
> - ret = VM_FAULT_OOM;
> - goto out;
> - }
> -
> - err = set_direct_map_invalid_noflush(folio_page(folio, 0));
> - if (err) {
> - folio_put(folio);
> - ret = vmf_error(err);
> + folio = secretmem_folio_alloc(gfp | __GFP_ZERO, 0);
> + if (IS_ERR_OR_NULL(folio)) {
> + ret = folio ? vmf_error(PTR_ERR(folio)) : VM_FAULT_OOM;
> goto out;
> }
>
> err = filemap_add_folio(mapping, folio, offset, gfp);
> if (unlikely(err)) {
> /*
> * If a split of large page was required, it
> * already happened when we marked the page invalid
> * which guarantees that this call won't fail
> */
> set_direct_map_default_noflush(folio_page(folio, 0));
> folio_put(folio);
> if (err == -EEXIST)
> goto retry;
This failure path leaks mermap TLB entries on ther CPUs. So if an
attacker can trigger this path, and cause another CPU to populate its
TLB for the mermap region, and then cause a victim to allocate the page
they just freed, they can use those entries for a sidechannel attack.
I did call out in the cover letter that there's some jank with the "if you use
__GFP_UNMAPPED|__GFP_ZERO then you need to think about the TLB" thing.
But, this shows it's worse than I thought. I need to think about how to
mitigate this.
prev parent reply other threads:[~2026-03-31 14:40 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 18:23 [PATCH v2 00/22] mm: Add __GFP_UNMAPPED Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd() Brendan Jackman
2026-03-20 19:42 ` Dave Hansen
2026-03-23 11:01 ` Brendan Jackman
2026-03-24 15:27 ` Borislav Petkov
2026-03-25 13:28 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 02/22] x86/mm: Generalize LDT remap into "mm-local region" Brendan Jackman
2026-03-20 19:47 ` Dave Hansen
2026-03-23 12:01 ` Brendan Jackman
2026-03-23 12:57 ` Brendan Jackman
2026-03-25 14:23 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 03/22] x86/tlb: Expose some flush function declarations to modules Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 04/22] mm: Create flags arg for __apply_to_page_range() Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 05/22] mm: Add more flags " Brendan Jackman
2026-03-26 16:14 ` Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 06/22] x86/mm: introduce the mermap Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 07/22] mm: KUnit tests for " Brendan Jackman
2026-03-24 8:00 ` kernel test robot
2026-03-20 18:23 ` [PATCH v2 08/22] mm: introduce for_each_free_list() Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 09/22] mm/page_alloc: don't overload migratetype in find_suitable_fallback() Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 10/22] mm: introduce freetype_t Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 11/22] mm: move migratetype definitions to freetype.h Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 12/22] mm: add definitions for allocating unmapped pages Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 13/22] mm: rejig pageblock mask definitions Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 14/22] mm: encode freetype flags in pageblock flags Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 15/22] mm/page_alloc: remove ifdefs from pindex helpers Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 16/22] mm/page_alloc: separate pcplists by freetype flags Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 17/22] mm/page_alloc: rename ALLOC_NON_BLOCK back to _HARDER Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 18/22] mm/page_alloc: introduce ALLOC_NOBLOCK Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 19/22] mm/page_alloc: implement __GFP_UNMAPPED allocations Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 20/22] mm/page_alloc: implement __GFP_UNMAPPED|__GFP_ZERO allocations Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 21/22] mm: Minimal KUnit tests for some new page_alloc logic Brendan Jackman
2026-03-20 18:23 ` [PATCH v2 22/22] mm/secretmem: Use __GFP_UNMAPPED when available Brendan Jackman
2026-03-31 14:40 ` Brendan Jackman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DHH1NTVNTA8W.2313NYMA29J42@google.com \
--to=jackmanb@google.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=david.kaplan@amd.com \
--cc=david@kernel.org \
--cc=derkling@google.com \
--cc=hannes@cmpxchg.org \
--cc=itazur@amazon.co.uk \
--cc=kalyazin@amazon.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=luto@kernel.org \
--cc=patrick.roy@linux.dev \
--cc=peterz@infradead.org \
--cc=reijiw@google.com \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=sumit.garg@oss.qualcomm.com \
--cc=tglx@kernel.org \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yosry@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox