* [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
@ 2024-10-14 23:56 Andrii Nakryiko
2024-10-15 0:00 ` Shakeel Butt
2024-10-16 18:39 ` Shakeel Butt
0 siblings, 2 replies; 8+ messages in thread
From: Andrii Nakryiko @ 2024-10-14 23:56 UTC (permalink / raw)
To: bpf, ast, daniel, martin.lau
Cc: linux-mm, linux-perf-users, linux-fsdevel, Andrii Nakryiko,
Yi Lai, Shakeel Butt
From memfd_secret(2) manpage:
The memory areas backing the file created with memfd_secret(2) are
visible only to the processes that have access to the file descriptor.
The memory region is removed from the kernel page tables and only the
page tables of the processes holding the file descriptor map the
corresponding physical memory. (Thus, the pages in the region can't be
accessed by the kernel itself, so that, for example, pointers to the
region can't be passed to system calls.)
We need to handle this special case gracefully in build ID fetching
code. Return -EACCESS whenever secretmem file is passed to build_id_parse()
family of APIs. Original report and repro can be found in [0].
[0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/
Reported-by: Yi Lai <yi1.lai@intel.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
lib/buildid.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/lib/buildid.c b/lib/buildid.c
index 290641d92ac1..f0e6facf61c5 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -5,6 +5,7 @@
#include <linux/elf.h>
#include <linux/kernel.h>
#include <linux/pagemap.h>
+#include <linux/secretmem.h>
#define BUILD_ID 3
@@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off)
freader_put_folio(r);
+ /* reject secretmem folios created with memfd_secret() */
+ if (secretmem_mapping(r->file->f_mapping))
+ return -EACCES;
+
r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
/* if sleeping is allowed, wait for the page, if necessary */
--
2.43.5
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-14 23:56 [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() Andrii Nakryiko @ 2024-10-15 0:00 ` Shakeel Butt 2024-10-16 18:39 ` Shakeel Butt 1 sibling, 0 replies; 8+ messages in thread From: Shakeel Butt @ 2024-10-15 0:00 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai On Mon, Oct 14, 2024 at 04:56:31PM GMT, Andrii Nakryiko wrote: > From memfd_secret(2) manpage: > > The memory areas backing the file created with memfd_secret(2) are > visible only to the processes that have access to the file descriptor. > The memory region is removed from the kernel page tables and only the > page tables of the processes holding the file descriptor map the > corresponding physical memory. (Thus, the pages in the region can't be > accessed by the kernel itself, so that, for example, pointers to the > region can't be passed to system calls.) > > We need to handle this special case gracefully in build ID fetching > code. Return -EACCESS whenever secretmem file is passed to build_id_parse() > family of APIs. Original report and repro can be found in [0]. > > [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/ > > Reported-by: Yi Lai <yi1.lai@intel.com> > Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> > Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction") > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-14 23:56 [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() Andrii Nakryiko 2024-10-15 0:00 ` Shakeel Butt @ 2024-10-16 18:39 ` Shakeel Butt 2024-10-16 19:59 ` Yosry Ahmed 2024-10-17 9:17 ` David Hildenbrand 1 sibling, 2 replies; 8+ messages in thread From: Shakeel Butt @ 2024-10-16 18:39 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, david, jackmanb, yosryahmed, jannh, rppt Ccing couple more folks who are doing similar work (ASI, guest_memfd) Folks, what is the generic way to check if a given mapping has folios unmapped from kernel address space? On Mon, Oct 14, 2024 at 04:56:31PM GMT, Andrii Nakryiko wrote: > From memfd_secret(2) manpage: > > The memory areas backing the file created with memfd_secret(2) are > visible only to the processes that have access to the file descriptor. > The memory region is removed from the kernel page tables and only the > page tables of the processes holding the file descriptor map the > corresponding physical memory. (Thus, the pages in the region can't be > accessed by the kernel itself, so that, for example, pointers to the > region can't be passed to system calls.) > > We need to handle this special case gracefully in build ID fetching > code. Return -EACCESS whenever secretmem file is passed to build_id_parse() > family of APIs. Original report and repro can be found in [0]. > > [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/ > > Reported-by: Yi Lai <yi1.lai@intel.com> > Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> > Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction") > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > --- > lib/buildid.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/lib/buildid.c b/lib/buildid.c > index 290641d92ac1..f0e6facf61c5 100644 > --- a/lib/buildid.c > +++ b/lib/buildid.c > @@ -5,6 +5,7 @@ > #include <linux/elf.h> > #include <linux/kernel.h> > #include <linux/pagemap.h> > +#include <linux/secretmem.h> > > #define BUILD_ID 3 > > @@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off) > > freader_put_folio(r); > > + /* reject secretmem folios created with memfd_secret() */ > + if (secretmem_mapping(r->file->f_mapping)) > + return -EACCES; > + > r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT); > > /* if sleeping is allowed, wait for the page, if necessary */ > -- > 2.43.5 > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-16 18:39 ` Shakeel Butt @ 2024-10-16 19:59 ` Yosry Ahmed 2024-10-16 21:45 ` Shakeel Butt 2024-10-17 9:17 ` David Hildenbrand 1 sibling, 1 reply; 8+ messages in thread From: Yosry Ahmed @ 2024-10-16 19:59 UTC (permalink / raw) To: Shakeel Butt Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, david, jackmanb, jannh, rppt On Wed, Oct 16, 2024 at 11:39 AM Shakeel Butt <shakeel.butt@linux.dev> wrote: > > Ccing couple more folks who are doing similar work (ASI, guest_memfd) > > Folks, what is the generic way to check if a given mapping has folios > unmapped from kernel address space? I suppose you mean specifically if a folio is not mapped in the direct map, because a folio can also be mapped in other regions of the kernel address space (e.g. vmalloc). From my perspective of working on ASI on the x86 side, I think lookup_address() is the right API to use. It returns a PTE and you can check if it is present. Based on that, I would say that the generic way is perhaps kernel_page_present(), which does the above on x86, not sure about other architectures. It seems like kernel_page_present() always returns true with !CONFIG_ARCH_HAS_SET_DIRECT_MAP, which assumes that unmapping folios from the direct map uses set_direct_map_*(). For secretmem, it seems like set_direct_map_*() is indeed the method used to unmap folios. I am not sure if the same stands for guest_memfd, but I don't see why not. ASI does not use set_direct_map_*(), but it doesn't matter in this context, read below if you care about the reasoning. ASI does not unmap folios from the direct map in the kernel address space, but it creates a new "restricted" address space that has the folios unmapped from the direct map by default. However, I don't think this is relevant here. IIUC, the purpose of this patch is to check if the folio is accessible by the kernel, which should be true even in the ASI restricted address space, because ASI will just transparently switch to the unrestricted kernel address space where the folio is mapped if needed. I hope this helps. > > On Mon, Oct 14, 2024 at 04:56:31PM GMT, Andrii Nakryiko wrote: > > From memfd_secret(2) manpage: > > > > The memory areas backing the file created with memfd_secret(2) are > > visible only to the processes that have access to the file descriptor. > > The memory region is removed from the kernel page tables and only the > > page tables of the processes holding the file descriptor map the > > corresponding physical memory. (Thus, the pages in the region can't be > > accessed by the kernel itself, so that, for example, pointers to the > > region can't be passed to system calls.) > > > > We need to handle this special case gracefully in build ID fetching > > code. Return -EACCESS whenever secretmem file is passed to build_id_parse() > > family of APIs. Original report and repro can be found in [0]. > > > > [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/ > > > > Reported-by: Yi Lai <yi1.lai@intel.com> > > Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> > > Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction") > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org> > > --- > > lib/buildid.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/lib/buildid.c b/lib/buildid.c > > index 290641d92ac1..f0e6facf61c5 100644 > > --- a/lib/buildid.c > > +++ b/lib/buildid.c > > @@ -5,6 +5,7 @@ > > #include <linux/elf.h> > > #include <linux/kernel.h> > > #include <linux/pagemap.h> > > +#include <linux/secretmem.h> > > > > #define BUILD_ID 3 > > > > @@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off) > > > > freader_put_folio(r); > > > > + /* reject secretmem folios created with memfd_secret() */ > > + if (secretmem_mapping(r->file->f_mapping)) > > + return -EACCES; > > + > > r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT); > > > > /* if sleeping is allowed, wait for the page, if necessary */ > > -- > > 2.43.5 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-16 19:59 ` Yosry Ahmed @ 2024-10-16 21:45 ` Shakeel Butt 0 siblings, 0 replies; 8+ messages in thread From: Shakeel Butt @ 2024-10-16 21:45 UTC (permalink / raw) To: Yosry Ahmed Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, david, jackmanb, jannh, rppt On Wed, Oct 16, 2024 at 12:59:13PM GMT, Yosry Ahmed wrote: > On Wed, Oct 16, 2024 at 11:39 AM Shakeel Butt <shakeel.butt@linux.dev> wrote: > > > > Ccing couple more folks who are doing similar work (ASI, guest_memfd) > > > > Folks, what is the generic way to check if a given mapping has folios > > unmapped from kernel address space? > > I suppose you mean specifically if a folio is not mapped in the direct > map, because a folio can also be mapped in other regions of the kernel > address space (e.g. vmalloc). > > From my perspective of working on ASI on the x86 side, I think > lookup_address() is > the right API to use. It returns a PTE and you can check if it is > present. > > Based on that, I would say that the generic way is perhaps > kernel_page_present(), which does the above on x86, not sure about > other architectures. It seems like kernel_page_present() always > returns true with !CONFIG_ARCH_HAS_SET_DIRECT_MAP, which assumes that > unmapping folios from the direct map uses set_direct_map_*(). > > For secretmem, it seems like set_direct_map_*() is indeed the method > used to unmap folios. I am not sure if the same stands for > guest_memfd, but I don't see why not. > > ASI does not use set_direct_map_*(), but it doesn't matter in this > context, read below if you care about the reasoning. > > ASI does not unmap folios from the direct map in the kernel address > space, but it creates a new "restricted" address space that has the > folios unmapped from the direct map by default. However, I don't think > this is relevant here. IIUC, the purpose of this patch is to check if > the folio is accessible by the kernel, which should be true even in > the ASI restricted address space, because ASI will just transparently > switch to the unrestricted kernel address space where the folio is > mapped if needed. > > I hope this helps. > Thanks a lot. This is really helpful. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-16 18:39 ` Shakeel Butt 2024-10-16 19:59 ` Yosry Ahmed @ 2024-10-17 9:17 ` David Hildenbrand 2024-10-17 16:22 ` Shakeel Butt 1 sibling, 1 reply; 8+ messages in thread From: David Hildenbrand @ 2024-10-17 9:17 UTC (permalink / raw) To: Shakeel Butt, Andrii Nakryiko Cc: bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, jackmanb, yosryahmed, jannh, rppt On 16.10.24 20:39, Shakeel Butt wrote: > Ccing couple more folks who are doing similar work (ASI, guest_memfd) > > Folks, what is the generic way to check if a given mapping has folios > unmapped from kernel address space? Can't we just lookup the mapping and refuse these folios that really shouldn't be looked at? See gup_fast_folio_allowed() where we refuse secretmem_mapping(). -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-17 9:17 ` David Hildenbrand @ 2024-10-17 16:22 ` Shakeel Butt 2024-10-17 17:41 ` David Hildenbrand 0 siblings, 1 reply; 8+ messages in thread From: Shakeel Butt @ 2024-10-17 16:22 UTC (permalink / raw) To: David Hildenbrand Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, jackmanb, yosryahmed, jannh, rppt On Thu, Oct 17, 2024 at 11:17:19AM GMT, David Hildenbrand wrote: > On 16.10.24 20:39, Shakeel Butt wrote: > > Ccing couple more folks who are doing similar work (ASI, guest_memfd) > > > > Folks, what is the generic way to check if a given mapping has folios > > unmapped from kernel address space? > > > Can't we just lookup the mapping and refuse these folios that really > shouldn't be looked at? > > See gup_fast_folio_allowed() where we refuse secretmem_mapping(). That is exactly what this patch is doing. See [1]. The reason I asked this question was because I see parallel efforts related to guest_memfd and ASI are going to unmap folios from direct map. (Yosry already explained ASI is a bit different). We want a more robust and future proof solution. [1] https://lore.kernel.org/all/20241014235631.1229438-1-andrii@kernel.org/ > > > -- > Cheers, > > David / dhildenb > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() 2024-10-17 16:22 ` Shakeel Butt @ 2024-10-17 17:41 ` David Hildenbrand 0 siblings, 0 replies; 8+ messages in thread From: David Hildenbrand @ 2024-10-17 17:41 UTC (permalink / raw) To: Shakeel Butt Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, jackmanb, yosryahmed, jannh, rppt On 17.10.24 18:22, Shakeel Butt wrote: > On Thu, Oct 17, 2024 at 11:17:19AM GMT, David Hildenbrand wrote: >> On 16.10.24 20:39, Shakeel Butt wrote: >>> Ccing couple more folks who are doing similar work (ASI, guest_memfd) >>> >>> Folks, what is the generic way to check if a given mapping has folios >>> unmapped from kernel address space? >> >> >> Can't we just lookup the mapping and refuse these folios that really >> shouldn't be looked at? >> >> See gup_fast_folio_allowed() where we refuse secretmem_mapping(). > > That is exactly what this patch is doing. See [1]. Hah! I should have looked at the full patch not just the discussion where I was CCed :) > The reason I asked > this question was because I see parallel efforts related to guest_memfd > and ASI are going to unmap folios from direct map. (Yosry already > explained ASI is a bit different). We want a more robust and future > proof solution. There was a discussion a while ago about having the abstraction of inaccessible mappings. See https://lore.kernel.org/all/c87a4ba0-b9c4-4044-b0c3-c1112601494f@redhat.com/ It would be a more future-proof replacement of the secretmem checks. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-10-17 17:41 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-10-14 23:56 [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() Andrii Nakryiko 2024-10-15 0:00 ` Shakeel Butt 2024-10-16 18:39 ` Shakeel Butt 2024-10-16 19:59 ` Yosry Ahmed 2024-10-16 21:45 ` Shakeel Butt 2024-10-17 9:17 ` David Hildenbrand 2024-10-17 16:22 ` Shakeel Butt 2024-10-17 17:41 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox