linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
@ 2024-10-14 23:56 Andrii Nakryiko
  2024-10-15  0:00 ` Shakeel Butt
  2024-10-16 18:39 ` Shakeel Butt
  0 siblings, 2 replies; 8+ messages in thread
From: Andrii Nakryiko @ 2024-10-14 23:56 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau
  Cc: linux-mm, linux-perf-users, linux-fsdevel, Andrii Nakryiko,
	Yi Lai, Shakeel Butt

From memfd_secret(2) manpage:

  The memory areas backing the file created with memfd_secret(2) are
  visible only to the processes that have access to the file descriptor.
  The memory region is removed from the kernel page tables and only the
  page tables of the processes holding the file descriptor map the
  corresponding physical memory. (Thus, the pages in the region can't be
  accessed by the kernel itself, so that, for example, pointers to the
  region can't be passed to system calls.)

We need to handle this special case gracefully in build ID fetching
code. Return -EACCESS whenever secretmem file is passed to build_id_parse()
family of APIs. Original report and repro can be found in [0].

  [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/

Reported-by: Yi Lai <yi1.lai@intel.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 lib/buildid.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/buildid.c b/lib/buildid.c
index 290641d92ac1..f0e6facf61c5 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -5,6 +5,7 @@
 #include <linux/elf.h>
 #include <linux/kernel.h>
 #include <linux/pagemap.h>
+#include <linux/secretmem.h>
 
 #define BUILD_ID 3
 
@@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off)
 
 	freader_put_folio(r);
 
+	/* reject secretmem folios created with memfd_secret() */
+	if (secretmem_mapping(r->file->f_mapping))
+		return -EACCES;
+
 	r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
 
 	/* if sleeping is allowed, wait for the page, if necessary */
-- 
2.43.5



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-14 23:56 [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() Andrii Nakryiko
@ 2024-10-15  0:00 ` Shakeel Butt
  2024-10-16 18:39 ` Shakeel Butt
  1 sibling, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2024-10-15  0:00 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users,
	linux-fsdevel, Yi Lai

On Mon, Oct 14, 2024 at 04:56:31PM GMT, Andrii Nakryiko wrote:
> From memfd_secret(2) manpage:
> 
>   The memory areas backing the file created with memfd_secret(2) are
>   visible only to the processes that have access to the file descriptor.
>   The memory region is removed from the kernel page tables and only the
>   page tables of the processes holding the file descriptor map the
>   corresponding physical memory. (Thus, the pages in the region can't be
>   accessed by the kernel itself, so that, for example, pointers to the
>   region can't be passed to system calls.)
> 
> We need to handle this special case gracefully in build ID fetching
> code. Return -EACCESS whenever secretmem file is passed to build_id_parse()
> family of APIs. Original report and repro can be found in [0].
> 
>   [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/
> 
> Reported-by: Yi Lai <yi1.lai@intel.com>
> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
> Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction")
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-14 23:56 [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() Andrii Nakryiko
  2024-10-15  0:00 ` Shakeel Butt
@ 2024-10-16 18:39 ` Shakeel Butt
  2024-10-16 19:59   ` Yosry Ahmed
  2024-10-17  9:17   ` David Hildenbrand
  1 sibling, 2 replies; 8+ messages in thread
From: Shakeel Butt @ 2024-10-16 18:39 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users,
	linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, david, jackmanb,
	yosryahmed, jannh, rppt

Ccing couple more folks who are doing similar work (ASI, guest_memfd)

Folks, what is the generic way to check if a given mapping has folios
unmapped from kernel address space?

On Mon, Oct 14, 2024 at 04:56:31PM GMT, Andrii Nakryiko wrote:
> From memfd_secret(2) manpage:
> 
>   The memory areas backing the file created with memfd_secret(2) are
>   visible only to the processes that have access to the file descriptor.
>   The memory region is removed from the kernel page tables and only the
>   page tables of the processes holding the file descriptor map the
>   corresponding physical memory. (Thus, the pages in the region can't be
>   accessed by the kernel itself, so that, for example, pointers to the
>   region can't be passed to system calls.)
> 
> We need to handle this special case gracefully in build ID fetching
> code. Return -EACCESS whenever secretmem file is passed to build_id_parse()
> family of APIs. Original report and repro can be found in [0].
> 
>   [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/
> 
> Reported-by: Yi Lai <yi1.lai@intel.com>
> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
> Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction")
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  lib/buildid.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/buildid.c b/lib/buildid.c
> index 290641d92ac1..f0e6facf61c5 100644
> --- a/lib/buildid.c
> +++ b/lib/buildid.c
> @@ -5,6 +5,7 @@
>  #include <linux/elf.h>
>  #include <linux/kernel.h>
>  #include <linux/pagemap.h>
> +#include <linux/secretmem.h>
>  
>  #define BUILD_ID 3
>  
> @@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off)
>  
>  	freader_put_folio(r);
>  
> +	/* reject secretmem folios created with memfd_secret() */
> +	if (secretmem_mapping(r->file->f_mapping))
> +		return -EACCES;
> +
>  	r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
>  
>  	/* if sleeping is allowed, wait for the page, if necessary */
> -- 
> 2.43.5
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-16 18:39 ` Shakeel Butt
@ 2024-10-16 19:59   ` Yosry Ahmed
  2024-10-16 21:45     ` Shakeel Butt
  2024-10-17  9:17   ` David Hildenbrand
  1 sibling, 1 reply; 8+ messages in thread
From: Yosry Ahmed @ 2024-10-16 19:59 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm,
	linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba,
	david, jackmanb, jannh, rppt

On Wed, Oct 16, 2024 at 11:39 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> Ccing couple more folks who are doing similar work (ASI, guest_memfd)
>
> Folks, what is the generic way to check if a given mapping has folios
> unmapped from kernel address space?

I suppose you mean specifically if a folio is not mapped in the direct
map, because a folio can also be mapped in other regions of the kernel
address space (e.g. vmalloc).

From my perspective of working on ASI on the x86 side, I think
lookup_address() is
the right API to use. It returns a PTE and you can check if it is
present.

Based on that, I would say that the generic way is perhaps
kernel_page_present(), which does the above on x86, not sure about
other architectures. It seems like kernel_page_present() always
returns true with !CONFIG_ARCH_HAS_SET_DIRECT_MAP, which assumes that
unmapping folios from the direct map uses set_direct_map_*().

For secretmem, it seems like set_direct_map_*() is indeed the method
used to unmap folios. I am not sure if the same stands for
guest_memfd, but I don't see why not.

ASI does not use set_direct_map_*(), but it doesn't matter in this
context, read below if you care about the reasoning.

ASI does not unmap folios from the direct map in the kernel address
space, but it creates a new "restricted" address space that has the
folios unmapped from the direct map by default. However, I don't think
this is relevant here. IIUC, the purpose of this patch is to check if
the folio is accessible by the kernel, which should be true even in
the ASI restricted address space, because ASI will just transparently
switch to the unrestricted kernel address space where the folio is
mapped if needed.

I hope this helps.


>
> On Mon, Oct 14, 2024 at 04:56:31PM GMT, Andrii Nakryiko wrote:
> > From memfd_secret(2) manpage:
> >
> >   The memory areas backing the file created with memfd_secret(2) are
> >   visible only to the processes that have access to the file descriptor.
> >   The memory region is removed from the kernel page tables and only the
> >   page tables of the processes holding the file descriptor map the
> >   corresponding physical memory. (Thus, the pages in the region can't be
> >   accessed by the kernel itself, so that, for example, pointers to the
> >   region can't be passed to system calls.)
> >
> > We need to handle this special case gracefully in build ID fetching
> > code. Return -EACCESS whenever secretmem file is passed to build_id_parse()
> > family of APIs. Original report and repro can be found in [0].
> >
> >   [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/
> >
> > Reported-by: Yi Lai <yi1.lai@intel.com>
> > Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
> > Fixes: de3ec364c3c3 ("lib/buildid: add single folio-based file reader abstraction")
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  lib/buildid.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/lib/buildid.c b/lib/buildid.c
> > index 290641d92ac1..f0e6facf61c5 100644
> > --- a/lib/buildid.c
> > +++ b/lib/buildid.c
> > @@ -5,6 +5,7 @@
> >  #include <linux/elf.h>
> >  #include <linux/kernel.h>
> >  #include <linux/pagemap.h>
> > +#include <linux/secretmem.h>
> >
> >  #define BUILD_ID 3
> >
> > @@ -64,6 +65,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off)
> >
> >       freader_put_folio(r);
> >
> > +     /* reject secretmem folios created with memfd_secret() */
> > +     if (secretmem_mapping(r->file->f_mapping))
> > +             return -EACCES;
> > +
> >       r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
> >
> >       /* if sleeping is allowed, wait for the page, if necessary */
> > --
> > 2.43.5
> >


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-16 19:59   ` Yosry Ahmed
@ 2024-10-16 21:45     ` Shakeel Butt
  0 siblings, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2024-10-16 21:45 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm,
	linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba,
	david, jackmanb, jannh, rppt

On Wed, Oct 16, 2024 at 12:59:13PM GMT, Yosry Ahmed wrote:
> On Wed, Oct 16, 2024 at 11:39 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> >
> > Ccing couple more folks who are doing similar work (ASI, guest_memfd)
> >
> > Folks, what is the generic way to check if a given mapping has folios
> > unmapped from kernel address space?
> 
> I suppose you mean specifically if a folio is not mapped in the direct
> map, because a folio can also be mapped in other regions of the kernel
> address space (e.g. vmalloc).
> 
> From my perspective of working on ASI on the x86 side, I think
> lookup_address() is
> the right API to use. It returns a PTE and you can check if it is
> present.
> 
> Based on that, I would say that the generic way is perhaps
> kernel_page_present(), which does the above on x86, not sure about
> other architectures. It seems like kernel_page_present() always
> returns true with !CONFIG_ARCH_HAS_SET_DIRECT_MAP, which assumes that
> unmapping folios from the direct map uses set_direct_map_*().
> 
> For secretmem, it seems like set_direct_map_*() is indeed the method
> used to unmap folios. I am not sure if the same stands for
> guest_memfd, but I don't see why not.
> 
> ASI does not use set_direct_map_*(), but it doesn't matter in this
> context, read below if you care about the reasoning.
> 
> ASI does not unmap folios from the direct map in the kernel address
> space, but it creates a new "restricted" address space that has the
> folios unmapped from the direct map by default. However, I don't think
> this is relevant here. IIUC, the purpose of this patch is to check if
> the folio is accessible by the kernel, which should be true even in
> the ASI restricted address space, because ASI will just transparently
> switch to the unrestricted kernel address space where the folio is
> mapped if needed.
> 
> I hope this helps.
> 

Thanks a lot. This is really helpful.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-16 18:39 ` Shakeel Butt
  2024-10-16 19:59   ` Yosry Ahmed
@ 2024-10-17  9:17   ` David Hildenbrand
  2024-10-17 16:22     ` Shakeel Butt
  1 sibling, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2024-10-17  9:17 UTC (permalink / raw)
  To: Shakeel Butt, Andrii Nakryiko
  Cc: bpf, ast, daniel, martin.lau, linux-mm, linux-perf-users,
	linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba, jackmanb,
	yosryahmed, jannh, rppt

On 16.10.24 20:39, Shakeel Butt wrote:
> Ccing couple more folks who are doing similar work (ASI, guest_memfd)
> 
> Folks, what is the generic way to check if a given mapping has folios
> unmapped from kernel address space?


Can't we just lookup the mapping and refuse these folios that really 
shouldn't be looked at?

See gup_fast_folio_allowed() where we refuse secretmem_mapping().


-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-17  9:17   ` David Hildenbrand
@ 2024-10-17 16:22     ` Shakeel Butt
  2024-10-17 17:41       ` David Hildenbrand
  0 siblings, 1 reply; 8+ messages in thread
From: Shakeel Butt @ 2024-10-17 16:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm,
	linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba,
	jackmanb, yosryahmed, jannh, rppt

On Thu, Oct 17, 2024 at 11:17:19AM GMT, David Hildenbrand wrote:
> On 16.10.24 20:39, Shakeel Butt wrote:
> > Ccing couple more folks who are doing similar work (ASI, guest_memfd)
> > 
> > Folks, what is the generic way to check if a given mapping has folios
> > unmapped from kernel address space?
> 
> 
> Can't we just lookup the mapping and refuse these folios that really
> shouldn't be looked at?
> 
> See gup_fast_folio_allowed() where we refuse secretmem_mapping().

That is exactly what this patch is doing. See [1]. The reason I asked
this question was because I see parallel efforts related to guest_memfd
and ASI are going to unmap folios from direct map. (Yosry already
explained ASI is a bit different). We want a more robust and future
proof solution.


[1] https://lore.kernel.org/all/20241014235631.1229438-1-andrii@kernel.org/

> 
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse()
  2024-10-17 16:22     ` Shakeel Butt
@ 2024-10-17 17:41       ` David Hildenbrand
  0 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2024-10-17 17:41 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, linux-mm,
	linux-perf-users, linux-fsdevel, Yi Lai, pbonzini, seanjc, tabba,
	jackmanb, yosryahmed, jannh, rppt

On 17.10.24 18:22, Shakeel Butt wrote:
> On Thu, Oct 17, 2024 at 11:17:19AM GMT, David Hildenbrand wrote:
>> On 16.10.24 20:39, Shakeel Butt wrote:
>>> Ccing couple more folks who are doing similar work (ASI, guest_memfd)
>>>
>>> Folks, what is the generic way to check if a given mapping has folios
>>> unmapped from kernel address space?
>>
>>
>> Can't we just lookup the mapping and refuse these folios that really
>> shouldn't be looked at?
>>
>> See gup_fast_folio_allowed() where we refuse secretmem_mapping().
> 
> That is exactly what this patch is doing. See [1].

Hah! I should have looked at the full patch not just the discussion 
where I was CCed :)

> The reason I asked
> this question was because I see parallel efforts related to guest_memfd
> and ASI are going to unmap folios from direct map. (Yosry already
> explained ASI is a bit different). We want a more robust and future
> proof solution.

There was a discussion a while ago about having the abstraction of 
inaccessible mappings.

See 
https://lore.kernel.org/all/c87a4ba0-b9c4-4044-b0c3-c1112601494f@redhat.com/

It would be a more future-proof replacement of the secretmem checks.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-10-17 17:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-14 23:56 [PATCH bpf] lib/buildid: handle memfd_secret() files in build_id_parse() Andrii Nakryiko
2024-10-15  0:00 ` Shakeel Butt
2024-10-16 18:39 ` Shakeel Butt
2024-10-16 19:59   ` Yosry Ahmed
2024-10-16 21:45     ` Shakeel Butt
2024-10-17  9:17   ` David Hildenbrand
2024-10-17 16:22     ` Shakeel Butt
2024-10-17 17:41       ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox