linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
@ 2023-11-08 11:47 David Wang
  2023-11-13  9:15 ` David Hildenbrand
  0 siblings, 1 reply; 5+ messages in thread
From: David Wang @ 2023-11-08 11:47 UTC (permalink / raw)
  To: akpm, linux-mm, linux-kernel


Hi,
According to https://lwn.net/Articles/865256/, 
the memory address got from memfd_secret/ftruncate/mmap should not be used by syscalls, since it is not accessible even by kernel.

But my test result shows that the "secret" memory could be used in  syscall write, is this expected behavior?
This is my test code:

int main() {
	int fd = syscall(__NR_memfd_secret, 0);
	if (fd < 0) {
		perror("Fail to create secret");
		return -1;
	}
	if (ftruncate(fd, 1024) < 0) {
		perror("Fail to size the secret");
		return -1;
	}
	char *key = mmap(NULL, 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	if (key == MAP_FAILED) {
		perror("Fail to mmap");
		return -1;
	}
	// should be some secure channel
	strcpy(key, "ThisIsAKey");
	// printf("[%d]key(%s) ready: %p\n", getpid(), key, key);
	// getchar();
	// make syscall, should err
	write(STDOUT_FILENO, key, strlen(key));  //<-- Here the key shows up on stdout.

	return 0;
}

Thanks
David



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
  2023-11-08 11:47 [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall David Wang
@ 2023-11-13  9:15 ` David Hildenbrand
  2023-11-13 13:26   ` Theodore Ts'o
  0 siblings, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2023-11-13  9:15 UTC (permalink / raw)
  To: David Wang, akpm, linux-mm, linux-kernel; +Cc: Mike Rapoport

On 08.11.23 12:47, David Wang wrote:
> 
> Hi,
> According to https://lwn.net/Articles/865256/,
> the memory address got from memfd_secret/ftruncate/mmap should not be used by syscalls, since it is not accessible even by kernel.
> 
> But my test result shows that the "secret" memory could be used in  syscall write, is this expected behavior?
> This is my test code:

CCing Mike.

According to the man page:

"The  memory areas backing the file created with memfd_secret(2) are 
visible only to the processes that have access to the file descriptor. 
The memory region is removed from the kernel page tables and only the 
page tables  of  the  processes  holding  the file descriptor map the 
corresponding physical memory.  (Thus, the pages in the region can't be 
accessed by the kernel itself, so that, for example, pointers  to  the 
region can't be passed to system calls.)

I'm not sure if the last part is actually true, if the syscalls end up 
walking user page tables to copy data in/out.

> 
> int main() {
> 	int fd = syscall(__NR_memfd_secret, 0);
> 	if (fd < 0) {
> 		perror("Fail to create secret");
> 		return -1;
> 	}
> 	if (ftruncate(fd, 1024) < 0) {
> 		perror("Fail to size the secret");
> 		return -1;
> 	}
> 	char *key = mmap(NULL, 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> 	if (key == MAP_FAILED) {
> 		perror("Fail to mmap");
> 		return -1;
> 	}
> 	// should be some secure channel
> 	strcpy(key, "ThisIsAKey");
> 	// printf("[%d]key(%s) ready: %p\n", getpid(), key, key);
> 	// getchar();
> 	// make syscall, should err
> 	write(STDOUT_FILENO, key, strlen(key));  //<-- Here the key shows up on stdout.


What probably happens here is that the kernel reads the data via the 
user page tables, and can, therefore, access that memory just fine.

Looking at the selftest (tools/testing/selftests/mm/memfd_secret.c) we 
test that we cannot read from the memfd and cannot write to the memfd. 
We don't test if other syscalls can access that user-provided buffer 
that is backed by a memfd.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
  2023-11-13  9:15 ` David Hildenbrand
@ 2023-11-13 13:26   ` Theodore Ts'o
  2023-11-13 14:42     ` David Hildenbrand
  2023-11-13 15:42     ` David Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Theodore Ts'o @ 2023-11-13 13:26 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: David Wang, akpm, linux-mm, linux-kernel, Mike Rapoport

On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
> 
> According to the man page:
> 
> "The  memory areas backing the file created with memfd_secret(2) are visible
> only to the processes that have access to the file descriptor. The memory
> region is removed from the kernel page tables and only the page tables  of
> the  processes  holding  the file descriptor map the corresponding physical
> memory.  (Thus, the pages in the region can't be accessed by the kernel
> itself, so that, for example, pointers  to  the region can't be passed to
> system calls.)
> 
> I'm not sure if the last part is actually true, if the syscalls end up
> walking user page tables to copy data in/out.

The idea behind removing it from the kernel page tables is so that
kernel code running in some other process context won't be able to
reference the memory via the kernel address space.  (So if there is
some kind of kernel zero-day which allows arbitrary code execution,
the injected attack code would have to play games with page tables
before being able to reference the memory --- this is not
*impossible*, just more annoying.)

But if you are doing a buffered write, the copy from the user-supplied
buffer to the page cache is happening in the process's context.  So
"foreground kernel code" can dereference the user-supplied pointer
just fine.

Cheers,

						- Ted


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
  2023-11-13 13:26   ` Theodore Ts'o
@ 2023-11-13 14:42     ` David Hildenbrand
  2023-11-13 15:42     ` David Wang
  1 sibling, 0 replies; 5+ messages in thread
From: David Hildenbrand @ 2023-11-13 14:42 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: David Wang, akpm, linux-mm, linux-kernel, Mike Rapoport

On 13.11.23 14:26, Theodore Ts'o wrote:
> On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
>>
>> According to the man page:
>>
>> "The  memory areas backing the file created with memfd_secret(2) are visible
>> only to the processes that have access to the file descriptor. The memory
>> region is removed from the kernel page tables and only the page tables  of
>> the  processes  holding  the file descriptor map the corresponding physical
>> memory.  (Thus, the pages in the region can't be accessed by the kernel
>> itself, so that, for example, pointers  to  the region can't be passed to
>> system calls.)
>>
>> I'm not sure if the last part is actually true, if the syscalls end up
>> walking user page tables to copy data in/out.
> 
> The idea behind removing it from the kernel page tables is so that
> kernel code running in some other process context won't be able to
> reference the memory via the kernel address space.  (So if there is
> some kind of kernel zero-day which allows arbitrary code execution,
> the injected attack code would have to play games with page tables
> before being able to reference the memory --- this is not
> *impossible*, just more annoying.)
> 
> But if you are doing a buffered write, the copy from the user-supplied
> buffer to the page cache is happening in the process's context.  So
> "foreground kernel code" can dereference the user-supplied pointer
> just fine.

Right, so the statement in the man page is imprecise.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
  2023-11-13 13:26   ` Theodore Ts'o
  2023-11-13 14:42     ` David Hildenbrand
@ 2023-11-13 15:42     ` David Wang
  1 sibling, 0 replies; 5+ messages in thread
From: David Wang @ 2023-11-13 15:42 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Mike Rapoport



At 2023-11-13 21:26:21, "Theodore Ts'o" <tytso@mit.edu> wrote:
>On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
>> 
>> According to the man page:
>> 
>> "The  memory areas backing the file created with memfd_secret(2) are visible
>> only to the processes that have access to the file descriptor. The memory
>> region is removed from the kernel page tables and only the page tables  of
>> the  processes  holding  the file descriptor map the corresponding physical
>> memory.  (Thus, the pages in the region can't be accessed by the kernel
>> itself, so that, for example, pointers  to  the region can't be passed to
>> system calls.)
>> 
>> I'm not sure if the last part is actually true, if the syscalls end up
>> walking user page tables to copy data in/out.
>
>The idea behind removing it from the kernel page tables is so that
>kernel code running in some other process context won't be able to
>reference the memory via the kernel address space.  (So if there is
>some kind of kernel zero-day which allows arbitrary code execution,
>the injected attack code would have to play games with page tables
>before being able to reference the memory --- this is not
>*impossible*, just more annoying.)
>
>But if you are doing a buffered write, the copy from the user-supplied
>buffer to the page cache is happening in the process's context.  So
>"foreground kernel code" can dereference the user-supplied pointer
>just fine.
>

But the  inconsistent treatment in kernel,   memfd denied while  mmaped-address allowed,   is kind of confusing...
I thought those two should be treated the same way....

Thanks
David Wang

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-13 15:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-08 11:47 [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall David Wang
2023-11-13  9:15 ` David Hildenbrand
2023-11-13 13:26   ` Theodore Ts'o
2023-11-13 14:42     ` David Hildenbrand
2023-11-13 15:42     ` David Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox