* [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
@ 2023-11-08 11:47 David Wang
2023-11-13 9:15 ` David Hildenbrand
0 siblings, 1 reply; 5+ messages in thread
From: David Wang @ 2023-11-08 11:47 UTC (permalink / raw)
To: akpm, linux-mm, linux-kernel
Hi,
According to https://lwn.net/Articles/865256/,
the memory address got from memfd_secret/ftruncate/mmap should not be used by syscalls, since it is not accessible even by kernel.
But my test result shows that the "secret" memory could be used in syscall write, is this expected behavior?
This is my test code:
int main() {
int fd = syscall(__NR_memfd_secret, 0);
if (fd < 0) {
perror("Fail to create secret");
return -1;
}
if (ftruncate(fd, 1024) < 0) {
perror("Fail to size the secret");
return -1;
}
char *key = mmap(NULL, 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (key == MAP_FAILED) {
perror("Fail to mmap");
return -1;
}
// should be some secure channel
strcpy(key, "ThisIsAKey");
// printf("[%d]key(%s) ready: %p\n", getpid(), key, key);
// getchar();
// make syscall, should err
write(STDOUT_FILENO, key, strlen(key)); //<-- Here the key shows up on stdout.
return 0;
}
Thanks
David
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
2023-11-08 11:47 [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall David Wang
@ 2023-11-13 9:15 ` David Hildenbrand
2023-11-13 13:26 ` Theodore Ts'o
0 siblings, 1 reply; 5+ messages in thread
From: David Hildenbrand @ 2023-11-13 9:15 UTC (permalink / raw)
To: David Wang, akpm, linux-mm, linux-kernel; +Cc: Mike Rapoport
On 08.11.23 12:47, David Wang wrote:
>
> Hi,
> According to https://lwn.net/Articles/865256/,
> the memory address got from memfd_secret/ftruncate/mmap should not be used by syscalls, since it is not accessible even by kernel.
>
> But my test result shows that the "secret" memory could be used in syscall write, is this expected behavior?
> This is my test code:
CCing Mike.
According to the man page:
"The memory areas backing the file created with memfd_secret(2) are
visible only to the processes that have access to the file descriptor.
The memory region is removed from the kernel page tables and only the
page tables of the processes holding the file descriptor map the
corresponding physical memory. (Thus, the pages in the region can't be
accessed by the kernel itself, so that, for example, pointers to the
region can't be passed to system calls.)
I'm not sure if the last part is actually true, if the syscalls end up
walking user page tables to copy data in/out.
>
> int main() {
> int fd = syscall(__NR_memfd_secret, 0);
> if (fd < 0) {
> perror("Fail to create secret");
> return -1;
> }
> if (ftruncate(fd, 1024) < 0) {
> perror("Fail to size the secret");
> return -1;
> }
> char *key = mmap(NULL, 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> if (key == MAP_FAILED) {
> perror("Fail to mmap");
> return -1;
> }
> // should be some secure channel
> strcpy(key, "ThisIsAKey");
> // printf("[%d]key(%s) ready: %p\n", getpid(), key, key);
> // getchar();
> // make syscall, should err
> write(STDOUT_FILENO, key, strlen(key)); //<-- Here the key shows up on stdout.
What probably happens here is that the kernel reads the data via the
user page tables, and can, therefore, access that memory just fine.
Looking at the selftest (tools/testing/selftests/mm/memfd_secret.c) we
test that we cannot read from the memfd and cannot write to the memfd.
We don't test if other syscalls can access that user-provided buffer
that is backed by a memfd.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
2023-11-13 9:15 ` David Hildenbrand
@ 2023-11-13 13:26 ` Theodore Ts'o
2023-11-13 14:42 ` David Hildenbrand
2023-11-13 15:42 ` David Wang
0 siblings, 2 replies; 5+ messages in thread
From: Theodore Ts'o @ 2023-11-13 13:26 UTC (permalink / raw)
To: David Hildenbrand; +Cc: David Wang, akpm, linux-mm, linux-kernel, Mike Rapoport
On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
>
> According to the man page:
>
> "The memory areas backing the file created with memfd_secret(2) are visible
> only to the processes that have access to the file descriptor. The memory
> region is removed from the kernel page tables and only the page tables of
> the processes holding the file descriptor map the corresponding physical
> memory. (Thus, the pages in the region can't be accessed by the kernel
> itself, so that, for example, pointers to the region can't be passed to
> system calls.)
>
> I'm not sure if the last part is actually true, if the syscalls end up
> walking user page tables to copy data in/out.
The idea behind removing it from the kernel page tables is so that
kernel code running in some other process context won't be able to
reference the memory via the kernel address space. (So if there is
some kind of kernel zero-day which allows arbitrary code execution,
the injected attack code would have to play games with page tables
before being able to reference the memory --- this is not
*impossible*, just more annoying.)
But if you are doing a buffered write, the copy from the user-supplied
buffer to the page cache is happening in the process's context. So
"foreground kernel code" can dereference the user-supplied pointer
just fine.
Cheers,
- Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
2023-11-13 13:26 ` Theodore Ts'o
@ 2023-11-13 14:42 ` David Hildenbrand
2023-11-13 15:42 ` David Wang
1 sibling, 0 replies; 5+ messages in thread
From: David Hildenbrand @ 2023-11-13 14:42 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: David Wang, akpm, linux-mm, linux-kernel, Mike Rapoport
On 13.11.23 14:26, Theodore Ts'o wrote:
> On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
>>
>> According to the man page:
>>
>> "The memory areas backing the file created with memfd_secret(2) are visible
>> only to the processes that have access to the file descriptor. The memory
>> region is removed from the kernel page tables and only the page tables of
>> the processes holding the file descriptor map the corresponding physical
>> memory. (Thus, the pages in the region can't be accessed by the kernel
>> itself, so that, for example, pointers to the region can't be passed to
>> system calls.)
>>
>> I'm not sure if the last part is actually true, if the syscalls end up
>> walking user page tables to copy data in/out.
>
> The idea behind removing it from the kernel page tables is so that
> kernel code running in some other process context won't be able to
> reference the memory via the kernel address space. (So if there is
> some kind of kernel zero-day which allows arbitrary code execution,
> the injected attack code would have to play games with page tables
> before being able to reference the memory --- this is not
> *impossible*, just more annoying.)
>
> But if you are doing a buffered write, the copy from the user-supplied
> buffer to the page cache is happening in the process's context. So
> "foreground kernel code" can dereference the user-supplied pointer
> just fine.
Right, so the statement in the man page is imprecise.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall.
2023-11-13 13:26 ` Theodore Ts'o
2023-11-13 14:42 ` David Hildenbrand
@ 2023-11-13 15:42 ` David Wang
1 sibling, 0 replies; 5+ messages in thread
From: David Wang @ 2023-11-13 15:42 UTC (permalink / raw)
To: Theodore Ts'o
Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Mike Rapoport
At 2023-11-13 21:26:21, "Theodore Ts'o" <tytso@mit.edu> wrote:
>On Mon, Nov 13, 2023 at 10:15:05AM +0100, David Hildenbrand wrote:
>>
>> According to the man page:
>>
>> "The memory areas backing the file created with memfd_secret(2) are visible
>> only to the processes that have access to the file descriptor. The memory
>> region is removed from the kernel page tables and only the page tables of
>> the processes holding the file descriptor map the corresponding physical
>> memory. (Thus, the pages in the region can't be accessed by the kernel
>> itself, so that, for example, pointers to the region can't be passed to
>> system calls.)
>>
>> I'm not sure if the last part is actually true, if the syscalls end up
>> walking user page tables to copy data in/out.
>
>The idea behind removing it from the kernel page tables is so that
>kernel code running in some other process context won't be able to
>reference the memory via the kernel address space. (So if there is
>some kind of kernel zero-day which allows arbitrary code execution,
>the injected attack code would have to play games with page tables
>before being able to reference the memory --- this is not
>*impossible*, just more annoying.)
>
>But if you are doing a buffered write, the copy from the user-supplied
>buffer to the page cache is happening in the process's context. So
>"foreground kernel code" can dereference the user-supplied pointer
>just fine.
>
But the inconsistent treatment in kernel, memfd denied while mmaped-address allowed, is kind of confusing...
I thought those two should be treated the same way....
Thanks
David Wang
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-11-13 15:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-08 11:47 [BUG?] mm/secretmem: memory address mapped to memfd_secret can be used in write syscall David Wang
2023-11-13 9:15 ` David Hildenbrand
2023-11-13 13:26 ` Theodore Ts'o
2023-11-13 14:42 ` David Hildenbrand
2023-11-13 15:42 ` David Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox