linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Warning on mremapped uffd-wp memory
@ 2024-08-06 15:15 Ryan Roberts
  2024-08-06 16:37 ` David Hildenbrand
  0 siblings, 1 reply; 7+ messages in thread
From: Ryan Roberts @ 2024-08-06 15:15 UTC (permalink / raw)
  To: Peter Xu, David Hildenbrand; +Cc: Mark Rutland, Linux-MM

Hi Peter, David,

syzkaller has found an issue (at least on arm64, but I suspect it will be 
visible on x86_64 too) that triggers the following warning:

[ 2291.836518] ------------[ cut here ]------------
[ 2291.836528] WARNING: CPU: 3 PID: 9056 at mm/page_table_check.c:207 __page_table_check_ptes_set+0x22c/0x248
[ 2291.836541] Modules linked in:
[ 2291.836549] CPU: 3 UID: 1000 PID: 9056 Comm: bug Tainted: G        W          6.11.0-rc2-dirty #2
[ 2291.836554] Tainted: [W]=WARN
[ 2291.836557] Hardware name: linux,dummy-virt (DT)
[ 2291.836559] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2291.836564] pc : __page_table_check_ptes_set+0x22c/0x248
[ 2291.836568] lr : ptep_modify_prot_commit+0x24c/0x2b0
[ 2291.836573] sp : ffff80008ca6ba20
[ 2291.836575] x29: ffff80008ca6ba20 x28: ffff186392d1eb00 x27: 0000000020ffd000
[ 2291.836598] x26: 0010000000000001 x25: 0000000000000001 x24: 0000000000000000
[ 2291.836605] x23: 04e800018c738f43 x22: 0000000000000001 x21: ffff1863824163c0
[ 2291.836612] x20: 04e800018c738f43 x19: 04e800018c738f43 x18: 0000fffff7f87fff
[ 2291.836619] x17: 0000000000000000 x16: 1fffe30c748d22a1 x15: 0060000000000fc3
[ 2291.836625] x14: 0000000000000000 x13: 0000000020ffd000 x12: 0000fffff7f87fff
[ 2291.836631] x11: 0000000020ffd000 x10: 0000000000000000 x9 : ffffbcab99e3ab84
[ 2291.836638] x8 : ffff186382b8f000 x7 : 0000000020ffe000 x6 : 0000000020ffd000
[ 2291.836644] x5 : ffff186392d1eb00 x4 : 04e800018c738f43 x3 : 0000000000000001
[ 2291.836650] x2 : 04e800018c738f43 x1 : ffff18639fe01fe8 x0 : ffffbcab9ce56780
[ 2291.836657] Call trace:
[ 2291.836659]  __page_table_check_ptes_set+0x22c/0x248
[ 2291.836664]  ptep_modify_prot_commit+0x24c/0x2b0
[ 2291.836667]  change_protection+0x8a0/0x1100
[ 2291.836672]  mprotect_fixup+0x124/0x2d0
[ 2291.836675]  do_mprotect_pkey.constprop.0+0x29c/0x460
[ 2291.836679]  __arm64_sys_mprotect+0x24/0xf8
[ 2291.836682]  invoke_syscall+0x50/0x120
[ 2291.836690]  el0_svc_common.constprop.0+0x48/0xf0
[ 2291.836694]  do_el0_svc+0x24/0x38
[ 2291.836699]  el0_svc+0x34/0xe0
[ 2291.836705]  el0t_64_sync_handler+0x100/0x130
[ 2291.836709]  el0t_64_sync+0x190/0x198
[ 2291.836713] ---[ end trace 0000000000000000 ]---

The generated program (see below) mmaps a 16M region (RWX). It then mlocks all 
current and future memory.

Next, it registers 12K (3 pages) for use with UFFD-WP, and marks 4 pages
UFFD-WP'ed. This returns ENOENT because we only registered 3 pages, but those 3
pages are still UFFD-WP'ed in their PTE, so this error is not relavent to the
bug. At this point, there is a single VMA covering the 12K, with VM_UFFD_WP set,
amongst other flags:

  20ffb000-20ffe000 rwxp 00000000 00:00 0 
  Size:                 12 kB
  KernelPageSize:        4 kB
  MMUPageSize:           4 kB
  Rss:                  12 kB
  Pss:                  12 kB
  Pss_Dirty:            12 kB
  Shared_Clean:          0 kB
  Shared_Dirty:          0 kB
  Private_Clean:         0 kB
  Private_Dirty:        12 kB
  Referenced:           12 kB
  Anonymous:            12 kB
  KSM:                   0 kB
  LazyFree:              0 kB
  AnonHugePages:         0 kB
  ShmemPmdMapped:        0 kB
  FilePmdMapped:         0 kB
  Shared_Hugetlb:        0 kB
  Private_Hugetlb:       0 kB
  Swap:                  0 kB
  SwapPss:               0 kB
  Locked:               12 kB
  THPeligible:           0
  VmFlags: rd wr ex mr mw me uw lo ac 

Next we mremap the first page to the address where the last page was previously
mapped, with MREMAP_DONTUNMAP. This leads to 2 VMAs, but the new one doesn't
have VM_UFFD_WP set (Note also that the original VMA no longer has VM_LOCKED
which seems wrong to me, but I'll ignore that for now):

  20ffb000-20ffd000 rwxp 00000000 00:00 0 
  Size:                  8 kB
  KernelPageSize:        4 kB
  MMUPageSize:           4 kB
  Rss:                   4 kB
  Pss:                   4 kB
  Pss_Dirty:             4 kB
  Shared_Clean:          0 kB
  Shared_Dirty:          0 kB
  Private_Clean:         0 kB
  Private_Dirty:         4 kB
  Referenced:            4 kB
  Anonymous:             4 kB
  KSM:                   0 kB
  LazyFree:              0 kB
  AnonHugePages:         0 kB
  ShmemPmdMapped:        0 kB
  FilePmdMapped:         0 kB
  Shared_Hugetlb:        0 kB
  Private_Hugetlb:       0 kB
  Swap:                  0 kB
  SwapPss:               0 kB
  Locked:                0 kB
  THPeligible:           0
  VmFlags: rd wr ex mr mw me uw ac 
  20ffd000-20ffe000 rwxp 00000000 00:00 0 
  Size:                  4 kB
  KernelPageSize:        4 kB
  MMUPageSize:           4 kB
  Rss:                   4 kB
  Pss:                   4 kB
  Pss_Dirty:             4 kB
  Shared_Clean:          0 kB
  Shared_Dirty:          0 kB
  Private_Clean:         0 kB
  Private_Dirty:         4 kB
  Referenced:            4 kB
  Anonymous:             4 kB
  KSM:                   0 kB
  LazyFree:              0 kB
  AnonHugePages:         0 kB
  ShmemPmdMapped:        0 kB
  FilePmdMapped:         0 kB
  Shared_Hugetlb:        0 kB
  Private_Hugetlb:       0 kB
  Swap:                  0 kB
  SwapPss:               0 kB
  Locked:                4 kB
  THPeligible:           0
  VmFlags: rd wr ex mr mw me lo ac

Finally we try to mprotect that last 4K region to remove X, and we get the
warning saying the PTE has both the UFFD-WP and WRITE bits set. 

I'm guessing this is because the VM_UFFD_WP flag got spuriously dropped when
creating the final 4K VMA and so mprotect's can_change_pte_writable() check
incorrectly allowed the pte to be marked writable. But the mremap man page is
not very clear on the semantics when interacting with uffd regions; perhaps
uffd-wp bit should have been cleared when mremapping the ptes?

I'm hoping you can advice on the expected semantics and we can figure out how to
solve this?


The reproducer is as follows (with a few annotations added by me):

"""
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

#ifndef __NR_ioctl
#define __NR_ioctl 29
#endif
#ifndef __NR_mlockall
#define __NR_mlockall 230
#endif
#ifndef __NR_mmap
#define __NR_mmap 222
#endif
#ifndef __NR_mprotect
#define __NR_mprotect 226
#endif
#ifndef __NR_mremap
#define __NR_mremap 216
#endif
#ifndef __NR_userfaultfd
#define __NR_userfaultfd 282
#endif

uint64_t r[1] = {0xffffffffffffffff};

int main(void)
{
	intptr_t res = 0;

	syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
	syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
	syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);

	write(1, "executing program\n", sizeof("executing program\n") - 1);

	// userfaultfd(UFFD_USER_MODE_ONLY)        = 3
	res = syscall(__NR_userfaultfd, /*flags=UFFD_USER_MODE_ONLY*/1ul);
	if (res != -1)
		r[0] = res;

	// ioctl(3, UFFDIO_API, {api=0xaa, features=0 => features=UFFD_FEATURE_PAGEFAULT_FLAG_WP|UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP|UFFD_FEATURE_EVENT_REMOVE|UFFD_FEATURE_MISSING_HUGETLBFS|UFFD_FEATURE_MISSING_SHMEM|UFFD_FEATURE_EVENT_UNMAP|UFFD_FEATURE_SIGBUS|UFFD_FEATURE_THREAD_ID|UFFD_FEATURE_MINOR_HUGETLBFS|UFFD_FEATURE_MINOR_SHMEM|0x1f800, ioctls=1<<_UFFDIO_REGISTER|1<<_UFFDIO_UNREGISTER|1<<_UFFDIO_API}) = 0
	*(uint64_t*)0x20000000 = 0xaa;
	*(uint64_t*)0x20000008 = 0;
	*(uint64_t*)0x20000010 = 0;
	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x20000000ul);

	syscall(__NR_mlockall, /*flags=MCL_FUTURE|MCL_CURRENT*/3ul);

	// ioctl(3, UFFDIO_REGISTER, {range={start=0x20ffb000, len=0x3000}, mode=UFFDIO_REGISTER_MODE_WP, ioctls=1<<_UFFDIO_WAKE|1<<_UFFDIO_COPY|1<<_UFFDIO_ZEROPAGE|1<<_UFFDIO_WRITEPROTECT|0x120}) = 0
	*(uint64_t*)0x20000180 = 0x20ffb000;
	*(uint64_t*)0x20000188 = 0x3000;
	*(uint64_t*)0x20000190 = 2;
	*(uint64_t*)0x20000198 = 0;
	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul);

	// ioctl(3, UFFDIO_WRITEPROTECT, 0x20000080) = -1 ENOENT (No such file or directory)
	*(uint64_t*)0x20000080 = 0x20ffb000;
	*(uint64_t*)0x20000088 = 0x4000;
	*(uint64_t*)0x20000090 = 1;
	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa06, /*arg=*/0x20000080ul);

	syscall(__NR_mremap, /*addr=*/0x20ffb000ul, /*len=*/0x1000ul, /*newlen=*/0x1000ul, /*flags=MREMAP_DONTUNMAP|MREMAP_FIXED|MREMAP_MAYMOVE*/7ul, /*newaddr=*/0x20ffd000ul);
	syscall(__NR_mprotect, /*addr=*/0x20ffd000ul, /*len=*/0x1000ul, /*prot=PROT_WRITE|PROT_READ*/3ul);

	return 0;
}
"""

I'd appreciate any thoughts you may have!

Thanks,
Ryan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Warning on mremapped uffd-wp memory
  2024-08-06 15:15 Warning on mremapped uffd-wp memory Ryan Roberts
@ 2024-08-06 16:37 ` David Hildenbrand
  2024-08-06 16:58   ` Ryan Roberts
       [not found]   ` <ZrKHtOJ7wxk0V9Pl@x1n>
  0 siblings, 2 replies; 7+ messages in thread
From: David Hildenbrand @ 2024-08-06 16:37 UTC (permalink / raw)
  To: Ryan Roberts, Peter Xu; +Cc: Mark Rutland, Linux-MM

On 06.08.24 17:15, Ryan Roberts wrote:
> Hi Peter, David,
> 
> syzkaller has found an issue (at least on arm64, but I suspect it will be
> visible on x86_64 too) that triggers the following warning:
> 
> [ 2291.836518] ------------[ cut here ]------------
> [ 2291.836528] WARNING: CPU: 3 PID: 9056 at mm/page_table_check.c:207 __page_table_check_ptes_set+0x22c/0x248
> [ 2291.836541] Modules linked in:
> [ 2291.836549] CPU: 3 UID: 1000 PID: 9056 Comm: bug Tainted: G        W          6.11.0-rc2-dirty #2
> [ 2291.836554] Tainted: [W]=WARN
> [ 2291.836557] Hardware name: linux,dummy-virt (DT)
> [ 2291.836559] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 2291.836564] pc : __page_table_check_ptes_set+0x22c/0x248
> [ 2291.836568] lr : ptep_modify_prot_commit+0x24c/0x2b0
> [ 2291.836573] sp : ffff80008ca6ba20
> [ 2291.836575] x29: ffff80008ca6ba20 x28: ffff186392d1eb00 x27: 0000000020ffd000
> [ 2291.836598] x26: 0010000000000001 x25: 0000000000000001 x24: 0000000000000000
> [ 2291.836605] x23: 04e800018c738f43 x22: 0000000000000001 x21: ffff1863824163c0
> [ 2291.836612] x20: 04e800018c738f43 x19: 04e800018c738f43 x18: 0000fffff7f87fff
> [ 2291.836619] x17: 0000000000000000 x16: 1fffe30c748d22a1 x15: 0060000000000fc3
> [ 2291.836625] x14: 0000000000000000 x13: 0000000020ffd000 x12: 0000fffff7f87fff
> [ 2291.836631] x11: 0000000020ffd000 x10: 0000000000000000 x9 : ffffbcab99e3ab84
> [ 2291.836638] x8 : ffff186382b8f000 x7 : 0000000020ffe000 x6 : 0000000020ffd000
> [ 2291.836644] x5 : ffff186392d1eb00 x4 : 04e800018c738f43 x3 : 0000000000000001
> [ 2291.836650] x2 : 04e800018c738f43 x1 : ffff18639fe01fe8 x0 : ffffbcab9ce56780
> [ 2291.836657] Call trace:
> [ 2291.836659]  __page_table_check_ptes_set+0x22c/0x248
> [ 2291.836664]  ptep_modify_prot_commit+0x24c/0x2b0
> [ 2291.836667]  change_protection+0x8a0/0x1100
> [ 2291.836672]  mprotect_fixup+0x124/0x2d0
> [ 2291.836675]  do_mprotect_pkey.constprop.0+0x29c/0x460
> [ 2291.836679]  __arm64_sys_mprotect+0x24/0xf8
> [ 2291.836682]  invoke_syscall+0x50/0x120
> [ 2291.836690]  el0_svc_common.constprop.0+0x48/0xf0
> [ 2291.836694]  do_el0_svc+0x24/0x38
> [ 2291.836699]  el0_svc+0x34/0xe0
> [ 2291.836705]  el0t_64_sync_handler+0x100/0x130
> [ 2291.836709]  el0t_64_sync+0x190/0x198
> [ 2291.836713] ---[ end trace 0000000000000000 ]---
> 
> The generated program (see below) mmaps a 16M region (RWX). It then mlocks all
> current and future memory.
> 
> Next, it registers 12K (3 pages) for use with UFFD-WP, and marks 4 pages
> UFFD-WP'ed. This returns ENOENT because we only registered 3 pages, but those 3
> pages are still UFFD-WP'ed in their PTE, so this error is not relavent to the
> bug. At this point, there is a single VMA covering the 12K, with VM_UFFD_WP set,
> amongst other flags:
> 
>    20ffb000-20ffe000 rwxp 00000000 00:00 0
>    Size:                 12 kB
>    KernelPageSize:        4 kB
>    MMUPageSize:           4 kB
>    Rss:                  12 kB
>    Pss:                  12 kB
>    Pss_Dirty:            12 kB
>    Shared_Clean:          0 kB
>    Shared_Dirty:          0 kB
>    Private_Clean:         0 kB
>    Private_Dirty:        12 kB
>    Referenced:           12 kB
>    Anonymous:            12 kB
>    KSM:                   0 kB
>    LazyFree:              0 kB
>    AnonHugePages:         0 kB
>    ShmemPmdMapped:        0 kB
>    FilePmdMapped:         0 kB
>    Shared_Hugetlb:        0 kB
>    Private_Hugetlb:       0 kB
>    Swap:                  0 kB
>    SwapPss:               0 kB
>    Locked:               12 kB
>    THPeligible:           0
>    VmFlags: rd wr ex mr mw me uw lo ac
> 
> Next we mremap the first page to the address where the last page was previously
> mapped, with MREMAP_DONTUNMAP. This leads to 2 VMAs, but the new one doesn't
> have VM_UFFD_WP set (Note also that the original VMA no longer has VM_LOCKED
> which seems wrong to me, but I'll ignore that for now):
> 
>    20ffb000-20ffd000 rwxp 00000000 00:00 0
>    Size:                  8 kB
>    KernelPageSize:        4 kB
>    MMUPageSize:           4 kB
>    Rss:                   4 kB
>    Pss:                   4 kB
>    Pss_Dirty:             4 kB
>    Shared_Clean:          0 kB
>    Shared_Dirty:          0 kB
>    Private_Clean:         0 kB
>    Private_Dirty:         4 kB
>    Referenced:            4 kB
>    Anonymous:             4 kB
>    KSM:                   0 kB
>    LazyFree:              0 kB
>    AnonHugePages:         0 kB
>    ShmemPmdMapped:        0 kB
>    FilePmdMapped:         0 kB
>    Shared_Hugetlb:        0 kB
>    Private_Hugetlb:       0 kB
>    Swap:                  0 kB
>    SwapPss:               0 kB
>    Locked:                0 kB
>    THPeligible:           0
>    VmFlags: rd wr ex mr mw me uw ac
>    20ffd000-20ffe000 rwxp 00000000 00:00 0
>    Size:                  4 kB
>    KernelPageSize:        4 kB
>    MMUPageSize:           4 kB
>    Rss:                   4 kB
>    Pss:                   4 kB
>    Pss_Dirty:             4 kB
>    Shared_Clean:          0 kB
>    Shared_Dirty:          0 kB
>    Private_Clean:         0 kB
>    Private_Dirty:         4 kB
>    Referenced:            4 kB
>    Anonymous:             4 kB
>    KSM:                   0 kB
>    LazyFree:              0 kB
>    AnonHugePages:         0 kB
>    ShmemPmdMapped:        0 kB
>    FilePmdMapped:         0 kB
>    Shared_Hugetlb:        0 kB
>    Private_Hugetlb:       0 kB
>    Swap:                  0 kB
>    SwapPss:               0 kB
>    Locked:                4 kB
>    THPeligible:           0
>    VmFlags: rd wr ex mr mw me lo ac
> 
> Finally we try to mprotect that last 4K region to remove X, and we get the
> warning saying the PTE has both the UFFD-WP and WRITE bits set.
> 
> I'm guessing this is because the VM_UFFD_WP flag got spuriously dropped when
> creating the final 4K VMA and so mprotect's can_change_pte_writable() check
> incorrectly allowed the pte to be marked writable. But the mremap man page is
> not very clear on the semantics when interacting with uffd regions; perhaps
> uffd-wp bit should have been cleared when mremapping the ptes?
> 
> I'm hoping you can advice on the expected semantics and we can figure out how to
> solve this?
> 
> 
> The reproducer is as follows (with a few annotations added by me):
> 
> """
> // autogenerated by syzkaller (https://github.com/google/syzkaller)
> 
> #define _GNU_SOURCE
> 
> #include <endian.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <unistd.h>
> 
> #ifndef __NR_ioctl
> #define __NR_ioctl 29
> #endif
> #ifndef __NR_mlockall
> #define __NR_mlockall 230
> #endif
> #ifndef __NR_mmap
> #define __NR_mmap 222
> #endif
> #ifndef __NR_mprotect
> #define __NR_mprotect 226
> #endif
> #ifndef __NR_mremap
> #define __NR_mremap 216
> #endif
> #ifndef __NR_userfaultfd
> #define __NR_userfaultfd 282
> #endif
> 
> uint64_t r[1] = {0xffffffffffffffff};
> 
> int main(void)
> {
> 	intptr_t res = 0;
> 
> 	syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> 	syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> 	syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
> 
> 	write(1, "executing program\n", sizeof("executing program\n") - 1);
> 
> 	// userfaultfd(UFFD_USER_MODE_ONLY)        = 3
> 	res = syscall(__NR_userfaultfd, /*flags=UFFD_USER_MODE_ONLY*/1ul);
> 	if (res != -1)
> 		r[0] = res;
> 
> 	// ioctl(3, UFFDIO_API, {api=0xaa, features=0 => features=UFFD_FEATURE_PAGEFAULT_FLAG_WP|UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP|UFFD_FEATURE_EVENT_REMOVE|UFFD_FEATURE_MISSING_HUGETLBFS|UFFD_FEATURE_MISSING_SHMEM|UFFD_FEATURE_EVENT_UNMAP|UFFD_FEATURE_SIGBUS|UFFD_FEATURE_THREAD_ID|UFFD_FEATURE_MINOR_HUGETLBFS|UFFD_FEATURE_MINOR_SHMEM|0x1f800, ioctls=1<<_UFFDIO_REGISTER|1<<_UFFDIO_UNREGISTER|1<<_UFFDIO_API}) = 0
> 	*(uint64_t*)0x20000000 = 0xaa;
> 	*(uint64_t*)0x20000008 = 0;
> 	*(uint64_t*)0x20000010 = 0;
> 	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x20000000ul);
> 
> 	syscall(__NR_mlockall, /*flags=MCL_FUTURE|MCL_CURRENT*/3ul);
> 
> 	// ioctl(3, UFFDIO_REGISTER, {range={start=0x20ffb000, len=0x3000}, mode=UFFDIO_REGISTER_MODE_WP, ioctls=1<<_UFFDIO_WAKE|1<<_UFFDIO_COPY|1<<_UFFDIO_ZEROPAGE|1<<_UFFDIO_WRITEPROTECT|0x120}) = 0
> 	*(uint64_t*)0x20000180 = 0x20ffb000;
> 	*(uint64_t*)0x20000188 = 0x3000;
> 	*(uint64_t*)0x20000190 = 2;
> 	*(uint64_t*)0x20000198 = 0;
> 	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul);
> 
> 	// ioctl(3, UFFDIO_WRITEPROTECT, 0x20000080) = -1 ENOENT (No such file or directory)
> 	*(uint64_t*)0x20000080 = 0x20ffb000;
> 	*(uint64_t*)0x20000088 = 0x4000;
> 	*(uint64_t*)0x20000090 = 1;
> 	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa06, /*arg=*/0x20000080ul);
> 
> 	syscall(__NR_mremap, /*addr=*/0x20ffb000ul, /*len=*/0x1000ul, /*newlen=*/0x1000ul, /*flags=MREMAP_DONTUNMAP|MREMAP_FIXED|MREMAP_MAYMOVE*/7ul, /*newaddr=*/0x20ffd000ul);
> 	syscall(__NR_mprotect, /*addr=*/0x20ffd000ul, /*len=*/0x1000ul, /*prot=PROT_WRITE|PROT_READ*/3ul);
> 
> 	return 0;
> }
> """
> 
> I'd appreciate any thoughts you may have!

Interesting. Either the vma flag shouldn't get dropped or we should 
un-mark the PTEs.

Is the vma flag maybe getting dropped because of some weird interaction 
with UFFD_EVENT_REMAP?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Warning on mremapped uffd-wp memory
  2024-08-06 16:37 ` David Hildenbrand
@ 2024-08-06 16:58   ` Ryan Roberts
       [not found]   ` <ZrKHtOJ7wxk0V9Pl@x1n>
  1 sibling, 0 replies; 7+ messages in thread
From: Ryan Roberts @ 2024-08-06 16:58 UTC (permalink / raw)
  To: David Hildenbrand, Peter Xu; +Cc: Mark Rutland, Linux-MM

On 06/08/2024 17:37, David Hildenbrand wrote:
> On 06.08.24 17:15, Ryan Roberts wrote:
>> Hi Peter, David,
>>
>> syzkaller has found an issue (at least on arm64, but I suspect it will be
>> visible on x86_64 too) that triggers the following warning:
>>
>> [ 2291.836518] ------------[ cut here ]------------
>> [ 2291.836528] WARNING: CPU: 3 PID: 9056 at mm/page_table_check.c:207
>> __page_table_check_ptes_set+0x22c/0x248
>> [ 2291.836541] Modules linked in:
>> [ 2291.836549] CPU: 3 UID: 1000 PID: 9056 Comm: bug Tainted: G       
>> W          6.11.0-rc2-dirty #2
>> [ 2291.836554] Tainted: [W]=WARN
>> [ 2291.836557] Hardware name: linux,dummy-virt (DT)
>> [ 2291.836559] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [ 2291.836564] pc : __page_table_check_ptes_set+0x22c/0x248
>> [ 2291.836568] lr : ptep_modify_prot_commit+0x24c/0x2b0
>> [ 2291.836573] sp : ffff80008ca6ba20
>> [ 2291.836575] x29: ffff80008ca6ba20 x28: ffff186392d1eb00 x27: 0000000020ffd000
>> [ 2291.836598] x26: 0010000000000001 x25: 0000000000000001 x24: 0000000000000000
>> [ 2291.836605] x23: 04e800018c738f43 x22: 0000000000000001 x21: ffff1863824163c0
>> [ 2291.836612] x20: 04e800018c738f43 x19: 04e800018c738f43 x18: 0000fffff7f87fff
>> [ 2291.836619] x17: 0000000000000000 x16: 1fffe30c748d22a1 x15: 0060000000000fc3
>> [ 2291.836625] x14: 0000000000000000 x13: 0000000020ffd000 x12: 0000fffff7f87fff
>> [ 2291.836631] x11: 0000000020ffd000 x10: 0000000000000000 x9 : ffffbcab99e3ab84
>> [ 2291.836638] x8 : ffff186382b8f000 x7 : 0000000020ffe000 x6 : 0000000020ffd000
>> [ 2291.836644] x5 : ffff186392d1eb00 x4 : 04e800018c738f43 x3 : 0000000000000001
>> [ 2291.836650] x2 : 04e800018c738f43 x1 : ffff18639fe01fe8 x0 : ffffbcab9ce56780
>> [ 2291.836657] Call trace:
>> [ 2291.836659]  __page_table_check_ptes_set+0x22c/0x248
>> [ 2291.836664]  ptep_modify_prot_commit+0x24c/0x2b0
>> [ 2291.836667]  change_protection+0x8a0/0x1100
>> [ 2291.836672]  mprotect_fixup+0x124/0x2d0
>> [ 2291.836675]  do_mprotect_pkey.constprop.0+0x29c/0x460
>> [ 2291.836679]  __arm64_sys_mprotect+0x24/0xf8
>> [ 2291.836682]  invoke_syscall+0x50/0x120
>> [ 2291.836690]  el0_svc_common.constprop.0+0x48/0xf0
>> [ 2291.836694]  do_el0_svc+0x24/0x38
>> [ 2291.836699]  el0_svc+0x34/0xe0
>> [ 2291.836705]  el0t_64_sync_handler+0x100/0x130
>> [ 2291.836709]  el0t_64_sync+0x190/0x198
>> [ 2291.836713] ---[ end trace 0000000000000000 ]---
>>
>> The generated program (see below) mmaps a 16M region (RWX). It then mlocks all
>> current and future memory.
>>
>> Next, it registers 12K (3 pages) for use with UFFD-WP, and marks 4 pages
>> UFFD-WP'ed. This returns ENOENT because we only registered 3 pages, but those 3
>> pages are still UFFD-WP'ed in their PTE, so this error is not relavent to the
>> bug. At this point, there is a single VMA covering the 12K, with VM_UFFD_WP set,
>> amongst other flags:
>>
>>    20ffb000-20ffe000 rwxp 00000000 00:00 0
>>    Size:                 12 kB
>>    KernelPageSize:        4 kB
>>    MMUPageSize:           4 kB
>>    Rss:                  12 kB
>>    Pss:                  12 kB
>>    Pss_Dirty:            12 kB
>>    Shared_Clean:          0 kB
>>    Shared_Dirty:          0 kB
>>    Private_Clean:         0 kB
>>    Private_Dirty:        12 kB
>>    Referenced:           12 kB
>>    Anonymous:            12 kB
>>    KSM:                   0 kB
>>    LazyFree:              0 kB
>>    AnonHugePages:         0 kB
>>    ShmemPmdMapped:        0 kB
>>    FilePmdMapped:         0 kB
>>    Shared_Hugetlb:        0 kB
>>    Private_Hugetlb:       0 kB
>>    Swap:                  0 kB
>>    SwapPss:               0 kB
>>    Locked:               12 kB
>>    THPeligible:           0
>>    VmFlags: rd wr ex mr mw me uw lo ac
>>
>> Next we mremap the first page to the address where the last page was previously
>> mapped, with MREMAP_DONTUNMAP. This leads to 2 VMAs, but the new one doesn't
>> have VM_UFFD_WP set (Note also that the original VMA no longer has VM_LOCKED
>> which seems wrong to me, but I'll ignore that for now):
>>
>>    20ffb000-20ffd000 rwxp 00000000 00:00 0
>>    Size:                  8 kB
>>    KernelPageSize:        4 kB
>>    MMUPageSize:           4 kB
>>    Rss:                   4 kB
>>    Pss:                   4 kB
>>    Pss_Dirty:             4 kB
>>    Shared_Clean:          0 kB
>>    Shared_Dirty:          0 kB
>>    Private_Clean:         0 kB
>>    Private_Dirty:         4 kB
>>    Referenced:            4 kB
>>    Anonymous:             4 kB
>>    KSM:                   0 kB
>>    LazyFree:              0 kB
>>    AnonHugePages:         0 kB
>>    ShmemPmdMapped:        0 kB
>>    FilePmdMapped:         0 kB
>>    Shared_Hugetlb:        0 kB
>>    Private_Hugetlb:       0 kB
>>    Swap:                  0 kB
>>    SwapPss:               0 kB
>>    Locked:                0 kB
>>    THPeligible:           0
>>    VmFlags: rd wr ex mr mw me uw ac
>>    20ffd000-20ffe000 rwxp 00000000 00:00 0
>>    Size:                  4 kB
>>    KernelPageSize:        4 kB
>>    MMUPageSize:           4 kB
>>    Rss:                   4 kB
>>    Pss:                   4 kB
>>    Pss_Dirty:             4 kB
>>    Shared_Clean:          0 kB
>>    Shared_Dirty:          0 kB
>>    Private_Clean:         0 kB
>>    Private_Dirty:         4 kB
>>    Referenced:            4 kB
>>    Anonymous:             4 kB
>>    KSM:                   0 kB
>>    LazyFree:              0 kB
>>    AnonHugePages:         0 kB
>>    ShmemPmdMapped:        0 kB
>>    FilePmdMapped:         0 kB
>>    Shared_Hugetlb:        0 kB
>>    Private_Hugetlb:       0 kB
>>    Swap:                  0 kB
>>    SwapPss:               0 kB
>>    Locked:                4 kB
>>    THPeligible:           0
>>    VmFlags: rd wr ex mr mw me lo ac
>>
>> Finally we try to mprotect that last 4K region to remove X, and we get the
>> warning saying the PTE has both the UFFD-WP and WRITE bits set.
>>
>> I'm guessing this is because the VM_UFFD_WP flag got spuriously dropped when
>> creating the final 4K VMA and so mprotect's can_change_pte_writable() check
>> incorrectly allowed the pte to be marked writable. But the mremap man page is
>> not very clear on the semantics when interacting with uffd regions; perhaps
>> uffd-wp bit should have been cleared when mremapping the ptes?
>>
>> I'm hoping you can advice on the expected semantics and we can figure out how to
>> solve this?
>>
>>
>> The reproducer is as follows (with a few annotations added by me):
>>
>> """
>> // autogenerated by syzkaller (https://github.com/google/syzkaller)
>>
>> #define _GNU_SOURCE
>>
>> #include <endian.h>
>> #include <stdint.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>> #include <sys/syscall.h>
>> #include <sys/types.h>
>> #include <unistd.h>
>>
>> #ifndef __NR_ioctl
>> #define __NR_ioctl 29
>> #endif
>> #ifndef __NR_mlockall
>> #define __NR_mlockall 230
>> #endif
>> #ifndef __NR_mmap
>> #define __NR_mmap 222
>> #endif
>> #ifndef __NR_mprotect
>> #define __NR_mprotect 226
>> #endif
>> #ifndef __NR_mremap
>> #define __NR_mremap 216
>> #endif
>> #ifndef __NR_userfaultfd
>> #define __NR_userfaultfd 282
>> #endif
>>
>> uint64_t r[1] = {0xffffffffffffffff};
>>
>> int main(void)
>> {
>>     intptr_t res = 0;
>>
>>     syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul,
>> /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
>>     syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul,
>> /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul,
>> /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
>>     syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul,
>> /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
>>
>>     write(1, "executing program\n", sizeof("executing program\n") - 1);
>>
>>     // userfaultfd(UFFD_USER_MODE_ONLY)        = 3
>>     res = syscall(__NR_userfaultfd, /*flags=UFFD_USER_MODE_ONLY*/1ul);
>>     if (res != -1)
>>         r[0] = res;
>>
>>     // ioctl(3, UFFDIO_API, {api=0xaa, features=0 =>
>> features=UFFD_FEATURE_PAGEFAULT_FLAG_WP|UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP|UFFD_FEATURE_EVENT_REMOVE|UFFD_FEATURE_MISSING_HUGETLBFS|UFFD_FEATURE_MISSING_SHMEM|UFFD_FEATURE_EVENT_UNMAP|UFFD_FEATURE_SIGBUS|UFFD_FEATURE_THREAD_ID|UFFD_FEATURE_MINOR_HUGETLBFS|UFFD_FEATURE_MINOR_SHMEM|0x1f800, ioctls=1<<_UFFDIO_REGISTER|1<<_UFFDIO_UNREGISTER|1<<_UFFDIO_API}) = 0
>>     *(uint64_t*)0x20000000 = 0xaa;
>>     *(uint64_t*)0x20000008 = 0;
>>     *(uint64_t*)0x20000010 = 0;
>>     syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x20000000ul);
>>
>>     syscall(__NR_mlockall, /*flags=MCL_FUTURE|MCL_CURRENT*/3ul);
>>
>>     // ioctl(3, UFFDIO_REGISTER, {range={start=0x20ffb000, len=0x3000},
>> mode=UFFDIO_REGISTER_MODE_WP,
>> ioctls=1<<_UFFDIO_WAKE|1<<_UFFDIO_COPY|1<<_UFFDIO_ZEROPAGE|1<<_UFFDIO_WRITEPROTECT|0x120}) = 0
>>     *(uint64_t*)0x20000180 = 0x20ffb000;
>>     *(uint64_t*)0x20000188 = 0x3000;
>>     *(uint64_t*)0x20000190 = 2;
>>     *(uint64_t*)0x20000198 = 0;
>>     syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul);
>>
>>     // ioctl(3, UFFDIO_WRITEPROTECT, 0x20000080) = -1 ENOENT (No such file or
>> directory)
>>     *(uint64_t*)0x20000080 = 0x20ffb000;
>>     *(uint64_t*)0x20000088 = 0x4000;
>>     *(uint64_t*)0x20000090 = 1;
>>     syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa06, /*arg=*/0x20000080ul);
>>
>>     syscall(__NR_mremap, /*addr=*/0x20ffb000ul, /*len=*/0x1000ul,
>> /*newlen=*/0x1000ul,
>> /*flags=MREMAP_DONTUNMAP|MREMAP_FIXED|MREMAP_MAYMOVE*/7ul,
>> /*newaddr=*/0x20ffd000ul);
>>     syscall(__NR_mprotect, /*addr=*/0x20ffd000ul, /*len=*/0x1000ul,
>> /*prot=PROT_WRITE|PROT_READ*/3ul);
>>
>>     return 0;
>> }
>> """
>>
>> I'd appreciate any thoughts you may have!
> 
> Interesting. Either the vma flag shouldn't get dropped or we should un-mark the
> PTEs.

Yes, agreed. But which? I guess Peter is the expert here?

> 
> Is the vma flag maybe getting dropped because of some weird interaction with
> UFFD_EVENT_REMAP?
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Warning on mremapped uffd-wp memory
       [not found]           ` <29404449-fcbe-4d54-85ce-44da0b202243@arm.com>
@ 2024-10-01 14:27             ` Ryan Roberts
  2024-10-01 15:10               ` Peter Xu
  0 siblings, 1 reply; 7+ messages in thread
From: Ryan Roberts @ 2024-10-01 14:27 UTC (permalink / raw)
  To: Peter Xu, David Hildenbrand; +Cc: Mark Rutland, Linux-MM, Mike Rapoport

Hi Peter,

On 08/08/2024 12:25, Ryan Roberts wrote:
> On 07/08/2024 19:59, Peter Xu wrote:
>> On Wed, Aug 07, 2024 at 12:18:18PM +0200, David Hildenbrand wrote:
>>> On 07.08.24 10:58, David Hildenbrand wrote:
>>>> On 06.08.24 22:29, Peter Xu wrote:
>>>>> On Tue, Aug 06, 2024 at 06:37:55PM +0200, David Hildenbrand wrote:
>>>>>> On 06.08.24 17:15, Ryan Roberts wrote:
>>>>>>> Hi Peter, David,
>>>>>
>>>>> Hi, Ryan,
>>>>>
>>>>>>>
>>>>>>> syzkaller has found an issue (at least on arm64, but I suspect it will be
>>>>>>> visible on x86_64 too) that triggers the following warning:
>>>>>
>>>>> This is true.  I can easily reproduce..
>>>>>

[...]

>> When I'm looking at this specific issue again, it's more than ptes that
>> should need to remove the uffd-wp bit.  We have:
>>
>>   - pmd/pud/hugetlb in other paths that will need similar care..
>>
>>   - move_page_tables() smartness on HAVE_MOVE_PUD.. where we may need to
>>     walk the pmd page removing the bits when necessary..
>>
>>   - more importantly, mremap_userfaultfd_prep() might be too late if it's
>>     after moving pgtables..
>>
>>   - [not yet started looking] the mlock issue Ryan mentioned..
>>
>> Looks like we'll need more things to fix and test..
>>
>> I wished if I can simply disable UFFD_WP + EVENT_REMAP, but I think even
>> with that, by default when mremap() we should still logically tear down all
>> those uffd-wp bits which is the same as !EVENT_REMAP now..
>>
>> Let me know if anyone would like to beat me to it on fixing the whole
>> thing, I'd be more than happy..  
> 
> Afraid I won't be able to sign up to doing that work.
> 
> Otherwise, I'll probably need to postpone
>> the fix of this issue for 1-2 weeks but finish some other things first..

I'm not sure if there was any progress on this? We are still seeing the problem
on v6.12-rc1.

Thanks,
Ryan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Warning on mremapped uffd-wp memory
  2024-10-01 14:27             ` Ryan Roberts
@ 2024-10-01 15:10               ` Peter Xu
  2024-10-01 15:31                 ` Ryan Roberts
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Xu @ 2024-10-01 15:10 UTC (permalink / raw)
  To: Ryan Roberts; +Cc: David Hildenbrand, Mark Rutland, Linux-MM, Mike Rapoport

On Tue, Oct 01, 2024 at 03:27:48PM +0100, Ryan Roberts wrote:
> Hi Peter,
> 
> On 08/08/2024 12:25, Ryan Roberts wrote:
> > On 07/08/2024 19:59, Peter Xu wrote:
> >> On Wed, Aug 07, 2024 at 12:18:18PM +0200, David Hildenbrand wrote:
> >>> On 07.08.24 10:58, David Hildenbrand wrote:
> >>>> On 06.08.24 22:29, Peter Xu wrote:
> >>>>> On Tue, Aug 06, 2024 at 06:37:55PM +0200, David Hildenbrand wrote:
> >>>>>> On 06.08.24 17:15, Ryan Roberts wrote:
> >>>>>>> Hi Peter, David,
> >>>>>
> >>>>> Hi, Ryan,
> >>>>>
> >>>>>>>
> >>>>>>> syzkaller has found an issue (at least on arm64, but I suspect it will be
> >>>>>>> visible on x86_64 too) that triggers the following warning:
> >>>>>
> >>>>> This is true.  I can easily reproduce..
> >>>>>
> 
> [...]
> 
> >> When I'm looking at this specific issue again, it's more than ptes that
> >> should need to remove the uffd-wp bit.  We have:
> >>
> >>   - pmd/pud/hugetlb in other paths that will need similar care..
> >>
> >>   - move_page_tables() smartness on HAVE_MOVE_PUD.. where we may need to
> >>     walk the pmd page removing the bits when necessary..
> >>
> >>   - more importantly, mremap_userfaultfd_prep() might be too late if it's
> >>     after moving pgtables..
> >>
> >>   - [not yet started looking] the mlock issue Ryan mentioned..
> >>
> >> Looks like we'll need more things to fix and test..
> >>
> >> I wished if I can simply disable UFFD_WP + EVENT_REMAP, but I think even
> >> with that, by default when mremap() we should still logically tear down all
> >> those uffd-wp bits which is the same as !EVENT_REMAP now..
> >>
> >> Let me know if anyone would like to beat me to it on fixing the whole
> >> thing, I'd be more than happy..  
> > 
> > Afraid I won't be able to sign up to doing that work.
> > 
> > Otherwise, I'll probably need to postpone
> >> the fix of this issue for 1-2 weeks but finish some other things first..
> 
> I'm not sure if there was any progress on this? We are still seeing the problem
> on v6.12-rc1.

Hi, Ryan,

I haven't yet got free time to look at this, sorry.  I confess I didn't
prioritize this as high, as I doubt anyone would make real use of it, or
hit this issue in real workloads, and it'll even slow down generic
workloads even if slightly.

Do you want to have a look?  It'll be great if so.  Or I can try to find
some time this month.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Warning on mremapped uffd-wp memory
  2024-10-01 15:10               ` Peter Xu
@ 2024-10-01 15:31                 ` Ryan Roberts
  2024-10-01 15:42                   ` Peter Xu
  0 siblings, 1 reply; 7+ messages in thread
From: Ryan Roberts @ 2024-10-01 15:31 UTC (permalink / raw)
  To: Peter Xu; +Cc: David Hildenbrand, Mark Rutland, Linux-MM, Mike Rapoport

On 01/10/2024 16:10, Peter Xu wrote:
> On Tue, Oct 01, 2024 at 03:27:48PM +0100, Ryan Roberts wrote:
>> Hi Peter,
>>
>> On 08/08/2024 12:25, Ryan Roberts wrote:
>>> On 07/08/2024 19:59, Peter Xu wrote:
>>>> On Wed, Aug 07, 2024 at 12:18:18PM +0200, David Hildenbrand wrote:
>>>>> On 07.08.24 10:58, David Hildenbrand wrote:
>>>>>> On 06.08.24 22:29, Peter Xu wrote:
>>>>>>> On Tue, Aug 06, 2024 at 06:37:55PM +0200, David Hildenbrand wrote:
>>>>>>>> On 06.08.24 17:15, Ryan Roberts wrote:
>>>>>>>>> Hi Peter, David,
>>>>>>>
>>>>>>> Hi, Ryan,
>>>>>>>
>>>>>>>>>
>>>>>>>>> syzkaller has found an issue (at least on arm64, but I suspect it will be
>>>>>>>>> visible on x86_64 too) that triggers the following warning:
>>>>>>>
>>>>>>> This is true.  I can easily reproduce..
>>>>>>>
>>
>> [...]
>>
>>>> When I'm looking at this specific issue again, it's more than ptes that
>>>> should need to remove the uffd-wp bit.  We have:
>>>>
>>>>   - pmd/pud/hugetlb in other paths that will need similar care..
>>>>
>>>>   - move_page_tables() smartness on HAVE_MOVE_PUD.. where we may need to
>>>>     walk the pmd page removing the bits when necessary..
>>>>
>>>>   - more importantly, mremap_userfaultfd_prep() might be too late if it's
>>>>     after moving pgtables..
>>>>
>>>>   - [not yet started looking] the mlock issue Ryan mentioned..
>>>>
>>>> Looks like we'll need more things to fix and test..
>>>>
>>>> I wished if I can simply disable UFFD_WP + EVENT_REMAP, but I think even
>>>> with that, by default when mremap() we should still logically tear down all
>>>> those uffd-wp bits which is the same as !EVENT_REMAP now..
>>>>
>>>> Let me know if anyone would like to beat me to it on fixing the whole
>>>> thing, I'd be more than happy..  
>>>
>>> Afraid I won't be able to sign up to doing that work.
>>>
>>> Otherwise, I'll probably need to postpone
>>>> the fix of this issue for 1-2 weeks but finish some other things first..
>>
>> I'm not sure if there was any progress on this? We are still seeing the problem
>> on v6.12-rc1.
> 
> Hi, Ryan,
> 
> I haven't yet got free time to look at this, sorry.  I confess I didn't
> prioritize this as high, as I doubt anyone would make real use of it, or
> hit this issue in real workloads, and it'll even slow down generic
> workloads even if slightly.

No problem, I'm acting as the middle man really, given -rc1 is out, Mark has
been running his usual fuzzing and noted that the issue still exists. So I
thought I'd just enquire to see if you were able to make any progress. I agree
its not high priority. Although for a panic_on_warn=1 kernel (which I understand
some use in deployment), this means that user space can panic the system, so I
guess it needs to be addressed eventually.

> 
> Do you want to have a look?  It'll be great if so.  Or I can try to find
> some time this month.

I won't personally get time to look at this, since I'm busy with some other
commitments. But I might be able to find someone to look into it. Leave it with
me for now.

Thanks,
Ryan



> 
> Thanks,
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Warning on mremapped uffd-wp memory
  2024-10-01 15:31                 ` Ryan Roberts
@ 2024-10-01 15:42                   ` Peter Xu
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Xu @ 2024-10-01 15:42 UTC (permalink / raw)
  To: Ryan Roberts; +Cc: David Hildenbrand, Mark Rutland, Linux-MM, Mike Rapoport

On Tue, Oct 01, 2024 at 04:31:31PM +0100, Ryan Roberts wrote:
> On 01/10/2024 16:10, Peter Xu wrote:
> > On Tue, Oct 01, 2024 at 03:27:48PM +0100, Ryan Roberts wrote:
> >> Hi Peter,
> >>
> >> On 08/08/2024 12:25, Ryan Roberts wrote:
> >>> On 07/08/2024 19:59, Peter Xu wrote:
> >>>> On Wed, Aug 07, 2024 at 12:18:18PM +0200, David Hildenbrand wrote:
> >>>>> On 07.08.24 10:58, David Hildenbrand wrote:
> >>>>>> On 06.08.24 22:29, Peter Xu wrote:
> >>>>>>> On Tue, Aug 06, 2024 at 06:37:55PM +0200, David Hildenbrand wrote:
> >>>>>>>> On 06.08.24 17:15, Ryan Roberts wrote:
> >>>>>>>>> Hi Peter, David,
> >>>>>>>
> >>>>>>> Hi, Ryan,
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>> syzkaller has found an issue (at least on arm64, but I suspect it will be
> >>>>>>>>> visible on x86_64 too) that triggers the following warning:
> >>>>>>>
> >>>>>>> This is true.  I can easily reproduce..
> >>>>>>>
> >>
> >> [...]
> >>
> >>>> When I'm looking at this specific issue again, it's more than ptes that
> >>>> should need to remove the uffd-wp bit.  We have:
> >>>>
> >>>>   - pmd/pud/hugetlb in other paths that will need similar care..
> >>>>
> >>>>   - move_page_tables() smartness on HAVE_MOVE_PUD.. where we may need to
> >>>>     walk the pmd page removing the bits when necessary..
> >>>>
> >>>>   - more importantly, mremap_userfaultfd_prep() might be too late if it's
> >>>>     after moving pgtables..
> >>>>
> >>>>   - [not yet started looking] the mlock issue Ryan mentioned..
> >>>>
> >>>> Looks like we'll need more things to fix and test..
> >>>>
> >>>> I wished if I can simply disable UFFD_WP + EVENT_REMAP, but I think even
> >>>> with that, by default when mremap() we should still logically tear down all
> >>>> those uffd-wp bits which is the same as !EVENT_REMAP now..
> >>>>
> >>>> Let me know if anyone would like to beat me to it on fixing the whole
> >>>> thing, I'd be more than happy..  
> >>>
> >>> Afraid I won't be able to sign up to doing that work.
> >>>
> >>> Otherwise, I'll probably need to postpone
> >>>> the fix of this issue for 1-2 weeks but finish some other things first..
> >>
> >> I'm not sure if there was any progress on this? We are still seeing the problem
> >> on v6.12-rc1.
> > 
> > Hi, Ryan,
> > 
> > I haven't yet got free time to look at this, sorry.  I confess I didn't
> > prioritize this as high, as I doubt anyone would make real use of it, or
> > hit this issue in real workloads, and it'll even slow down generic
> > workloads even if slightly.
> 
> No problem, I'm acting as the middle man really, given -rc1 is out, Mark has
> been running his usual fuzzing and noted that the issue still exists. So I
> thought I'd just enquire to see if you were able to make any progress. I agree
> its not high priority. Although for a panic_on_warn=1 kernel (which I understand
> some use in deployment), this means that user space can panic the system, so I
> guess it needs to be addressed eventually.
> 
> > 
> > Do you want to have a look?  It'll be great if so.  Or I can try to find
> > some time this month.
> 
> I won't personally get time to look at this, since I'm busy with some other
> commitments. But I might be able to find someone to look into it. Leave it with
> me for now.

Thank you!

If there's patches I can definitely try to review them.

Or if this won't get addressed before someone else pokes again, I'll do it.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-10-01 15:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-06 15:15 Warning on mremapped uffd-wp memory Ryan Roberts
2024-08-06 16:37 ` David Hildenbrand
2024-08-06 16:58   ` Ryan Roberts
     [not found]   ` <ZrKHtOJ7wxk0V9Pl@x1n>
     [not found]     ` <7e52ca0b-39df-4979-8b16-9880e5a7149c@redhat.com>
     [not found]       ` <97c7b531-daeb-468f-af2a-31980f6f6a84@redhat.com>
     [not found]         ` <ZrPELKDCMl6MXupy@x1n>
     [not found]           ` <29404449-fcbe-4d54-85ce-44da0b202243@arm.com>
2024-10-01 14:27             ` Ryan Roberts
2024-10-01 15:10               ` Peter Xu
2024-10-01 15:31                 ` Ryan Roberts
2024-10-01 15:42                   ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox