From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C37D3C3DA64 for ; Tue, 6 Aug 2024 15:15:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34A786B007B; Tue, 6 Aug 2024 11:15:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D3836B0083; Tue, 6 Aug 2024 11:15:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 174276B0085; Tue, 6 Aug 2024 11:15:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EEDF06B007B for ; Tue, 6 Aug 2024 11:15:54 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6389AA59FB for ; Tue, 6 Aug 2024 15:15:54 +0000 (UTC) X-FDA: 82422170628.26.6985AB8 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf08.hostedemail.com (Postfix) with ESMTP id 8B1E0160031 for ; Tue, 6 Aug 2024 15:15:52 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722957321; a=rsa-sha256; cv=none; b=WtG3QJeILSxW+rHjLwe+TLM2k8Y7bWsdDP2WE7mEv00aQCzhPgZdSkb1Kj9ePj49IdZxe1 /LFsm8sekm+VZuPXgy8ZlK3CdyvqeUlyqzFUQFLXTod0uWhI8qzX0I78p1AiefPVUYDgOm YW26oEJZ7s9p50f1Q9PscfDfy0WHsbY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722957321; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=Me8s90POuiGD0U2/Ol+cfCid2imDA3GjSv3Dx9YDpVU=; b=7fAK1eAX6xVtcbj3nSEfeouKqJCU1KMlVnd5z7A+1p5I9BQmOPkD/b05sLEZZmLCz8/SEq AMbQH2CTZhNKgE4Rn/Vr3smmRyQYwItCJxfymV6tRmKcpCTHjYLczgFW8iWMTqRmYy5Eq5 xhRAckumRRTRU2oxbT7BboMhoH+bamk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 51AA7FEC; Tue, 6 Aug 2024 08:16:17 -0700 (PDT) Received: from [10.1.31.182] (XHFQ2J9959.cambridge.arm.com [10.1.31.182]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 04ED63F6A8; Tue, 6 Aug 2024 08:15:50 -0700 (PDT) Message-ID: <810b44a8-d2ae-4107-b665-5a42eae2d948@arm.com> Date: Tue, 6 Aug 2024 16:15:49 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-GB From: Ryan Roberts To: Peter Xu , David Hildenbrand Cc: Mark Rutland , Linux-MM Subject: Warning on mremapped uffd-wp memory Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: t1fo5rn6tx4h5f6naoe76r3yh8d6apwc X-Rspamd-Queue-Id: 8B1E0160031 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722957352-403763 X-HE-Meta: U2FsdGVkX18cVpFNuHURC/XGQIINR1w45LAnpUKylfESK/Cl9fIqsaIVM0HwJzVtS23wwVVIntbKkE0cg/WJNrg7ens+34IqSlWyHxTfDlJFQfeWN5TvJFxOZF59pikzHzGdmtSDFvtynhU0d0dVwQPjfdnSzdH6pHJTeDg2QpnAi2p/mrP4nbkLkvDbt9VxMpztlIN8+WkrdrMn+hCNYu/Jt64EfnTZCLfqAXVLVbRJrwnXPxmG0ZsEnJ1N8ivdJvoAz5r0by5CUJV8IpW4gLxFuMRhrivKOMbvLobrUqhkA5kVsox93rTiAc5u8C2QfZyBYG9h/7b8C7LMTAUvkCm8dvSxppabEk7Pdki01ddDSI+QqT3dm9aNq6LN9KdVPwgwv+DxiDB3GtfszFW8Zol8SO0Gj11kR8cnBINj8vUCRaditzKAMe76zQWOS6YMNK3zk3E4B98j2K/xNBtyT7NfTJYA398bawjPB/WXVsNsfUIidDyHHBboDmdBhugYRho/z2uoQJlCRGW0L8kCDA+llLs9/fRB1XuGuzHrHDP3cuDdBb1rwfh26noanhkgRX0aEfgwf2xtG7xK7DFgT33Wxnn4k8hPvuhEJRw7hj4CHqaeYYNjw2fKtaxvAJGCGFVhMIWGik7jqJ4cd+HU2zav2P138qLlE//DJkj+SN2M5FGkSBxlVjk95F8GbGzioYBe+jfcefvEGxl+H4ImUSgh2rBGPhJXoY9Y6m86YDOOWvbZT+Em6IOVeRiMwsYCkk+SyXKJaazdodk/Rgma0MCWdYDaLHTt09wTLEXKcEOouu0lpRoP0wT4hLzBIUq7SlVaxsDxRsLpwjzG92Unih9djzH6Bh7vhpdi4MM6HAj3snTtKdW5jbBDB1Pwdy2Ph+C8tkfPWEmGk4rMCeSN/wasxFN2D+Upyiyba0fLxdQcBhlYuPmtZBxNBBPlWP3p1C3TVZilgW7Ph2KCceR dfh80Hp3 Q6ZT1qISOmhRM42DHa7I/C1qQq/b5qjqw3m433UDpTBTarWx5b0kutiFG7/aO2AG1GDJdUi0f+pUM01WcQLfdQ6epsQJXDiKpafLnsUH84cKAP1BMrUMFUDgfCHNbaRJT2m5esVBp0ve5VaTqmPqDwfWGjuxAIZZQs6c0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Peter, David, syzkaller has found an issue (at least on arm64, but I suspect it will be visible on x86_64 too) that triggers the following warning: [ 2291.836518] ------------[ cut here ]------------ [ 2291.836528] WARNING: CPU: 3 PID: 9056 at mm/page_table_check.c:207 __page_table_check_ptes_set+0x22c/0x248 [ 2291.836541] Modules linked in: [ 2291.836549] CPU: 3 UID: 1000 PID: 9056 Comm: bug Tainted: G W 6.11.0-rc2-dirty #2 [ 2291.836554] Tainted: [W]=WARN [ 2291.836557] Hardware name: linux,dummy-virt (DT) [ 2291.836559] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2291.836564] pc : __page_table_check_ptes_set+0x22c/0x248 [ 2291.836568] lr : ptep_modify_prot_commit+0x24c/0x2b0 [ 2291.836573] sp : ffff80008ca6ba20 [ 2291.836575] x29: ffff80008ca6ba20 x28: ffff186392d1eb00 x27: 0000000020ffd000 [ 2291.836598] x26: 0010000000000001 x25: 0000000000000001 x24: 0000000000000000 [ 2291.836605] x23: 04e800018c738f43 x22: 0000000000000001 x21: ffff1863824163c0 [ 2291.836612] x20: 04e800018c738f43 x19: 04e800018c738f43 x18: 0000fffff7f87fff [ 2291.836619] x17: 0000000000000000 x16: 1fffe30c748d22a1 x15: 0060000000000fc3 [ 2291.836625] x14: 0000000000000000 x13: 0000000020ffd000 x12: 0000fffff7f87fff [ 2291.836631] x11: 0000000020ffd000 x10: 0000000000000000 x9 : ffffbcab99e3ab84 [ 2291.836638] x8 : ffff186382b8f000 x7 : 0000000020ffe000 x6 : 0000000020ffd000 [ 2291.836644] x5 : ffff186392d1eb00 x4 : 04e800018c738f43 x3 : 0000000000000001 [ 2291.836650] x2 : 04e800018c738f43 x1 : ffff18639fe01fe8 x0 : ffffbcab9ce56780 [ 2291.836657] Call trace: [ 2291.836659] __page_table_check_ptes_set+0x22c/0x248 [ 2291.836664] ptep_modify_prot_commit+0x24c/0x2b0 [ 2291.836667] change_protection+0x8a0/0x1100 [ 2291.836672] mprotect_fixup+0x124/0x2d0 [ 2291.836675] do_mprotect_pkey.constprop.0+0x29c/0x460 [ 2291.836679] __arm64_sys_mprotect+0x24/0xf8 [ 2291.836682] invoke_syscall+0x50/0x120 [ 2291.836690] el0_svc_common.constprop.0+0x48/0xf0 [ 2291.836694] do_el0_svc+0x24/0x38 [ 2291.836699] el0_svc+0x34/0xe0 [ 2291.836705] el0t_64_sync_handler+0x100/0x130 [ 2291.836709] el0t_64_sync+0x190/0x198 [ 2291.836713] ---[ end trace 0000000000000000 ]--- The generated program (see below) mmaps a 16M region (RWX). It then mlocks all current and future memory. Next, it registers 12K (3 pages) for use with UFFD-WP, and marks 4 pages UFFD-WP'ed. This returns ENOENT because we only registered 3 pages, but those 3 pages are still UFFD-WP'ed in their PTE, so this error is not relavent to the bug. At this point, there is a single VMA covering the 12K, with VM_UFFD_WP set, amongst other flags: 20ffb000-20ffe000 rwxp 00000000 00:00 0 Size: 12 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 12 kB Pss: 12 kB Pss_Dirty: 12 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 12 kB Referenced: 12 kB Anonymous: 12 kB KSM: 0 kB LazyFree: 0 kB AnonHugePages: 0 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 12 kB THPeligible: 0 VmFlags: rd wr ex mr mw me uw lo ac Next we mremap the first page to the address where the last page was previously mapped, with MREMAP_DONTUNMAP. This leads to 2 VMAs, but the new one doesn't have VM_UFFD_WP set (Note also that the original VMA no longer has VM_LOCKED which seems wrong to me, but I'll ignore that for now): 20ffb000-20ffd000 rwxp 00000000 00:00 0 Size: 8 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 4 kB Pss: 4 kB Pss_Dirty: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB KSM: 0 kB LazyFree: 0 kB AnonHugePages: 0 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 0 VmFlags: rd wr ex mr mw me uw ac 20ffd000-20ffe000 rwxp 00000000 00:00 0 Size: 4 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 4 kB Pss: 4 kB Pss_Dirty: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB KSM: 0 kB LazyFree: 0 kB AnonHugePages: 0 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 4 kB THPeligible: 0 VmFlags: rd wr ex mr mw me lo ac Finally we try to mprotect that last 4K region to remove X, and we get the warning saying the PTE has both the UFFD-WP and WRITE bits set. I'm guessing this is because the VM_UFFD_WP flag got spuriously dropped when creating the final 4K VMA and so mprotect's can_change_pte_writable() check incorrectly allowed the pte to be marked writable. But the mremap man page is not very clear on the semantics when interacting with uffd regions; perhaps uffd-wp bit should have been cleared when mremapping the ptes? I'm hoping you can advice on the expected semantics and we can figure out how to solve this? The reproducer is as follows (with a few annotations added by me): """ // autogenerated by syzkaller (https://github.com/google/syzkaller) #define _GNU_SOURCE #include #include #include #include #include #include #include #include #ifndef __NR_ioctl #define __NR_ioctl 29 #endif #ifndef __NR_mlockall #define __NR_mlockall 230 #endif #ifndef __NR_mmap #define __NR_mmap 222 #endif #ifndef __NR_mprotect #define __NR_mprotect 226 #endif #ifndef __NR_mremap #define __NR_mremap 216 #endif #ifndef __NR_userfaultfd #define __NR_userfaultfd 282 #endif uint64_t r[1] = {0xffffffffffffffff}; int main(void) { intptr_t res = 0; syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul); syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul); syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/-1, /*offset=*/0ul); write(1, "executing program\n", sizeof("executing program\n") - 1); // userfaultfd(UFFD_USER_MODE_ONLY) = 3 res = syscall(__NR_userfaultfd, /*flags=UFFD_USER_MODE_ONLY*/1ul); if (res != -1) r[0] = res; // ioctl(3, UFFDIO_API, {api=0xaa, features=0 => features=UFFD_FEATURE_PAGEFAULT_FLAG_WP|UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_EVENT_REMAP|UFFD_FEATURE_EVENT_REMOVE|UFFD_FEATURE_MISSING_HUGETLBFS|UFFD_FEATURE_MISSING_SHMEM|UFFD_FEATURE_EVENT_UNMAP|UFFD_FEATURE_SIGBUS|UFFD_FEATURE_THREAD_ID|UFFD_FEATURE_MINOR_HUGETLBFS|UFFD_FEATURE_MINOR_SHMEM|0x1f800, ioctls=1<<_UFFDIO_REGISTER|1<<_UFFDIO_UNREGISTER|1<<_UFFDIO_API}) = 0 *(uint64_t*)0x20000000 = 0xaa; *(uint64_t*)0x20000008 = 0; *(uint64_t*)0x20000010 = 0; syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa3f, /*arg=*/0x20000000ul); syscall(__NR_mlockall, /*flags=MCL_FUTURE|MCL_CURRENT*/3ul); // ioctl(3, UFFDIO_REGISTER, {range={start=0x20ffb000, len=0x3000}, mode=UFFDIO_REGISTER_MODE_WP, ioctls=1<<_UFFDIO_WAKE|1<<_UFFDIO_COPY|1<<_UFFDIO_ZEROPAGE|1<<_UFFDIO_WRITEPROTECT|0x120}) = 0 *(uint64_t*)0x20000180 = 0x20ffb000; *(uint64_t*)0x20000188 = 0x3000; *(uint64_t*)0x20000190 = 2; *(uint64_t*)0x20000198 = 0; syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc020aa00, /*arg=*/0x20000180ul); // ioctl(3, UFFDIO_WRITEPROTECT, 0x20000080) = -1 ENOENT (No such file or directory) *(uint64_t*)0x20000080 = 0x20ffb000; *(uint64_t*)0x20000088 = 0x4000; *(uint64_t*)0x20000090 = 1; syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc018aa06, /*arg=*/0x20000080ul); syscall(__NR_mremap, /*addr=*/0x20ffb000ul, /*len=*/0x1000ul, /*newlen=*/0x1000ul, /*flags=MREMAP_DONTUNMAP|MREMAP_FIXED|MREMAP_MAYMOVE*/7ul, /*newaddr=*/0x20ffd000ul); syscall(__NR_mprotect, /*addr=*/0x20ffd000ul, /*len=*/0x1000ul, /*prot=PROT_WRITE|PROT_READ*/3ul); return 0; } """ I'd appreciate any thoughts you may have! Thanks, Ryan