linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Pengfei Xu <pengfei.xu@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Suren Baghdasaryan <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	sidhartha.kumar@oracle.com, Bert Karwatzki <spasswolf@web.de>,
	Jiri Olsa <olsajiri@gmail.com>, Kees Cook <kees@kernel.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Jeff Xu <jeffxu@chromium.org>,
	syzkaller-bugs@googlegroups.com
Subject: Re: [PATCH v8 15/21] mm: Change failure of MAP_FIXED to restoring the gap on failure
Date: Tue, 3 Sep 2024 13:27:42 +0100	[thread overview]
Message-ID: <52ee7eb3-955c-4ade-b5f0-28fed8ba3d0b@lucifer.local> (raw)
In-Reply-To: <a753f480-8dc6-42f7-bc6b-32f37d47d829@lucifer.local>

Hi Andrew - TL;DR of this is - please apply the fix patch attached below to
fix a problem in this series, thanks! :)

On Tue, Sep 03, 2024 at 12:00:04PM GMT, Lorenzo Stoakes wrote:
> On Tue, Sep 03, 2024 at 11:07:38AM GMT, Pengfei Xu wrote:
> > Hi Liam R. Howlett,
> >
> > Greetings!
> >
> > There is WARNING in __split_vma in next-20240902 in syzkaller fuzzing test.
> > Bisected and found first bad commit:
> > "
> > 3483c95414f9 mm: change failure of MAP_FIXED to restoring the gap on failure
> > "
> > It's same as below patch.
> > After reverted the above commit on top of next-20240902, this issue was gone.
> >
> > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240903_092137___split_vma
> > Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/repro.c
> > Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/repro.prog
> > Syzkaller report: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/repro.report
> > Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/kconfig_origin
> > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/bisect_info.log
> > bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240903_092137___split_vma/bzImage_ecc768a84f0b8e631986f9ade3118fa37852fef0.tar.gz
> > Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240903_092137___split_vma/ecc768a84f0b8e631986f9ade3118fa37852fef0_dmesg.log
> >
> > And "KASAN: slab-use-after-free Read in acct_collect" also pointed to the
> > same commit, all detailed info:
> > https://github.com/xupengfe/syzkaller_logs/tree/main/240903_090000_acct_collect
> >
> > "
>
> Thanks for the report! Looking into it.
>
> > [   19.953726] cgroup: Unknown subsys name 'net'
> > [   20.045121] cgroup: Unknown subsys name 'rlimit'
> > [   20.138332] ------------[ cut here ]------------
> > [   20.138634] WARNING: CPU: 1 PID: 732 at include/linux/maple_tree.h:733 __split_vma+0x4d7/0x1020
> > [   20.139075] Modules linked in:
> > [   20.139245] CPU: 1 UID: 0 PID: 732 Comm: repro Not tainted 6.11.0-rc6-next-20240902-ecc768a84f0b #1
> > [   20.139779] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [   20.140337] RIP: 0010:__split_vma+0x4d7/0x1020
> > [   20.140572] Code: 89 ee 48 8b 40 10 48 89 c7 48 89 85 00 ff ff ff e8 8e 61 a7 ff 48 8b 85 00 ff ff ff 4c 39 e8 0f 83 ea fd ff ff e8 b9 5e a7 ff <0f> 0b e9 de fd ff ff 48 8b 85 30 ff ff ff 48 83 c0 10 48 89 85 18
> > [   20.141476] RSP: 0018:ffff8880217379a0 EFLAGS: 00010293
> > [   20.141749] RAX: 0000000000000000 RBX: ffff8880132351e0 RCX: ffffffff81bf6117
> > [   20.142106] RDX: ffff888012c30000 RSI: ffffffff81bf6187 RDI: 0000000000000006
> > [   20.142457] RBP: ffff888021737aa0 R08: 0000000000000001 R09: ffffed100263d3cd
> > [   20.142814] R10: 0000000020ff9000 R11: 0000000000000001 R12: ffff888021737e40
> > [   20.143173] R13: 0000000020ff9000 R14: 0000000020ffc000 R15: ffff888013235d20
> > [   20.143529] FS:  00007eff937f9740(0000) GS:ffff88806c500000(0000) knlGS:0000000000000000
> > [   20.144308] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   20.144600] CR2: 0000000020000040 CR3: 000000001f464003 CR4: 0000000000770ef0
> > [   20.144958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   20.145313] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> > [   20.145665] PKRU: 55555554
> > [   20.145809] Call Trace:
> > [   20.145940]  <TASK>
> > [   20.146056]  ? show_regs+0x6d/0x80
> > [   20.146247]  ? __warn+0xf3/0x380
> > [   20.146431]  ? report_bug+0x25e/0x4b0
> > [   20.146650]  ? __split_vma+0x4d7/0x1020
>
> Have repro'd locally. This is, unsurprisingly, on this line (even if trace above
> doesn't decode to it unfortunately):
>
> 	vma_iter_config(vmi, new->vm_start, new->vm_end);
>
> The VMA in question spans 0x20ff9000, 0x21000000 so is 7 pages in size.
>
> At the point of invoking vma_iter_config(), the vma iterator points at
> 0x20ff9001, but we try to position it to 0x20ff9000.
>
> It seems the issue is that in do_vmi_munmap(), after vma_find() is called, we
> find a VMA at 0x20ff9000, but the VMI is positioned to 0x20ff9001...!
>
> Perhaps maple tree corruption in a previous call somehow?
>
>
> I can interestingly only repro this if I clear the qemu image each time, I'm
> guessing this is somehow tied to the instantiation of the cgroup setup or such?
>
> Am continuing the investigation.
>

[snip]

OK I turned on CONFIG_DEBUG_VM_MAPLE_TREE and am hitting
VM_WARN_ON_ONCE_MM(vma->vm_start != vmi_start, mm) after gather_failed is hit in
mmap_region() as a result of call_mmap() returning an error.

This is invoking kernfs_fop_mmap(), which returns -ENODEV because the
KERNFS_HAS_MMAP flag has not been set for the cgroup file being mapped.

This results in mmap_region() jumping to unmap_and_free_vma, which unmaps the
page tables in the region and goes on to abort the unmap operation.

The validate_mm() that fails is called by vms_complete_munmap_vmas() which was
invoked by vms_abort_munmap_vmas().

The tree is then corrupted:

vma ffff888013414d20 start 0000000020ff9000 end 0000000021000000 mm ffff88800d06ae40
prot 25 anon_vma ffff8880132cc660 vm_ops 0000000000000000
pgoff 20ff9 file 0000000000000000 private_data 0000000000000000
flags: 0x8100077(read|write|exec|mayread|maywrite|mayexec|account|softdirty)
tree range: ffff888013414d20 start 20ff9001 end 20ffffff

Incorrectly starting at off-by-one 0x20ff9001 rather than 0x20ff9000. This
is a very telling off-by... the programmer's favourite off-by-1 :) Which
then made me think of how mas operations have an _inclusive_ end and VMA
ones have an _exclusive_ one.

And so I tracked down the cause of this to vms_abort_munmap_vmas() which
was invoking mas_set_range() using vms->end (exclusive) as if it were
inclusive, which thus resulted in 0x20ff9000 being wrongly cleared.

Thes solution is simply to subtract this by one as done in the attached
fix-patch.

I confirmed this fixed the issue as I was able to set up a reliable repro
locally.

Thanks for the report! Great find.

----8<----
From 3e7decc5390b0edc462afa74794a8208e25e50f2 Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date: Tue, 3 Sep 2024 13:20:34 +0100
Subject: [PATCH] mm: fix off-by-one error in vms_abort_munmap_vmas()

Maple tree ranges have an inclusive end, VMAs do not, so we must subtract
one from the VMA-specific end value when using a mas_...() function.

We failed to do so in vms_abort_munmap_vmas() which resulted in a store
overlapping the intended range by one byte, and thus corrupting the maple
tree.

Fix this by subtracting one from vms->end() passed into mas_set_range().

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 mm/vma.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vma.h b/mm/vma.h
index 370d3246f147..819f994cf727 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -240,7 +240,7 @@ static inline void vms_abort_munmap_vmas(struct vma_munmap_struct *vms,
 	 * not symmetrical and state data has been lost.  Resort to the old
 	 * failure method of leaving a gap where the MAP_FIXED mapping failed.
 	 */
-	mas_set_range(mas, vms->start, vms->end);
+	mas_set_range(mas, vms->start, vms->end - 1);
 	if (unlikely(mas_store_gfp(mas, NULL, GFP_KERNEL))) {
 		pr_warn_once("%s: (%d) Unable to abort munmap() operation\n",
 			     current->comm, current->pid);
--
2.46.0


  reply	other threads:[~2024-09-03 12:28 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30  4:00 [PATCH v8 00/21] Avoid MAP_FIXED gap exposure Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 01/21] mm/vma: Correctly position vma_iterator in __split_vma() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 02/21] mm/vma: Introduce abort_munmap_vmas() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 03/21] mm/vma: Introduce vmi_complete_munmap_vmas() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 04/21] mm/vma: Extract the gathering of vmas from do_vmi_align_munmap() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 05/21] mm/vma: Introduce vma_munmap_struct for use in munmap operations Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 06/21] mm/vma: Change munmap to use vma_munmap_struct() for accounting and surrounding vmas Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 07/21] mm/vma: Extract validate_mm() from vma_complete() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 08/21] mm/vma: Inline munmap operation in mmap_region() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 09/21] mm/vma: Expand mmap_region() munmap call Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 10/21] mm/vma: Support vma == NULL in init_vma_munmap() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 11/21] mm/mmap: Reposition vma iterator in mmap_region() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 12/21] mm/vma: Track start and end for munmap in vma_munmap_struct Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 13/21] mm: Clean up unmap_region() argument list Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 14/21] mm/mmap: Avoid zeroing vma tree in mmap_region() Liam R. Howlett
2024-10-07 19:05   ` [BUG] page table UAF, " Jann Horn
2024-10-07 20:31     ` Liam R. Howlett
2024-10-07 21:31       ` Jann Horn
2024-10-08  1:50         ` Liam R. Howlett
2024-10-08 17:15           ` Jann Horn
2024-10-08 17:51             ` Suren Baghdasaryan
2024-10-08 18:06               ` Jann Horn
2024-10-11 14:26             ` Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 15/21] mm: Change failure of MAP_FIXED to restoring the gap on failure Liam R. Howlett
2024-09-03  3:07   ` Pengfei Xu
2024-09-03 11:00     ` Lorenzo Stoakes
2024-09-03 12:27       ` Lorenzo Stoakes [this message]
2024-09-03 16:03         ` Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 16/21] mm/mmap: Use PHYS_PFN in mmap_region() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 17/21] mm/mmap: Use vms accounted pages " Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 18/21] ipc/shm, mm: Drop do_vma_munmap() Liam R. Howlett
2024-08-30  4:00 ` [PATCH v8 19/21] mm: Move may_expand_vm() check in mmap_region() Liam R. Howlett
2024-08-30  4:01 ` [PATCH v8 20/21] mm/vma: Drop incorrect comment from vms_gather_munmap_vmas() Liam R. Howlett
2024-08-30  4:01 ` [PATCH v8 21/21] mm/vma.h: Optimise vma_munmap_struct Liam R. Howlett
2024-08-30 16:05 ` [PATCH v8 00/21] Avoid MAP_FIXED gap exposure Jeff Xu
2024-08-30 17:07   ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52ee7eb3-955c-4ade-b5f0-28fed8ba3d0b@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=jeffxu@chromium.org \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=olsajiri@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=pengfei.xu@intel.com \
    --cc=sidhartha.kumar@oracle.com \
    --cc=spasswolf@web.de \
    --cc=surenb@google.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox