From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
Pedro Falcato <pfalcato@suse.de>,
Yeoreum Yun <yeoreum.yun@arm.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
David Hildenbrand <david@kernel.org>,
Jeongjun Park <aha310510@gmail.com>,
Rik van Riel <riel@surriel.com>
Subject: Re: [PATCH] mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
Date: Mon, 5 Jan 2026 09:12:37 +0000 [thread overview]
Message-ID: <af0e6e5f-68ad-466f-b009-e45a2c553744@lucifer.local> (raw)
In-Reply-To: <aVtH-9rUDNAhjc6V@hyeyoo>
Andrew - drop the original will rework a v2, thanks.
On Mon, Jan 05, 2026 at 02:11:23PM +0900, Harry Yoo wrote:
> On Fri, Jan 02, 2026 at 08:55:20PM +0000, Lorenzo Stoakes wrote:
> > Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
> > merges") introduced the ability to merge previously unavailable VMA merge
> > scenarios.
> >
> > The key piece of logic introduced was the ability to merge a faulted VMA
> > immediately next to an unfaulted VMA, which relies upon dup_anon_vma() to
> > correctly handle anon_vma state.
> >
> > In the case of the merge of an existing VMA (that is changing properties of
> > a VMA and then merging if those properties are shared by adjacent VMAs),
> > dup_anon_vma() is invoked correctly.
> >
> > However in the case of the merge of a new VMA, a corner case peculiar to
> > mremap() was missed.
> >
> > The issue is that vma_expand() only performs dup_anon_vma() if the target
> > (the VMA that will ultimately become the merged VMA): is not the next VMA,
> > i.e. the one that appears after the range in which the new VMA is to be
> > established.
> >
> > A key insight here is that in all other cases other than mremap(), a new
> > VMA merge either expands an existing VMA, meaning that the target VMA will
> > be that VMA, or would have anon_vma be NULL.
> >
> > Specifically:
> >
> > * __mmap_region() - no anon_vma in place, initial mapping.
> > * do_brk_flags() - expanding an existing VMA.
> > * vma_merge_extend() - expanding an existing VMA.
> > * relocate_vma_down() - no anon_vma in place, initial mapping.
> >
> > In addition, we are in the unique situation of needing to duplicate
> > anon_vma state from a VMA that is neither the previous or next VMA being
> > merged with.
> >
> > To account for this, introduce a new field in struct vma_merge_struct
> > specifically for the mremap() case, and update vma_expand() to explicitly
> > check for this case and invoke dup_anon_vma() to ensure anon_vma state is
> > correctly propagated.
> >
> > This issue can be observed most directly by invoked mremap() to move around
> > a VMA and cause this kind of merge with the MREMAP_DONTUNMAP flag
> > specified.
> >
> > This will result in unlink_anon_vmas() being called after failing to
> > duplicate anon_vma state to the target VMA, which results in the anon_vma
> > itself being freed with folios still possessing dangling pointers to the
> > anon_vma and thus a use-after-free bug.
> >
> > This bug was discovered via a syzbot report, which this patch resolves.
> >
> > The following program reproduces the issue (and is fixed by this patch):
>
> [...]
>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
> > Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/694a2745.050a0220.19928e.0017.GAE@google.com/
> > Cc: stable@kernel.org
> > ---
>
> Hi Lorenzo, I really appreciate that you've done very through analysis of
> the bug so quickly and precisely, and wrote a fix. Also having a simpler
> repro (that works on my machine!) is hugely helpful.
>
> My comment inlined below.
>
> > mm/vma.c | 58 ++++++++++++++++++++++++++++++++++++++++++--------------
> > mm/vma.h | 3 +++
> > 2 files changed, 47 insertions(+), 14 deletions(-)
> >
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 6377aa290a27..2268f518a89b 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -1130,26 +1130,50 @@ int vma_expand(struct vma_merge_struct *vmg)
> > mmap_assert_write_locked(vmg->mm);
> >
> > vma_start_write(target);
> > - if (next && (target != next) && (vmg->end == next->vm_end)) {
> > + if (next && vmg->end == next->vm_end) {
> > + struct vm_area_struct *copied_from = vmg->copied_from;
> > int ret;
> >
> > - sticky_flags |= next->vm_flags & VM_STICKY;
> > - remove_next = true;
> > - /* This should already have been checked by this point. */
> > - VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
> > - vma_start_write(next);
> > - /*
> > - * In this case we don't report OOM, so vmg->give_up_on_mm is
> > - * safe.
> > - */
> > - ret = dup_anon_vma(target, next, &anon_dup);
> > - if (ret)
> > - return ret;
> > + if (target != next) {
> > + sticky_flags |= next->vm_flags & VM_STICKY;
> > + remove_next = true;
> > + /* This should already have been checked by this point. */
> > + VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg);
> > + vma_start_write(next);
> > + /*
> > + * In this case we don't report OOM, so vmg->give_up_on_mm is
> > + * safe.
> > + */
> > + ret = dup_anon_vma(target, next, &anon_dup);
> > + if (ret)
> > + return ret;
>
> While this fix works when we're expanding the next VMA to cover the new
> range, I don't think it's covering the case where we're expanding the
> prev VMA to cover the new range and next VMA.
Right yeah because we're also not propagating the mremap'd VMA's anon_vma state,
damn.
I missed the wood for the trees, this just needs to be a general pattern for
propagating mremap(). This is probably why we disallowed these merges in the
past :) but it's good to address this, it also shows what a mess anon mremap()
makes, and further underlines the need for anon rmap rework.
Let me rework this, and come back with a general solution.
I will also add some self tests whose failure case will be 'breaks
CONFIG_DEBUG_VM kernel', and just assert basic stuff on them. We already have a
merge.c test suite for this.
>
> Previously I argued [1] that when mremap()'ing into a gap between two unfaulted
Right sorry missed this as was so focused on fixing the reported issue (and was
on leave at the time so wanted this done asap :)
> VMAs that are compatible, calling `dup_anon_vma(target, next, &anon_dup);`
> is incorrect:
> mremap()
> |-----------------------------------|
> | |
> v |
> [ VMA C, unfaulted ][ gap ][ VMA B, unfaulted ][ gap ][ VMA A, faulted ]
>
>
> I suspected this patch doesn't cover the case, so I slightly modified your
> repro to test my theory (added to the end of the email).
>
> The test confirmed my theory. It doesn't cover the case above because
> target is not next but prev ((target != next) returns true), and neither
> target nor next have anon_vma, but the VMA that is copied from does.
>
> With the modified repro, I'm still seeing the warning that Jann added,
> on top of mm-hotfixes-unstable (HEAD: 871cf622a8ba) which already has
> your fix (65769f3b9877).
>
> [1] https://lore.kernel.org/linux-mm/aVd-UZQGW4ltH6hY@hyeyoo
>
> > + } else if (copied_from) {
> > + vma_start_write(next);
> > +
> > + /*
> > + * We are copying from a VMA (i.e. mremap()'ing) to
> > + * next, and thus must ensure that either anon_vma's are
> > + * already compatible (in which case this call is a nop)
> > + * or all anon_vma state is propagated to next
> > + */
> > + ret = dup_anon_vma(next, copied_from, &anon_dup);
> > + if (ret)
> > + return ret;
>
> So we need to fix this to work even when (target != next) returns true.
>
> Modified repro:
>
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/mman.h>
>
> #define RESERVED_PGS (100)
> #define VMA_A_PGS (10)
> #define VMA_B_PGS (10)
> #define VMA_C_PGS (10)
> #define NUM_ITERS (1000)
>
> static void trigger_bug(void)
> {
> unsigned long page_size = sysconf(_SC_PAGE_SIZE);
> char *reserved, *ptr_a, *ptr_b, *ptr_c;
>
> /*
> * The goal here is to achieve:
> * mremap()
> * |-----------------------------------|
> * | |
> * v |
> * [ VMA C, unfaulted ][ gap ][ VMA B, unfaulted ][ gap ][ VMA A, faulted ]
> *
> * Merge VMA C, B, A by expanding VMA C.
> */
>
> /* Reserve a region of memory to operate in. */
> reserved = mmap(NULL, RESERVED_PGS * page_size, PROT_NONE,
> MAP_PRIVATE | MAP_ANON, -1, 0);
> if (reserved == MAP_FAILED) {
> perror("mmap reserved");
> exit(EXIT_FAILURE);
> }
>
> /* Map VMA A into place. */
> ptr_a = mmap(&reserved[page_size + VMA_C_PGS * page_size], VMA_A_PGS * page_size,
> PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
> if (ptr_a == MAP_FAILED) {
> perror("mmap VMA A");
> exit(EXIT_FAILURE);
> }
> /* Fault it in. */
> ptr_a[0] = 'x';
>
> /*
> * Now move it out of the way so we can place VMA B in position,
> * unfaulted.
> */
> ptr_a = mremap(ptr_a, VMA_A_PGS * page_size, VMA_A_PGS * page_size,
> MREMAP_FIXED | MREMAP_MAYMOVE, &reserved[50 * page_size]);
> if (ptr_a == MAP_FAILED) {
> perror("mremap VMA A out of the way");
> exit(EXIT_FAILURE);
> }
>
> /* Map VMA C into place. */
> ptr_c = mmap(&reserved[page_size], VMA_C_PGS * page_size,
> PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
> if (ptr_c == MAP_FAILED) {
> perror("mmap VMA C");
> exit(EXIT_FAILURE);
> }
>
> /* Map VMA B into place. */
> ptr_b = mmap(&reserved[page_size + VMA_C_PGS * page_size + VMA_A_PGS * page_size],
> VMA_B_PGS * page_size, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);
> if (ptr_b == MAP_FAILED) {
> perror("mmap VMA B");
> exit(EXIT_FAILURE);
> }
>
> /* Now move VMA A into position w/MREMAP_DONTUNMAP + free anon_vma. */
> ptr_a = mremap(ptr_a, VMA_A_PGS * page_size, VMA_A_PGS * page_size,
> MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
> &reserved[page_size + VMA_C_PGS * page_size]);
> if (ptr_a == MAP_FAILED) {
> perror("mremap VMA A with MREMAP_DONTUNMAP");
> exit(EXIT_FAILURE);
> }
>
> /* Finally, unmap VMA A which should trigger the bug. */
> munmap(ptr_a, VMA_A_PGS * page_size);
>
> /* Cleanup in case bug didn't trigger sufficiently visibly... */
> munmap(reserved, RESERVED_PGS * page_size);
> }
>
> int main(void)
> {
> int i;
>
> for (i = 0; i < NUM_ITERS; i++)
> trigger_bug();
>
> return EXIT_SUCCESS;
> }
Thanks for this! Will go through all this and come up with a v2.
>
> --
> Cheers,
> Harry / Hyeonggon
next prev parent reply other threads:[~2026-01-05 9:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-02 20:55 Lorenzo Stoakes
2026-01-02 21:00 ` Lorenzo Stoakes
2026-01-04 19:25 ` David Hildenbrand (Red Hat)
2026-01-05 12:53 ` Lorenzo Stoakes
2026-01-05 5:11 ` Harry Yoo
2026-01-05 9:12 ` Lorenzo Stoakes [this message]
2026-01-05 15:24 ` Liam R. Howlett
2026-01-05 15:32 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=af0e6e5f-68ad-466f-b009-e45a2c553744@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=aha310510@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=vbabka@suse.cz \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox