From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: jannh@google.com, Liam.Howlett@oracle.com,
lorenzo.stoakes@oracle.com, vbabka@suse.cz, pfalcato@suse.de,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
Date: Mon, 28 Jul 2025 10:16:59 -0700 [thread overview]
Message-ID: <CAJuCfpGDFcz0rTdnJx3UkSCDt3Ri-R55swZcqOoiLBbg8iF2zw@mail.gmail.com> (raw)
In-Reply-To: <20250728170950.2216966-1-surenb@google.com>
On Mon, Jul 28, 2025 at 10:09 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> By inducing delays in the right places, Jann Horn created a reproducer
> for a hard to hit UAF issue that became possible after VMAs were allowed
> to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
>
> Race description is borrowed from Jann's discovery report:
> lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under
> rcu_read_lock(). At that point, the VMA may be concurrently freed, and
> it can be recycled by another process. vma_start_read() then
> increments the vma->vm_refcnt (if it is in an acceptable range), and
> if this succeeds, vma_start_read() can return a recycled VMA.
>
> In this scenario where the VMA has been recycled, lock_vma_under_rcu()
> will then detect the mismatching ->vm_mm pointer and drop the VMA
> through vma_end_read(), which calls vma_refcount_put().
> vma_refcount_put() drops the refcount and then calls rcuwait_wake_up()
> using a copy of vma->vm_mm. This is wrong: It implicitly assumes that
> the caller is keeping the VMA's mm alive, but in this scenario the caller
> has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
>
> The diagram depicting the race:
> T1 T2 T3
> == == ==
> lock_vma_under_rcu
> mas_walk
> <VMA gets removed from mm>
> mmap
> <the same VMA is reallocated>
> vma_start_read
> __refcount_inc_not_zero_limited_acquire
> munmap
> __vma_enter_locked
> refcount_add_not_zero
> vma_end_read
> vma_refcount_put
> __refcount_dec_and_test
> rcuwait_wait_event
> <finish operation>
> rcuwait_wake_up [UAF]
>
> Note that rcuwait_wait_event() in T3 does not block because refcount
> was already dropped by T1. At this point T3 can exit and free the mm
> causing UAF in T1.
> To avoid this we move vma->vm_mm verification into vma_start_read() and
> grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
>
> Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU")
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Cc: <stable@vger.kernel.org>
> ---
> - Applies cleanly over mm-unstable.
> - Should be applied to 6.15 and 6.16 but these branches do not
> have lock_next_vma() function, so the change in lock_next_vma() should be
> skipped when applying to those branches.
Andrew, if you would like me to post a separate patch for 6.15 and
6.16 please let me know. The merge conflict in those branches should
be trivial: just skip the change in lock_next_vma() which does not
exist in those branches.
>
> include/linux/mmap_lock.h | 21 +++++++++++++++++++++
> mm/mmap_lock.c | 10 +++-------
> 2 files changed, 24 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> index 1f4f44951abe..4ee4ab835c41 100644
> --- a/include/linux/mmap_lock.h
> +++ b/include/linux/mmap_lock.h
> @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w);
> #include <linux/tracepoint-defs.h>
> #include <linux/types.h>
> #include <linux/cleanup.h>
> +#include <linux/sched/mm.h>
>
> #define MMAP_LOCK_INITIALIZER(name) \
> .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
> @@ -183,6 +184,26 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm,
> }
>
> rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
> +
> + /*
> + * If vma got attached to another mm from under us, that mm is not
> + * stable and can be freed in the narrow window after vma->vm_refcnt
> + * is dropped and before rcuwait_wake_up(mm) is called. Grab it before
> + * releasing vma->vm_refcnt.
> + */
> + if (unlikely(vma->vm_mm != mm)) {
> + /*
> + * __mmdrop() is a heavy operation and we don't need RCU
> + * protection here. Release RCU lock during these operations.
> + */
> + rcu_read_unlock();
> + mmgrab(vma->vm_mm);
> + vma_refcount_put(vma);
> + mmdrop(vma->vm_mm);
> + rcu_read_lock();
> + return NULL;
> + }
> +
> /*
> * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
> * False unlocked result is impossible because we modify and check
> diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c
> index 729fb7d0dd59..aa3bc42ecde0 100644
> --- a/mm/mmap_lock.c
> +++ b/mm/mmap_lock.c
> @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> */
>
> /* Check if the vma we locked is the right one. */
> - if (unlikely(vma->vm_mm != mm ||
> - address < vma->vm_start || address >= vma->vm_end))
> + if (unlikely(address < vma->vm_start || address >= vma->vm_end))
> goto inval_end_read;
>
> rcu_read_unlock();
> @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm,
> goto fallback;
> }
>
> - /*
> - * Verify the vma we locked belongs to the same address space and it's
> - * not behind of the last search position.
> - */
> - if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
> + /* Verify the vma is not behind of the last search position. */
> + if (unlikely(from_addr >= vma->vm_end))
> goto fallback_unlock;
>
> /*
>
> base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
> --
> 2.50.1.487.gc89ff58d15-goog
>
next prev parent reply other threads:[~2025-07-28 17:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-28 17:09 Suren Baghdasaryan
2025-07-28 17:16 ` Suren Baghdasaryan [this message]
2025-07-28 17:19 ` Vlastimil Babka
2025-07-28 17:37 ` Suren Baghdasaryan
2025-07-28 17:39 ` Vlastimil Babka
2025-07-28 17:43 ` Suren Baghdasaryan
2025-07-28 17:55 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJuCfpGDFcz0rTdnJx3UkSCDt3Ri-R55swZcqOoiLBbg8iF2zw@mail.gmail.com \
--to=surenb@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=pfalcato@suse.de \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox