[PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
@ 2025-07-28 17:09 Suren Baghdasaryan
  2025-07-28 17:16 ` Suren Baghdasaryan
  2025-07-28 17:19 ` Vlastimil Babka
  0 siblings, 2 replies; 7+ messages in thread
From: Suren Baghdasaryan @ 2025-07-28 17:09 UTC (permalink / raw)
  To: akpm
  Cc: jannh, Liam.Howlett, lorenzo.stoakes, vbabka, pfalcato, linux-mm,
	linux-kernel, surenb, stable

By inducing delays in the right places, Jann Horn created a reproducer
for a hard to hit UAF issue that became possible after VMAs were allowed
to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.

Race description is borrowed from Jann's discovery report:
lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under
rcu_read_lock(). At that point, the VMA may be concurrently freed, and
it can be recycled by another process. vma_start_read() then
increments the vma->vm_refcnt (if it is in an acceptable range), and
if this succeeds, vma_start_read() can return a recycled VMA.

In this scenario where the VMA has been recycled, lock_vma_under_rcu()
will then detect the mismatching ->vm_mm pointer and drop the VMA
through vma_end_read(), which calls vma_refcount_put().
vma_refcount_put() drops the refcount and then calls rcuwait_wake_up()
using a copy of vma->vm_mm. This is wrong: It implicitly assumes that
the caller is keeping the VMA's mm alive, but in this scenario the caller
has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.

The diagram depicting the race:
T1         T2         T3
==         ==         ==
lock_vma_under_rcu
  mas_walk
          <VMA gets removed from mm>
                      mmap
                        <the same VMA is reallocated>
  vma_start_read
    __refcount_inc_not_zero_limited_acquire
                      munmap
                        __vma_enter_locked
                          refcount_add_not_zero
  vma_end_read
    vma_refcount_put
      __refcount_dec_and_test
                          rcuwait_wait_event
                            <finish operation>
      rcuwait_wake_up [UAF]

Note that rcuwait_wait_event() in T3 does not block because refcount
was already dropped by T1. At this point T3 can exit and free the mm
causing UAF in T1.
To avoid this we move vma->vm_mm verification into vma_start_read() and
grab vma->vm_mm to stabilize it before vma_refcount_put() operation.

Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU")
Reported-by: Jann Horn <jannh@google.com>
Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
---
- Applies cleanly over mm-unstable.
- Should be applied to 6.15 and 6.16 but these branches do not
have lock_next_vma() function, so the change in lock_next_vma() should be
skipped when applying to those branches.

 include/linux/mmap_lock.h | 21 +++++++++++++++++++++
 mm/mmap_lock.c            | 10 +++-------
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 1f4f44951abe..4ee4ab835c41 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w);
 #include <linux/tracepoint-defs.h>
 #include <linux/types.h>
 #include <linux/cleanup.h>
+#include <linux/sched/mm.h>
 
 #define MMAP_LOCK_INITIALIZER(name) \
 	.mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
@@ -183,6 +184,26 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm,
 	}
 
 	rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
+
+	/*
+	 * If vma got attached to another mm from under us, that mm is not
+	 * stable and can be freed in the narrow window after vma->vm_refcnt
+	 * is dropped and before rcuwait_wake_up(mm) is called. Grab it before
+	 * releasing vma->vm_refcnt.
+	 */
+	if (unlikely(vma->vm_mm != mm)) {
+		/*
+		 * __mmdrop() is a heavy operation and we don't need RCU
+		 * protection here. Release RCU lock during these operations.
+		 */
+		rcu_read_unlock();
+		mmgrab(vma->vm_mm);
+		vma_refcount_put(vma);
+		mmdrop(vma->vm_mm);
+		rcu_read_lock();
+		return NULL;
+	}
+
 	/*
 	 * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
 	 * False unlocked result is impossible because we modify and check
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c
index 729fb7d0dd59..aa3bc42ecde0 100644
--- a/mm/mmap_lock.c
+++ b/mm/mmap_lock.c
@@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
 	 */
 
 	/* Check if the vma we locked is the right one. */
-	if (unlikely(vma->vm_mm != mm ||
-		     address < vma->vm_start || address >= vma->vm_end))
+	if (unlikely(address < vma->vm_start || address >= vma->vm_end))
 		goto inval_end_read;
 
 	rcu_read_unlock();
@@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm,
 		goto fallback;
 	}
 
-	/*
-	 * Verify the vma we locked belongs to the same address space and it's
-	 * not behind of the last search position.
-	 */
-	if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
+	/* Verify the vma is not behind of the last search position. */
+	if (unlikely(from_addr >= vma->vm_end))
 		goto fallback_unlock;
 
 	/*

base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
-- 
2.50.1.487.gc89ff58d15-goog



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  2025-07-28 17:09 [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped Suren Baghdasaryan
@ 2025-07-28 17:16 ` Suren Baghdasaryan
  2025-07-28 17:19 ` Vlastimil Babka
  1 sibling, 0 replies; 7+ messages in thread
From: Suren Baghdasaryan @ 2025-07-28 17:16 UTC (permalink / raw)
  To: akpm
  Cc: jannh, Liam.Howlett, lorenzo.stoakes, vbabka, pfalcato, linux-mm,
	linux-kernel, stable

On Mon, Jul 28, 2025 at 10:09 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> By inducing delays in the right places, Jann Horn created a reproducer
> for a hard to hit UAF issue that became possible after VMAs were allowed
> to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
>
> Race description is borrowed from Jann's discovery report:
> lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under
> rcu_read_lock(). At that point, the VMA may be concurrently freed, and
> it can be recycled by another process. vma_start_read() then
> increments the vma->vm_refcnt (if it is in an acceptable range), and
> if this succeeds, vma_start_read() can return a recycled VMA.
>
> In this scenario where the VMA has been recycled, lock_vma_under_rcu()
> will then detect the mismatching ->vm_mm pointer and drop the VMA
> through vma_end_read(), which calls vma_refcount_put().
> vma_refcount_put() drops the refcount and then calls rcuwait_wake_up()
> using a copy of vma->vm_mm. This is wrong: It implicitly assumes that
> the caller is keeping the VMA's mm alive, but in this scenario the caller
> has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
>
> The diagram depicting the race:
> T1         T2         T3
> ==         ==         ==
> lock_vma_under_rcu
>   mas_walk
>           <VMA gets removed from mm>
>                       mmap
>                         <the same VMA is reallocated>
>   vma_start_read
>     __refcount_inc_not_zero_limited_acquire
>                       munmap
>                         __vma_enter_locked
>                           refcount_add_not_zero
>   vma_end_read
>     vma_refcount_put
>       __refcount_dec_and_test
>                           rcuwait_wait_event
>                             <finish operation>
>       rcuwait_wake_up [UAF]
>
> Note that rcuwait_wait_event() in T3 does not block because refcount
> was already dropped by T1. At this point T3 can exit and free the mm
> causing UAF in T1.
> To avoid this we move vma->vm_mm verification into vma_start_read() and
> grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
>
> Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU")
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Cc: <stable@vger.kernel.org>
> ---
> - Applies cleanly over mm-unstable.
> - Should be applied to 6.15 and 6.16 but these branches do not
> have lock_next_vma() function, so the change in lock_next_vma() should be
> skipped when applying to those branches.

Andrew, if you would like me to post a separate patch for 6.15 and
6.16 please let me know. The merge conflict in those branches should
be trivial: just skip the change in lock_next_vma() which does not
exist in those branches.

>
>  include/linux/mmap_lock.h | 21 +++++++++++++++++++++
>  mm/mmap_lock.c            | 10 +++-------
>  2 files changed, 24 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> index 1f4f44951abe..4ee4ab835c41 100644
> --- a/include/linux/mmap_lock.h
> +++ b/include/linux/mmap_lock.h
> @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w);
>  #include <linux/tracepoint-defs.h>
>  #include <linux/types.h>
>  #include <linux/cleanup.h>
> +#include <linux/sched/mm.h>
>
>  #define MMAP_LOCK_INITIALIZER(name) \
>         .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
> @@ -183,6 +184,26 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm,
>         }
>
>         rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
> +
> +       /*
> +        * If vma got attached to another mm from under us, that mm is not
> +        * stable and can be freed in the narrow window after vma->vm_refcnt
> +        * is dropped and before rcuwait_wake_up(mm) is called. Grab it before
> +        * releasing vma->vm_refcnt.
> +        */
> +       if (unlikely(vma->vm_mm != mm)) {
> +               /*
> +                * __mmdrop() is a heavy operation and we don't need RCU
> +                * protection here. Release RCU lock during these operations.
> +                */
> +               rcu_read_unlock();
> +               mmgrab(vma->vm_mm);
> +               vma_refcount_put(vma);
> +               mmdrop(vma->vm_mm);
> +               rcu_read_lock();
> +               return NULL;
> +       }
> +
>         /*
>          * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
>          * False unlocked result is impossible because we modify and check
> diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c
> index 729fb7d0dd59..aa3bc42ecde0 100644
> --- a/mm/mmap_lock.c
> +++ b/mm/mmap_lock.c
> @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
>          */
>
>         /* Check if the vma we locked is the right one. */
> -       if (unlikely(vma->vm_mm != mm ||
> -                    address < vma->vm_start || address >= vma->vm_end))
> +       if (unlikely(address < vma->vm_start || address >= vma->vm_end))
>                 goto inval_end_read;
>
>         rcu_read_unlock();
> @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm,
>                 goto fallback;
>         }
>
> -       /*
> -        * Verify the vma we locked belongs to the same address space and it's
> -        * not behind of the last search position.
> -        */
> -       if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
> +       /* Verify the vma is not behind of the last search position. */
> +       if (unlikely(from_addr >= vma->vm_end))
>                 goto fallback_unlock;
>
>         /*
>
> base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
> --
> 2.50.1.487.gc89ff58d15-goog
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  2025-07-28 17:09 [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped Suren Baghdasaryan
  2025-07-28 17:16 ` Suren Baghdasaryan
@ 2025-07-28 17:19 ` Vlastimil Babka
  2025-07-28 17:37   ` Suren Baghdasaryan
  1 sibling, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2025-07-28 17:19 UTC (permalink / raw)
  To: Suren Baghdasaryan, akpm
  Cc: jannh, Liam.Howlett, lorenzo.stoakes, pfalcato, linux-mm,
	linux-kernel, stable

On 7/28/25 19:09, Suren Baghdasaryan wrote:
> By inducing delays in the right places, Jann Horn created a reproducer
> for a hard to hit UAF issue that became possible after VMAs were allowed
> to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
> 
> Race description is borrowed from Jann's discovery report:
> lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under
> rcu_read_lock(). At that point, the VMA may be concurrently freed, and
> it can be recycled by another process. vma_start_read() then
> increments the vma->vm_refcnt (if it is in an acceptable range), and
> if this succeeds, vma_start_read() can return a recycled VMA.
> 
> In this scenario where the VMA has been recycled, lock_vma_under_rcu()
> will then detect the mismatching ->vm_mm pointer and drop the VMA
> through vma_end_read(), which calls vma_refcount_put().
> vma_refcount_put() drops the refcount and then calls rcuwait_wake_up()
> using a copy of vma->vm_mm. This is wrong: It implicitly assumes that
> the caller is keeping the VMA's mm alive, but in this scenario the caller
> has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
> 
> The diagram depicting the race:
> T1         T2         T3
> ==         ==         ==
> lock_vma_under_rcu
>   mas_walk
>           <VMA gets removed from mm>
>                       mmap
>                         <the same VMA is reallocated>
>   vma_start_read
>     __refcount_inc_not_zero_limited_acquire
>                       munmap
>                         __vma_enter_locked
>                           refcount_add_not_zero
>   vma_end_read
>     vma_refcount_put
>       __refcount_dec_and_test
>                           rcuwait_wait_event
>                             <finish operation>
>       rcuwait_wake_up [UAF]
> 
> Note that rcuwait_wait_event() in T3 does not block because refcount
> was already dropped by T1. At this point T3 can exit and free the mm
> causing UAF in T1.
> To avoid this we move vma->vm_mm verification into vma_start_read() and
> grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
> 
> Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU")
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> Cc: <stable@vger.kernel.org>
> ---
> - Applies cleanly over mm-unstable.
> - Should be applied to 6.15 and 6.16 but these branches do not
> have lock_next_vma() function, so the change in lock_next_vma() should be
> skipped when applying to those branches.
> 
>  include/linux/mmap_lock.h | 21 +++++++++++++++++++++
>  mm/mmap_lock.c            | 10 +++-------
>  2 files changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> index 1f4f44951abe..4ee4ab835c41 100644
> --- a/include/linux/mmap_lock.h
> +++ b/include/linux/mmap_lock.h
> @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w);
>  #include <linux/tracepoint-defs.h>
>  #include <linux/types.h>
>  #include <linux/cleanup.h>
> +#include <linux/sched/mm.h>
>  
>  #define MMAP_LOCK_INITIALIZER(name) \
>  	.mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
> @@ -183,6 +184,26 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm,
>  	}
>  
>  	rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
> +
> +	/*
> +	 * If vma got attached to another mm from under us, that mm is not
> +	 * stable and can be freed in the narrow window after vma->vm_refcnt
> +	 * is dropped and before rcuwait_wake_up(mm) is called. Grab it before
> +	 * releasing vma->vm_refcnt.
> +	 */
> +	if (unlikely(vma->vm_mm != mm)) {
> +		/*
> +		 * __mmdrop() is a heavy operation and we don't need RCU
> +		 * protection here. Release RCU lock during these operations.
> +		 */
> +		rcu_read_unlock();
> +		mmgrab(vma->vm_mm);
> +		vma_refcount_put(vma);

The vma can go away here.

> +		mmdrop(vma->vm_mm);

So we need to copy the vma->vm_mm first?

> +		rcu_read_lock();
> +		return NULL;
> +	}
> +
>  	/*
>  	 * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
>  	 * False unlocked result is impossible because we modify and check
> diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c
> index 729fb7d0dd59..aa3bc42ecde0 100644
> --- a/mm/mmap_lock.c
> +++ b/mm/mmap_lock.c
> @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
>  	 */
>  
>  	/* Check if the vma we locked is the right one. */
> -	if (unlikely(vma->vm_mm != mm ||
> -		     address < vma->vm_start || address >= vma->vm_end))
> +	if (unlikely(address < vma->vm_start || address >= vma->vm_end))
>  		goto inval_end_read;
>  
>  	rcu_read_unlock();
> @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm,
>  		goto fallback;
>  	}
>  
> -	/*
> -	 * Verify the vma we locked belongs to the same address space and it's
> -	 * not behind of the last search position.
> -	 */
> -	if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
> +	/* Verify the vma is not behind of the last search position. */
> +	if (unlikely(from_addr >= vma->vm_end))
>  		goto fallback_unlock;
>  
>  	/*
> 
> base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  2025-07-28 17:19 ` Vlastimil Babka
@ 2025-07-28 17:37   ` Suren Baghdasaryan
  2025-07-28 17:39     ` Vlastimil Babka
  0 siblings, 1 reply; 7+ messages in thread
From: Suren Baghdasaryan @ 2025-07-28 17:37 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: akpm, jannh, Liam.Howlett, lorenzo.stoakes, pfalcato, linux-mm,
	linux-kernel, stable

On Mon, Jul 28, 2025 at 10:19 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 7/28/25 19:09, Suren Baghdasaryan wrote:
> > By inducing delays in the right places, Jann Horn created a reproducer
> > for a hard to hit UAF issue that became possible after VMAs were allowed
> > to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.
> >
> > Race description is borrowed from Jann's discovery report:
> > lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under
> > rcu_read_lock(). At that point, the VMA may be concurrently freed, and
> > it can be recycled by another process. vma_start_read() then
> > increments the vma->vm_refcnt (if it is in an acceptable range), and
> > if this succeeds, vma_start_read() can return a recycled VMA.
> >
> > In this scenario where the VMA has been recycled, lock_vma_under_rcu()
> > will then detect the mismatching ->vm_mm pointer and drop the VMA
> > through vma_end_read(), which calls vma_refcount_put().
> > vma_refcount_put() drops the refcount and then calls rcuwait_wake_up()
> > using a copy of vma->vm_mm. This is wrong: It implicitly assumes that
> > the caller is keeping the VMA's mm alive, but in this scenario the caller
> > has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.
> >
> > The diagram depicting the race:
> > T1         T2         T3
> > ==         ==         ==
> > lock_vma_under_rcu
> >   mas_walk
> >           <VMA gets removed from mm>
> >                       mmap
> >                         <the same VMA is reallocated>
> >   vma_start_read
> >     __refcount_inc_not_zero_limited_acquire
> >                       munmap
> >                         __vma_enter_locked
> >                           refcount_add_not_zero
> >   vma_end_read
> >     vma_refcount_put
> >       __refcount_dec_and_test
> >                           rcuwait_wait_event
> >                             <finish operation>
> >       rcuwait_wake_up [UAF]
> >
> > Note that rcuwait_wait_event() in T3 does not block because refcount
> > was already dropped by T1. At this point T3 can exit and free the mm
> > causing UAF in T1.
> > To avoid this we move vma->vm_mm verification into vma_start_read() and
> > grab vma->vm_mm to stabilize it before vma_refcount_put() operation.
> >
> > Fixes: 3104138517fc ("mm: make vma cache SLAB_TYPESAFE_BY_RCU")
> > Reported-by: Jann Horn <jannh@google.com>
> > Closes: https://lore.kernel.org/all/CAG48ez0-deFbVH=E3jbkWx=X3uVbd8nWeo6kbJPQ0KoUD+m2tA@mail.gmail.com/
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > Cc: <stable@vger.kernel.org>
> > ---
> > - Applies cleanly over mm-unstable.
> > - Should be applied to 6.15 and 6.16 but these branches do not
> > have lock_next_vma() function, so the change in lock_next_vma() should be
> > skipped when applying to those branches.
> >
> >  include/linux/mmap_lock.h | 21 +++++++++++++++++++++
> >  mm/mmap_lock.c            | 10 +++-------
> >  2 files changed, 24 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> > index 1f4f44951abe..4ee4ab835c41 100644
> > --- a/include/linux/mmap_lock.h
> > +++ b/include/linux/mmap_lock.h
> > @@ -12,6 +12,7 @@ extern int rcuwait_wake_up(struct rcuwait *w);
> >  #include <linux/tracepoint-defs.h>
> >  #include <linux/types.h>
> >  #include <linux/cleanup.h>
> > +#include <linux/sched/mm.h>
> >
> >  #define MMAP_LOCK_INITIALIZER(name) \
> >       .mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
> > @@ -183,6 +184,26 @@ static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm,
> >       }
> >
> >       rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_);
> > +
> > +     /*
> > +      * If vma got attached to another mm from under us, that mm is not
> > +      * stable and can be freed in the narrow window after vma->vm_refcnt
> > +      * is dropped and before rcuwait_wake_up(mm) is called. Grab it before
> > +      * releasing vma->vm_refcnt.
> > +      */
> > +     if (unlikely(vma->vm_mm != mm)) {
> > +             /*
> > +              * __mmdrop() is a heavy operation and we don't need RCU
> > +              * protection here. Release RCU lock during these operations.
> > +              */
> > +             rcu_read_unlock();
> > +             mmgrab(vma->vm_mm);
> > +             vma_refcount_put(vma);
>
> The vma can go away here.

No, the vma can't go away here because we are holding vm_refcnt. So,
the vma and its mm are stable up until vma_refcount_put() drops
vm_refcnt.

>
> > +             mmdrop(vma->vm_mm);
>
> So we need to copy the vma->vm_mm first?
>
> > +             rcu_read_lock();
> > +             return NULL;
> > +     }
> > +
> >       /*
> >        * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
> >        * False unlocked result is impossible because we modify and check
> > diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c
> > index 729fb7d0dd59..aa3bc42ecde0 100644
> > --- a/mm/mmap_lock.c
> > +++ b/mm/mmap_lock.c
> > @@ -164,8 +164,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
> >        */
> >
> >       /* Check if the vma we locked is the right one. */
> > -     if (unlikely(vma->vm_mm != mm ||
> > -                  address < vma->vm_start || address >= vma->vm_end))
> > +     if (unlikely(address < vma->vm_start || address >= vma->vm_end))
> >               goto inval_end_read;
> >
> >       rcu_read_unlock();
> > @@ -236,11 +235,8 @@ struct vm_area_struct *lock_next_vma(struct mm_struct *mm,
> >               goto fallback;
> >       }
> >
> > -     /*
> > -      * Verify the vma we locked belongs to the same address space and it's
> > -      * not behind of the last search position.
> > -      */
> > -     if (unlikely(vma->vm_mm != mm || from_addr >= vma->vm_end))
> > +     /* Verify the vma is not behind of the last search position. */
> > +     if (unlikely(from_addr >= vma->vm_end))
> >               goto fallback_unlock;
> >
> >       /*
> >
> > base-commit: c617a4dd7102e691fa0fb2bc4f6b369e37d7f509
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  2025-07-28 17:37   ` Suren Baghdasaryan
@ 2025-07-28 17:39     ` Vlastimil Babka
  2025-07-28 17:43       ` Suren Baghdasaryan
  0 siblings, 1 reply; 7+ messages in thread
From: Vlastimil Babka @ 2025-07-28 17:39 UTC (permalink / raw)
  To: Suren Baghdasaryan
  Cc: akpm, jannh, Liam.Howlett, lorenzo.stoakes, pfalcato, linux-mm,
	linux-kernel, stable

On 7/28/25 19:37, Suren Baghdasaryan wrote:
> On Mon, Jul 28, 2025 at 10:19 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>> > +      */
>> > +     if (unlikely(vma->vm_mm != mm)) {
>> > +             /*
>> > +              * __mmdrop() is a heavy operation and we don't need RCU
>> > +              * protection here. Release RCU lock during these operations.
>> > +              */
>> > +             rcu_read_unlock();
>> > +             mmgrab(vma->vm_mm);
>> > +             vma_refcount_put(vma);
>>
>> The vma can go away here.
> 
> No, the vma can't go away here because we are holding vm_refcnt. So,
> the vma and its mm are stable up until vma_refcount_put() drops
> vm_refcnt.

But that's exactly what we're doing here?

>>
>> > +             mmdrop(vma->vm_mm);

And here we reference the vma again?

>> So we need to copy the vma->vm_mm first?
>>
>> > +             rcu_read_lock();
>> > +             return NULL;
>> > +     }
>> > +
>> >       /*
>> >        * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
>> >        * False unlocked result is impossible because we modify and check


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  2025-07-28 17:39     ` Vlastimil Babka
@ 2025-07-28 17:43       ` Suren Baghdasaryan
  2025-07-28 17:55         ` Suren Baghdasaryan
  0 siblings, 1 reply; 7+ messages in thread
From: Suren Baghdasaryan @ 2025-07-28 17:43 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: akpm, jannh, Liam.Howlett, lorenzo.stoakes, pfalcato, linux-mm,
	linux-kernel, stable

On Mon, Jul 28, 2025 at 10:39 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 7/28/25 19:37, Suren Baghdasaryan wrote:
> > On Mon, Jul 28, 2025 at 10:19 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >> > +      */
> >> > +     if (unlikely(vma->vm_mm != mm)) {
> >> > +             /*
> >> > +              * __mmdrop() is a heavy operation and we don't need RCU
> >> > +              * protection here. Release RCU lock during these operations.
> >> > +              */
> >> > +             rcu_read_unlock();
> >> > +             mmgrab(vma->vm_mm);
> >> > +             vma_refcount_put(vma);
> >>
> >> The vma can go away here.
> >
> > No, the vma can't go away here because we are holding vm_refcnt. So,
> > the vma and its mm are stable up until vma_refcount_put() drops
> > vm_refcnt.
>
> But that's exactly what we're doing here?

Ah, you are right. At the time of mmdrop() call the vma is already
unstable. Let me fix it by copying the mm like we do in
vma_refcount_put().

>
> >>
> >> > +             mmdrop(vma->vm_mm);
>
> And here we reference the vma again?
>
> >> So we need to copy the vma->vm_mm first?
> >>
> >> > +             rcu_read_lock();
> >> > +             return NULL;
> >> > +     }
> >> > +
> >> >       /*
> >> >        * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
> >> >        * False unlocked result is impossible because we modify and check


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
  2025-07-28 17:43       ` Suren Baghdasaryan
@ 2025-07-28 17:55         ` Suren Baghdasaryan
  0 siblings, 0 replies; 7+ messages in thread
From: Suren Baghdasaryan @ 2025-07-28 17:55 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: akpm, jannh, Liam.Howlett, lorenzo.stoakes, pfalcato, linux-mm,
	linux-kernel, stable

On Mon, Jul 28, 2025 at 10:43 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Mon, Jul 28, 2025 at 10:39 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > On 7/28/25 19:37, Suren Baghdasaryan wrote:
> > > On Mon, Jul 28, 2025 at 10:19 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> > >> > +      */
> > >> > +     if (unlikely(vma->vm_mm != mm)) {
> > >> > +             /*
> > >> > +              * __mmdrop() is a heavy operation and we don't need RCU
> > >> > +              * protection here. Release RCU lock during these operations.
> > >> > +              */
> > >> > +             rcu_read_unlock();
> > >> > +             mmgrab(vma->vm_mm);
> > >> > +             vma_refcount_put(vma);
> > >>
> > >> The vma can go away here.
> > >
> > > No, the vma can't go away here because we are holding vm_refcnt. So,
> > > the vma and its mm are stable up until vma_refcount_put() drops
> > > vm_refcnt.
> >
> > But that's exactly what we're doing here?
>
> Ah, you are right. At the time of mmdrop() call the vma is already
> unstable. Let me fix it by copying the mm like we do in
> vma_refcount_put().

Fixed in v2: https://lore.kernel.org/all/20250728175355.2282375-1-surenb@google.com/
Thanks!

>
> >
> > >>
> > >> > +             mmdrop(vma->vm_mm);
> >
> > And here we reference the vma again?
> >
> > >> So we need to copy the vma->vm_mm first?
> > >>
> > >> > +             rcu_read_lock();
> > >> > +             return NULL;
> > >> > +     }
> > >> > +
> > >> >       /*
> > >> >        * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result.
> > >> >        * False unlocked result is impossible because we modify and check


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-07-28 17:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-28 17:09 [PATCH 1/1] mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped Suren Baghdasaryan
2025-07-28 17:16 ` Suren Baghdasaryan
2025-07-28 17:19 ` Vlastimil Babka
2025-07-28 17:37   ` Suren Baghdasaryan
2025-07-28 17:39     ` Vlastimil Babka
2025-07-28 17:43       ` Suren Baghdasaryan
2025-07-28 17:55         ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox