linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Rik van Riel <riel@surriel.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Harry Yoo <harry.yoo@oracle.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
Date: Fri, 25 Jul 2025 17:38:03 +0200	[thread overview]
Message-ID: <1a43ebfa-0c4f-4029-ad81-125aba68b764@suse.cz> (raw)
In-Reply-To: <CAG48ez23CPO-m6kPaEs8kLUfRVCN+QMbsEn7BocfaJuq=gRwaA@mail.gmail.com>

On 7/25/25 16:44, Jann Horn wrote:
> On Fri, Jul 25, 2025 at 4:11 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 7/25/25 14:16, Jann Horn wrote:
>> > If an anon folio is mapped into userspace, its anon_vma must be alive,
>> > otherwise rmap walks can hit UAF.
>> >
>> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
>> > walks that seems to indicate that there can be pages with elevated mapcount
>> > whose anon_vma has already been freed, but I think we never figured out
>> > what the cause is; and syzkaller only hit these UAFs when memory pressure
>> > randomly caused reclaim to rmap-walk the affected pages, so it of course
>> > didn't manage to create a reproducer.
>> >
>> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
>> > hopefully catch such issues more reliably.
>> >
>> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
>> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
>> >
>> > Acked-by: David Hildenbrand <david@redhat.com>
>> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> > Signed-off-by: Jann Horn <jannh@google.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

>> > ---
>> > Changes in v2:
>> > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
>> > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
>> > - more verbose comment (Lorenzo)
>> > - replaced "page" mentions with "folio" in commit message
>> > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
>> > ---
>> >  include/linux/rmap.h | 22 ++++++++++++++++++++++
>> >  1 file changed, 22 insertions(+)
>> >
>> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>> > index 20803fcb49a7..6cd020eea37a 100644
>> > --- a/include/linux/rmap.h
>> > +++ b/include/linux/rmap.h
>> > @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
>> >       default:
>> >               VM_WARN_ON_ONCE(true);
>> >       }
>> > +
>> > +     /*
>> > +      * Anon folios must have an associated live anon_vma as long as they're
>> > +      * mapped into userspace.
>> > +      * Note that the atomic_read() mainly does two things:
>> > +      *
>> > +      * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
>> > +      *    check that the associated anon_vma has not yet been freed (subject
>>
>> I think more precisely it checks that the slab folio hosting the anon_vma
>> could not have been yet freed, IIUC? If the anon_vma itself has been freed
>> then this will not trigger.
> 
> The point of CONFIG_SLUB_RCU_DEBUG, which I'm talking about here, is
> that it allows KASAN to catch UAF once the anon_vma has been freed and
> an RCU grace period has passed; it is not necessary that the slab
> folio has been freed.
> 
> You can see that working in the linked syzkaller reports - KASAN
> tracked the object as freed after slab_free_after_rcu_debug(), which
> is an RCU callback scheduled from kmem_cache_free().
> 
>> > +      *    to KASAN's usual limitations). This check will pass if the
>> > +      *    anon_vma's refcount has already dropped to 0 but an RCU grace
>> > +      *    period hasn't passed since then.
>>
>> AFAIU this says it more accurately and matches my interpretation above?
>>
>> > +      * 2. If the anon_vma has not yet been freed, it checks that the
>> > +      *    anon_vma still has a nonzero refcount (as opposed to being in the
>> > +      *    middle of an RCU delay for getting freed).
>>
>> Again the RCU delay would apply to the slab page, unless you talk about the
>> CONFIG_SLUB_RCU_DEBUG specific path (IIRC).
> 
> Yes, right, the "RCU delay" in the second bullet point refers to
> CONFIG_SLUB_RCU_DEBUG.

OK I misunderstood that while bullet 1 notes the check only happens with
CONFIG_SLUB_RCU_DEBUG, I assumed the description is still meant semantically
from the point of anon_vma users (particularly what "freed" means - moment
of kfree() vs KASAN quarantine). Once considered from the point of what
happens with the object under CONFIG_SLUB_RCU_DEBUG, it all makes sense.

> Here I'm saying "If the anon_vma has not yet been freed" because
> that's the only case in which I can reliably say what will happen, and
> this is the main case that isn't already covered by the first bullet
> point in a CONFIG_SLUB_RCU_DEBUG build.
> 
>> That said, I wonder if here in __folio_rmap_sanity_checks() we are even in a
>> situation where we rely on SLAB_TYPESAFE_BY_RCU in order to not touch
>> something that's not anon_vma anymore? I think we expect it to exist?
> 
> Yes, we expect it to exist. That's why I'm not just asserting that the
> anon_vma is still considered live by KASAN, but also that its refcount
> is non-zero.
> 
>> Can we
>> thus invent a CONFIG_SLUB_RCU_DEBUG-specific assert that assert the anon_vma
>> itself has not been freed yet (i.e. even if within a grace period?).
> 
> That is essentially what I'm doing - checking that the count is
> nonzero verifies that it's not within a grace period, and the implicit
> KASAN check verifies it can't be in a KASAN quarantine after the grace
> period is over.

OK I guess that's sufficient and we'd be unlikely to find a bug scenario
where anon_vma was kfree'd() and thus KASAN with CONFIG_SLUB_RCU_DEBUG is
waiting for the grace period, yet it doesn't have a zero refcount.



  reply	other threads:[~2025-07-25 15:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-25 12:16 Jann Horn
2025-07-25 14:11 ` Vlastimil Babka
2025-07-25 14:44   ` Jann Horn
2025-07-25 15:38     ` Vlastimil Babka [this message]
2025-07-25 15:40     ` David Hildenbrand
2025-07-28  4:05 ` Harry Yoo
2025-07-28  4:33   ` Lorenzo Stoakes
2025-07-29  2:41     ` Harry Yoo
2025-07-28 14:12   ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1a43ebfa-0c4f-4029-ad81-125aba68b764@suse.cz \
    --to=vbabka@suse.cz \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=riel@surriel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox