From: Andrew Morton <akpm@linux-foundation.org>
To: Jann Horn <jannh@google.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Matthew Wilcox <willy@infradead.org>,
"Kirill A . Shutemov" <kirill@shutemov.name>,
John Hubbard <jhubbard@nvidia.com>, Jan Kara <jack@suse.cz>,
stable@vger.kernel.org
Subject: Re: [PATCH resend] mm/gup: fix try_grab_compound_head() race with split_huge_page()
Date: Fri, 11 Jun 2021 15:36:24 -0700 [thread overview]
Message-ID: <20210611153624.65badf761078f86f76365ab9@linux-foundation.org> (raw)
In-Reply-To: <20210611161545.998858-1-jannh@google.com>
On Fri, 11 Jun 2021 18:15:45 +0200 Jann Horn <jannh@google.com> wrote:
> try_grab_compound_head() is used to grab a reference to a page from
> get_user_pages_fast(), which is only protected against concurrent
> freeing of page tables (via local_irq_save()), but not against
> concurrent TLB flushes, freeing of data pages, or splitting of compound
> pages.
>
> Because no reference is held to the page when try_grab_compound_head()
> is called, the page may have been freed and reallocated by the time its
> refcount has been elevated; therefore, once we're holding a stable
> reference to the page, the caller re-checks whether the PTE still points
> to the same page (with the same access rights).
>
> The problem is that try_grab_compound_head() has to grab a reference on
> the head page; but between the time we look up what the head page is and
> the time we actually grab a reference on the head page, the compound
> page may have been split up (either explicitly through split_huge_page()
> or by freeing the compound page to the buddy allocator and then
> allocating its individual order-0 pages).
> If that happens, get_user_pages_fast() may end up returning the right
> page but lifting the refcount on a now-unrelated page, leading to
> use-after-free of pages.
>
> To fix it:
> Re-check whether the pages still belong together after lifting the
> refcount on the head page.
> Move anything else that checks compound_head(page) below the refcount
> increment.
>
> This can't actually happen on bare-metal x86 (because there, disabling
> IRQs locks out remote TLB flushes), but it can happen on virtualized x86
> (e.g. under KVM) and probably also on arm64. The race window is pretty
> narrow, and constantly allocating and shattering hugepages isn't exactly
> fast; for now I've only managed to reproduce this in an x86 KVM guest with
> an artificially widened timing window (by adding a loop that repeatedly
> calls `inl(0x3f8 + 5)` in `try_get_compound_head()` to force VM exits,
> so that PV TLB flushes are used instead of IPIs).
>
> ...
>
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -43,8 +43,21 @@ static void hpage_pincount_sub(struct page *page, int refs)
>
> atomic_sub(refs, compound_pincount_ptr(page));
> }
>
> +/* Equivalent to calling put_page() @refs times. */
> +static void put_page_refs(struct page *page, int refs)
> +{
> + VM_BUG_ON_PAGE(page_ref_count(page) < refs, page);
I don't think there's a need to nuke the whole kernel in this case.
Can we warn then simply leak the page? That way we have a much better
chance of getting a good bug report.
> + /*
> + * Calling put_page() for each ref is unnecessarily slow. Only the last
> + * ref needs a put_page().
> + */
> + if (refs > 1)
> + page_ref_sub(page, refs - 1);
> + put_page(page);
> +}
next prev parent reply other threads:[~2021-06-11 22:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-11 16:15 Jann Horn
2021-06-11 22:36 ` Andrew Morton [this message]
2021-06-12 1:49 ` Jann Horn
2021-06-12 10:17 ` John Hubbard
2021-06-14 4:47 ` Jann Horn
2021-06-15 0:38 ` John Hubbard
2021-06-14 13:10 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210611153624.65badf761078f86f76365ab9@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=jhubbard@nvidia.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=stable@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox