linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Houghton <jthoughton@google.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	 linux-mm@kvack.org, Adrian Hunter <adrian.hunter@intel.com>,
	 Al Viro <viro@zeniv.linux.org.uk>,
	Alex Williamson <alex.williamson@redhat.com>,
	 Alexander Potapenko <glider@google.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	 Andrey Konovalov <andreyknvl@gmail.com>,
	Andrey Ryabinin <ryabinin.a.a@gmail.com>,
	 Christian Brauner <brauner@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Daniel Vetter <daniel@ffwll.ch>,  Dave Airlie <airlied@gmail.com>,
	Dimitri Sivanich <dimitri.sivanich@hpe.com>,
	 Dmitry Vyukov <dvyukov@google.com>,
	Ian Rogers <irogers@google.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	 Jiri Olsa <jolsa@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	 Mark Rutland <mark.rutland@arm.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Miaohe Lin <linmiaohe@huawei.com>,
	Michal Hocko <mhocko@kernel.org>,
	 Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	 Muchun Song <muchun.song@linux.dev>,
	Namhyung Kim <namhyung@kernel.org>,
	 Naoya Horiguchi <naoya.horiguchi@nec.com>,
	 Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Ryan Roberts <ryan.roberts@arm.com>,
	 SeongJae Park <sj@kernel.org>,
	Shakeel Butt <shakeelb@google.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	 Vincenzo Frascino <vincenzo.frascino@arm.com>,
	Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH] mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
Date: Thu, 29 Jun 2023 21:50:55 -0400	[thread overview]
Message-ID: <CADrL8HUwxLU-UvTLbzp-JM5EqQ2u-91UU4VfAhRrPiu7i3Jhkg@mail.gmail.com> (raw)
In-Reply-To: <20230630013203.1955064-1-jhubbard@nvidia.com>

On Thu, Jun 29, 2023 at 9:32 PM John Hubbard <jhubbard@nvidia.com> wrote:
>
> The following crash happens for me when running the -mm selftests
> (below). Specifically, it happens while running the uffd-stress
> subtests:
>
> kernel BUG at mm/hugetlb.c:7249!
> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 0 PID: 3238 Comm: uffd-stress Not tainted 6.4.0-hubbard-github+ #109
> Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1503 08/03/2018
> RIP: 0010:huge_pte_alloc+0x12c/0x1a0
> ...
> Call Trace:
>  <TASK>
>  ? __die_body+0x63/0xb0
>  ? die+0x9f/0xc0
>  ? do_trap+0xab/0x180
>  ? huge_pte_alloc+0x12c/0x1a0
>  ? do_error_trap+0xc6/0x110
>  ? huge_pte_alloc+0x12c/0x1a0
>  ? handle_invalid_op+0x2c/0x40
>  ? huge_pte_alloc+0x12c/0x1a0
>  ? exc_invalid_op+0x33/0x50
>  ? asm_exc_invalid_op+0x16/0x20
>  ? __pfx_put_prev_task_idle+0x10/0x10
>  ? huge_pte_alloc+0x12c/0x1a0
>  hugetlb_fault+0x1a3/0x1120
>  ? finish_task_switch+0xb3/0x2a0
>  ? lock_is_held_type+0xdb/0x150
>  handle_mm_fault+0xb8a/0xd40
>  ? find_vma+0x5d/0xa0
>  do_user_addr_fault+0x257/0x5d0
>  exc_page_fault+0x7b/0x1f0
>  asm_exc_page_fault+0x22/0x30
>
> That happens because a BUG() statement in huge_pte_alloc() attempts to
> check that a pte, if present, is a hugetlb pte, but it does so in a
> non-lockless-safe manner that leads to a false BUG() report.
>
> We got here due to a couple of bugs, each of which by itself was not
> quite enough to cause a problem:
>
> First of all, before commit c33c794828f2("mm: ptep_get() conversion"),
> the BUG() statement in huge_pte_alloc() was itself fragile: it relied
> upon compiler behavior to only read the pte once, despite using it twice
> in the same conditional.
>
> Next, commit c33c794828f2 ("mm: ptep_get() conversion") broke that
> delicate situation, by causing all direct pte reads to be done via
> READ_ONCE(). And so READ_ONCE() got called twice within the same BUG()
> conditional, leading to comparing (potentially, occasionally) different
> versions of the pte, and thus to false BUG() reports.
>
> Fix this by taking a single snapshot of the pte before using it in the
> BUG conditional.
>
> Now, that commit is only partially to blame here but, people doing
> bisections will invariably land there, so this will help them find a fix
> for a real crash. And also, the previous behavior was unlikely to ever
> expose this bug--it was fragile, yet not actually broken.
>
> So that's why I chose this commit for the Fixes tag, rather than the
> commit that created the original BUG() statement.
>
> Fixes: c33c794828f2 ("mm: ptep_get() conversion")

Hi John,

Good catch, and thanks for the detailed explanation. It looks like
riscv and powerpc have equivalent problems in their huge_pte_alloc
implementations, perhaps it's worth taking a look at those. (riscv
looks like it has precisely the same problem except it's a WARN, but
powerpc looks more interesting.)

Either way,

Acked-by: James Houghton <jthoughton@google.com>


  reply	other threads:[~2023-06-30  1:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-30  1:32 John Hubbard
2023-06-30  1:50 ` James Houghton [this message]
2023-06-30  2:06   ` John Hubbard
2023-06-30  2:13 ` Muchun Song
2023-06-30 10:07 ` Ryan Roberts
2023-07-01  0:21   ` John Hubbard
2023-07-03 21:39 ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADrL8HUwxLU-UvTLbzp-JM5EqQ2u-91UU4VfAhRrPiu7i3Jhkg@mail.gmail.com \
    --to=jthoughton@google.com \
    --cc=adrian.hunter@intel.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andreyknvl@gmail.com \
    --cc=brauner@kernel.org \
    --cc=daniel@ffwll.ch \
    --cc=dimitri.sivanich@hpe.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=irogers@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=jolsa@kernel.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=namhyung@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=oleksandr_tyshchenko@epam.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=ryabinin.a.a@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeelb@google.com \
    --cc=sj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=vincenzo.frascino@arm.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox