From: Jann Horn <jannh@google.com>
To: Linux-MM <linux-mm@kvack.org>,
Peter Zijlstra <peterz@infradead.org>,
Christoph Hellwig <hch@lst.de>
Cc: kernel list <linux-kernel@vger.kernel.org>,
David Hildenbrand <david@redhat.com>,
Jason Gunthorpe <jgg@mellanox.com>
Subject: ptep_get_lockless() on 32-bit x86/mips/sh looks wrong
Date: Thu, 6 Oct 2022 17:23:59 +0200 [thread overview]
Message-ID: <CAG48ez3h-mnp9ZFC10v+-BW_8NQvxbwBsMYJFP8JX31o0B17Pg@mail.gmail.com> (raw)
ptep_get_lockless() does the following under CONFIG_GUP_GET_PTE_LOW_HIGH:
pte_t pte;
do {
pte.pte_low = ptep->pte_low;
smp_rmb();
pte.pte_high = ptep->pte_high;
smp_rmb();
} while (unlikely(pte.pte_low != ptep->pte_low));
It has a comment above it that argues that this is correct because:
1. A present PTE can't become non-present and then become a present
PTE pointing to another page without a TLB flush in between.
2. TLB flushes involve IPIs.
As far as I can tell, in particular on x86, _both_ of those
assumptions are false; perhaps on mips and sh only one of them is?
Number 2 is straightforward: X86 can run under hypervisors, and when
it runs under hypervisors, the MMU paravirtualization code (including
the KVM version) can implement remote TLB flushes without IPIs.
Number 1 is gnarlier, because breaking that assumption implies that
there can be a situation where different threads see different memory
at the same virtual address because their TLBs are incoherent. But as
far as I know, it can happen when MADV_DONTNEED races with an
anonymous page fault, because zap_pte_range() does not always flush
stale TLB entries before dropping the page table lock. I think that's
probably fine, since it's a "garbage in, garbage out" kind of
situation - but if a concurrent GUP-fast can then theoretically end up
returning a completely unrelated page, that's bad.
Sadly, mips and sh don't define arch_cmpxchg_double(), so we can't
just change ptep_get_lockless() to use arch_cmpxchg_double() and be
done with it...
next reply other threads:[~2022-10-06 15:24 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-06 15:23 Jann Horn [this message]
2022-10-06 15:44 ` Jason Gunthorpe
2022-10-06 15:55 ` Jann Horn
2022-10-06 15:58 ` Jann Horn
2022-10-06 16:15 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAG48ez3h-mnp9ZFC10v+-BW_8NQvxbwBsMYJFP8JX31o0B17Pg@mail.gmail.com \
--to=jannh@google.com \
--cc=david@redhat.com \
--cc=hch@lst.de \
--cc=jgg@mellanox.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox