From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A23B3C433F5 for ; Thu, 6 Oct 2022 15:24:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE3618E0001; Thu, 6 Oct 2022 11:24:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E94766B0073; Thu, 6 Oct 2022 11:24:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D5B608E0001; Thu, 6 Oct 2022 11:24:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C49476B0072 for ; Thu, 6 Oct 2022 11:24:36 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 92C9381148 for ; Thu, 6 Oct 2022 15:24:36 +0000 (UTC) X-FDA: 79990896552.17.861FC56 Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) by imf27.hostedemail.com (Postfix) with ESMTP id 1F12F4001E for ; Thu, 6 Oct 2022 15:24:35 +0000 (UTC) Received: by mail-io1-f46.google.com with SMTP id d196so1508678iof.11 for ; Thu, 06 Oct 2022 08:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=1a+psg2nqIZL7bAZ6z1cIQFIUH3jE8LmypWSBzkS2oo=; b=rSXh1u7hwhORzZwyDce9LUkOzigRIDR41Au4fcmTpZ2GI2gKKIJvrJ/i0/ClQ/LSub e0HTQ+YSb3fdIXmE140lznkj3350+Jxd7RiGKzq2riUNJK3zR595y/bYaKns/XnO1d6I Ty6f3XcarE4fwy+tDx9srBv0EhzGaB0FDSz/iYxhk+kxiQtnpwBqb1rXOoASXrYcv2L9 PrJRODlq08hZ7j34hBhEMdAmZDnPqJURVbx3C6iPpMZOlGUmNhw/4pXbI4sV6InCoW3A RxCH3NBUXeO9kZkP+LkJDaDb2Z7z1NiGV+Uuw7Jz925LRjMeZP87mclo8m8A6EGtKrUc x8PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=1a+psg2nqIZL7bAZ6z1cIQFIUH3jE8LmypWSBzkS2oo=; b=75TZyZqddRnVOghN5uSm8Gk30s7iLyBokjGMakfuZqX4l6K/SSEJWM3hYYre02Z9+s lSPKMYv5QU3whhmfFg6yygJHPvzfPGFWivI+yUHnSshMDEMQKPsZFzhiz+5xoNQl1Tba OiBfAe/eSMg5BG/Kfdkj0MztWN8ZVgckMVpuawBVHgymq4EFFsHAeKXbmPX6YqtL5M+3 MJ+9o4wKP1RcXFLuANUPJJjfcqByAn6p10Wb38xJ6lanB5kQyrJR5NFj7l+FQcVFu9pI dKqkNnnB43LLTDKFa4k5xN7eYF8kdDmj58Lxck31fVbBNB74TR/7jIj4gdp4mAjfhad4 iH/Q== X-Gm-Message-State: ACrzQf00kymGhtNK1ixZx+U0GfRaobtTUNzOI//Zhouxbi4f/qxz7Nh0 CvjLLalyYAZcnvLdanc+8Ko/OMVqUkgKjnZCkZydarWAya4QKQ== X-Google-Smtp-Source: AMsMyM4vahoSfRZE5lpPET3RLGsrNUFdrN5iUP1gQqWxcLez6lwgoo+5s1Ej6C5tpjbEy+hQr6LmPZK1lGSy426yfB8= X-Received: by 2002:a6b:5d07:0:b0:6bb:7253:a439 with SMTP id r7-20020a6b5d07000000b006bb7253a439mr164573iob.2.1665069875206; Thu, 06 Oct 2022 08:24:35 -0700 (PDT) MIME-Version: 1.0 From: Jann Horn Date: Thu, 6 Oct 2022 17:23:59 +0200 Message-ID: Subject: ptep_get_lockless() on 32-bit x86/mips/sh looks wrong To: Linux-MM , Peter Zijlstra , Christoph Hellwig Cc: kernel list , David Hildenbrand , Jason Gunthorpe Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665069876; a=rsa-sha256; cv=none; b=hTbLv59awT4uj1gfgKjd51t5Jphj9BLLGXI/I4kXzcudZUaU4DN1YubAL387xoaRWWkcdj 74JldgGqsclzBuYV2Yc10WrtorJ+/0zsYjEqHFxDbm9HEHhD9o7fWkUc36dxp1u+FPC/b1 k9o2wq8xJrSu71mZL2eU+LVe6jMNlJc= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rSXh1u7h; spf=pass (imf27.hostedemail.com: domain of jannh@google.com designates 209.85.166.46 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665069876; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=1a+psg2nqIZL7bAZ6z1cIQFIUH3jE8LmypWSBzkS2oo=; b=d83dRCEYfkPOseZ9A9OFf8pyN8rngiUkZgdIXdxTG0hNVPrQvVUdW6OFVWJKiIObmG1Hmo 7vGum3M0IUh54VsOj5bKnQsDx2uJdzXKDtbO2uNownfNZicn2vLjtS3C+ZL0e0uZp6I4ec F3eouuLp8DkPowr2vzE8VFNYoxCcazY= X-Rspamd-Queue-Id: 1F12F4001E X-Rspam-User: Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rSXh1u7h; spf=pass (imf27.hostedemail.com: domain of jannh@google.com designates 209.85.166.46 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com X-Stat-Signature: s9d5ajzhqpebydz49tdyqoacktfucyce X-Rspamd-Server: rspam04 X-HE-Tag: 1665069875-422518 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ptep_get_lockless() does the following under CONFIG_GUP_GET_PTE_LOW_HIGH: pte_t pte; do { pte.pte_low = ptep->pte_low; smp_rmb(); pte.pte_high = ptep->pte_high; smp_rmb(); } while (unlikely(pte.pte_low != ptep->pte_low)); It has a comment above it that argues that this is correct because: 1. A present PTE can't become non-present and then become a present PTE pointing to another page without a TLB flush in between. 2. TLB flushes involve IPIs. As far as I can tell, in particular on x86, _both_ of those assumptions are false; perhaps on mips and sh only one of them is? Number 2 is straightforward: X86 can run under hypervisors, and when it runs under hypervisors, the MMU paravirtualization code (including the KVM version) can implement remote TLB flushes without IPIs. Number 1 is gnarlier, because breaking that assumption implies that there can be a situation where different threads see different memory at the same virtual address because their TLBs are incoherent. But as far as I know, it can happen when MADV_DONTNEED races with an anonymous page fault, because zap_pte_range() does not always flush stale TLB entries before dropping the page table lock. I think that's probably fine, since it's a "garbage in, garbage out" kind of situation - but if a concurrent GUP-fast can then theoretically end up returning a completely unrelated page, that's bad. Sadly, mips and sh don't define arch_cmpxchg_double(), so we can't just change ptep_get_lockless() to use arch_cmpxchg_double() and be done with it...