From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A6B8C48BE5 for ; Tue, 15 Jun 2021 09:46:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BE5B56143E for ; Tue, 15 Jun 2021 09:46:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BE5B56143E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 38DA56B0036; Tue, 15 Jun 2021 05:46:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 33D856B006E; Tue, 15 Jun 2021 05:46:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B7B16B0071; Tue, 15 Jun 2021 05:46:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id DBA616B0036 for ; Tue, 15 Jun 2021 05:46:46 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6BDC2180AD806 for ; Tue, 15 Jun 2021 09:46:46 +0000 (UTC) X-FDA: 78255478812.36.0DDCBF1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP id 7D52FA000245 for ; Tue, 15 Jun 2021 09:46:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id DF9E46143D; Tue, 15 Jun 2021 09:46:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623750405; bh=Q4s9x4X/axe/W24+TWRuk/HxGxn+2TZijoQqtBdwBVk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=F51nHnqNF5lzBBA9V3adfbhtW0cgxcadpe1e/aIV0ONTdKx/yImRdFUtaAiN9l4B0 AYtxDe4VSjXZDTUN6cxeMVpVoE18nqH9FJn0Yd3ufJ9JrKrn5ZhCu80rfJmZQUl/6C yCCqMMze3Zv6pSMFQ1rW5AjKc79NpvsSTp5CiA9eQX4RVO6VeiX8/sfOP9RpNF5S+f Ynou70nSy+uK8Vj/6QbbTF0jVGfWB2RiPLVhwB5HsdHw+RJou/JFycIVgQQN4HbaXZ aabHr9m1ja+uN0EjRrhfgCX1xnvjN+BaDVSv5i3daMMP3JDDmvUMu8Pm8q7DiF42xw 4sNTaai8ibW/A== Date: Tue, 15 Jun 2021 10:46:39 +0100 From: Will Deacon To: Jason Gunthorpe Cc: Hugh Dickins , "Kirill A. Shutemov" , Andrew Morton , "Kirill A. Shutemov" , Yang Shi , Wang Yugui , Matthew Wilcox , Alistair Popple , Ralph Campbell , Zi Yan , Peter Xu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic() Message-ID: <20210615094639.GC19878@willie-the-truck> References: <589b358c-febc-c88e-d4c2-7834b37fa7bf@google.com> <594c1f0-d396-5346-1f36-606872cddb18@google.com> <20210610090617.e6qutzzj3jxcseyi@box.shutemov.name> <20210610121542.GQ1096940@ziepe.ca> <20210611153613.GR1096940@ziepe.ca> <939a0fa-7d6c-f535-7c34-4c522903e6f@google.com> <20210611194249.GS1096940@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210611194249.GS1096940@ziepe.ca> User-Agent: Mutt/1.10.1 (2018-07-13) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=F51nHnqN; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of will@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=will@kernel.org X-Stat-Signature: 1g6wznyzsfngbnqeqtzjf8uo8pke6jpm X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7D52FA000245 X-HE-Tag: 1623750396-432048 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 11, 2021 at 04:42:49PM -0300, Jason Gunthorpe wrote: > On Fri, Jun 11, 2021 at 12:05:42PM -0700, Hugh Dickins wrote: > > > diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h > > > index e896ebef8c24cb..0bf1fdec928e71 100644 > > > +++ b/arch/x86/include/asm/pgtable-3level.h > > > @@ -75,7 +75,7 @@ static inline void native_set_pte(pte_t *ptep, pte_t pte) > > > static inline pmd_t pmd_read_atomic(pmd_t *pmdp) > > > { > > > pmdval_t ret; > > > - u32 *tmp = (u32 *)pmdp; > > > + u32 *tmp = READ_ONCE((u32 *)pmdp); > > > > > > ret = (pmdval_t) (*tmp); > > > if (ret) { > > > @@ -84,7 +84,7 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp) > > > * or we can end up with a partial pmd. > > > */ > > > smp_rmb(); > > > - ret |= ((pmdval_t)*(tmp + 1)) << 32; > > > + ret |= READ_ONCE((pmdval_t)*(tmp + 1)) << 32; > > > } > > > > Maybe that. Or maybe now (since Will's changes) it can just do > > one READ_ONCE() of the whole, then adjust its local copy. > > I think the smb_rmb() is critical here to ensure a PTE table pointer > is coherent, READ_ONCE is not a substitute, unless I am miss > understanding what Will's changes are??? Yes, I agree that the barrier is needed here for x86 PAE. I would really have liked to enforce native-sized access in READ_ONCE(), but unfortunately there is plenty of code out there which is resilient to a 64-bit access being split into two separate 32-bit accesses and so I wasn't able to go that far. That being said, pmd_read_atomic() probably _should_ be using READ_ONCE() because using it inconsistently can give rise to broken codegen, e.g. if you do: pmdval_t x, y, z; x = *pmdp; // Invalid y = READ_ONCE(*pmdp); // Valid if (pmd_valid(y)) z = *pmdp; // Invalid again! Then the compiler can allocate the same register for x and z, but will issue an additional load for y. If a concurrent update takes place to the pmd which transitions from Invalid -> Valid, then it will look as though things went back in time, because z will be stale. We actually hit this on arm64 in practice [1]. Will [1] https://lore.kernel.org/lkml/20171003114244.430374928@linuxfoundation.org/