From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03EF4C77B7A for ; Thu, 25 May 2023 09:09:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7481990000C; Thu, 25 May 2023 05:09:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F767900002; Thu, 25 May 2023 05:09:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60D7690000C; Thu, 25 May 2023 05:09:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 52D86900002 for ; Thu, 25 May 2023 05:09:05 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 08331140AD1 for ; Thu, 25 May 2023 09:09:05 +0000 (UTC) X-FDA: 80828203050.22.CAB472E Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf24.hostedemail.com (Postfix) with ESMTP id A893318001A for ; Thu, 25 May 2023 09:09:02 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685005743; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gBXCkEBFyHa2wZUskF2YJwVR/TkzQG3okicEkKNjeHM=; b=D/u3pGa/PVaF0/0cQoAJ4Lej6i6WKl7rqCJEeG+jyCinAsZ8m9i3qLwCOGipDmXH3gh/vs hOQIdbl7FTxnHnfDkrnolEhNfgqk10rsg9ZAxM/SOsw6XtRCAuul4lre56ygS1y0RcZ8hQ qeq9ztzwSjxzbgGsaEkYwp5v9iMnsd4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685005743; a=rsa-sha256; cv=none; b=DOhn31wEkTX7UQ8OcnOLRLaAWrIdBOjLxcqLuVoO+Qb63Okec/6LT4Ls5s9HWfB6u/WfT8 O/k07JIPTrse+mPV0Yvsg+KoNqck+zO/+wKN9u2M9aw1uH/1eoNKFPeIlMToAxKj8rS75Y NhtpOZbSgomGls7OAgZLEUA5vf5CJcU= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 862CE1042; Thu, 25 May 2023 02:09:46 -0700 (PDT) Received: from [10.1.27.40] (C02Z41KALVDN.cambridge.arm.com [10.1.27.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6DABD3F67D; Thu, 25 May 2023 02:08:59 -0700 (PDT) Message-ID: <6dfa35b0-7e1d-5b0d-1ee8-d7f5d58b4ed0@arm.com> Date: Thu, 25 May 2023 10:08:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH v2 4/5] mm: Add new ptep_deref() helper to fully encapsulate pte_t From: Ryan Roberts To: Yu Zhao Cc: Andrew Morton , SeongJae Park , Christoph Hellwig , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Lorenzo Stoakes , Uladzislau Rezki , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org, damon@lists.linux.dev References: <20230518110727.2106156-1-ryan.roberts@arm.com> <20230518110727.2106156-5-ryan.roberts@arm.com> <692e9e7e-ee00-368b-6a31-60a895f7011c@arm.com> In-Reply-To: <692e9e7e-ee00-368b-6a31-60a895f7011c@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: A893318001A X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: yfy4j5ypdr9heiiw4apbhu5wdhzsf1yq X-HE-Tag: 1685005742-303948 X-HE-Meta: U2FsdGVkX19hdAfHo1QJN8TVZFbcmEPsvKcEN2yoqQ/w1DHG7/huQ0y1a6xvCPCSPvKiB1q3FTpaOIef7LbV066IWQBTho1Ko8E1PJ2qVGCaDKgJ36W606B47B3qw0ja54dmKQ7yshiY3UkV2I722HdzhyCy9m0lpzE5fQqqDrj3nj7m4jvrTQKmTkuINiA0mY3LqKs6pIc3xI6jHGME+S8y/f/imFY/OyGUrZyRTUPyBbiZ15R5AeT839dYh2BXGAvdcQAU0dDFxvqpak9P41jvQe1vRLnPC9ZqrD6sN+OlM4s8ucJZP51WWy7XEY9MqeQhjs/NE7belOq54QLWQj3fHMn0VvgxrnwL6PXPCpq4r8uZAOMaxGvlP1rEDW7mRi5fVSfQi5YxFcYol8QLzuHSHoLPhE9ZbxhLGg9hOC84oKbnrEoDas8FO13w+Ha+uKBmHSObZDs7zpL2f53XCUNRVCTCzS6n5yAB7CoE5t3dEHu//yUo8Gq3jUYT2w2e2zQomxyGD3NJA17Uq9+0E+8soCKNeS65eclHhJq2D4FR6+jOc4qrIWCRIoC4omknzbw2X6OEdTymj+JS+B2ofrUlwy3JDYNIE71DE3BHfVTBvKZoGzVTV1c32qvEpHHqWAGvW/JVGbTR3XR3vHquwhoyQbBxijJ9NKRc7GF4JUJcQ1mk0bDF3xIcMn9q+YmZdakeNVM9xlX6Au15AT6BvtkvgsLPyxbq3GdPYHm8Vm3VI3n9//ozpGAQfEgBhuiOlECNV6nnC70m7M6JfNmnfvGrghzgRQtpBrUac1qtRQLDpMkjsjbTZo3gb5hQzJkGV9PSkQjra5UdwiKcEkvg/cYRoP4VFBCQ2vvUW0rHLUI/ZfcxbGAAzw0THoHNgs3c5jpHDorait8iUfvzQNb4rsX1xRPQzwpfl8U2Zxk9zNhScRz5whyYHJtlrPscKmAzfJI6jFfXX9cHEKz0VFR T+NKm4GM O22A9JuGSKXxUCa+NE0BMeNXJhtnpEfzLlhC67hUNhwgbAAMM4ady+v8QRFZGkmnjmC/4lSMC/Pz/o1QfXPAqwuIRpl5OmutUdb622L/pwerUpciOjSMnOH4BYh6S3XJuJLYkf4+iUV/HLkXroFEBFoPsK+o4fT/oIzAQIYbiB+zE+clK/kikEOXNp0S2FHYHmhPEkgNiNHrFUyJYFcomenvPgw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 19/05/2023 10:12, Ryan Roberts wrote: > On 18/05/2023 20:28, Yu Zhao wrote: >> On Thu, May 18, 2023 at 5:07 AM Ryan Roberts wrote: >>> >>> There are many call sites that directly dereference a pte_t pointer. >>> This makes it very difficult to properly encapsulate a page table in the >>> arch code without having to allocate shadow page tables. ptep_deref() >>> aims to solve this by replacing all direct dereferences with a call to >>> this function. >>> >>> The default implementation continues to just dereference the pointer >>> (*ptep), so generated code should be exactly the same. However, it is >>> possible for the architecture to override the default with their own >>> implementation, that can (e.g.) hide certain bits from the core code, or >>> determine young/dirty status by mixing in state from another source. >>> >>> While ptep_get() and ptep_get_lockless() already exist, these are >>> implemented as atomic accesses (e.g. READ_ONCE() in the default case). >>> So rather than using ptep_get() and risking performance regressions, >>> introduce an new variant. >> >> We should reuse ptep_get(): >> 1. I don't think READ_ONCE() can cause measurable regressions in this case. >> 2. It's technically wrong without it. > > Can you clarify what you mean by technically wrong? Are you saying that the > current code that does direct dereferencing is buggy? > > I previously convinced myself that the potential for the compiler generating > multiple loads was safe because the code in question is under the PTL so there > are no concurrent stores. And we shouldn't see any tearing for the same reason. > > That said, if there is concensus that we can just use ptep_get() (== > READ_ONCE()) everywhere, then I agree that would be cleaner. Does anyone object? Hi all, A politie bump: It would be great to hear opinions on this before I go ahead and make the change. Thanks, Ryan