From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BD74C48BC3 for ; Tue, 20 Feb 2024 19:59:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BDDD56B0072; Tue, 20 Feb 2024 14:59:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B8E276B0074; Tue, 20 Feb 2024 14:59:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A55AB6B0075; Tue, 20 Feb 2024 14:59:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9673E6B0072 for ; Tue, 20 Feb 2024 14:59:00 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 37688140858 for ; Tue, 20 Feb 2024 19:59:00 +0000 (UTC) X-FDA: 81813245640.14.39E13D5 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 62390C000A for ; Tue, 20 Feb 2024 19:58:58 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708459138; a=rsa-sha256; cv=none; b=snQ2O1ijPTxcyA4EUuzpyyhceCb9XcKhE95P+ODVyxSRZEhuL0RClo4oWkkqEPnZfNKfHd c8HI6bcUe+mjkXrRk87Xqv+ykKsXFQ3d9pB3GDqJA0ExOV7PtmOEEB9/oQ05OB0ykly4UF A+48Ymrjok1h2QUvCgL39KI/ZP6ScEM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708459138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aCGG1n3K1/8qYFXNdvddX+moUXK+HNNXWmX+SD9OaDI=; b=Zj91wRgB1wkdVnQBE/u3Yse0Ydju4sQDPhvzn+8tuLUPkxs2jqoJ7my6kPhIECf5ZFWH1F Z71G3LaGUsOIUCw4xShbJK2yiUw/liyk0W8xRGu241+s7DOa7jmKMLu7FD1SL5fzTS0EgW I0kF0T8WqVSA4K3AzlyE3F43hN6t2Mk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C06FFEC; Tue, 20 Feb 2024 11:59:36 -0800 (PST) Received: from [172.20.10.9] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A9C333F73F; Tue, 20 Feb 2024 11:58:51 -0800 (PST) Message-ID: <9cb2b8c6-aac8-4130-8558-6646817689e0@arm.com> Date: Tue, 20 Feb 2024 20:58:49 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 12/18] arm64/mm: Wire up PTE_CONT for user mappings Content-Language: en-GB To: Catalin Marinas Cc: Will Deacon , Ard Biesheuvel , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240215103205.2607016-1-ryan.roberts@arm.com> <20240215103205.2607016-13-ryan.roberts@arm.com> <892caa6a-e4fe-4009-aa33-0570526961c5@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 62390C000A X-Stat-Signature: 5socrdqytm8hquhrxei1othujtt1wcb7 X-Rspam-User: X-HE-Tag: 1708459138-456618 X-HE-Meta: U2FsdGVkX1+0vnoHyXKkM5ETjY56dvBQaMeEnXbsznJQ9wXVRcCiB3vv8D4yavOikvoq2zab/oItOfCH8x2Ou0F/PQ8lP+QqpCUQw/j2sc/5yHORQ+acIQGHBi0r8uAwm0LEzvRCs5IXNcGb8FCSuS7ySMtcHQOLYqNVPs9qDumtF3Jh3rcTsxGebEdN5GrCTvblhwJS7fTFkBBeknEC2gKLRR5Xj+LeWf7lQaQ5cAFQyS6tVoRWlz0mFUPGJhrdYDNxGLtY2ASl51PXOzacFEVSOUf89hMz5rUD9knjN7UIIBuCbwT8zuY654cloSCltzD3atUlKJqIwIdYnCn15P+RSrvGPWqkOqDalcefmbZu7Nn/c2RFLtLwZ0MqXkRoqcJ76zeGZrM2zjDOMK9Nu4HRnqdNh2eVAxInSbDMeQepuC7mBa+vj1SV6Ie6z3YH5t///B0DkOSY+w16pxcGXO1N3Xx1D+Q/UxUMnT9FG6xYjSL1UKSXocAYKz01mOvvSYnBtUgArfy7zqgj4yXbW5xCHCliKRuj4igHa8xx3y3e6UxBIIwxPwMcN1LQ62dUVZ2pqNvNq8tXlq4ewyJc0CWyLYhSxlEIaxLG1jgx8yZZh+YRBQUQfHaQc4tI6VgWBVUAL6pjZ4aVprvgKwYfvYVb+/BIHcalBGlAX2qSg87uZfiDI7d3XOi3WnOXWR8Wbt7ZlGjNbx18PMLWin9pQ6624z16ZgKcRCuwK2Kyg01MB+ji9Yhfc22LG0t0S0mLYER3UhdViQ9n2vhX2n2a1/p3KpLjkTcw8aDtPLao3MHJ8S6vM2YuVUg/LOVxNn1vGp0nxI0bosvxDhxDBIXwduusbmblDNYoXDHXvLIfFekrhCyzSt409S194ZbTsGiBLEy1Omits7YJUA/GP38AhgNdcIj7mneCa/Ei1ilVe5KsNGR1MZz4VVi0LxdY84lCrX2TLTJdiM+oWfrEthV 9sH04tqc hvQi+yWxLIAFxD3P6meep+HgEDs9MzwPC/YXq3wjovb5GVPbQRvkBGQzLRRQHX8MH3NfP15kMPoDfQHboUuP7dzh37YYyTxitMuQmBiceJeSYUvnvYEvJEbgmkyw5MbdxbQd3ZMCO/WHp5JS11XUzWQeV/PGVK935oSkVvvCe1SA/PDbETUSrg6teY8S5bGsHnYp+SoMR9/rJEI/K5z6WRxsfkcgkaPO0UJE+BapTrA5l20XdtiU+AvxXOhHmAoutrC0/i6ByZ6GyuYGNTJmT46GoM8c3BdkeqicZ6iCTIkmM+2bJot1H5Fq06I/nbJ8MhWfDBvlyMEEHb3dwUJ2zRQTKpvKuzJ2gckz//9k6STyn6qWuA8dIEgym5C+GYWIBwLudZjE7f/Q304DONsmcGSo9Sg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 19/02/2024 15:18, Catalin Marinas wrote: > On Fri, Feb 16, 2024 at 12:53:43PM +0000, Ryan Roberts wrote: >> On 16/02/2024 12:25, Catalin Marinas wrote: >>> On Thu, Feb 15, 2024 at 10:31:59AM +0000, Ryan Roberts wrote: >>>> +pte_t contpte_ptep_get_lockless(pte_t *orig_ptep) >>>> +{ >>>> + /* >>>> + * Gather access/dirty bits, which may be populated in any of the ptes >>>> + * of the contig range. We may not be holding the PTL, so any contiguous >>>> + * range may be unfolded/modified/refolded under our feet. Therefore we >>>> + * ensure we read a _consistent_ contpte range by checking that all ptes >>>> + * in the range are valid and have CONT_PTE set, that all pfns are >>>> + * contiguous and that all pgprots are the same (ignoring access/dirty). >>>> + * If we find a pte that is not consistent, then we must be racing with >>>> + * an update so start again. If the target pte does not have CONT_PTE >>>> + * set then that is considered consistent on its own because it is not >>>> + * part of a contpte range. >>>> +*/ > [...] >>> After writing the comments above, I think I figured out that the whole >>> point of this loop is to check that the ptes in the contig range are >>> still consistent and the only variation allowed is the dirty/young >>> state to be passed to the orig_pte returned. The original pte may have >>> been updated by the time this loop finishes but I don't think it >>> matters, it wouldn't be any different than reading a single pte and >>> returning it while it is being updated. >> >> Correct. The pte can be updated at any time, before after or during the reads. >> That was always the case. But now we have to cope with a whole contpte block >> being repainted while we are reading it. So we are just checking to make sure >> that all the ptes that we read from the contpte block are consistent with >> eachother and therefore we can trust that the access/dirty bits we gathered are >> consistent. > > I've been thinking a bit more about this - do any of the callers of > ptep_get_lockless() check the dirty/access bits? The only one that seems > to care is ptdump but in that case I'd rather see the raw bits for > debugging rather than propagating the dirty/access bits to the rest in > the contig range. > > So with some clearer documentation on the requirements, I think we don't > need an arm64-specific ptep_get_lockless() (unless I missed something). We've discussed similar at [1]. And I've posted an RFC series to convert all ptep_get_lockless() to ptep_get_lockless_norecency() at [2]. The current spec for ptep_get_lockless() is that it includes the access and dirty bits. So we can't just read the single pte - if there is a tlb eviction followed by re-population for the block, the access/dirty bits could move and that will break pte_same() comparisons which are used in places. So the previous conclusion was that we are ok to put this arm64-specific ptep_get_lockless() in for now, but look to simplify by migrating to ptep_get_lockless_norecency() in future. Are you ok with that approach? [1] https://lore.kernel.org/linux-mm/a91cfe1c-289e-4828-8cfc-be34eb69a71b@redhat.com/ [2] https://lore.kernel.org/linux-mm/20240215121756.2734131-1-ryan.roberts@arm.com/ Thanks, Ryan