From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24771CF3959 for ; Thu, 19 Sep 2024 15:49:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 350E06B008C; Thu, 19 Sep 2024 11:49:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 302DC6B0092; Thu, 19 Sep 2024 11:49:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C8AF6B0093; Thu, 19 Sep 2024 11:49:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F27AD6B008C for ; Thu, 19 Sep 2024 11:49:11 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3BA9A140E53 for ; Thu, 19 Sep 2024 15:49:11 +0000 (UTC) X-FDA: 82581921702.24.5A12BAD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id F054C8001F for ; Thu, 19 Sep 2024 15:49:08 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726760823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kg/TDUOyxQVvoJj+SaV4+ITNnLyD7pDUx+3Sukv02tg=; b=ujMFiYS8uBETJE8iPxgGFe8yM/RgHalhngxB19RVHCo4Of7aDMcpyY8hdsVcsGJM1lu86/ yLxHbmpueawEGV2lRl2J3jXdVuRtLtl5ak4sIerybMXnBpN1cp+IBIod5Jcgv8zpzc1zG5 v9K5lGo+88oaEfVU7uHYycka5zrl+EE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726760823; a=rsa-sha256; cv=none; b=yRvNohwAMsnc0cpRKopAhGIwCdzpb8Y+Kc2MLRg8SlBahBwQVfxda1XLi0aW9de7FprJ4/ eqHT9Kd3c9MMDKYORvsdH6nq+I7dlfApM6DUBpEiYc9q7GaVqNUFtMHo8D2xExp70ozKrb HN0O8I4qVCAqOXdO+7XfHjVGysdkYT8= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 41B6AFEC; Thu, 19 Sep 2024 08:49:37 -0700 (PDT) Received: from [10.57.82.79] (unknown [10.57.82.79]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E30043F71A; Thu, 19 Sep 2024 08:49:00 -0700 (PDT) Message-ID: <82fa108e-5b15-435a-8b61-6253766c7d88@arm.com> Date: Thu, 19 Sep 2024 17:48:58 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries Content-Language: en-GB To: "Russell King (Oracle)" , Anshuman Khandual Cc: kernel test robot , linux-mm@kvack.org, llvm@lists.linux.dev, oe-kbuild-all@lists.linux.dev, Andrew Morton , David Hildenbrand , "Mike Rapoport (IBM)" , Arnd Bergmann , x86@kernel.org, linux-m68k@lists.linux-m68k.org, linux-fsdevel@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dimitri Sivanich , Alexander Viro , Muchun Song , Andrey Ryabinin , Miaohe Lin , Dennis Zhou , Tejun Heo , Christoph Lameter , Uladzislau Rezki , Christoph Hellwig References: <20240917073117.1531207-8-anshuman.khandual@arm.com> <202409190310.ViHBRe12-lkp@intel.com> <8f43251a-5418-4c54-a9b0-29a6e9edd879@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F054C8001F X-Stat-Signature: 1bt4sdabxqg156f83wm6kj3tzby9mpim X-Rspam-User: X-HE-Tag: 1726760948-234884 X-HE-Meta: U2FsdGVkX19cQohUIk0DEQ/6KG7M79suohYh7yh7ABpjGUsmQDzEkLKonfGv+SyrbQmn6Klaop99uWeKZLeP9o1WewoZRxk65xFieKIWPT/UvaUTOwY/jAB908OJUfnKu0iGTf97FcIYQt142SX0tccuysZ4ie6bbsYMutdpB3MbG6NFunaZ91dEcF3gyenpozD6+9KbKSY7SzVMPiL98dFlp1Drs4sV/ssoURRH5dnat93C/qn5Nrq1ocSP6+6LbsYlIpUroAjcA1NoRg367vKay6eUiX84Cpr+z/UVZG+lC+qDrBzGX6dfw7atotCAphdRKYnrNA7ME+7Z2mjlX0CjuBQ3z5Xk5Yz6BcvQ3ojAF2J4Kmvf+2L+IIcvuXg8zSbncohtT3PmEy3zuyLnMpBtQ0EMLJnf+W0nTztsUj2IIN896/tdzGcC7ygEX7iFGDP5xgwZZ4nB++f6qEkgHpcmDbwWpbcb4qL+/U71GgX8nl50kvPfxn8ZOCclwoaJiSkKup8gmdlKFybuEMCqTJrEkQWWaINwmQplPdAtu943c5VIDDrEI0jGj+JeWraZiPA87DUFKHEZYbHmLV9T5jG+r6PZ46EWJZg4Zqof1nyT1V5yQaNe2lItKtbjVA/1IjyVdBYWJ4iiug+2JJQ81/gCCDHq2dw+pFMy/ApItoKTNYqWctT7rEA2BJDLX4ENAT5vyi8Iut90Dj8xCLurGavG0H08OT0npf+157U3GipQ+1TQZ+sKP/lDzVjwXx+kvejKAaHm0W3RDp91kr/89EFpjbCMWbo8BdiC/LQACp0VeBbgpicGo3pBXAZn+rncnheJt1FC04ql+aPmRPDCB7mRWdjpwoWn1OtW7VfcKhcTd2Q6nZ0YPgRTI/oAjIZRyBrX4W1HYXIV7bEDvPJWfkmMxwFDQgRoQBxZ0xiArZNQHqwWCiFNiMBO2sZ6uQpP1Yf1vHP7eAfG2Bnt/aR ZNVIAdHW /FPmpX6YG+kO7BC26rvZaIa46NlepActS7AHxgljclvyc0avlRwrKUWTvTHb0GD9P3tW+T2n9xesNPAmZ66ZFTZ8QuApCF66bNPqegF2To5fnY4YvhFMBtWryjJR5gfvn1jfD6IYIydsK1P8pgFhWhnPUvTqOxw4FbBJe+6gsKKhKlZgiX5rTAp65/D8K1KjhlNZZoHDBsKTritLz5qlqeGpLyq05zkLLlCr7vDDIm6DuJz8svcjzZ3e5kblsGIHgxfgpyQlLzJDhHTU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 19/09/2024 10:11, Russell King (Oracle) wrote: > On Thu, Sep 19, 2024 at 01:25:08PM +0530, Anshuman Khandual wrote: >> arm (32) platform currently overrides pgdp_get() helper in the platform but >> defines that like the exact same version as the generic one, albeit with a >> typo which can be fixed with something like this. > > pgdp_get() was added to arm in eba2591d99d1 ("mm: Introduce > pudp/p4dp/pgdp_get() functions") with the typo you've spotted. It seems > it was added with no users, otherwise the error would have been spotted > earlier. I'm not a fan of adding dead code to the kernel for this > reason. > >> Regardless there is another problem here. On arm platform there are multiple >> pgd_t definitions available depending on various configs but some are arrays >> instead of a single data element, although platform pgdp_get() helper remains >> the same for all. >> >> arch/arm/include/asm/page-nommu.h:typedef unsigned long pgd_t[2]; >> arch/arm/include/asm/pgtable-2level-types.h:typedef struct { pmdval_t pgd[2]; } pgd_t; >> arch/arm/include/asm/pgtable-2level-types.h:typedef pmdval_t pgd_t[2]; >> arch/arm/include/asm/pgtable-3level-types.h:typedef struct { pgdval_t pgd; } pgd_t; >> arch/arm/include/asm/pgtable-3level-types.h:typedef pgdval_t pgd_t; >> >> I guess it might need different pgdp_get() variants depending applicable pgd_t >> definition. Will continue looking into this further but meanwhile copied Russel >> King in case he might be able to give some direction. > > That's Russel*L*, thanks. > > 32-bit arm uses, in some circumstances, an array because each level 1 > page table entry is actually two descriptors. It needs to be this way > because each level 2 table pointed to by each level 1 entry has 256 > entries, meaning it only occupies 1024 bytes in a 4096 byte page. > > In order to cut down on the wastage, treat the level 1 page table as > groups of two entries, which point to two consecutive 1024 byte tables > in the level 2 page. > > The level 2 entry isn't suitable for the kernel's use cases (there are > no bits to represent accessed/dirty and other important stuff that the > Linux MM wants) so we maintain the hardware page tables and a separate > set that Linux uses in the same page. Again, the software tables are > consecutive, so from Linux's perspective, the level 2 page tables > have 512 entries in them and occupy one full page. > > This is documented in arch/arm/include/asm/pgtable-2level.h > > However, what this means is that from the software perspective, the > level 1 page table descriptors are an array of two entries, both of > which need to be setup when creating a level 2 page table, but only > the first one should ever be dereferenced when walking the tables, > otherwise the code that walks the second level of page table entries > will walk off the end of the software table into the actual hardware > descriptors. > > I've no idea what the idea is behind introducing pgd_get() and what > it's semantics are, so I can't comment further. The helper is intended to read the value of the entry pointed to by the passed in pointer. And it shoiuld be read in a "single copy atomic" manner, meaning no tearing. Further, the PTL is expected to be held when calling the getter. If the HW can write to the entry such that its racing with the lock holder (i.e. HW update of access/dirty) then READ_ONCE() should be suitable for most architectures. If there is no possibility of racing (because HW doesn't write to the entry), then a simple dereference would be sufficient, I think (which is what the core code was already doing in most cases). There is additional benefit that the architecture can hook this function if it has exotic use cases (see contpte feature on arm64 as an example, which hooks ptep_get()). It sounds to me like the arm (32) implementation of pgdp_get() could just continue to do a direct dereference and this should be safe? I don't think it supports HW update of access/dirty? Thanks, Ryan