From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66630C3DA7F for ; Thu, 1 Aug 2024 02:03:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F2A36B00A3; Wed, 31 Jul 2024 22:03:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 77B616B00A5; Wed, 31 Jul 2024 22:03:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61C4A6B00A7; Wed, 31 Jul 2024 22:03:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 410B76B00A3 for ; Wed, 31 Jul 2024 22:03:39 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A13241A085A for ; Thu, 1 Aug 2024 02:03:38 +0000 (UTC) X-FDA: 82402030116.15.C9A4F19 Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by imf13.hostedemail.com (Postfix) with ESMTP id 08A8F20027 for ; Thu, 1 Aug 2024 02:03:35 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=ellerman.id.au header.s=201909 header.b=Lojho8xF; spf=pass (imf13.hostedemail.com: domain of mpe@ellerman.id.au designates 150.107.74.76 as permitted sender) smtp.mailfrom=mpe@ellerman.id.au; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722477742; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8adUdvUW737uHA/RqkquHip5HZGx3zcFU9ygzrLxTNw=; b=iwULgaG/H50iNsJS9f8cbUv4IuAdIwyLNFSvHU33NOps4lCuL69TwDdh/50+36nIo6zSN3 bU6R1qP2oj5XAopM2MQJ54n71DGkusB3OSRXKPJQbfDjX/EosfiiCDuxyBffWpiTR9e9nr zA3LMoKZ/rb4D7W7hMtgSmNh9yzv2rI= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=ellerman.id.au header.s=201909 header.b=Lojho8xF; spf=pass (imf13.hostedemail.com: domain of mpe@ellerman.id.au designates 150.107.74.76 as permitted sender) smtp.mailfrom=mpe@ellerman.id.au; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722477742; a=rsa-sha256; cv=none; b=nnf2xLpZb6tJVWGx2lDVerB3SsdU4LXY/aW49A+sC1NjOURnxqLS99DbKp4uXYrXkP4Uce ohW+eczR7eLp71ckaiUY/j4zmDoInAU0GOUuThj4T0sujwlLQLANlQBPVKl5dp8f4SECOg 6nucW1SGtwE/yAxbadg1KQPm+ULW4LQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ellerman.id.au; s=201909; t=1722477812; bh=8adUdvUW737uHA/RqkquHip5HZGx3zcFU9ygzrLxTNw=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=Lojho8xFX67N2BejJZkXmuky71fNDSqpFlfBXmegPmbrfZom5lbPEK1s31Uvpbi8u qAqIzrivsyTVNnvQ8FnVOUOKUQNtkHDFoIjrhy4mUKZDsG5map1kezbCfewhgS2hjp H0ehc3IJP+2FLnz2/BuNrRlKJy9v3Y2M6xYdjR6tbVhlZahtsPcesA9GC9XjA10J7z wtiI4ZijWRWnUbv7IQ+plaMJwuWRMTPTLhCOSuNSnNu/cEFMe98dvMbe9aP2bKvDUM VQqV4AxremTPak3mvhMqTFAWWis26wyRofXluqhgkNNEWqjTggMfFG4ZbbyVMSFbsD ZyzlLqGtsIQbg== Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mail.ozlabs.org (Postfix) with ESMTPSA id 4WZC1H32JZz4x0C; Thu, 1 Aug 2024 12:03:31 +1000 (AEST) From: Michael Ellerman To: David Hildenbrand , Peter Xu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, James Houghton , stable@vger.kernel.org, Oscar Salvador , Muchun Song , Baolin Wang , Christophe Leroy , Nicholas Piggin Subject: Re: [PATCH v3] mm/hugetlb: fix hugetlb vs. core-mm PT locking In-Reply-To: <2b0131cf-d066-44ba-96d9-a611448cbaf9@redhat.com> References: <20240731122103.382509-1-david@redhat.com> <2b0131cf-d066-44ba-96d9-a611448cbaf9@redhat.com> Date: Thu, 01 Aug 2024 12:03:30 +1000 Message-ID: <871q39ov7x.fsf@mail.lhotse> MIME-Version: 1.0 Content-Type: text/plain X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 08A8F20027 X-Stat-Signature: mpy5fh7jg1rk5epr46zrx61qbo7qhg6t X-Rspam-User: X-HE-Tag: 1722477815-284144 X-HE-Meta: U2FsdGVkX18Cgia/Uo/y1E7wwHSI0JL0ZNYC+I9Z0n7UQ0E//WDWdQ9L6vppH5VjQ0q9goEMsjoER2v4KSTsAwvM48IUWNF/9tyT2jLzAU3XuzaxgdacI53YYwC50HJWx3xf7+LbSEL2m2UtYiNkRjSztrXD64YahrlXq1zVtMgYVwoScGMS2Lt1SCNytn5r7ghwlj/EQz/vZ0MdJ6dli+2ZH00KRLGD/H1fOKieQDD/jfdVckjT1obEvt1EVuSDh6z2GQkYC81BCF7qT1gh9QQ4qS2VEUBqtU+gSWDugLcMPkqw36wNVSF9WPiErmAbu7FaQM4qFLD8LpxQuFflwwIkMwtg9F5hShuIDqAQ5hgLSIL/YpIMMktNb3+AARqmxWD/VVpvVH+vdJ5/tzkWHTx/8tPRZ8RxRLzoMm4mUeAIo26So2jI7AOkqDl0u3t5240W3zpDjlcv+w11XcEZT3fOhuccLwppcETAMflTrRBoO6zBkjZ2nB8bcDPNjX14abljDsN8xKTUrk25v2W/2gLuk+MuX8vvQzm/MscoixzZiegntUtXrCfirka6MgFkT9JmgnquYYXhPq7Hlxw7CFCf4nu+UMh0k/NSII/u4GRu5ZIXGWQaS0NXnaQKlS1Y7xFaReAg4G0trXex0HMybzb/YdNcWycN3wht6xxOpCFCRYWDyB6yvfpb8g66d7F+HPJndhCouItJA2Jx6UZD2Ony2pxPiyOAmFZvmGtX7w7w+z/6VKSzALDuAL5WTySH5VRyVAqG3XsK+odQ7kxAeFwTHY6PZes5D5ZEQ3OLDSEqE/tKjCSasvVr+2DWZ2yX+56Mn5yhC3O+oA+b6Q0QMwDPo7tYNOix6ZAJx6Px5WLKnoUW4Hnt48tajWfMOJ16L46fCpAf+RPrEr5+kMxyLb0iqp3mkeEwR43b/LoVCO8+Szi/CeFzHDOaxSt/ypopRDh1ngv906owma/a+93 T3LEnrGJ Evoz0eZ+sThQpRXqD0z6/QpqaFytpLeKonbCa2Nd/BKk7/I3SVXqR8txjBJfRalSVOrYH6CsLrrloWbWdxYBvTpKEUq5Tp90UVXsts4CTp+UPRBpxt0o0EcQWX/KZi8aPPp5eARcZ6THa5ucxDGz9xMu5q2OWr7VHz5u7YZcQXvvC+fHcZ6IewW2Q6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: David Hildenbrand writes: > On 31.07.24 16:54, Peter Xu wrote: ... >> >> The other nitpick is, I didn't yet find any arch that use non-zero order >> page for pte pgtables. I would give it a shot with dropping the mask thing >> then see what explodes (which I don't expect any, per my read..), but yeah >> I understand we saw some already due to other things, so I think it's fine >> in this hugetlb path (that we're removing) we do a few more math if you >> think that's easier for you. > > I threw > BUILD_BUG_ON(PTRS_PER_PTE * sizeof(pte_t) > PAGE_SIZE); > into pte_lockptr() and did a bunch of cross-compiles. > > And for some reason it blows up for powernv (powernv_defconfig) and > pseries (pseries_defconfig). > > > In function 'pte_lockptr', > inlined from 'pte_offset_map_nolock' at mm/pgtable-generic.c:316:11: > ././include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_291' declared with attribute error: BUILD_BUG_ON failed: PTRS_PER_PTE * sizeof(pte_t) > PAGE_SIZE > 510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > | ^ > ././include/linux/compiler_types.h:491:25: note: in definition of macro '__compiletime_assert' > 491 | prefix ## suffix(); \ > | ^~~~~~ > ././include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert' > 510 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > | ^~~~~~~~~~~~~~~~~~~ > ./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' > 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) > | ^~~~~~~~~~~~~~~~~~ > ./include/linux/build_bug.h:50:9: note: in expansion of macro 'BUILD_BUG_ON_MSG' > 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) > | ^~~~~~~~~~~~~~~~ > ./include/linux/mm.h:2926:9: note: in expansion of macro 'BUILD_BUG_ON' > 2926 | BUILD_BUG_ON(PTRS_PER_PTE * sizeof(pte_t) > PAGE_SIZE); > | ^~~~~~~~~~~~ ... > > pte_alloc_one() ends up calling pte_fragment_alloc(mm, 0). But there we always > end up calling pagetable_alloc(, 0). > > And fragments are supposed to be <= a single page. > > Now I'm confused what's wrong here ... am I missing something obvious? > > CCing some powerpc folks. Is this some pte_t oddity? It will be because PTRS_PER_PTE is not a compile time constant :( $ git grep "define PTRS_PER_PTE" arch/powerpc/include/asm/book3s/64 arch/powerpc/include/asm/book3s/64/pgtable.h:#define PTRS_PER_PTE (1 << PTE_INDEX_SIZE) $ git grep "define PTE_INDEX_SIZE" arch/powerpc/include/asm/book3s/64 arch/powerpc/include/asm/book3s/64/pgtable.h:#define PTE_INDEX_SIZE __pte_index_size $ git grep __pte_index_size arch/powerpc/mm/pgtable_64.c arch/powerpc/mm/pgtable_64.c:unsigned long __pte_index_size; Which is because the pseries/powernv (book3s64) kernel supports either the HPT or Radix MMU at runtime, and they have different page table geometry. If you change it to use MAX_PTRS_PER_PTE it should work (that's defined for all arches). cheers diff --git a/include/linux/mm.h b/include/linux/mm.h index 381750f41767..1fd9c296c0b6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2924,6 +2924,8 @@ static inline spinlock_t *ptlock_ptr(struct ptdesc *ptdesc) static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) { /* PTE page tables don't currently exceed a single page. */ + BUILD_BUG_ON(MAX_PTRS_PER_PTE * sizeof(pte_t) > PAGE_SIZE); + return ptlock_ptr(virt_to_ptdesc(pte)); }