From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44FA9C2BBCA for ; Tue, 25 Jun 2024 04:49:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C695C6B0283; Tue, 25 Jun 2024 00:49:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1A246B0333; Tue, 25 Jun 2024 00:49:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A92266B0283; Tue, 25 Jun 2024 00:49:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8AAEA6B027F for ; Tue, 25 Jun 2024 00:49:56 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 07AEBA3219 for ; Tue, 25 Jun 2024 04:49:56 +0000 (UTC) X-FDA: 82268183592.18.C99CB24 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by imf26.hostedemail.com (Postfix) with ESMTP id 1CD1E14000C for ; Tue, 25 Jun 2024 04:49:53 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kABBcH9m; spf=pass (imf26.hostedemail.com: domain of npiggin@gmail.com designates 209.85.215.171 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719290987; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZuqG7biAOqgDgQrrsZ67iRk9Ad3cri9ERlzvBniIlQU=; b=YFMBMP5iDcTZRAj6eXk0lJhVMp2W3CR5ozs7LSTlESM1bOwcWHq2u6YRNRxDW7p5dKvsHr tuwGzIl7c9V55++3ecLl/N5cLl5npowiMpyW3vfAijiElkEFv2fKNYRYdQl2lJ6suVnqPl 15qysQJiJqzwH1t1+MUT6c874GSKH08= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kABBcH9m; spf=pass (imf26.hostedemail.com: domain of npiggin@gmail.com designates 209.85.215.171 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719290987; a=rsa-sha256; cv=none; b=W9woow+GsQHrQY0XPRExvcFNxpIfkAAFidrxZMIin8Q8hCFQ6Y4Mij3GfrU62eOZGpBcnh N7Y0fcUGT2KjwWT46l3sZlEMfU6KwETZgKBm1egUgMY4tpB3epgv76v1T2YrNFENrdOWuG eFGsAUcXpvkwB6fDjytV1wmC01YFaC0= Received: by mail-pg1-f171.google.com with SMTP id 41be03b00d2f7-6c4926bf9baso4192775a12.2 for ; Mon, 24 Jun 2024 21:49:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719290993; x=1719895793; darn=kvack.org; h=in-reply-to:references:to:from:subject:cc:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZuqG7biAOqgDgQrrsZ67iRk9Ad3cri9ERlzvBniIlQU=; b=kABBcH9mKY7hYClQU0BNoSyd5fTW3FyfFbSPhOi+zGgxhGKL08OZTlC4tcL+tk8XRs bEmlIdR9z3IfJxzm+KYMcMK/wjfjuFm6/HPx341paSCJ1hYJYYvYT3iO7k36wTEY16oK ZdeHM7l5jgJzxI8lwk5Myr+N7rMDe6K982XlFLNTS14xrK5yOkbOcl9pdJvmUMeiTyx/ 3bYRibX7Ye5Posw37YWYao3nqWL4DIOxrRnzzvfPI054neNZ5OkrlsJzxol+4CcmqSTs MSc2TKx7+hdQal6wP5KrZ8OvaPRof4XPfE7U1rtlJfgIjE4xAWGLgQo210epJK/qv44Y ZCNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719290993; x=1719895793; h=in-reply-to:references:to:from:subject:cc:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=ZuqG7biAOqgDgQrrsZ67iRk9Ad3cri9ERlzvBniIlQU=; b=OYUwek0tcyGcFfN+Kk0eve/PQ7LW6YQSlBm039YRlpvO8SPcDDVLeERGp5x6JB5o/f HFhUIF4Jin/lc8RtOonyv5/0xklHOC0WB+Tg516pDs1D4M5Ei+uAvdSD8QmUC87T4WSj 2cozxiEp/2VFSpEVsRTyjL1wiNx6ZYjHTBNWeFfcf+XnXfVt1FjFP7DMzAwrER+kTvJK Y/Nmj4LHV3po5ZiDP9INmpO/WBGYVa3s6akYKTz35pLhRTs9vbzIRFK/5O3sfseCE7TQ nWrCCxxw+SMdGOkIN/nRysCVX5MZXUQFTiwZMqWoucI2weVJbdpKx97iu1lM7jLuagm4 5XvA== X-Forwarded-Encrypted: i=1; AJvYcCWn2aAl7BalC5SdDsJZAXXt68xQy74CYIXiJSVjsTwqfq1hSD8XhbgVmTeC3kZ8TGQJOM3AWXBV8Zm2vJ5sUbf3y8U= X-Gm-Message-State: AOJu0YxDNrfr3yQUJumgcp8qyMPJAdPxAatTC6TGnPijUREg3S3Cl6yk M8hdP2OSYeYevM0Pe9oOpTaNqSxH2yK+khbfvkUUVhq1kKar8+DC X-Google-Smtp-Source: AGHT+IEMY8o/252qBDFEIJQKQoJP+4B/vRW3iANS8a4tWuh2eFp5ixvi1C+Zofa9uCivrKn8XtAURA== X-Received: by 2002:a05:6a20:be13:b0:1bd:2292:e592 with SMTP id adf61e73a8af0-1bd2292e5efmr233523637.22.1719290992767; Mon, 24 Jun 2024 21:49:52 -0700 (PDT) Received: from localhost (118-211-5-80.tpgi.com.au. [118.211.5.80]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2c8b832639bsm275497a91.1.2024.06.24.21.49.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 24 Jun 2024 21:49:52 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 25 Jun 2024 14:49:45 +1000 Message-Id: Cc: , , Subject: Re: [PATCH v6 21/23] powerpc/64s: Use contiguous PMD/PUD instead of HUGEPD From: "Nicholas Piggin" To: "Christophe Leroy" , "Andrew Morton" , "Jason Gunthorpe" , "Peter Xu" , "Oscar Salvador" , "Michael Ellerman" X-Mailer: aerc 0.17.0 References: <23f3fe9e8fe37cb164a369850d4569dddf359fdf.1719240269.git.christophe.leroy@csgroup.eu> In-Reply-To: <23f3fe9e8fe37cb164a369850d4569dddf359fdf.1719240269.git.christophe.leroy@csgroup.eu> X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 1CD1E14000C X-Stat-Signature: kfzer7cq1bf4wead6s6escus9up5qzmu X-HE-Tag: 1719290993-746541 X-HE-Meta: U2FsdGVkX18g1T6fi1as6FJE/a7QwW46nBL4GRnNEZWZ3IrqLCLqK4+pbOoVKdpcMQ9yPvSGPzs8D/gN3+AghIwUUgjpoCWtAc/uLi3KF/Jpn8YH6WukaDFa2PzEa87gboPOaSRT8VCxFmq4jRK3unvkulQmdLA/eNBKzC6E+832hXvwDloOLLiOFy3rPPZsClBkMANzaSvWmikGkUb3UkiHjA6Hmpj8xsBHd73+2KqcrntJ6wSbKa3YwkLcNKwQ32d6AwVewSKC7eXXWZbdHW78DxMAwYoHCpHEopFKt6E4hy7HZhezXLeuojkgMEhYsD6oOKs9fxDUVazYOASLBthnnjAR0ElYwgTOV1eS6PtDwch+c251LPs+aiNK6AghMVUbouzO/hmojgW6uaH14pSu9X7QskhEWfjB8K+Skf48fqWYxuuSKFkAa7WPHk3u02XyvzeT1tq8THleluUJWC/vPhK9vFqjONo5BJUKLy+lt27fiQGkfe5EwpNF6E1qkv3AZqPz13nJZ60YcoZ7YrsNCe/P3ZiaMlGMxabijkfR5qHrEiC3+hjd1JB7PxXEFXv+FLObuSgqnod6D2uGMx/HNxmtITwGMkdr1qA+2HDYEtb8y11iY5PnZA5Xy/mr/ZZRVEVQwJpxN3iuuR/l3qqvys3MYvQjAJrVCisRcbp3ksxV47O6zMmtPjF0JWdWtzLDPyvZjH/XIstqwWg59Bka9fAj/2yLZICGhPnnq9T6CZhpmSy/KqpaOLODVwNn7gLM7a9HRy1r2NndZh4lo99W1b8oYlH31VmFIpbZM1t+P+kdAn24rudWucceMvpmvq++e0BjQAIr5+DD8sH91AaPoL+GZTH6bt77/rwM2i+nOIsphPrIhyPRcBN8oJ+7zPxMEU1FqJzZf187jKa9Hl/FmqnWYGA/ztVPeAYkk3VM5e06CvYuVEDyMHsv1fkwW4KDjWQVS2JMHK9r3xA zU6fs80i 8gTbOoQwekFnpsvV/8K/ilO7MA/ob7f9ZatLL28hBoQBrVmgAPbHX2ZjztBoN9WhvknbaPYLBZeSXwfyBz/yv4aLuWhn/GgvKlYkhKRC4BUatIjPYvJfdPH1nAUCtpnM0aazxnX2zraCjGjao7BUUj11k65R4vBiwSOELNlbCOgcOUw44W+3UNdTjd6YnXH+78fSvrAwV/atavdiifLcHbM6jsMXQcfMqoluXbFnX8+vPVdZFF9MXg6bM+7/Fk94hurKOCGv+nZnCX1XaMjs/6wxe/l+u9je3n8YLbGeTveQzpN2u4AVeWsRM3p53fQmyRFIL3GxokuQDbtgWVbxPnh55QQzdzN+XL0TRUHVTkAQ9EfvfNk/Rzx9DbsrCuLlY/ILnpVrKx37+2aghpdSl5tpeXdP9WzHCp6a8S6KamePvcs0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue Jun 25, 2024 at 12:45 AM AEST, Christophe Leroy wrote: > On book3s/64, the only user of hugepd is hash in 4k mode. > > All other setups (hash-64, radix-4, radix-64) use leaf PMD/PUD. > > Rework hash-4k to use contiguous PMD and PUD instead. > > In that setup there are only two huge page sizes: 16M and 16G. > > 16M sits at PMD level and 16G at PUD level. > > pte_update doesn't know page size, lets use the same trick as > hpte_need_flush() to get page size from segment properties. That's > not the most efficient way but let's do that until callers of > pte_update() provide page size instead of just a huge flag. > > Signed-off-by: Christophe Leroy [snip] > +static inline unsigned long hash__pte_update(struct mm_struct *mm, > + unsigned long addr, > + pte_t *ptep, unsigned long clr, > + unsigned long set, > + int huge) > +{ > + unsigned long old; > + > + old =3D hash__pte_update_one(ptep, clr, set); > + > + if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && huge) { > + unsigned int psize =3D get_slice_psize(mm, addr); > + int nb, i; > + > + if (psize =3D=3D MMU_PAGE_16M) > + nb =3D SZ_16M / PMD_SIZE; > + else if (psize =3D=3D MMU_PAGE_16G) > + nb =3D SZ_16G / PUD_SIZE; > + else > + nb =3D 1; > + > + WARN_ON_ONCE(nb =3D=3D 1); /* Should never happen */ > + > + for (i =3D 1; i < nb; i++) > + hash__pte_update_one(ptep + i, clr, set); > + } > /* huge pages use the old page table lock */ > if (!huge) > assert_pte_locked(mm, addr); > =20 > - old =3D be64_to_cpu(old_be); > if (old & H_PAGE_HASHPTE) > hpte_need_flush(mm, addr, ptep, old, huge); > =20 We definitely need a bit more comment and changelog about the atomicity issues here. I think the plan should be all hash-side access just operates on PTE[0], which should avoid that whole race. There could be some cases that don't follow that. Adding some warnings to catch such things could be good too. I'd been meaning to do more on this sooner, sorry. I've started tinkering with adding a bit of debug code. I'll see if I can help with adding a bit of comments. [snip] > diff --git a/arch/powerpc/mm/book3s64/hugetlbpage.c b/arch/powerpc/mm/boo= k3s64/hugetlbpage.c > index 5a2e512e96db..83c3361b358b 100644 > --- a/arch/powerpc/mm/book3s64/hugetlbpage.c > +++ b/arch/powerpc/mm/book3s64/hugetlbpage.c > @@ -53,6 +53,16 @@ int __hash_page_huge(unsigned long ea, unsigned long a= ccess, unsigned long vsid, > /* If PTE permissions don't match, take page fault */ > if (unlikely(!check_pte_access(access, old_pte))) > return 1; > + /* > + * If hash-4k, hugepages use seeral contiguous PxD entries > + * so bail out and let mm make the page young or dirty > + */ > + if (IS_ENABLED(CONFIG_PPC_4K_PAGES)) { > + if (!(old_pte & _PAGE_ACCESSED)) > + return 1; > + if ((access & _PAGE_WRITE) && !(old_pte & _PAGE_DIRTY)) > + return 1; > + } > =20 > /* > * Try to lock the PTE, add ACCESSED and DIRTY if it was I'm hoping we wouldn't have to do this, if we follow the PTE[0] rule. I think is minor enough that should not prevent testing in -mm. Thanks, Nick