From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F601EB64DC for ; Tue, 20 Jun 2023 10:50:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2A5B8D0007; Tue, 20 Jun 2023 06:50:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED86C8D0001; Tue, 20 Jun 2023 06:50:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D78AD8D0007; Tue, 20 Jun 2023 06:50:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C61C18D0001 for ; Tue, 20 Jun 2023 06:50:18 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8E5CD808C5 for ; Tue, 20 Jun 2023 10:50:18 +0000 (UTC) X-FDA: 80922806916.10.0801635 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf21.hostedemail.com (Postfix) with ESMTP id 917E71C0015 for ; Tue, 20 Jun 2023 10:50:15 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="fnS8a/4V"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of npiggin@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=npiggin@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687258215; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7951XL+AGyfPDPVBcyQBGj0fPAuFilMzcMu5/OoYt74=; b=BT7nFBHDN8wwrGHQUfLIncxMKJU+b+gSDtGWSKlL/UWXFyI0kcaCsXNL4VMdYz83arNGOC yHnTJRRueh0TOjEmgzqnkeOACsWTgUfQgG2LJcX+s8avQRqCzwxRH1NkVy/F+WkTbkNJn/ lQ6hJJun5n80CXEWIwnWwR05wgspi/c= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="fnS8a/4V"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of npiggin@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=npiggin@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687258215; a=rsa-sha256; cv=none; b=l0oRjiCYIbVA/Q8oypr3zXAnjNUbg/Zcqo4pItHBMB4yEGzZE13H7i2k4Bu4OPGbjAM7KK U+NJoVeeVo8IxorKUD9KcXs34ENLTxTI4w/sUBaMR0KpTfbFtQCN3t9xBqm4snJT0cOPDX JLc1799s4Rs7RbGTtgdBaDU+BVCNTsM= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1b52bf6e669so36693115ad.2 for ; Tue, 20 Jun 2023 03:50:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687258214; x=1689850214; h=in-reply-to:references:to:from:subject:cc:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7951XL+AGyfPDPVBcyQBGj0fPAuFilMzcMu5/OoYt74=; b=fnS8a/4VYBS/vSvyHGdyP24ps8Kq8sr63Wr4Cn+CKBIdCwmniMmjJQIShB7fHbhPok 4Tz8DOApK/lZNktEbFb0CTD55Axu4zROPjHAYc6+zmDoDqX+AqohriDpOOaQN3d3t6YO WubwljD9DnaVkMpJ+ANgkaj8164ZY2AQrv3cSKs4osSMUdtI8cdJt59XQ6OQPKBDN4BA HGG2Pm6f8Vl5/IatHvIGVKuiGYw4tRpj875PHbwE+48tkD2gtAT0If5GHBzB/ngA9yVU rfO//WK2++e6S9Hf+ak9ebcfkvZQo9JogK4XbUn/3EtRLurzh5eH86GEuU3AVAYEbRDN BmUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687258214; x=1689850214; h=in-reply-to:references:to:from:subject:cc:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=7951XL+AGyfPDPVBcyQBGj0fPAuFilMzcMu5/OoYt74=; b=UFgDVG3V1oBGPSyNN3OOvD3FL37b4fUxfBlnw/1zQdOydJLcSvU6HXqVYcXdzcVgb0 BggXihacJ+SrxoeBE8gxFnfkMvXJzViqVdNxIIvWwX0tcrFptnR5/k3i9YK3yuP66mMV RWZpV/vuu2pC97gM0cNNmZSoPHl6zVlE4/6HZUWtBbmb9WDkBEAv17KCnjz5FWPZtByn mMr3rR2v44fkYOzy0m362TQDJTOKyjHSoTqTV3D2zOPsCgkNeEBsSpsmIjnO1C8pvHDS 2r4DTjxhiIsM0n14AFCm92ecIb7NBaCCMFCLcy7WL2eb5c2M5aUxd29tQcBfySZ6irnb McEQ== X-Gm-Message-State: AC+VfDwNHzdNoL+eA3P0AzyNkyxW4X7pCqGIIdkCHFM3fnMo9H4nM3Jh bKyEANApnfmOWcS129aMb6c= X-Google-Smtp-Source: ACHHUZ7CSMfgUHDwU9KLFxwcxS1szXOVyovRvhTVEH1gWS7z37bysU3PrZvvl2rPP84okj3F8GtMVA== X-Received: by 2002:a17:902:6b8b:b0:1b2:1a79:147d with SMTP id p11-20020a1709026b8b00b001b21a79147dmr10591794plk.2.1687258214282; Tue, 20 Jun 2023 03:50:14 -0700 (PDT) Received: from localhost ([124.170.190.103]) by smtp.gmail.com with ESMTPSA id p2-20020a170902e74200b001b3c892c367sm1368654plf.63.2023.06.20.03.49.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 20 Jun 2023 03:50:13 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 20 Jun 2023 20:49:56 +1000 Message-Id: Cc: "Andrew Morton" , "Paolo Bonzini" , "Alistair Popple" , "Anup Patel" , "Ben Gardon" , "Borislav Petkov" , "Catalin Marinas" , "Chao Peng" , "Christophe Leroy" , "Dave Hansen" , "Fabiano Rosas" , "Gaosheng Cui" , "Gavin Shan" , "H. Peter Anvin" , "Ingo Molnar" , "James Morse" , "Jason A. Donenfeld" , "Jason Gunthorpe" , "Jonathan Corbet" , "Marc Zyngier" , "Masami Hiramatsu" , "Michael Ellerman" , "Michael Larabel" , "Mike Rapoport" , "Oliver Upton" , "Paul Mackerras" , "Peter Xu" , "Sean Christopherson" , "Steven Rostedt" , "Suzuki K Poulose" , "Thomas Gleixner" , "Thomas Huth" , "Will Deacon" , "Zenghui Yu" , , , , , , , , , , Subject: Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe From: "Nicholas Piggin" To: "Yu Zhao" X-Mailer: aerc 0.14.0 References: <20230526234435.662652-1-yuzhao@google.com> <20230526234435.662652-7-yuzhao@google.com> In-Reply-To: X-Rspamd-Queue-Id: 917E71C0015 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: ukhr1ihddeaqpymgow7cwc6ad7wm54z5 X-HE-Tag: 1687258215-973018 X-HE-Meta: U2FsdGVkX186oCyG3VGeW7/z7GlDIznI2fDiFetaAUbk4xK+3oqdId3HxXAjhTLo/JZuOqjR4O/s0jvpZ7Hmuy3owG1nG4U77zLm9IopFokNshfrEpkEP8zsNuWL45ui8NKigQHmb+i7ibuj5rQwSTRXh3z41JuJywKXkQ62emjxqDoLa/jFbqbAMEFBHA+93F+P6t77CETuO6zQjHWH0lRTEQtw5zZ37cxVIemwJsRSILkJmg9F7RKVX3vmkmR3AQJH+Z0ADvsVxjOTGhCuVnZZnAjDVB0LOYuVEKLmowFqSfYEv0kKLGE2MP+FWxdBzgJmbmlRbFPiHEoCsEJ7BeeuLu3T24g5GbbPo7i+v3lxlMyEkiOzHJh6tyQnQSOXJ0LE0lWgD+1dxyCrm0bol0sM4thYA8RbJ7f9m4Nex8Eh0E1VSkqqWnnxqgyNVxoRb1dh0agxnDlHDsAX1UX38m5DAizXl02+GKOquULyaBvCZlyxvqPQqxi1WCMiWrOHMd0JsZV7JLiFdrNBzSgnTx8cvjantOJtle6QB/XaLxJ0EJ54qfkTJuHJPvTh6g6JNgKwDEIlBE9+xEef+LxACcnTxMS/pdG2sMOBll5yHZ6Nj9ZKY8epoUNbA8ZSKRnusTblrwpBJkBecjB0unkOjeLycXgCI4YGJNMDJbypHzWt/J2KqvvIY7dl4z9j7O30pWgbCFSr813/QabrL+jj9DPKTBZtb5rzQ5TlDqMyJ51TLyCvI2QVCFKwY1qfotgOvyqIH3gOSDKjLgXCuiumZYtVJeuK4uXW5Kc5+TXmp/WmWdB8+1g/HYZBnHnDUGbw8TaQ3baZc4vMG+U7Z0bnic4JqcL4KVrpSOF3v6Sluc9esTftSGvLayhhYxcQ6neSXhU3ft3v9mNv/G02ea7Y0tbE32/kahGxAl2aqnzPD4m2NiDM5P3+lYV8sozGkkoC+4QTNiCI3Q8YD6U1h1D WlLxSRZb YPy+zzRncncsfmUIh9Qj5bwpg/pBW7OVUHoVxTLqga4QB7zdce0LurfxuO5R4+RAz53LmPQvDPYGbprnBGQ5TJzwdm43lG22gWaYqMQrrb7wnyNO0VcYKACl+Q0Y0Rpf7DyqOIA7i1hz+NDDZtFsbWKVwsyxsBKdKTtZM7uj75z06v+5UqQxlucanM/HtTrGIUtDhOh5IL1QACr+wEyhfQS/ZCD83d7FkrvgXTZ0vr+UnHSnrSGL3swYGPY/ZJTYVvHvRUQFgAHyGQmbe3G/JwsWemCG9qGHO4jAVMHwswcCfqllkAYSpk8eTLdCAoDRl76tZo62kzZz5dm0ai8u5o1x5JUd4BwKFl5Ah1BNzQOpE1ECvJmUFSSNI3oHFvOyF6vt/gYuDLsiPp9Nv6I1G1hLd38iaY5V0T8+fS/RuhhSIZybdIu1t0t26Dy8mG3viGFRFzeMATxD0rJCYldXbvinhbkZRl3D2XDVJOR5HMUKRigezDMyuVcwmjpaOIArtDDuD X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue Jun 20, 2023 at 6:00 PM AEST, Yu Zhao wrote: > On Tue, Jun 20, 2023 at 12:33=E2=80=AFAM Nicholas Piggin wrote: > > > > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > > > KVM page tables are currently not RCU safe against remapping, i.e., > > > kvmppc_unmap_free_pmd_entry_table() et al. The previous > > > > Minor nit but the "page table" is not RCU-safe against something. It > > is RCU-freed, and therefore some algorithm that accesses it can have > > the existence guarantee provided by RCU (usually there still needs > > to be more to it). > > > > > mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with > > > that operation. > > > > > > However, the new mmu_notifier_ops member test_clear_young() provides > > > a fast path that does not take kvm->mmu_lock. To implement > > > kvm_arch_test_clear_young() for that path, orphan page tables need to > > > be freed by RCU. > > > > Short version: clear the referenced bit using RCU instead of MMU lock > > to protect against page table freeing, and there is no problem with > > clearing the bit in a table that has been freed. > > > > Seems reasonable. > > Thanks. All above points taken. > > > > Unmapping, specifically kvm_unmap_radix(), does not free page tables, > > > hence not a concern. > > > > Not sure if you really need to make the distinction about why the page > > table is freed, we might free them via unmapping. The point is just > > anything that frees them while there can be concurrent access, right? > > Correct. > > > > Signed-off-by: Yu Zhao > > > --- > > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kv= m/book3s_64_mmu_radix.c > > > index 461307b89c3a..3b65b3b11041 100644 > > > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c > > > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c > > > @@ -1469,13 +1469,15 @@ int kvmppc_radix_init(void) > > > { > > > unsigned long size =3D sizeof(void *) << RADIX_PTE_INDEX_SIZE; > > > > > > - kvm_pte_cache =3D kmem_cache_create("kvm-pte", size, size, 0, p= te_ctor); > > > + kvm_pte_cache =3D kmem_cache_create("kvm-pte", size, size, > > > + SLAB_TYPESAFE_BY_RCU, pte_cto= r); > > > if (!kvm_pte_cache) > > > return -ENOMEM; > > > > > > size =3D sizeof(void *) << RADIX_PMD_INDEX_SIZE; > > > > > > - kvm_pmd_cache =3D kmem_cache_create("kvm-pmd", size, size, 0, p= md_ctor); > > > + kvm_pmd_cache =3D kmem_cache_create("kvm-pmd", size, size, > > > + SLAB_TYPESAFE_BY_RCU, pmd_cto= r); > > > if (!kvm_pmd_cache) { > > > kmem_cache_destroy(kvm_pte_cache); > > > return -ENOMEM; > > > > KVM PPC HV radix PUD level page tables use the arch/powerpc allocators > > (for some reason), which are not RCU freed. I think you need them too? > > We don't. The use of the arch/powerpc allocator for PUD tables seems > appropriate to me because, unlike PMD/PTE tables, we never free PUD > tables during the lifetime of a VM: Ah you're right, the pud_free only comes from the double alloc case so it's never visible to concurrent threads. > * We don't free PUD/PMD/PTE tables when they become empty, i.e., not > mapping any pages but still attached. (We could in theory, as > x86/aarch64 do.) We may try to do that at some point, but that's not related to your patch for now so no worries. Thanks, Nick