From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 934AAC3DA49 for ; Fri, 26 Jul 2024 00:29:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF9086B009D; Thu, 25 Jul 2024 20:29:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA87B6B009E; Thu, 25 Jul 2024 20:29:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4ADD6B009F; Thu, 25 Jul 2024 20:29:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B65DF6B009D for ; Thu, 25 Jul 2024 20:29:34 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1FCEE141508 for ; Fri, 26 Jul 2024 00:29:34 +0000 (UTC) X-FDA: 82380020268.22.0F3AEBD Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf24.hostedemail.com (Postfix) with ESMTP id 52938180036 for ; Fri, 26 Jul 2024 00:29:32 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DpYa9VAC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=jthoughton@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721953717; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DbyJs6ojjDUY/aYBbei0H0GJsGGC+U+B84+bho9NthY=; b=VHlLbI4AeYU+tqHmr+VTvveWPRED+/lrxUiEV2spZ0XS3QpzhOk1gbGuw3WtajAVQEJiXd gnGUr0bPikZRBXZZfab3VMgFNACH3XL9qq34GX8PUvNf8M2ywLCEENyEfGCKZltSrEhZsA SZ6XIvgPMHVSjx7UHoNX98W+px1IpRA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721953717; a=rsa-sha256; cv=none; b=tAdnpt1+r9w9oYpFumOeie2qINrzfsK5RmA6ydO5unniQCcFUZrSSQPgffSP66rsrPZTp/ kV8vG4CqsxD8Zea1USCt5Gbt2PUVrI0ma/xq/hGPGR/U5XhPLtjAa/dKETOFmArj90Blmf BnruhGG/MY9bcEq3vQGiPkOseV28m8I= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DpYa9VAC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=jthoughton@google.com Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-44e534a1fbeso42161cf.1 for ; Thu, 25 Jul 2024 17:29:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1721953771; x=1722558571; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=DbyJs6ojjDUY/aYBbei0H0GJsGGC+U+B84+bho9NthY=; b=DpYa9VAC//CG0nvkY8X9dcYGNs9/+T7rW/s7K+/ZsDusuxDj7xSkal/jPclrnDypLk B1pOa09YWp5+pqHyTfdCIklC1iDmw3S20Ez4PKb6XN0pOwu3JTs6XzVhYiUyykf5YmPx gl7jOBpTG5600+0QFXHy3RL0T61BgmqqnbnGreS1sg21MMkYRDWrmEuns5VYq2PtAjiE zBXxNS3pDgwhscmvnxIksYS5DOGI55Ts2hilnZVkls+moTYBxn7svElGNSaZNenZREV9 D5Ptwse+53M9TjWrL3IHaWSOhVoRJ/XeCUhrNisNu29879icFeCK+FvjrOL5KHJLleGG l2sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721953771; x=1722558571; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DbyJs6ojjDUY/aYBbei0H0GJsGGC+U+B84+bho9NthY=; b=F6YrN4qXM3maNCXxTXHATGwxk6TnE+LVg7ooYoy9szNlLk146oAan+luZ2kqETPc+x GT8VJCNANt+14o8ehVFy43c65Gx1M4q+qiQ2VHtH0vVWQI32T1Qi8gWLOutMWUa1kouy DIGcQRAGI1LC8DIXJQG+5gn936/RQiJXS2VID6NRVpkrvlYggceyPyNUlaWkeUsbSdOP kItuLV8fPWKaRsJAvj2WZjE8f6JC9qwfV1NhTmTh6Is7g7QQnsgF9XKChOhRdDiSxO0/ wq+VZzQ6n4LWGhR93hckWm4lZnjFgZLcxDtrijNmmZztLUW420zfbKNO09k3YiCxFfrZ pyTg== X-Forwarded-Encrypted: i=1; AJvYcCVnYcijS/yaaJrBOO3+oZeMopOuz+t9Ign0UZHrRZW1rCMmmJsco9034WVIbc1whfNeiqT1i/Sf0o6cKZkTsnGyfiY= X-Gm-Message-State: AOJu0Yxqya7yqqQS2cFxTsuzWUNR+lq6PR/bn03dT9o6+lbMJdeJoje8 HER9uB0uSLiueVpthIavMxSZiOG1YOzhvYaRaF3wXnN983IxrG1+QfJ3EQXS6/kKHG0Hn2xhBUF jJiUt2i3JMwpPLRwgfU/K2/bpptVNV6PgGhba X-Google-Smtp-Source: AGHT+IEAKXr1ckGzIAW0Zij5lROUgzcsWDVkCYVl2RWE/gIa1eP8NSl4F+ePaiwMP4UG1WnczScB6df5XE4x/GMY03s= X-Received: by 2002:a05:622a:1ba6:b0:447:f5de:bd18 with SMTP id d75a77b69052e-44ff3e5d32amr1488861cf.9.1721953771125; Thu, 25 Jul 2024 17:29:31 -0700 (PDT) MIME-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> <20240724011037.3671523-2-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Thu, 25 Jul 2024 17:28:55 -0700 Message-ID: Subject: Re: [PATCH v6 01/11] KVM: Add lockless memslot walk to KVM To: David Matlack Cc: Andrew Morton , Paolo Bonzini , Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Rientjes , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Sean Christopherson , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 52938180036 X-Stat-Signature: 71ztpqq6diuhcna1d6eirpw1ghuqkquq X-Rspam-User: X-HE-Tag: 1721953772-278538 X-HE-Meta: U2FsdGVkX1+snXeL5P+gUHXMfoRZreAYIw4zyNLFO8moam4g4AN0hUDwPVwBFIoyG28pvuPKsVIkWHf9IR9zkQ5gqpLQKnfaD8QVPX4QfETGEesGhjh4s0SzQ970RB8eRYJUE+N0KNptqV92h5euQXC8GJ3temcJDKmyISIvFFBD4vIcQ/txW8HLPr6OXoUVH4yhV3+QjYK854sp2NiMS2p6E1hmXJ18/7kCgLON/Ftjgx15hyrf8N6Svz9rTloMC2WWGjO6iScidnL4CRX2Fvjrbc96I4sFyAsae2maabymGqbwae9XaOPROMudDK1+cPWC24c2IcwSaI0UkrwoE3NCWkL1qfYSKuzrnWSTg98lJIZMrS9Epwbzvti/wR0q9et5iXp6oW4FSMAvJNRlLMrGZ8L1LutLhKVY5c0ovWKiGEYJEPinQFShw6JJHAB4P/4L361einsybPAns+7JKt9NFTr4QRyRaPqmB9WFXZ0CRB9PGqpvOB4AogTFDDc2EvaYo9vcf4BZb5ovMG2HBPRpZhH5T/IVp8iQqzMEBwPyNbWOqVoOYKX6gtsvAFXkOYRBrWecnjrKhUXSfaR/PvsqUCc4jsFpDcYtQWSaza6SGblnQajlCTMruOiwPk9AqbTU5BLc8PAZfTBASNWpuJWdbEwGQ5icXMTL3Hsl7XYJ3TuQxEHcpTVxHtKVZ6bmrdvoqGKiLPlU4eBM5nwCDh71SX35fv0e8kAjcC9820BLGgw89dJlV9xvkKA+/CBP1h2/mkxFddzK7s1AmUYvy/5CxZzYyiK7vD8f09GG/fxOWcx0HHu+xHdvIAUNbu+rdToDq3O6NltADEBe147qZmFTqtMOgyagq4fx9o+tj62hzNViG6GcorfGfQtDwyBvtnzkNc6Dxgro9th7hRejVpMzCRF4XUdNL5YkZt/uQ65g3kox//OGlo1ranU/iurBDmwexicjTUT4lleYBdQ kfcKUMIz 1UBsTxbMEjFTmBz7og4UrhnAT/V2N3M1Th1JjskJRWiSkbzoQ5+xisP/XmNmkPO2oj+pkLe9sA6rFmP7DOmCXqe8hu/U+dmlQtiHaAjABKcDnQX1cEG+M0fDhuTwKsP/4yhT+R1cgMS7+NOKjaeWWua7IAum08UaxvfgMzh4Salvy0vwAevS6HIIMX4fwqKojwReEoeLtlEmnh4XoN37880fk8NknAvLjEjLIrMkFBEkSIc8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 25, 2024 at 9:39=E2=80=AFAM David Matlack = wrote: > > On 2024-07-24 01:10 AM, James Houghton wrote: > > Provide flexibility to the architecture to synchronize as optimally as > > they can instead of always taking the MMU lock for writing. > > > > Architectures that do their own locking must select > > CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS. > > > > The immediate application is to allow architectures to implement the > > test/clear_young MMU notifiers more cheaply. > > > > Suggested-by: Yu Zhao > > Signed-off-by: James Houghton > > Aside from the cleanup suggestion (which should be in separate patches > anyway): > > Reviewed-by: David Matlack Thanks David! > > > --- > > include/linux/kvm_host.h | 1 + > > virt/kvm/Kconfig | 3 +++ > > virt/kvm/kvm_main.c | 26 +++++++++++++++++++------- > > 3 files changed, 23 insertions(+), 7 deletions(-) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 689e8be873a7..8cd80f969cff 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -266,6 +266,7 @@ struct kvm_gfn_range { > > gfn_t end; > > union kvm_mmu_notifier_arg arg; > > bool may_block; > > + bool lockless; > > }; > > bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)= ; > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); > > diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig > > index b14e14cdbfb9..632334861001 100644 > > --- a/virt/kvm/Kconfig > > +++ b/virt/kvm/Kconfig > > @@ -100,6 +100,9 @@ config KVM_GENERIC_MMU_NOTIFIER > > select MMU_NOTIFIER > > bool > > > > +config KVM_MMU_NOTIFIER_YOUNG_LOCKLESS > > + bool > > + > > config KVM_GENERIC_MEMORY_ATTRIBUTES > > depends on KVM_GENERIC_MMU_NOTIFIER > > bool > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index d0788d0a72cc..33f8997a5c29 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -555,6 +555,7 @@ struct kvm_mmu_notifier_range { > > on_lock_fn_t on_lock; > > bool flush_on_ret; > > bool may_block; > > + bool lockless; > > }; > > > > /* > > @@ -609,6 +610,10 @@ static __always_inline kvm_mn_ret_t __kvm_handle_h= va_range(struct kvm *kvm, > > IS_KVM_NULL_FN(range->handler))) > > return r; > > > > + /* on_lock will never be called for lockless walks */ > > + if (WARN_ON_ONCE(range->lockless && !IS_KVM_NULL_FN(range->on_loc= k))) > > + return r; > > + > > idx =3D srcu_read_lock(&kvm->srcu); > > > > for (i =3D 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) { > > @@ -640,15 +645,18 @@ static __always_inline kvm_mn_ret_t __kvm_handle_= hva_range(struct kvm *kvm, > > gfn_range.start =3D hva_to_gfn_memslot(hva_start,= slot); > > gfn_range.end =3D hva_to_gfn_memslot(hva_end + PA= GE_SIZE - 1, slot); > > gfn_range.slot =3D slot; > > + gfn_range.lockless =3D range->lockless; > > > > if (!r.found_memslot) { > > r.found_memslot =3D true; > > - KVM_MMU_LOCK(kvm); > > - if (!IS_KVM_NULL_FN(range->on_lock)) > > - range->on_lock(kvm); > > - > > - if (IS_KVM_NULL_FN(range->handler)) > > - goto mmu_unlock; > > + if (!range->lockless) { > > + KVM_MMU_LOCK(kvm); > > + if (!IS_KVM_NULL_FN(range->on_loc= k)) > > + range->on_lock(kvm); > > + > > + if (IS_KVM_NULL_FN(range->handler= )) > > + goto mmu_unlock; > > + } > > } > > r.ret |=3D range->handler(kvm, &gfn_range); > > } > > @@ -658,7 +666,7 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hv= a_range(struct kvm *kvm, > > kvm_flush_remote_tlbs(kvm); > > > > mmu_unlock: > > - if (r.found_memslot) > > + if (r.found_memslot && !range->lockless) > > KVM_MMU_UNLOCK(kvm); > > > > srcu_read_unlock(&kvm->srcu, idx); > > @@ -679,6 +687,8 @@ static __always_inline int kvm_handle_hva_range(str= uct mmu_notifier *mn, > > .on_lock =3D (void *)kvm_null_fn, > > .flush_on_ret =3D true, > > .may_block =3D false, > > + .lockless =3D > > + IS_ENABLED(CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS= ), > > }; > > > > return __kvm_handle_hva_range(kvm, &range).ret; > > @@ -697,6 +707,8 @@ static __always_inline int kvm_handle_hva_range_no_= flush(struct mmu_notifier *mn > > .on_lock =3D (void *)kvm_null_fn, > > .flush_on_ret =3D false, > > .may_block =3D false, > > + .lockless =3D > > + IS_ENABLED(CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS= ), > > kvm_handle_hva_range{,_no_flush}() have very generic names but > they're intimately tied to the "young" notifiers. Whereas > __kvm_handle_hva_range() is the truly generic handler function. > > This is arguably a pre-existing issue, but adding > CONFIG_KVM_MMU_NOTIFIER_YOUNG_LOCKLESS makes these functions even more > intamtely tied to the "young" notifiers. > > We could rename kvm_handle_hva_range{,_no_flush}() but I think the > cleanest thing to do might be to just drop them entirely and move their > contents into their callers (there are only 2 callers of these 3 > functions). That will create a little duplication but IMO will make the > code easier to read. > > And then we can also rename __kvm_handle_hva_range() to > kvm_handle_hva_range(). Thanks for the suggestion, I think this is a good idea. I'm curious how others feel, as this indeed does duplicate the code some. Perhaps it is better just to rename kvm_handle_hva_range() to kvm_handle_hva_range_age() or something like that, and something similar for _no_flush(). :/ But yeah I think it's fine to just do the manipulation you're suggesting. I'll include it in v7 unless others say not to.