From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B72FCA0ECA for ; Fri, 30 Aug 2024 00:35:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E7F0C6B008C; Thu, 29 Aug 2024 20:35:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E2F1B6B0092; Thu, 29 Aug 2024 20:35:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF76E6B0093; Thu, 29 Aug 2024 20:35:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id AA5426B008C for ; Thu, 29 Aug 2024 20:35:40 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5D6DB40B08 for ; Fri, 30 Aug 2024 00:35:40 +0000 (UTC) X-FDA: 82507043640.22.8F11499 Received: from mail-yb1-f173.google.com (mail-yb1-f173.google.com [209.85.219.173]) by imf02.hostedemail.com (Postfix) with ESMTP id 9401580005 for ; Fri, 30 Aug 2024 00:35:38 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VGpzCqU6; spf=pass (imf02.hostedemail.com: domain of jthoughton@google.com designates 209.85.219.173 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724978094; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+TNka9vd5JXf53shoZJCdocIPoVkda4RXWlcGSY44+M=; b=lNjJrTziDkwUJVxRlIHrQ2B/zh6EYRKMwNnt2HjlQUml+Xpk9UR9pASRwYWecrDDTd1wjR 146hMOGdu8W8A80RKROgeHznBpaikqFI28zISQsMwV78fUSvRx/AKXYM9r9lom8Ei3IJYi 8yxDTsTicWH9iQ9MM6NlBPT3jDq96rg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VGpzCqU6; spf=pass (imf02.hostedemail.com: domain of jthoughton@google.com designates 209.85.219.173 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724978094; a=rsa-sha256; cv=none; b=UNDvGdmQsgr54Yar2cizR3l61pyhKORSqK6xotNQBb4agBW62t+L+7JpIWtcI9RWjaVkAK puOw3iGa5Z+vlGWS2vzhd6DJ7V63qOxYh60wKqHrq2iHuYNMUHA5uoEtv7M9R/Xp9nWc59 j5qwWP0Z1QToAtaoIgfn5s/a8JCf5JY= Received: by mail-yb1-f173.google.com with SMTP id 3f1490d57ef6-e0b7efa1c1bso1333798276.3 for ; Thu, 29 Aug 2024 17:35:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724978137; x=1725582937; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+TNka9vd5JXf53shoZJCdocIPoVkda4RXWlcGSY44+M=; b=VGpzCqU6pciD7wLYsZvsImr4o3d3so43Ul64tMg6Bv+C0PVt+5cXll1HdcaTWqg2lq zsYzwxyaLvgmbYRWdQFdcTWIJs5mN54534+Nzul8+N7il9RNoZOPcdSJA6hnRxkJB7zo 3ZSF4h90+I0tcq6V86cpXW900oCtz8qCZfS2K28NvstycQkqigpwK0G2rYTlJCzQS6ne jaIIJCHGgLM9ppPhgolCeocCnIgXmUywKO8vrNn6NF4jB62XxU4+Nqt7Ay6dZIsap244 ZT1CzKJrzaxEjD5k6k+ejT1pVvTSEIikkg0/uwL9nsBHHGfZlCwTDb8DewZE4xQwUxom KZkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724978137; x=1725582937; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+TNka9vd5JXf53shoZJCdocIPoVkda4RXWlcGSY44+M=; b=izAE//IM/nJxgciYl/uXDZ+s8+UsduzSaYu/M/FjY3evzudncf6FlKTt0vHPplMfeh +6M4paSV8OsjizbAtj9cJUc+QOqdAXofwCXKpi/EuIyOhtLCGOfHsAUWp1bXwsnFjq79 weZDCq8F/82roXUSjW0B3KucFMToHbI6G+/Fr9g7oR/QCQIIYHPXbZESTY72yhb7/7Fz jBxMGyj9o6lX5TcsfQx0GwQJYjgoHAO4Tz4NMJMN7zlR9DfYEPonIddIY1KbaInZhA/t L0DtOkrFMFpMN38xWE+lU54qN7diAVg/fu2h5X/yDbArkzzPK6BN5xCYXhjSqmC28kU2 qRQQ== X-Forwarded-Encrypted: i=1; AJvYcCXe0RG/MCZVTFXBX2Rr/BC2HTEs3f0HrmrwR0XsLGXUwcmyMHmshtZS0dUq4CpA7BdfGWIktBv47Q==@kvack.org X-Gm-Message-State: AOJu0YxyDKqPUTcWpnZEd4jz7hgXo79gPYpfUBBtLPeoTzVvztIE00Va 4u+3gGbOekJYrFZelu7Pq1S9i8TphL31TLUyBcIMQx3uStyCHaJeOZ2uAFbk5QV8Ymcd7XznI3j igsJ0JctDbNceEcQPcz232xxYDFjVH79vM8VM X-Google-Smtp-Source: AGHT+IHGGoRCfmvp4fUpMCFvBy1BBzHFHrDm0larnXWV+XPd8x/YAfZRmgba0cFiXNndcnOr0sPomBSZ5ocJlkYWUy4= X-Received: by 2002:a05:6902:1584:b0:e16:6c41:1601 with SMTP id 3f1490d57ef6-e1a7a15de23mr625233276.33.1724978137240; Thu, 29 Aug 2024 17:35:37 -0700 (PDT) MIME-Version: 1.0 References: <20240724011037.3671523-1-jthoughton@google.com> <20240724011037.3671523-3-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Thu, 29 Aug 2024 17:35:01 -0700 Message-ID: Subject: Re: [PATCH v6 02/11] KVM: x86: Relax locking for kvm_test_age_gfn and kvm_age_gfn To: Sean Christopherson Cc: Andrew Morton , Paolo Bonzini , Ankit Agrawal , Axel Rasmussen , Catalin Marinas , David Matlack , David Rientjes , James Morse , Jason Gunthorpe , Jonathan Corbet , Marc Zyngier , Oliver Upton , Raghavendra Rao Ananta , Ryan Roberts , Shaoqin Huang , Suzuki K Poulose , Wei Xu , Will Deacon , Yu Zhao , Zenghui Yu , kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 9401580005 X-Stat-Signature: 9so8dymqrywob9w8cxwdshcidsm7ynnx X-HE-Tag: 1724978138-601097 X-HE-Meta: U2FsdGVkX1+rBRa+ss8s2g+tenXmRP9ugXn07yj7TVVPYUIUTO8gucYSAIi5+z2eoTOEySrMqEwZ/qFZoAtSxCfPMGllEyka6gJs92Ce0rWdq0OuCsWGnH+rdL/D19W2bS9MiS7QXUtJz7UC/+rE10E5u2DIDlDxrAIzI3PEwB2eiIc2Hh/sNbAFADkTbT0GPsk4S+4ib4nX85IJuKjdolHu7k/u8wd+u88fyOfnwr5tkGltoayCo0NmZ9bO+HBQj5L/vzNosBSqlT+izf+k+52ZwH7FRJ4aEU7b4t8c96tE4R5jdeIhe3HHQhCANXS2Jj0sIUS20ot4WhoH4Rx1BoVf9h2yWhxYOMhsuCgV3zsCh9iNOsM2n3AJtqTZa93PhNABi8mqRORJ5rJTLYcSNguPu6R03k0MHmX3grduu2ZU8ot1v97WGysy9ADEP2H9eDI1S+PCqJEu9QdYVNPr4JpSnVPbxp5Us9IC6ZWp83UMcxcatV6GVkuW2sUOlxgZSrddTik8GpO72rKxP1QBcN5mVVVuEaeAhtQDRdAhrPGDGd/we4qm/YiGMcQ2SDCcYLbT1AVjJZr/GXSZ15XxIcji1f98G4WDLCYAv9po3thYCfbBdF0QMU/FP0PekJba9UkGvXRjE6AwqnoCmFea3dVZSRKrDZ8DS6MUjcLhAm38sWggwnU40UMoQCWqPKhyIX0sawV5HTqFOfTazpKUsp6DhKKBSOKJlaO12IVMZcVxJdBA5fZXz2DIHM7axZQvbZquWUeaKrLj96v9H6/qa3U/KdNUxGKlcAANp7tYriS5L5JL2H3i2gMU3wjicXjkBDtkuctjG04SzO4YALP0s6vC79+vqujgOn5K9lH9+DoLQQghu87hcCC9sjOdRURgsFhQ+kPbrpjDrzfXpCBbg8LpZKdACnkynJjsISzdwmh5DMZkq+esuPCDS9/pLsOeNqPeABHPWj3xq4ZY+wA GSr3SAPz SgDWyP/cdTdfhuw6cxJ1IZJPCT+D7308TYIAqBYIEtsV17gSApC5gNQyD+FknUoEqhTR3Oswe845Vc2YALHY14dBKHyJMgtRg9TqF2x0dAH18H0CGqjp+uBAQ1mSP7fLM5+xQMIH1DUFtNr5ejCHus3vwlfI1TqVVlYYNPrfL6AY6x9AqfRFutkO/ihjM5L6W9AeysemJD9JtT+hPi8UG2kahPoHfCBSjcv17fkzrsQy4I3WOvZcDW1m9WJwteBg2tUGxyGQnqn8fP7oNBFEAKLoomTmlensbkbVV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 16, 2024 at 6:05=E2=80=AFPM Sean Christopherson wrote: > > On Wed, Jul 24, 2024, James Houghton wrote: > > Walk the TDP MMU in an RCU read-side critical section. > > ...without holding mmu_lock, while doing xxx. There are a lot of TDP MMU= walks, > pand they all need RCU protection. Added "without holding mmu_lock when harvesting and potentially updating age information on sptes". > > This requires a way to do RCU-safe walking of the tdp_mmu_roots; do thi= s with > > a new macro. The PTE modifications are now done atomically, and > > kvm_tdp_mmu_spte_need_atomic_write() has been updated to account for th= e fact > > that kvm_age_gfn can now lockless update the accessed bit and the R/X b= its). > > > > If the cmpxchg for marking the spte for access tracking fails, we simpl= y > > retry if the spte is still a leaf PTE. If it isn't, we return false > > to continue the walk. > > Please avoid pronouns. E.g. s/we/KVM (and adjust grammar as needed), so = that > it's clear what actor in particular is doing the retry. Fixed. Though, I have also changed this to reflect the change in the retry logic I've made, given your other comment. > > Harvesting age information from the shadow MMU is still done while > > holding the MMU write lock. > > > > Suggested-by: Yu Zhao > > Signed-off-by: James Houghton > > --- > > arch/x86/include/asm/kvm_host.h | 1 + > > arch/x86/kvm/Kconfig | 1 + > > arch/x86/kvm/mmu/mmu.c | 10 ++++- > > arch/x86/kvm/mmu/tdp_iter.h | 27 +++++++------ > > arch/x86/kvm/mmu/tdp_mmu.c | 67 +++++++++++++++++++++++++-------- > > 5 files changed, 77 insertions(+), 29 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm= _host.h > > index 950a03e0181e..096988262005 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -1456,6 +1456,7 @@ struct kvm_arch { > > * tdp_mmu_page set. > > * > > * For reads, this list is protected by: > > + * RCU alone or > > * the MMU lock in read mode + RCU or > > * the MMU lock in write mode > > * > > diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig > > index 4287a8071a3a..6ac43074c5e9 100644 > > --- a/arch/x86/kvm/Kconfig > > +++ b/arch/x86/kvm/Kconfig > > @@ -23,6 +23,7 @@ config KVM > > depends on X86_LOCAL_APIC > > select KVM_COMMON > > select KVM_GENERIC_MMU_NOTIFIER > > + select KVM_MMU_NOTIFIER_YOUNG_LOCKLESS > > select HAVE_KVM_IRQCHIP > > select HAVE_KVM_PFNCACHE > > select HAVE_KVM_DIRTY_RING_TSO > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 901be9e420a4..7b93ce8f0680 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -1633,8 +1633,11 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn= _range *range) > > { > > bool young =3D false; > > > > - if (kvm_memslots_have_rmaps(kvm)) > > + if (kvm_memslots_have_rmaps(kvm)) { > > + write_lock(&kvm->mmu_lock); > > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_rmap); > > + write_unlock(&kvm->mmu_lock); > > + } > > > > if (tdp_mmu_enabled) > > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > @@ -1646,8 +1649,11 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kv= m_gfn_range *range) > > { > > bool young =3D false; > > > > - if (kvm_memslots_have_rmaps(kvm)) > > + if (kvm_memslots_have_rmaps(kvm)) { > > + write_lock(&kvm->mmu_lock); > > young =3D kvm_handle_gfn_range(kvm, range, kvm_test_age_r= map); > > + write_unlock(&kvm->mmu_lock); > > + } > > > > if (tdp_mmu_enabled) > > young |=3D kvm_tdp_mmu_test_age_gfn(kvm, range); > > diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h > > index 2880fd392e0c..510936a8455a 100644 > > --- a/arch/x86/kvm/mmu/tdp_iter.h > > +++ b/arch/x86/kvm/mmu/tdp_iter.h > > @@ -25,6 +25,13 @@ static inline u64 kvm_tdp_mmu_write_spte_atomic(tdp_= ptep_t sptep, u64 new_spte) > > return xchg(rcu_dereference(sptep), new_spte); > > } > > > > +static inline u64 tdp_mmu_clear_spte_bits_atomic(tdp_ptep_t sptep, u64= mask) > > +{ > > + atomic64_t *sptep_atomic =3D (atomic64_t *)rcu_dereference(sptep)= ; > > + > > + return (u64)atomic64_fetch_and(~mask, sptep_atomic); > > +} > > + > > static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 new_= spte) > > { > > KVM_MMU_WARN_ON(is_ept_ve_possible(new_spte)); > > @@ -32,10 +39,11 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_pte= p_t sptep, u64 new_spte) > > } > > > > /* > > - * SPTEs must be modified atomically if they are shadow-present, leaf > > - * SPTEs, and have volatile bits, i.e. has bits that can be set outsid= e > > - * of mmu_lock. The Writable bit can be set by KVM's fast page fault > > - * handler, and Accessed and Dirty bits can be set by the CPU. > > + * SPTEs must be modified atomically if they have bits that can be set= outside > > + * of the mmu_lock. This can happen for any shadow-present leaf SPTEs,= as the > > + * Writable bit can be set by KVM's fast page fault handler, the Acces= sed and > > + * Dirty bits can be set by the CPU, and the Accessed and R/X bits can= be > > + * cleared by age_gfn_range. > > * > > * Note, non-leaf SPTEs do have Accessed bits and those bits are > > * technically volatile, but KVM doesn't consume the Accessed bit of > > @@ -46,8 +54,7 @@ static inline void __kvm_tdp_mmu_write_spte(tdp_ptep_= t sptep, u64 new_spte) > > static inline bool kvm_tdp_mmu_spte_need_atomic_write(u64 old_spte, in= t level) > > { > > return is_shadow_present_pte(old_spte) && > > - is_last_spte(old_spte, level) && > > - spte_has_volatile_bits(old_spte); > > + is_last_spte(old_spte, level); > > } > > > > static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t sptep, u64 old_spt= e, > > @@ -63,12 +70,8 @@ static inline u64 kvm_tdp_mmu_write_spte(tdp_ptep_t = sptep, u64 old_spte, > > static inline u64 tdp_mmu_clear_spte_bits(tdp_ptep_t sptep, u64 old_sp= te, > > u64 mask, int level) > > { > > - atomic64_t *sptep_atomic; > > - > > - if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level)) { > > - sptep_atomic =3D (atomic64_t *)rcu_dereference(sptep); > > - return (u64)atomic64_fetch_and(~mask, sptep_atomic); > > - } > > + if (kvm_tdp_mmu_spte_need_atomic_write(old_spte, level)) > > + return tdp_mmu_clear_spte_bits_atomic(sptep, mask); > > > > __kvm_tdp_mmu_write_spte(sptep, old_spte & ~mask); > > return old_spte; > > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > > index c7dc49ee7388..3f13b2db53de 100644 > > --- a/arch/x86/kvm/mmu/tdp_mmu.c > > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > > @@ -29,6 +29,11 @@ static __always_inline bool kvm_lockdep_assert_mmu_l= ock_held(struct kvm *kvm, > > > > return true; > > } > > +static __always_inline bool kvm_lockdep_assert_rcu_read_lock_held(void= ) > > +{ > > + WARN_ON_ONCE(!rcu_read_lock_held()); > > + return true; > > +} > > I doubt KVM needs a manual WARN, the RCU deference stuff should yell loud= ly if > something is missing an rcu_read_lock(). You're right -- removed. > > void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) > > { > > @@ -178,6 +183,15 @@ static struct kvm_mmu_page *tdp_mmu_next_root(stru= ct kvm *kvm, > > ((_only_valid) && (_root)->role.invalid))) { = \ > > } else > > > > +/* > > + * Iterate over all TDP MMU roots in an RCU read-side critical section= . > > + */ > > +#define for_each_tdp_mmu_root_rcu(_kvm, _root, _as_id) = \ > > + list_for_each_entry_rcu(_root, &_kvm->arch.tdp_mmu_roots, link) = \ > > This should just process valid roots: > > https://lore.kernel.org/all/20240801183453.57199-7-seanjc@google.com Thanks! I've added `|| (_root)->role.invalid)` to the below conditional expression, and I've renamed the macro to for_each_valid_tdp_mmu_root_rcu. > > + if (kvm_lockdep_assert_rcu_read_lock_held() && = \ > > + (_as_id >=3D 0 && kvm_mmu_page_as_id(_root) !=3D _as_= id)) { \ > > + } else > > + > > #define for_each_tdp_mmu_root(_kvm, _root, _as_id) \ > > __for_each_tdp_mmu_root(_kvm, _root, _as_id, false) > > > > @@ -1224,6 +1238,27 @@ static __always_inline bool kvm_tdp_mmu_handle_g= fn(struct kvm *kvm, > > return ret; > > } > > > > +static __always_inline bool kvm_tdp_mmu_handle_gfn_lockless( > > + struct kvm *kvm, > > + struct kvm_gfn_range *range, > > + tdp_handler_t handler) > > Please burn all the Google3 from your brain, and code ;-) I indented this way to avoid going past the 80 character limit. I've adjusted it to be more like the other functions in this file. Perhaps I should put `static __always_inline bool` on its own line? > > > + struct kvm_mmu_page *root; > > + struct tdp_iter iter; > > + bool ret =3D false; > > + > > + rcu_read_lock(); > > + > > + for_each_tdp_mmu_root_rcu(kvm, root, range->slot->as_id) { > > + tdp_root_for_each_leaf_pte(iter, root, range->start, rang= e->end) > > + ret |=3D handler(kvm, &iter, range); > > + } > > + > > + rcu_read_unlock(); > > + > > + return ret; > > +} > > + > > /* > > * Mark the SPTEs range of GFNs [start, end) unaccessed and return non= -zero > > * if any of the GFNs in the range have been accessed. > > @@ -1237,28 +1272,30 @@ static bool age_gfn_range(struct kvm *kvm, stru= ct tdp_iter *iter, > > { > > u64 new_spte; > > > > +retry: > > /* If we have a non-accessed entry we don't need to change the pt= e. */ > > if (!is_accessed_spte(iter->old_spte)) > > return false; > > > > if (spte_ad_enabled(iter->old_spte)) { > > - iter->old_spte =3D tdp_mmu_clear_spte_bits(iter->sptep, > > - iter->old_spte, > > - shadow_accessed_= mask, > > - iter->level); > > + iter->old_spte =3D tdp_mmu_clear_spte_bits_atomic(iter->s= ptep, > > + shadow_accessed_mask); > > new_spte =3D iter->old_spte & ~shadow_accessed_mask; > > } else { > > - /* > > - * Capture the dirty status of the page, so that it doesn= 't get > > - * lost when the SPTE is marked for access tracking. > > - */ > > + new_spte =3D mark_spte_for_access_track(iter->old_spte); > > + if (__tdp_mmu_set_spte_atomic(iter, new_spte)) { > > + /* > > + * The cmpxchg failed. If the spte is still a > > + * last-level spte, we can safely retry. > > + */ > > + if (is_shadow_present_pte(iter->old_spte) && > > + is_last_spte(iter->old_spte, iter->level)) > > + goto retry; > > Do we have a feel for how often conflicts actually happen? I.e. is it wo= rth > retrying and having to worry about infinite loops, however improbable the= y may > be? I'm not sure how common this is. I think it's probably better not to retry actually. If the cmpxchg fails, this spte is probably young anyway, so I can just `return true` instead of potentially retrying. This is all best-effort anyway.