linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Houghton <jthoughton@google.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>, David Matlack <dmatlack@google.com>,
	 Marc Zyngier <maz@kernel.org>,
	Oliver Upton <oliver.upton@linux.dev>,
	 Sean Christopherson <seanjc@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	James Morse <james.morse@arm.com>,
	 Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	 Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	 Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Shaoqin Huang <shahuang@redhat.com>,
	 Gavin Shan <gshan@redhat.com>,
	Ricardo Koller <ricarkol@google.com>,
	 Raghavendra Rao Ananta <rananta@google.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	 David Rientjes <rientjes@google.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	 linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,  kvmarm@lists.linux.dev,
	kvm@vger.kernel.org, linux-mm@kvack.org,
	 linux-trace-kernel@vger.kernel.org,
	James Houghton <jthoughton@google.com>
Subject: [PATCH v3 6/7] KVM: arm64: Participate in bitmap-based PTE aging
Date: Mon,  1 Apr 2024 23:29:45 +0000	[thread overview]
Message-ID: <20240401232946.1837665-7-jthoughton@google.com> (raw)
In-Reply-To: <20240401232946.1837665-1-jthoughton@google.com>

Participate in bitmap-based aging while grabbing the KVM MMU lock for
reading. Ideally we wouldn't need to grab this lock at all, but that
would require a more intrustive and risky change. Also pass
KVM_PGTABLE_WALK_SHARED, as this software walker is safe to run in
parallel with other walkers.

It is safe only to grab the KVM MMU lock for reading as the kvm_pgtable
is destroyed while holding the lock for writing, and freeing of the page
table pages is either done while holding the MMU lock for writing or
after an RCU grace period.

When mkold == false, record the young pages in the passed-in bitmap.

When mkold == true, only age the pages that need aging according to the
passed-in bitmap.

Suggested-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: James Houghton <jthoughton@google.com>
---
 arch/arm64/include/asm/kvm_host.h    |  5 +++++
 arch/arm64/include/asm/kvm_pgtable.h |  4 +++-
 arch/arm64/kvm/hyp/pgtable.c         | 21 ++++++++++++++-------
 arch/arm64/kvm/mmu.c                 | 23 +++++++++++++++++++++--
 4 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9e8a496fb284..e503553cb356 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1331,4 +1331,9 @@ bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu);
 	(get_idreg_field((kvm), id, fld) >= expand_field_sign(id, fld, min) && \
 	 get_idreg_field((kvm), id, fld) <= expand_field_sign(id, fld, max))
 
+#define kvm_arch_prepare_bitmap_age kvm_arch_prepare_bitmap_age
+bool kvm_arch_prepare_bitmap_age(struct mmu_notifier *mn);
+#define kvm_arch_finish_bitmap_age kvm_arch_finish_bitmap_age
+void kvm_arch_finish_bitmap_age(struct mmu_notifier *mn);
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 19278dfe7978..1976b4e26188 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -644,6 +644,7 @@ kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr);
  * @addr:	Intermediate physical address to identify the page-table entry.
  * @size:	Size of the address range to visit.
  * @mkold:	True if the access flag should be cleared.
+ * @range:	The kvm_gfn_range that is being used for the memslot walker.
  *
  * The offset of @addr within a page is ignored.
  *
@@ -657,7 +658,8 @@ kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr);
  * Return: True if any of the visited PTEs had the access flag set.
  */
 bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr,
-					 u64 size, bool mkold);
+					 u64 size, bool mkold,
+					 struct kvm_gfn_range *range);
 
 /**
  * kvm_pgtable_stage2_relax_perms() - Relax the permissions enforced by a
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3fae5830f8d2..e881d3595aca 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -1281,6 +1281,7 @@ kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr)
 }
 
 struct stage2_age_data {
+	struct kvm_gfn_range *range;
 	bool	mkold;
 	bool	young;
 };
@@ -1290,20 +1291,24 @@ static int stage2_age_walker(const struct kvm_pgtable_visit_ctx *ctx,
 {
 	kvm_pte_t new = ctx->old & ~KVM_PTE_LEAF_ATTR_LO_S2_AF;
 	struct stage2_age_data *data = ctx->arg;
+	gfn_t gfn = ctx->addr / PAGE_SIZE;
 
 	if (!kvm_pte_valid(ctx->old) || new == ctx->old)
 		return 0;
 
 	data->young = true;
 
+
 	/*
-	 * stage2_age_walker() is always called while holding the MMU lock for
-	 * write, so this will always succeed. Nonetheless, this deliberately
-	 * follows the race detection pattern of the other stage-2 walkers in
-	 * case the locking mechanics of the MMU notifiers is ever changed.
+	 * stage2_age_walker() may not be holding the MMU lock for write, so
+	 * follow the race detection pattern of the other stage-2 walkers.
 	 */
-	if (data->mkold && !stage2_try_set_pte(ctx, new))
-		return -EAGAIN;
+	if (data->mkold) {
+		if (kvm_gfn_should_age(data->range, gfn) &&
+				!stage2_try_set_pte(ctx, new))
+			return -EAGAIN;
+	} else
+		kvm_gfn_record_young(data->range, gfn);
 
 	/*
 	 * "But where's the TLBI?!", you scream.
@@ -1315,10 +1320,12 @@ static int stage2_age_walker(const struct kvm_pgtable_visit_ctx *ctx,
 }
 
 bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr,
-					 u64 size, bool mkold)
+					 u64 size, bool mkold,
+					 struct kvm_gfn_range *range)
 {
 	struct stage2_age_data data = {
 		.mkold		= mkold,
+		.range		= range,
 	};
 	struct kvm_pgtable_walker walker = {
 		.cb		= stage2_age_walker,
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 18680771cdb0..104cc23e9bb3 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1802,6 +1802,25 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	return false;
 }
 
+bool kvm_arch_prepare_bitmap_age(struct mmu_notifier *mn)
+{
+	struct kvm *kvm = mmu_notifier_to_kvm(mn);
+
+	/*
+	 * We need to hold the MMU lock for reading to prevent page tables
+	 * from being freed underneath us.
+	 */
+	read_lock(&kvm->mmu_lock);
+	return true;
+}
+
+void kvm_arch_finish_bitmap_age(struct mmu_notifier *mn)
+{
+	struct kvm *kvm = mmu_notifier_to_kvm(mn);
+
+	read_unlock(&kvm->mmu_lock);
+}
+
 bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 {
 	u64 size = (range->end - range->start) << PAGE_SHIFT;
@@ -1811,7 +1830,7 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 
 	return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt,
 						   range->start << PAGE_SHIFT,
-						   size, true);
+						   size, true, range);
 }
 
 bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
@@ -1823,7 +1842,7 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 
 	return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt,
 						   range->start << PAGE_SHIFT,
-						   size, false);
+						   size, false, range);
 }
 
 phys_addr_t kvm_mmu_get_httbr(void)
-- 
2.44.0.478.gd926399ef9-goog



  parent reply	other threads:[~2024-04-01 23:30 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-01 23:29 [PATCH v3 0/7] mm/kvm: Improve parallelism for access bit harvesting James Houghton
2024-04-01 23:29 ` [PATCH v3 1/7] mm: Add a bitmap into mmu_notifier_{clear,test}_young James Houghton
2024-04-04 18:52   ` David Hildenbrand
2024-04-09 18:31     ` James Houghton
2024-04-09 19:35       ` David Hildenbrand
2024-04-11  0:35         ` James Houghton
2024-04-12 18:45   ` David Matlack
2024-04-19 20:34     ` James Houghton
2024-04-01 23:29 ` [PATCH v3 2/7] KVM: Move MMU notifier function declarations James Houghton
2024-04-01 23:29 ` [PATCH v3 3/7] KVM: Add basic bitmap support into kvm_mmu_notifier_test/clear_young James Houghton
2024-04-12 20:28   ` David Matlack
2024-04-19 20:41     ` James Houghton
2024-04-01 23:29 ` [PATCH v3 4/7] KVM: x86: Move tdp_mmu_enabled and shadow_accessed_mask James Houghton
2024-04-01 23:29 ` [PATCH v3 5/7] KVM: x86: Participate in bitmap-based PTE aging James Houghton
2024-04-11 17:08   ` David Matlack
2024-04-11 17:28     ` David Matlack
2024-04-11 18:00       ` David Matlack
2024-04-11 18:07         ` David Matlack
2024-04-19 20:47       ` James Houghton
2024-04-19 21:06         ` David Matlack
2024-04-19 21:48           ` James Houghton
2024-04-21  0:19             ` Yu Zhao
2024-04-12 20:44   ` David Matlack
2024-04-19 20:54     ` James Houghton
2024-04-01 23:29 ` James Houghton [this message]
2024-04-02  4:06   ` [PATCH v3 6/7] KVM: arm64: " Yu Zhao
2024-04-02  7:00     ` Oliver Upton
2024-04-02  7:33     ` Marc Zyngier
2024-04-01 23:29 ` [PATCH v3 7/7] mm: multi-gen LRU: use mmu_notifier_test_clear_young() James Houghton
2024-04-12 18:41 ` [PATCH v3 0/7] mm/kvm: Improve parallelism for access bit harvesting David Matlack
2024-04-19 20:57   ` James Houghton
2024-04-19 22:23     ` Oliver Upton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240401232946.1837665-7-jthoughton@google.com \
    --to=jthoughton@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=gshan@redhat.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=maz@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=rananta@google.com \
    --cc=ricarkol@google.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=ryan.roberts@arm.com \
    --cc=seanjc@google.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox