From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0E4AC04FF6 for ; Fri, 19 Apr 2024 21:48:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D1C56B0089; Fri, 19 Apr 2024 17:48:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 281CC6B008A; Fri, 19 Apr 2024 17:48:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 149D26B0092; Fri, 19 Apr 2024 17:48:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ED43D6B0089 for ; Fri, 19 Apr 2024 17:48:41 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3CDB5809ED for ; Fri, 19 Apr 2024 21:48:40 +0000 (UTC) X-FDA: 82027621200.23.8B4001C Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 72E31C0002 for ; Fri, 19 Apr 2024 21:48:38 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=f8lHaYGe; spf=pass (imf10.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713563318; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7WUgfQpxFSi9OMu/BLbxS5EwXbfPHvR9Oq0TsJVDLS8=; b=RbQkCFlSAW50xpszyok+fh6pfnXQop7/UsNxLXE4NrPp2dOW9ffTM4pwihmEuuLpDwcXpd MMAhoPui2tGZEvrar4iwN91Wf7cyzajBrx8CIXrFcxi5UpQV8NEXY/PDL5XLOBD5tg8mKF oFLW1wdqO1cDJT/cFwIqso49Z8YR4QM= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=f8lHaYGe; spf=pass (imf10.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713563318; a=rsa-sha256; cv=none; b=nc/HKqXL8lXKdDvVzX+xcv8fdITG9+l2dp4LylzGjtMJBxxayD30ccguuJTZeYmebMSPu7 scTQfP09hebi9meZy2vHzYqQ1OYnVlc+5fRxa0sR2iLp+leMpM8nad4rXUebCx5QbFQzO8 JhhnIZfdSmxhgu4vB5VoDcl5f4esYZ0= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-434b5abbb0dso101891cf.0 for ; Fri, 19 Apr 2024 14:48:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713563317; x=1714168117; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7WUgfQpxFSi9OMu/BLbxS5EwXbfPHvR9Oq0TsJVDLS8=; b=f8lHaYGeVxh4Xf65kp9dqAsLN5wL42F1JBeEwa7dEV9KXArT0TKDlRYrqcYQwPth9G 4oy6+Diq9yW9YE1SmRsigTRQXWHj/E+Ykk0B3c15kLML57y5gU5XS7daYZyW7O1uYH4c 5m5HFtNO2/SJjwb8VGCDB4MxOUCuMHIe//Ho/QYbDZdxg6207LmDR0+DWfA5CVHWMEMH 0PiCLfqy92X2R26pVZFIcsuhjiPHWLqN7nSPsmyuCCnrClEwi0We9+xiIGyg7eLMe8Cf /R76laeXykOjzlYx5xSRMKsvKI4pvAOodrzGNQ/HnR9BXcqA6D3/TPxU7cUeNuRSCm0I nNgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713563317; x=1714168117; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7WUgfQpxFSi9OMu/BLbxS5EwXbfPHvR9Oq0TsJVDLS8=; b=GzGAvSp5Z9vaDbcSp/gQUC/XM+j9cFC4DVLMrLHLnlfWfz6McOBFRq85llcJdIL93Q vMTlPIDR9XYGHd8fUn25m6epRpVHyH0QJ0KLkqIMuq8CcAQVAaPyjJFc61GQrBuTstCm v97NI7BHZIFjcynbpNW0k0/fZS7vx0VBEBPD1eAtnFOx/r8lDTiOtkUubMFayfxw+zMQ bgNiVJMr3MESBCHHRqfALKP7RjUALwp2PbAws93oRCdWmg8wP9Hsvtu4NiUDnN/1rBop xRiRYVtQWqg7PwVL40LC2CTAtO1zBM3hzayiwPErZ8QnHMNjlFrE22fe8gGW7U2sRdTU 5Lfg== X-Forwarded-Encrypted: i=1; AJvYcCUxVHYsouLKULQj4OO7BGhXjuJWtK47YiwuyYMPDHNr8/1wbOpzfgM46CrDelv/Gq6HOhJNZmZQZlakvxTMtRINRYI= X-Gm-Message-State: AOJu0YyW9kVDKDHJQ78Nvz6P5+QJjZsEc9FwesOUKUVrDiCYyVLuK7zI bB/4x6MVN4Gzn1cck2mn5U4rk2Je5II1T9elgBdBP1evormtmZPfIEslGzLPrivUOFa/lH7otlc SlgAacCtnNf3OPz+NIvKuVFXuyujcrD3lxsDI X-Google-Smtp-Source: AGHT+IF/+KLjCOKrIy86O0ywp9xXyAU+ULEISk31UACWIOy8xyUI4tOEo7+H8GJqqYX/DI/5Jm7Jn6Bxyh5mWp8jmOE= X-Received: by 2002:a05:622a:600f:b0:431:8176:e4e5 with SMTP id he15-20020a05622a600f00b004318176e4e5mr25735qtb.13.1713563317426; Fri, 19 Apr 2024 14:48:37 -0700 (PDT) MIME-Version: 1.0 References: <20240401232946.1837665-1-jthoughton@google.com> <20240401232946.1837665-6-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Fri, 19 Apr 2024 14:48:00 -0700 Message-ID: Subject: Re: [PATCH v3 5/7] KVM: x86: Participate in bitmap-based PTE aging To: David Matlack Cc: Andrew Morton , Paolo Bonzini , Yu Zhao , Marc Zyngier , Oliver Upton , Sean Christopherson , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Shaoqin Huang , Gavin Shan , Ricardo Koller , Raghavendra Rao Ananta , Ryan Roberts , David Rientjes , Axel Rasmussen , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 72E31C0002 X-Stat-Signature: jh35s4r84oxohprgttxiewzzfzsfr6bs X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1713563318-304008 X-HE-Meta: U2FsdGVkX1/XbjOixXe5SdceJzwkOvPPwTbAHAoJR8WOIlMXXNDYbcOdkT0HhGTFU9dcrPUAcs1f6ehyYNJtX9y0+VPy5AKeLAyfwPeHJ3ZUBPL4XkJW0hKJhUcn2wUwY2nDyQBGZmg66uLvmweOJZAcBo0tBWH8A60aPqhgLFf+XLwAJl6yVdm0+X3IZZyZvEVzyYcBEKW1T9/se3QMH6opwFKlI0IXLo93rjJI/+UkZAGOzFNjvSHQNkYrSLHl7bTzZtAkWchXzaFFvleKOSwZZr61rvdoAOuhmf3Ysu8RxNdjbw9tRK6PnbE9+fNGdZS7PPKsew3bGSW2ZWhRuQ/6CoBstV33qPVPSfD/EEbJkrct9epRBAXd/DmJA/UK+NdINjOM+6wHSSTnX33NixZde0ty6bBT83Kgycz9z0RPsYvgdw8n4hax8cKNMKPRc+cCyFOOdZIwwm5Gxx4vQ7nXWz4QutNOlwjlv7gqv13HTRq9Rgvg1sn8bfuAchVVkE7vrnhtI/al1sZ+UgyJ2zisr1kvo1TP3KxlEzRwTe4m6StunMMr0WpiD3oUwmXQ4yyo7MxahVhBtltZTeGlS8wp56nLLcExvq0u43GZPFKpSVnIQo37PCLU/HwR3+ZIXaMgGSOekKzSvs5GUnckb5KH+2Tw/wpZB6s6M/D4YsV05WFu+qQWqwN6zVFEKnLSw/6nJj7r6rE5KqiSe/kZz8AxI1r1fRmmzB67khPhiHcxqNv2ZtAlVZ8iKxDmBjPK/T6mabyWgxc250gbztiCJih4kYwY6AGHR29bZa5FxqR0cZ3S+2FiXss66TNkaL8e3ytwxi7gOnoQcSxUbr6KRtFH8gHwI1y4HxI6e3ZmiqOzdUx8yPsQCH1n+A1m4GSdr/eJI1fBg0YpNeE/dQ+8VwwZxISX+nCmoxfHpxaJRet+C4TPTv8ySZ76hTO84d+/IsBHIhNNa5nNY9Z7xl1 9t3Tvzpw cL3Jmzagx8dX4s3RBv5iyCsRfkoYFH9OqpWpoQkNfrtqlmBLmLVMgnWX32FVUkC1F84PhPmf3tGvbLW+/jPa6LzROkynlQdl6w6TJVbeEdix+Z1d8v8ucxSeiWZ50FyLLGoqM2qmw0p/HGYBplbBLyf1NLXSEiQzpNhxD23AhSPv/Pf9kIo0/FLZo0RblUTRifPPEraQAbiiMWbrYddVcd01IqPiKweGJRxNKo8Aqcj7uw99yTh0fU4X9NRdWKbb5sX0c X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 19, 2024 at 2:07=E2=80=AFPM David Matlack = wrote: > > On 2024-04-19 01:47 PM, James Houghton wrote: > > On Thu, Apr 11, 2024 at 10:28=E2=80=AFAM David Matlack wrote: > > > On 2024-04-11 10:08 AM, David Matlack wrote: > > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > > { > > > bool young =3D false; > > > > > > if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps(k= vm)) > > > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_rm= ap); > > > > > > if (tdp_mmu_enabled) > > > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > > > > > return young; > > > } > > > > > > bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > > { > > > bool young =3D false; > > > > > > if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps(k= vm)) > > > young =3D kvm_handle_gfn_range(kvm, range, kvm_test_a= ge_rmap); > > > > > > if (tdp_mmu_enabled) > > > young |=3D kvm_tdp_mmu_test_age_gfn(kvm, range); > > > > > > return young; > > > > > > Yeah I think this is the right thing to do. Given your other > > suggestions (on patch 3), I think this will look something like this > > -- let me know if I've misunderstood something: > > > > bool check_rmap =3D !bitmap && kvm_memslot_have_rmaps(kvm); > > > > if (check_rmap) > > KVM_MMU_LOCK(kvm); > > > > rcu_read_lock(); // perhaps only do this when we don't take the MMU loc= k? > > > > if (check_rmap) > > kvm_handle_gfn_range(/* ... */ kvm_test_age_rmap) > > > > if (tdp_mmu_enabled) > > kvm_tdp_mmu_test_age_gfn() // modified to be RCU-safe > > > > rcu_read_unlock(); > > if (check_rmap) > > KVM_MMU_UNLOCK(kvm); > > I was thinking a little different. If you follow my suggestion to first > make the TDP MMU aging lockless, you'll end up with something like this > prior to adding bitmap support (note: the comments are just for > demonstrative purposes): > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > { > bool young =3D false; > > /* Shadow MMU aging holds write-lock. */ > if (kvm_memslots_have_rmaps(kvm)) { > write_lock(&kvm->mmu_lock); > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_rmap); > write_unlock(&kvm->mmu_lock); > } > > /* TDM MMU aging is lockless. */ > if (tdp_mmu_enabled) > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > return young; > } > > Then when you add bitmap support it would look something like this: > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > { > unsigned long *bitmap =3D range->arg.metadata->bitmap; > bool young =3D false; > > /* SHadow MMU aging holds write-lock and does not support bitmap.= */ > if (kvm_memslots_have_rmaps(kvm) && !bitmap) { > write_lock(&kvm->mmu_lock); > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_rmap); > write_unlock(&kvm->mmu_lock); > } > > /* TDM MMU aging is lockless and supports bitmap. */ > if (tdp_mmu_enabled) > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > return young; > } > > rcu_read_lock/unlock() would be called in kvm_tdp_mmu_age_gfn_range(). Oh yes this is a lot better. I hope I would have seen this when it came time to actually update this patch. Thanks. > > That brings up a question I've been wondering about. If KVM only > advertises support for the bitmap lookaround when shadow roots are not > allocated, does that mean MGLRU will be blind to accesses made by L2 > when nested virtualization is enabled? And does that mean the Linux MM > will think all L2 memory is cold (i.e. good candidate for swapping) > because it isn't seeing accesses made by L2? Yes, I think so (for both questions). That's better than KVM not participating in MGLRU aging at all, which is the case today (IIUC -- also ignoring the case where KVM accesses guest memory directly). We could have MGLRU always invoke the mmu notifiers, but frequently taking the MMU lock for writing might be worse than evicting when we shouldn't. Maybe Yu tried this at some point, but I can't find any results for this.