From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 833BEC4345F for ; Sun, 21 Apr 2024 00:20:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E5FA6B007B; Sat, 20 Apr 2024 20:20:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96D836B0082; Sat, 20 Apr 2024 20:20:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E6A56B0083; Sat, 20 Apr 2024 20:20:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 56E856B007B for ; Sat, 20 Apr 2024 20:20:10 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D0F4B1A06CA for ; Sun, 21 Apr 2024 00:20:09 +0000 (UTC) X-FDA: 82031631738.29.2E21554 Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf01.hostedemail.com (Postfix) with ESMTP id 080644000C for ; Sun, 21 Apr 2024 00:20:07 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Uqdg4WPT; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of yuzhao@google.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713658808; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oz0Tuj6DlzITFTKo1WuqGSj2CD5OL4MfynPTYKg49wk=; b=Etc/diUIPhr3MXt+X8Et1s82DV4xCXjW2BrvahRjq27bu64WG57uT2eSrOZQCsUf29DnSD /7llOoyyLHdU38wMCGqrN3/0YsfpIpvQKdpkbuLQVK0Uyi/1n28eotDXymdqR7gD5+v15l kHZVZsu7VTITfgiUwMOznG0myCzbKd4= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Uqdg4WPT; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of yuzhao@google.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713658808; a=rsa-sha256; cv=none; b=jhM4QAJkzGdwis05RVK2jdaJ2C/huPkhJQy8ru61jxj4XTc9QtbmOoelDvhqEI1crfCLQE sMQK0JSA3Vfh7sriwmP5PrMyNTGFEY5cM2qs9ZnBnsEl6+As4XXpgYyyiz3nfYE2Q3KTLX I+EVWO5f6TdW0RLIPSiYhtnnomQgFEQ= Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-56e5174ffc2so4104a12.1 for ; Sat, 20 Apr 2024 17:20:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713658806; x=1714263606; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=oz0Tuj6DlzITFTKo1WuqGSj2CD5OL4MfynPTYKg49wk=; b=Uqdg4WPT43OhlrKGfwMDY/KXkTtZkCFWlhNUylkrZsuPeiVmXRdDT8AGVhxBvDnB9e Ct6PbimfDmZEWtV6uPhch72rwY8j1/lNjKbQ2hTKjMJoiaZzxklArALIr6fGyeUHTPwk jPS453nEtRfMd7Ee1KyLcExrCnHOkQu1W1Bohtvo3/Y8fO8e7X8kKfNFf5WK6XAZN+2m OxevC4/JA3DDd0xd7LsFuON9BZ/VFh4XkkkCDL3pU4IwuChRMfp/TWCdWfW5y2WPcvzL DpULCWzi9REuC5TUlWKd5UzD5Aoc5Tmbl0HrkHf5CyLP1+fVGtbAGv4V2byVwLr/cYx/ 8emw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713658806; x=1714263606; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oz0Tuj6DlzITFTKo1WuqGSj2CD5OL4MfynPTYKg49wk=; b=ItseS1WwChM9GJ/i1dX2uG7sR7F25f4Z1/HJe9SpSk0ndWP7sTLtC5uokTlY3r1kH1 NXOZR4wCruT8BlM7OVOyDm/1eO4SNxP3Ya8fAHt96iS2J0QkXg8EKsbHO0XN89pOpr68 U/tSyE3+NkjCwuNCt5yLJ4ICd49h6RlXgo7isNHOVDOHFgH6oQ2nVugTK7jLM71wBehV jUD4hEWaISCGC7rDuxIb5CnOEJOg8Vz0Lq8KyA2K2bnT3wKgLZ0MMZWYJGuY9aoEvNjY VZH6MPxaaJltLwOBTbCvXbKJ4bZC2o6H5v7XKeZ4n5L36H1V52L7b/kr/c3VY3tC0TbQ fYGg== X-Forwarded-Encrypted: i=1; AJvYcCX6GfBWEic6JSM54JY10RrGgCPEchxjdHU0P+6XkYpUwjRtgPcGvs6dUOhdpJjVffl5nUA0OoW/4flxC7V99kXMJCw= X-Gm-Message-State: AOJu0YzWXIYKhr4W0x+F1vgwC/A2e4PXH3qa+9kcO/zYnumzfxyBWBO+ XTkTHumiPEHZEH6Ugjp6rL9AprYI9SlzNBtB/5GL5lC9aVBpy8QNLNfGuJgxQ5X7Bw/fC0ZnuMD FwJVJjGzQLclN3vasxGJ2PZuGBBkn41I0IUos X-Google-Smtp-Source: AGHT+IFnrvPCNneqerzD05ClyZhnz2EnkFQKegicnMvDgyA8/2oshU9h8CQq7U+j3yGfhKQTM0feyi/OqvJPICIP0e4= X-Received: by 2002:a50:ec95:0:b0:571:fc6f:426a with SMTP id e21-20020a50ec95000000b00571fc6f426amr28507edr.6.1713658806100; Sat, 20 Apr 2024 17:20:06 -0700 (PDT) MIME-Version: 1.0 References: <20240401232946.1837665-1-jthoughton@google.com> <20240401232946.1837665-6-jthoughton@google.com> In-Reply-To: From: Yu Zhao Date: Sat, 20 Apr 2024 18:19:28 -0600 Message-ID: Subject: Re: [PATCH v3 5/7] KVM: x86: Participate in bitmap-based PTE aging To: James Houghton Cc: David Matlack , Andrew Morton , Paolo Bonzini , Marc Zyngier , Oliver Upton , Sean Christopherson , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Shaoqin Huang , Gavin Shan , Ricardo Koller , Raghavendra Rao Ananta , Ryan Roberts , David Rientjes , Axel Rasmussen , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 080644000C X-Stat-Signature: 5wgzrdzendwotiqsqmnp9qgmnyd5xypf X-HE-Tag: 1713658807-32481 X-HE-Meta: U2FsdGVkX1/238XqTkVjh58bD0Fy4ewEGSzFpVVK18ME/8hp4zZN2kw4K5v6kMRk9VPIpzqoCInElcXj9ThPoV92tbVpkqCA5syhg4FIp4CrJldRg8eSYInXT2DtD0w2zxcxgs4BmOwmToqwwQBix9JIW5ffsVVr1+9MVeem3SamV43EJZNSVWOHaTxC+dHzkqlbpIPZf9fdFj+mhwH/Ox3EYpF9lJrKlTb9zkiWzI5YP4b3sOQPV612hOcq9OPub73kFfek3XFrOQLXIsSmj02ts5GJGR8P/I3/Ye0XSB4xn+pgAAkcnDi/m0fFMcUVnTgp7yY95vUPhnNA1yppIMZeVE1loBuYMsXlcjfdFvw4h7nwXfGjszO/j1p/beH49fWXOU6idKIv7qxpEyWgbWl0m/aHBgQJyv0BzXJmWBU9YaVhWaiT0WgMUwBOMdE4k43bGFJ9e7HhrYWDRRLfQv53AbNUeiAfWvVJMQ0qyp1GFCPvLYdofKuM8dXKSTMQfilfghvtgkkfYhjuRULHT082Dh4q47Q25maraNJ5GF3B7WXHZERc9+Rs+X5FxgrwgTJ31eDUe5yJDO03ZExNA+C2D2RGQk3zttavI8M9XEQ6BjNbRgEuFibYVZXd43M7uZJoY5qnjNSiOJ2tEk2/M3w7cBssVAx+/8O4kH7D2orNCpOPWV/m0iwKaQZyXfrM7qOWFJrgTYiJcunDt84h/PstDEhehuwGH1Y2RxXyV+aS74iAv3Pt7GFUoPn6HTfpUPfiPH1smkIZjH/YKYq4W2V9kfFA/06MZVWZJotjzbWOvB2bqX5yv8LL0Jni9KGsaqsy2Xqj0rHXDqcaJF8/Jg1oDATEUJQA9pGXbPQGMoW+nqs2NqkD/ONJti53krAJfFtXmDwWz3NELjfy+Mwg29QYCO6Pgsacq1iiO/Klhmo6xgygOjvIE+D6bSHQ4/WUpNWjqWcqOCD4HwIDL41 K6b672Qf +pKTNuX29SWwQj5NP7Ul8rgLIBCu1UkoQwTIGzHiLuFLOHZfeUOlrx8WphLFFvfBW0tVuyukHQp0uk55eXA88Ke/r6WR8JWW0AhgrL/GiJL/fE0UN32I6E9Nnlh15mgbHh3MMuEHZpoOpEoh0ctMx29LufyG7TuVnD6poA4oDGTFStqFuHSxO+hcNtnGcl+5XRLZCv+4Kl3BDWDAPROs0flBqKujt/7CZpOEj1ub5nd5aRtDEtPmu1NyvKA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 19, 2024 at 3:48=E2=80=AFPM James Houghton wrote: > > On Fri, Apr 19, 2024 at 2:07=E2=80=AFPM David Matlack wrote: > > > > On 2024-04-19 01:47 PM, James Houghton wrote: > > > On Thu, Apr 11, 2024 at 10:28=E2=80=AFAM David Matlack wrote: > > > > On 2024-04-11 10:08 AM, David Matlack wrote: > > > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > > > { > > > > bool young =3D false; > > > > > > > > if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps= (kvm)) > > > > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_= rmap); > > > > > > > > if (tdp_mmu_enabled) > > > > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > > > > > > > return young; > > > > } > > > > > > > > bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > > > { > > > > bool young =3D false; > > > > > > > > if (!range->arg.metadata->bitmap && kvm_memslots_have_rmaps= (kvm)) > > > > young =3D kvm_handle_gfn_range(kvm, range, kvm_test= _age_rmap); > > > > > > > > if (tdp_mmu_enabled) > > > > young |=3D kvm_tdp_mmu_test_age_gfn(kvm, range); > > > > > > > > return young; > > > > > > > > > Yeah I think this is the right thing to do. Given your other > > > suggestions (on patch 3), I think this will look something like this > > > -- let me know if I've misunderstood something: > > > > > > bool check_rmap =3D !bitmap && kvm_memslot_have_rmaps(kvm); > > > > > > if (check_rmap) > > > KVM_MMU_LOCK(kvm); > > > > > > rcu_read_lock(); // perhaps only do this when we don't take the MMU l= ock? > > > > > > if (check_rmap) > > > kvm_handle_gfn_range(/* ... */ kvm_test_age_rmap) > > > > > > if (tdp_mmu_enabled) > > > kvm_tdp_mmu_test_age_gfn() // modified to be RCU-safe > > > > > > rcu_read_unlock(); > > > if (check_rmap) > > > KVM_MMU_UNLOCK(kvm); > > > > I was thinking a little different. If you follow my suggestion to first > > make the TDP MMU aging lockless, you'll end up with something like this > > prior to adding bitmap support (note: the comments are just for > > demonstrative purposes): > > > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > { > > bool young =3D false; > > > > /* Shadow MMU aging holds write-lock. */ > > if (kvm_memslots_have_rmaps(kvm)) { > > write_lock(&kvm->mmu_lock); > > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_rmap= ); > > write_unlock(&kvm->mmu_lock); > > } > > > > /* TDM MMU aging is lockless. */ > > if (tdp_mmu_enabled) > > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > > > return young; > > } > > > > Then when you add bitmap support it would look something like this: > > > > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > > { > > unsigned long *bitmap =3D range->arg.metadata->bitmap; > > bool young =3D false; > > > > /* SHadow MMU aging holds write-lock and does not support bitma= p. */ > > if (kvm_memslots_have_rmaps(kvm) && !bitmap) { > > write_lock(&kvm->mmu_lock); > > young =3D kvm_handle_gfn_range(kvm, range, kvm_age_rmap= ); > > write_unlock(&kvm->mmu_lock); > > } > > > > /* TDM MMU aging is lockless and supports bitmap. */ > > if (tdp_mmu_enabled) > > young |=3D kvm_tdp_mmu_age_gfn_range(kvm, range); > > > > return young; > > } > > > > rcu_read_lock/unlock() would be called in kvm_tdp_mmu_age_gfn_range(). > > Oh yes this is a lot better. I hope I would have seen this when it > came time to actually update this patch. Thanks. > > > > > That brings up a question I've been wondering about. If KVM only > > advertises support for the bitmap lookaround when shadow roots are not > > allocated, does that mean MGLRU will be blind to accesses made by L2 > > when nested virtualization is enabled? And does that mean the Linux MM > > will think all L2 memory is cold (i.e. good candidate for swapping) > > because it isn't seeing accesses made by L2? > > Yes, I think so (for both questions). That's better than KVM not > participating in MGLRU aging at all, which is the case today (IIUC -- > also ignoring the case where KVM accesses guest memory directly). We > could have MGLRU always invoke the mmu notifiers, but frequently > taking the MMU lock for writing might be worse than evicting when we > shouldn't. Maybe Yu tried this at some point, but I can't find any > results for this. No, in this case only the fast path (page table scanning) is disabled. MGLRU still sees the A-bit from L2 using the rmap, i.e., the slow path calling folio_check_references().