From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC421C04FF0 for ; Thu, 11 Apr 2024 17:08:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 640176B0095; Thu, 11 Apr 2024 13:08:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 617026B0098; Thu, 11 Apr 2024 13:08:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4DE8F6B00A1; Thu, 11 Apr 2024 13:08:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 30A246B0095 for ; Thu, 11 Apr 2024 13:08:53 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E3148C0BE7 for ; Thu, 11 Apr 2024 17:08:52 +0000 (UTC) X-FDA: 81997885704.17.FA0081A Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf02.hostedemail.com (Postfix) with ESMTP id 0D1A080011 for ; Thu, 11 Apr 2024 17:08:50 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dUZM/b5U"; spf=pass (imf02.hostedemail.com: domain of dmatlack@google.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=dmatlack@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712855331; a=rsa-sha256; cv=none; b=D9SEmg1A6IoBaKVMxU8c3F83m/3hV27mokEI1WtwdT63GgNFyime45JhQELdWyT0ULU64q pcfI1zdeD/42H1s6pG10Qg4DarINhO6aJWZvY8FA+rx2/Jy/kp7ajTZiIuIKsARbWyQ735 vLsPvTbd24bvUVbJN+h8Ik1YQV3gPWY= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dUZM/b5U"; spf=pass (imf02.hostedemail.com: domain of dmatlack@google.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=dmatlack@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712855331; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vEGKpy4oX/yIMuuLIeB2ya0zq8zO1o7RbrrJFAy2vfY=; b=QdmdeC2eCSdLSroen33IK6womdNFcW3NQe6KGQRFNSz3FXD2Z+rhxMSIfiuLR6S/oGEniK KPBR5glu1Gk6nSAZJB23PIMFAFdLJ5vx+LIa+AcVq9B9eIby7GKdojW3pyIhoB6TczenB0 CXPHzSbcX3uLfTsH1n5awkkq4zwsSn8= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1e0bfc42783so725375ad.0 for ; Thu, 11 Apr 2024 10:08:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712855330; x=1713460130; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=vEGKpy4oX/yIMuuLIeB2ya0zq8zO1o7RbrrJFAy2vfY=; b=dUZM/b5U8Gaacuxm/W0Xe+3EpaxDfwFHw7CgHtQW9FWDDLC/T8xH3I9ej6s7Ld2rDr 3NPfx7pM0yjOQ1GtJwKyGV8uWUr643ZXs0/eakBGUu6OJrQhsZKMdSinzOZhNiUm4+dy GS0MIJIxhXMhFQrjJZ00qvusDsNWzSSVwo5vizWCGBokaLbEUrFNKegZH5NBtjCsVezz +Auh8i4O7p2h6jEzW387cR+B8dcVzHKTzxbUwjXvbDcoHWK52oW3qiPsXzMDlWr7hqtq IvJE223SVguCuQrq19XqvGlE78YS+aBb7LiusBLM9cvsdy13kFfxWY3O3xXmKfC93Ls1 Kgaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712855330; x=1713460130; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=vEGKpy4oX/yIMuuLIeB2ya0zq8zO1o7RbrrJFAy2vfY=; b=L4lqiw1h+luwIte3W5EWK+MlvzeQtEgJ0mYmGXtDkGKnit1QMPmnWIWxBUDXJ/YU1R nzbFfkdR9FT/5mqgeqOx+e44DS6jOoCBJqQgX0D7SkT3ExG9/4s4caRKiAThmX3AGuio J80O3EuaPz7g0z0Ud/2IlollESlr88eMmi2+CthcMCa4XkAslzYaXR9WQcYUVNXBYbIi Jmc1lAzej4ek0rQ9kX1FHF4BeK+s+1bMacWnl3CMsVsvsF3FDJP7z6KqYoXjQscD2CEC Nci5qyXM09Uqe/h6rTFoC5DLLG6YtugrTQ6OMnkVpyaJJE/8Ibtex3fS+I9nVN87AKwX OkLw== X-Forwarded-Encrypted: i=1; AJvYcCVeUrGk9IrIk+bXVQ4mdwTDA6yVdiouYKdzqKf2Ow+GSscmCjrWduqFV1rMd1tQ8KovHxVW0upeuISptK4PFu3zKRk= X-Gm-Message-State: AOJu0YwM3pJjtdPQPuo15rD2SsbC6mROOlJcyK0M1+5KinidnDMnlrkX 09nxWuWMib+qVrfD+TcLOmHa64oQhGSLnpzn+K2giZbcj+UMP0s9Uir42rMelg== X-Google-Smtp-Source: AGHT+IGIo+XHhfBaNNvx2FZS3Opjx6zyO4INlV5oeLFS8dMFPff/iA2HmfQk0TnetdunGtVsGaac0w== X-Received: by 2002:a17:902:d303:b0:1e0:bae4:48f9 with SMTP id b3-20020a170902d30300b001e0bae448f9mr88599plc.32.1712855329667; Thu, 11 Apr 2024 10:08:49 -0700 (PDT) Received: from google.com (210.73.125.34.bc.googleusercontent.com. [34.125.73.210]) by smtp.gmail.com with ESMTPSA id a5-20020a1709027d8500b001e197cee600sm1413805plm.3.2024.04.11.10.08.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Apr 2024 10:08:49 -0700 (PDT) Date: Thu, 11 Apr 2024 10:08:44 -0700 From: David Matlack To: James Houghton Cc: Andrew Morton , Paolo Bonzini , Yu Zhao , Marc Zyngier , Oliver Upton , Sean Christopherson , Jonathan Corbet , James Morse , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Shaoqin Huang , Gavin Shan , Ricardo Koller , Raghavendra Rao Ananta , Ryan Roberts , David Rientjes , Axel Rasmussen , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: Re: [PATCH v3 5/7] KVM: x86: Participate in bitmap-based PTE aging Message-ID: References: <20240401232946.1837665-1-jthoughton@google.com> <20240401232946.1837665-6-jthoughton@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240401232946.1837665-6-jthoughton@google.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 0D1A080011 X-Stat-Signature: 5tkai7amt5ioifsidcpeckifbsapd1dm X-Rspam-User: X-HE-Tag: 1712855330-862759 X-HE-Meta: U2FsdGVkX1+QO0UTe4cZiS6QLg/7IK2sC+hI7o/afvGWa82yA7Nr4IW/n5YFQszInzji0acs7kv/YwS69AY5aGIHyCLIEEJdTl9pWEzXqS8qnjbyJfa3VLkzvv4mhEpYEx3soqc0hCtWy33150X/2VoSDq/M6+Sd85YycN0B0+msj9O20UOcFkzkmKGTS+QvIIKfIovIrgm6XRA5pssdPuZTe3YidH6HH/5rdJDV4vDU+hysUrfo8FtaH9hAUNQz8zv2e9Yc05dCZDkscH0pR2hcIWOEq1mPf0yxa55LBfaDyVh4SYIrLAceOkGa1EcwDDxv+uzncVe5Ty4tPpZPFN5dZ+BGQMWvcZNMnfO4GJeJi44SYDxSNtd109umJ4MLd0rgZwiyJXiMpXfXh1mBorP0RaOo7cz/x/Tx+Tii1kxEvyxaw3Frma3dxL9WNHiCOCsufcF0cWPqXlgZUKXf/fSAN+pL9hIom8+AkbD/QyKY9G+4GyAAYYkx1tHEd1c3rxCST8WrAvZreBLgCfPvdKS2OI+2buj63CNeEIluLq3uwot76P10RhlVEz1r2a3ikQPeoAdL+fk9B4BdB4GP3RoJTK+C6XCqhSxgX60uA3u3JcqTn94fPPF38yW/yZ4ScCDgr7u1rF3xswsGH9FnURUABa+A1FL8bwE20HaecdYKBaeK//ViFvK0aKPMc75u/dQuB7IhlHBp5MQeSA57R4BLJuwGroCkI+JhybI/0Sm/ypuUcahM6/+xJWRtljRIUxjAWtPcFfH3Akm9TWokMyw3fmHhNI1syyMbf1hmm6tZx1jsWCdQh72GR0eUr1tUPW/y2q7x1XEF3uV25YRsaRR7ruembTw8YsIiTxN0XqyLmlG6A1PWbbnJAfzBPt3ie2LxRvad8bH27+3ww5Prbvrn7LEbMha21f39dZlG1IRsOpaDcBSV61Q9hbqq1ljHzMZQuH79zPG+HMdDpP8 83Hc9ZUD QQvDr78dGt32B7vQoqELBp1MrKX+OBlMMKvb82cAdB4Mu7F5wq88VLNqW9H9h46tDg8+P+vSUFtajPAUIegh1751h7wc5y6HAcYosny2r53cRmobkh7lBhWZRHMU4+X5tTjrSj1qp/V15aztAMQBGFyDFDwWOU6yjuNSb2VSuxD+CW4QLCu45gjLQtOzyAV6evLULGrIKPl1RF2F90M8qfyQ/AugL+2XA8u6fhLaKAOLwa4QBk6yrtElUlPiG4jYZLrQxeJWAd7FHEg6mH/CDAFoAkcfmBkoLRwel5AW6C/6swPbQ6vawWmz50D1HhnBNzQ5F7NR0056SpjeVZ0Xf4UAeyRtq90RAUiEiogL/SxAgL+1PQRNbmTJiyDHxNvr33oscqA7yuCUWike9GQ8cFBSGk3UxpbgwQFDJIAK+DxYHXv4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-04-01 11:29 PM, James Houghton wrote: > Only handle the TDP MMU case for now. In other cases, if a bitmap was > not provided, fallback to the slowpath that takes mmu_lock, or, if a > bitmap was provided, inform the caller that the bitmap is unreliable. > > Suggested-by: Yu Zhao > Signed-off-by: James Houghton > --- > arch/x86/include/asm/kvm_host.h | 14 ++++++++++++++ > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++-- > arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++- > 3 files changed, 37 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 3b58e2306621..c30918d0887e 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -2324,4 +2324,18 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); > */ > #define KVM_EXIT_HYPERCALL_MBZ GENMASK_ULL(31, 1) > > +#define kvm_arch_prepare_bitmap_age kvm_arch_prepare_bitmap_age > +static inline bool kvm_arch_prepare_bitmap_age(struct mmu_notifier *mn) > +{ > + /* > + * Indicate that we support bitmap-based aging when using the TDP MMU > + * and the accessed bit is available in the TDP page tables. > + * > + * We have no other preparatory work to do here, so we do not need to > + * redefine kvm_arch_finish_bitmap_age(). > + */ > + return IS_ENABLED(CONFIG_X86_64) && tdp_mmu_enabled > + && shadow_accessed_mask; > +} > + > #endif /* _ASM_X86_KVM_HOST_H */ > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 992e651540e8..fae1a75750bb 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -1674,8 +1674,14 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > { > bool young = false; > > - if (kvm_memslots_have_rmaps(kvm)) > + if (kvm_memslots_have_rmaps(kvm)) { > + if (range->lockless) { > + kvm_age_set_unreliable(range); > + return false; > + } If a VM has TDP MMU enabled, supports A/D bits, and is using nested virtualization, MGLRU will effectively be blind to all accesses made by the VM. kvm_arch_prepare_bitmap_age() will return true indicating that the bitmap is supported. But then kvm_age_gfn() and kvm_test_age_gfn() will return false immediately and indicate the bitmap is unreliable because a shadow root is allocate. The notfier will then return MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE. Looking at the callers, MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE is never consumed or used. So I think MGLRU will assume all memory is unaccessed? One way to improve the situation would be to re-order the TDP MMU function first and return young instead of false, so that way MGLRU at least has visibility into accesses made by L1 (and L2 if EPT is disable in L2). But that still means MGLRU is blind to accesses made by L2. What about grabbing the mmu_lock if there's a shadow root allocated and get rid of MMU_NOTIFIER_YOUNG_BITMAP_UNRELIABLE altogether? if (kvm_memslots_have_rmaps(kvm)) { write_lock(&kvm->mmu_lock); young |= kvm_handle_gfn_range(kvm, range, kvm_age_rmap); write_unlock(&kvm->mmu_lock); } The TDP MMU walk would still be lockless. KVM only has to take the mmu_lock to collect accesses made by L2. kvm_age_rmap() and kvm_test_age_rmap() will need to become bitmap-aware as well, but that seems relatively simple with the helper functions.