From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28591EB64D7 for ; Tue, 20 Jun 2023 21:18:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8F358D0002; Tue, 20 Jun 2023 17:18:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A3EE58D0001; Tue, 20 Jun 2023 17:18:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 906428D0002; Tue, 20 Jun 2023 17:18:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7DA458D0001 for ; Tue, 20 Jun 2023 17:18:50 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 499171205EB for ; Tue, 20 Jun 2023 21:18:50 +0000 (UTC) X-FDA: 80924390820.14.22A2815 Received: from mail-oa1-f52.google.com (mail-oa1-f52.google.com [209.85.160.52]) by imf18.hostedemail.com (Postfix) with ESMTP id 6921D1C0015 for ; Tue, 20 Jun 2023 21:18:48 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=jYUxatN9; spf=pass (imf18.hostedemail.com: domain of isaku.yamahata@gmail.com designates 209.85.160.52 as permitted sender) smtp.mailfrom=isaku.yamahata@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687295928; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pPIrkDOEnetRxudGIQqwVRWbtrDT/RDlBKZ80849Peg=; b=gyePndlGpXT7bLrCRd2N8oGy289SgDYBLu7+gXDBawmTVxpRUy9WQbT04ywwdwgrvS4xRP vk6lHPTqfLiqKhhvdxzaXMLKj0AXHoMUWIGe9oBzVPd75QpzzkGnG8OiW7nFfM5qWJvfQS 0TyU3BWTbwyLMlcWkUBHY0TgqJCr8m0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687295928; a=rsa-sha256; cv=none; b=d9KFhch4yk+RTZn2XpeImLOe014jSQBrprDFbIJ0CQS7DKsJcMqRzh5FtODIIyycClGIBn HsTfvIaekAvwBY3+s+kohhjg25N/u/Cb+eK/QLOe0cMOUCbW8ZJL8zrzMygpIqY+6PdSkB wcdZEWHpmvR8NrrFWS7y3J/QUCburQg= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=jYUxatN9; spf=pass (imf18.hostedemail.com: domain of isaku.yamahata@gmail.com designates 209.85.160.52 as permitted sender) smtp.mailfrom=isaku.yamahata@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-oa1-f52.google.com with SMTP id 586e51a60fabf-1a98a7fde3fso4923968fac.3 for ; Tue, 20 Jun 2023 14:18:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687295927; x=1689887927; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pPIrkDOEnetRxudGIQqwVRWbtrDT/RDlBKZ80849Peg=; b=jYUxatN9gZBGTUs0XIxhoKSn1D5lt5Y5Vxr5pBmpTD1cGve3wFw3YrEJd1ECpInXH8 9gPsSUYOVuUc9i0Kah4UzDCbB/pFdnB47txwId6sW6GwBrUzT4vDUimtQmIO/XI4xuqp NlJz8VrScTNqLft0nR6MYiSMpTkf8srTxaYuThtjytTPBqt6XNGURjV7pA0JHDQkg/v/ SsQZJrIfsWqTLEBF0CNxMCDzWSuR3ncxGKx1HJEDdpM38eCtXXI14KOsZVydct8SwggE 2Xbs7OJRYrqGD/0rBBj8kz7R1QdSRY+1A9vq/M8zFNga2fFJBkeo1w9T5JOsT0zPQXMQ O7gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687295927; x=1689887927; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pPIrkDOEnetRxudGIQqwVRWbtrDT/RDlBKZ80849Peg=; b=ZcgdMqP9LJRwfdFl7Fv2WU2x0lsgcKFp2aRszo9C1VVTEaxx/7iIJHegt1pyu4zPGz Z39Ia/o0buR5nqhuEfKJSm1VVLXCVRYPif1X59z4LFdqA2b7cQvC9PO9RssbgQircBSh Y+ErynXTbeB3keMBxUKLn0fLqwDiKS58UaNFGYXJ9+98U9529OQ7idS6YdJVJDtjhWsJ crL/TnzMTU2tqjA7S1Qi+7ExKIBnH57C75GYsffFgqClSFuU+YumEuJAnsd2knU0n6xz LGLGfx/vRVzPFjX/EYsyia5uwUdq1p84Hcr1sw8yykwuIiT45G2s/gEHY1QrmyBwDX2m ROpQ== X-Gm-Message-State: AC+VfDwo+/X1fFn4UkgWxSTDiHtG9MRF33V9QtkQR0izAHDmJiLcvjsJ JXSNXWYqmDqoEpCl2CSk2NE= X-Google-Smtp-Source: ACHHUZ5IJr0CXOlTtOybC0Ytm0ge2PHofa8ddUm0IVeq3u7zpfVE/i4EfefH19Y8S38FSxTm/LLAPw== X-Received: by 2002:a05:6870:ab84:b0:1a3:67a6:4676 with SMTP id gs4-20020a056870ab8400b001a367a64676mr13548407oab.29.1687295927231; Tue, 20 Jun 2023 14:18:47 -0700 (PDT) Received: from localhost ([192.55.54.50]) by smtp.gmail.com with ESMTPSA id fs9-20020a17090af28900b00256504e0937sm7763908pjb.34.2023.06.20.14.18.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jun 2023 14:18:46 -0700 (PDT) Date: Tue, 20 Jun 2023 14:18:45 -0700 From: Isaku Yamahata To: Michael Roth Cc: Isaku Yamahata , kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de, thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org, pbonzini@redhat.com, seanjc@google.com, vkuznets@redhat.com, jmattson@google.com, luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com, pgonda@google.com, peterz@infradead.org, srinivas.pandruvada@linux.intel.com, rientjes@google.com, dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de, vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com, tony.luck@intel.com, marcorr@google.com, sathyanarayanan.kuppuswamy@linux.intel.com, alpergun@google.com, dgilbert@redhat.com, jarkko@kernel.org, ashish.kalra@amd.com, nikunj.dadhania@amd.com, liam.merwick@oracle.com, zhi.a.wang@intel.com Subject: Re: [PATCH RFC v9 04/51] KVM: x86: Determine shared/private faults using a configurable mask Message-ID: <20230620211845.GV2244082@ls.amr.corp.intel.com> References: <20230612042559.375660-1-michael.roth@amd.com> <20230612042559.375660-5-michael.roth@amd.com> <20230614164709.GT2244082@ls.amr.corp.intel.com> <20230620202841.7qizls3u3kcck45g@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230620202841.7qizls3u3kcck45g@amd.com> X-Rspamd-Queue-Id: 6921D1C0015 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 7n41kxepbknnk9fdjpoopjucxyzbp5uh X-HE-Tag: 1687295928-566324 X-HE-Meta: U2FsdGVkX18VJ7GRNqGQKSkNnBtv6VWtRriKIQEyJgwxvq60KI3EEgM1kggEutZdy40SmkcUNK/1EU8pi0I30aIrGn+nSrB2p149b7mERi0yZUKNr+45/cUyo7czH7uAeVWrOj4MtpIdK4rJVua5LdspxRbUwuiqQpTs2SMW3NrdESHy6YE4OtY77t1a16R3v7YlG12z6w1HI0QYOVzf5dE+oJAgi3x0vId0sMfppspqGqEmkA46CNPZ5mupkOqci4Ewwwt3dZNX5/ZD0TT1daSIoKwQEZdz9dPsKkKwYAns7NyNFgdcXjL0TEo04Zp2iIiInn0mBdM9xsTMLPbT9Vz01MixGaxgD961QoITr+YF04dCfE5TXtv7hpRfjPqUOnDbyIkVIKgOrrxn9LY5k74UmakvD2Cr913UWQ4GegYRSW0e/AqyuFI7llt0nBQJBK0S90WX7gL2/bzyHhttdRV5FXiTt+r306zFrktWkokjHE7FBpv+1+inDsFc3POt6Hvmzg4yvHPS3+X37RQiC/gMHTYBvSw2HpY+NOFLyfI9RVu7Cy5clJ73Qb3DvVhtnOtV0ptumCuMGh+zi4vn56tSNt8eTeIQ1juysPqX2/HS1I091NAGn5yEmNhlK09GdecKXBQUUa8l/V8EgPOhZfWno6AVNPHK9J63MqIIJ1BSHaht4ZIkglrjd3YhP52gtN2RApi+kJXwqYf5iVCSqwJg+ObAzxaVYjkF88b9qWCGCENpetgKEo+ippbmyA3HJKaSUPyU5+t79Z+GnKFwr3WXYhCKvgdC3MpQaOSdFGLXt86OBsUz9y1zVPF3x150wtb9gftjR4ToPzN8y5K2ps2ZeMIIZs83o9q7dRUkfvqvNJQ4ovnuFanmlplR6GrydatfD2hxJai39fo5m6eoV7E3fgZc6z36vmZTF6r1bjX+zhSuXVL0OsvvQIN8FV8llUX5uidCx4+QvMiyV6g 09Mg6al6 h2ShnXoOf19xfWqK+qIPKHl0LQb8KrxoF02ECE4LiNHW0lIJ/wDLbzHY1GSwJIgvu/LSlDWKXCs8et8gJbu1SpLEqjNXmhmKLvH7Lci716exLEbvWnyoiWxZg++jHa0a8DP04lRTJMkWxpE2PyUbbhh45fNqHZAWw/nvzfXVe6h5GGHEFATaOPDyT9jneqA28IR5Lz25OI1d9oaPbwCFmNScCbFFBTth/rdZEmHQwvvm2WMhanmcVfGiHXdtSw0hFXX4wkCU9Q7d4MW/sOy4UvZB87E3dC1iaWOdVnhsEhTQBzpoY65Puo2SmZACjvh1EWtQ/SkOg6fl3wGhSRAEqYKWHKAQ9/Yoe/tdmzl190hEk/EsXQgIBJYHqCq4MGkuMVpGPIrXOrkCBuwnIC6QWH1+J5o8JxyWGqheHEu7ZRudsKNSa6DLjmyotYqTMluZgjkF5ckF1hQ7qcn95Wi7SkFVuU1u6zWtxxSaBRvdEh2dY7XVongwcNJk6LblOXzhctHRM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 20, 2023 at 03:28:41PM -0500, Michael Roth wrote: > On Wed, Jun 14, 2023 at 09:47:09AM -0700, Isaku Yamahata wrote: > > On Sun, Jun 11, 2023 at 11:25:12PM -0500, > > Michael Roth wrote: > > > > > This will be used to determine whether or not an #NPF should be serviced > > > using a normal page vs. a guarded/gmem one. > > > > > > Signed-off-by: Michael Roth > > > --- > > > arch/x86/include/asm/kvm_host.h | 7 +++++++ > > > arch/x86/kvm/mmu/mmu_internal.h | 35 ++++++++++++++++++++++++++++++++- > > > 2 files changed, 41 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > index b3bd24f2a390..c26f76641121 100644 > > > --- a/arch/x86/include/asm/kvm_host.h > > > +++ b/arch/x86/include/asm/kvm_host.h > > > @@ -1445,6 +1445,13 @@ struct kvm_arch { > > > */ > > > #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1) > > > struct kvm_mmu_memory_cache split_desc_cache; > > > + > > > + /* > > > + * When set, used to determine whether a fault should be treated as > > > + * private in the context of protected VMs which use a separate gmem > > > + * pool to back private guest pages. > > > + */ > > > + u64 mmu_private_fault_mask; > > > }; > > > > > > struct kvm_vm_stat { > > > diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h > > > index 780b91e1da9f..9b9e75aa43f4 100644 > > > --- a/arch/x86/kvm/mmu/mmu_internal.h > > > +++ b/arch/x86/kvm/mmu/mmu_internal.h > > > @@ -252,6 +252,39 @@ struct kvm_page_fault { > > > > > > int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); > > > > > > +static bool kvm_mmu_fault_is_private(struct kvm *kvm, gpa_t gpa, u64 err) > > > +{ > > > + struct kvm_memory_slot *slot; > > > + bool private_fault = false; > > > + gfn_t gfn = gpa_to_gfn(gpa); > > > + > > > + slot = gfn_to_memslot(kvm, gfn); > > > + if (!slot) { > > > + pr_debug("%s: no slot, GFN: 0x%llx\n", __func__, gfn); > > > + goto out; > > > + } > > > + > > > + if (!kvm_slot_can_be_private(slot)) { > > > + pr_debug("%s: slot is not private, GFN: 0x%llx\n", __func__, gfn); > > > + goto out; > > > + } > > > + > > > + if (kvm->arch.mmu_private_fault_mask) { > > > + private_fault = !!(err & kvm->arch.mmu_private_fault_mask); > > > + goto out; > > > + } > > > > What's the convention of err? Can we abstract it by introducing a new bit > > PFERR_PRIVATE_MASK? The caller sets it based on arch specific value. > > the logic will be > > .is_private = err & PFERR_PRIVATE_MASK; > > I'm not sure I understand the question. 'err' is just the page fault flags, > and arch.mmu_private_fault_mask is something that can be set on a > per-platform basis when running in a mode where shared/private access > is recorded in the page fault flags during a #NPF. > > I'm not sure how we'd keep the handling cross-platform by moving to a macro, > since TDX uses a different bit, and we'd want to be able to build a > SNP+TDX kernel that could run on either type of hardware. > > Are you suggesting to reverse that and have err be set in a platform-specific > way and then use a common PFERR_PRIVATE_MASK that's software-defined and > consistent across platforms? That could work, but existing handling seems > to use page fault flags as-is, keeping the hardware-set values, rather than > modifying them to pass additional metadata, so it seems like it might > make things more confusing to make an exception to that here. Or are > there other cases where it's done that way? I meant the latter, making PFERR_PRIVATE_MASK common software-defined. I think the SVM fault handler can use hardware value directly by carefully defining those PFERR values. TDX doesn't have architectural bit in error code to indicate the private fault. It's coded in faulted address as shared bit. GPA bit 51 or 47. PFERR_{USER, WRITE, FETCH, PRESENT} are already software-defined value for VMX (and TDX). The fault handler for VMX, handle_ept_violation(), converts encoding. For TDX, PFERR_PRIVATE_MASK is just one more software defined bit. I'm fine with either way, variable or macro. Which do you prefer? - Define variable mmu_private_fault_mask (global or per struct kvm) The module initialization code, hardware_setup(), sets mmu_private_fault_mask. - Define the software defined value, PFERR_PRIVATE_MASK. The caller of kvm_mmu_page_fault() parses the hardware value and construct software defined error_code. - any other? > > > + > > > + /* > > > + * Handling below is for UPM self-tests and guests that treat userspace > > > + * as the authority on whether a fault should be private or not. > > > + */ > > > + private_fault = kvm_mem_is_private(kvm, gpa >> PAGE_SHIFT); > > > > This code path is sad. One extra slot lookup and xarray look up. > > Without mmu lock, the result can change by other vcpu. > > Let's find a better way. > > The intention was to rely on fault->mmu_seq to determine if a > KVM_SET_MEMORY_ATTRIBUTES update came in after .private_fault was set so > that fault handling could be retried, but that doesn't happen until > kvm_faultin_pfn() which is *after* this is logged. So yes, I think there > is a race here, and the approach you took in your Misc. series of > keeping the kvm_mem_is_private() check inside kvm_faultin_pfn() is more > efficient/correct. > > If we can figure out a way to handle checking the fault flags in a way > that works for both TDX/SNP (and KVM self-test use-case) we can > consolidate around that. I can think of the following ways. I think the second option is better because we don't need exit bit for error code. - Introduce software defined error code - Add a flags to struct kvm_arch for self-test use-case VM_TYPE_PROTECTED_VM. Set it to true for VM_TYPE_PROTECTED_VM case. - any other? Thanks, -- Isaku Yamahata