From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4530C4332F for ; Thu, 17 Nov 2022 13:30:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C2638E0001; Thu, 17 Nov 2022 08:30:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 572CE6B0075; Thu, 17 Nov 2022 08:30:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 412AD8E0001; Thu, 17 Nov 2022 08:30:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2FC1D6B0074 for ; Thu, 17 Nov 2022 08:30:29 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D3A091C6BC5 for ; Thu, 17 Nov 2022 13:30:28 +0000 (UTC) X-FDA: 80143018536.10.4BF87D2 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf20.hostedemail.com (Postfix) with ESMTP id 0D2F21C0004 for ; Thu, 17 Nov 2022 13:30:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668691828; x=1700227828; h=date:from:to:cc:subject:message-id:reply-to:references: mime-version:in-reply-to; bh=kkui0TYDb/IbaSu7o0K1fhPB2oH/R5viDFHkl5cBAL8=; b=AeGwAgechVjPHLrA0D/IH/Kjzu7hPUPW7D/BB1KA3Ae5KApu4JLwkEHY Ipqxp2fHGFYYWJ2YE6okiktp1mBvUlUzkvWNNYZJhuRGXgezLhNmNkmuK /XM2OAIs4A8lUiwAuXL43aYVIuLapWZQlxHTxjqbume2ZWoSJkP2Sim0U b+ALT1Ah3ZwgiC6QmP97h735i+ZgLIvdm04T7EPbTkB96/HfAEnc0cwYM l/gzQceDxq+JbdbcTdhvq4vFC4QfvVdeUtHwVDp1gPXRccJMjIXq5+K/y asH8mqXHHbE//RjCYkFz8Zu2xPSPjed5xH6qtrwT9q8hblYrm5gfkHV6j w==; X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="293243751" X-IronPort-AV: E=Sophos;i="5.96,171,1665471600"; d="scan'208";a="293243751" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2022 05:30:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10533"; a="703332498" X-IronPort-AV: E=Sophos;i="5.96,171,1665471600"; d="scan'208";a="703332498" Received: from chaop.bj.intel.com (HELO localhost) ([10.240.193.75]) by fmsmga008.fm.intel.com with ESMTP; 17 Nov 2022 05:30:16 -0800 Date: Thu, 17 Nov 2022 21:25:51 +0800 From: Chao Peng To: Sean Christopherson Cc: Ackerley Tng , aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, bfields@fieldses.org, bp@alien8.de, corbet@lwn.net, dave.hansen@intel.com, david@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, hpa@zytor.com, hughd@google.com, jlayton@kernel.org, jmattson@google.com, joro@8bytes.org, jun.nakajima@intel.com, kirill.shutemov@linux.intel.com, kvm@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, luto@kernel.org, mail@maciej.szmigiero.name, mhocko@suse.com, michael.roth@amd.com, mingo@redhat.com, pbonzini@redhat.com, qemu-devel@nongnu.org, qperret@google.com, rppt@kernel.org, shuah@kernel.org, songmuchun@bytedance.com, steven.price@arm.com, tabba@google.com, tglx@linutronix.de, vannapurve@google.com, vbabka@suse.cz, vkuznets@redhat.com, wanpengli@tencent.com, wei.w.wang@intel.com, x86@kernel.org, yu.c.zhang@linux.intel.com Subject: Re: [PATCH v9 7/8] KVM: Handle page fault for private memory Message-ID: <20221117132551.GB422408@chaop.bj.intel.com> Reply-To: Chao Peng References: <20221025151344.3784230-8-chao.p.peng@linux.intel.com> <20221116205025.1510291-1-ackerleytng@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668691828; a=rsa-sha256; cv=none; b=CUyFv9NUyW/tt74+PrQ1x8ShexDRazQIjTRh5CJf/z8lSDuj5sf0SizTXeAe30uthmxCF3 xdtDEcruy8NtzvZFuGSwU8BqAhCSXNttjyjkhKHCbE6GSKxg9QEvgd+sP8WUFlv6KYfog3 Mz8v+iMDtsBKdNjYDioESbGZaBXHwUQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=AeGwAgec; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf20.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=chao.p.peng@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668691828; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CoIJjfxbwByQChdJgpmqdxU03KkPelvOk6rXysfSFJA=; b=CXq8hkRgC8I7FRCxz+lHMpPf0mpRh21KCp3rUoDwv4PGGSoLHIjN07NqSh79N1VCsYjJbO z/Fshc7DnO7PWyR6VIr0Vr79xP133Nrktqn7HPSP76KMUsfqgjz1PFUK+Fxtd2Rb3K8qK8 4+lLHkhJ2Zfvtl24Q6McOeaX9YRLsZQ= X-Rspamd-Queue-Id: 0D2F21C0004 Authentication-Results: imf20.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=AeGwAgec; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf20.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=chao.p.peng@linux.intel.com X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 71pcrsm34z5aoiu79d7obc9bfnng135n X-HE-Tag: 1668691827-664029 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 16, 2022 at 10:13:07PM +0000, Sean Christopherson wrote: > On Wed, Nov 16, 2022, Ackerley Tng wrote: > > >@@ -4173,6 +4203,22 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > > return RET_PF_EMULATE; > > > } > > > > > >+ if (kvm_slot_can_be_private(slot) && > > >+ fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn)) { > > >+ vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT; > > >+ if (fault->is_private) > > >+ vcpu->run->memory.flags = KVM_MEMORY_EXIT_FLAG_PRIVATE; > > >+ else > > >+ vcpu->run->memory.flags = 0; > > >+ vcpu->run->memory.padding = 0; > > >+ vcpu->run->memory.gpa = fault->gfn << PAGE_SHIFT; > > >+ vcpu->run->memory.size = PAGE_SIZE; > > >+ return RET_PF_USER; > > >+ } > > >+ > > >+ if (fault->is_private) > > >+ return kvm_faultin_pfn_private(fault); > > >+ > > > > Since each memslot may also not be backed by restricted memory, we > > should also check if the memslot has been set up for private memory > > with > > > > if (fault->is_private && kvm_slot_can_be_private(slot)) > > return kvm_faultin_pfn_private(fault); > > > > Without this check, restrictedmem_get_page will get called with NULL > > in slot->restricted_file, which causes a NULL pointer dereference. > > Hmm, silently skipping the faultin would result in KVM faulting in the shared > portion of the memslot, and I believe would end up mapping that pfn as private, > i.e. would map a non-UPM PFN as a private mapping. For TDX and SNP, that would > be double ungood as it would let the host access memory that is mapped private, > i.e. lead to #MC or #PF(RMP) in the host. That's correct. > > I believe the correct solution is to drop the "can be private" check from the > above check, and instead handle that in kvm_faultin_pfn_private(). That would fix > another bug, e.g. if the fault is shared, the slot can't be private, but for > whatever reason userspace marked the gfn as private. Even though KVM might be > able service the fault, the correct thing to do in that case is to exit to userspace. It makes sense to me. Chao > > E.g. > > --- > arch/x86/kvm/mmu/mmu.c | 36 ++++++++++++++++++++++-------------- > 1 file changed, 22 insertions(+), 14 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 10017a9f26ee..e2ac8873938e 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -4158,11 +4158,29 @@ static inline u8 order_to_level(int order) > return PG_LEVEL_4K; > } > > -static int kvm_faultin_pfn_private(struct kvm_page_fault *fault) > +static int kvm_do_memory_fault_exit(struct kvm_vcpu *vcpu, > + struct kvm_page_fault *fault) > +{ > + vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT; > + if (fault->is_private) > + vcpu->run->memory.flags = KVM_MEMORY_EXIT_FLAG_PRIVATE; > + else > + vcpu->run->memory.flags = 0; > + vcpu->run->memory.padding = 0; > + vcpu->run->memory.gpa = fault->gfn << PAGE_SHIFT; > + vcpu->run->memory.size = PAGE_SIZE; > + return RET_PF_USER; > +} > + > +static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, > + struct kvm_page_fault *fault) > { > int order; > struct kvm_memory_slot *slot = fault->slot; > > + if (kvm_slot_can_be_private(slot)) > + return kvm_do_memory_fault_exit(vcpu, fault); > + > if (kvm_restricted_mem_get_pfn(slot, fault->gfn, &fault->pfn, &order)) > return RET_PF_RETRY; > > @@ -4203,21 +4221,11 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > return RET_PF_EMULATE; > } > > - if (kvm_slot_can_be_private(slot) && > - fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn)) { > - vcpu->run->exit_reason = KVM_EXIT_MEMORY_FAULT; > - if (fault->is_private) > - vcpu->run->memory.flags = KVM_MEMORY_EXIT_FLAG_PRIVATE; > - else > - vcpu->run->memory.flags = 0; > - vcpu->run->memory.padding = 0; > - vcpu->run->memory.gpa = fault->gfn << PAGE_SHIFT; > - vcpu->run->memory.size = PAGE_SIZE; > - return RET_PF_USER; > - } > + if (fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn)) > + return kvm_do_memory_fault_exit(vcpu, fault); > > if (fault->is_private) > - return kvm_faultin_pfn_private(fault); > + return kvm_faultin_pfn_private(vcpu, fault); > > async = false; > fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, &async, > > base-commit: 969d761bb7b8654605937f31ae76123dcb7f15a3 > --