From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC30DE71085 for ; Thu, 21 Sep 2023 14:59:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5E4CE6B0187; Thu, 21 Sep 2023 10:59:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 595066B019D; Thu, 21 Sep 2023 10:59:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 45D766B01A1; Thu, 21 Sep 2023 10:59:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3763F6B0187 for ; Thu, 21 Sep 2023 10:59:15 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 06C2D80839 for ; Thu, 21 Sep 2023 14:59:14 +0000 (UTC) X-FDA: 81260912670.25.19BF1C5 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 3A8F5180011 for ; Thu, 21 Sep 2023 14:59:13 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LnjH3vc2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3QFoMZQYKCFUF1xA6z3BB381.zB985AHK-997Ixz7.BE3@flex--seanjc.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3QFoMZQYKCFUF1xA6z3BB381.zB985AHK-997Ixz7.BE3@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695308353; a=rsa-sha256; cv=none; b=ksOmcq3dEJ6wr57eRfAnGWIW4Tc7S8N2MH5y59HDrCbCvl9my3bnmRbGLNKI8jjbl0Pbx8 DRDs1Iv+tx9G3qY6/e3lzkeUUUBURdE7SRvt51v2Xobzccwt/gPiol4MWQceeU6RkWpxfJ TqzRYx2mL6CZuvUnriKvsuBjNXQ2r/k= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=LnjH3vc2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3QFoMZQYKCFUF1xA6z3BB381.zB985AHK-997Ixz7.BE3@flex--seanjc.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3QFoMZQYKCFUF1xA6z3BB381.zB985AHK-997Ixz7.BE3@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695308353; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4hX9qb1Iw7SkuB3FcSZYopfwd6y9xnz247DDm3mrX40=; b=xBUf84DZkqh9PpoB+5gnCahXzDVyyaNSroF2pWpM8C4NYcQhf4Zpw+AqdYwqO7tUUwLQaO tEZGw0rx9Bk0Ont8Kck5qdxbnXc+A/SgV2a72cSuudQKhAWTI1JURCIIAI4Ji0WVDTY+GJ qBwQavM0OZJlki5syELSmklmp6eD7zY= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-58f9db8bc1dso14226087b3.3 for ; Thu, 21 Sep 2023 07:59:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695308352; x=1695913152; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4hX9qb1Iw7SkuB3FcSZYopfwd6y9xnz247DDm3mrX40=; b=LnjH3vc2hkqt1YhhZCifDVLLlQJ+N+26FyryF57IcI0zSntBKG1txa0THfUaS916KE pwRNSr6p/azSgeCu4JVpQpmFR6U3Kz0DnnweXhUYSteayEtnsPYY8aivBKwyZSPKmIuO Y3W7h4kHx8HvcbK1qnBK/TuZlNgcv8OEIe9R0hGCHVxSnfBnRHxyH6UOzvoBcEazAswk kXZkQOBI1dSxY886sxk1+PIrH0QLs8alZlFyHiulwTKW2bskvSeptGFLXVZ9RwNK+5s8 dQkX8UDrA35vfhT94LkXl7LskYPugAIZ1aqaw8EihckVknzieOQdf5YlTgsMYeN2pqUK SH1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695308352; x=1695913152; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4hX9qb1Iw7SkuB3FcSZYopfwd6y9xnz247DDm3mrX40=; b=fXwcdzYSmwrcR5c85NnBNI2Ud2vGI6tzQyowS9s+ImavgI4PgRAx2RMGL5M5D6I7lq i8GZOpJDHUnA1C9uwcOPXfwpJ6ahxXRUBHEGJqJ6jbUylBqslRIOSVImzoCFgtu4Uk/b o6gqOf6jrSnEUxhY8gGA8ULvAALQdlm4f3mEGRVDHZPLMv75IyXXpCsKqgrpLh0wlCO6 zQeeVDIXZn7SC2y6KBmRjIFjH+jiTgLngcjQE8MCCkfSpZs5HUHghMMIxOtOLydq14Tl 7YzTvk3ZKAueGmxGuG45Bg8BYTNQ6oKaAnK+Ox2hI5TrMzrLo531gZAJrZbdDrJG8Ije aTvg== X-Gm-Message-State: AOJu0Yw/I/xLy5fCVktvkz6Vji/Z0kuxozJd/mD7gr7IpkWzGKk3HGph iKM1qpMwlC0MONiv1cKEVCzNHMDFCYQ= X-Google-Smtp-Source: AGHT+IGMlI3r5pTKtUH2X22CRcKkDXwZa01FQ/zuLj/x6iC6Lpx1DAvznI7oMbumBcX5i3FsE4fH+yzcjSE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:690c:2906:b0:59e:ee51:52a1 with SMTP id eg6-20020a05690c290600b0059eee5152a1mr76581ywb.10.1695308352199; Thu, 21 Sep 2023 07:59:12 -0700 (PDT) Date: Thu, 21 Sep 2023 07:59:10 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230914015531.1419405-1-seanjc@google.com> <20230914015531.1419405-19-seanjc@google.com> Message-ID: Subject: Re: [RFC PATCH v12 18/33] KVM: x86/mmu: Handle page fault for private memory From: Sean Christopherson To: Yan Zhao Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , "Matthew Wilcox (Oracle)" , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , Xu Yilun , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="us-ascii" X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 3A8F5180011 X-Stat-Signature: kpbojkhydsrce8ka1qsg85gca3spd3rg X-HE-Tag: 1695308353-959955 X-HE-Meta: U2FsdGVkX1+p5kficqJtgM8n2i+of6hri4K7IqNgtcKNh+A36kIOz1hnWz1qHQSyOc86erTyjxcpH2q5PvqEc1mg4h1K7RyrFn297EwN9tW46RIUnIZK7Ezqm9yoOJVgtl7L+WOLNxD1M6JsALXbIW1nH9zzRxupBr6sEnhAZlmWKJNwoT1x98JTOvx4VAvBtlVzVfNxHeoMKjp7NrlcRYZX7m3oiroekatXTzDm/h3nEb80Hh0+2Z7hg8BENCRRpXjtuuPZjO6o9o+CTkRLX6KFpejySOqrD5qAeux4GMMp0aWZ51Js2WCGxHcVAtM7KjrlYVRK2WFvwFojzeErQw/+sU0vYogMndfNPiHmf0kGEJuVpmPgK1L8be3WB4qQ7N3EWSLExfHFAwuE+Og6XD/PgaCZIrdCvH04Z1qg4j6J4LggHxyJHxECQFqc8Ghudi0lLp3Ysr7QBzFTGNbkcY2Zm/s5Wjv+jyRMYvz3BlnSViR+jgO16L+B3lTlfuMfzS0PQvfOVkH1D+wY6l6klWhL4RxaC2ksHVNntMpAsMfmBL4g8Ii+psY0Rut5BIgIHGozUKjKFgKhEhyfglSXZ3gkl1mEF+DdFKYJu/rCwM6Scy+ncal01fzrOA8Myd5vbinGTAyHext203O7x8nj58NDilcyPMSOnxDHmnGn73x6zseWxxBMx/RQFhhqHgy+TPxaigg3icJi2wpv8qLZt7m4rc8ECBTZoCpC3OC6AHde/CeVCeu2f435Tv6H/+ulkIV+kfGkj7PoEePE7Ijf5/tlgn1PULjI6wNlYBZbTmADcwjfFaaaXJqzj0vESlVG6uHuv11UayK4ys7YVtoNHHd4ml9clgWVwRi6OJcSKYxolYigVMzXnnrmpXm7jngnC6JyXU48HL8n4YwmcnhkH7P1vBgJ3+aJ+cp6E9tf/9Vg5e4HmhvO5DKrDGzUkCNTarFFt53nUQwxscB4QFf 0CupJkdZ VKkJpUjm7McMZs6IrZzpYVckTUbW9jc9C1ytt6UsBLQSPF/sYk559eVMvzu+6MGCxVCTroHps3QrEwMY+MtMoQkB97ts0fpqOZSrG/Hap6JqKVPD8dFDG1AhAl7ZLuSD3Vx8oiWYswMLYqfaPaqzzpqJEl2EpGd6wZCPQtfSjeIc90DrkUSKZdsh4Zk2/HcZeBFxPUOdJeAjhjmPe14/UzJCLy20RZ5pe6Q4Kb/lSL10T8hn48d+hymp/fqEOlcBYjDYcpBqPqoX2HlNPSCy9HjdRao5aTUhlD/X7n1LdKdhXEnqBhGAo6NlqfGKGsiBeb4lBv7Hqvys7HoNxj5o1EupQBkopolG0U4d8UZOKmqDB8E5alhX/yqaalLZJWViKuoCepAbR5awzAMai3mSeKXQ3y29im5b8xC4uqvZ7w9DibgfeRooBAg0YNwINUbot9CoOwq5U07PGbrBiTdCz5B4yAeIoPPY86zjpGat7D7h1zd8v05egG8+OMa/auOdUEaRGxphUwCFTAWr17eqlaoCCNMYehREouX2k/kcO2btuN3Zqf1kDELgTqnSz5eywKm7LCNr4Rl+aGBqs4aZG3PgHzhbBpOOcs4pi8Pvb43kqLJ5I9uP6JZfsU6D7iSSMdtUSNHvjmg9rzDps5li9m/01+SYheL+0RnMV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Sep 18, 2023, Yan Zhao wrote: > On Fri, Sep 15, 2023 at 07:26:16AM -0700, Sean Christopherson wrote: > > On Fri, Sep 15, 2023, Yan Zhao wrote: > > > > static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > > > > { > > > > struct kvm_memory_slot *slot = fault->slot; > > > > @@ -4293,6 +4356,14 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault > > > > return RET_PF_EMULATE; > > > > } > > > > > > > > + if (fault->is_private != kvm_mem_is_private(vcpu->kvm, fault->gfn)) { > > > In patch 21, > > > fault->is_private is set as: > > > ".is_private = kvm_mem_is_private(vcpu->kvm, cr2_or_gpa >> PAGE_SHIFT)", > > > then, the inequality here means memory attribute has been updated after > > > last check. > > > So, why an exit to user space for converting is required instead of a mere retry? > > > > > > Or, is it because how .is_private is assigned in patch 21 is subjected to change > > > in future? > > > > This. Retrying on SNP or TDX would hang the guest. I suppose we could special > Is this because if the guest access a page in private way (e.g. via > private key in TDX), the returned page must be a private page? Yes, the returned page must be private, because the GHCI (TDX) and GHCB (SNP) require that the host allow implicit conversions. I.e. if the guest accesses memory as private (or shared), then the host must map memory as private (or shared). Simply resuming the guest will not change the guest access, nor will it change KVM's memory attributes. Ideally (IMO), implicit conversions would be disallowed, but even if implicit conversions weren't a thing, retrying would still be wrong as KVM would either inject an exception into the guest or exit to userspace to let userspace handle the illegal access. > > case VMs where .is_private is derived from the memory attributes, but the > > SW_PROTECTED_VM type is primary a development vehicle at this point. I'd like to > > have it mimic SNP/TDX as much as possible; performance is a secondary concern. > Ok. But this mimic is somewhat confusing as it may be problematic in below scenario, > though sane guest should ensure no one is accessing a page before doing memory > conversion. > > > CPU 0 CPU 1 > access GFN A in private way > fault->is_private=true > convert GFN A to shared > set memory attribute of A to shared > > faultin, mismatch and exit > set memory attribute of A > to private > > vCPU access GFN A in shared way > fault->is_private = true > faultin, match and map a private PFN B > > vCPU accesses private PFN B in shared way If this is a TDX or SNP VM, then the private vs. shared information comes from the guest itself, e.g. this sequence vCPU access GFN A in shared way fault->is_private = true cannot happen because is_private will be false based on the error code (SNP) or the GPA (TDX). And when hardware doesn't generate page faults based on private vs. shared, i.e. for non-TDX/SNP VMs, from a fault handling perspective there is no concept of the guest accessing a GFN in a "private way" or a "shared way". I.e. there are no implicit conversions. For SEV and SEV-ES, the guest can access memory as private vs. shared, but the and the host VMM absolutely must be in agreement and synchronized with respect to the state of a page, otherwise guest memory will be corrupted. But that has nothing to do with the fault handling, e.g. creating aliases in the guest to access a single GFN as shared and private from two CPUs will create incoherent cache entries and/or corrupt data without any involvement from KVM. In other words, the above isn't possible for TDX/SNP, and for all other types, the conflict between CPU0 and CPU1 is unequivocally a guest bug.