From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DD97C6FA82 for ; Wed, 14 Sep 2022 16:32:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D762F8D0003; Wed, 14 Sep 2022 12:32:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D262B8D0001; Wed, 14 Sep 2022 12:32:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC6BD8D0003; Wed, 14 Sep 2022 12:32:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ADFAC8D0001 for ; Wed, 14 Sep 2022 12:32:20 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8A4B61C5B9C for ; Wed, 14 Sep 2022 16:32:20 +0000 (UTC) X-FDA: 79911233640.01.FF6267B Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) by imf16.hostedemail.com (Postfix) with ESMTP id 360FC1800A5 for ; Wed, 14 Sep 2022 16:32:20 +0000 (UTC) Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-3454e58fe53so187575547b3.2 for ; Wed, 14 Sep 2022 09:32:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=TLSZ0fG4jkVsKxHaOJUp349np8FVF8YfoTMtMUTuG8A=; b=lYAYteHJi3IbnOFnnIGskPJm4LFXQyIlElbzGWypdvdmuOzPYj/pxEWqsuDYmXdtDT +J1fUhuR0nhJSF9vE2IpcB6JMQQVhOxL1FrdrpmKfq/+Pf/0NtZGKAKyIJNPCIIbB3kj vWi4nL+PVud7ZLGOFm7hlsQsP4WGqcGQY5y652IsMJt0X4Satpn4pwnGti6aqOjR7WNb b5tss/uSLmxVOvAGyxZK6jMqSNUzxkIwbjBSIb/bf0uu+j7cYjltNEAMKT0SrlRij08o vO6mJbmXCXzYCHWU2jM86xagKPMTdIsdFaNh0fAXa8lsKrIt9q1/gcJSJ01jSLIuS/+g IKcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=TLSZ0fG4jkVsKxHaOJUp349np8FVF8YfoTMtMUTuG8A=; b=yIWoeX/kVTBUzpdsIt8sacMnV5f81s7S+QnsRD1h18kNqU7VsJJoXk2jEzNPjYibAm zjuFno3+0oylNq4nGJGuzna5TzC6yRIe/jchx/CpPEB1ocEUzVKYHlrP16NYOEXMwU/X tEKvpuplvaGyRsucr0hirTwGF+pVSbMEopAahUXQvRceUR9FfdxP6bJtuafV32VpBEG4 YV0gHPkg84BuDJSULUQHE0sLIz3b6WuvmLkupJbdVI1qGEooFEW9Lz+UHm9si1+OKeuI dWtSpVjf+NbE4pnMUy/XghnT1FeQRrPT5u6QIjRNJ6mpVfiluXJM1lOSLspAdxpDoC3Z /O5Q== X-Gm-Message-State: ACgBeo0MJ48ZYEsDruD6YBZJ9LvV75L0ax4DKXDFVMJAOOYZgZld0TWg ydCtbc1vvUoHQQV/0+YQ1jiK0S1GPoywwHOCtRR0HA== X-Google-Smtp-Source: AA6agR5JyKlU4G6TTbmq/f1TwUurqwIm3Dn+pMyu//jmSXVD3ZQAN3lmbo6PGjQVmw+oTh1R9/QR0RnFraMNNxG5k98= X-Received: by 2002:a81:7cd7:0:b0:345:221c:5671 with SMTP id x206-20020a817cd7000000b00345221c5671mr30574218ywc.297.1663173139128; Wed, 14 Sep 2022 09:32:19 -0700 (PDT) MIME-Version: 1.0 References: <20210820155918.7518-1-brijesh.singh@amd.com> <20210820155918.7518-40-brijesh.singh@amd.com> <4e41dcff-7c7b-cf36-434a-c7732e7e8ff2@amd.com> <20220908212114.sqne7awimfwfztq7@amd.com> In-Reply-To: From: Marc Orr Date: Wed, 14 Sep 2022 17:32:08 +0100 Message-ID: Subject: Re: [PATCH Part2 v5 39/45] KVM: SVM: Introduce ops for the post gfn map and unmap To: Sean Christopherson Cc: Michael Roth , Brijesh Singh , x86 , LKML , kvm list , linux-coco@lists.linux.dev, Linux Memory Management List , Linux Crypto Mailing List , Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Gonda , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Borislav Petkov , Vlastimil Babka , "Kirill A . Shutemov" , Andi Kleen , Tony Luck , Sathyanarayanan Kuppuswamy , jarkko@profian.com Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663173140; a=rsa-sha256; cv=none; b=oM1hJYGl1QQg/xCX/WbCFPxrWj3XyrgHBKMFVIRlOPcaVGF3ZLHDk/k0JjG2tkNWl9NLKa vr99VR37fgK1O9Hechh+GZI/w2kmXLO/5wJvTnPWvFe0ZmxOK4fL8PrCANBYBNuw4sX5IE aXOMAva1cIsof//nLzkUbjZdZlhBaJY= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lYAYteHJ; spf=pass (imf16.hostedemail.com: domain of marcorr@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=marcorr@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663173140; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TLSZ0fG4jkVsKxHaOJUp349np8FVF8YfoTMtMUTuG8A=; b=el59e+JnbBFbEXbXBzgil2Pk9aO2y432GCgfPWq0aaSIG1Skm9eVOvDbA+jUVjzxNQLDEx BIzn8mVDvEzzHzT/1KFwC8r2v1dEKCYcDeVPuz3YQl+fbuU+FwbfLcqWSrf16jJiFvlNQ/ vWbhf8ZqB7EkMfz+q+tRVYxWipLTc4I= X-Rspam-User: Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lYAYteHJ; spf=pass (imf16.hostedemail.com: domain of marcorr@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=marcorr@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam02 X-Stat-Signature: sapnfupy65qwmtnpdx5dqcsmbq9f99d8 X-Rspamd-Queue-Id: 360FC1800A5 X-HE-Tag: 1663173140-14854 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 14, 2022 at 5:15 PM Sean Christopherson wrote: > > On Wed, Sep 14, 2022, Marc Orr wrote: > > On Wed, Sep 14, 2022 at 9:05 AM Sean Christopherson wrote: > > > > > > On Thu, Sep 08, 2022, Michael Roth wrote: > > > > On Fri, Oct 15, 2021 at 05:16:28PM +0000, Sean Christopherson wrote: > > > > So in the context of this interim solution, we're trying to look for a > > > > solution that's simple enough that it can be used reliably, without > > > > introducing too much additional complexity into KVM. There is one > > > > approach that seems to fit that bill, that Brijesh attempted in an > > > > earlier version of this series (I'm not sure what exactly was the > > > > catalyst to changing the approach, as I wasn't really in the loop at > > > > the time, but AIUI there weren't any showstoppers there, but please > > > > correct me if I'm missing anything): > > > > > > > > - if the host is writing to a page that it thinks is supposed to be > > > > shared, and the guest switches it to private, we get an RMP fault > > > > (actually, we will get a !PRESENT fault, since as of v5 we now > > > > remove the mapping from the directmap as part of conversion) > > > > - in the host #PF handler, if we see that the page is marked private > > > > in the RMP table, simply switch it back to shared > > > > - if this was a bug on the part of the host, then the guest will see > > > > > > As discussed off-list, attempting to fix up RMP violations in the host #PF handler > > > is not a viable approach. There was also extensive discussion on-list a while back: > > > > > > https://lore.kernel.org/all/8a244d34-2b10-4cf8-894a-1bf12b59cf92@www.fastmail.com > > > > I mentioned this during Mike's talk at the micro-conference: For pages > > mapped in by the kernel can we disallow them to be converted to > > private? > > In theory, yes. Do we want to do something like this? No. kmap() does something > vaguely similar for 32-bit PAE/PSE kernels, but that's a lot of complexity and > overhead to take on. And this issue goes far beyond a kmap(); when the kernel gup()s > a page, the kernel expects the pfn to be available, no exceptions (pun intended). > > > Note, userspace accesses are already handled by UPM. > > I'm confused by the UPM comment. Isn't the gist of this thread about the ability > to merge SNP _without_ UPM? Or am I out in left field? I think that was the overall gist: yes. But it's not what I was trying to comment on :-). HOWEVER, thinking about this more: I was confused when I wrote out my last reply. I had thought that the issue that Michael brought up applied even with UPM. That is, I was thinking it was still possibly for a guest to maliciously convert a page to private mapped in by the kernel and assumed to be shared. But I now realize that is not what will actually happen. To be concrete, let's assume the GHCB page. What will happen is: - KVM has GHCB page mapped in. GHCB is always assumed to be shared. So far so good. - Malicious guest converts GHCB page to private (e.g., via Page State Change request) - Guest exits to KVM - KVM exits to userspace VMM - Userspace VM allocates page in private FD. Now, what happens here depends on how UPM works. If we allow double allocation then our host kernel is safe. However, now we have the "double allocation problem". If on the other hand, we deallocate the page in the shared FD, the host kernel can segfault. And now we actually do have essentially the same problem Michael was describing that we have without UPM. Because we'll end up in fault.c in the kernel context and likely panic the host. I hope I got this right this time. Sorry for the confusion on my last reply. > > In pseudo-code, I'm thinking something like this: > > > > kmap_helper() { > > // And all other interfaces where the kernel can map a GPA > > // into the kernel page tables > > mapped_into_kernel_mem_set[hpa] = true; > > } > > > > kunmap_helper() { > > // And all other interfaces where the kernel can unmap a GPA > > // into the kernel page tables > > mapped_into_kernel_mem_set[hpa] = false; > > > > // Except it's not this simple because we probably need ref counting > > // for multiple mappings. Sigh. But you get the idea. > > A few issues off the top of my head: > > - It's not just refcounting, there would also likely need to be locking to > guarantee sane behavior. > - kmap() isn't allowed to fail and RMPUPDATE isn't strictly guaranteed to succeed, > which is problematic if the kernel attempts to kmap() a page that's already > private, especially for kmap_atomic(), which isn't allowed to sleep. > - Not all kernel code is well behaved and bounces through kmap(); undoubtedly > some of the 1200+ users of page_address() will be problematic. > > $ git grep page_address | wc -l > 1267 > - It's not sufficient for TDX. Merging something this complicated when we know > we still need UPM would be irresponsible from a maintenance perspective. > - KVM would need to support two separate APIs for SNP, which I very much don't > want to do. Ack on merging without UPM. I wasn't trying to chime in on merging before/after UPM. See my other comment above. Sorry for the confusion. Ack on other concerns about "enlightening kmap" as well. I agree.