From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2B36C433EF for ; Mon, 15 Nov 2021 12:31:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B60061056 for ; Mon, 15 Nov 2021 12:31:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4B60061056 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C840B6B007B; Mon, 15 Nov 2021 07:31:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C3EC76B007D; Mon, 15 Nov 2021 07:31:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A85C66B007E; Mon, 15 Nov 2021 07:31:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0212.hostedemail.com [216.40.44.212]) by kanga.kvack.org (Postfix) with ESMTP id 950826B007B for ; Mon, 15 Nov 2021 07:31:10 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 57E1882FF2 for ; Mon, 15 Nov 2021 12:31:10 +0000 (UTC) X-FDA: 78811099500.03.5CEBDA4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 5889CF000219 for ; Mon, 15 Nov 2021 12:31:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636979467; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bmrvKUYAASjH6mIRSwiOSO3Soa8eQXTmOOyvhFqgyNs=; b=aUxncPhddF4ohDvb4RILEDNTT+9f5wRz4OoF+DppqCzq1mcNtPNGXMZAVO1A18s4p3omlO sioLAQQh0Y5omJcdYr7IuVspiKe7zjtnJYUzBj2WnMyMeag34dorDkdubTw24Mwu/uE2+F qchfRqJRBfftB8NPpqF0y1ZQk8HCwjo= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-531-V3R0tSj2M06lNr54AxY4cQ-1; Mon, 15 Nov 2021 07:31:04 -0500 X-MC-Unique: V3R0tSj2M06lNr54AxY4cQ-1 Received: by mail-wr1-f70.google.com with SMTP id h13-20020adfa4cd000000b001883fd029e8so3509988wrb.11 for ; Mon, 15 Nov 2021 04:31:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=bmrvKUYAASjH6mIRSwiOSO3Soa8eQXTmOOyvhFqgyNs=; b=3rxVhAjqsa2ZebT5XFg5HbMKnajyfzRT13kjH01MUaAp1WeMmQN6Bzv4ow9qMvmULC ijK0+vIHBfIlPqIDcP+kfv/GkAtQot/bfFvjMPEUP9WrGIdY5zZclfTf9GCeYug3rPDP c99RCou+aO8rKQnq919+q45ZwAcF7DgZernhUWO5VR0mRMAzPb1M49vaSmC9uUmgRQZ8 DsbamSrXxZ+Ue/JVhzBQwfVyvn+h7qD+ooPoV27YTrZXJo5A5JxUoapkRIShvInO/GNb 5cX5OT1o3KWX0OXxjHqqcddJwb9n4G/ezT41ZmayBORiTzzYJqDL1BTIHdkR8ZnqTbbb iiWQ== X-Gm-Message-State: AOAM533OO17NJnd4YjYxDJSzVoEJts0gIwBaqHIWEaV4HCLF8H9BYY/1 wg9HrVwIoSdh3fhjgSXiBAd60YqCZ46TUcMAc1MnNMWCzzhyJkv4RqHEZKzAsakuUaPpYgkeppa nFvkycjHNsq4= X-Received: by 2002:adf:f708:: with SMTP id r8mr46380717wrp.198.1636979463370; Mon, 15 Nov 2021 04:31:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJzH7dBVhHzposGOZpkUWZ27MvoJHO6QECzLEU5yEXHaBNIHFbfvFOSCbhtiEPuokaDOzbkyRg== X-Received: by 2002:adf:f708:: with SMTP id r8mr46380675wrp.198.1636979463157; Mon, 15 Nov 2021 04:31:03 -0800 (PST) Received: from work-vm (cpc109025-salf6-2-0-cust480.10-2.cable.virginm.net. [82.30.61.225]) by smtp.gmail.com with ESMTPSA id f7sm22666208wmg.6.2021.11.15.04.31.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 04:31:02 -0800 (PST) Date: Mon, 15 Nov 2021 12:30:59 +0000 From: "Dr. David Alan Gilbert" To: Sean Christopherson Cc: Borislav Petkov , Dave Hansen , Peter Gonda , Brijesh Singh , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Michael Roth , Vlastimil Babka , "Kirill A . Shutemov" , Andi Kleen , tony.luck@intel.com, marcorr@google.com, sathyanarayanan.kuppuswamy@linux.intel.com Subject: Re: [PATCH Part2 v5 00/45] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Message-ID: References: <20210820155918.7518-1-brijesh.singh@amd.com> <061ccd49-3b9f-d603-bafd-61a067c3f6fa@intel.com> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/2.0.7 (2021-05-04) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 5889CF000219 X-Stat-Signature: x71tnmw8ardwj4ygzgmdxudcwjdzswck Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aUxncPhd; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf11.hostedemail.com: domain of dgilbert@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=dgilbert@redhat.com X-HE-Tag: 1636979468-823113 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: * Sean Christopherson (seanjc@google.com) wrote: > On Fri, Nov 12, 2021, Borislav Petkov wrote: > > On Fri, Nov 12, 2021 at 09:59:46AM -0800, Dave Hansen wrote: > > > Or, is there some mechanism that prevent guest-private memory from = being > > > accessed in random host kernel code? >=20 > Or random host userspace code... >=20 > > So I'm currently under the impression that random host->guest accesse= s > > should not happen if not previously agreed upon by both. >=20 > Key word "should". >=20 > > Because, as explained on IRC, if host touches a private guest page, > > whatever the host does to that page, the next time the guest runs, it= 'll > > get a #VC where it will see that that page doesn't belong to it anymo= re > > and then, out of paranoia, it will simply terminate to protect itself= . > >=20 > > So cloud providers should have an interest to prevent such random str= ay > > accesses if they wanna have guests. :) >=20 > Yes, but IMO inducing a fault in the guest because of _host_ bug is wro= ng. Would it necessarily have been a host bug? A guest telling the host a bad GPA to DMA into would trigger this wouldn't it? Still; I wonder if it's best to kill the guest - maybe it's best for the host to kill the guest and leave behind diagnostics of what happened; for someone debugging the crash, it's going to be less useful to know that page X was wrongly accessed (which is what the guest would see), and more useful to know that it was the kernel's vhost-... driver that accessed it. Dave > On Fri, Nov 12, 2021, Peter Gonda wrote: > > Here is an alternative to the current approach: On RMP violation (hos= t > > or userspace) the page fault handler converts the page from private t= o > > shared to allow the write to continue. This pulls from s390=E2=80=99s= error > > handling which does exactly this. See =E2=80=98arch_make_page_accessi= ble()=E2=80=99. >=20 > Ah, after further reading, s390 does _not_ do implicit private=3D>share= d conversions. >=20 > s390's arch_make_page_accessible() is somewhat similar, but it is not a= direct > comparison. IIUC, it exports and integrity protects the data and thus = preserves > the guest's data in an encrypted form, e.g. so that it can be swapped t= o disk. > And if the host corrupts the data, attempting to convert it back to sec= ure on a > subsequent guest access will fail. >=20 > The host kernel's handling of the "convert to secure" failures doesn't = appear to > be all that robust, e.g. it looks like there are multiple paths where t= he error > is dropped on the floor and the guest is resumed , but IMO soft hanging= the guest=20 > is still better than inducing a fault in the guest, and far better than= potentially > coercing the guest into reading corrupted memory ("spurious" PVALIDATE)= . And s390's > behavior is fixable since it's purely a host error handling problem. >=20 > To truly make a page shared, s390 requires the guest to call into the u= ltravisor > to make a page shared. And on the host side, the host can pin a page a= s shared > to prevent the guest from unsharing it while the host is accessing it a= s a shared > page. >=20 > So, inducing #VC is similar in the sense that a malicious s390 can also= DoS itself, > but is quite different in that (AFAICT) s390 does not create an attack = surface where > a malicious or buggy host userspace can induce faults in the guest, or = worst case in > SNP, exploit a buggy guest into accepting and accessing corrupted data. >=20 > It's also different in that s390 doesn't implicitly convert between sha= red and > private. Functionally, it doesn't really change the end result because= a buggy > host that writes guest private memory will DoS the guest (by inducing a= #VC or > corrupting exported data), but at least for s390 there's a sane, legiti= mate use > case for accessing guest private memory (swap and maybe migration?), wh= ereas for > SNP, IMO implicitly converting to shared on a host access is straight u= p wrong. >=20 > > Additionally it adds less complexity to the SNP kernel patches, and > > requires no new ABI. >=20 > I disagree, this would require "new" ABI in the sense that it commits K= VM to > supporting SNP without requiring userspace to initiate any and all conv= ersions > between shared and private. Which in my mind is the big elephant in th= e room: > do we want to require new KVM (and kernel?) ABI to allow/force userspac= e to > explicitly declare guest private memory for TDX _and_ SNP, or just TDX? >=20 --=20 Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK