From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE148C77B73 for ; Thu, 20 Apr 2023 19:14:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 489AB900003; Thu, 20 Apr 2023 15:14:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41301900002; Thu, 20 Apr 2023 15:14:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28C9A900003; Thu, 20 Apr 2023 15:14:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1444C900002 for ; Thu, 20 Apr 2023 15:14:04 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D74D0AB63C for ; Thu, 20 Apr 2023 19:14:03 +0000 (UTC) X-FDA: 80702719566.21.351A1C7 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by imf05.hostedemail.com (Postfix) with ESMTP id CDA0410000C for ; Thu, 20 Apr 2023 19:14:01 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=VPpNbcz8; spf=pass (imf05.hostedemail.com: domain of atishp@rivosinc.com designates 209.85.215.181 as permitted sender) smtp.mailfrom=atishp@rivosinc.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682018042; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hiGazBo7fqmmG21SdxdzgEwUnvE1CYshJ5DRTxFcvcU=; b=2kJ77Goguu/aYXthDOM75EVVhtK5re3+t28OVx7rzreUEO6tx3EKQvfX8rFoWNcQ6ZQtJP okTeBLtHxI9qP8lzuIiEkJvWDpxGJNa+EmDoTm4ecMFq1pS4+dOZrYjbOjMGhWodAjT1Np 718jyORCqAPMc1MtamXLnUqX6f72muA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=VPpNbcz8; spf=pass (imf05.hostedemail.com: domain of atishp@rivosinc.com designates 209.85.215.181 as permitted sender) smtp.mailfrom=atishp@rivosinc.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682018042; a=rsa-sha256; cv=none; b=HjKlI7wn2kojfxOBYaT8F80vEVMQlP3gVy2Cu42B56K2xCLaau6eKNQ8z6YCFP1YLHMqeX zQBiJM8OTpTPiDmyBxYDc+Pv4dx8F5fcAWxQx9G5gxda8YcaiyMUjNFfsgMTaytOX6s7ZN fBC9Pjml6oqvfa4J9/lrvzwsF0veHpY= Received: by mail-pg1-f181.google.com with SMTP id 41be03b00d2f7-51b0f9d7d70so1373109a12.1 for ; Thu, 20 Apr 2023 12:14:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1682018040; x=1684610040; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hiGazBo7fqmmG21SdxdzgEwUnvE1CYshJ5DRTxFcvcU=; b=VPpNbcz8lUmn6XqbAF8ZKpyYp3zV26MHR66QP3yKRqpcmVdPoHTXLEyF6Tm6HEp4sd I6BqSMDCI5vFX0NE5D1q7CokCl4/RoKvxQk1G13HaFvjs53+GMT1wsINZIEansmPZtBE 1k8Rd9FGcvKrprRZvlKXZGn5l88bbi1KjmY6066siYhLq8wcQ8X6Jf6SnA5UoPM85uEo 7diQrnMp6doro6jk8uOyyRDOWi4bDJDokTvU+c4NBYe1TsSjguNSflNmfNIhkY1Df2B0 kI05B/Wzga0pxbf0UynU69HDnh053SHp7yyj+UPOsBKA2vPkT5VaxUle0IyaTxscv/pO Av8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682018040; x=1684610040; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hiGazBo7fqmmG21SdxdzgEwUnvE1CYshJ5DRTxFcvcU=; b=YdglZ9o8nU4v+PW2vN9m9ihc88WBzvKkLdlMWwCxlnglZv+khtI1Xd42MO/zhbvxQP SljYAB7PQUvReXETON746ibYG2tMFf/ggn/aninM40jMnM3EsRmjn6Pm0BzAJa4LlguF n4rD1ZgB4Q4BfewulmZYjj84MEneC5Jh+jcVF++Iox85FcQlwgTA+p/6S/J9mNA05hsN XHp+NWO/gDjdbzt/dacgCBpAz/XPlN1Yxce2b/Lgb9Pd+mLmGFyPzHugmzjawh8lJd2U r0ma7HN/Af4ZQZB0NewS6oPPay1l/L+IUgtfyzVhN+WBvAhRHfT5zd2Wy6597KDnJMCo MryA== X-Gm-Message-State: AAQBX9c+IGz3Q3Ud70lK3WWgR/+HkyaGp+vMZ690BPE3c1ED4xStzGLi BccNX/Z/6Swhsn3sm5oyYiMDEWBYzFRuDsmSSboPpA== X-Google-Smtp-Source: AKy350aRQ0rM7CLx1DwjTywqrVN0WjdwG2lvnHksOXDIHfuaqZi7DwTJ1XnW9z+QtTwIqGEYzeTP15HNnlsEvUBFY24= X-Received: by 2002:a17:90a:744c:b0:23f:962e:826b with SMTP id o12-20020a17090a744c00b0023f962e826bmr2608127pjk.15.1682018040457; Thu, 20 Apr 2023 12:14:00 -0700 (PDT) MIME-Version: 1.0 References: <20230419221716.3603068-1-atishp@rivosinc.com> In-Reply-To: From: Atish Kumar Patra Date: Fri, 21 Apr 2023 00:43:49 +0530 Message-ID: Subject: Re: [RFC 00/48] RISC-V CoVE support To: Sean Christopherson Cc: linux-kernel@vger.kernel.org, Alexandre Ghiti , Andrew Jones , Andrew Morton , Anup Patel , Atish Patra , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Suzuki K Poulose , Will Deacon , Marc Zyngier , linux-coco@lists.linux.dev, Dylan Reid , abrestic@rivosinc.com, Samuel Ortiz , Christoph Hellwig , Conor Dooley , Greg Kroah-Hartman , Guo Ren , Heiko Stuebner , Jiri Slaby , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, Mayuresh Chitale , Palmer Dabbelt , Paolo Bonzini , Paul Walmsley , Rajnesh Kanwal , Uladzislau Rezki Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CDA0410000C X-Stat-Signature: 6thki7ddz35i6xdjkyazzgq3einht1qf X-HE-Tag: 1682018041-185487 X-HE-Meta: U2FsdGVkX1/KjrwzTLO/+RK+7RGbAuOrIOx91ndst7XmeSvdfZgNW06vHmc0X71qWrw7ZWj+Pz3GDqqY+m9yQmPzWyI3DcJid5uVkNbw6OI11G0sLi99ygYx+jOqxrR0pLtiB1Fsdl7/CK2ViTESvnQqPBAcdASXLnpeJX+D1pYFKOWIUV/9Vwpn7/qJeg6x8WGefRTB4Ir5rAn78ysZZKDXyUBwcuXQtPBEAVJVBKyvxKC2p+rZ14K7kZzRIPLHPMzGW0pQQqviKLUmpgDGUc+Q9VlW3TqhzxCx7ix97MrLXmPkyWFm7A8jjmpxSYOMkvYnOsif1k+vcb97elUo1Cz+kOutq3bO0IDYh/TQNoYJKkv4/i4Ev3KzmUMeZ4PpW2TEKD2fy3bb/Lk7cUixtPYZAmNo4wX32Vm6vore8EoB16ftxlJyc+XdL2cAiOGwnuVLWBFtKur9u1IrCqDcv0Vd2V1lbYTL1fD4rs2uorbHEezEFfv2quo4gV/ciP5l1Scym64Q2jyFAD1f3fxuNxfRF/iweN0c+MYQPYAsZhkxHhk8oWGp8RJiEIiYEBboRJMY+EaKT2F9Mct0BFzuQsLIVQlVl7QtuyvYrUFLvG3Eu/bgXns8XxWr33+xaCCxN7RcYawtCaTzvJE9+1supPFC9vSZAQIE8JA5BNUYfF7tSlX+hsmIOBNgXVQB3Amy6PAcQDsW4rv4ipzp5mB/P5NLT5Aju61/G/KZctIB8pt1jaBlFawh4BTjCo9elSrgB0jNMUqlDye7JaeFGZV4DVdyHgRwUH/i680IN4Hgi8QOc//FG0hobc1xEUyWQjeOmTrbABPxRkdE9KrMs3p1751JOgaPew6B8NOEmBF1cFsA699/GgvSN3UdKlVVWr/j3RpniG7AVIQycLg13f8JVpUeiCrAINaC0ihUCYxDRhPJvJ4KuUDBaC2rcGYLC+qxuM0xVpuakvewQZabzB1 Q6muggT7 k5GxnsE6iM0KRipmZIGiWVN1QXXIwYNV0WEQ1+4aMeLWrUdqBg48g6cyFkxJvyHecxgocwSsEKdMrT+tz6QmpuxCMpQBOBBxBZhBJL2Z8LPS18T62H1wJo+3uXW4uU4yYzUSupFSssQIX5JOHWQL8lvdr6kBw6PUjDixstkZ7JLfbXia9WjRFjtWAsqThbdn18cKI7tF2N+5KyFh+1KbLDvTzlBavwBdoBS3FqmzNOP6YG0CYMVPntf99bo4vlNC9+ABk1UOFJjGFS8etxAwmck6fJiHwRTk85zM+o1Ruu2gkUxkvfrDt9F+HSDzKm9q4xmaTsIbH4TtRzBh2ueJmiT4S1N/SQ945qkQSaiELM/sqS5HnSZFr1ZcO0YyXzBv9zLix X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 20, 2023 at 10:00=E2=80=AFPM Sean Christopherson wrote: > > On Wed, Apr 19, 2023, Atish Patra wrote: > > 2. Lazy gstage page allocation vs upfront allocation with page pool. > > Currently, all gstage mappings happen at runtime during the fault. This= is expensive > > as we need to convert that page to confidential memory as well. A page = pool framework > > may be a better choice which can hold all the confidential pages which = can be > > pre-allocated upfront. A generic page pool infrastructure may benefit o= ther CC solutions ? > > I'm sorry, what? Do y'all really not pay any attention to what is happen= ing > outside of the RISC-V world? > > We, where "we" is KVM x86 and ARM, with folks contributing from 5+ compan= ines, > have been working on this problem for going on three *years*. And that's= just > from the first public posting[1], there have been discussions about how t= o approach > this for even longer. There have been multiple related presentations at = KVM Forum, > something like 4 or 5 just at KVM Forum 2022 alone. > Yes. We are following the restrictedmem effort and was reviewing the v10 this week. I did mention about that in the 1st item in the TODO list. We are planning to use the restrictedmen feature once it is closer to upstream (which seems to be the case looking at v10). Another reason is that this initial series is based on kvmtool only. We are working on qemu-kvm right now but have some RISC-V specific dependencies(interrupt controller stuff) which are not there yet. As the restrictedmem patches are already available in qemu-kvm too, our plan was to support CoVE in qemu-kvm first and work on restrictedmem after that. This item was just based on this RFC implementation which uses a lazy gstage page allocation. The idea was to check if there is any interest at all in this approach. I should have mentioned about restrictedmem plan in this section as well. Sorry for the confusion. Thanks for your suggestion. It seems we should just directly move to restrictedmem asap. > Patch 1 says "This patch is based on pkvm patches", so clearly you are at= least > aware that there is other work going on in this space. > Yes. We have been following pkvm, tdx & CCA patches. The MMIO section has more details on TDX/pkvm related aspects. > At a very quick glance, this series is suffers from all of the same flaws= that SNP, > TDX, and pKVM have encountered. E.g. assuming guest memory is backed by = struct page > memory, relying on pinning to solve all problems (hint, it doesn't), and = so on and > so forth. > > And to make things worse, this series is riddled with bugs. E.g. patch 1= 9 alone > manages to squeeze in multiple fatal bugs in five new lines of code: dead= lock due > to not releasing mmap_lock on failure, failure to correcty handle MOVE, f= ailure to That's an oversight. Apologies for that. Thanks for pointing it out. > handle DELETE at all, failure to honor (or reject) READONLY, and probably= several > others. > It should be rejected for READONLY as our APIs don't have any permission flags yet. I think we should add that to enable CoVE APIs to support as well ? Same goes for DELETE ops as we don't have an API to delete any confidential memory region yet. I was not very sure about the use case for MOVE though (migration possibly ?) kvm_riscv_cove_vm_add_memreg should have been invoked only for CREATE & reject others for now. I will revise the patch accordingly and leave a TODO comment for the future about API updates. > diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c > index 4b0f09e..63889d9 100644 > --- a/arch/riscv/kvm/mmu.c > +++ b/arch/riscv/kvm/mmu.c > @@ -499,6 +499,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, > > mmap_read_lock(current->mm); > > + if (is_cove_vm(kvm)) { > + ret =3D kvm_riscv_cove_vm_add_memreg(kvm, base_gpa, size)= ; > + if (ret) > + return ret; > + } > /* > * A memory region could potentially cover multiple VMAs, and > * any holes between them, so iterate over all of them to find > > I get that this is an RFC, but for a series of this size, operating in an= area that > is under heavy development by multiple other architectures, to have a dif= fstat that > shows _zero_ changes to common KVM is simply unacceptable. > Thanks for the valuable feedback. This is pretty much pre-RFC as the spec is very much in draft state. We want to share with the larger linux community to gather feedback sooner than later so that we can incorporate that feedback into the spec if any. > Please, go look at restrictedmem[2] and work on building CoVE support on = top of > that. If the current proposal doesn't fit CoVE's needs, then we need to = know _before_ > all of that code gets merged. > Absolutely. That has always been the plan. > [1] https://lore.kernel.org/linux-mm/20200522125214.31348-1-kirill.shutem= ov@linux.intel.com > [2] https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linu= x.intel.com > > > arch/riscv/Kbuild | 2 + > > arch/riscv/Kconfig | 27 + > > arch/riscv/cove/Makefile | 2 + > > arch/riscv/cove/core.c | 40 + > > arch/riscv/cove/cove_guest_sbi.c | 109 +++ > > arch/riscv/include/asm/cove.h | 27 + > > arch/riscv/include/asm/covg_sbi.h | 38 + > > arch/riscv/include/asm/csr.h | 2 + > > arch/riscv/include/asm/kvm_cove.h | 206 +++++ > > arch/riscv/include/asm/kvm_cove_sbi.h | 101 +++ > > arch/riscv/include/asm/kvm_host.h | 10 +- > > arch/riscv/include/asm/kvm_vcpu_sbi.h | 3 + > > arch/riscv/include/asm/mem_encrypt.h | 26 + > > arch/riscv/include/asm/sbi.h | 107 +++ > > arch/riscv/include/uapi/asm/kvm.h | 17 + > > arch/riscv/kernel/irq.c | 12 + > > arch/riscv/kernel/setup.c | 2 + > > arch/riscv/kvm/Makefile | 1 + > > arch/riscv/kvm/aia.c | 101 ++- > > arch/riscv/kvm/aia_device.c | 41 +- > > arch/riscv/kvm/aia_imsic.c | 127 ++- > > arch/riscv/kvm/cove.c | 1005 +++++++++++++++++++++++ > > arch/riscv/kvm/cove_sbi.c | 490 +++++++++++ > > arch/riscv/kvm/main.c | 30 +- > > arch/riscv/kvm/mmu.c | 45 +- > > arch/riscv/kvm/tlb.c | 11 +- > > arch/riscv/kvm/vcpu.c | 69 +- > > arch/riscv/kvm/vcpu_exit.c | 34 +- > > arch/riscv/kvm/vcpu_insn.c | 115 ++- > > arch/riscv/kvm/vcpu_sbi.c | 16 + > > arch/riscv/kvm/vcpu_sbi_covg.c | 232 ++++++ > > arch/riscv/kvm/vcpu_timer.c | 26 +- > > arch/riscv/kvm/vm.c | 34 +- > > arch/riscv/kvm/vmid.c | 17 +- > > arch/riscv/mm/Makefile | 3 + > > arch/riscv/mm/init.c | 17 +- > > arch/riscv/mm/ioremap.c | 45 + > > arch/riscv/mm/mem_encrypt.c | 61 ++ > > drivers/tty/hvc/hvc_riscv_sbi.c | 5 + > > drivers/tty/serial/earlycon-riscv-sbi.c | 51 +- > > include/uapi/linux/kvm.h | 8 + > > mm/vmalloc.c | 16 + > > 42 files changed, 3222 insertions(+), 109 deletions(-) > > create mode 100644 arch/riscv/cove/Makefile > > create mode 100644 arch/riscv/cove/core.c > > create mode 100644 arch/riscv/cove/cove_guest_sbi.c > > create mode 100644 arch/riscv/include/asm/cove.h > > create mode 100644 arch/riscv/include/asm/covg_sbi.h > > create mode 100644 arch/riscv/include/asm/kvm_cove.h > > create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h > > create mode 100644 arch/riscv/include/asm/mem_encrypt.h > > create mode 100644 arch/riscv/kvm/cove.c > > create mode 100644 arch/riscv/kvm/cove_sbi.c > > create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c > > create mode 100644 arch/riscv/mm/ioremap.c > > create mode 100644 arch/riscv/mm/mem_encrypt.c > > > > -- > > 2.25.1 > >