From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4619CC77B72 for ; Thu, 20 Apr 2023 16:30:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 33FD0900003; Thu, 20 Apr 2023 12:30:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C8E2900002; Thu, 20 Apr 2023 12:30:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 19090900003; Thu, 20 Apr 2023 12:30:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 03B9E900002 for ; Thu, 20 Apr 2023 12:30:34 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id CCE1B1602A2 for ; Thu, 20 Apr 2023 16:30:33 +0000 (UTC) X-FDA: 80702307546.01.82606FA Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf12.hostedemail.com (Postfix) with ESMTP id F19134000C for ; Thu, 20 Apr 2023 16:30:31 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=xsdAX4QP; spf=pass (imf12.hostedemail.com: domain of 3pmhBZAYKCKUXJFSOHLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--seanjc.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3pmhBZAYKCKUXJFSOHLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682008232; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iuRee1QypTQZKXl0Ba92BIJVB1syFZT2hDTjYd+QOu0=; b=hf9fkAmC/iDPlGfwlbffi3v12AFjgrvQkDpFou141Z8YN285wLyaZ3i4vuangCWwxcoM2i sFDLXCMc8H5dGFfGajskYlNTYyx/886LgmHhpXph2KAmvA2FSJtQEVscjoIfAJvO/2rYvS O1xzzw0UOTpC/YtO+EBH3lSoSQUMdrM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=xsdAX4QP; spf=pass (imf12.hostedemail.com: domain of 3pmhBZAYKCKUXJFSOHLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--seanjc.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3pmhBZAYKCKUXJFSOHLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--seanjc.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682008232; a=rsa-sha256; cv=none; b=Z017cjcn7j1OWdTWQhSC12FsLzSwdeXT4TZ6/SHE0k4GjvvRBaLqRG7AMx8yyRF6Qa+lD8 y9ZphCSk7/oRH5yp7BVp8X1CV4M7Zj/JJNu0mVS8XNJZAvr3TRj7Y6XSmPzMgcrKGN5rb6 twrrxnnWQHPJT8uUyf6pJTBFdgvi4k8= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1a698bd6046so8801185ad.2 for ; Thu, 20 Apr 2023 09:30:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1682008231; x=1684600231; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iuRee1QypTQZKXl0Ba92BIJVB1syFZT2hDTjYd+QOu0=; b=xsdAX4QP18io7Q14j+sXDfKxXWzCBe26O67EEA66IxfIS/8q2flLf9bLpR4Ty2pCSg SWEzc7bfmxxWoDPY3ZAeQnylHRhvU2dunlBCLpk5wasFsFVxROxVvFn1UYeuv+w3AyPw geWJT6S9CU69DZyFzIkS6eaz87U/r9Arw28+RfRnOiwVk8SCG0YVWtXIF6+DGOUwwmBO i3ysWcWrHB2oO0Y99eIgGUqJZVy/+1bYFJTczZfYFaafqiXAjoX+0D9ml3VR9KqkR3vB Zj0m/1JdP41uNpNyDFv6R0a5aVt539CPhUroGtVNGUjeGi2UCaxG8H1u85tL6+EDZhgC xhgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682008231; x=1684600231; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iuRee1QypTQZKXl0Ba92BIJVB1syFZT2hDTjYd+QOu0=; b=G2f5I6fMNGMPUA6jLeoHtMFy0rc3f0iv8ONl0HOzbPXZBy4ZjOCZ8ZIc1ZWrQexkXS 1yL5m6lhDeUDgWR07PQ8CLE/+ifTG1Hu//XVErqza7QaPIE2v3P4O+HGeclj5Tni7dAQ whUk+rVeRjeaRcLFynKCUM4tt98IBqe6R2iP7s0kuPEItpHr3n03mk73qiUS0jVCS5Ac y+w+YJdGXILVvW+hP8q9Ytk5BytL3KSGAq04IKjYiG6v+40+AigUoyh/ff5u7slkptJ3 7QUjNejAQiZgGqI22O7qzV4rG9WJN0+3ApszAsTf7+AeY+KyVEGy7gF1O0KfwFXkViWY a7kQ== X-Gm-Message-State: AAQBX9f05NeIYCzmulStCSZ1YnVfayYk6ST+/vdZwBcKhlrJiSKDmbRv rx1XdjRR2JWXVa6Zxd5gT6Ltg75LQPs= X-Google-Smtp-Source: AKy350bG5XxutuL1YWXYB6O6lYpygFxMjScQrACRWWOAwwrWocyrTBgqNbf4ffJRIfKXzMJNG2svVwTCXq0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:ab89:b0:1a6:8d3b:85e7 with SMTP id f9-20020a170902ab8900b001a68d3b85e7mr779019plr.3.1682008230658; Thu, 20 Apr 2023 09:30:30 -0700 (PDT) Date: Thu, 20 Apr 2023 09:30:29 -0700 In-Reply-To: <20230419221716.3603068-1-atishp@rivosinc.com> Mime-Version: 1.0 References: <20230419221716.3603068-1-atishp@rivosinc.com> Message-ID: Subject: Re: [RFC 00/48] RISC-V CoVE support From: Sean Christopherson To: Atish Patra Cc: linux-kernel@vger.kernel.org, Alexandre Ghiti , Andrew Jones , Andrew Morton , Anup Patel , Atish Patra , "=?iso-8859-1?Q?Bj=F6rn_T=F6pel?=" , Suzuki K Poulose , Will Deacon , Marc Zyngier , linux-coco@lists.linux.dev, Dylan Reid , abrestic@rivosinc.com, Samuel Ortiz , Christoph Hellwig , Conor Dooley , Greg Kroah-Hartman , Guo Ren , Heiko Stuebner , Jiri Slaby , kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, Mayuresh Chitale , Palmer Dabbelt , Paolo Bonzini , Paul Walmsley , Rajnesh Kanwal , Uladzislau Rezki Content-Type: text/plain; charset="us-ascii" X-Stat-Signature: in6uiycwccndoni6k4z6kwhj57eerijq X-Rspam-User: X-Rspamd-Queue-Id: F19134000C X-Rspamd-Server: rspam06 X-HE-Tag: 1682008231-907896 X-HE-Meta: U2FsdGVkX18xJbiKkfQxcgz1RITBiwVmTZ8BVeyRo8WD1g+pdntT2XnPLPVuCaL9NEvOvx3dEV1wXGQ0Z8mxxqdqbU85XNhzoMhkDdy5pJeRwg18jRQ3xYwi+AanwWRL33OgyeR1N8shXnaij1z8L5PVJU1PKA80IdMtG9k7HZ9BUWyDRpm8wx8pc5uXoUXvfsd8Mbn9ZIjGwFEzx4Mu3liY8PmjCmovHnUYaSsCVy+vxhZzyBZik9Tp2wDb9aTDUISjsMcs1bLJxxKbrJPKYETyLO+z4qAMCrczlM8w0KfWsGyCGLDdMHrtTbkGgrw2Wn56V1u6+16BOdj8v9n8GfaJq4PgmKKmryHXsaQCXBW9rZPQNxa5rFjriHwFAoAyiKYMcEp0H9mBNyQXjz9p6GFr0qJXgQ7kXuCPIMnKwcg0NEdkydOqvHzNENHRacoxuYWbraUFxwbL3MhYwQR34db4WCJ/kkf13zvAXkPWaPjTvmRZDxD6JPm5ilw+8Sxkzt5VSSvo9Ry+3j9WDfP/R19YR2q6rUDNhOcHaLf/FGX7+5BHStrwULtznJiGZe4yZhBYmz8rm6dB1bqf2rbUFrwZ58kVFGMEfibKGOcQADESfAeUHkqPSPsQSFbDy3DkfO7dOsZG2GBzbYquQKI9PtUgF+WI/bQPGceagLlFqOpBaKjdfdLXdqx782QC2a8NRDQxIdM3nO2TlY3Eo+GAxSr76Qxdb8S6GFadabMkMKtjIrw9KSh86Th08vdRa6Ae/sZTVbGFNAIakrFXu4ri+PqbZfIC9PwViZOmQVz/Y/k7A9mHiGWQswHaottRuRGSQnzwrE/X0RwTYcsT1scqhjcNAYOos2VgBOaXxwV9ogkg6pcyF1996msy7Lrm4ArqRJWEJwLlO+f4WhRADt7UKyTfTO9lT/9yARJC4EDzGyeYSQYSUnWwJF+V4zFtKbqzK8HeqM9ywzsmp+DFlDX eJEqpxV2 kT7IQEFs09Au4ReePOxNSz35JP7mrNWDbDbLo7TuHkSOjgIYCYC961pcml5NkjkA6H9HZXxz2Ndu2aXJ9VFi3RzDOdxNuaejJLExBTh0GJszWe8LC8NVP4NXgt/zUdWqFolXKAdabY4/YnHl59W7F4ZiYeHEcAWIh4d+9PVV8xL80VrwMRHueGRliRjNG3xqNOZkr9Ikd4/bGQSoH24x2Ds6hzxj03S041NA/iEPTEYErDBftdfaTLZweDS/pWm8O1SVXwp7m1zKbw4/3O/nX6dLi7/yBC9mzcr5apwArOaYT+A9TvAk63qDUzfwwfbM3eDc+31jt/2NPYeIb8/ql8j+BGwOTHQqMPK0+XGs9r7F/XTBVb4u/R0dXiLMHBnWmwzVvtWeTopq+/bHVc0/Kl8hotBHE1kcBqhCfm7Pb6tJXY3kUYVDOr/svRdwBmsGwSo0HJgeDVZ37D21TGng/hpd6bQVxuNQO78ssj1HTpHgJb/kTmfZnWKtHBmafVwc8AkmlI4YgDaFCsI0m6ZlFjyvr7k6ConCXszoNlmHZv07g2pBIdpHmuAbKVE4/1AklelzDKPPXsD/C6uc7w+Ra4y6Y3jB5px+5KzoEjKCJ8JmWPUT6wPW1XwotOssh/ToD6fV9v0uNFJ/Q23zzdQ8bZdkuQulSTYDjU4MXErFH24JGYgxCGdS6jp7crA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 19, 2023, Atish Patra wrote: > 2. Lazy gstage page allocation vs upfront allocation with page pool. > Currently, all gstage mappings happen at runtime during the fault. This is expensive > as we need to convert that page to confidential memory as well. A page pool framework > may be a better choice which can hold all the confidential pages which can be > pre-allocated upfront. A generic page pool infrastructure may benefit other CC solutions ? I'm sorry, what? Do y'all really not pay any attention to what is happening outside of the RISC-V world? We, where "we" is KVM x86 and ARM, with folks contributing from 5+ companines, have been working on this problem for going on three *years*. And that's just from the first public posting[1], there have been discussions about how to approach this for even longer. There have been multiple related presentations at KVM Forum, something like 4 or 5 just at KVM Forum 2022 alone. Patch 1 says "This patch is based on pkvm patches", so clearly you are at least aware that there is other work going on in this space. At a very quick glance, this series is suffers from all of the same flaws that SNP, TDX, and pKVM have encountered. E.g. assuming guest memory is backed by struct page memory, relying on pinning to solve all problems (hint, it doesn't), and so on and so forth. And to make things worse, this series is riddled with bugs. E.g. patch 19 alone manages to squeeze in multiple fatal bugs in five new lines of code: deadlock due to not releasing mmap_lock on failure, failure to correcty handle MOVE, failure to handle DELETE at all, failure to honor (or reject) READONLY, and probably several others. diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 4b0f09e..63889d9 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -499,6 +499,11 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, mmap_read_lock(current->mm); + if (is_cove_vm(kvm)) { + ret = kvm_riscv_cove_vm_add_memreg(kvm, base_gpa, size); + if (ret) + return ret; + } /* * A memory region could potentially cover multiple VMAs, and * any holes between them, so iterate over all of them to find I get that this is an RFC, but for a series of this size, operating in an area that is under heavy development by multiple other architectures, to have a diffstat that shows _zero_ changes to common KVM is simply unacceptable. Please, go look at restrictedmem[2] and work on building CoVE support on top of that. If the current proposal doesn't fit CoVE's needs, then we need to know _before_ all of that code gets merged. [1] https://lore.kernel.org/linux-mm/20200522125214.31348-1-kirill.shutemov@linux.intel.com [2] https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linux.intel.com > arch/riscv/Kbuild | 2 + > arch/riscv/Kconfig | 27 + > arch/riscv/cove/Makefile | 2 + > arch/riscv/cove/core.c | 40 + > arch/riscv/cove/cove_guest_sbi.c | 109 +++ > arch/riscv/include/asm/cove.h | 27 + > arch/riscv/include/asm/covg_sbi.h | 38 + > arch/riscv/include/asm/csr.h | 2 + > arch/riscv/include/asm/kvm_cove.h | 206 +++++ > arch/riscv/include/asm/kvm_cove_sbi.h | 101 +++ > arch/riscv/include/asm/kvm_host.h | 10 +- > arch/riscv/include/asm/kvm_vcpu_sbi.h | 3 + > arch/riscv/include/asm/mem_encrypt.h | 26 + > arch/riscv/include/asm/sbi.h | 107 +++ > arch/riscv/include/uapi/asm/kvm.h | 17 + > arch/riscv/kernel/irq.c | 12 + > arch/riscv/kernel/setup.c | 2 + > arch/riscv/kvm/Makefile | 1 + > arch/riscv/kvm/aia.c | 101 ++- > arch/riscv/kvm/aia_device.c | 41 +- > arch/riscv/kvm/aia_imsic.c | 127 ++- > arch/riscv/kvm/cove.c | 1005 +++++++++++++++++++++++ > arch/riscv/kvm/cove_sbi.c | 490 +++++++++++ > arch/riscv/kvm/main.c | 30 +- > arch/riscv/kvm/mmu.c | 45 +- > arch/riscv/kvm/tlb.c | 11 +- > arch/riscv/kvm/vcpu.c | 69 +- > arch/riscv/kvm/vcpu_exit.c | 34 +- > arch/riscv/kvm/vcpu_insn.c | 115 ++- > arch/riscv/kvm/vcpu_sbi.c | 16 + > arch/riscv/kvm/vcpu_sbi_covg.c | 232 ++++++ > arch/riscv/kvm/vcpu_timer.c | 26 +- > arch/riscv/kvm/vm.c | 34 +- > arch/riscv/kvm/vmid.c | 17 +- > arch/riscv/mm/Makefile | 3 + > arch/riscv/mm/init.c | 17 +- > arch/riscv/mm/ioremap.c | 45 + > arch/riscv/mm/mem_encrypt.c | 61 ++ > drivers/tty/hvc/hvc_riscv_sbi.c | 5 + > drivers/tty/serial/earlycon-riscv-sbi.c | 51 +- > include/uapi/linux/kvm.h | 8 + > mm/vmalloc.c | 16 + > 42 files changed, 3222 insertions(+), 109 deletions(-) > create mode 100644 arch/riscv/cove/Makefile > create mode 100644 arch/riscv/cove/core.c > create mode 100644 arch/riscv/cove/cove_guest_sbi.c > create mode 100644 arch/riscv/include/asm/cove.h > create mode 100644 arch/riscv/include/asm/covg_sbi.h > create mode 100644 arch/riscv/include/asm/kvm_cove.h > create mode 100644 arch/riscv/include/asm/kvm_cove_sbi.h > create mode 100644 arch/riscv/include/asm/mem_encrypt.h > create mode 100644 arch/riscv/kvm/cove.c > create mode 100644 arch/riscv/kvm/cove_sbi.c > create mode 100644 arch/riscv/kvm/vcpu_sbi_covg.c > create mode 100644 arch/riscv/mm/ioremap.c > create mode 100644 arch/riscv/mm/mem_encrypt.c > > -- > 2.25.1 >