From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD72AC433EF for ; Thu, 21 Jul 2022 11:37:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 419018E0005; Thu, 21 Jul 2022 07:37:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C83E8E0001; Thu, 21 Jul 2022 07:37:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2422B8E0005; Thu, 21 Jul 2022 07:37:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0F8B68E0001 for ; Thu, 21 Jul 2022 07:37:11 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 41ACE1A08F2 for ; Thu, 21 Jul 2022 09:42:22 +0000 (UTC) X-FDA: 79710616524.02.E9A8026 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf12.hostedemail.com (Postfix) with ESMTP id 10D5E40098 for ; Thu, 21 Jul 2022 09:42:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1658396541; x=1689932541; h=date:from:to:cc:subject:message-id:reply-to:references: mime-version:in-reply-to; bh=6iF6C2k7uRS+q7Yo8OaJrnqIl0kJvizDvEKWhi3avTo=; b=joPP2EemT4H8yP5gy+QaaOKWYHKhDdxMo7NIZtt9S08zNmOVKynkT8mq nGQa08vo0cAftNIGSoaAKtkCURtSecHb2inW6a/DwX+h20i0TxXhU2SLr oqceqSObs/bpXqTv5VkWGy7i0X+RNjYKEiFawuR+c2PippyrVAeaGuZcM b37QjH83MKk+GpBLkZLwSuBwP3ig1TGdIDyMbId3iZKTGONB6ceFX3nff +dRMe74BCVXbdrw86Y3crkJXFJYynlYP8D3S48Nn/KWKQewiNNOk+f27v BRz/SkgMCblTwBgWDyEB9D/XQerB4cVD9MPE0Z5i7nBV85JqKSnS4mH33 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10414"; a="373303147" X-IronPort-AV: E=Sophos;i="5.92,289,1650956400"; d="scan'208";a="373303147" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2022 02:42:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.92,289,1650956400"; d="scan'208";a="656669811" Received: from chaop.bj.intel.com (HELO localhost) ([10.240.193.75]) by fmsmga008.fm.intel.com with ESMTP; 21 Jul 2022 02:42:05 -0700 Date: Thu, 21 Jul 2022 17:37:15 +0800 From: Chao Peng To: Sean Christopherson Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , Michael Roth , mhocko@suse.com, Muchun Song Subject: Re: [PATCH v7 11/14] KVM: Register/unregister the guest private memory regions Message-ID: <20220721093715.GB153288@chaop.bj.intel.com> Reply-To: Chao Peng References: <20220706082016.2603916-1-chao.p.peng@linux.intel.com> <20220706082016.2603916-12-chao.p.peng@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658396542; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iM1XxqPohMHwMbzTPw4Kqujr04r6DeYJ0gANZd5/U1M=; b=hU/+kJTAQFE+rGsrEORAhXKChDBIiWhvWDxrukXghHIEyYJXQzZxcDFdx97GGPkjVT/Z65 a2mqMsBl+v4LjZ2wLd1l9+5+sTvkhUhh732pE2w3tKIjnaQtswDYRb/uZn6/T4gIO+W/3R e6/YcaXS2E7VUAo9D/qeXjnQbifbzdQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658396542; a=rsa-sha256; cv=none; b=O7Zqn3zSnu6qAmN21vFt/f3wfE7mjMjOQmRBl8IWKrktVe3obSy7pHpT8DUg0jQH3CRZIB VsXHi8dPSbk/SffB5y93JMLGag7V5iI1JXkiSpD/hI593XuqCdXOh1+8M3uidukmFd6fRr dLggG/R1kNClWPcS73LKLdc3XGXv/38= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=joPP2Eem; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf12.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=chao.p.peng@linux.intel.com X-Rspamd-Queue-Id: 10D5E40098 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=joPP2Eem; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf12.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=chao.p.peng@linux.intel.com X-Rspamd-Server: rspam12 X-Rspam-User: X-Stat-Signature: 1o4b8i9mdzuwj91gzji86ifp3rsqiqn6 X-HE-Tag: 1658396540-705659 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 20, 2022 at 04:44:32PM +0000, Sean Christopherson wrote: > On Wed, Jul 06, 2022, Chao Peng wrote: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 230c8ff9659c..bb714c2a4b06 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -914,6 +914,35 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) > > > > #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ > > > > +#ifdef CONFIG_HAVE_KVM_PRIVATE_MEM > > +#define KVM_MEM_ATTR_PRIVATE 0x0001 > > +static int kvm_vm_ioctl_set_encrypted_region(struct kvm *kvm, unsigned int ioctl, > > + struct kvm_enc_region *region) > > +{ > > + unsigned long start, end; > > As alluded to in a different reply, because this will track GPAs instead of HVAs, > the type needs to be "gpa_t", not "unsigned long". Oh, actually, they need to > be gfn_t, since those are what gets shoved into the xarray. It's gfn_t actually. My original purpose for this is 32bit architectures (if any) can also work with it since index of xarrary is 32bit on those architectures. But kvm_enc_region is u64 so itr's even not possible. > > > + void *entry; > > + int r; > > + > > + if (region->size == 0 || region->addr + region->size < region->addr) > > + return -EINVAL; > > + if (region->addr & (PAGE_SIZE - 1) || region->size & (PAGE_SIZE - 1)) > > + return -EINVAL; > > + > > + start = region->addr >> PAGE_SHIFT; > > + end = (region->addr + region->size - 1) >> PAGE_SHIFT; > > + > > + entry = ioctl == KVM_MEMORY_ENCRYPT_REG_REGION ? > > + xa_mk_value(KVM_MEM_ATTR_PRIVATE) : NULL; > > + > > + r = xa_err(xa_store_range(&kvm->mem_attr_array, start, end, > > + entry, GFP_KERNEL_ACCOUNT)); > > IIUC, this series treats memory as shared by default. I think we should invert > that and have KVM's ABI be that all guest memory as private by default, i.e. > require the guest to opt into sharing memory instead of opt out of sharing memory. > > And then the xarray would track which regions are shared. Maybe I missed some information discussed elsewhere? I followed https://lkml.org/lkml/2022/5/23/772. KVM is shared by default but userspace should set all guest memory to private before the guest launch, guest then sees all memory as private. While default it to private sounds also good, if we only talk about the private/shared in private memory context (I think so), then there is no ambiguity. > > Regarding mem_attr_array, it probably makes sense to explicitly include what it's > tracking in the name, i.e. name it {private,shared}_mem_array depending on whether > it's used to track private vs. shared memory. If we ever need to track metadata > beyond shared/private then we can tweak the name as needed, e.g. if hardware ever > supports secondary non-ephemeral encryption keys. As I think that there may be other state beyond that. Fine with me to just take consideration of private/shared, and it also sounds reasonable for people who want to support that to change. Chao