From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E2F9C7115A for ; Wed, 18 Jun 2025 12:18:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B42826B008A; Wed, 18 Jun 2025 08:18:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1A486B008C; Wed, 18 Jun 2025 08:18:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A2FC86B0092; Wed, 18 Jun 2025 08:18:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 92C046B008A for ; Wed, 18 Jun 2025 08:18:19 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6209F1023E0 for ; Wed, 18 Jun 2025 12:18:19 +0000 (UTC) X-FDA: 83568423918.30.9EE37A8 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) by imf23.hostedemail.com (Postfix) with ESMTP id 430C314000D for ; Wed, 18 Jun 2025 12:18:16 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=i02v3LLw; spf=pass (imf23.hostedemail.com: domain of xiaoyao.li@intel.com designates 198.175.65.21 as permitted sender) smtp.mailfrom=xiaoyao.li@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750249097; a=rsa-sha256; cv=none; b=39UEWeJmnnR61cCMlbI46B71ddz7tWyF66NhA3yhyQAkrJcpXmP1vRe+mJkITLjzP+/eEk ORSJx1SzjYYtNd6m/A08bDmW/N+XCI9xshlUzdm5sYG4vXl92Tki0GLlA3qHUHR7vwWsR+ 2GuhIHGU1vEnzGwadtmFDZStDHJkY0E= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=i02v3LLw; spf=pass (imf23.hostedemail.com: domain of xiaoyao.li@intel.com designates 198.175.65.21 as permitted sender) smtp.mailfrom=xiaoyao.li@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750249097; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DGJrFX4KjpJUzAwCWc1n1CBQiSM7cKQ05XqUBmdKiXQ=; b=ScaY4RDK1tBJ0exoMRyOe9KIwWFr6DB3CUo9ZRc877Iz9bQxmmXyvq9bI+TZVm8xc+4ZHb U3m3mAGz8cO4CHXdNR4j2qHXk15UA79PuKYZKA85rt9m10PA9uIUBPBnmXkac3IgIM7ACS Zbl11XTTOBmQN39JZeKICVHUR+tFBq8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750249096; x=1781785096; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=k+8qTSYcfkM1u4FWin0jAagfhW9z2r/NVtbj44YotsY=; b=i02v3LLwyjkP8FXDiS835zul4o2b9djnb5XDiH28AQwtjaCzvSpO3eh+ E6TAhNBoCtcRHbdD9DV3RkAZRy74Ijuoj1kSuM+KY5Wy//5VpiKZ66XX5 BoQ/WdaeQmQahx+QXJR13yu4wAomxGA2GJbuPYhddv0DS8svpB6hqqrAs gPx1So0Bkzv9Wfqq6Cvb6zcLl3t1r2ydj3BYBgYLivkfaQXCBBXkC6Ji5 xzoN0pqj/jvuZR9D69esMsM/zfMxziRKtXQ0UI9dJE9A4vTiU/Sqmv9VM tXwagiRFK9aA9QNdbmifPwGt8gEeM56QLt9/2uszXvesebc1tnGsJfTUq w==; X-CSE-ConnectionGUID: saTgOGQeR+u0kKqZ/pD6bg== X-CSE-MsgGUID: eSPOi7vuQ1+Sr0ndH8YKFw== X-IronPort-AV: E=McAfee;i="6800,10657,11468"; a="52331209" X-IronPort-AV: E=Sophos;i="6.16,246,1744095600"; d="scan'208";a="52331209" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2025 05:18:15 -0700 X-CSE-ConnectionGUID: 9r6V4U3FS1SpaQqBstaAOA== X-CSE-MsgGUID: F5hO37xQQqqSDpHTn4JSbQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,246,1744095600"; d="scan'208";a="153479359" Received: from xiaoyaol-hp-g830.ccr.corp.intel.com (HELO [10.124.247.1]) ([10.124.247.1]) by fmviesa003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2025 05:18:01 -0700 Message-ID: <3031b949-c42a-49bc-be0c-f95a62c792e2@intel.com> Date: Wed, 18 Jun 2025 20:17:58 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v12 08/18] KVM: guest_memfd: Allow host to map guest_memfd pages To: David Hildenbrand , Sean Christopherson Cc: Fuad Tabba , Ira Weiny , kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com References: <20250611133330.1514028-1-tabba@google.com> <20250611133330.1514028-9-tabba@google.com> <68501fa5dce32_2376af294d1@iweiny-mobl.notmuch> <701c8716-dd69-4bf6-9d36-4f8847f96e18@redhat.com> <3fb0e82b-f4ef-402d-a33c-0b12e8aa990c@redhat.com> <5ee9bbb8-d100-408c-ac07-ea9c5b603545@intel.com> <5a55d95e-5e32-4239-a445-be13228ea80b@redhat.com> <45af2c0d-a416-49bc-8011-4ec57a56d6f5@intel.com> <40a5903b-f747-4eab-8959-06ddd6e88f82@redhat.com> <38101158-4475-4885-83e7-654045ca0f9b@redhat.com> Content-Language: en-US From: Xiaoyao Li In-Reply-To: <38101158-4475-4885-83e7-654045ca0f9b@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 430C314000D X-Stat-Signature: 36duww9mhb1b74cmekkstim4oaantsxh X-Rspam-User: X-HE-Tag: 1750249096-713220 X-HE-Meta: U2FsdGVkX199MqXvm+xVXNqzvtCmIUERt4GBe/q5e5R0hz/5mOh+C+BXYSAAutMqRkGid96l9oMmlgxftTIyOndTV5YT4QlDzNgn5yi+VGcD7CIwBZIVHIzCD5si6Y2izvwD/ez3YZNsOEFVHqCQ7OyVNQPzSkpxdOZAJUSw/HDxMBzgSVh46/bBvsjj27nF4Yjc4Z66rAF2Z30p9fdbz5c1nWCaEvwDbUc3REt//ygkyU1MQ4U6gKOvNASWmFQrckwDn1iqW6IasItahbXm5Yo+2TaB1ko+gxafSDtZCnXSHlYTC85vZ9thR0moZKkGKE1vr/SWI3oMASSxd2kRfquVzj7lojypObRfkOuP6r/yjFnCjbY9qXbMH850iq4Mo27NWG3M/YffVMbp+2EprKDHO8wqh19gmPuDbSJ75tHSkA2JksdWbZvqnHPG0AjZ6UTKhqG1fwxtFSPCNQ3gylmsmRn4NnX41Xo4Xxr+HTkDcyzdwJ2EVVtlHTDGZsp1SVZLx8viDsuTLCPVVvEHluV7eFTMsaej/UHhDpwyzF74taeik864Ks+CaDctBZj+6gzAtJgsoZoSVmM2NGm1lC/XLqQahT3twsuvt/S15VzkNEXs659tnX4MiZed9EKCQJpcjpUjTLaDiEoR9jg7Q6lxMLW41PFFj8UTnOpQeBw9+4iuvVjtRrK9mhsKlX3hhYtreG9E8jUF7/+fKsn6G7vc2JLB6rED8VmPsDkrpxqUNsrQqqnKiTMaVSwUFnomAvvEVJW1hJu/6A0xG6FQPxCMp4epKkJUG2wY8OJx1fJe+x3ErAev10rxwKJGAKBedgTWESnZbaHHRDatdEBgXzBpgjYraWj0Vdo0ULwOdpCAGQDMuvunKoV4wP4UkDZxgcDnOUQoaidbx1Mx6ehgZTHfY2PMRoVaDw8N2sPiyP2t3SdHAl48ILCYT6INikrvMLNiTO6h5zMxYEAMcXO +y2kQsU2 mTv4KZXiW2XFxjK7lQ27NR/2Mqa8XWUM1iGKGoT3aehW7kUpsYNLF1NLo1kPlUeGe+VSHyQvt2whh9ZbIXDH/kR9MkfKo3QEtoLimOVv1mPARHrenuyDg4qIic+gVjtZ2SsoP+mdMS0b0+uoD4SCLl79NSML15ORD2AZRJARRvtNURj5ickn8BLTQXq8wqripZmrj0tHy0ImjNwuwlGP9IZ4Gp/fD0U99Jpk6il/NbS2gonOTYoJyxSpNEB0FNvTLrynKBj97Rin50M2dYSZBKilkz3A1CRdsKj1QHJ8qwzXfGhWTcd1miTbj0y7SoVR/hxB/LOI7wPRHVqBq4w8l7ywcRg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 6/18/2025 7:14 PM, David Hildenbrand wrote: > On 18.06.25 12:42, Xiaoyao Li wrote: >> On 6/18/2025 5:59 PM, David Hildenbrand wrote: >>> On 18.06.25 11:44, Xiaoyao Li wrote: >>>> On 6/18/2025 5:27 PM, David Hildenbrand wrote: >>>>> On 18.06.25 11:20, Xiaoyao Li wrote: >>>>>> On 6/18/2025 4:15 PM, David Hildenbrand wrote: >>>>>>>> If we are really dead set on having SHARED in the name, it could be >>>>>>>> GUEST_MEMFD_FLAG_USER_MAPPABLE_SHARED or >>>>>>>> GUEST_MEMFD_FLAG_USER_MAP_SHARED?  But >>>>>>>> to me that's _too_ specific and again somewhat confusing given the >>>>>>>> unfortunate >>>>>>>> private vs. shared usage in CoCo-land.  And just playing the odds, >>>>>>>> I'm >>>>>>>> fine taking >>>>>>>> a risk of ending up with GUEST_MEMFD_FLAG_USER_MAPPABLE_PRIVATE or >>>>>>>> whatever, >>>>>>>> because I think that is comically unlikely to happen. >>>>>>> >>>>>>> I think in addition to GUEST_MEMFD_FLAG_MMAP we want something to >>>>>>> express "this is not your old guest_memfd that only supports private >>>>>>> memory". And that's what I am struggling with. >>>>>> >>>>>> Sorry for chiming in. >>>>>> >>>>>> Per my understanding, (old) guest memfd only means it's the memory >>>>>> that >>>>>> cannot be accessed by userspace. There should be no shared/private >>>>>> concept on it. >>>>>> >>>>>> And "private" is the concept of KVM. Guest memfd can serve as private >>>>>> memory, is just due to the character of it cannot be accessed from >>>>>> userspace. >>>>>> >>>>>> So if the guest memfd can be mmap'ed, then it become userspace >>>>>> accessable and cannot serve as private memory. >>>>>> >>>>>>> Now, if you argue "support for mmap() implies support for non- >>>>>>> private >>>>>>> memory", I'm probably okay for that. >>>>>> >>>>>> I would say, support for mmap() implies cannot be used as private >>>>>> memory. >>>>> >>>>> That's not where we're heading with in-place conversion support: you >>>>> will have private (ianccessible) and non-private (accessible) >>>>> parts, and >>>>> while guest_memfd will support mmap() only the accessible parts can >>>>> actually be accessed (faulted in etc). >>>> >>>> That's OK. The guestmemfd can be fine-grained, i.e., different >>>> range/part of it can have different access property. But one rule never >>>> change: only the sub-range is not accessible by userspace can it be >>>> serve as private memory. >>> >>> I'm sorry, I don't understand what you are getting at. >>> >>> You said "So if the guest memfd can be mmap'ed, then it become userspace >>> accessable and cannot serve as private memory." and I say, with in-place >>> conversion support you are wrong. >>> >>> The whole file can be mmaped(), that does not tell us anything about >>> which parts can be private or not. >> >> So there is nothing prevent userspace from accessing it after a range is >> converted to private via KVM_GMEM_CONVERT_PRIVATE since the whole file >> can be mmaped()? >> >> If so, then for TDX case, userspace can change the TD-owner bit of the >> private part by accessing it and later guest access will poison it and >> trigger #MC. If the #MC is only delivered to the PCPU that triggers it, >> it just leads to the TD guest being killed. If the #MC is broadcasted, >> it affects other in the system. >> >> I just give it a try on real TDX system with in-place conversion. The TD >> is killed due to SIGBUS (host kernel handles the #MC and sends the >> SIGBUS). It seems OK if only the TD guest being affected due to >> userspace accesses the private memory. But I'm not sure if there is any >> corner case that will affect the host. > > I suggest you go ahead and read all about in-place conversion support, > and how it all relates to the #MC problem you mention here. > > Long story short: SIGBUS is triggered by the fault handler, not by the > #MC, because private pages cannot be faulted in and accessed. > Sorry for the wrong information and thanks for your patience! I'm clearer now that this series and the in-place conversion try to make shared/private the property of guest memfd. If under this big picture, it looks reasonable to name the flag with "shared'. While just looking at this patch alone, Sean's concern makes more sense.