From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27748C021B8 for ; Wed, 26 Feb 2025 08:48:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92FCD280003; Wed, 26 Feb 2025 03:48:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E2156B0093; Wed, 26 Feb 2025 03:48:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 77FF1280003; Wed, 26 Feb 2025 03:48:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 59FC86B008C for ; Wed, 26 Feb 2025 03:48:35 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EC6B480F72 for ; Wed, 26 Feb 2025 08:48:34 +0000 (UTC) X-FDA: 83161469748.16.54360B7 Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) by imf29.hostedemail.com (Postfix) with ESMTP id A88E5120007 for ; Wed, 26 Feb 2025 08:48:32 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b=CZpk5I6B; spf=pass (imf29.hostedemail.com: domain of "prvs=14584be8e=roypat@amazon.co.uk" designates 207.171.184.29 as permitted sender) smtp.mailfrom="prvs=14584be8e=roypat@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.co.uk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740559712; a=rsa-sha256; cv=none; b=EruJzbN/NyRmIlxXBxd5y9Uu4MAF2FmPPqY8CCCz6DyRUXAD82Amd8j7LL7UaTlraDC5X6 HTbs3szt937GdUMgP02D3uIFl+xe7oClZlNgdyWF+iHwXsUYjXK9J7u2HAeFehcL8Rr4M3 yd+zGaMF8kxSOJ5mRTS+8HeB4/lsXjI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b=CZpk5I6B; spf=pass (imf29.hostedemail.com: domain of "prvs=14584be8e=roypat@amazon.co.uk" designates 207.171.184.29 as permitted sender) smtp.mailfrom="prvs=14584be8e=roypat@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.co.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740559712; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qF79yHjpoNZOP3Yj4UirgbH8x7O0W6H8pZSYrcLmHaU=; b=3u7ClIVxEgoXhoFfO8g0IZOFwKfF3zpWXOTtqsJa2lsxLR21Y0648PhhqElHZwRhBzi5Mv wZrR0A5oXpaDS/tMrSZmCOYMIjerR5cUAbFfZhqYnG0Z318lZZ57jRSKm3BBOQZYTBM6tG JYqQGqEO/MesrbgHH4bDDmyE/DF5HAM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.uk; i=@amazon.co.uk; q=dns/txt; s=amazon201209; t=1740559713; x=1772095713; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=qF79yHjpoNZOP3Yj4UirgbH8x7O0W6H8pZSYrcLmHaU=; b=CZpk5I6BYb0d/yZSGkfTetBCDU7qC73GqauywyEvofGGttyfuoJsOUlB L+b6zbc0WI1Zva1urhQKzKGagiwMMmstZbs/eO9Hj8i+1qmmJv/NGOtQE clNK05QFTW3hzh8fZyZg3+J0S71SbOWgi+jL4snatRzGboLDjOz5ROfSX Y=; X-IronPort-AV: E=Sophos;i="6.13,316,1732579200"; d="scan'208";a="497310661" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2025 08:48:27 +0000 Received: from EX19MTAUWB002.ant.amazon.com [10.0.38.20:29533] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.34.181:2525] with esmtp (Farcaster) id 74ff41ae-9755-4ae7-b352-87edb65d8925; Wed, 26 Feb 2025 08:48:26 +0000 (UTC) X-Farcaster-Flow-ID: 74ff41ae-9755-4ae7-b352-87edb65d8925 Received: from EX19D020UWC002.ant.amazon.com (10.13.138.147) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Wed, 26 Feb 2025 08:48:26 +0000 Received: from EX19MTAUWC001.ant.amazon.com (10.250.64.145) by EX19D020UWC002.ant.amazon.com (10.13.138.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14; Wed, 26 Feb 2025 08:48:26 +0000 Received: from email-imr-corp-prod-pdx-1box-2b-8c2c6aed.us-west-2.amazon.com (10.25.36.210) by mail-relay.amazon.com (10.250.64.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1544.14 via Frontend Transport; Wed, 26 Feb 2025 08:48:26 +0000 Received: from [127.0.0.1] (dev-dsk-roypat-1c-dbe2a224.eu-west-1.amazon.com [172.19.88.180]) by email-imr-corp-prod-pdx-1box-2b-8c2c6aed.us-west-2.amazon.com (Postfix) with ESMTPS id AE19FA0135; Wed, 26 Feb 2025 08:48:18 +0000 (UTC) Message-ID: <8642de57-553a-47ec-81af-803280a360ec@amazon.co.uk> Date: Wed, 26 Feb 2025 08:48:16 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 03/12] KVM: guest_memfd: Add flag to remove from direct map To: David Hildenbrand , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , References: <20250221160728.1584559-1-roypat@amazon.co.uk> <20250221160728.1584559-4-roypat@amazon.co.uk> From: Patrick Roy Content-Language: en-US Autocrypt: addr=roypat@amazon.co.uk; keydata= xjMEY0UgYhYJKwYBBAHaRw8BAQdA7lj+ADr5b96qBcdINFVJSOg8RGtKthL5x77F2ABMh4PN NVBhdHJpY2sgUm95IChHaXRodWIga2V5IGFtYXpvbikgPHJveXBhdEBhbWF6b24uY28udWs+ wpMEExYKADsWIQQ5DAcjaM+IvmZPLohVg4tqeAbEAgUCY0UgYgIbAwULCQgHAgIiAgYVCgkI CwIEFgIDAQIeBwIXgAAKCRBVg4tqeAbEAmQKAQC1jMl/KT9pQHEdALF7SA1iJ9tpA5ppl1J9 AOIP7Nr9SwD/fvIWkq0QDnq69eK7HqW14CA7AToCF6NBqZ8r7ksi+QLOOARjRSBiEgorBgEE AZdVAQUBAQdAqoMhGmiXJ3DMGeXrlaDA+v/aF/ah7ARbFV4ukHyz+CkDAQgHwngEGBYKACAW IQQ5DAcjaM+IvmZPLohVg4tqeAbEAgUCY0UgYgIbDAAKCRBVg4tqeAbEAtjHAQDkh5jZRIsZ 7JMNkPMSCd5PuSy0/Gdx8LGgsxxPMZwePgEAn5Tnh4fVbf00esnoK588bYQgJBioXtuXhtom 8hlxFQM= In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: A88E5120007 X-Stat-Signature: 4nzqee9spop88g8s4pe1iwn4ook4ho9y X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1740559712-757720 X-HE-Meta: U2FsdGVkX18cDk+uplwU0wNwVT6ljSOcWmXJxcLxGNj22i5DNVFrbVcip0MYUlst/AxdcBJglh8nfT8d4g5YqGAzCc06oZeoBdkpkYckHsO/az8qMBNvDNWI/34Y83HMv69fxuhfn0xA0AAoaOzI55CKkt6B+CXvRYmKY0rPePA2yuqV+o+mVs7c1jdl221v170RDWSJPVSxgqSoC6rWqkDvpN91VJrxQmNASh0ODRrbTS21pdORDKgotHxYokluQgSfz8sIOcuUayaR+2vvvvvYv+/EiKMFt9dz/JxSfco3g7CH/8TEoFQZMxec5gRD7k1QXfl4DwSKEBvjpMkm+bTrPUD0Nvu+XPXjNAkgAO9YAwSQpCMWqIPuooWL2vhggmT+e4gYbgvyRh3gmJHcL7SMkTxmu59uTh3iCNlo4MJAqQ3valbFgBjR5YJSWiTyC85r2kFz+KqcuPD13hC9r+YLqvj0HXJamA5er23b62VnRSz5OnGAPm1ca8oBdy+/Htf9SLXftN07uyzOrclHWmsD3SUbXKibtTnURXr+SpUqq6a0kijDKN+ax9oE608hXRrhYF+XYUwPjkTX2TrSoJLTpsDdA2vXsfer7wsQzXBmcMPCrEZC6dXXoXzdOkPwVU0qknqV28rmn+2MaQ9vCVM2ju7IqY9+lCgfSJ7bHJjPjP3cFQnLRfsLlqc+4VoJDyXYTrv30xFi0+qQtWfKDrYNCoEU3h5IGdONg3qaOPuQ6QgXtnAffquyfHUDjivNj0IvOj9bMY2wwGEHVSwsz0YNnU04zJ+Y+90UU44FaLGYoMP7w1026Pi/0Wpx5nXBAA90gD3c8PpWcN5oJHrf27OYtIxWxn397Yj3rUSJEg2P1ELRrtW9CKu3kk5N9kwHpe0C28d59zXxkc5KqaG0A+WjPTegiYk41ytWf1c3WPCFuD481ua93O85LnckZdhzzIP2x8OP/YAkwdsP23e LJfJ6rXl wimPEUcd4jKnLMzOEG3YkukwajUx1mh42oa/vL7DKg1cdeXrqMCt438inwQSMTpShcQdL/3DhoTelU6fOAZ2O5ohi+ZM6+1n7JYEJ5hpLxoOMtg5K4wfrhQbN8SputFHFP/xhQCGmdnfqTA8GrLFEbzlXIxIuquOyBtTSVKzAD/O/3S0AU1wvLPvrHpyW9cqv/A+n0PodZdrUrO8HGmI4/EN1JyLi4tW78qbL4l0LE06w2tqRF5+QPAs3w1jpAcnfU5TUEqzoLm2F5PGxb42XbGLOgLRaAYhEuW5g0ivvgRPnRxgh3xdc9+d/RObdYrN8oxMbIparHiYFC2S0uZV0KEpQXWz4EmaGUUSdu5qVQF4EuuBSSakK53phstjtOwBEQ1r4d7DNqFX/JmVuVg7fcJsnhbe1P1Ol9jFYEExULajI3l3WkAlfi1XxIXdSGxYQ2rs7PTtkG/CcCU4TdKA4Sp3NgZ5p8jWqLdYADaNi1FkuNDSq+KWiFqYbDtqC39g0Xtcfk8w+GFj7Yde9as0OZHM/qCtsBLp7K58Y7tDWVIJCOD2G1bRGsS/Frnh5WLR3ewc3TXWoXzKFEpQcwfTLMj107Ef/mvHdR8rmpzFyu3C/k2ipyQymyf/rv0zv3levZ/fwAp8FiGPpqcE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 2025-02-25 at 16:54 +0000, David Hildenbrand wrote:> On 21.02.25 17:07, Patrick Roy wrote: >> Add KVM_GMEM_NO_DIRECT_MAP flag for KVM_CREATE_GUEST_MEMFD() ioctl. When >> set, guest_memfd folios will be removed from the direct map after >> preparation, with direct map entries only restored when the folios are >> freed. >> >> To ensure these folios do not end up in places where the kernel cannot >> deal with them, set AS_NO_DIRECT_MAP on the guest_memfd's struct >> address_space if KVM_GMEM_NO_DIRECT_MAP is requested. >> >> Note that this flag causes removal of direct map entries for all >> guest_memfd folios independent of whether they are "shared" or "private" >> (although current guest_memfd only supports either all folios in the >> "shared" state, or all folios in the "private" state if >> !IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM)). The usecase for removing >> direct map entries of also the shared parts of guest_memfd are a special >> type of non-CoCo VM where, host userspace is trusted to have access to >> all of guest memory, but where Spectre-style transient execution attacks >> through the host kernel's direct map should still be mitigated. >> >> Note that KVM retains access to guest memory via userspace >> mappings of guest_memfd, which are reflected back into KVM's memslots >> via userspace_addr. This is needed for things like MMIO emulation on >> x86_64 to work. Previous iterations attempted to instead have KVM >> temporarily restore direct map entries whenever such an access to guest >> memory was needed, but this turned out to have a significant performance >> impact, as well as additional complexity due to needing to refcount >> direct map reinsertion operations and making them play nicely with gmem >> truncations. >> >> This iteration also doesn't have KVM perform TLB flushes after direct >> map manipulations. This is because TLB flushes resulted in a up to 40x >> elongation of page faults in guest_memfd (scaling with the number of CPU >> cores), or a 5x elongation of memory population. On the one hand, TLB >> flushes are not needed for functional correctness (the virt->phys >> mapping technically stays "correct", the kernel should simply to not it >> for a while), so this is a correct optimization to make. On the other >> hand, it means that the desired protection from Spectre-style attacks is >> not perfect, as an attacker could try to prevent a stale TLB entry from >> getting evicted, keeping it alive until the page it refers to is used by >> the guest for some sensitive data, and then targeting it using a >> spectre-gadget. >> >> Signed-off-by: Patrick Roy > > ... > >> >> +static bool kvm_gmem_test_no_direct_map(struct inode *inode) >> +{ >> + return ((unsigned long) inode->i_private) & KVM_GMEM_NO_DIRECT_MAP; >> +} >> + >> static inline void kvm_gmem_mark_prepared(struct folio *folio) >> { >> + struct inode *inode = folio_inode(folio); >> + >> + if (kvm_gmem_test_no_direct_map(inode)) { >> + int r = set_direct_map_valid_noflush(folio_page(folio, 0), folio_nr_pages(folio), >> + false); > > Will this work if KVM is built as a module, or is this another good > reason why we might want guest_memfd core part of core-mm? mh, I'm admittedly not too familiar with the differences that would come from building KVM as a module vs not. I do remember something about the direct map accessors not being available for modules, so this would indeed not work. Does that mean moving gmem into core-mm will be a pre-requisite for the direct map removal stuff? > -- > Cheers, > > David / dhildenb > Best, Patrick