From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B10BACFB446 for ; Mon, 7 Oct 2024 15:57:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 058DC6B0096; Mon, 7 Oct 2024 11:57:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F234D6B0099; Mon, 7 Oct 2024 11:57:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9CB86B009B; Mon, 7 Oct 2024 11:57:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B8B706B0096 for ; Mon, 7 Oct 2024 11:57:04 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 58E681409E5 for ; Mon, 7 Oct 2024 15:57:04 +0000 (UTC) X-FDA: 82647259968.09.9368702 Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) by imf07.hostedemail.com (Postfix) with ESMTP id 46B844001C for ; Mon, 7 Oct 2024 15:57:02 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b=rQ2qtdDH; dmarc=pass (policy=quarantine) header.from=amazon.co.uk; spf=pass (imf07.hostedemail.com: domain of "prvs=00310a3d6=roypat@amazon.co.uk" designates 52.95.49.90 as permitted sender) smtp.mailfrom="prvs=00310a3d6=roypat@amazon.co.uk" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728316523; a=rsa-sha256; cv=none; b=KH/6df0fHRW8eAH3MF4x73wac7bLZE6EPSqTtwAN2+6iGM4+tIgqs+AqGv4BIeC4Zw6uuI xeu6W4WAdZdxaDinICGjpsi0dWDiFkMImxVRmtUV6fsB/bRyBQnR0YvENwwsH3t5KlXdaY yYvHD6FS/7bnNHR8EqrJKvltwu1nZXw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b=rQ2qtdDH; dmarc=pass (policy=quarantine) header.from=amazon.co.uk; spf=pass (imf07.hostedemail.com: domain of "prvs=00310a3d6=roypat@amazon.co.uk" designates 52.95.49.90 as permitted sender) smtp.mailfrom="prvs=00310a3d6=roypat@amazon.co.uk" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728316523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GiJM5/3RB7kREr4++92FJZ3G6aQv+RCcdKcWV7SOVMk=; b=k0yh+NwxOdV56NkcoQbJWsGAd3IbVOs+Y0F5jz070H4dHOlnI4guUkrtOQ5doH1lM8ewME BAI7eOoDY2oDV98tIxPVRInMqmlugSxB9zt585fnSADHK1w73s//d2NksHI0k/mMFtPIWo FIwUIDttfT4N/mb2VZVFK1dmZE6NLTI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.uk; i=@amazon.co.uk; q=dns/txt; s=amazon201209; t=1728316622; x=1759852622; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=GiJM5/3RB7kREr4++92FJZ3G6aQv+RCcdKcWV7SOVMk=; b=rQ2qtdDHyrOfDpWi2gcjv13g/aLBmcVr6dDQ9OYEKNsoBNWqtzQWqTSr +ndmZ3bTPv3FKZ673a+rXeWQFJiTb5HshPcGBjDXnjDPRR6/Xr5i+7ZDK jPzmKoynjXpv7jaGDOcR04gzoLDHk52agtzubCh7gzf+iIZ6oMvTe0QwM s=; X-IronPort-AV: E=Sophos;i="6.11,184,1725321600"; d="scan'208";a="438881514" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Oct 2024 15:56:55 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.21.151:63359] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.41.198:2525] with esmtp (Farcaster) id dee9626c-3d64-4128-98b5-9eed2f937ae6; Mon, 7 Oct 2024 15:56:55 +0000 (UTC) X-Farcaster-Flow-ID: dee9626c-3d64-4128-98b5-9eed2f937ae6 Received: from EX19D003UWC003.ant.amazon.com (10.13.138.173) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 7 Oct 2024 15:56:48 +0000 Received: from EX19MTAUWB001.ant.amazon.com (10.250.64.248) by EX19D003UWC003.ant.amazon.com (10.13.138.173) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.35; Mon, 7 Oct 2024 15:56:48 +0000 Received: from email-imr-corp-prod-iad-all-1a-6ea42a62.us-east-1.amazon.com (10.25.36.214) by mail-relay.amazon.com (10.250.64.254) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34 via Frontend Transport; Mon, 7 Oct 2024 15:56:48 +0000 Received: from [127.0.0.1] (dev-dsk-roypat-1c-dbe2a224.eu-west-1.amazon.com [172.19.88.180]) by email-imr-corp-prod-iad-all-1a-6ea42a62.us-east-1.amazon.com (Postfix) with ESMTPS id 4CFBD4027E; Mon, 7 Oct 2024 15:56:43 +0000 (UTC) Message-ID: Date: Mon, 7 Oct 2024 16:56:42 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 30/39] KVM: guest_memfd: Handle folio preparation for guest_memfd mmap To: Ackerley Tng , Elliot Berman CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , James Gowans , "Kalyazin, Nikita" , "Manwaring, Derek" References: From: Patrick Roy Content-Language: en-US Autocrypt: addr=roypat@amazon.co.uk; keydata= xjMEY0UgYhYJKwYBBAHaRw8BAQdA7lj+ADr5b96qBcdINFVJSOg8RGtKthL5x77F2ABMh4PN NVBhdHJpY2sgUm95IChHaXRodWIga2V5IGFtYXpvbikgPHJveXBhdEBhbWF6b24uY28udWs+ wpMEExYKADsWIQQ5DAcjaM+IvmZPLohVg4tqeAbEAgUCY0UgYgIbAwULCQgHAgIiAgYVCgkI CwIEFgIDAQIeBwIXgAAKCRBVg4tqeAbEAmQKAQC1jMl/KT9pQHEdALF7SA1iJ9tpA5ppl1J9 AOIP7Nr9SwD/fvIWkq0QDnq69eK7HqW14CA7AToCF6NBqZ8r7ksi+QLOOARjRSBiEgorBgEE AZdVAQUBAQdAqoMhGmiXJ3DMGeXrlaDA+v/aF/ah7ARbFV4ukHyz+CkDAQgHwngEGBYKACAW IQQ5DAcjaM+IvmZPLohVg4tqeAbEAgUCY0UgYgIbDAAKCRBVg4tqeAbEAtjHAQDkh5jZRIsZ 7JMNkPMSCd5PuSy0/Gdx8LGgsxxPMZwePgEAn5Tnh4fVbf00esnoK588bYQgJBioXtuXhtom 8hlxFQM= In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 46B844001C X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 38k35hd1g3u64k9wbd6nky5w9zjtwhhq X-HE-Tag: 1728316621-586141 X-HE-Meta: U2FsdGVkX19eDoOBMkWUM7iBhWcpoxIeahxtWzbLy8NRcwrwQLNSnmNbh03jMjDnLEFJBS2Eq/nbNjHNOLCneLuQufehlbRskuQpenC/VXR9ETVmMeu2To99kzAxtXasVdgHxTZGO+fqumn+FNipvIe05orFpTYdTrb9MONy+TyFJS1CzDPPmwOtL0IzXsSTqRQwIsMKsJKr81HdzQlAVP2vqPL//yDlIHDEVeur77i39UKKqvWcg7VwLAlEDSzq6QXiKYDBEeLCN8blvzdviIuW1ZJmB5snAGTGIcehg8nvosJFzxeMswSxaO9c1DeSLmo/ZiLWUiGk3ujuBAIEv4o6Pq4IP5SAsPW7ASOtDx4c2TdwklWDMFjQmUMbs+l5n4kI5hsipsLkI0WJnv7OVX8YqfCLSnxo81iMvh+WSZFqTLwl+Icb/OFBvSTJGsbY85ED+qyQQEj76k/wiSF1Exw6DMEPC4AHFqxw37kpnLpVJaN0SMoOCStqD4zWxqm70+fZoXdWUk4HIR/BiJOCfaT7LpkbfB1qF1Prc1+evRSSDCXLHKU3vKGsiBdXSvtJ0+FZX9C4S43ofjmmFfBWati7acMEW9+5+ktznXwGniyXyqALlg6C8iU/bNgpyUYeUUjkH8OqeFTd5JW+YqTknUrrT+lgAYstPDG4bR3WVaeCJrgBo/cL/ZOr4L3TwyC97Y9wZOzgUBEhY69jeEDWiDxTOpoVrszsP0Q0JqO3Dfw8pb9JX9v7hhqS9v6PNOB0L6uiWnL5EWwZ1ijmuS49MMGJ3Np1uC9/OjaxYSIZR+yWvEGtSlgWfAwhMgAY8b5LjvnY4NymWsaN4s9Or91Q/Ga3SNo64DENle8bAhPjAjS71prJaU8tb/X16bzOefWarej8HF0sHMDxjrtyme68PfRqr70nf5NtiI9kerDiGA1xcWEFLGS+7rNy2fuivJvJ5Iva9FRb9bqBUAARibc mGgAhCNx gjdWDqOaIUNRs3GZH2cy7AYi3Y+S0rr8BI5G1JPNboKZu+AnpwhzO8JgcqP+4zaCHvbT3CYUaz67AGPyLMS4H+toSdvda7sv3sNbOYIZr/O95cPTzXY2snFvxjdVolOtOxhsLdUPVdPEhl/DUr9JWTimZtz2wTrgT69/8QrPdR0Ha40AD25aHdiWa2f44Rrzzi8nmsDm1ncEqIiGAm98PGLa+tjYkFmbnk+7vqdwMS9asVnsT4McOnmk0vrogvpgqrVkpIU6fBwoFf7gpX2D/uI6Yu/Kkjd/POcv83+0gwdsilygnPd6qEfH3PmNzhgGl1DOSFBxlMCttbroE+z6qNR6icOIIpDyUortqbehSre0nu0g56nXd/FHqFmKyucASjcL6uEOhvsMn84EIL6EZVFJg7d1LaMAsPkyahb0HqKyNbLu9glOlPgDHdxhZ8wDp69fMnEDSqcYUvP81No5BJ1zjHgbEVgBoJMNIVy1OI9BJqtBwy2P2VmCyptktdI+3CBEWkpaELQdEwIDfk4+vsGCJkmXQXN6TOxlXthDj9TCKqm0997GNlW+aLtAHe2DE+gRgHb3qreoFsrpckGrZq1WnZsjdlE8QseuHexKYz1JRhelXwQeZz30Hk1L5Jj14Pj9i6fRPxWWKOBCFPa4VjkFSZ7O1MdtTeCFXpqr6z/H/rzSnI7054D1H6bNpf8ojxsT+Cm1pcgPFH7dGr6DcyIxMGxw+h4Jxeo5jgA9ZVX2l+sI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ackerley, On Thu, 2024-10-03 at 22:32 +0100, Ackerley Tng wrote: > Elliot Berman writes: > >> On Tue, Sep 10, 2024 at 11:44:01PM +0000, Ackerley Tng wrote: >>> Since guest_memfd now supports mmap(), folios have to be prepared >>> before they are faulted into userspace. >>> >>> When memory attributes are switched between shared and private, the >>> up-to-date flags will be cleared. >>> >>> Use the folio's up-to-date flag to indicate being ready for the guest >>> usage and can be used to mark whether the folio is ready for shared OR >>> private use. >> >> Clearing the up-to-date flag also means that the page gets zero'd out >> whenever it transitions between shared and private (either direction). >> pKVM (Android) hypervisor policy can allow in-place conversion between >> shared/private. >> >> I believe the important thing is that sev_gmem_prepare() needs to be >> called prior to giving page to guest. In my series, I had made a >> ->prepare_inaccessible() callback where KVM would only do this part. >> When transitioning to inaccessible, only that callback would be made, >> besides the bookkeeping. The folio zeroing happens once when allocating >> the folio if the folio is initially accessible (faultable). >> >> From x86 CoCo perspective, I think it also makes sense to not zero >> the folio when changing faultiblity from private to shared: >> - If guest is sharing some data with host, you've wiped the data and >> guest has to copy again. >> - Or, if SEV/TDX enforces that page is zero'd between transitions, >> Linux has duplicated the work that trusted entity has already done. >> >> Fuad and I can help add some details for the conversion. Hopefully we >> can figure out some of the plan at plumbers this week. > > Zeroing the page prevents leaking host data (see function docstring for > kvm_gmem_prepare_folio() introduced in [1]), so we definitely don't want > to introduce a kernel data leak bug here. > > In-place conversion does require preservation of data, so for > conversions, shall we zero depending on VM type? > > + Gunyah: don't zero since ->prepare_inaccessible() is a no-op > + pKVM: don't zero > + TDX: don't zero > + SEV: AMD Architecture Programmers Manual 7.10.6 says there is no > automatic encryption and implies no zeroing, hence perform zeroing > + KVM_X86_SW_PROTECTED_VM: Doesn't have a formal definition so I guess > we could require zeroing on transition? Maybe for KVM_X86_SW_PROTECTED_VM we could make zero-ing configurable via some CREATE_GUEST_MEMFD flag, instead of forcing one specific behavior. For the "non-CoCo with direct map entries removed" VMs that we at AWS are going for, we'd like a VM type with host-controlled in-place conversions which doesn't zero on transitions, so if KVM_X86_SW_PROTECTED_VM ends up zeroing, we'd need to add another new VM type for that. Somewhat related sidenote: For VMs that allow inplace conversions and do not zero, we do not need to zap the stage-2 mappings on memory attribute changes, right? > This way, the uptodate flag means that it has been prepared (as in > sev_gmem_prepare()), and zeroed if required by VM type. > > Regarding flushing the dcache/tlb in your other question [2], if we > don't use folio_zero_user(), can we relying on unmapping within core-mm > to flush after shared use, and unmapping within KVM To flush after > private use? > > Or should flush_dcache_folio() be explicitly called on kvm_gmem_fault()? > > clear_highpage(), used in the non-hugetlb (original) path, doesn't flush > the dcache. Was that intended? > >> Thanks, >> Elliot >> >>> >>> > > [1] https://lore.kernel.org/all/20240726185157.72821-8-pbonzini@redhat.com/ > [2] https://lore.kernel.org/all/diqz34ldszp3.fsf@ackerleytng-ctop.c.googlers.com/ Best, Patrick