From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31AF0CF34D5 for ; Thu, 3 Oct 2024 23:43:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9BF3C6B02AA; Thu, 3 Oct 2024 19:43:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 947426B02AB; Thu, 3 Oct 2024 19:43:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 80E8E6B02AC; Thu, 3 Oct 2024 19:43:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 598BD6B02AA for ; Thu, 3 Oct 2024 19:43:40 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C909580D75 for ; Thu, 3 Oct 2024 23:43:39 +0000 (UTC) X-FDA: 82633920558.16.0FC111E Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf04.hostedemail.com (Postfix) with ESMTP id 14D1C40003 for ; Thu, 3 Oct 2024 23:43:37 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1wjjk6h7; spf=pass (imf04.hostedemail.com: domain of 3KCz_ZgsKCLsbdlfsmfzuohhpphmf.dpnmjovy-nnlwbdl.psh@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3KCz_ZgsKCLsbdlfsmfzuohhpphmf.dpnmjovy-nnlwbdl.psh@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727998888; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=SN3dVjC5fB1CQSshX6WwUlOM21L2K4T7gTVUDpU5VO4=; b=XcRGq9RLPwtNssZPtKzfRsxxLmP2AO9Bg+OqHD6FFSC1KyxvWMAOghdUDZrIyvZ1YzhH+z X43lKoAjuCl+wGw1nbNNW9rtnXO/F5jsg5TOfWx3uG9OFrNJab2dghTknWsPmaG7yhN74Z zMc13jcIb07ovaGtHMiG9p4qMEGYx+w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727998888; a=rsa-sha256; cv=none; b=jRnoPAxFFNrfaY2iiRKV4WlfzAGKj7pMbVZ4yMMFhHdST5BQo4hAKnuMqwUNQttdRtf5XG WeiNSIbWLECkb+Vjt2pQ6WL0mscjUrJ6oBrdAgPTx23ZPyM+i/3l++cAvua1mlGKT+BsKN ytT0IqZZknv86fZh3fj84NYg01+gmSU= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=1wjjk6h7; spf=pass (imf04.hostedemail.com: domain of 3KCz_ZgsKCLsbdlfsmfzuohhpphmf.dpnmjovy-nnlwbdl.psh@flex--ackerleytng.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3KCz_ZgsKCLsbdlfsmfzuohhpphmf.dpnmjovy-nnlwbdl.psh@flex--ackerleytng.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2e0ada150c3so1564541a91.1 for ; Thu, 03 Oct 2024 16:43:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1727999017; x=1728603817; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date:from:to :cc:subject:date:message-id:reply-to; bh=SN3dVjC5fB1CQSshX6WwUlOM21L2K4T7gTVUDpU5VO4=; b=1wjjk6h7s0htuQYBG4v4r5RkmpG3TdE61KU/QGukQc0imDrbqUHB/tgOxISpVMHPmp /H6wLP3+VBgGH9fTNiVD2ZGxhlAUB2dHH9PFtgPu8gmgXiTdr9PxCN7u8fSG00yzWIz/ kWJR6fTq23FSYPBl8FO6GTx8ESq+uO8Q0Ncw4uGa/iNGVPX2VnDw/QYzmYi0AZKENhQt 30SDgQ0p2NDQkl0g11Bi80L31jmpVMNYu39Jou+BHqer8BBzt0d0H4UHdxkVO7K0pxa2 xJFrlRlZHwhjwxeZ9HS9BiWJh8K1s3lhPWz1FGU30OzqgCo40nf5oSGu6hSpvHnNLgwl fzVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727999017; x=1728603817; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SN3dVjC5fB1CQSshX6WwUlOM21L2K4T7gTVUDpU5VO4=; b=NufE+Hyturthoy74ZGwbHX2H9vJwiS5b3QNwjGToKTn9UNFQhSYmTD4WfjgrOx6RRF hxu8D7cUm+HQiGaVsEs7C6nXHsoPrWHCg6Tv1YSR+EWlCEco85gBh1EFf4S7fhCcXUZm ZHGUaKJ3BEsZJ9NiDH5f05yNTk6FL12LhbITkiv3cPYKZwsrtUgKwIxRMAM6XpGnPJ+D fuCsrgsf9eaqSBfCWjT0cATXjrgxiFdcxARTR9rhxUvZXb6+N0r3WIghluvOw9JZoYqC uXfJupMWAk7qDzgi1v2Gk7ycX/1W1IEY7eoRDLyGRLfRJkQAUuiseGQGfk5dZkRQVdNF NYvw== X-Forwarded-Encrypted: i=1; AJvYcCXuR+I857m4FiOVIDt+1hfnWTn2AGhyrB9Mk+r1nTFVwiKFrD7j+AqEFr13MGpKzurP9JfA1e1aQg==@kvack.org X-Gm-Message-State: AOJu0YyDOJDsVQa3/zKUzKJE37XTKi0i7xQoLFeZ/+7OfzR8K7wtaHow /eYLnbXAQnQANMIqcvz+h2+p8+pIGMC+P5HggzVRCsxY1VHos8BUDfshKOG7lvKork4FmPWp3fF sCheQyzXp4xa2bb3SjzF9sA== X-Google-Smtp-Source: AGHT+IHBOIxM9r0ehPl4c3IhMwECnflPechWmOVBJu0WiGbLsRHWGkU66D8A48Rkw+4e+C8ZFyGr8wE06O6tn38x1g== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:146:b875:ac13:a9fc]) (user=ackerleytng job=sendgmr) by 2002:a17:90a:51c7:b0:2e0:72ab:98e9 with SMTP id 98e67ed59e1d1-2e1e5dc5fcemr2279a91.0.1727999016517; Thu, 03 Oct 2024 16:43:36 -0700 (PDT) Date: Thu, 03 Oct 2024 23:43:35 +0000 In-Reply-To: (message from Ackerley Tng on Thu, 03 Oct 2024 21:32:08 +0000) Mime-Version: 1.0 Message-ID: Subject: Re: [RFC PATCH 30/39] KVM: guest_memfd: Handle folio preparation for guest_memfd mmap From: Ackerley Tng To: Ackerley Tng Cc: quic_eberman@quicinc.com, tabba@google.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com, erdemaktas@google.com, vannapurve@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 14D1C40003 X-Stat-Signature: 7rbhoe8u6izhic1d4triqw7e44c5b815 X-HE-Tag: 1727999017-275150 X-HE-Meta: U2FsdGVkX197yGXPV95ORuEsAZgauBTq/65KdxTbrjbRU1qcUKF+rQY/ZgM1EoNkoz7VrKZv8yLDBe+otbajAZcZBZXLXvsavzCKs2aJLx0xZj1k8RN6MKkyDrDGdje466ElBcGOEXFOpSt1GWVYym1tqlerEXKoH6EYohl6AxtQqM5UWPgPMksEo0xsar1YSmVIvGmGRpUNnJuNjPPiOaIizlmA4kK+EOtXhANPQPRFl03qzXlJYg8pw1CgZPztVcDBIM7NQCq3J1uzONqBAZ/gShmRDVhsQ2i0CKo1UBu7BpsEGj92ILx0sHly7bybYLwUNKcRW95ZuyWIIf4+4QLndBmDUNuEeKi1v9GhUY3OqwRd8TmvWEBU3dFM9+OTtxGsO0umHQidYFCKOs3sDWlroSQun3MEsK6GDexez3drX1POjkK7I9snk0cbzwQnboRTEVM8OPmLDUAsrKAhoHDQ2yfZ5Sj0zNOP46x09wZBFIkLsT/NFAsiLDpPxyEc7o92v4bmVYkepOQAY4C9RwFIXs6S8q6To4F9p1qBSnrKIPyI4WbEO1H1jzIsWRRZnxwOCH/5XEE7w3SiqmtgEvxtmZuR1kk0py+U4j11TpQqgs1KELDTIn5KB23b7C/mRC8U1AV1jhEhvlmlfMb0gXUhnzGBmhZbDgcVz7vvtUj992a5w+UC1w9nkh+LsJCsuM9tEC8mb4LDebg7UAyrGwO7YiEs2FWeWbOaTttqk4XJ9a+oFQTwo5nSKkrzEvJB8Lce6TBUkGVtvurCiZ5Cy+XL2wUVQN0DYFwfJy1uS6FJ7a882TpMjKcCIaUCLOZhMl4CSKaXfCVqh1MnrXiWsaVJ5o4/WRKibG3SlFzFJus1E0fnYwQmwDB24nLGt9EBSkQnqSl3AgQx3utxFQWrvcOgDzV3jX/cQObLhDJPHsJqGyVbvwnKkR5K3zkTwrRZ9ZzazV4U3vzpXnIB/bo SBkGgmwS 9LMatF1lheqKdIehtikAXe/1RPEjQcvgNQ2/axoQOgxWI274sIVVDoDNU9mfJP1oV1TjvfKD+urFHvMl4YNPGKNefJgiZgb0140ZlhLakWJU0NROpkQ+WAu4ZPSdSYR9oNPENQCsg5XIX59yb8FI5DtY5yJf7FYi+mfPYqgaD67GI4xzVz0HClJsthitx9DI6u3tlvHgrm+z3TwlkVSoVMT118RxL9jfE61ytlrxNspvL3idrBohOb1z40sH37Owsm2sjSPyyelSfD9kXgNYMTXO+BeDkjiSZ1Z9plqZeJVCjfkZwmijFsLCJkL5RfFNBVkdEJtVsijCOxk5tZBx0MPuZ/s3Ql3biC3JN9ut+FSVHf5tFbgtoxzvcvLn5RxhdeV5/BbdQjMkuttnYD1jj988toRx4Sou526/FhkBdfl11AhvKT87V9QZRcjsEGGkIYiP5qXNS0TUOguNF1BSNkIcokZwhWumtSFVnIXxTzegV6e2SEZNTaihDIOT7BvcQMCKYh8FIX/e3hi6Xf3T5YarcNO/znQGhBkkVs7XtBl0SGOR2+BDHE/uPVwgCEDVPbA8D/HChocVO9PuFMZDTVWo8IHXqXbLWZ7Kh9hst+0FxtdcSOMXxg7BZh1UV+nk5Kexm8WKECTQ7QQjwuo6XvbZRMby5NK5qvClljYe27ofldGelTG9v4fqO8wDbk5M7vvoapT2cNvredQW3qF5z8P1KeSa4gL1bTlQNo1ncrmNyNYnc5BVox3HD1q/EJvX7Xrb1CSgElS3qUoRjPAYzUAT/6XdCfp2el1PHpqxfqH1GwGlYbI7bLPJuDHZ0m5ev9+hPVeJ6Q4sUlUstOTeRyOGBhEmiiuaSJPIH X-Bogosity: Ham, tests=bogofilter, spamicity=0.001480, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Ackerley Tng writes: > Elliot Berman writes: > >> On Tue, Sep 10, 2024 at 11:44:01PM +0000, Ackerley Tng wrote: >>> Since guest_memfd now supports mmap(), folios have to be prepared >>> before they are faulted into userspace. >>> >>> When memory attributes are switched between shared and private, the >>> up-to-date flags will be cleared. >>> >>> Use the folio's up-to-date flag to indicate being ready for the guest >>> usage and can be used to mark whether the folio is ready for shared OR >>> private use. >> >> Clearing the up-to-date flag also means that the page gets zero'd out >> whenever it transitions between shared and private (either direction). >> pKVM (Android) hypervisor policy can allow in-place conversion between >> shared/private. >> >> I believe the important thing is that sev_gmem_prepare() needs to be >> called prior to giving page to guest. In my series, I had made a >> ->prepare_inaccessible() callback where KVM would only do this part. >> When transitioning to inaccessible, only that callback would be made, >> besides the bookkeeping. The folio zeroing happens once when allocating >> the folio if the folio is initially accessible (faultable). >> >> From x86 CoCo perspective, I think it also makes sense to not zero >> the folio when changing faultiblity from private to shared: >> - If guest is sharing some data with host, you've wiped the data and >> guest has to copy again. >> - Or, if SEV/TDX enforces that page is zero'd between transitions, >> Linux has duplicated the work that trusted entity has already done. >> >> Fuad and I can help add some details for the conversion. Hopefully we >> can figure out some of the plan at plumbers this week. > > Zeroing the page prevents leaking host data (see function docstring for > kvm_gmem_prepare_folio() introduced in [1]), so we definitely don't want > to introduce a kernel data leak bug here. Actually it seems like filemap_grab_folio() already gets a zeroed page. filemap_grab_folio() eventually calls __alloc_pages_noprof() -> get_page_from_freelist() -> prep_new_page() -> post_alloc_hook() and post_alloc_hook() calls kernel_init_pages(), which zeroes the page, depending on kernel config. Paolo, was calling clear_highpage() in kvm_gmem_prepare_folio() zeroing an already empty page returned from filemap_grab_folio()? > In-place conversion does require preservation of data, so for > conversions, shall we zero depending on VM type? > > + Gunyah: don't zero since ->prepare_inaccessible() is a no-op > + pKVM: don't zero > + TDX: don't zero > + SEV: AMD Architecture Programmers Manual 7.10.6 says there is no > automatic encryption and implies no zeroing, hence perform zeroing > + KVM_X86_SW_PROTECTED_VM: Doesn't have a formal definition so I guess > we could require zeroing on transition? > > This way, the uptodate flag means that it has been prepared (as in > sev_gmem_prepare()), and zeroed if required by VM type. > > Regarding flushing the dcache/tlb in your other question [2], if we > don't use folio_zero_user(), can we relying on unmapping within core-mm > to flush after shared use, and unmapping within KVM To flush after > private use? > > Or should flush_dcache_folio() be explicitly called on kvm_gmem_fault()? > > clear_highpage(), used in the non-hugetlb (original) path, doesn't flush > the dcache. Was that intended? > >> Thanks, >> Elliot >> >>> >>> > > [1] https://lore.kernel.org/all/20240726185157.72821-8-pbonzini@redhat.com/ > [2] https://lore.kernel.org/all/diqz34ldszp3.fsf@ackerleytng-ctop.c.googlers.com/