From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 311AACF34C3 for ; Thu, 3 Oct 2024 21:32:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BAFFE6B043A; Thu, 3 Oct 2024 17:32:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5FF46B043B; Thu, 3 Oct 2024 17:32:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A27706B043C; Thu, 3 Oct 2024 17:32:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 866DE6B043A for ; Thu, 3 Oct 2024 17:32:12 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 29441140B06 for ; Thu, 3 Oct 2024 21:32:12 +0000 (UTC) X-FDA: 82633589304.20.8DB3D61 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf23.hostedemail.com (Postfix) with ESMTP id 582F7140006 for ; Thu, 3 Oct 2024 21:32:10 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=yWaYWkdp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3WQ3_ZgsKCK4OQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3WQ3_ZgsKCK4OQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727991034; a=rsa-sha256; cv=none; b=FpD3sqvIRUDQpwl2w6X3qDAI0OknFPYusxDvkWyLIAYP4lEBKjzDDYycTMuVYRNvLlhFB+ Bfl9giyJ51ZSCll3ca60ts1q09oLvmv9+//j824dk2r3uDA0o2BMTdZWR2GfJkHnpZf0R4 5cZOWzoj3Pwo4Hb1yNTutO2GiY9u7Ac= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=yWaYWkdp; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3WQ3_ZgsKCK4OQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3WQ3_ZgsKCK4OQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727991034; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=YeyMi5GuWsR2a8eIHUSkiPOJlEPgvDo2Zk7QC+dIJyo=; b=Wxk7xA2pGRLwQTUDTrcATXmHYfb7rScGdTRAxyykE+yR9luir2xhT8U8+wMlo11nCjgYM6 X95sM8bcWYwD/bjLGyhfN+Yb7aBShmLvpw8qrLEnBt6+Cx6qbpP1shN7I0KUWZlq1gn/Ya 9TfZw1Z+kDafV5clhsGO9pFlEjZoSzM= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e28b624bfcso22526597b3.2 for ; Thu, 03 Oct 2024 14:32:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1727991129; x=1728595929; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date:from:to :cc:subject:date:message-id:reply-to; bh=YeyMi5GuWsR2a8eIHUSkiPOJlEPgvDo2Zk7QC+dIJyo=; b=yWaYWkdpMMNsglKB0TUpsjFZnK6hNMmQHOYY39cXdpD88nIHTVvCqcoxvGbnO9X+6S xqxAf1RthBWOmqNPb4veyRFjdqtfepX3Y+t351YkqQ/B75P0j9Wld+nwXGOFOYkboMVk Rn6jqtUKEF1K7Na15wV4pC2WPWslLWkT7wYb7Bcq1PA30+ViLsnKOVVmdDXCBDyJhInF /WfzIxUm6spQd2WaZ1OAdmqDixOHsSQzesqUJNe2CBygwJG8Q+LRdJtDWkJh7Jq04Bd1 fRaP4Sw21d3Z9zm+Pslx2EpzQH4PjC4URqSkM5eeYhLedW6ngNwqsqiIOYCCJwmFfx5Z qTZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727991129; x=1728595929; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YeyMi5GuWsR2a8eIHUSkiPOJlEPgvDo2Zk7QC+dIJyo=; b=ekqr3NqgI6HU9SMh7mKZ8q9lF5oW3hraZ4q3VHuiuACqwlNciwN3ly/b6kHaat+Jyk cRbZLAZVM21FLpCJeJv79i6w7Lo1In/TMrb5vjkOPv0o8OyJP1EfpU+D3KPJEGyuyrBL Ss4FzvnqxjPWeAQnDwbgb5ZfVXVSaFSPilP6Khw5Rwqm6JakvHO7K09gUIMWPlCab/tI Cu+4OjdbD1+EKxOcAwkh+++8gQnr964aExbyNNBX1b3aYI8heU+UGLXnfHgEbsRMwkP1 WHQ9Jk3tJ2H/AsaQmIUQARFlGmyEbYrHYvroRqOuyD65WjblhF7Nt1pTXaMNAhPDqFvW x7bQ== X-Forwarded-Encrypted: i=1; AJvYcCUAGjCdHV4y/NmDuyaY6qA8SIV+fJ9Zi4Ke3CQXTWRobHk7dh99sCviXEpbr7N92jkMFkLm2D8JoA==@kvack.org X-Gm-Message-State: AOJu0YwcRWfyAP+1EnUF0ZfKPBG530wj+mEv8dAhe20zH97HdPtcmj6W IuJcK4/ZFKJ3KE9W6c4aVDBwrKRuPvbDkU91G7UerGtHagbD1sz1+PA6oZaOVr9gsV78xfblyDW rTMxK9cL1xzziuYGXYX9EEQ== X-Google-Smtp-Source: AGHT+IHj9Mj3fGF2AKNp7UT4NwW0tAXGu9bw92gmr83lf/HTz4fdabYcJ+qNb/yseg69AGP4G9z54pF5GWqqQEUBBQ== X-Received: from ackerleytng-ctop.c.googlers.com ([fda3:e722:ac3:cc00:146:b875:ac13:a9fc]) (user=ackerleytng job=sendgmr) by 2002:a05:690c:2f83:b0:61c:89a4:dd5f with SMTP id 00721157ae682-6e2c6e8b3a8mr32317b3.0.1727991129318; Thu, 03 Oct 2024 14:32:09 -0700 (PDT) Date: Thu, 03 Oct 2024 21:32:08 +0000 In-Reply-To: <20240916120939512-0700.eberman@hu-eberman-lv.qualcomm.com> (message from Elliot Berman on Mon, 16 Sep 2024 13:00:56 -0700) Mime-Version: 1.0 Message-ID: Subject: Re: [RFC PATCH 30/39] KVM: guest_memfd: Handle folio preparation for guest_memfd mmap From: Ackerley Tng To: Elliot Berman Cc: tabba@google.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, mike.kravetz@oracle.com, erdemaktas@google.com, vannapurve@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-fsdevel@kvack.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 582F7140006 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: cdm66gq6ywmjw9j3pgfncutb8x9jru3g X-HE-Tag: 1727991130-526079 X-HE-Meta: U2FsdGVkX1+ZAUVcl+c0G2J+NU4RCpK7gWNAOAul+++EjBqBZdZMribb/2kCqi1ndkM6YsxxyhEKPH/E1CrIUUTE1GpTP2NGcQPnRMDX8sEZes1exIaWcvdc3ddyYIEq/xx5J2WPmrTxGzunuDlXtgmh7kaWIl3p9pkPYo7DH6S7RYgUEWXeRjsdkElUEThC6LPglz6nXAnzUg0pSdVNpQrsK7fKOhWy+Poa6vvfgX0Cf4OLatnuWAgJd0P63EcG212MMbD6yGSxueDfe3dR04pPmx1QS8/kstFOAMNI0KIm59dq0pdBY7RKXdixxIj0QCUB75KIY1znDre9nZbRUcJRY1uryefY7qhFW3oWv3naxp8IQeVkmwXoTb9dpI0nFiJxYzffACIFEDRgkfg1FNT0wr8ICxLjPfOZo3ePeLj0ekRdzkJT1aMIAI+jmbLQNt9LamTmo3KSv4Wa/kLygvf+Xwk1Se/PcMsbLUGwB8jaKV8QJ69NUkeE1FT2c37UNUA6a43awZp8mibEfWhOUMu0PM5br1SQfjWMrC8TT8Zp//S70y6nNmYdX+iC/TTYup6bzpHss3pWv8VBNIGjIB5Ilm/2uBKO8SuP3KjgAizgLYrGKEpsYsYzSoiApMn8Ak4DVDGOaqx5tt5XsMCeyAXNOKnh4ryVHkEhhkWpH2aUOCyOFSfgqMkQvKdcrby/RP04Noa64gRgp9mZrNm18FkV4NXamL85UCFU6qVBsavQhv1Y+qxbi4C93YY6sXvXafGCiY/jr3EfpbyvQUTNTEMpwu/8txxBdhVBTAz3AQ68MkytSwxDsFhYQWljHRgkEsdWV9bI097WUvTd5GxcGJl0NBgdUeEAgUUbJMgB23VHFovSj2dxNL26fnDY7pCgQsyqFER+x72YxxiZ/F8Sp8hAZ2/JSh9dyA3ox/9K+KPrFWx+ate1LTcz5lwhE306jQZ3u0/4YNeTeTOfQSm 39yh5PMK PKPgZjKRPJawIFaAagEIcu7vJHqeeXWMYle0SFUC8DNhB9V+5qx47bUcscUYIbi4SLktUxy8y7ptieBLiZiXbNYq/BbnRyhsQQn24AW/kcIbbAnkOOXWrAU3jfraVUwQoMZaRuzDMouGRZgyK3jWBH6BENiJW54xOK7rpwBsZ33EhQW/qO71/VVDuuAu6wiOtSOGYsXVJ1uAoj2cR9lbdOGLKerev+lmBoqxWJeHSBSyQ2bF4x5c+VwJgYUyG+tECXw71pYsPl6xRqh15zOvyQ6A7zlo2s7dCw3540YSnjaAAW4OWQWrXP1n8CFsKvGyrA33EAD26ig6tl2qBhKLB/WAuWIHdZfvjHQF8HbGo5KgTB+RJ/N+LxL5lORhV0xEZK+xnzvR7FjlWbjJl/T6NBgcvF4aqddJU/atD5y+TT1qx7sFiYizNRXiNUycMxyxHMHIo5c3whStTihS886e8YyJgJRJszB0WVGbJrekKdD0RcQwwD08zOMPEEzj/VHvyZ0eWSb7Hx5iSpl3Gb6gauNdLgWO+wvwTVzw8a9SnU2cHxVaeE/hKn7u5JhZ9194E53VanJb9qXu+9FOtDWbeZggE2I8COkXZVgYxQFftBtmLLp1A3wTjepAymnB3QGw1lNsqBqI1mG0FSdFHsng88yyyOvRSa/aQZ5wt7F7Ht72gxZq6qbYnx/XRSzXV2G8QTihDY1z3VCoOud/c5HIrg1qu+F8w9wJME7b2Yn4E2LxXs56SLAqBWaOdQrg9b48HCwb1MQN+IcbQlTjvLcpZaKaHB+IerKSwRapXdDpU6uLZjp+Kw6aFANzfeYiHzkmy2ReGyL97RGWaPyY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000075, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Elliot Berman writes: > On Tue, Sep 10, 2024 at 11:44:01PM +0000, Ackerley Tng wrote: >> Since guest_memfd now supports mmap(), folios have to be prepared >> before they are faulted into userspace. >> >> When memory attributes are switched between shared and private, the >> up-to-date flags will be cleared. >> >> Use the folio's up-to-date flag to indicate being ready for the guest >> usage and can be used to mark whether the folio is ready for shared OR >> private use. > > Clearing the up-to-date flag also means that the page gets zero'd out > whenever it transitions between shared and private (either direction). > pKVM (Android) hypervisor policy can allow in-place conversion between > shared/private. > > I believe the important thing is that sev_gmem_prepare() needs to be > called prior to giving page to guest. In my series, I had made a > ->prepare_inaccessible() callback where KVM would only do this part. > When transitioning to inaccessible, only that callback would be made, > besides the bookkeeping. The folio zeroing happens once when allocating > the folio if the folio is initially accessible (faultable). > > From x86 CoCo perspective, I think it also makes sense to not zero > the folio when changing faultiblity from private to shared: > - If guest is sharing some data with host, you've wiped the data and > guest has to copy again. > - Or, if SEV/TDX enforces that page is zero'd between transitions, > Linux has duplicated the work that trusted entity has already done. > > Fuad and I can help add some details for the conversion. Hopefully we > can figure out some of the plan at plumbers this week. Zeroing the page prevents leaking host data (see function docstring for kvm_gmem_prepare_folio() introduced in [1]), so we definitely don't want to introduce a kernel data leak bug here. In-place conversion does require preservation of data, so for conversions, shall we zero depending on VM type? + Gunyah: don't zero since ->prepare_inaccessible() is a no-op + pKVM: don't zero + TDX: don't zero + SEV: AMD Architecture Programmers Manual 7.10.6 says there is no automatic encryption and implies no zeroing, hence perform zeroing + KVM_X86_SW_PROTECTED_VM: Doesn't have a formal definition so I guess we could require zeroing on transition? This way, the uptodate flag means that it has been prepared (as in sev_gmem_prepare()), and zeroed if required by VM type. Regarding flushing the dcache/tlb in your other question [2], if we don't use folio_zero_user(), can we relying on unmapping within core-mm to flush after shared use, and unmapping within KVM To flush after private use? Or should flush_dcache_folio() be explicitly called on kvm_gmem_fault()? clear_highpage(), used in the non-hugetlb (original) path, doesn't flush the dcache. Was that intended? > Thanks, > Elliot > >> >> [1] https://lore.kernel.org/all/20240726185157.72821-8-pbonzini@redhat.com/ [2] https://lore.kernel.org/all/diqz34ldszp3.fsf@ackerleytng-ctop.c.googlers.com/