From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8904C36018 for ; Wed, 2 Apr 2025 23:48:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 845DD280003; Wed, 2 Apr 2025 19:48:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F52C280001; Wed, 2 Apr 2025 19:48:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 695B3280003; Wed, 2 Apr 2025 19:48:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 460B9280001 for ; Wed, 2 Apr 2025 19:48:45 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2C7B6140821 for ; Wed, 2 Apr 2025 23:48:45 +0000 (UTC) X-FDA: 83290746210.16.9D7A834 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf11.hostedemail.com (Postfix) with ESMTP id 6E0BA40002 for ; Wed, 2 Apr 2025 23:48:43 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ntHG5Nzi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 32cztZwsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=32cztZwsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743637723; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=okI1pmt6tOTfUKPKjHxrx91So52sK9llT42N0KFITk0=; b=JVR2Sbo6Y0P+80yut17/P+EazmRMgxRJI3sAsMmwllTHwd+Mq1Ey8szZDxsqOPTvG+kAgz 14fRovYzmYlTEAm3iA8sqE2/DQB6hWWQs0FG8iXF8oS4AVre6+rSfMAwBdhD/rLkgqSaRY 9gUPvul6CX6ziTWc4mhRU0dHH9hoSHk= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ntHG5Nzi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 32cztZwsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=32cztZwsKCHAOQYSfZSmhbUUccUZS.QcaZWbil-aaYjOQY.cfU@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743637723; a=rsa-sha256; cv=none; b=2wRA2CcxTGMgZoECDu6p1upTY1uHKiAm2VKT+XF8kJXkE5jASjc3lu0Qy2DT7r5UQPzZnI VXTEBRCGhIY4zaPL7LwNqpJwuUXWBRwlsLHfe05OijXG9vjFMrfhhYvywGag3MJAjjxjCV pp69HocGJwQnHgR7pLe8JwCj/v6FXGg= Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-739515de999so256109b3a.1 for ; Wed, 02 Apr 2025 16:48:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743637722; x=1744242522; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date:from:to :cc:subject:date:message-id:reply-to; bh=okI1pmt6tOTfUKPKjHxrx91So52sK9llT42N0KFITk0=; b=ntHG5NzilqJkMTTH61mkUF0Vy3vz2xlmsAoHz1b7FIP7961JdwjyHX9FkV5gyx9H/N aBCy/04V9966uRqYY5o30vp4DGFsAL5rgACEeidAyXj6iufqyCsts5pDncKmaSHM9WGt qVIkKIwFKXWrsEbjPMYgLMveorX8/HfnkZsMtnbRi8NPGA9KGwa/FM292I4Ny14fr2Hp dOykIJ595HVEdfWXR1NEd/Bu4cz3JIv3EEI96b2zN4XIjInFCLLnmn8ERiZFezJp5LQB bFFOgEW92hWbDvcjWntDvRB3x1P3lC9VvU/G3w9Chu669ecnRpOMGYz2N1oV3e1AcXgJ 1rbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743637722; x=1744242522; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=okI1pmt6tOTfUKPKjHxrx91So52sK9llT42N0KFITk0=; b=QYJf9I24m9voj3k+NYa9EU9Vk7tjVmRTM+M5h3Qnvj4hiM0wJPnzCboBlvM3ZemjgN SozI3+zvUsZ+85jKYcHTcKfK5G1UjDazYS2eWUVd3InGoVXnLN7GyZjgAS0FtBMWk3Ym 1ZrOdgkCRQWQVmbSRcJIsmBYnwhbb4sd9UjgP+sg6G65gMdLZe9xhy1ldqza6KKFUxRs DHVABsWMHFlo9+0ZKmJa2R76QT0AQnrlkHhy4MajEMYJ6XZM01oztXpD43cpKuTZv1q3 CPwjlZmPCTQ+YyDb2SWw8eU3sD6D2/e6epVZnKXn8iiDXRekhFAN34V35ncEnO9U4TKL REFw== X-Forwarded-Encrypted: i=1; AJvYcCWs3goXB5fsiVnIqpUrv7SlFRUqsqaPEGx/VCEO3qgHAsbqt60nzHhLgRxrdmunPjtsmfw9qQDbGA==@kvack.org X-Gm-Message-State: AOJu0YwXB4DRRDXTUf5mg04iVxBMPqIa6QEZZmG4fUHnyi4f/wIPrkxa Mc6Ac/L438zFJj7BYPdH4XUxevAoqXpZh+P9rO/sFngo6qQIiBVssMTlVCBpfR0vF5/nhu/eQlE 5tvG5iBo0Y9jx0ELzn0dIpw== X-Google-Smtp-Source: AGHT+IH/YhfpAAGoHaBKKxwRys6fo289xieKC6kOaOVSNp/c/2oRJMnICYWbHXEdwkIXIkAH+R7jSVJ9Q4uatsrThA== X-Received: from pfgu16.prod.google.com ([2002:a05:6a00:990:b0:730:7c03:35e1]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4b05:b0:736:6151:c6ca with SMTP id d2e1a72fcca58-73980323564mr29435171b3a.4.1743637721950; Wed, 02 Apr 2025 16:48:41 -0700 (PDT) Date: Wed, 02 Apr 2025 16:48:40 -0700 In-Reply-To: <20250328153133.3504118-5-tabba@google.com> (message from Fuad Tabba on Fri, 28 Mar 2025 15:31:30 +0000) Mime-Version: 1.0 Message-ID: Subject: Re: [PATCH v7 4/7] KVM: guest_memfd: Folio sharing states and functions that manage their transition From: Ackerley Tng To: Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam01 X-Stat-Signature: o5ckpk5hdjbao8c76yuy74fcc3fezxe5 X-Rspam-User: X-Rspamd-Queue-Id: 6E0BA40002 X-HE-Tag: 1743637723-984316 X-HE-Meta: U2FsdGVkX19+YB6IiZ8lP271KBqya/+FMxn7VV+TkZFn6OOAKafoH2NzDECQlhQlh/n0sGjyoNIOhKSaZnmtJNYGJ+QviTUFvrzaxwo8izDTA2OZphaCZstP77lfeU98T7lCsV9cBMW3G7+SOmSOlrJN9VwR6ia69qW0/aG3nCq2+9iyPOi9piHV6dUMnNmzXDB+Dq33BETgzWfuqQeYRf/TNAPst7jF/CqG3tDEix1Fg+KNPUwcbVqyDxMxsux26O4v1ONbpba+J2p5NaNgnXidLJJD3jutE21nGmWCOCdj5Tmq43eExICn/G3AaWJn0hpgsJWeXdk3prTKqMAzfloZyzC7eDA3eBzNraoXW6RTczT/gEpJ4SnvoS9wSmNsF9Bxn07fcHzPPGLgSoy2DP8jnIDC6F2FaylZFYn6jW5HBu4sb6rfnhkLRCH+mVfR4EPsJNe+0UTZlWnZMPdFgtR8buK3vgxztYpPHxA2RD4jTlrxalUgoAtj3xWhecSJt1Q9rBfvzudFwdlSQCJQsuZYUXhe9CD9qk1xA/dCyOlaauSA3U09mvyilRYVtoeQNhhhKOirCEaL8GK1wUdnT7HD+sQOAQ4L942wAH/HDvFuxlFnpped0tqBBT3XSxM8zaibq/G3LEDvQkOSqOuOf3FrxPMwG9ExVYPyi5vF+u330solo37/vCfz0vAyrC/91BlqUf7LknQ/8Hae3pJxTxldl1dKyTMagzmKtk1vAQOemql3RLxiS6Dh9jd9OXWIhtpngAES84zJk23rXpIu1eUlUyTvoSO3MMsZn3FS07idvCgulEIVNvzigt0m4rGKc0fXJUTO5e93S7jgZb4sUxxIjZd9uSg20LJu75ORoxAHLbDzm6EDG6259KKHNQZgvhXK2hsj7MGJjtOgtuxnOm/p2ByPTV5AmQImsRcdl9OpQSNs77R+zYyPlQzpP2XBMmTKUIMQ9Z5AY/1veLc qwu+MCur cuI4BbjqdWZerF6ImAXX/NovuIfYh6Ojx0oPWHW6eRVPXzVZcXA7TnVbFOjTue1wj1Tb0+u4/enHtW4lyo+NBO9zyCxTsZesf+Wztnt8SwkXj/1O3BM6JoDC3IU7yAPKvQiRelwsfDwaBLNJ9CqZQfkHholWXPk5rUwle5GKBIcpIX1KW66jXGylorCkzWis6RqV5heZtAhKz8NR64z0++CTWmbGBNG31nXnQbzbypgfAa8XQCY4cEQghqSf4aKZmPMlmuUP6FAcsmEG1HVHKu8Dzu+X23TeCqT9/AKqi7HwtcBFk6wp/CgYpG0hfoOD1u/E671ixiO3Zpd4tLEmKNTIo0hb2idQGLX1F+zgAvY1E7Y/n69R7Y5OHQQYNmoV6kSIIiXdYBG7iII57QqFAYzgyrcWt+SeOQjUkhm/+mkqsruOhDUZD9ucElhdEDPkoHXzoEwnv0aM9C4qe0dzbtp7udkt4QPw1cUZF3gudoQKVz6NWzO5Hx+iD0jHg49zRZA4HvKo144og6z0NIh59u41exOCRsQFrGnTXvFbalU0fXsbY8vUD4qnyJXuTs+kUsPHj6crg3HJJ3Q8p3A+kn6CSaG9iYTjwnXsAkKbUVDhVF78y8Bg/5NClEQ5345eVUxTVO/TIEtSVhS7100WD/w5ftt3u8ivRAb12pHG7lHFpjlBYQQr/kLi5Fg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Fuad Tabba writes: > To allow in-place sharing of guest_memfd folios with the host, > guest_memfd needs to track their sharing state, because mapping of > shared folios will only be allowed where it safe to access these folios. > It is safe to map and access these folios when explicitly shared with > the host, or potentially if not yet exposed to the guest (e.g., at > initialization). > > This patch introduces sharing states for guest_memfd folios as well as > the functions that manage transitioning between those states. > > Signed-off-by: Fuad Tabba > --- > include/linux/kvm_host.h | 39 +++++++- > virt/kvm/guest_memfd.c | 208 ++++++++++++++++++++++++++++++++++++--- > virt/kvm/kvm_main.c | 62 ++++++++++++ > 3 files changed, 295 insertions(+), 14 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index bc73d7426363..bf82faf16c53 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2600,7 +2600,44 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, > #endif > > #ifdef CONFIG_KVM_GMEM_SHARED_MEM > +int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end); > +int kvm_gmem_clear_shared(struct kvm *kvm, gfn_t start, gfn_t end); > +int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, gfn_t start, > + gfn_t end); > +int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, gfn_t start, > + gfn_t end); > +bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, gfn_t gfn); > void kvm_gmem_handle_folio_put(struct folio *folio); > -#endif > +#else > +static inline int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end) > +{ > + WARN_ON_ONCE(1); > + return -EINVAL; > +} > +static inline int kvm_gmem_clear_shared(struct kvm *kvm, gfn_t start, > + gfn_t end) > +{ > + WARN_ON_ONCE(1); > + return -EINVAL; > +} > +static inline int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, > + gfn_t start, gfn_t end) > +{ > + WARN_ON_ONCE(1); > + return -EINVAL; > +} > +static inline int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, > + gfn_t start, gfn_t end) > +{ > + WARN_ON_ONCE(1); > + return -EINVAL; > +} > +static inline bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, > + gfn_t gfn) > +{ > + WARN_ON_ONCE(1); > + return false; > +} > +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > > #endif > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > index cde16ed3b230..3b4d724084a8 100644 > --- a/virt/kvm/guest_memfd.c > +++ b/virt/kvm/guest_memfd.c > @@ -29,14 +29,6 @@ static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) > return inode->i_mapping->i_private_data; > } > > -#ifdef CONFIG_KVM_GMEM_SHARED_MEM > -void kvm_gmem_handle_folio_put(struct folio *folio) > -{ > - WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); > -} > -EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); > -#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > - > /** > * folio_file_pfn - like folio_file_page, but return a pfn. > * @folio: The folio which contains this index. > @@ -389,22 +381,211 @@ static void kvm_gmem_init_mount(void) > } > > #ifdef CONFIG_KVM_GMEM_SHARED_MEM > -static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) > +/* > + * An enum of the valid folio sharing states: > + * Bit 0: set if not shared with the guest (guest cannot fault it in) > + * Bit 1: set if not shared with the host (host cannot fault it in) > + */ > +enum folio_shareability { > + KVM_GMEM_ALL_SHARED = 0b00, /* Shared with the host and the guest. */ > + KVM_GMEM_GUEST_SHARED = 0b10, /* Shared only with the guest. */ > + KVM_GMEM_NONE_SHARED = 0b11, /* Not shared, transient state. */ > +}; > + > +static int kvm_gmem_offset_set_shared(struct inode *inode, pgoff_t index) > { > - struct kvm_gmem *gmem = file->private_data; > + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + void *xval = xa_mk_value(KVM_GMEM_ALL_SHARED); > + > + lockdep_assert_held_write(offsets_lock); > + > + return xa_err(xa_store(shared_offsets, index, xval, GFP_KERNEL)); > +} > + > +/* > + * Marks the range [start, end) as shared with both the host and the guest. > + * Called when guest shares memory with the host. > + */ > +static int kvm_gmem_offset_range_set_shared(struct inode *inode, > + pgoff_t start, pgoff_t end) > +{ > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + pgoff_t i; > + int r = 0; > + > + write_lock(offsets_lock); > + for (i = start; i < end; i++) { > + r = kvm_gmem_offset_set_shared(inode, i); > + if (WARN_ON_ONCE(r)) > + break; > + } > + write_unlock(offsets_lock); > + > + return r; > +} > + > +static int kvm_gmem_offset_clear_shared(struct inode *inode, pgoff_t index) > +{ > + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_SHARED); > + void *xval_none = xa_mk_value(KVM_GMEM_NONE_SHARED); > + struct folio *folio; > + int refcount; > + int r; > + > + lockdep_assert_held_write(offsets_lock); > + > + folio = filemap_lock_folio(inode->i_mapping, index); > + if (!IS_ERR(folio)) { > + /* +1 references are expected because of filemap_lock_folio(). */ > + refcount = folio_nr_pages(folio) + 1; > + } else { > + r = PTR_ERR(folio); > + if (WARN_ON_ONCE(r != -ENOENT)) > + return r; > + > + folio = NULL; > + } > + > + if (!folio || folio_ref_freeze(folio, refcount)) { > + /* > + * No outstanding references: transition to guest shared. > + */ > + r = xa_err(xa_store(shared_offsets, index, xval_guest, GFP_KERNEL)); > + > + if (folio) > + folio_ref_unfreeze(folio, refcount); > + } else { > + /* > + * Outstanding references: the folio cannot be faulted in by > + * anyone until they're dropped. > + */ > + r = xa_err(xa_store(shared_offsets, index, xval_none, GFP_KERNEL)); Once we do this on elevated refcounts, truncate needs to be updated to handle the case where some folio is still in a KVM_GMEM_NONE_SHARED state. When a folio is found in a KVM_GMEM_NONE_SHARED state, the shareability should be fast-forwarded to KVM_GMEM_GUEST_SHARED, and the filemap's refcounts restored. The folio can then be truncated from the filemap as usual (which will drop the filemap's refcounts) > + } > + > + if (folio) { > + folio_unlock(folio); > + folio_put(folio); > + } > + > + return r; > +} > > +/* > + * Marks the range [start, end) as not shared with the host. If the host doesn't > + * have any references to a particular folio, then that folio is marked as > + * shared with the guest. > + * > + * However, if the host still has references to the folio, then the folio is > + * marked and not shared with anyone. Marking it as not shared allows draining > + * all references from the host, and ensures that the hypervisor does not > + * transition the folio to private, since the host still might access it. > + * > + * Called when guest unshares memory with the host. > + */ > +static int kvm_gmem_offset_range_clear_shared(struct inode *inode, > + pgoff_t start, pgoff_t end) > +{ > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + pgoff_t i; > + int r = 0; > + > + write_lock(offsets_lock); > + for (i = start; i < end; i++) { > + r = kvm_gmem_offset_clear_shared(inode, i); > + if (WARN_ON_ONCE(r)) > + break; > + } > + write_unlock(offsets_lock); > + > + return r; > +} > + > +void kvm_gmem_handle_folio_put(struct folio *folio) > +{ > + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); > +} > +EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); > + > +/* > + * Returns true if the folio is shared with the host and the guest. > + * > + * Must be called with the offsets_lock lock held. > + */ > +static bool kvm_gmem_offset_is_shared(struct inode *inode, pgoff_t index) > +{ > + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + unsigned long r; > + > + lockdep_assert_held(offsets_lock); > > - /* For now, VMs that support shared memory share all their memory. */ > - return kvm_arch_gmem_supports_shared_mem(gmem->kvm); > + r = xa_to_value(xa_load(shared_offsets, index)); > + > + return r == KVM_GMEM_ALL_SHARED; > +} > + > +/* > + * Returns true if the folio is shared with the guest (not transitioning). > + * > + * Must be called with the offsets_lock lock held. > + */ > +static bool kvm_gmem_offset_is_guest_shared(struct inode *inode, pgoff_t index) > +{ > + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + unsigned long r; > + > + lockdep_assert_held(offsets_lock); > + > + r = xa_to_value(xa_load(shared_offsets, index)); > + > + return (r == KVM_GMEM_ALL_SHARED || r == KVM_GMEM_GUEST_SHARED); > +} > + > +int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) > +{ > + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); > + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; > + pgoff_t end_off = start_off + end - start; > + > + return kvm_gmem_offset_range_set_shared(inode, start_off, end_off); > +} > + > +int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) > +{ > + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); > + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; > + pgoff_t end_off = start_off + end - start; > + > + return kvm_gmem_offset_range_clear_shared(inode, start_off, end_off); > +} > + > +bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, gfn_t gfn) > +{ > + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; > + bool r; > + > + read_lock(offsets_lock); > + r = kvm_gmem_offset_is_guest_shared(inode, pgoff); > + read_unlock(offsets_lock); > + > + return r; > } > > static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) > { > struct inode *inode = file_inode(vmf->vma->vm_file); > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > struct folio *folio; > vm_fault_t ret = VM_FAULT_LOCKED; > > filemap_invalidate_lock_shared(inode->i_mapping); > + read_lock(offsets_lock); > > folio = kvm_gmem_get_folio(inode, vmf->pgoff); > if (IS_ERR(folio)) { > @@ -423,7 +604,7 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) > goto out_folio; > } > > - if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) { > + if (!kvm_gmem_offset_is_shared(inode, vmf->pgoff)) { > ret = VM_FAULT_SIGBUS; > goto out_folio; > } > @@ -457,6 +638,7 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) > } > > out_filemap: > + read_unlock(offsets_lock); > filemap_invalidate_unlock_shared(inode->i_mapping); > > return ret; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 3e40acb9f5c0..90762252381c 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -3091,6 +3091,68 @@ static int next_segment(unsigned long len, int offset) > return len; > } > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > +int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end) > +{ > + struct kvm_memslot_iter iter; > + int r = 0; > + > + mutex_lock(&kvm->slots_lock); > + > + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { > + struct kvm_memory_slot *memslot = iter.slot; > + gfn_t gfn_start, gfn_end; > + > + if (!kvm_slot_can_be_private(memslot)) > + continue; > + > + gfn_start = max(start, memslot->base_gfn); > + gfn_end = min(end, memslot->base_gfn + memslot->npages); > + if (WARN_ON_ONCE(start >= end)) > + continue; > + > + r = kvm_gmem_slot_set_shared(memslot, gfn_start, gfn_end); > + if (WARN_ON_ONCE(r)) > + break; > + } > + > + mutex_unlock(&kvm->slots_lock); > + > + return r; > +} > +EXPORT_SYMBOL_GPL(kvm_gmem_set_shared); > + > +int kvm_gmem_clear_shared(struct kvm *kvm, gfn_t start, gfn_t end) > +{ > + struct kvm_memslot_iter iter; > + int r = 0; > + > + mutex_lock(&kvm->slots_lock); > + > + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { > + struct kvm_memory_slot *memslot = iter.slot; > + gfn_t gfn_start, gfn_end; > + > + if (!kvm_slot_can_be_private(memslot)) > + continue; > + > + gfn_start = max(start, memslot->base_gfn); > + gfn_end = min(end, memslot->base_gfn + memslot->npages); > + if (WARN_ON_ONCE(start >= end)) > + continue; > + > + r = kvm_gmem_slot_clear_shared(memslot, gfn_start, gfn_end); > + if (WARN_ON_ONCE(r)) > + break; > + } > + > + mutex_unlock(&kvm->slots_lock); > + > + return r; > +} > +EXPORT_SYMBOL_GPL(kvm_gmem_clear_shared); > +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ > + > /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ > static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, > void *data, int offset, int len)