linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Fuad Tabba <tabba@google.com>
Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org,
	linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org,
	mpe@ellerman.id.au, anup@brainfault.org,
		paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, seanjc@google.com,
		viro@zeniv.linux.org.uk, brauner@kernel.org,
	willy@infradead.org, 	akpm@linux-foundation.org,
	xiaoyao.li@intel.com, yilun.xu@intel.com,
		chao.p.peng@linux.intel.com, jarkko@kernel.org,
	amoorthy@google.com, dmatlack@google.com,
	yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com,
	mic@digikod.net, vbabka@suse.cz, 	vannapurve@google.com,
	ackerleytng@google.com, mail@maciej.szmigiero.name,
		david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com,
		liam.merwick@oracle.com, isaku.yamahata@gmail.com,
	kirill.shutemov@linux.intel.com, 	suzuki.poulose@arm.com,
	steven.price@arm.com, quic_eberman@quicinc.com,
	quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com,
	quic_svaddagi@quicinc.com, 	quic_cvanscha@quicinc.com,
	quic_pderrin@quicinc.com, quic_pheragu@quicinc.com,
	catalin.marinas@arm.com, james.morse@arm.com,
	yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org,
	will@kernel.org, qperret@google.com, 	keirf@google.com,
	roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org,
		jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com,
	fvdl@google.com, hughd@google.com, jthoughton@google.com
Subject: Re: [RFC PATCH v5 05/15] KVM: guest_memfd: Folio mappability states and functions that manage their transition
Date: Mon, 20 Jan 2025 12:30:03 +0200	[thread overview]
Message-ID: <r425iid27x5ybtm4awz3gx2sxibuhlsr6me3e6e3kjtl5nuwia@2xgh3frgoovs> (raw)
In-Reply-To: <20250117163001.2326672-6-tabba@google.com>

On Fri, Jan 17, 2025 at 04:29:51PM +0000, Fuad Tabba wrote:
> +/*
> + * Marks the range [start, end) as not mappable by the host. If the host doesn't
> + * have any references to a particular folio, then that folio is marked as
> + * mappable by the guest.
> + *
> + * However, if the host still has references to the folio, then the folio is
> + * marked and not mappable by anyone. Marking it is not mappable allows it to
> + * drain all references from the host, and to ensure that the hypervisor does
> + * not transition the folio to private, since the host still might access it.
> + *
> + * Usually called when guest unshares memory with the host.
> + */
> +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end)
> +{
> +	struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets;
> +	void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE);
> +	void *xval_none = xa_mk_value(KVM_GMEM_NONE_MAPPABLE);
> +	pgoff_t i;
> +	int r = 0;
> +
> +	filemap_invalidate_lock(inode->i_mapping);
> +	for (i = start; i < end; i++) {
> +		struct folio *folio;
> +		int refcount = 0;
> +
> +		folio = filemap_lock_folio(inode->i_mapping, i);
> +		if (!IS_ERR(folio)) {
> +			refcount = folio_ref_count(folio);
> +		} else {
> +			r = PTR_ERR(folio);
> +			if (WARN_ON_ONCE(r != -ENOENT))
> +				break;
> +
> +			folio = NULL;
> +		}
> +
> +		/* +1 references are expected because of filemap_lock_folio(). */
> +		if (folio && refcount > folio_nr_pages(folio) + 1) {

Looks racy.

What prevent anybody from obtaining a reference just after check?

Lock on folio doesn't stop random filemap_get_entry() from elevating the
refcount.

folio_ref_freeze() might be required.

> +			/*
> +			 * Outstanding references, the folio cannot be faulted
> +			 * in by anyone until they're dropped.
> +			 */
> +			r = xa_err(xa_store(mappable_offsets, i, xval_none, GFP_KERNEL));
> +		} else {
> +			/*
> +			 * No outstanding references. Transition the folio to
> +			 * guest mappable immediately.
> +			 */
> +			r = xa_err(xa_store(mappable_offsets, i, xval_guest, GFP_KERNEL));
> +		}
> +
> +		if (folio) {
> +			folio_unlock(folio);
> +			folio_put(folio);
> +		}
> +
> +		if (WARN_ON_ONCE(r))
> +			break;
> +	}
> +	filemap_invalidate_unlock(inode->i_mapping);
> +
> +	return r;
> +}

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


  reply	other threads:[~2025-01-20 10:30 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17 16:29 [RFC PATCH v5 00/15] KVM: Restricted mapping of guest_memfd at the host and arm64 support Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 01/15] mm: Consolidate freeing of typed folios on final folio_put() Fuad Tabba
2025-01-17 22:05   ` Elliot Berman
2025-01-19 14:39     ` Fuad Tabba
2025-01-20 10:39     ` David Hildenbrand
2025-01-20 10:50       ` Fuad Tabba
2025-01-20 10:39   ` David Hildenbrand
2025-01-20 10:43     ` Fuad Tabba
2025-01-20 10:43     ` Vlastimil Babka
2025-01-20 11:12       ` Vlastimil Babka
2025-01-20 11:28       ` David Hildenbrand
2025-01-17 16:29 ` [RFC PATCH v5 02/15] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes Fuad Tabba
2025-01-24  4:25   ` Gavin Shan
2025-01-29 10:12     ` Fuad Tabba
2025-02-11 15:58     ` Ackerley Tng
2025-01-17 16:29 ` [RFC PATCH v5 03/15] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 04/15] KVM: guest_memfd: Track mappability within a struct kvm_gmem_private Fuad Tabba
2025-01-24  5:31   ` Gavin Shan
2025-01-29 10:15     ` Fuad Tabba
2025-02-26 22:29       ` Ackerley Tng
2025-01-17 16:29 ` [RFC PATCH v5 05/15] KVM: guest_memfd: Folio mappability states and functions that manage their transition Fuad Tabba
2025-01-20 10:30   ` Kirill A. Shutemov [this message]
2025-01-20 10:40     ` Fuad Tabba
2025-02-06  3:14       ` Ackerley Tng
2025-02-06  9:45         ` Fuad Tabba
2025-02-19 23:33   ` Ackerley Tng
2025-02-20  9:26     ` Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 06/15] KVM: guest_memfd: Handle final folio_put() of guestmem pages Fuad Tabba
2025-01-20 11:37   ` Vlastimil Babka
2025-01-20 12:14     ` Fuad Tabba
2025-01-22 22:24       ` Ackerley Tng
2025-01-23 11:00         ` Fuad Tabba
2025-02-06  3:18           ` Ackerley Tng
2025-02-06  3:28           ` Ackerley Tng
2025-02-06  9:47             ` Fuad Tabba
2025-01-30 14:23         ` Fuad Tabba
2025-01-22 22:16   ` Ackerley Tng
2025-01-23  9:50     ` Fuad Tabba
2025-02-05  1:28       ` Vishal Annapurve
2025-02-05  4:31         ` Ackerley Tng
2025-02-05  5:58           ` Vishal Annapurve
2025-02-05  0:42   ` Vishal Annapurve
2025-02-05 10:06     ` Fuad Tabba
2025-02-05 17:39       ` Vishal Annapurve
2025-02-05 17:42         ` Vishal Annapurve
2025-02-07 10:46           ` Ackerley Tng
2025-02-10 16:04             ` Fuad Tabba
2025-02-05  0:51   ` Vishal Annapurve
2025-02-05 10:07     ` Fuad Tabba
2025-02-06  3:37   ` Ackerley Tng
2025-02-06  9:49     ` Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 07/15] KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 08/15] KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page() Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 09/15] KVM: guest_memfd: Add KVM capability to check if guest_memfd is host mappable Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 10/15] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as mappable Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 11/15] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 12/15] KVM: arm64: Skip VMA checks for slots without userspace address Fuad Tabba
2025-01-17 16:29 ` [RFC PATCH v5 13/15] KVM: arm64: Refactor user_mem_abort() calculation of force_pte Fuad Tabba
2025-01-17 16:30 ` [RFC PATCH v5 14/15] KVM: arm64: Handle guest_memfd()-backed guest page faults Fuad Tabba
2025-01-17 16:30 ` [RFC PATCH v5 15/15] KVM: arm64: Enable guest_memfd private memory when pKVM is enabled Fuad Tabba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=r425iid27x5ybtm4awz3gx2sxibuhlsr6me3e6e3kjtl5nuwia@2xgh3frgoovs \
    --to=kirill@shutemov.name \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=amoorthy@google.com \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=brauner@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=fvdl@google.com \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=isaku.yamahata@intel.com \
    --cc=james.morse@arm.com \
    --cc=jarkko@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jthoughton@google.com \
    --cc=keirf@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=mic@digikod.net \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=quic_cvanscha@quicinc.com \
    --cc=quic_eberman@quicinc.com \
    --cc=quic_mnalajal@quicinc.com \
    --cc=quic_pderrin@quicinc.com \
    --cc=quic_pheragu@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=rientjes@google.com \
    --cc=roypat@amazon.co.uk \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=steven.price@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.w.wang@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox