linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vishal Annapurve <vannapurve@google.com>
To: Shivank Garg <shivankg@amd.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
	pbonzini@redhat.com,  linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, linux-coco@lists.linux.dev,
	 chao.gao@intel.com, seanjc@google.com, ackerleytng@google.com,
	 david@redhat.com, vbabka@suse.cz, bharata@amd.com,
	nikunj@amd.com,  michael.day@amd.com, Neeraj.Upadhyay@amd.com,
	thomas.lendacky@amd.com,  michael.roth@amd.com, tabba@google.com
Subject: Re: [PATCH v6 0/5] Add NUMA mempolicy support for KVM guest-memfd
Date: Sat, 8 Mar 2025 17:09:46 -0800	[thread overview]
Message-ID: <CAGtprH8akKUF=8+RkX_QMjp35C0bU1zxGi4v1Zm5AWCw=8V8AQ@mail.gmail.com> (raw)
In-Reply-To: <20250226082549.6034-1-shivankg@amd.com>

On Wed, Feb 26, 2025 at 12:28 AM Shivank Garg <shivankg@amd.com> wrote:
>
> In this patch-series:
> Based on the discussion in the bi-weekly guest_memfd upstream call on
> 2025-02-20[4], I have dropped the RFC tag, documented the memory allocation
> behavior after policy changes and added selftests.
>
>
> KVM's guest-memfd memory backend currently lacks support for NUMA policy
> enforcement, causing guest memory allocations to be distributed arbitrarily
> across host NUMA nodes regardless of the policy specified by the VMM. This
> occurs because conventional userspace NUMA control mechanisms like mbind()
> are ineffective with guest-memfd, as the memory isn't directly mapped to
> userspace when allocations occur.
>
> This patch-series adds NUMA binding capabilities to guest_memfd backend
> KVM guests. It has evolved through several approaches based on community
> feedback:
>
> - v1,v2: Extended the KVM_CREATE_GUEST_MEMFD IOCTL to pass mempolicy.
> - v3: Introduced fbind() syscall for VMM memory-placement configuration.
> - v4-v6: Current approach using shared_policy support and vm_ops (based on
>       suggestions from David[1] and guest_memfd biweekly upstream call[2]).
>
> For SEV-SNP guests, which use the guest-memfd memory backend, NUMA-aware
> memory placement is essential for optimal performance, particularly for
> memory-intensive workloads.
>
> This series implements proper NUMA policy support for guest-memfd by:
>
> 1. Adding mempolicy-aware allocation APIs to the filemap layer.

I have been thinking more about this after the last guest_memfd
upstream call on March 6th.

To allow 1G page support with guest_memfd [1] without encountering
significant memory overheads, its important to support in-place memory
conversion with private hugepages getting split/merged upon
conversion. Private pages can be seamlessly split/merged only if the
refcounts of complete subpages are frozen, most effective way to
achieve and enforce this is to just not have struct pages for private
memory. All the guest_memfd private range users (including IOMMU [2]
in future) can request pfns for offsets and get notified about
invalidation when pfns go away.

Not having struct pages for private memory also provide additional benefits:
* Significantly lesser memory overhead for handling splitting/merge operations
    - With struct pages around, every split of 1G page needs struct
page allocation for 512 * 512 4K pages in worst case.
* Enable roadmap for PFN range allocators in the backend and usecases
like KHO [3] that target use of memory without struct page.

IIRC, filemap was initially used as a matter of convenience for
initial guest memfd implementation.

As pointed by David in the call, to get rid of struct page for private
memory ranges, filemap/pagecache needs to be replaced by a lightweight
mechanism that tracks offsets -> pfns mapping for private memory
ranges while still keeping filemap/pagecache for shared memory ranges
(it's still needed to allow GUP usecases). I am starting to think that
the filemap replacement for private memory ranges should be done
sooner rather than later, otherwise it will become more and more
difficult with features landing in guest_memfd relying on presence of
filemap.

This discussion matters more for hugepages and PFN range allocations.
I would like to ensure that we have consensus on this direction.

[1] https://lpc.events/event/18/contributions/1764/
[2] https://lore.kernel.org/kvm/CAGtprH8C4MQwVTFPBMbFWyW4BrK8-mDqjJn-UUFbFhw4w23f3A@mail.gmail.com/
[3] https://lore.kernel.org/linux-mm/20240805093245.889357-1-jgowans@amazon.com/


  parent reply	other threads:[~2025-03-09  1:10 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-26  8:25 Shivank Garg
2025-02-26  8:25 ` [PATCH v6 1/5] mm/filemap: add mempolicy support to the filemap layer Shivank Garg
2025-02-28 14:17   ` Vlastimil Babka
2025-02-28 17:51     ` Ackerley Tng
2025-02-26  8:25 ` [PATCH v6 2/5] mm/mempolicy: export memory policy symbols Shivank Garg
2025-02-26 13:59   ` Vlastimil Babka
2025-02-26  8:25 ` [PATCH v6 3/5] KVM: guest_memfd: Pass file pointer instead of inode pointer Shivank Garg
2025-02-26  8:25 ` [PATCH v6 4/5] KVM: guest_memfd: Enforce NUMA mempolicy using shared policy Shivank Garg
2025-02-28 17:25   ` Ackerley Tng
2025-03-03  8:58     ` Vlastimil Babka
2025-03-04  0:19       ` Ackerley Tng
2025-03-04 15:30         ` Sean Christopherson
2025-03-04 15:51           ` David Hildenbrand
2025-03-04 16:59             ` Sean Christopherson
2025-02-26  8:25 ` [PATCH v6 5/5] KVM: guest_memfd: selftests: add tests for mmap and NUMA policy support Shivank Garg
2025-03-09  1:09 ` Vishal Annapurve [this message]
2025-03-09 18:52   ` [PATCH v6 0/5] Add NUMA mempolicy support for KVM guest-memfd Vishal Annapurve

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGtprH8akKUF=8+RkX_QMjp35C0bU1zxGi4v1Zm5AWCw=8V8AQ@mail.gmail.com' \
    --to=vannapurve@google.com \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@amd.com \
    --cc=chao.gao@intel.com \
    --cc=david@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=michael.day@amd.com \
    --cc=michael.roth@amd.com \
    --cc=nikunj@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shivankg@amd.com \
    --cc=tabba@google.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox