From: Elliot Berman <quic_eberman@quicinc.com>
To: James Gowans <jgowans@amazon.com>
Cc: <linux-kernel@vger.kernel.org>,
Sean Christopherson <seanjc@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Steve Sistare <steven.sistare@oracle.com>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Anthony Yznaga <anthony.yznaga@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
Jason Gunthorpe <jgg@ziepe.ca>, <linux-fsdevel@vger.kernel.org>,
Usama Arif <usama.arif@bytedance.com>, <kvm@vger.kernel.org>,
Alexander Graf <graf@amazon.com>,
David Woodhouse <dwmw@amazon.co.uk>,
Paul Durrant <pdurrant@amazon.co.uk>,
Nicolas Saenz Julienne <nsaenz@amazon.es>
Subject: Re: [PATCH 05/10] guestmemfs: add file mmap callback
Date: Tue, 29 Oct 2024 16:05:54 -0700 [thread overview]
Message-ID: <20241029120232032-0700.eberman@hu-eberman-lv.qualcomm.com> (raw)
In-Reply-To: <20240805093245.889357-6-jgowans@amazon.com>
On Mon, Aug 05, 2024 at 11:32:40AM +0200, James Gowans wrote:
> Make the file data usable to userspace by adding mmap. That's all that
> QEMU needs for guest RAM, so that's all be bother implementing for now.
>
> When mmaping the file the VMA is marked as PFNMAP to indicate that there
> are no struct pages for the memory in this VMA. Remap_pfn_range() is
> used to actually populate the page tables. All PTEs are pre-faulted into
> the pgtables at mmap time so that the pgtables are usable when this
> virtual address range is given to VFIO's MAP_DMA.
Thanks for sending this out! I'm going through the series with the
intention to see how it might fit within the existing guest_memfd work
for pKVM/CoCo/Gunyah.
It might've been mentioned in the MM alignment session -- you might be
interested to join the guest_memfd bi-weekly call to see how we are
overlapping [1].
[1]: https://lore.kernel.org/kvm/ae794891-fe69-411a-b82e-6963b594a62a@redhat.com/T/
---
Was the decision to pre-fault everything because it was convenient to do
or otherwise intentionally different from hugetlb?
>
> Signed-off-by: James Gowans <jgowans@amazon.com>
> ---
> fs/guestmemfs/file.c | 43 +++++++++++++++++++++++++++++++++++++-
> fs/guestmemfs/guestmemfs.c | 2 +-
> fs/guestmemfs/guestmemfs.h | 3 +++
> 3 files changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/fs/guestmemfs/file.c b/fs/guestmemfs/file.c
> index 618c93b12196..b1a52abcde65 100644
> --- a/fs/guestmemfs/file.c
> +++ b/fs/guestmemfs/file.c
> @@ -1,6 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0-only
>
> #include "guestmemfs.h"
> +#include <linux/mm.h>
>
> static int truncate(struct inode *inode, loff_t newsize)
> {
> @@ -41,6 +42,46 @@ static int inode_setattr(struct mnt_idmap *idmap, struct dentry *dentry, struct
> return 0;
> }
>
> +/*
> + * To be able to use PFNMAP VMAs for VFIO DMA mapping we need the page tables
> + * populated with mappings. Pre-fault everything.
> + */
> +static int mmap(struct file *filp, struct vm_area_struct *vma)
> +{
> + int rc;
> + unsigned long *mappings_block;
> + struct guestmemfs_inode *guestmemfs_inode;
> +
> + guestmemfs_inode = guestmemfs_get_persisted_inode(filp->f_inode->i_sb,
> + filp->f_inode->i_ino);
> +
> + mappings_block = guestmemfs_inode->mappings;
> +
> + /* Remap-pfn-range will mark the range VM_IO */
> + for (unsigned long vma_addr_offset = vma->vm_start;
> + vma_addr_offset < vma->vm_end;
> + vma_addr_offset += PMD_SIZE) {
> + int block, mapped_block;
> + unsigned long map_size = min(PMD_SIZE, vma->vm_end - vma_addr_offset);
> +
> + block = (vma_addr_offset - vma->vm_start) / PMD_SIZE;
> + mapped_block = *(mappings_block + block);
> + /*
> + * It's wrong to use rempa_pfn_range; this will install PTE-level entries.
> + * The whole point of 2 MiB allocs is to improve TLB perf!
> + * We should use something like mm/huge_memory.c#insert_pfn_pmd
> + * but that is currently static.
> + * TODO: figure out the best way to install PMDs.
> + */
> + rc = remap_pfn_range(vma,
> + vma_addr_offset,
> + (guestmemfs_base >> PAGE_SHIFT) + (mapped_block * 512),
> + map_size,
> + vma->vm_page_prot);
> + }
> + return 0;
> +}
> +
> const struct inode_operations guestmemfs_file_inode_operations = {
> .setattr = inode_setattr,
> .getattr = simple_getattr,
> @@ -48,5 +89,5 @@ const struct inode_operations guestmemfs_file_inode_operations = {
>
> const struct file_operations guestmemfs_file_fops = {
> .owner = THIS_MODULE,
> - .iterate_shared = NULL,
> + .mmap = mmap,
> };
> diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c
> index c45c796c497a..38f20ad25286 100644
> --- a/fs/guestmemfs/guestmemfs.c
> +++ b/fs/guestmemfs/guestmemfs.c
> @@ -9,7 +9,7 @@
> #include <linux/memblock.h>
> #include <linux/statfs.h>
>
> -static phys_addr_t guestmemfs_base, guestmemfs_size;
> +phys_addr_t guestmemfs_base, guestmemfs_size;
> struct guestmemfs_sb *psb;
>
> static int statfs(struct dentry *root, struct kstatfs *buf)
> diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h
> index 7ea03ac8ecca..0f2788ce740e 100644
> --- a/fs/guestmemfs/guestmemfs.h
> +++ b/fs/guestmemfs/guestmemfs.h
> @@ -8,6 +8,9 @@
> #define GUESTMEMFS_FILENAME_LEN 255
> #define GUESTMEMFS_PSB(sb) ((struct guestmemfs_sb *)sb->s_fs_info)
>
> +/* Units of bytes */
> +extern phys_addr_t guestmemfs_base, guestmemfs_size;
> +
> struct guestmemfs_sb {
> /* Inode number */
> unsigned long next_free_ino;
> --
> 2.34.1
>
>
next prev parent reply other threads:[~2024-10-29 23:06 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-05 9:32 [PATCH 00/10] Introduce guestmemfs: persistent in-memory filesystem James Gowans
2024-08-05 9:32 ` [PATCH 01/10] guestmemfs: Introduce filesystem skeleton James Gowans
2024-08-05 10:20 ` Christian Brauner
2024-08-05 9:32 ` [PATCH 02/10] guestmemfs: add inode store, files and dirs James Gowans
2024-08-05 9:32 ` [PATCH 03/10] guestmemfs: add persistent data block allocator James Gowans
2024-08-05 9:32 ` [PATCH 04/10] guestmemfs: support file truncation James Gowans
2024-08-05 9:32 ` [PATCH 05/10] guestmemfs: add file mmap callback James Gowans
2024-10-29 23:05 ` Elliot Berman [this message]
2024-10-30 22:18 ` Frank van der Linden
2024-11-01 12:55 ` Gowans, James
2024-10-31 15:30 ` Gowans, James
2024-10-31 16:06 ` Jason Gunthorpe
2024-11-01 13:01 ` Gowans, James
2024-11-01 13:42 ` Jason Gunthorpe
2024-11-02 8:24 ` Gowans, James
2024-11-04 11:11 ` Mike Rapoport
2024-11-04 14:39 ` Jason Gunthorpe
2024-11-04 10:49 ` Mike Rapoport
2024-08-05 9:32 ` [PATCH 06/10] kexec/kho: Add addr flag to not initialise memory James Gowans
2024-08-05 9:32 ` [PATCH 07/10] guestmemfs: Persist filesystem metadata via KHO James Gowans
2024-08-05 9:32 ` [PATCH 08/10] guestmemfs: Block modifications when serialised James Gowans
2024-08-05 9:32 ` [PATCH 09/10] guestmemfs: Add documentation and usage instructions James Gowans
2024-08-05 9:32 ` [PATCH 10/10] MAINTAINERS: Add maintainers for guestmemfs James Gowans
2024-08-05 14:32 ` [PATCH 00/10] Introduce guestmemfs: persistent in-memory filesystem Theodore Ts'o
2024-08-05 14:41 ` Paolo Bonzini
2024-08-05 19:47 ` Gowans, James
2024-08-05 19:53 ` Gowans, James
2024-08-05 20:01 ` Jan Kara
2024-08-05 23:29 ` Jason Gunthorpe
2024-08-06 8:26 ` Gowans, James
2024-08-06 8:12 ` Gowans, James
2024-08-06 13:43 ` David Hildenbrand
2024-10-17 4:53 ` Vishal Annapurve
2024-11-01 12:53 ` Gowans, James
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241029120232032-0700.eberman@hu-eberman-lv.qualcomm.com \
--to=quic_eberman@quicinc.com \
--cc=akpm@linux-foundation.org \
--cc=anthony.yznaga@oracle.com \
--cc=brauner@kernel.org \
--cc=dwmw@amazon.co.uk \
--cc=graf@amazon.com \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jgowans@amazon.com \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nsaenz@amazon.es \
--cc=pbonzini@redhat.com \
--cc=pdurrant@amazon.co.uk \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=steven.sistare@oracle.com \
--cc=usama.arif@bytedance.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox