From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: "David Hildenbrand (Red Hat)" <david@kernel.org>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
Pedro Falcato <pfalcato@suse.de>,
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Subject: Re: [PATCH] mm/mremap: allow VMAs with VM_DONTEXPAND|VM_PFNMAP when creating new mapping
Date: Thu, 20 Nov 2025 09:35:08 +0000 [thread overview]
Message-ID: <4fdd31d7-2814-43ed-9674-d4b15b0ed780@lucifer.local> (raw)
In-Reply-To: <6e415c85-9ccd-4029-91fe-557d3946ef51@kernel.org>
On Thu, Nov 20, 2025 at 10:16:26AM +0100, David Hildenbrand (Red Hat) wrote:
> On 11/20/25 10:04, Lorenzo Stoakes wrote:
> > Hi Vivek, thanks for the patch.
> >
> > In general though, let's please not make a fundamental change to mremap()
> > behaviour in late -rc6. Late in cycle/during merge window we're really only
> > interested in existing series, series that are less involved than this.
> >
> > On Wed, Nov 19, 2025 at 09:35:46PM -0800, Vivek Kasireddy wrote:
> > > When mremap is used to create a new mapping, we should not return
> > > -EFAULT for VMAs with VM_DONTEXPAND or VM_PFNMAP flags set because
> > > the old VMA would neither be expanded nor shrunk in this case. This
> >
> > I guess you're trying to be succinct here and 'clone' each input VMA using
> > the 0 source size input.
> >
> > However this can't work.
> >
> > This operation is not equivalent to an mmap(). It may seem to be for
> > ordinary mappings but in practice it isn't:
> >
> > (syscall)
> > -> do_mremap()
> > -> mremap_at()
> > -> expand_vma()
> > -> move_vma()
> > -> copy_vma_and_data()
> > -> copy_vma()
> >
> > Essentially copying the properties of the VMA to the new region.
> >
> > But this doesn't work for PFN map.
> >
> > At _no point_ are you invoking the original f_op->mmap or
> > f_op->mmap_prepare handler.
> >
> > And these handles for PFN maps set up page tables, because PFN maps
> > literally do not exist as VMAs which have properties independent of their
> > page tables like this.
>
> vfio-pci is a bit different, though, as it uses
> vmf_insert_pfn()/vmf_insert_pfn_pmd()/vmf_insert_pfn_pud() at fault time to
> insert PFNs, not at mmap time using remap_pfn_range() and friends.
>
> (see vfio_pci_mmap_page_fault() )
It sets VM_DONTEXPAND but is fine with being expanded? :) That sounds like a
bug there:
vm_flags_set(vma, VM_ALLOW_ANY_UNCACHED | VM_IO | VM_PFNMAP |
VM_DONTEXPAND | VM_DONTDUMP);
Drop the VM_DONTEXPAND then?
But on the other hand... I see a _whole bunch_ of logic in vfio_pci_core_mmap()
that just wouldn't be executed on expansion (and the requested case here is 100%
an expand due to not invoking mmap callbacks...)
So we'd still be in a broken state here surely if we allowed it?
>
> Now, I have to idea if that is the main use case we want to target here and
> how we could handle it, just wanted to point it out :)
Right, the objections all remain, we absolutely cannot take this patch or
anything like it.
I guess VM_DONTEXPAND is really 'we do stuff in the (soon to be) mmap_prepare
callback that is dependent on region size'.
Most notably in this case:
phys_len = PAGE_ALIGN(pci_resource_len(pdev, index));
...
if (req_start + req_len > phys_len)
return -EINVAL;
>
> --
> Cheers
>
> David
>
I wouldn't be entirely opposed to a _new system call_ explicitly for cloning an
existing memory region that explicitly invokes an mmap with equivalent
parameters for VM_DONTEXPAND/VM_PFNMAP cases/does a copy for ordinary cases.
But I'm not interested in doing that for mremap(ptr, 0, new_size, ...).
An alternative would be to have a new VMA callback for expansion where the
driver could explicitly do these checks :) but not sure if worth it.
Cheers, Lorenzo
next prev parent reply other threads:[~2025-11-20 9:35 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-20 5:35 Vivek Kasireddy
2025-11-20 9:04 ` Lorenzo Stoakes
2025-11-20 9:16 ` David Hildenbrand (Red Hat)
2025-11-20 9:35 ` Lorenzo Stoakes [this message]
2025-11-20 9:49 ` David Hildenbrand (Red Hat)
2025-11-20 9:58 ` Lorenzo Stoakes
2025-11-21 3:05 ` Akihiko Odaki
2025-11-21 8:03 ` Lorenzo Stoakes
2025-11-21 8:48 ` Akihiko Odaki
2025-11-21 9:10 ` Lorenzo Stoakes
2025-11-21 10:16 ` Akihiko Odaki
2025-11-21 10:52 ` Lorenzo Stoakes
2025-11-21 7:26 ` David Hildenbrand (Red Hat)
2025-11-21 6:51 ` Kasireddy, Vivek
2025-11-21 7:52 ` Lorenzo Stoakes
2025-11-21 8:13 ` David Hildenbrand (Red Hat)
2025-11-21 15:03 ` Liam R. Howlett
2025-11-22 6:56 ` Kasireddy, Vivek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4fdd31d7-2814-43ed-9674-d4b15b0ed780@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
--cc=pfalcato@suse.de \
--cc=vbabka@suse.cz \
--cc=vivek.kasireddy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox