From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>,
Matthew Wilcox <willy@infradead.org>, Guo Ren <guoren@kernel.org>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
"David S . Miller" <davem@davemloft.net>,
Andreas Larsson <andreas@gaisler.com>,
Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Nicolas Pitre <nico@fluxnic.net>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@redhat.com>,
Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
Dave Young <dyoung@redhat.com>, Tony Luck <tony.luck@intel.com>,
Reinette Chatre <reinette.chatre@intel.com>,
Dave Martin <Dave.Martin@arm.com>,
James Morse <james.morse@arm.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Uladzislau Rezki <urezki@gmail.com>,
Dmitry Vyukov <dvyukov@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Jann Horn <jannh@google.com>, Pedro Falcato <pfalcato@suse.de>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-csky@vger.kernel.org,
linux-mips@vger.kernel.org, linux-s390@vger.kernel.org,
sparclinux@vger.kernel.org, nvdimm@lists.linux.dev,
linux-cxl@vger.kernel.org, linux-mm@kvack.org,
ntfs3@lists.linux.dev, kexec@lists.infradead.org,
kasan-dev@googlegroups.com, Jason Gunthorpe <jgg@nvidia.com>,
iommu@lists.linux.dev, Kevin Tian <kevin.tian@intel.com>,
Will Deacon <will@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Sumanth Korikkar <sumanthk@linux.ibm.com>
Subject: [PATCH v5 14/15] mm: update mem char driver to use mmap_prepare
Date: Mon, 20 Oct 2025 13:11:31 +0100 [thread overview]
Message-ID: <48f60764d7a6901819d1af778fa33b775d2e8c77.1760959442.git.lorenzo.stoakes@oracle.com> (raw)
In-Reply-To: <cover.1760959441.git.lorenzo.stoakes@oracle.com>
Update the mem char driver (backing /dev/mem and /dev/zero) to use
f_op->mmap_prepare hook rather than the deprecated f_op->mmap.
The /dev/zero implementation has a very unique and rather concerning
characteristic in that it converts MAP_PRIVATE mmap() mappings anonymous
when they are, in fact, not.
The new f_op->mmap_prepare() can support this, but rather than introducing
a helper function to perform this hack (and risk introducing other users),
utilise the success hook to do so.
We utilise the newly introduced shmem_zero_setup_desc() to allow for the
shared mapping case via an f_op->mmap_prepare() hook.
We also use the desc->action_error_hook to filter the remap error to
-EAGAIN to keep behaviour consistent.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/char/mem.c | 84 +++++++++++++++++++++++++++-------------------
1 file changed, 50 insertions(+), 34 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index db1ca53a6d01..52039fae1594 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -304,13 +304,13 @@ static unsigned zero_mmap_capabilities(struct file *file)
}
/* can't do an in-place private mapping if there's no MMU */
-static inline int private_mapping_ok(struct vm_area_struct *vma)
+static inline int private_mapping_ok(struct vm_area_desc *desc)
{
- return is_nommu_shared_mapping(vma->vm_flags);
+ return is_nommu_shared_mapping(desc->vm_flags);
}
#else
-static inline int private_mapping_ok(struct vm_area_struct *vma)
+static inline int private_mapping_ok(struct vm_area_desc *desc)
{
return 1;
}
@@ -322,46 +322,49 @@ static const struct vm_operations_struct mmap_mem_ops = {
#endif
};
-static int mmap_mem(struct file *file, struct vm_area_struct *vma)
+static int mmap_filter_error(int err)
{
- size_t size = vma->vm_end - vma->vm_start;
- phys_addr_t offset = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT;
+ return -EAGAIN;
+}
+
+static int mmap_mem_prepare(struct vm_area_desc *desc)
+{
+ struct file *file = desc->file;
+ const size_t size = vma_desc_size(desc);
+ const phys_addr_t offset = (phys_addr_t)desc->pgoff << PAGE_SHIFT;
/* Does it even fit in phys_addr_t? */
- if (offset >> PAGE_SHIFT != vma->vm_pgoff)
+ if (offset >> PAGE_SHIFT != desc->pgoff)
return -EINVAL;
/* It's illegal to wrap around the end of the physical address space. */
if (offset + (phys_addr_t)size - 1 < offset)
return -EINVAL;
- if (!valid_mmap_phys_addr_range(vma->vm_pgoff, size))
+ if (!valid_mmap_phys_addr_range(desc->pgoff, size))
return -EINVAL;
- if (!private_mapping_ok(vma))
+ if (!private_mapping_ok(desc))
return -ENOSYS;
- if (!range_is_allowed(vma->vm_pgoff, size))
+ if (!range_is_allowed(desc->pgoff, size))
return -EPERM;
- if (!phys_mem_access_prot_allowed(file, vma->vm_pgoff, size,
- &vma->vm_page_prot))
+ if (!phys_mem_access_prot_allowed(file, desc->pgoff, size,
+ &desc->page_prot))
return -EINVAL;
- vma->vm_page_prot = phys_mem_access_prot(file, vma->vm_pgoff,
- size,
- vma->vm_page_prot);
+ desc->page_prot = phys_mem_access_prot(file, desc->pgoff,
+ size,
+ desc->page_prot);
- vma->vm_ops = &mmap_mem_ops;
+ desc->vm_ops = &mmap_mem_ops;
+
+ /* Remap-pfn-range will mark the range VM_IO. */
+ mmap_action_remap_full(desc, desc->pgoff);
+ /* We filter remap errors to -EAGAIN. */
+ desc->action.error_hook = mmap_filter_error;
- /* Remap-pfn-range will mark the range VM_IO */
- if (remap_pfn_range(vma,
- vma->vm_start,
- vma->vm_pgoff,
- size,
- vma->vm_page_prot)) {
- return -EAGAIN;
- }
return 0;
}
@@ -501,14 +504,26 @@ static ssize_t read_zero(struct file *file, char __user *buf,
return cleared;
}
-static int mmap_zero(struct file *file, struct vm_area_struct *vma)
+static int mmap_zero_private_success(const struct vm_area_struct *vma)
+{
+ /*
+ * This is a highly unique situation where we mark a MAP_PRIVATE mapping
+ * of /dev/zero anonymous, despite it not being.
+ */
+ vma_set_anonymous((struct vm_area_struct *)vma);
+
+ return 0;
+}
+
+static int mmap_zero_prepare(struct vm_area_desc *desc)
{
#ifndef CONFIG_MMU
return -ENOSYS;
#endif
- if (vma->vm_flags & VM_SHARED)
- return shmem_zero_setup(vma);
- vma_set_anonymous(vma);
+ if (desc->vm_flags & VM_SHARED)
+ return shmem_zero_setup_desc(desc);
+
+ desc->action.success_hook = mmap_zero_private_success;
return 0;
}
@@ -526,10 +541,11 @@ static unsigned long get_unmapped_area_zero(struct file *file,
{
if (flags & MAP_SHARED) {
/*
- * mmap_zero() will call shmem_zero_setup() to create a file,
- * so use shmem's get_unmapped_area in case it can be huge;
- * and pass NULL for file as in mmap.c's get_unmapped_area(),
- * so as not to confuse shmem with our handle on "/dev/zero".
+ * mmap_zero_prepare() will call shmem_zero_setup() to create a
+ * file, so use shmem's get_unmapped_area in case it can be
+ * huge; and pass NULL for file as in mmap.c's
+ * get_unmapped_area(), so as not to confuse shmem with our
+ * handle on "/dev/zero".
*/
return shmem_get_unmapped_area(NULL, addr, len, pgoff, flags);
}
@@ -632,7 +648,7 @@ static const struct file_operations __maybe_unused mem_fops = {
.llseek = memory_lseek,
.read = read_mem,
.write = write_mem,
- .mmap = mmap_mem,
+ .mmap_prepare = mmap_mem_prepare,
.open = open_mem,
#ifndef CONFIG_MMU
.get_unmapped_area = get_unmapped_area_mem,
@@ -668,7 +684,7 @@ static const struct file_operations zero_fops = {
.write_iter = write_iter_zero,
.splice_read = copy_splice_read,
.splice_write = splice_write_zero,
- .mmap = mmap_zero,
+ .mmap_prepare = mmap_zero_prepare,
.get_unmapped_area = get_unmapped_area_zero,
#ifndef CONFIG_MMU
.mmap_capabilities = zero_mmap_capabilities,
--
2.51.0
next prev parent reply other threads:[~2025-10-20 12:12 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 12:11 [PATCH v5 00/15] expand mmap_prepare functionality, port more users Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 01/15] mm/shmem: update shmem to use mmap_prepare Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 02/15] device/dax: update devdax " Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 03/15] mm/vma: remove unused function, make internal functions static Lorenzo Stoakes
2025-10-20 14:07 ` Jason Gunthorpe
2025-10-20 12:11 ` [PATCH v5 04/15] mm: add vma_desc_size(), vma_desc_pages() helpers Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 05/15] relay: update relay to use mmap_prepare Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 06/15] mm/vma: rename __mmap_prepare() function to avoid confusion Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 07/15] mm: add remap_pfn_range_prepare(), remap_pfn_range_complete() Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 08/15] mm: abstract io_remap_pfn_range() based on PFN Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 09/15] mm: introduce io_remap_pfn_range_[prepare, complete]() Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 10/15] mm: add ability to take further action in vm_area_desc Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 11/15] doc: update porting, vfs documentation for mmap_prepare actions Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 12/15] mm/hugetlbfs: update hugetlbfs to use mmap_prepare Lorenzo Stoakes
2025-10-20 13:39 ` Sumanth Korikkar
2025-10-20 14:13 ` Lorenzo Stoakes
2025-10-20 12:11 ` [PATCH v5 13/15] mm: add shmem_zero_setup_desc() Lorenzo Stoakes
2025-10-20 12:11 ` Lorenzo Stoakes [this message]
2025-10-20 12:11 ` [PATCH v5 15/15] mm: update resctl to use mmap_prepare Lorenzo Stoakes
2025-10-20 18:16 ` [PATCH v5 00/15] expand mmap_prepare functionality, port more users Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48f60764d7a6901819d1af778fa33b775d2e8c77.1760959442.git.lorenzo.stoakes@oracle.com \
--to=lorenzo.stoakes@oracle.com \
--cc=Dave.Martin@arm.com \
--cc=Liam.Howlett@oracle.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=andreas@gaisler.com \
--cc=andreyknvl@gmail.com \
--cc=arnd@arndb.de \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=dvyukov@google.com \
--cc=dyoung@redhat.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=guoren@kernel.org \
--cc=hca@linux.ibm.com \
--cc=hughd@google.com \
--cc=iommu@lists.linux.dev \
--cc=jack@suse.cz \
--cc=james.morse@arm.com \
--cc=jannh@google.com \
--cc=jgg@nvidia.com \
--cc=kasan-dev@googlegroups.com \
--cc=kevin.tian@intel.com \
--cc=kexec@lists.infradead.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=nico@fluxnic.net \
--cc=ntfs3@lists.linux.dev \
--cc=nvdimm@lists.linux.dev \
--cc=osalvador@suse.de \
--cc=pfalcato@suse.de \
--cc=reinette.chatre@intel.com \
--cc=robin.murphy@arm.com \
--cc=rppt@kernel.org \
--cc=sparclinux@vger.kernel.org \
--cc=sumanthk@linux.ibm.com \
--cc=surenb@google.com \
--cc=svens@linux.ibm.com \
--cc=tony.luck@intel.com \
--cc=tsbogend@alpha.franken.de \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=vgoyal@redhat.com \
--cc=viro@zeniv.linux.org.uk \
--cc=vishal.l.verma@intel.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox