linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings
@ 2025-12-04 15:09 Peter Xu
  2025-12-04 15:10 ` [PATCH v2 1/4] mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment Peter Xu
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Peter Xu @ 2025-12-04 15:09 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, peterx, Kevin Tian, Andrew Morton

This series is based on v6.18.  It allows mmap(!MAP_FIXED) to work with
huge pfnmaps with best effort.  Meanwhile, it enables it for vfio-pci as
the first user.

v1: https://lore.kernel.org/r/20250613134111.469884-1-peterx@redhat.com

A changelog may not apply because all the patches were rewrote based on a
new interface this v2 introduced.  Hence omitted.

In this version, a new file operation, get_mapping_order(), is introduced
(based on discussion with Jason on v1) to minimize the code needed for
drivers to implement this.  It also helps avoid exporting any mm functions.
One can refer to the discussion in v1 for more information.

Currently, get_mapping_order() API is define as:

  int (*get_mapping_order)(struct file *file, unsigned long pgoff, size_t len);

The first argument is the file pointer, the 2nd+3rd are the pgoff+len
specified from a mmap() request.  The driver can use this interface to
opt-in providing mapping order hints to core mm on VA allocations for the
range of the file specified.  I kept the interface as simple for now, so
that core mm will always do the alignment with pgoff assuming that would
always work.  The driver can only report the order from pgoff+len, which
will be used to do the alignment.

Before this series, an userapp in most cases need to be modified to benefit
from huge mappings to provide huge size aligned VA using MAP_FIXED.  After
this series, the userapp can benefit from huge pfnmap automatically after
the kernel upgrades, with no userspace modifications.

It's still best-effort, because the auto-alignment will require a larger VA
range to be allocated via the per-arch allocator, hence if the huge-mapping
aligned VA cannot be allocated then it'll still fallback to small mappings
like before.  However that's from theory POV: in reality I don't yet know
when it'll fail especially when on a 64bits system.

So far, only vfio-pci is supported.  But the logic should be applicable to
all the drivers that support or will support huge pfnmaps.  I've copied
some more people in this version too from hardware perspective.

For testings:

- checkpatch.pl
- cross build harness
- unit test that I got from Alex [1], checking mmap() alignments on a QEMU
  instance with an 128MB bar.

Checking the alignments look all sane with mmap(!MAP_FIXED), and huge
mappings properly installed.  I didn't observe anything wrong.

I currently lack larger bars to test PUD sizes.  Please kindly report if
one can run this with 1G+ bars and hit issues.

Alex Mastro: thanks for the testing offered in v1, but since this series
was rewritten, a re-test will be needed.  I hence didn't collect the T-b.

Comments welcomed, thanks.

[1] https://github.com/awilliam/tests/blob/vfio-pci-device-map-alignment/vfio-pci-device-map-alignment.c

Peter Xu (4):
  mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment
  mm: Add file_operations.get_mapping_order()
  vfio: Introduce vfio_device_ops.get_mapping_order hook
  vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings

 Documentation/filesystems/vfs.rst |  4 +++
 drivers/vfio/pci/vfio_pci.c       |  1 +
 drivers/vfio/pci/vfio_pci_core.c  | 49 ++++++++++++++++++++++++++
 drivers/vfio/vfio_main.c          | 14 ++++++++
 include/linux/fs.h                |  1 +
 include/linux/huge_mm.h           |  5 +--
 include/linux/vfio.h              |  5 +++
 include/linux/vfio_pci_core.h     |  2 ++
 mm/huge_memory.c                  |  7 ++--
 mm/mmap.c                         | 58 +++++++++++++++++++++++++++----
 10 files changed, 135 insertions(+), 11 deletions(-)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/4] mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment
  2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
@ 2025-12-04 15:10 ` Peter Xu
  2025-12-04 15:10 ` [PATCH v2 2/4] mm: Add file_operations.get_mapping_order() Peter Xu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2025-12-04 15:10 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, peterx, Kevin Tian, Andrew Morton

Add "align" parameter to thp_get_unmapped_area_vmflags() so that it allows
get unmapped area with any alignment.

There're two existing callers, use PMD_SIZE explicitly for them.

No functional change intended.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/huge_mm.h | 5 +++--
 mm/huge_memory.c        | 7 ++++---
 mm/mmap.c               | 3 ++-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 71ac78b9f834f..1c221550362d7 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -362,7 +362,7 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags);
 unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags,
-		vm_flags_t vm_flags);
+		unsigned long align, vm_flags_t vm_flags);
 
 bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins);
 int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
@@ -559,7 +559,8 @@ static inline unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma,
 static inline unsigned long
 thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
 			      unsigned long len, unsigned long pgoff,
-			      unsigned long flags, vm_flags_t vm_flags)
+			      unsigned long flags, unsigned long align,
+			      vm_flags_t vm_flags)
 {
 	return 0;
 }
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 6cba1cb14b23a..ab2450b985171 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1155,12 +1155,12 @@ static unsigned long __thp_get_unmapped_area(struct file *filp,
 
 unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags,
-		vm_flags_t vm_flags)
+		unsigned long align, vm_flags_t vm_flags)
 {
 	unsigned long ret;
 	loff_t off = (loff_t)pgoff << PAGE_SHIFT;
 
-	ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE, vm_flags);
+	ret = __thp_get_unmapped_area(filp, addr, len, off, flags, align, vm_flags);
 	if (ret)
 		return ret;
 
@@ -1171,7 +1171,8 @@ unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long add
 unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
-	return thp_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0);
+	return thp_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags,
+					     PMD_SIZE, 0);
 }
 EXPORT_SYMBOL_GPL(thp_get_unmapped_area);
 
diff --git a/mm/mmap.c b/mm/mmap.c
index 5fd3b80fda1d5..8fa397a18252e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -846,7 +846,8 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		   && IS_ALIGNED(len, PMD_SIZE)) {
 		/* Ensures that larger anonymous mappings are THP aligned. */
 		addr = thp_get_unmapped_area_vmflags(file, addr, len,
-						     pgoff, flags, vm_flags);
+						     pgoff, flags, PMD_SIZE,
+						     vm_flags);
 	} else {
 		addr = mm_get_unmapped_area_vmflags(current->mm, file, addr, len,
 						    pgoff, flags, vm_flags);
-- 
2.50.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 2/4] mm: Add file_operations.get_mapping_order()
  2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
  2025-12-04 15:10 ` [PATCH v2 1/4] mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment Peter Xu
@ 2025-12-04 15:10 ` Peter Xu
  2025-12-04 15:19   ` Peter Xu
  2025-12-04 15:10 ` [PATCH v2 3/4] vfio: Introduce vfio_device_ops.get_mapping_order hook Peter Xu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Peter Xu @ 2025-12-04 15:10 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, peterx, Kevin Tian, Andrew Morton

Add one new file operation, get_mapping_order().  It can be used by file
backends to report mapping order hints.

By default, Linux assumed we will map in PAGE_SIZE chunks.  With this hint,
the driver can report the possibility of mapping chunks that are larger
than PAGE_SIZE.  Then, the VA allocator will try to use that as alignment
when allocating the VA ranges.

This is useful because when chunks to be mapped are larger than PAGE_SIZE,
VA alignment matters and it needs to be aligned with the size of the chunk
to be mapped.

Said that, no matter what is the alignment used for the VA allocation, the
driver can still decide which size to map the chunks.  It is also not an
issue if it keeps mapping in PAGE_SIZE.

get_mapping_order() is defined to take three parameters.  Besides the 1st
parameter which will be the file object pointer, the 2nd + 3rd parameters
being the pgoff + size of the mmap() request.  Its retval is defined as the
order, which must be non-negative to enable the alignment.  When zero is
returned, it should behave like when the hint is not provided, IOW,
alignment will still be PAGE_SIZE.

When the order is too big, ignore the hint.  Normally drivers are trusted,
so it's more of an extra layer of safety measure.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 Documentation/filesystems/vfs.rst |  4 +++
 include/linux/fs.h                |  1 +
 mm/mmap.c                         | 59 +++++++++++++++++++++++++++----
 3 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 4f13b01e42eb5..b707ddbebbf52 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -1069,6 +1069,7 @@ This describes how the VFS can manipulate an open file.  As of kernel
 		int (*fasync) (int, struct file *, int);
 		int (*lock) (struct file *, int, struct file_lock *);
 		unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
+		int (*get_mapping_order)(struct file *, unsigned long, size_t);
 		int (*check_flags)(int);
 		int (*flock) (struct file *, int, struct file_lock *);
 		ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
@@ -1165,6 +1166,9 @@ otherwise noted.
 ``get_unmapped_area``
 	called by the mmap(2) system call
 
+``get_mapping_order``
+	called by the mmap(2) system call to get mapping order hint
+
 ``check_flags``
 	called by the fcntl(2) system call for F_SETFL command
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index dd3b57cfadeeb..5ba373576bfe5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2287,6 +2287,7 @@ struct file_operations {
 	int (*fasync) (int, struct file *, int);
 	int (*lock) (struct file *, int, struct file_lock *);
 	unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
+	int (*get_mapping_order)(struct file *file, unsigned long pgoff, size_t len);
 	int (*check_flags)(int);
 	int (*flock) (struct file *, int, struct file_lock *);
 	ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
diff --git a/mm/mmap.c b/mm/mmap.c
index 8fa397a18252e..be3dd0623f00c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -808,6 +808,33 @@ unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct file *fi
 	return arch_get_unmapped_area(filp, addr, len, pgoff, flags, vm_flags);
 }
 
+static inline bool file_has_mmap_order_hint(struct file *file)
+{
+	return file && file->f_op && file->f_op->get_mapping_order;
+}
+
+static inline bool
+mmap_should_align(struct file *file, unsigned long addr, unsigned long len)
+{
+	/* When THP not enabled at all, skip */
+	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
+		return false;
+
+	/* Never try any alignment if the mmap() address hint is provided */
+	if (addr)
+		return false;
+
+	/* Anonymous THP could use some better alignment when len aligned */
+	if (!file)
+		return IS_ALIGNED(len, PMD_SIZE);
+
+	/*
+	 * It's a file mapping, no address hint provided by caller, try any
+	 * alignment if the file backend would provide a hint
+	 */
+	return file_has_mmap_order_hint(file);
+}
+
 unsigned long
 __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 		unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
@@ -815,8 +842,9 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 	unsigned long (*get_area)(struct file *, unsigned long,
 				  unsigned long, unsigned long, unsigned long)
 				  = NULL;
-
 	unsigned long error = arch_mmap_check(addr, len, flags);
+	unsigned long align;
+
 	if (error)
 		return error;
 
@@ -841,13 +869,30 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
 
 	if (get_area) {
 		addr = get_area(file, addr, len, pgoff, flags);
-	} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && !file
-		   && !addr /* no hint */
-		   && IS_ALIGNED(len, PMD_SIZE)) {
-		/* Ensures that larger anonymous mappings are THP aligned. */
+	} else if (mmap_should_align(file, addr, len)) {
+		if (file_has_mmap_order_hint(file)) {
+			int order;
+			/*
+			 * Allow driver to opt-in on the order hint.
+			 *
+			 * Sanity check on the order returned. Treating
+			 * either negative or too big order to be invalid,
+			 * where alignment will be skipped.
+			 */
+			order = file->f_op->get_mapping_order(file, pgoff, len);
+			if (order < 0)
+				order = 0;
+			if (check_shl_overflow(PAGE_SIZE, order, &align))
+				/* No alignment applied */
+				align = PAGE_SIZE;
+		} else {
+			/* Default alignment for anonymous THPs */
+			align = PMD_SIZE;
+		}
+
 		addr = thp_get_unmapped_area_vmflags(file, addr, len,
-						     pgoff, flags, PMD_SIZE,
-						     vm_flags);
+						     pgoff, flags,
+						     align, vm_flags);
 	} else {
 		addr = mm_get_unmapped_area_vmflags(current->mm, file, addr, len,
 						    pgoff, flags, vm_flags);
-- 
2.50.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 3/4] vfio: Introduce vfio_device_ops.get_mapping_order hook
  2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
  2025-12-04 15:10 ` [PATCH v2 1/4] mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment Peter Xu
  2025-12-04 15:10 ` [PATCH v2 2/4] mm: Add file_operations.get_mapping_order() Peter Xu
@ 2025-12-04 15:10 ` Peter Xu
  2025-12-04 15:10 ` [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
  2025-12-04 18:16 ` [PATCH v2 0/4] mm/vfio: " Cédric Le Goater
  4 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2025-12-04 15:10 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, peterx, Kevin Tian, Andrew Morton

Add a hook to vfio_device_ops to allow sub-modules provide mapping order
hint for an mmap() request.  When not available, use the default value (0).

Note that this patch will change the code path for vfio on mmap() when
allocating the virtual address range to be mapped, however it should not
change the result of the VA allocated, because the default value (0) should
be the old behavior.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 drivers/vfio/vfio_main.c | 14 ++++++++++++++
 include/linux/vfio.h     |  5 +++++
 2 files changed, 19 insertions(+)

diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 38c8e9350a60e..3f2107ff93e5d 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1372,6 +1372,19 @@ static void vfio_device_show_fdinfo(struct seq_file *m, struct file *filep)
 }
 #endif
 
+static int vfio_device_get_mapping_order(struct file *file,
+					 unsigned long pgoff,
+					 size_t len)
+{
+	struct vfio_device_file *df = file->private_data;
+	struct vfio_device *device = df->device;
+
+	if (device->ops->get_mapping_order)
+		return device->ops->get_mapping_order(device, pgoff, len);
+
+	return 0;
+}
+
 const struct file_operations vfio_device_fops = {
 	.owner		= THIS_MODULE,
 	.open		= vfio_device_fops_cdev_open,
@@ -1384,6 +1397,7 @@ const struct file_operations vfio_device_fops = {
 #ifdef CONFIG_PROC_FS
 	.show_fdinfo	= vfio_device_show_fdinfo,
 #endif
+	.get_mapping_order	= vfio_device_get_mapping_order,
 };
 
 static struct vfio_device *vfio_device_from_file(struct file *file)
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index eb563f538dee5..46a4d85fc4953 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -111,6 +111,8 @@ struct vfio_device {
  * @dma_unmap: Called when userspace unmaps IOVA from the container
  *             this device is attached to.
  * @device_feature: Optional, fill in the VFIO_DEVICE_FEATURE ioctl
+ * @get_mapping_order: Optional, provide mapping order hints for mmap().
+ *                     When unavailable, use the default order (zero).
  */
 struct vfio_device_ops {
 	char	*name;
@@ -139,6 +141,9 @@ struct vfio_device_ops {
 	void	(*dma_unmap)(struct vfio_device *vdev, u64 iova, u64 length);
 	int	(*device_feature)(struct vfio_device *device, u32 flags,
 				  void __user *arg, size_t argsz);
+	int	(*get_mapping_order)(struct vfio_device *device,
+				     unsigned long pgoff,
+				     size_t len);
 };
 
 #if IS_ENABLED(CONFIG_IOMMUFD)
-- 
2.50.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
  2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
                   ` (2 preceding siblings ...)
  2025-12-04 15:10 ` [PATCH v2 3/4] vfio: Introduce vfio_device_ops.get_mapping_order hook Peter Xu
@ 2025-12-04 15:10 ` Peter Xu
  2025-12-05  4:33   ` kernel test robot
  2025-12-05  7:45   ` kernel test robot
  2025-12-04 18:16 ` [PATCH v2 0/4] mm/vfio: " Cédric Le Goater
  4 siblings, 2 replies; 9+ messages in thread
From: Peter Xu @ 2025-12-04 15:10 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, peterx, Kevin Tian, Andrew Morton

This patch enables best-effort mmap() for vfio-pci bars even without
MAP_FIXED, so as to utilize huge pfnmaps as much as possible.  It should
also avoid userspace changes (switching to MAP_FIXED with pre-aligned VA
addresses) to start enabling huge pfnmaps on VFIO bars.

Here the trick is making sure the MMIO PFNs will be aligned with the VAs
allocated from mmap() when !MAP_FIXED, so that whatever returned from
mmap(!MAP_FIXED) of vfio-pci MMIO regions will be automatically suitable
for huge pfnmaps as much as possible.

To achieve that, a custom vfio_device's get_mapping_hint() for vfio-pci
devices is needed.

Note that BAR's MMIO physical addresses should normally be guaranteed to be
BAR-size aligned.  It means the MMIO address will also always be aligned
with vfio-pci's file offset address space, per VFIO_PCI_OFFSET_SHIFT.

With that guaranteed, VA allocator can calculate the alignment with pgoff,
which will be further aligned with the MMIO physical addresses to be mapped
in the VMA later.

So far, stick with the simple plan to rely on the hardware assumption that
should always be true.  Leave it for later if pgoff needs adjustments when
there's a real demand of it when calculating the alignment.

For discussion on the requirement of this feature, see:

https://lore.kernel.org/linux-pci/20250529214414.1508155-1-amastro@fb.com/

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c      |  1 +
 drivers/vfio/pci/vfio_pci_core.c | 49 ++++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h    |  2 ++
 3 files changed, 52 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index ac10f14417f2f..8f29037cee6eb 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -145,6 +145,7 @@ static const struct vfio_device_ops vfio_pci_ops = {
 	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
 	.pasid_attach_ioas	= vfio_iommufd_physical_pasid_attach_ioas,
 	.pasid_detach_ioas	= vfio_iommufd_physical_pasid_detach_ioas,
+	.get_mapping_order	= vfio_pci_core_get_mapping_order,
 };
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 7dcf5439dedc9..28ab37715acc0 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1640,6 +1640,55 @@ static unsigned long vma_to_pfn(struct vm_area_struct *vma)
 	return (pci_resource_start(vdev->pdev, index) >> PAGE_SHIFT) + pgoff;
 }
 
+/*
+ * Hint function for mmap() about the size of mapping to be carried out.
+ * This helps to enable huge pfnmaps as much as possible on BAR mappings.
+ *
+ * This function does the minimum check on mmap() parameters to make the
+ * hint valid only. The majority of mmap() sanity check will be done later
+ * in mmap().
+ */
+int vfio_pci_core_get_mapping_order(struct vfio_device *device,
+				    unsigned long pgoff, size_t len)
+{
+	struct vfio_pci_core_device *vdev =
+	    container_of(device, struct vfio_pci_core_device, vdev);
+	struct pci_dev *pdev = vdev->pdev;
+	unsigned int index = pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT);
+	unsigned long req_start;
+	size_t phys_len;
+
+	/* Currently, only bars 0-5 supports huge pfnmap */
+	if (index >= VFIO_PCI_ROM_REGION_INDEX)
+		return 0;
+
+	/*
+	 * NOTE: we're keeping things simple as of now, assuming the
+	 * physical address of BARs (aka, pci_resource_start(pdev, index))
+	 * should always be aligned with pgoff in vfio-pci's address space.
+	 */
+	req_start = (pgoff << PAGE_SHIFT) & ((1UL << VFIO_PCI_OFFSET_SHIFT) - 1);
+	phys_len = PAGE_ALIGN(pci_resource_len(pdev, index));
+
+	/*
+	 * If this happens, it will probably fail mmap() later.. mapping
+	 * hint isn't important anymore.
+	 */
+	if (req_start >= phys_len)
+		return 0;
+
+	phys_len = MIN(phys_len - req_start, len);
+
+	if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PUD_PFNMAP) && phys_len >= PUD_SIZE)
+		return PUD_ORDER;
+
+	if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PMD_PFNMAP) && phys_len >= PMD_SIZE)
+		return PMD_ORDER;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_pci_core_get_mapping_order);
+
 static vm_fault_t vfio_pci_mmap_huge_fault(struct vm_fault *vmf,
 					   unsigned int order)
 {
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index f541044e42a2a..d320dfacc5681 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -119,6 +119,8 @@ ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
 		size_t count, loff_t *ppos);
 ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
 		size_t count, loff_t *ppos);
+int vfio_pci_core_get_mapping_order(struct vfio_device *device,
+		unsigned long pgoff, size_t len);
 int vfio_pci_core_mmap(struct vfio_device *core_vdev, struct vm_area_struct *vma);
 void vfio_pci_core_request(struct vfio_device *core_vdev, unsigned int count);
 int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf);
-- 
2.50.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/4] mm: Add file_operations.get_mapping_order()
  2025-12-04 15:10 ` [PATCH v2 2/4] mm: Add file_operations.get_mapping_order() Peter Xu
@ 2025-12-04 15:19   ` Peter Xu
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2025-12-04 15:19 UTC (permalink / raw)
  To: kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, Kevin Tian, Andrew Morton,
	David Hildenbrand, Matthew Wilcox, Jonathan Corbet,
	linux-fsdevel, linux-doc, Liam R. Howlett, Lorenzo Stoakes,
	Vlastimil Babka, Jann Horn, Pedro Falcato, Alexander Viro,
	Christian Brauner, Jan Kara, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko

I forgot to copy mm/fs maintainers for the 1st/2nd patches in this series,
my apologies.  Whole series can be found here:

https://lore.kernel.org/r/20251204151003.171039-1-peterx@redhat.com

I'll modify the cc list when repost.

Thanks,

On Thu, Dec 04, 2025 at 10:10:01AM -0500, Peter Xu wrote:
> Add one new file operation, get_mapping_order().  It can be used by file
> backends to report mapping order hints.
> 
> By default, Linux assumed we will map in PAGE_SIZE chunks.  With this hint,
> the driver can report the possibility of mapping chunks that are larger
> than PAGE_SIZE.  Then, the VA allocator will try to use that as alignment
> when allocating the VA ranges.
> 
> This is useful because when chunks to be mapped are larger than PAGE_SIZE,
> VA alignment matters and it needs to be aligned with the size of the chunk
> to be mapped.
> 
> Said that, no matter what is the alignment used for the VA allocation, the
> driver can still decide which size to map the chunks.  It is also not an
> issue if it keeps mapping in PAGE_SIZE.
> 
> get_mapping_order() is defined to take three parameters.  Besides the 1st
> parameter which will be the file object pointer, the 2nd + 3rd parameters
> being the pgoff + size of the mmap() request.  Its retval is defined as the
> order, which must be non-negative to enable the alignment.  When zero is
> returned, it should behave like when the hint is not provided, IOW,
> alignment will still be PAGE_SIZE.
> 
> When the order is too big, ignore the hint.  Normally drivers are trusted,
> so it's more of an extra layer of safety measure.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  Documentation/filesystems/vfs.rst |  4 +++
>  include/linux/fs.h                |  1 +
>  mm/mmap.c                         | 59 +++++++++++++++++++++++++++----
>  3 files changed, 57 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
> index 4f13b01e42eb5..b707ddbebbf52 100644
> --- a/Documentation/filesystems/vfs.rst
> +++ b/Documentation/filesystems/vfs.rst
> @@ -1069,6 +1069,7 @@ This describes how the VFS can manipulate an open file.  As of kernel
>  		int (*fasync) (int, struct file *, int);
>  		int (*lock) (struct file *, int, struct file_lock *);
>  		unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
> +		int (*get_mapping_order)(struct file *, unsigned long, size_t);
>  		int (*check_flags)(int);
>  		int (*flock) (struct file *, int, struct file_lock *);
>  		ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
> @@ -1165,6 +1166,9 @@ otherwise noted.
>  ``get_unmapped_area``
>  	called by the mmap(2) system call
>  
> +``get_mapping_order``
> +	called by the mmap(2) system call to get mapping order hint
> +
>  ``check_flags``
>  	called by the fcntl(2) system call for F_SETFL command
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index dd3b57cfadeeb..5ba373576bfe5 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2287,6 +2287,7 @@ struct file_operations {
>  	int (*fasync) (int, struct file *, int);
>  	int (*lock) (struct file *, int, struct file_lock *);
>  	unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
> +	int (*get_mapping_order)(struct file *file, unsigned long pgoff, size_t len);
>  	int (*check_flags)(int);
>  	int (*flock) (struct file *, int, struct file_lock *);
>  	ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 8fa397a18252e..be3dd0623f00c 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -808,6 +808,33 @@ unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct file *fi
>  	return arch_get_unmapped_area(filp, addr, len, pgoff, flags, vm_flags);
>  }
>  
> +static inline bool file_has_mmap_order_hint(struct file *file)
> +{
> +	return file && file->f_op && file->f_op->get_mapping_order;
> +}
> +
> +static inline bool
> +mmap_should_align(struct file *file, unsigned long addr, unsigned long len)
> +{
> +	/* When THP not enabled at all, skip */
> +	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> +		return false;
> +
> +	/* Never try any alignment if the mmap() address hint is provided */
> +	if (addr)
> +		return false;
> +
> +	/* Anonymous THP could use some better alignment when len aligned */
> +	if (!file)
> +		return IS_ALIGNED(len, PMD_SIZE);
> +
> +	/*
> +	 * It's a file mapping, no address hint provided by caller, try any
> +	 * alignment if the file backend would provide a hint
> +	 */
> +	return file_has_mmap_order_hint(file);
> +}
> +
>  unsigned long
>  __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  		unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags)
> @@ -815,8 +842,9 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  	unsigned long (*get_area)(struct file *, unsigned long,
>  				  unsigned long, unsigned long, unsigned long)
>  				  = NULL;
> -
>  	unsigned long error = arch_mmap_check(addr, len, flags);
> +	unsigned long align;
> +
>  	if (error)
>  		return error;
>  
> @@ -841,13 +869,30 @@ __get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
>  
>  	if (get_area) {
>  		addr = get_area(file, addr, len, pgoff, flags);
> -	} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && !file
> -		   && !addr /* no hint */
> -		   && IS_ALIGNED(len, PMD_SIZE)) {
> -		/* Ensures that larger anonymous mappings are THP aligned. */
> +	} else if (mmap_should_align(file, addr, len)) {
> +		if (file_has_mmap_order_hint(file)) {
> +			int order;
> +			/*
> +			 * Allow driver to opt-in on the order hint.
> +			 *
> +			 * Sanity check on the order returned. Treating
> +			 * either negative or too big order to be invalid,
> +			 * where alignment will be skipped.
> +			 */
> +			order = file->f_op->get_mapping_order(file, pgoff, len);
> +			if (order < 0)
> +				order = 0;
> +			if (check_shl_overflow(PAGE_SIZE, order, &align))
> +				/* No alignment applied */
> +				align = PAGE_SIZE;
> +		} else {
> +			/* Default alignment for anonymous THPs */
> +			align = PMD_SIZE;
> +		}
> +
>  		addr = thp_get_unmapped_area_vmflags(file, addr, len,
> -						     pgoff, flags, PMD_SIZE,
> -						     vm_flags);
> +						     pgoff, flags,
> +						     align, vm_flags);
>  	} else {
>  		addr = mm_get_unmapped_area_vmflags(current->mm, file, addr, len,
>  						    pgoff, flags, vm_flags);
> -- 
> 2.50.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings
  2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
                   ` (3 preceding siblings ...)
  2025-12-04 15:10 ` [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
@ 2025-12-04 18:16 ` Cédric Le Goater
  4 siblings, 0 replies; 9+ messages in thread
From: Cédric Le Goater @ 2025-12-04 18:16 UTC (permalink / raw)
  To: Peter Xu, kvm, linux-mm, linux-kernel
  Cc: Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, Kevin Tian, Andrew Morton

On 12/4/25 16:09, Peter Xu wrote:
> This series is based on v6.18.  It allows mmap(!MAP_FIXED) to work with
> huge pfnmaps with best effort.  Meanwhile, it enables it for vfio-pci as
> the first user.
> 
> v1: https://lore.kernel.org/r/20250613134111.469884-1-peterx@redhat.com
> 
> A changelog may not apply because all the patches were rewrote based on a
> new interface this v2 introduced.  Hence omitted.
> 
> In this version, a new file operation, get_mapping_order(), is introduced
> (based on discussion with Jason on v1) to minimize the code needed for
> drivers to implement this.  It also helps avoid exporting any mm functions.
> One can refer to the discussion in v1 for more information.
> 
> Currently, get_mapping_order() API is define as:
> 
>    int (*get_mapping_order)(struct file *file, unsigned long pgoff, size_t len);
> 
> The first argument is the file pointer, the 2nd+3rd are the pgoff+len
> specified from a mmap() request.  The driver can use this interface to
> opt-in providing mapping order hints to core mm on VA allocations for the
> range of the file specified.  I kept the interface as simple for now, so
> that core mm will always do the alignment with pgoff assuming that would
> always work.  The driver can only report the order from pgoff+len, which
> will be used to do the alignment.
> 
> Before this series, an userapp in most cases need to be modified to benefit
> from huge mappings to provide huge size aligned VA using MAP_FIXED.  After
> this series, the userapp can benefit from huge pfnmap automatically after
> the kernel upgrades, with no userspace modifications.
> 
> It's still best-effort, because the auto-alignment will require a larger VA
> range to be allocated via the per-arch allocator, hence if the huge-mapping
> aligned VA cannot be allocated then it'll still fallback to small mappings
> like before.  However that's from theory POV: in reality I don't yet know
> when it'll fail especially when on a 64bits system.
> 
> So far, only vfio-pci is supported.  But the logic should be applicable to
> all the drivers that support or will support huge pfnmaps.  I've copied
> some more people in this version too from hardware perspective.
> 
> For testings:
> 
> - checkpatch.pl
> - cross build harness
> - unit test that I got from Alex [1], checking mmap() alignments on a QEMU
>    instance with an 128MB bar.
> 
> Checking the alignments look all sane with mmap(!MAP_FIXED), and huge
> mappings properly installed.  I didn't observe anything wrong.
> 
> I currently lack larger bars to test PUD sizes.  Please kindly report if
> one can run this with 1G+ bars and hit issues.

LGTM, with a 32G BAR :

Using device 0000:02:00.0 in IOMMU group 27
Device 0000:02:00.0 supports 9 regions, 5 irqs
[BAR0]: size 0x1000000, order 24, offset 0x0, flags 0xf
Testing BAR0, require at least 21 bit alignment
[PASS] Minimum alignment 21
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size
[BAR1]: size 0x800000000, order 35, offset 0x10000000000, flags 0x7
Testing BAR1, require at least 30 bit alignment
[PASS] Minimum alignment 31
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size
[BAR3]: size 0x2000000, order 25, offset 0x30000000000, flags 0x7
Testing BAR3, require at least 21 bit alignment
[PASS] Minimum alignment 21
Testing random offset
[PASS] Random offset
Testing random size
[PASS] Random size


C.

> 
> Alex Mastro: thanks for the testing offered in v1, but since this series
> was rewritten, a re-test will be needed.  I hence didn't collect the T-b.
> 
> Comments welcomed, thanks.
> 
> [1] https://github.com/awilliam/tests/blob/vfio-pci-device-map-alignment/vfio-pci-device-map-alignment.c
> 
> Peter Xu (4):
>    mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment
>    mm: Add file_operations.get_mapping_order()
>    vfio: Introduce vfio_device_ops.get_mapping_order hook
>    vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
> 
>   Documentation/filesystems/vfs.rst |  4 +++
>   drivers/vfio/pci/vfio_pci.c       |  1 +
>   drivers/vfio/pci/vfio_pci_core.c  | 49 ++++++++++++++++++++++++++
>   drivers/vfio/vfio_main.c          | 14 ++++++++
>   include/linux/fs.h                |  1 +
>   include/linux/huge_mm.h           |  5 +--
>   include/linux/vfio.h              |  5 +++
>   include/linux/vfio_pci_core.h     |  2 ++
>   mm/huge_memory.c                  |  7 ++--
>   mm/mmap.c                         | 58 +++++++++++++++++++++++++++----
>   10 files changed, 135 insertions(+), 11 deletions(-)
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
  2025-12-04 15:10 ` [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
@ 2025-12-05  4:33   ` kernel test robot
  2025-12-05  7:45   ` kernel test robot
  1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-12-05  4:33 UTC (permalink / raw)
  To: Peter Xu, kvm, linux-mm, linux-kernel
  Cc: oe-kbuild-all, Jason Gunthorpe, Nico Pache, Zi Yan, Alex Mastro,
	David Hildenbrand, Alex Williamson, Zhi Wang, David Laight,
	Yi Liu, Ankit Agrawal, peterx, Kevin Tian, Andrew Morton,
	Linux Memory Management List

Hi Peter,

kernel test robot noticed the following build warnings:

[auto build test WARNING on awilliam-vfio/for-linus]
[also build test WARNING on linus/master v6.18]
[cannot apply to akpm-mm/mm-everything awilliam-vfio/next brauner-vfs/vfs.all next-20251204]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-thp-Allow-thp_get_unmapped_area_vmflags-to-take-alignment/20251204-231258
base:   https://github.com/awilliam/linux-vfio.git for-linus
patch link:    https://lore.kernel.org/r/20251204151003.171039-5-peterx%40redhat.com
patch subject: [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
config: i386-buildonly-randconfig-004-20251205 (https://download.01.org/0day-ci/archive/20251205/202512051241.QtfYgqkx-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251205/202512051241.QtfYgqkx-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512051241.QtfYgqkx-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/vfio/pci/vfio_pci_core.c: In function 'vfio_pci_core_get_mapping_order':
>> drivers/vfio/pci/vfio_pci_core.c:1670:51: warning: left shift count >= width of type [-Wshift-count-overflow]
    1670 |         req_start = (pgoff << PAGE_SHIFT) & ((1UL << VFIO_PCI_OFFSET_SHIFT) - 1);
         |                                                   ^~


vim +1670 drivers/vfio/pci/vfio_pci_core.c

  1642	
  1643	/*
  1644	 * Hint function for mmap() about the size of mapping to be carried out.
  1645	 * This helps to enable huge pfnmaps as much as possible on BAR mappings.
  1646	 *
  1647	 * This function does the minimum check on mmap() parameters to make the
  1648	 * hint valid only. The majority of mmap() sanity check will be done later
  1649	 * in mmap().
  1650	 */
  1651	int vfio_pci_core_get_mapping_order(struct vfio_device *device,
  1652					    unsigned long pgoff, size_t len)
  1653	{
  1654		struct vfio_pci_core_device *vdev =
  1655		    container_of(device, struct vfio_pci_core_device, vdev);
  1656		struct pci_dev *pdev = vdev->pdev;
  1657		unsigned int index = pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT);
  1658		unsigned long req_start;
  1659		size_t phys_len;
  1660	
  1661		/* Currently, only bars 0-5 supports huge pfnmap */
  1662		if (index >= VFIO_PCI_ROM_REGION_INDEX)
  1663			return 0;
  1664	
  1665		/*
  1666		 * NOTE: we're keeping things simple as of now, assuming the
  1667		 * physical address of BARs (aka, pci_resource_start(pdev, index))
  1668		 * should always be aligned with pgoff in vfio-pci's address space.
  1669		 */
> 1670		req_start = (pgoff << PAGE_SHIFT) & ((1UL << VFIO_PCI_OFFSET_SHIFT) - 1);
  1671		phys_len = PAGE_ALIGN(pci_resource_len(pdev, index));
  1672	
  1673		/*
  1674		 * If this happens, it will probably fail mmap() later.. mapping
  1675		 * hint isn't important anymore.
  1676		 */
  1677		if (req_start >= phys_len)
  1678			return 0;
  1679	
  1680		phys_len = MIN(phys_len - req_start, len);
  1681	
  1682		if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PUD_PFNMAP) && phys_len >= PUD_SIZE)
  1683			return PUD_ORDER;
  1684	
  1685		if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PMD_PFNMAP) && phys_len >= PMD_SIZE)
  1686			return PMD_ORDER;
  1687	
  1688		return 0;
  1689	}
  1690	EXPORT_SYMBOL_GPL(vfio_pci_core_get_mapping_order);
  1691	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
  2025-12-04 15:10 ` [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
  2025-12-05  4:33   ` kernel test robot
@ 2025-12-05  7:45   ` kernel test robot
  1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-12-05  7:45 UTC (permalink / raw)
  To: Peter Xu, kvm, linux-mm, linux-kernel
  Cc: llvm, oe-kbuild-all, Jason Gunthorpe, Nico Pache, Zi Yan,
	Alex Mastro, David Hildenbrand, Alex Williamson, Zhi Wang,
	David Laight, Yi Liu, Ankit Agrawal, peterx, Kevin Tian,
	Andrew Morton, Linux Memory Management List

Hi Peter,

kernel test robot noticed the following build warnings:

[auto build test WARNING on awilliam-vfio/for-linus]
[also build test WARNING on v6.18]
[cannot apply to akpm-mm/mm-everything awilliam-vfio/next brauner-vfs/vfs.all linus/master next-20251205]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-thp-Allow-thp_get_unmapped_area_vmflags-to-take-alignment/20251204-231258
base:   https://github.com/awilliam/linux-vfio.git for-linus
patch link:    https://lore.kernel.org/r/20251204151003.171039-5-peterx%40redhat.com
patch subject: [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings
config: i386-randconfig-006-20251205 (https://download.01.org/0day-ci/archive/20251205/202512051509.bh8Oncoq-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251205/202512051509.bh8Oncoq-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512051509.bh8Oncoq-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/vfio/pci/vfio_pci_core.c:1670:44: warning: shift count >= width of type [-Wshift-count-overflow]
    1670 |         req_start = (pgoff << PAGE_SHIFT) & ((1UL << VFIO_PCI_OFFSET_SHIFT) - 1);
         |                                                   ^  ~~~~~~~~~~~~~~~~~~~~~
   1 warning generated.


vim +1670 drivers/vfio/pci/vfio_pci_core.c

  1642	
  1643	/*
  1644	 * Hint function for mmap() about the size of mapping to be carried out.
  1645	 * This helps to enable huge pfnmaps as much as possible on BAR mappings.
  1646	 *
  1647	 * This function does the minimum check on mmap() parameters to make the
  1648	 * hint valid only. The majority of mmap() sanity check will be done later
  1649	 * in mmap().
  1650	 */
  1651	int vfio_pci_core_get_mapping_order(struct vfio_device *device,
  1652					    unsigned long pgoff, size_t len)
  1653	{
  1654		struct vfio_pci_core_device *vdev =
  1655		    container_of(device, struct vfio_pci_core_device, vdev);
  1656		struct pci_dev *pdev = vdev->pdev;
  1657		unsigned int index = pgoff >> (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT);
  1658		unsigned long req_start;
  1659		size_t phys_len;
  1660	
  1661		/* Currently, only bars 0-5 supports huge pfnmap */
  1662		if (index >= VFIO_PCI_ROM_REGION_INDEX)
  1663			return 0;
  1664	
  1665		/*
  1666		 * NOTE: we're keeping things simple as of now, assuming the
  1667		 * physical address of BARs (aka, pci_resource_start(pdev, index))
  1668		 * should always be aligned with pgoff in vfio-pci's address space.
  1669		 */
> 1670		req_start = (pgoff << PAGE_SHIFT) & ((1UL << VFIO_PCI_OFFSET_SHIFT) - 1);
  1671		phys_len = PAGE_ALIGN(pci_resource_len(pdev, index));
  1672	
  1673		/*
  1674		 * If this happens, it will probably fail mmap() later.. mapping
  1675		 * hint isn't important anymore.
  1676		 */
  1677		if (req_start >= phys_len)
  1678			return 0;
  1679	
  1680		phys_len = MIN(phys_len - req_start, len);
  1681	
  1682		if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PUD_PFNMAP) && phys_len >= PUD_SIZE)
  1683			return PUD_ORDER;
  1684	
  1685		if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PMD_PFNMAP) && phys_len >= PMD_SIZE)
  1686			return PMD_ORDER;
  1687	
  1688		return 0;
  1689	}
  1690	EXPORT_SYMBOL_GPL(vfio_pci_core_get_mapping_order);
  1691	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-12-05  7:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-04 15:09 [PATCH v2 0/4] mm/vfio: huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-12-04 15:10 ` [PATCH v2 1/4] mm/thp: Allow thp_get_unmapped_area_vmflags() to take alignment Peter Xu
2025-12-04 15:10 ` [PATCH v2 2/4] mm: Add file_operations.get_mapping_order() Peter Xu
2025-12-04 15:19   ` Peter Xu
2025-12-04 15:10 ` [PATCH v2 3/4] vfio: Introduce vfio_device_ops.get_mapping_order hook Peter Xu
2025-12-04 15:10 ` [PATCH v2 4/4] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Peter Xu
2025-12-05  4:33   ` kernel test robot
2025-12-05  7:45   ` kernel test robot
2025-12-04 18:16 ` [PATCH v2 0/4] mm/vfio: " Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox