linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] Remove device private pages from physical address space
@ 2026-01-07  9:18 Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

Today, when creating these device private struct pages, the first step
is to use request_free_mem_region() to get a range of physical address
space large enough to represent the devices memory. This allocated
physical address range is then remapped as device private memory using
memremap_pages.

Needing allocation of physical address space has some problems:

  1) There may be insufficient physical address space to represent the
     device memory. KASLR reducing the physical address space and VM
     configurations with limited physical address space increase the
     likelihood of hitting this especially as device memory increases. This
     has been observed to prevent device private from being initialized.  

  2) Attempting to add the device private pages to the linear map at
     addresses beyond the actual physical memory causes issues on
     architectures like aarch64  - meaning the feature does not work there [0].

This series changes device private memory so that it does not require
allocation of physical address space and these problems are avoided.
Instead of using the physical address space, we introduce a "device
private address space" and allocate from there.

A consequence of placing the device private pages outside of the
physical address space is that they no longer have a PFN. However, it is
still necessary to be able to look up a corresponding device private
page from a device private PTE entry, which means that we still require
some way to index into this device private address space. Instead of a
PFN, device private pages use an offset into this device private address
space to look up device private struct pages.

The problem that then needs to be addressed is how to avoid confusing
these device private offsets with PFNs. It is the inherent limited usage
of the device private pages themselves which make this possible. A
device private page is only used for userspace mappings, we do not need
to be concerned with them being used within the mm more broadly. This
means that the only way that the core kernel looks up these pages is via
the page table, where their PTE already indicates if they refer to a
device private page via their swap type, e.g.  SWP_DEVICE_WRITE. We can
use this information to determine if the PTE contains a PFN which should
be looked up in the page map, or a device private offset which should be
looked up elsewhere.

This applies when we are creating PTE entries for device private pages -
because they have their own type there are already must be handled
separately, so it is a small step to convert them to a device private
PFN now too.

The first part of the series updates callers where device private
offsets might now be encountered to track this extra state.

The last patch contains the bulk of the work where we change how we
convert between device private pages to device private offsets and then
use a new interface for allocating device private pages without the need
for reserving physical address space.

By removing the device private pages from the physical address space,
this series also opens up the possibility to moving away from tracking
device private memory using struct pages in the future. This is
desirable as on systems with large amounts of memory these device
private struct pages use a signifiant amount of memory and take a
significant amount of time to initialize.

*** Changes in v2 ***

The most significant change in v2 is addressing code paths that are
common between MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT devices.

This had been overlooked in previous revisions.

To do this we introduce a migrate_pfn_from_page() helper which will call
device_private_offset_to_page() and set the MIGRATE_PFN_DEVICE_PRIVATE
flag if required.

In places where we could have a device private offset
(MEMORY_DEVICE_PRIVATE) or a pfn (MEMORY_DEVICE_COHERENT) we update to
use an mpfn to disambiguate.  This includes some users in the drivers
and migrate_device_{pfns,range}().

Seeking opinions on using the mpfns like this or if a new type would be
preferred.

  - mm/migrate_device: Introduce migrate_pfn_from_page() helper
    - New to series

  - drm/amdkfd: Use migrate pfns internally
    - New to series

  - mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
    - New to series

  - mm/migrate_device: Add migrate PFN flag to track device private pages
    - Update for migrate_pfn_from_page()
    - Rename to MIGRATE_PFN_DEVICE_PRIVATE
    - drm/amd: Check adev->gmc.xgmi.connected_to_cpu
    - lib/test_hmm.c: Check chunk->pagemap.type == MEMORY_DEVICE_PRIVATE

  - mm: Add helpers to create migration entries from struct pages
    - Add a flags param

  - mm: Add a new swap type for migration entries of device private pages
    - Add softleaf_is_migration_device_private_read()

  - mm: Add helpers to create device private entries from struct pages
    - Add a flags param

  - mm: Remove device private pages from the physical address space
    - Make sure last member of struct dev_pagemap remains DECLARE_FLEX_ARRAY(struct range, ranges);

Testing:
- selftests/mm/hmm-tests on an amd64 VM

* NOTE: I will need help in testing the driver changes *

Revisions:
- RFC: https://lore.kernel.org/all/20251128044146.80050-1-jniethe@nvidia.com/
- v1: https://lore.kernel.org/all/20251231043154.42931-1-jniethe@nvidia.com/

[0] https://lore.kernel.org/lkml/CAMj1kXFZ=4hLL1w6iCV5O5uVoVLHAJbc0rr40j24ObenAjXe9w@mail.gmail.com/

Jordan Niethe (11):
  mm/migrate_device: Introduce migrate_pfn_from_page() helper
  drm/amdkfd: Use migrate pfns internally
  mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
  mm/migrate_device: Add migrate PFN flag to track device private pages
  mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn to track
    device private pages
  mm: Add helpers to create migration entries from struct pages
  mm: Add a new swap type for migration entries of device private pages
  mm: Add helpers to create device private entries from struct pages
  mm/util: Add flag to track device private pages in page snapshots
  mm/hmm: Add flag to track device private pages
  mm: Remove device private pages from the physical address space

 Documentation/mm/hmm.rst                 |  11 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c       |  43 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  45 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   2 +-
 drivers/gpu/drm/drm_pagemap.c            |  11 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  45 ++----
 drivers/gpu/drm/xe/xe_svm.c              |  37 ++---
 fs/proc/page.c                           |   6 +-
 include/drm/drm_pagemap.h                |   8 +-
 include/linux/hmm.h                      |   7 +-
 include/linux/leafops.h                  | 116 ++++++++++++--
 include/linux/memremap.h                 |  64 +++++++-
 include/linux/migrate.h                  |  23 ++-
 include/linux/mm.h                       |   9 +-
 include/linux/rmap.h                     |  33 +++-
 include/linux/swap.h                     |   8 +-
 include/linux/swapops.h                  | 136 ++++++++++++++++
 lib/test_hmm.c                           |  86 ++++++----
 mm/debug.c                               |   9 +-
 mm/hmm.c                                 |   5 +-
 mm/huge_memory.c                         |  43 ++---
 mm/hugetlb.c                             |  15 +-
 mm/memory.c                              |   5 +-
 mm/memremap.c                            | 193 ++++++++++++++++++-----
 mm/migrate.c                             |   6 +-
 mm/migrate_device.c                      |  76 +++++----
 mm/mm_init.c                             |   8 +-
 mm/mprotect.c                            |  10 +-
 mm/page_vma_mapped.c                     |  32 +++-
 mm/rmap.c                                |  59 ++++---
 mm/util.c                                |   8 +-
 mm/vmscan.c                              |   2 +-
 32 files changed, 822 insertions(+), 339 deletions(-)


base-commit: f8f9c1f4d0c7a64600e2ca312dec824a0bc2f1da
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-08 20:03   ` Felix Kuehling
  2026-01-07  9:18 ` [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally Jordan Niethe
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

To create a migrate from a given struct page, that page is first
converted to its pfn, before passing the pfn to migrate_pfn().

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have a
pfn and must be handled separately.

Prepare for this with a new helper:

    - migrate_pfn_from_page()

This helper takes a struct page as parameter instead of a pfn. This will
allow more flexibility for handling the mpfn differently for device
private pages.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
---
v2: New to series
---
 arch/powerpc/kvm/book3s_hv_uvmem.c       |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  2 +-
 drivers/gpu/drm/drm_pagemap.c            |  2 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  4 ++--
 include/linux/migrate.h                  |  5 +++++
 lib/test_hmm.c                           | 11 ++++++-----
 mm/migrate_device.c                      |  7 +++----
 7 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index e5000bef90f2..67910900af7b 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -784,7 +784,7 @@ static int kvmppc_svm_page_in(struct vm_area_struct *vma,
 		}
 	}
 
-	*mig.dst = migrate_pfn(page_to_pfn(dpage));
+	*mig.dst = migrate_pfn_from_page(dpage);
 	migrate_vma_pages(&mig);
 out_finalize:
 	migrate_vma_finalize(&mig);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index af53e796ea1b..ca552c34ece2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -646,7 +646,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
 		pr_debug_ratelimited("dma mapping dst to 0x%llx, pfn 0x%lx\n",
 				     dst[i] >> PAGE_SHIFT, page_to_pfn(dpage));
 
-		migrate->dst[i] = migrate_pfn(page_to_pfn(dpage));
+		migrate->dst[i] = migrate_pfn_from_page(dpage);
 		j++;
 	}
 
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index 37d7cfbbb3e8..5ddf395847ef 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -490,7 +490,7 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
 			goto free_pages;
 
 		page = folio_page(folio, 0);
-		mpfn[i] = migrate_pfn(page_to_pfn(page));
+		mpfn[i] = migrate_pfn_from_page(page);
 
 next:
 		if (page)
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 58071652679d..a7edcdca9701 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -249,7 +249,7 @@ static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
 		goto done;
 	}
 
-	args.dst[0] = migrate_pfn(page_to_pfn(dpage));
+	args.dst[0] = migrate_pfn_from_page(dpage);
 	if (order)
 		args.dst[0] |= MIGRATE_PFN_COMPOUND;
 	dfolio = page_folio(dpage);
@@ -766,7 +766,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
 		((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
 	if (src & MIGRATE_PFN_WRITE)
 		*pfn |= NVIF_VMM_PFNMAP_V0_W;
-	mpfn = migrate_pfn(page_to_pfn(dpage));
+	mpfn = migrate_pfn_from_page(dpage);
 	if (folio_order(page_folio(dpage)))
 		mpfn |= MIGRATE_PFN_COMPOUND;
 	return mpfn;
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 26ca00c325d9..d269ec1400be 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -140,6 +140,11 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 	return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID;
 }
 
+static inline unsigned long migrate_pfn_from_page(struct page *page)
+{
+	return migrate_pfn(page_to_pfn(page));
+}
+
 enum migrate_vma_direction {
 	MIGRATE_VMA_SELECT_SYSTEM = 1 << 0,
 	MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1,
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 8af169d3873a..7e5248404d00 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -727,7 +727,8 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
 				rpage = BACKING_PAGE(dpage);
 				rpage->zone_device_data = dmirror;
 
-				*dst = migrate_pfn(page_to_pfn(dpage)) | write;
+				*dst = migrate_pfn_from_page(dpage) |
+				       write;
 				src_page = pfn_to_page(spfn + i);
 
 				if (spage)
@@ -754,7 +755,7 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
 		pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 0x%lx\n",
 			 page_to_pfn(spage), page_to_pfn(dpage));
 
-		*dst = migrate_pfn(page_to_pfn(dpage)) | write;
+		*dst = migrate_pfn_from_page(dpage) | write;
 
 		if (is_large) {
 			int i;
@@ -989,7 +990,7 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma *args,
 
 		if (dpage) {
 			lock_page(dpage);
-			*dst |= migrate_pfn(page_to_pfn(dpage));
+			*dst |= migrate_pfn_from_page(dpage);
 		}
 
 		for (i = 0; i < (1 << order); i++) {
@@ -1000,7 +1001,7 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma *args,
 			if (!dpage && order) {
 				dpage = alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr);
 				lock_page(dpage);
-				dst[i] = migrate_pfn(page_to_pfn(dpage));
+				dst[i] = migrate_pfn_from_page(dpage);
 				dst_page = pfn_to_page(page_to_pfn(dpage));
 				dpage = NULL; /* For the next iteration */
 			} else {
@@ -1412,7 +1413,7 @@ static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
 
 		/* TODO Support splitting here */
 		lock_page(dpage);
-		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
+		dst_pfns[i] = migrate_pfn_from_page(dpage);
 		if (src_pfns[i] & MIGRATE_PFN_WRITE)
 			dst_pfns[i] |= MIGRATE_PFN_WRITE;
 		if (order)
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 23379663b1e1..1a2067f830da 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -207,9 +207,8 @@ static int migrate_vma_collect_huge_pmd(pmd_t *pmdp, unsigned long start,
 			.vma = walk->vma,
 		};
 
-		unsigned long pfn = page_to_pfn(folio_page(folio, 0));
-
-		migrate->src[migrate->npages] = migrate_pfn(pfn) | write
+		migrate->src[migrate->npages] = migrate_pfn_from_page(folio_page(folio, 0))
+						| write
 						| MIGRATE_PFN_MIGRATE
 						| MIGRATE_PFN_COMPOUND;
 		migrate->dst[migrate->npages++] = 0;
@@ -328,7 +327,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 				goto again;
 			}
 
-			mpfn = migrate_pfn(page_to_pfn(page)) |
+			mpfn = migrate_pfn_from_page(page) |
 					MIGRATE_PFN_MIGRATE;
 			if (softleaf_is_device_private_write(entry))
 				mpfn |= MIGRATE_PFN_WRITE;
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-08 22:00   ` Felix Kuehling
  2026-01-07  9:18 ` [PATCH v2 03/11] mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns Jordan Niethe
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have a
pfn.

A MIGRATE_PFN flag will be introduced that distinguishes between mpfns
that contain a pfn vs an offset into device private memory.

Replace usages of pfns and page_to_pfn() to mpfns and
migrate_pfn_to_page() to prepare for handling this distinction. This
will assist in continuing to use the same code paths for both
MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT devices.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
---
v2:
  - New to series
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 +++++++--------
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |  2 +-
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index ca552c34ece2..c493b19268cc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -204,17 +204,17 @@ svm_migrate_copy_done(struct amdgpu_device *adev, struct dma_fence *mfence)
 }
 
 unsigned long
-svm_migrate_addr_to_pfn(struct amdgpu_device *adev, unsigned long addr)
+svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long addr)
 {
-	return (addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT;
+	return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT);
 }
 
 static void
-svm_migrate_get_vram_page(struct svm_range *prange, unsigned long pfn)
+svm_migrate_get_vram_page(struct svm_range *prange, unsigned long mpfn)
 {
 	struct page *page;
 
-	page = pfn_to_page(pfn);
+	page = migrate_pfn_to_page(mpfn);
 	svm_range_bo_ref(prange->svm_bo);
 	page->zone_device_data = prange->svm_bo;
 	zone_device_page_init(page, 0);
@@ -225,7 +225,7 @@ svm_migrate_put_vram_page(struct amdgpu_device *adev, unsigned long addr)
 {
 	struct page *page;
 
-	page = pfn_to_page(svm_migrate_addr_to_pfn(adev, addr));
+	page = migrate_pfn_to_page(svm_migrate_addr_to_mpfn(adev, addr));
 	unlock_page(page);
 	put_page(page);
 }
@@ -235,7 +235,7 @@ svm_migrate_addr(struct amdgpu_device *adev, struct page *page)
 {
 	unsigned long addr;
 
-	addr = page_to_pfn(page) << PAGE_SHIFT;
+	addr = (migrate_pfn_from_page(page) >> MIGRATE_PFN_SHIFT) << PAGE_SHIFT;
 	return (addr - adev->kfd.pgmap.range.start);
 }
 
@@ -301,9 +301,8 @@ svm_migrate_copy_to_vram(struct kfd_node *node, struct svm_range *prange,
 
 		if (migrate->src[i] & MIGRATE_PFN_MIGRATE) {
 			dst[i] = cursor.start + (j << PAGE_SHIFT);
-			migrate->dst[i] = svm_migrate_addr_to_pfn(adev, dst[i]);
+			migrate->dst[i] = svm_migrate_addr_to_mpfn(adev, dst[i]);
 			svm_migrate_get_vram_page(prange, migrate->dst[i]);
-			migrate->dst[i] = migrate_pfn(migrate->dst[i]);
 			mpages++;
 		}
 		spage = migrate_pfn_to_page(migrate->src[i]);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
index 2b7fd442d29c..a80b72abe1e0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
@@ -48,7 +48,7 @@ int svm_migrate_vram_to_ram(struct svm_range *prange, struct mm_struct *mm,
 			    uint32_t trigger, struct page *fault_page);
 
 unsigned long
-svm_migrate_addr_to_pfn(struct amdgpu_device *adev, unsigned long addr);
+svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long addr);
 
 #endif /* IS_ENABLED(CONFIG_HSA_AMD_SVM) */
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 03/11] mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages Jordan Niethe
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have a
pfn.

This causes an issue for migrate_device_{pfns,range}() which take pfn
parameters because depending on of the device is MEMORY_DEVICE_PRIVATE
or MEMORY_DEVICE_COHERENT will effect how that parameter should be
interpreted.

A MIGRATE_PFN flag will be introduced that distinguishes between mpfns
that contain a pfn vs an offset into device private memory, we will take
advantage of that here.

Update migrate_device_{pfns,range}() to take a mpfn instead of pfn.

Update the users of migrate_device_{pfns,range}() to pass in an mpfn.

To support this change, update the
dpagemap_devmem_ops::populate_devmem_pfn() to instead return mpfns and
rename accordingly.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
---
v2: New to series
---
 drivers/gpu/drm/drm_pagemap.c          |  9 +++---
 drivers/gpu/drm/nouveau/nouveau_dmem.c |  5 +--
 drivers/gpu/drm/xe/xe_svm.c            |  9 +++---
 include/drm/drm_pagemap.h              |  8 ++---
 lib/test_hmm.c                         |  2 +-
 mm/migrate_device.c                    | 45 ++++++++++++++------------
 6 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index 5ddf395847ef..e4c73a9ce68b 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -337,7 +337,7 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
 
 	mmap_assert_locked(mm);
 
-	if (!ops->populate_devmem_pfn || !ops->copy_to_devmem ||
+	if (!ops->populate_devmem_mpfn || !ops->copy_to_devmem ||
 	    !ops->copy_to_ram)
 		return -EOPNOTSUPP;
 
@@ -390,7 +390,7 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
 		goto err_finalize;
 	}
 
-	err = ops->populate_devmem_pfn(devmem_allocation, npages, migrate.dst);
+	err = ops->populate_devmem_mpfn(devmem_allocation, npages, migrate.dst);
 	if (err)
 		goto err_finalize;
 
@@ -401,10 +401,9 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
 		goto err_finalize;
 
 	for (i = 0; i < npages; ++i) {
-		struct page *page = pfn_to_page(migrate.dst[i]);
+		struct page *page = migrate_pfn_to_page(migrate.dst[i]);
 
 		pages[i] = page;
-		migrate.dst[i] = migrate_pfn(migrate.dst[i]);
 		drm_pagemap_get_devmem_page(page, zdd);
 	}
 
@@ -575,7 +574,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
 	pagemap_addr = buf + (2 * sizeof(*src) * npages);
 	pages = buf + (2 * sizeof(*src) + sizeof(*pagemap_addr)) * npages;
 
-	err = ops->populate_devmem_pfn(devmem_allocation, npages, src);
+	err = ops->populate_devmem_mpfn(devmem_allocation, npages, src);
 	if (err)
 		goto err_free;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index a7edcdca9701..bd3f7102c3f9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -483,8 +483,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
 	dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
 	dma_info = kvcalloc(npages, sizeof(*dma_info), GFP_KERNEL | __GFP_NOFAIL);
 
-	migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
-			npages);
+	migrate_device_range(src_pfns,
+			     migrate_pfn(chunk->pagemap.range.start >> PAGE_SHIFT),
+			     npages);
 
 	for (i = 0; i < npages; i++) {
 		if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index 55c5a0eb82e1..260676b0d246 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -5,6 +5,7 @@
 
 #include <drm/drm_drv.h>
 
+#include <linux/migrate.h>
 #include "xe_bo.h"
 #include "xe_exec_queue_types.h"
 #include "xe_gt_stats.h"
@@ -681,8 +682,8 @@ static struct drm_buddy *vram_to_buddy(struct xe_vram_region *vram)
 	return &vram->ttm.mm;
 }
 
-static int xe_svm_populate_devmem_pfn(struct drm_pagemap_devmem *devmem_allocation,
-				      unsigned long npages, unsigned long *pfn)
+static int xe_svm_populate_devmem_mpfn(struct drm_pagemap_devmem *devmem_allocation,
+				       unsigned long npages, unsigned long *pfn)
 {
 	struct xe_bo *bo = to_xe_bo(devmem_allocation);
 	struct ttm_resource *res = bo->ttm.resource;
@@ -697,7 +698,7 @@ static int xe_svm_populate_devmem_pfn(struct drm_pagemap_devmem *devmem_allocati
 		int i;
 
 		for (i = 0; i < drm_buddy_block_size(buddy, block) >> PAGE_SHIFT; ++i)
-			pfn[j++] = block_pfn + i;
+			pfn[j++] = migrate_pfn(block_pfn + i);
 	}
 
 	return 0;
@@ -705,7 +706,7 @@ static int xe_svm_populate_devmem_pfn(struct drm_pagemap_devmem *devmem_allocati
 
 static const struct drm_pagemap_devmem_ops dpagemap_devmem_ops = {
 	.devmem_release = xe_svm_devmem_release,
-	.populate_devmem_pfn = xe_svm_populate_devmem_pfn,
+	.populate_devmem_mpfn = xe_svm_populate_devmem_mpfn,
 	.copy_to_devmem = xe_svm_copy_to_devmem,
 	.copy_to_ram = xe_svm_copy_to_ram,
 };
diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
index f6e7e234c089..0d1d083b778a 100644
--- a/include/drm/drm_pagemap.h
+++ b/include/drm/drm_pagemap.h
@@ -157,17 +157,17 @@ struct drm_pagemap_devmem_ops {
 	void (*devmem_release)(struct drm_pagemap_devmem *devmem_allocation);
 
 	/**
-	 * @populate_devmem_pfn: Populate device memory PFN (required for migration)
+	 * @populate_devmem_mpfn: Populate device memory PFN (required for migration)
 	 * @devmem_allocation: device memory allocation
 	 * @npages: Number of pages to populate
-	 * @pfn: Array of page frame numbers to populate
+	 * @mpfn: Array of migrate page frame numbers to populate
 	 *
 	 * Populate device memory page frame numbers (PFN).
 	 *
 	 * Return: 0 on success, a negative error code on failure.
 	 */
-	int (*populate_devmem_pfn)(struct drm_pagemap_devmem *devmem_allocation,
-				   unsigned long npages, unsigned long *pfn);
+	int (*populate_devmem_mpfn)(struct drm_pagemap_devmem *devmem_allocation,
+				    unsigned long npages, unsigned long *pfn);
 
 	/**
 	 * @copy_to_devmem: Copy to device memory (required for migration)
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 7e5248404d00..a6ff292596f3 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -1389,7 +1389,7 @@ static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
 	src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
 	dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
 
-	migrate_device_range(src_pfns, start_pfn, npages);
+	migrate_device_range(src_pfns, migrate_pfn(start_pfn), npages);
 	for (i = 0; i < npages; i++) {
 		struct page *dpage, *spage;
 
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 1a2067f830da..a2baaa2a81f9 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -1354,11 +1354,11 @@ void migrate_vma_finalize(struct migrate_vma *migrate)
 }
 EXPORT_SYMBOL(migrate_vma_finalize);
 
-static unsigned long migrate_device_pfn_lock(unsigned long pfn)
+static unsigned long migrate_device_pfn_lock(unsigned long mpfn)
 {
 	struct folio *folio;
 
-	folio = folio_get_nontail_page(pfn_to_page(pfn));
+	folio = folio_get_nontail_page(migrate_pfn_to_page(mpfn));
 	if (!folio)
 		return 0;
 
@@ -1367,13 +1367,14 @@ static unsigned long migrate_device_pfn_lock(unsigned long pfn)
 		return 0;
 	}
 
-	return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
+	return mpfn | MIGRATE_PFN_MIGRATE;
 }
 
 /**
  * migrate_device_range() - migrate device private pfns to normal memory.
- * @src_pfns: array large enough to hold migrating source device private pfns.
- * @start: starting pfn in the range to migrate.
+ * @src_mpfns: array large enough to hold migrating source device private
+ * migrate pfns.
+ * @start: starting migrate pfn in the range to migrate.
  * @npages: number of pages to migrate.
  *
  * migrate_vma_setup() is similar in concept to migrate_vma_setup() except that
@@ -1389,28 +1390,29 @@ static unsigned long migrate_device_pfn_lock(unsigned long pfn)
  * allocate destination pages and start copying data from the device to CPU
  * memory before calling migrate_device_pages().
  */
-int migrate_device_range(unsigned long *src_pfns, unsigned long start,
+int migrate_device_range(unsigned long *src_mpfns, unsigned long start,
 			unsigned long npages)
 {
-	unsigned long i, j, pfn;
+	unsigned long i, j, mpfn;
 
-	for (pfn = start, i = 0; i < npages; pfn++, i++) {
-		struct page *page = pfn_to_page(pfn);
+	for (mpfn = start, i = 0; i < npages; i++) {
+		struct page *page = migrate_pfn_to_page(mpfn);
 		struct folio *folio = page_folio(page);
 		unsigned int nr = 1;
 
-		src_pfns[i] = migrate_device_pfn_lock(pfn);
+		src_mpfns[i] = migrate_device_pfn_lock(mpfn);
 		nr = folio_nr_pages(folio);
 		if (nr > 1) {
-			src_pfns[i] |= MIGRATE_PFN_COMPOUND;
+			src_mpfns[i] |= MIGRATE_PFN_COMPOUND;
 			for (j = 1; j < nr; j++)
-				src_pfns[i+j] = 0;
+				src_mpfns[i+j] = 0;
 			i += j - 1;
-			pfn += j - 1;
+			mpfn += (j - 1) << MIGRATE_PFN_SHIFT;
 		}
+		mpfn += 1 << MIGRATE_PFN_SHIFT;
 	}
 
-	migrate_device_unmap(src_pfns, npages, NULL);
+	migrate_device_unmap(src_mpfns, npages, NULL);
 
 	return 0;
 }
@@ -1418,32 +1420,33 @@ EXPORT_SYMBOL(migrate_device_range);
 
 /**
  * migrate_device_pfns() - migrate device private pfns to normal memory.
- * @src_pfns: pre-popluated array of source device private pfns to migrate.
+ * @src_mpfns: pre-popluated array of source device private migrate pfns to
+ * migrate.
  * @npages: number of pages to migrate.
  *
  * Similar to migrate_device_range() but supports non-contiguous pre-popluated
  * array of device pages to migrate.
  */
-int migrate_device_pfns(unsigned long *src_pfns, unsigned long npages)
+int migrate_device_pfns(unsigned long *src_mpfns, unsigned long npages)
 {
 	unsigned long i, j;
 
 	for (i = 0; i < npages; i++) {
-		struct page *page = pfn_to_page(src_pfns[i]);
+		struct page *page = migrate_pfn_to_page(src_mpfns[i]);
 		struct folio *folio = page_folio(page);
 		unsigned int nr = 1;
 
-		src_pfns[i] = migrate_device_pfn_lock(src_pfns[i]);
+		src_mpfns[i] = migrate_device_pfn_lock(src_mpfns[i]);
 		nr = folio_nr_pages(folio);
 		if (nr > 1) {
-			src_pfns[i] |= MIGRATE_PFN_COMPOUND;
+			src_mpfns[i] |= MIGRATE_PFN_COMPOUND;
 			for (j = 1; j < nr; j++)
-				src_pfns[i+j] = 0;
+				src_mpfns[i+j] = 0;
 			i += j - 1;
 		}
 	}
 
-	migrate_device_unmap(src_pfns, npages, NULL);
+	migrate_device_unmap(src_mpfns, npages, NULL);
 
 	return 0;
 }
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (2 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 03/11] mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-08 20:01   ` Felix Kuehling
  2026-01-07  9:18 ` [PATCH v2 05/11] mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn " Jordan Niethe
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have
normal PFN and must be handled separately.

Prepare for this by adding a MIGRATE_PFN_DEVICE_PRIVATE flag to indicate
that a migrate pfn contains a PFN for a device private page.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>

---
v1:
- Update for HMM huge page support
- Update existing drivers to use MIGRATE_PFN_DEVICE
v2:
- Include changes to migrate_pfn_from_page()
- Rename to MIGRATE_PFN_DEVICE_PRIVATE
- drm/amd: Check adev->gmc.xgmi.connected_to_cpu
- lib/test_hmm.c: Check chunk->pagemap.type == MEMORY_DEVICE_PRIVATE
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  7 ++++++-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  3 ++-
 drivers/gpu/drm/xe/xe_svm.c              |  2 +-
 include/linux/migrate.h                  | 14 +++++++++-----
 lib/test_hmm.c                           |  6 +++++-
 5 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index c493b19268cc..1a07a8b92e8f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -206,7 +206,12 @@ svm_migrate_copy_done(struct amdgpu_device *adev, struct dma_fence *mfence)
 unsigned long
 svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long addr)
 {
-	return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT);
+	unsigned long flags = 0;
+
+	if (!adev->gmc.xgmi.connected_to_cpu)
+		flags |= MIGRATE_PFN_DEVICE_PRIVATE;
+	return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT) |
+	       flags;
 }
 
 static void
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index bd3f7102c3f9..adfa3df5cbc5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -484,7 +484,8 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
 	dma_info = kvcalloc(npages, sizeof(*dma_info), GFP_KERNEL | __GFP_NOFAIL);
 
 	migrate_device_range(src_pfns,
-			     migrate_pfn(chunk->pagemap.range.start >> PAGE_SHIFT),
+			     migrate_pfn(chunk->pagemap.range.start >> PAGE_SHIFT) |
+			     MIGRATE_PFN_DEVICE_PRIVATE,
 			     npages);
 
 	for (i = 0; i < npages; i++) {
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index 260676b0d246..f82790d7e7e6 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -698,7 +698,7 @@ static int xe_svm_populate_devmem_mpfn(struct drm_pagemap_devmem *devmem_allocat
 		int i;
 
 		for (i = 0; i < drm_buddy_block_size(buddy, block) >> PAGE_SHIFT; ++i)
-			pfn[j++] = migrate_pfn(block_pfn + i);
+			pfn[j++] = migrate_pfn(block_pfn + i) | MIGRATE_PFN_DEVICE_PRIVATE;
 	}
 
 	return 0;
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index d269ec1400be..5fd2ee080bc0 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -122,11 +122,12 @@ static inline int migrate_misplaced_folio(struct folio *folio, int node)
  * have enough bits to store all physical address and flags. So far we have
  * enough room for all our flags.
  */
-#define MIGRATE_PFN_VALID	(1UL << 0)
-#define MIGRATE_PFN_MIGRATE	(1UL << 1)
-#define MIGRATE_PFN_WRITE	(1UL << 3)
-#define MIGRATE_PFN_COMPOUND	(1UL << 4)
-#define MIGRATE_PFN_SHIFT	6
+#define MIGRATE_PFN_VALID		(1UL << 0)
+#define MIGRATE_PFN_MIGRATE		(1UL << 1)
+#define MIGRATE_PFN_WRITE		(1UL << 3)
+#define MIGRATE_PFN_COMPOUND		(1UL << 4)
+#define MIGRATE_PFN_DEVICE_PRIVATE	(1UL << 5)
+#define MIGRATE_PFN_SHIFT		6
 
 static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
 {
@@ -142,6 +143,9 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 
 static inline unsigned long migrate_pfn_from_page(struct page *page)
 {
+	if (is_device_private_page(page))
+		return migrate_pfn(page_to_pfn(page)) |
+		       MIGRATE_PFN_DEVICE_PRIVATE;
 	return migrate_pfn(page_to_pfn(page));
 }
 
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index a6ff292596f3..872d3846af7b 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -1385,11 +1385,15 @@ static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
 	unsigned long *src_pfns;
 	unsigned long *dst_pfns;
 	unsigned int order = 0;
+	unsigned long flags = 0;
 
 	src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
 	dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
 
-	migrate_device_range(src_pfns, migrate_pfn(start_pfn), npages);
+	if (chunk->pagemap.type == MEMORY_DEVICE_PRIVATE)
+		flags |= MIGRATE_PFN_DEVICE_PRIVATE;
+
+	migrate_device_range(src_pfns, migrate_pfn(start_pfn) | flags, npages);
 	for (i = 0; i < npages; i++) {
 		struct page *dpage, *spage;
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 05/11] mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn to track device private pages
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (3 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 06/11] mm: Add helpers to create migration entries from struct pages Jordan Niethe
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have
normal PFN and must be handled separately.

Prepare for this by modifying page_vma_mapped_walk::pfn to contain flags
as well as a PFN. Introduce a PVMW_PFN_DEVICE_PRIVATE flag to indicate
that a page_vma_mapped_walk::pfn contains a PFN for a device private
page.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
v1:
  - Update for HMM huge page support
v2:
  - Move adding device_private param to check_pmd() until final patch
---
 include/linux/rmap.h | 30 +++++++++++++++++++++++++++++-
 mm/page_vma_mapped.c | 13 +++++++------
 mm/rmap.c            |  4 ++--
 mm/vmscan.c          |  2 +-
 4 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index daa92a58585d..57c63b6a8f65 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -939,9 +939,37 @@ struct page_vma_mapped_walk {
 	unsigned int flags;
 };
 
+/* pfn is a device private offset */
+#define PVMW_PFN_DEVICE_PRIVATE	(1UL << 0)
+#define PVMW_PFN_SHIFT		1
+
+static inline unsigned long page_vma_walk_pfn(unsigned long pfn)
+{
+	return (pfn << PVMW_PFN_SHIFT);
+}
+
+static inline unsigned long folio_page_vma_walk_pfn(const struct folio *folio)
+{
+	if (folio_is_device_private(folio))
+		return page_vma_walk_pfn(folio_pfn(folio)) |
+		       PVMW_PFN_DEVICE_PRIVATE;
+
+	return page_vma_walk_pfn(folio_pfn(folio));
+}
+
+static inline struct page *page_vma_walk_pfn_to_page(unsigned long pvmw_pfn)
+{
+	return pfn_to_page(pvmw_pfn >> PVMW_PFN_SHIFT);
+}
+
+static inline struct folio *page_vma_walk_pfn_to_folio(unsigned long pvmw_pfn)
+{
+	return page_folio(page_vma_walk_pfn_to_page(pvmw_pfn));
+}
+
 #define DEFINE_FOLIO_VMA_WALK(name, _folio, _vma, _address, _flags)	\
 	struct page_vma_mapped_walk name = {				\
-		.pfn = folio_pfn(_folio),				\
+		.pfn = folio_page_vma_walk_pfn(_folio),			\
 		.nr_pages = folio_nr_pages(_folio),			\
 		.pgoff = folio_pgoff(_folio),				\
 		.vma = _vma,						\
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index b38a1d00c971..96c525785d78 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -129,9 +129,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 		pfn = softleaf_to_pfn(entry);
 	}
 
-	if ((pfn + pte_nr - 1) < pvmw->pfn)
+	if ((pfn + pte_nr - 1) < (pvmw->pfn >> PVMW_PFN_SHIFT))
 		return false;
-	if (pfn > (pvmw->pfn + pvmw->nr_pages - 1))
+	if (pfn > ((pvmw->pfn >> PVMW_PFN_SHIFT) + pvmw->nr_pages - 1))
 		return false;
 	return true;
 }
@@ -139,9 +139,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 /* Returns true if the two ranges overlap.  Careful to not overflow. */
 static bool check_pmd(unsigned long pfn, struct page_vma_mapped_walk *pvmw)
 {
-	if ((pfn + HPAGE_PMD_NR - 1) < pvmw->pfn)
+	if ((pfn + HPAGE_PMD_NR - 1) < (pvmw->pfn >> PVMW_PFN_SHIFT))
 		return false;
-	if (pfn > pvmw->pfn + pvmw->nr_pages - 1)
+	if (pfn > (pvmw->pfn >> PVMW_PFN_SHIFT) + pvmw->nr_pages - 1)
 		return false;
 	return true;
 }
@@ -254,7 +254,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 				entry = softleaf_from_pmd(pmde);
 
 				if (!softleaf_is_migration(entry) ||
-				    !check_pmd(softleaf_to_pfn(entry), pvmw))
+				    !check_pmd(softleaf_to_pfn(entry),
+					       pvmw))
 					return not_found(pvmw);
 				return true;
 			}
@@ -350,7 +351,7 @@ unsigned long page_mapped_in_vma(const struct page *page,
 {
 	const struct folio *folio = page_folio(page);
 	struct page_vma_mapped_walk pvmw = {
-		.pfn = page_to_pfn(page),
+		.pfn = folio_page_vma_walk_pfn(folio),
 		.nr_pages = 1,
 		.vma = vma,
 		.flags = PVMW_SYNC,
diff --git a/mm/rmap.c b/mm/rmap.c
index f955f02d570e..79a2478b4aa9 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1112,7 +1112,7 @@ static bool mapping_wrprotect_range_one(struct folio *folio,
 {
 	struct wrprotect_file_state *state = (struct wrprotect_file_state *)arg;
 	struct page_vma_mapped_walk pvmw = {
-		.pfn		= state->pfn,
+		.pfn		= page_vma_walk_pfn(state->pfn),
 		.nr_pages	= state->nr_pages,
 		.pgoff		= state->pgoff,
 		.vma		= vma,
@@ -1190,7 +1190,7 @@ int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff,
 		      struct vm_area_struct *vma)
 {
 	struct page_vma_mapped_walk pvmw = {
-		.pfn		= pfn,
+		.pfn		= page_vma_walk_pfn(pfn),
 		.nr_pages	= nr_pages,
 		.pgoff		= pgoff,
 		.vma		= vma,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 670fe9fae5ba..be5682d345b5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4203,7 +4203,7 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
 	pte_t *pte = pvmw->pte;
 	unsigned long addr = pvmw->address;
 	struct vm_area_struct *vma = pvmw->vma;
-	struct folio *folio = pfn_folio(pvmw->pfn);
+	struct folio *folio = page_vma_walk_pfn_to_folio(pvmw->pfn);
 	struct mem_cgroup *memcg = folio_memcg(folio);
 	struct pglist_data *pgdat = folio_pgdat(folio);
 	struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 06/11] mm: Add helpers to create migration entries from struct pages
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (4 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 05/11] mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn " Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 07/11] mm: Add a new swap type for migration entries of device private pages Jordan Niethe
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

To create a new migration entry for a given struct page, that page is
first converted to its pfn, before passing the pfn to
make_readable_migration_entry() (and friends).

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have a
pfn and must be handled separately.

Prepare for this with a new set of helpers:

  - make_readable_migration_entry_from_page()
  - make_readable_exclusive_migration_entry_from_page()
  - make_writable_migration_entry_from_page()

These helpers take a struct page as parameter instead of a pfn. This
will allow more flexibility for handling the swap offset field
differently for device private pages.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
---
v1:
  - New to series
v2:
  - Add flags param
---
 include/linux/leafops.h | 14 ++++++++++++++
 include/linux/swapops.h | 33 +++++++++++++++++++++++++++++++++
 mm/huge_memory.c        | 29 +++++++++++++++++------------
 mm/hugetlb.c            | 15 +++++++++------
 mm/memory.c             |  5 +++--
 mm/migrate_device.c     | 12 ++++++------
 mm/mprotect.c           | 10 +++++++---
 mm/rmap.c               | 12 ++++++------
 8 files changed, 95 insertions(+), 35 deletions(-)

diff --git a/include/linux/leafops.h b/include/linux/leafops.h
index cfafe7a5e7b1..2fde8208da13 100644
--- a/include/linux/leafops.h
+++ b/include/linux/leafops.h
@@ -363,6 +363,20 @@ static inline unsigned long softleaf_to_pfn(softleaf_t entry)
 	return swp_offset(entry) & SWP_PFN_MASK;
 }
 
+/**
+ * softleaf_to_flags() - Obtain flags encoded within leaf entry.
+ * @entry: Leaf entry, softleaf_has_pfn(@entry) must return true.
+ *
+ * Returns: The flags associated with the leaf entry.
+ */
+static inline unsigned long softleaf_to_flags(softleaf_t entry)
+{
+	VM_WARN_ON_ONCE(!softleaf_has_pfn(entry));
+
+	/* Temporary until swp_entry_t eliminated. */
+	return swp_offset(entry) & (SWP_MIG_YOUNG | SWP_MIG_DIRTY);
+}
+
 /**
  * softleaf_to_page() - Obtains struct page for PFN encoded within leaf entry.
  * @entry: Leaf entry, softleaf_has_pfn(@entry) must return true.
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 8cfc966eae48..a9ad997bd5ec 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -173,16 +173,33 @@ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
 	return swp_entry(SWP_MIGRATION_READ, offset);
 }
 
+static inline swp_entry_t make_readable_migration_entry_from_page(struct page *page, pgoff_t flags)
+{
+	return swp_entry(SWP_MIGRATION_READ, page_to_pfn(page) | flags);
+}
+
 static inline swp_entry_t make_readable_exclusive_migration_entry(pgoff_t offset)
 {
 	return swp_entry(SWP_MIGRATION_READ_EXCLUSIVE, offset);
 }
 
+static inline swp_entry_t make_readable_exclusive_migration_entry_from_page(struct page *page,
+									    pgoff_t flags)
+{
+	return swp_entry(SWP_MIGRATION_READ_EXCLUSIVE, page_to_pfn(page) | flags);
+}
+
 static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
 {
 	return swp_entry(SWP_MIGRATION_WRITE, offset);
 }
 
+static inline swp_entry_t make_writable_migration_entry_from_page(struct page *page,
+								  pgoff_t flags)
+{
+	return swp_entry(SWP_MIGRATION_WRITE, page_to_pfn(page) | flags);
+}
+
 /*
  * Returns whether the host has large enough swap offset field to support
  * carrying over pgtable A/D bits for page migrations.  The result is
@@ -222,11 +239,27 @@ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
 	return swp_entry(0, 0);
 }
 
+static inline swp_entry_t make_readable_migration_entry_from_page(struct page *page, pgoff_t flags)
+{
+	return swp_entry(0, 0);
+}
+
+static inline swp_entry_t make_writeable_migration_entry_from_page(struct page *page, pgoff_t flags)
+{
+	return swp_entry(0, 0);
+}
+
 static inline swp_entry_t make_readable_exclusive_migration_entry(pgoff_t offset)
 {
 	return swp_entry(0, 0);
 }
 
+static inline swp_entry_t make_readable_exclusive_migration_entry_from_page(struct page *page,
+									    pgoff_t flags)
+{
+	return swp_entry(0, 0);
+}
+
 static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
 {
 	return swp_entry(0, 0);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 40cf59301c21..e3a448cdb34d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1800,7 +1800,8 @@ static void copy_huge_non_present_pmd(
 
 	if (softleaf_is_migration_write(entry) ||
 	    softleaf_is_migration_read_exclusive(entry)) {
-		entry = make_readable_migration_entry(swp_offset(entry));
+		entry = make_readable_migration_entry_from_page(softleaf_to_page(entry),
+								softleaf_to_flags(entry));
 		pmd = swp_entry_to_pmd(entry);
 		if (pmd_swp_soft_dirty(*src_pmd))
 			pmd = pmd_swp_mksoft_dirty(pmd);
@@ -2524,9 +2525,13 @@ static void change_non_present_huge_pmd(struct mm_struct *mm,
 		 * just be safe and disable write
 		 */
 		if (folio_test_anon(folio))
-			entry = make_readable_exclusive_migration_entry(swp_offset(entry));
+			entry = make_readable_exclusive_migration_entry_from_page(
+						softleaf_to_page(entry),
+						softleaf_to_flags(entry));
 		else
-			entry = make_readable_migration_entry(swp_offset(entry));
+			entry = make_readable_migration_entry_from_page(
+						softleaf_to_page(entry),
+						softleaf_to_flags(entry));
 		newpmd = swp_entry_to_pmd(entry);
 		if (pmd_swp_soft_dirty(*pmd))
 			newpmd = pmd_swp_mksoft_dirty(newpmd);
@@ -3183,14 +3188,14 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 
 		for (i = 0, addr = haddr; i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE) {
 			if (write)
-				swp_entry = make_writable_migration_entry(
-							page_to_pfn(page + i));
+				swp_entry = make_writable_migration_entry_from_page(
+							page + i, 0);
 			else if (anon_exclusive)
-				swp_entry = make_readable_exclusive_migration_entry(
-							page_to_pfn(page + i));
+				swp_entry = make_readable_exclusive_migration_entry_from_page(
+							page + i, 0);
 			else
-				swp_entry = make_readable_migration_entry(
-							page_to_pfn(page + i));
+				swp_entry = make_readable_migration_entry_from_page(
+							page + i, 0);
 			if (young)
 				swp_entry = make_migration_entry_young(swp_entry);
 			if (dirty)
@@ -4890,11 +4895,11 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw,
 	if (pmd_dirty(pmdval))
 		folio_mark_dirty(folio);
 	if (pmd_write(pmdval))
-		entry = make_writable_migration_entry(page_to_pfn(page));
+		entry = make_writable_migration_entry_from_page(page, 0);
 	else if (anon_exclusive)
-		entry = make_readable_exclusive_migration_entry(page_to_pfn(page));
+		entry = make_readable_exclusive_migration_entry_from_page(page, 0);
 	else
-		entry = make_readable_migration_entry(page_to_pfn(page));
+		entry = make_readable_migration_entry_from_page(page, 0);
 	if (pmd_young(pmdval))
 		entry = make_migration_entry_young(entry);
 	if (pmd_dirty(pmdval))
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 51273baec9e5..6a5e40d4cfc2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4939,8 +4939,9 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 				 * COW mappings require pages in both
 				 * parent and child to be set to read.
 				 */
-				softleaf = make_readable_migration_entry(
-							swp_offset(softleaf));
+				softleaf = make_readable_migration_entry_from_page(
+							softleaf_to_page(softleaf),
+							softleaf_to_flags(softleaf));
 				entry = swp_entry_to_pte(softleaf);
 				if (userfaultfd_wp(src_vma) && uffd_wp)
 					entry = pte_swp_mkuffd_wp(entry);
@@ -6491,11 +6492,13 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
 
 			if (softleaf_is_migration_write(entry)) {
 				if (folio_test_anon(folio))
-					entry = make_readable_exclusive_migration_entry(
-								swp_offset(entry));
+					entry = make_readable_exclusive_migration_entry_from_page(
+								softleaf_to_page(entry),
+								softleaf_to_flags(entry));
 				else
-					entry = make_readable_migration_entry(
-								swp_offset(entry));
+					entry = make_readable_migration_entry_from_page(
+								softleaf_to_page(entry),
+								softleaf_to_flags(entry));
 				newpte = swp_entry_to_pte(entry);
 				pages++;
 			}
diff --git a/mm/memory.c b/mm/memory.c
index 2a55edc48a65..16493fbb3adb 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -963,8 +963,9 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 			 * to be set to read. A previously exclusive entry is
 			 * now shared.
 			 */
-			entry = make_readable_migration_entry(
-							swp_offset(entry));
+			entry = make_readable_migration_entry_from_page(
+							softleaf_to_page(entry),
+							softleaf_to_flags(entry));
 			pte = softleaf_to_pte(entry);
 			if (pte_swp_soft_dirty(orig_pte))
 				pte = pte_swp_mksoft_dirty(pte);
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index a2baaa2a81f9..c876526ac6a3 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -432,14 +432,14 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 
 			/* Setup special migration page table entry */
 			if (mpfn & MIGRATE_PFN_WRITE)
-				entry = make_writable_migration_entry(
-							page_to_pfn(page));
+				entry = make_writable_migration_entry_from_page(
+							page, 0);
 			else if (anon_exclusive)
-				entry = make_readable_exclusive_migration_entry(
-							page_to_pfn(page));
+				entry = make_readable_exclusive_migration_entry_from_page(
+							page, 0);
 			else
-				entry = make_readable_migration_entry(
-							page_to_pfn(page));
+				entry = make_readable_migration_entry_from_page(
+							page, 0);
 			if (pte_present(pte)) {
 				if (pte_young(pte))
 					entry = make_migration_entry_young(entry);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 283889e4f1ce..adfe1b7a4a19 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -328,10 +328,14 @@ static long change_pte_range(struct mmu_gather *tlb,
 				 * just be safe and disable write
 				 */
 				if (folio_test_anon(folio))
-					entry = make_readable_exclusive_migration_entry(
-							     swp_offset(entry));
+					entry = make_readable_exclusive_migration_entry_from_page(
+							softleaf_to_page(entry),
+							softleaf_to_flags(entry));
 				else
-					entry = make_readable_migration_entry(swp_offset(entry));
+					entry = make_readable_migration_entry_from_page(
+							softleaf_to_page(entry),
+							softleaf_to_flags(entry));
+
 				newpte = swp_entry_to_pte(entry);
 				if (pte_swp_soft_dirty(oldpte))
 					newpte = pte_swp_mksoft_dirty(newpte);
diff --git a/mm/rmap.c b/mm/rmap.c
index 79a2478b4aa9..6a63333f8722 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2539,14 +2539,14 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 			 * pte is removed and then restart fault handling.
 			 */
 			if (writable)
-				entry = make_writable_migration_entry(
-							page_to_pfn(subpage));
+				entry = make_writable_migration_entry_from_page(
+							subpage, 0);
 			else if (anon_exclusive)
-				entry = make_readable_exclusive_migration_entry(
-							page_to_pfn(subpage));
+				entry = make_readable_exclusive_migration_entry_from_page(
+							subpage, 0);
 			else
-				entry = make_readable_migration_entry(
-							page_to_pfn(subpage));
+				entry = make_readable_migration_entry_from_page(
+							subpage, 0);
 			if (likely(pte_present(pteval))) {
 				if (pte_young(pteval))
 					entry = make_migration_entry_young(entry);
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 07/11] mm: Add a new swap type for migration entries of device private pages
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (5 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 06/11] mm: Add helpers to create migration entries from struct pages Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 08/11] mm: Add helpers to create device private entries from struct pages Jordan Niethe
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have
pfns and must be handled separately.

When migrating a device private page a migration entry is created for
that page - this includes the pfn for that page. Once device private
pages begin using device memory offsets instead of pfns we will need to
be able to determine which kind of value is in the entry so we can
associate it with the correct page.

Introduce new swap types:

  - SWP_MIGRATION_DEVICE_READ
  - SWP_MIGRATION_DEVICE_WRITE
  - SWP_MIGRATION_DEVICE_READ_EXCLUSIVE

These correspond to

  - SWP_MIGRATION_READ
  - SWP_MIGRATION_WRITE
  - SWP_MIGRATION_READ_EXCLUSIVE

except the swap entry contains a device private offset.

The SWP_MIGRATION_DEVICE swap types are treated as specializations of
the SWP_MIGRATION types. That is, the existing helpers such as
is_writable_migration_entry() will still return true for a
SWP_MIGRATION_DEVICE_WRITE entry. Likewise, the
make_*__migration_entry_from_page() helpers will determine create either
a SWP_MIGRATION_DEVICE or a SWP_MIGRATION type as the page requires.

Introduce new helpers such as
is_writable_device_migration_private_entry() to disambiguate between a
SWP_MIGRATION_WRITE and a SWP_MIGRATION_DEVICE_WRITE entry.

Introduce corresponding softleaf types and helpers.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
v1:
  - Update for softleaf infrastructure
  - Handle make_readable_migration_entry_from_page() and friends
  - s/make_device_migration_readable_exclusive_migration_entry/make_readable_exclusive_migration_device_private_entry
  - s/is_device_migration_readable_exclusive_entry/is_readable_exclusive_device_private_migration_entry/
v2:
  - Add softleaf_is_migration_device_private_read()
---
 include/linux/leafops.h | 86 +++++++++++++++++++++++++++++++++++++----
 include/linux/swap.h    |  8 +++-
 include/linux/swapops.h | 79 +++++++++++++++++++++++++++++++++++++
 3 files changed, 164 insertions(+), 9 deletions(-)

diff --git a/include/linux/leafops.h b/include/linux/leafops.h
index 2fde8208da13..2fa09ffe9e34 100644
--- a/include/linux/leafops.h
+++ b/include/linux/leafops.h
@@ -28,6 +28,9 @@ enum softleaf_type {
 	SOFTLEAF_DEVICE_PRIVATE_READ,
 	SOFTLEAF_DEVICE_PRIVATE_WRITE,
 	SOFTLEAF_DEVICE_EXCLUSIVE,
+	SOFTLEAF_MIGRATION_DEVICE_READ,
+	SOFTLEAF_MIGRATION_DEVICE_READ_EXCLUSIVE,
+	SOFTLEAF_MIGRATION_DEVICE_WRITE,
 	/* H/W posion types. */
 	SOFTLEAF_HWPOISON,
 	/* Marker types. */
@@ -165,6 +168,12 @@ static inline enum softleaf_type softleaf_type(softleaf_t entry)
 		return SOFTLEAF_DEVICE_PRIVATE_READ;
 	case SWP_DEVICE_EXCLUSIVE:
 		return SOFTLEAF_DEVICE_EXCLUSIVE;
+	case SWP_MIGRATION_DEVICE_READ:
+		return SOFTLEAF_MIGRATION_DEVICE_READ;
+	case SWP_MIGRATION_DEVICE_WRITE:
+		return SOFTLEAF_MIGRATION_DEVICE_WRITE;
+	case SWP_MIGRATION_DEVICE_READ_EXCLUSIVE:
+		return SOFTLEAF_MIGRATION_DEVICE_READ_EXCLUSIVE;
 #endif
 #ifdef CONFIG_MEMORY_FAILURE
 	case SWP_HWPOISON:
@@ -190,16 +199,75 @@ static inline bool softleaf_is_swap(softleaf_t entry)
 	return softleaf_type(entry) == SOFTLEAF_SWAP;
 }
 
+/**
+ * softleaf_is_migration_device_private() - Is this leaf entry a migration
+ * device private entry?
+ * @entry: Leaf entry.
+ *
+ * Returns: true if the leaf entry is a device private entry, otherwise false.
+ */
+static inline bool softleaf_is_migration_device_private(softleaf_t entry)
+{
+	switch (softleaf_type(entry)) {
+	case SOFTLEAF_MIGRATION_DEVICE_READ:
+	case SOFTLEAF_MIGRATION_DEVICE_WRITE:
+	case SOFTLEAF_MIGRATION_DEVICE_READ_EXCLUSIVE:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/**
+ * softleaf_is_migration_device_private_write() - Is this leaf entry a writable
+ * device private migration entry?
+ * @entry: Leaf entry.
+ *
+ * Returns: true if the leaf entry is a writable device private migration entry,
+ * otherwise false.
+ */
+static inline bool softleaf_is_migration_device_private_write(softleaf_t entry)
+{
+	return softleaf_type(entry) == SOFTLEAF_MIGRATION_DEVICE_WRITE;
+}
+
+/**
+ * softleaf_is_migration_device_private_read() - Is this leaf entry a readable
+ * device private migration entry?
+ * @entry: Leaf entry.
+ *
+ * Returns: true if the leaf entry is an readable device private migration
+ * entry, otherwise false.
+ */
+static inline bool softleaf_is_migration_device_private_read(softleaf_t entry)
+{
+	return softleaf_type(entry) == SOFTLEAF_MIGRATION_DEVICE_READ;
+}
+
+/**
+ * softleaf_is_migration_read_exclusive() - Is this leaf entry an exclusive
+ * readable device private migration entry?
+ * @entry: Leaf entry.
+ *
+ * Returns: true if the leaf entry is an exclusive readable device private
+ * migration entry, otherwise false.
+ */
+static inline bool softleaf_is_migration_device_private_read_exclusive(softleaf_t entry)
+{
+	return softleaf_type(entry) == SOFTLEAF_MIGRATION_DEVICE_READ_EXCLUSIVE;
+}
+
 /**
  * softleaf_is_migration_write() - Is this leaf entry a writable migration entry?
  * @entry: Leaf entry.
  *
- * Returns: true if the leaf entry is a writable migration entry, otherwise
- * false.
+ * Returns: true if the leaf entry is a writable migration entry or a writable
+ * device private migration entry, otherwise false.
  */
 static inline bool softleaf_is_migration_write(softleaf_t entry)
 {
-	return softleaf_type(entry) == SOFTLEAF_MIGRATION_WRITE;
+	return softleaf_type(entry) == SOFTLEAF_MIGRATION_WRITE ||
+	       softleaf_is_migration_device_private_write(entry);
 }
 
 /**
@@ -211,7 +279,8 @@ static inline bool softleaf_is_migration_write(softleaf_t entry)
  */
 static inline bool softleaf_is_migration_read(softleaf_t entry)
 {
-	return softleaf_type(entry) == SOFTLEAF_MIGRATION_READ;
+	return softleaf_type(entry) == SOFTLEAF_MIGRATION_READ ||
+	       softleaf_is_migration_device_private_read(entry);
 }
 
 /**
@@ -219,12 +288,13 @@ static inline bool softleaf_is_migration_read(softleaf_t entry)
  * readable migration entry?
  * @entry: Leaf entry.
  *
- * Returns: true if the leaf entry is an exclusive readable migration entry,
- * otherwise false.
+ * Returns: true if the leaf entry is an exclusive readable migration entry or
+ * exclusive readable device private migration entry, otherwise false.
  */
 static inline bool softleaf_is_migration_read_exclusive(softleaf_t entry)
 {
-	return softleaf_type(entry) == SOFTLEAF_MIGRATION_READ_EXCLUSIVE;
+	return softleaf_type(entry) == SOFTLEAF_MIGRATION_READ_EXCLUSIVE ||
+	       softleaf_is_migration_device_private_read_exclusive(entry);
 }
 
 /**
@@ -241,7 +311,7 @@ static inline bool softleaf_is_migration(softleaf_t entry)
 	case SOFTLEAF_MIGRATION_WRITE:
 		return true;
 	default:
-		return false;
+		return softleaf_is_migration_device_private(entry);
 	}
 }
 
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 38ca3df68716..c15e3b3067cd 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -74,12 +74,18 @@ static inline int current_is_kswapd(void)
  *
  * When a page is mapped by the device for exclusive access we set the CPU page
  * table entries to a special SWP_DEVICE_EXCLUSIVE entry.
+ *
+ * Because device private pages do not use regular PFNs, special migration
+ * entries are also needed.
  */
 #ifdef CONFIG_DEVICE_PRIVATE
-#define SWP_DEVICE_NUM 3
+#define SWP_DEVICE_NUM 6
 #define SWP_DEVICE_WRITE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM)
 #define SWP_DEVICE_READ (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM+1)
 #define SWP_DEVICE_EXCLUSIVE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM+2)
+#define SWP_MIGRATION_DEVICE_READ (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM+3)
+#define SWP_MIGRATION_DEVICE_READ_EXCLUSIVE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM+4)
+#define SWP_MIGRATION_DEVICE_WRITE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM+5)
 #else
 #define SWP_DEVICE_NUM 0
 #endif
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index a9ad997bd5ec..bae76d3831fb 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -148,6 +148,43 @@ static inline swp_entry_t make_device_exclusive_entry(pgoff_t offset)
 	return swp_entry(SWP_DEVICE_EXCLUSIVE, offset);
 }
 
+static inline swp_entry_t make_readable_migration_device_private_entry(pgoff_t offset)
+{
+	return swp_entry(SWP_MIGRATION_DEVICE_READ, offset);
+}
+
+static inline swp_entry_t make_writable_migration_device_private_entry(pgoff_t offset)
+{
+	return swp_entry(SWP_MIGRATION_DEVICE_WRITE, offset);
+}
+
+static inline bool is_device_private_migration_entry(swp_entry_t entry)
+{
+	return unlikely(swp_type(entry) == SWP_MIGRATION_DEVICE_READ ||
+			swp_type(entry) == SWP_MIGRATION_DEVICE_READ_EXCLUSIVE ||
+			swp_type(entry) == SWP_MIGRATION_DEVICE_WRITE);
+}
+
+static inline bool is_readable_device_migration_private_entry(swp_entry_t entry)
+{
+	return unlikely(swp_type(entry) == SWP_MIGRATION_DEVICE_READ);
+}
+
+static inline bool is_writable_device_migration_private_entry(swp_entry_t entry)
+{
+	return unlikely(swp_type(entry) == SWP_MIGRATION_DEVICE_WRITE);
+}
+
+static inline swp_entry_t make_readable_exclusive_migration_device_private_entry(pgoff_t offset)
+{
+	return swp_entry(SWP_MIGRATION_DEVICE_READ_EXCLUSIVE, offset);
+}
+
+static inline bool is_readable_exclusive_device_private_migration_entry(swp_entry_t entry)
+{
+	return swp_type(entry) == SWP_MIGRATION_DEVICE_READ_EXCLUSIVE;
+}
+
 #else /* CONFIG_DEVICE_PRIVATE */
 static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 {
@@ -164,6 +201,36 @@ static inline swp_entry_t make_device_exclusive_entry(pgoff_t offset)
 	return swp_entry(0, 0);
 }
 
+static inline swp_entry_t make_readable_migration_device_private_entry(pgoff_t offset)
+{
+	return swp_entry(0, 0);
+}
+
+static inline swp_entry_t make_writable_migration_device_private_entry(pgoff_t offset)
+{
+	return swp_entry(0, 0);
+}
+
+static inline bool is_device_private_migration_entry(swp_entry_t entry)
+{
+	return false;
+}
+
+static inline bool is_writable_device_migration_private_entry(swp_entry_t entry)
+{
+	return false;
+}
+
+static inline swp_entry_t make_readable_exclusive_migration_device_private_entry(pgoff_t offset)
+{
+	return swp_entry(0, 0);
+}
+
+static inline bool is_readable_exclusive_device_private_migration_entry(swp_entry_t entry)
+{
+	return false;
+}
+
 #endif /* CONFIG_DEVICE_PRIVATE */
 
 #ifdef CONFIG_MIGRATION
@@ -175,6 +242,10 @@ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
 
 static inline swp_entry_t make_readable_migration_entry_from_page(struct page *page, pgoff_t flags)
 {
+	if (is_device_private_page(page))
+		return make_readable_migration_device_private_entry(
+				page_to_pfn(page) | flags);
+
 	return swp_entry(SWP_MIGRATION_READ, page_to_pfn(page) | flags);
 }
 
@@ -186,6 +257,10 @@ static inline swp_entry_t make_readable_exclusive_migration_entry(pgoff_t offset
 static inline swp_entry_t make_readable_exclusive_migration_entry_from_page(struct page *page,
 									    pgoff_t flags)
 {
+	if (is_device_private_page(page))
+		return make_readable_exclusive_migration_device_private_entry(
+				page_to_pfn(page) | flags);
+
 	return swp_entry(SWP_MIGRATION_READ_EXCLUSIVE, page_to_pfn(page) | flags);
 }
 
@@ -197,6 +272,10 @@ static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
 static inline swp_entry_t make_writable_migration_entry_from_page(struct page *page,
 								  pgoff_t flags)
 {
+	if (is_device_private_page(page))
+		return make_writable_migration_device_private_entry(
+				page_to_pfn(page) | flags);
+
 	return swp_entry(SWP_MIGRATION_WRITE, page_to_pfn(page) | flags);
 }
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 08/11] mm: Add helpers to create device private entries from struct pages
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (6 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 07/11] mm: Add a new swap type for migration entries of device private pages Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 09/11] mm/util: Add flag to track device private pages in page snapshots Jordan Niethe
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

To create a new device private entry for a given struct page, that page
is first converted to its pfn, before passing the pfn to
make_writable_device_private_entry() (and friends).

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have a
pfn and must be handled separately.

Prepare for this with a new set of helpers:

- make_readable_device_private_entry_from_page()
- make_writable_device_private_entry_from_page()

These helpers take a struct page as parameter instead of a pfn. This
will allow more flexibility for handling the swap offset field
differently for device private pages.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
---
v1:
  - New to series
v2:
  - Add flag param
---
 include/linux/swapops.h | 24 ++++++++++++++++++++++++
 mm/huge_memory.c        | 14 ++++++--------
 mm/migrate.c            |  6 ++----
 mm/migrate_device.c     | 12 ++++--------
 4 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index bae76d3831fb..f7d85a451a2b 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -138,11 +138,23 @@ static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 	return swp_entry(SWP_DEVICE_READ, offset);
 }
 
+static inline swp_entry_t make_readable_device_private_entry_from_page(struct page *page,
+								       pgoff_t flags)
+{
+	return swp_entry(SWP_DEVICE_READ, page_to_pfn(page) | flags);
+}
+
 static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 {
 	return swp_entry(SWP_DEVICE_WRITE, offset);
 }
 
+static inline swp_entry_t make_writable_device_private_entry_from_page(struct page *page,
+								       pgoff_t flags)
+{
+	return swp_entry(SWP_DEVICE_WRITE, page_to_pfn(page) | flags);
+}
+
 static inline swp_entry_t make_device_exclusive_entry(pgoff_t offset)
 {
 	return swp_entry(SWP_DEVICE_EXCLUSIVE, offset);
@@ -191,11 +203,23 @@ static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 	return swp_entry(0, 0);
 }
 
+static inline swp_entry_t make_readable_device_private_entry_from_page(struct page *page,
+								       pgoff_t flags)
+{
+	return swp_entry(0, 0);
+}
+
 static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 {
 	return swp_entry(0, 0);
 }
 
+static inline swp_entry_t make_writable_device_private_entry_from_page(struct page *page,
+								       pgoff_t flags)
+{
+	return swp_entry(0, 0);
+}
+
 static inline swp_entry_t make_device_exclusive_entry(pgoff_t offset)
 {
 	return swp_entry(0, 0);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e3a448cdb34d..03f1f13bb24c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3219,11 +3219,11 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 			 * is false.
 			 */
 			if (write)
-				swp_entry = make_writable_device_private_entry(
-							page_to_pfn(page + i));
+				swp_entry = make_writable_device_private_entry_from_page(
+							page + i, 0);
 			else
-				swp_entry = make_readable_device_private_entry(
-							page_to_pfn(page + i));
+				swp_entry = make_readable_device_private_entry_from_page(
+							page + i, 0);
 			/*
 			 * Young and dirty bits are not progated via swp_entry
 			 */
@@ -4950,11 +4950,9 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
 		swp_entry_t entry;
 
 		if (pmd_write(pmde))
-			entry = make_writable_device_private_entry(
-							page_to_pfn(new));
+			entry = make_writable_device_private_entry_from_page(new, 0);
 		else
-			entry = make_readable_device_private_entry(
-							page_to_pfn(new));
+			entry = make_readable_device_private_entry_from_page(new, 0);
 		pmde = swp_entry_to_pmd(entry);
 
 		if (pmd_swp_soft_dirty(*pvmw->pmd))
diff --git a/mm/migrate.c b/mm/migrate.c
index 5169f9717f60..6cc6c989ab6b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -399,11 +399,9 @@ static bool remove_migration_pte(struct folio *folio,
 
 		if (unlikely(is_device_private_page(new))) {
 			if (pte_write(pte))
-				entry = make_writable_device_private_entry(
-							page_to_pfn(new));
+				entry = make_writable_device_private_entry_from_page(new, 0);
 			else
-				entry = make_readable_device_private_entry(
-							page_to_pfn(new));
+				entry = make_readable_device_private_entry_from_page(new, 0);
 			pte = softleaf_to_pte(entry);
 			if (pte_swp_soft_dirty(old_pte))
 				pte = pte_swp_mksoft_dirty(pte);
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index c876526ac6a3..0ca6f78df0e2 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -836,11 +836,9 @@ static int migrate_vma_insert_huge_pmd_page(struct migrate_vma *migrate,
 		swp_entry_t swp_entry;
 
 		if (vma->vm_flags & VM_WRITE)
-			swp_entry = make_writable_device_private_entry(
-						page_to_pfn(page));
+			swp_entry = make_writable_device_private_entry_from_page(page, 0);
 		else
-			swp_entry = make_readable_device_private_entry(
-						page_to_pfn(page));
+			swp_entry = make_readable_device_private_entry_from_page(page, 0);
 		entry = swp_entry_to_pmd(swp_entry);
 	} else {
 		if (folio_is_zone_device(folio) &&
@@ -1033,11 +1031,9 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
 		swp_entry_t swp_entry;
 
 		if (vma->vm_flags & VM_WRITE)
-			swp_entry = make_writable_device_private_entry(
-						page_to_pfn(page));
+			swp_entry = make_writable_device_private_entry_from_page(page, 0);
 		else
-			swp_entry = make_readable_device_private_entry(
-						page_to_pfn(page));
+			swp_entry = make_readable_device_private_entry_from_page(page, 0);
 		entry = swp_entry_to_pte(swp_entry);
 	} else {
 		if (folio_is_zone_device(folio) &&
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 09/11] mm/util: Add flag to track device private pages in page snapshots
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (7 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 08/11] mm: Add helpers to create device private entries from struct pages Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 10/11] mm/hmm: Add flag to track device private pages Jordan Niethe
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have
normal pfns and must be handled separately.

Add a new flag PAGE_SNAPSHOT_DEVICE_PRIVATE to track when the pfn of a
page snapshot is a device private page.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>
---
v1:
  - No change
v2:
  - No change
---
 fs/proc/page.c     | 6 ++++--
 include/linux/mm.h | 7 ++++---
 mm/util.c          | 3 +++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index f9b2c2c906cd..adca0e681442 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -191,10 +191,12 @@ u64 stable_page_flags(const struct page *page)
 	         folio_test_large_rmappable(folio)) {
 		/* Note: we indicate any THPs here, not just PMD-sized ones */
 		u |= 1 << KPF_THP;
-	} else if (is_huge_zero_pfn(ps.pfn)) {
+	} else if (!(ps.flags & PAGE_SNAPSHOT_DEVICE_PRIVATE) &&
+		   is_huge_zero_pfn(ps.pfn)) {
 		u |= 1 << KPF_ZERO_PAGE;
 		u |= 1 << KPF_THP;
-	} else if (is_zero_pfn(ps.pfn)) {
+	} else if (!(ps.flags & PAGE_SNAPSHOT_DEVICE_PRIVATE)
+		   && is_zero_pfn(ps.pfn)) {
 		u |= 1 << KPF_ZERO_PAGE;
 	}
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 15076261d0c2..e65329e1969f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4623,9 +4623,10 @@ static inline bool page_pool_page_is_pp(const struct page *page)
 }
 #endif
 
-#define PAGE_SNAPSHOT_FAITHFUL (1 << 0)
-#define PAGE_SNAPSHOT_PG_BUDDY (1 << 1)
-#define PAGE_SNAPSHOT_PG_IDLE  (1 << 2)
+#define PAGE_SNAPSHOT_FAITHFUL		(1 << 0)
+#define PAGE_SNAPSHOT_PG_BUDDY		(1 << 1)
+#define PAGE_SNAPSHOT_PG_IDLE		(1 << 2)
+#define PAGE_SNAPSHOT_DEVICE_PRIVATE	(1 << 3)
 
 struct page_snapshot {
 	struct folio folio_snapshot;
diff --git a/mm/util.c b/mm/util.c
index 97cae40c0209..65e3f1a97d76 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1218,6 +1218,9 @@ static void set_ps_flags(struct page_snapshot *ps, const struct folio *folio,
 
 	if (folio_test_idle(folio))
 		ps->flags |= PAGE_SNAPSHOT_PG_IDLE;
+
+	if (is_device_private_page(page))
+		ps->flags |= PAGE_SNAPSHOT_DEVICE_PRIVATE;
 }
 
 /**
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] mm/hmm: Add flag to track device private pages
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (8 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 09/11] mm/util: Add flag to track device private pages in page snapshots Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07  9:18 ` [PATCH v2 11/11] mm: Remove device private pages from the physical address space Jordan Niethe
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

A future change will remove device private pages from the physical
address space. This will mean that device private pages no longer have
normal pfns and must be handled separately.

Prepare for this by adding a HMM_PFN_DEVICE_PRIVATE flag to indicate
that a hmm_pfn contains a PFN for a device private page.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>

---
v1:
  - Update HMM_PFN_ORDER_SHIFT
  - Handle hmm_vma_handle_absent_pmd()
v2:
  - No change
---
 include/linux/hmm.h | 4 +++-
 mm/hmm.c            | 5 +++--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index db75ffc949a7..d8756c341620 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -23,6 +23,7 @@ struct mmu_interval_notifier;
  * HMM_PFN_WRITE - if the page memory can be written to (requires HMM_PFN_VALID)
  * HMM_PFN_ERROR - accessing the pfn is impossible and the device should
  *                 fail. ie poisoned memory, special pages, no vma, etc
+ * HMM_PFN_DEVICE_PRIVATE - the pfn field contains a DEVICE_PRIVATE pfn.
  * HMM_PFN_P2PDMA - P2P page
  * HMM_PFN_P2PDMA_BUS - Bus mapped P2P transfer
  * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation
@@ -40,6 +41,7 @@ enum hmm_pfn_flags {
 	HMM_PFN_VALID = 1UL << (BITS_PER_LONG - 1),
 	HMM_PFN_WRITE = 1UL << (BITS_PER_LONG - 2),
 	HMM_PFN_ERROR = 1UL << (BITS_PER_LONG - 3),
+	HMM_PFN_DEVICE_PRIVATE = 1UL << (BITS_PER_LONG - 7),
 	/*
 	 * Sticky flags, carried from input to output,
 	 * don't forget to update HMM_PFN_INOUT_FLAGS
@@ -48,7 +50,7 @@ enum hmm_pfn_flags {
 	HMM_PFN_P2PDMA     = 1UL << (BITS_PER_LONG - 5),
 	HMM_PFN_P2PDMA_BUS = 1UL << (BITS_PER_LONG - 6),
 
-	HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 11),
+	HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 12),
 
 	/* Input flags */
 	HMM_PFN_REQ_FAULT = HMM_PFN_VALID,
diff --git a/mm/hmm.c b/mm/hmm.c
index 4ec74c18bef6..14895fa6575f 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -267,7 +267,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 		if (softleaf_is_device_private(entry) &&
 		    page_pgmap(softleaf_to_page(entry))->owner ==
 		    range->dev_private_owner) {
-			cpu_flags = HMM_PFN_VALID;
+			cpu_flags = HMM_PFN_VALID | HMM_PFN_DEVICE_PRIVATE;
 			if (softleaf_is_device_private_write(entry))
 				cpu_flags |= HMM_PFN_WRITE;
 			new_pfn_flags = softleaf_to_pfn(entry) | cpu_flags;
@@ -347,7 +347,8 @@ static int hmm_vma_handle_absent_pmd(struct mm_walk *walk, unsigned long start,
 	    softleaf_to_folio(entry)->pgmap->owner ==
 	    range->dev_private_owner) {
 		unsigned long cpu_flags = HMM_PFN_VALID |
-			hmm_pfn_flags_order(PMD_SHIFT - PAGE_SHIFT);
+			hmm_pfn_flags_order(PMD_SHIFT - PAGE_SHIFT) |
+			HMM_PFN_DEVICE_PRIVATE;
 		unsigned long pfn = softleaf_to_pfn(entry);
 		unsigned long i;
 
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] mm: Remove device private pages from the physical address space
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (9 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 10/11] mm/hmm: Add flag to track device private pages Jordan Niethe
@ 2026-01-07  9:18 ` Jordan Niethe
  2026-01-07 18:36 ` [PATCH v2 00/11] Remove device private pages from " Matthew Brost
  2026-01-07 20:06 ` Andrew Morton
  12 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-07  9:18 UTC (permalink / raw)
  To: linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

Currently when creating these device private struct pages, the first
step is to use request_free_mem_region() to get a range of physical
address space large enough to represent the devices memory. This
allocated physical address range is then remapped as device private
memory using memremap_pages().

Needing allocation of physical address space has some problems:

  1) There may be insufficient physical address space to represent the
     device memory. KASLR reducing the physical address space and VM
     configurations with limited physical address space increase the
     likelihood of hitting this especially as device memory increases. This
     has been observed to prevent device private from being initialized.

  2) Attempting to add the device private pages to the linear map at
     addresses beyond the actual physical memory causes issues on
     architectures like aarch64 meaning the feature does not work there.

Instead of using the physical address space, introduce a device private
address space and allocate devices regions from there to represent the
device private pages.

Introduce a new interface memremap_device_private_pagemap() that
allocates a requested amount of device private address space and creates
the necessary device private pages.

To support this new interface, struct dev_pagemap needs some changes:

  - Add a new dev_pagemap::nr_pages field as an input parameter.
  - Add a new dev_pagemap::pages array to store the device
    private pages.

When using memremap_device_private_pagemap(), rather then passing in
dev_pagemap::ranges[dev_pagemap::nr_ranges] of physical address space to
be remapped, dev_pagemap::nr_ranges will always be 1, and the device
private range that is reserved is returned in dev_pagemap::range.

Forbid calling memremap_pages() with dev_pagemap::ranges::type =
MEMORY_DEVICE_PRIVATE.

Represent this device private address space using a new
device_private_pgmap_tree maple tree. This tree maps a given device
private address to a struct dev_pagemap, where a specific device private
page may then be looked up in that dev_pagemap::pages array.

Device private address space can be reclaimed and the assoicated device
private pages freed using the corresponding new
memunmap_device_private_pagemap() interface.

Because the device private pages now live outside the physical address
space, they no longer have a normal PFN. This means that page_to_pfn(),
et al. are no longer meaningful.

Introduce helpers:

  - device_private_page_to_offset()
  - device_private_folio_to_offset()

to take a given device private page / folio and return its offset within
the device private address space.

Update the places where we previously converted a device private page to
a PFN to use these new helpers. When we encounter a device private
offset, instead of looking up its page within the pagemap use
device_private_offset_to_page() instead.

Update the existing users:

 - lib/test_hmm.c
 - ppc ultravisor
 - drm/amd/amdkfd
 - gpu/drm/xe
 - gpu/drm/nouveau

to use the new memremap_device_private_pagemap() interface.

Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
Signed-off-by: Alistair Popple <apopple@nvidia.com>

---

NOTE: The updates to the existing drivers have only been compile tested.
I'll need some help in testing these drivers.

v1:
- Include NUMA node paramater for memremap_device_private_pagemap()
- Add devm_memremap_device_private_pagemap() and friends
- Update existing users of memremap_pages():
    - ppc ultravisor
    - drm/amd/amdkfd
    - gpu/drm/xe
    - gpu/drm/nouveau
- Update for HMM huge page support
- Guard device_private_offset_to_page and friends with CONFIG_ZONE_DEVICE

v2:
- Make sure last member of struct dev_pagemap remains DECLARE_FLEX_ARRAY(struct range, ranges);
---
 Documentation/mm/hmm.rst                 |  11 +-
 arch/powerpc/kvm/book3s_hv_uvmem.c       |  41 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  23 +--
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |  35 ++--
 drivers/gpu/drm/xe/xe_svm.c              |  28 +---
 include/linux/hmm.h                      |   3 +
 include/linux/leafops.h                  |  16 +-
 include/linux/memremap.h                 |  64 +++++++-
 include/linux/migrate.h                  |   6 +-
 include/linux/mm.h                       |   2 +
 include/linux/rmap.h                     |   5 +-
 include/linux/swapops.h                  |  10 +-
 lib/test_hmm.c                           |  69 ++++----
 mm/debug.c                               |   9 +-
 mm/memremap.c                            | 193 ++++++++++++++++++-----
 mm/mm_init.c                             |   8 +-
 mm/page_vma_mapped.c                     |  19 ++-
 mm/rmap.c                                |  43 +++--
 mm/util.c                                |   5 +-
 19 files changed, 391 insertions(+), 199 deletions(-)

diff --git a/Documentation/mm/hmm.rst b/Documentation/mm/hmm.rst
index 7d61b7a8b65b..27067a6a2408 100644
--- a/Documentation/mm/hmm.rst
+++ b/Documentation/mm/hmm.rst
@@ -276,17 +276,12 @@ These can be allocated and freed with::
     struct resource *res;
     struct dev_pagemap pagemap;
 
-    res = request_free_mem_region(&iomem_resource, /* number of bytes */,
-                                  "name of driver resource");
     pagemap.type = MEMORY_DEVICE_PRIVATE;
-    pagemap.range.start = res->start;
-    pagemap.range.end = res->end;
-    pagemap.nr_range = 1;
+    pagemap.nr_pages = /* number of pages */;
     pagemap.ops = &device_devmem_ops;
-    memremap_pages(&pagemap, numa_node_id());
+    memremap_device_private_pagemap(&pagemap, numa_node_id());
 
-    memunmap_pages(&pagemap);
-    release_mem_region(pagemap.range.start, range_len(&pagemap.range));
+    memunmap_device_private_pagemap(&pagemap);
 
 There are also devm_request_free_mem_region(), devm_memremap_pages(),
 devm_memunmap_pages(), and devm_release_mem_region() when the resources can
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 67910900af7b..948747db8231 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -636,7 +636,7 @@ void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot,
 		mutex_lock(&kvm->arch.uvmem_lock);
 
 		if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
-			uvmem_page = pfn_to_page(uvmem_pfn);
+			uvmem_page = device_private_offset_to_page(uvmem_pfn);
 			pvt = uvmem_page->zone_device_data;
 			pvt->skip_page_out = skip_page_out;
 			pvt->remove_gfn = true;
@@ -721,7 +721,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
 	pvt->gpa = gpa;
 	pvt->kvm = kvm;
 
-	dpage = pfn_to_page(uvmem_pfn);
+	dpage = device_private_offset_to_page(uvmem_pfn);
 	dpage->zone_device_data = pvt;
 	zone_device_page_init(dpage, 0);
 	return dpage;
@@ -888,7 +888,7 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
 	srcu_idx = srcu_read_lock(&kvm->srcu);
 	mutex_lock(&kvm->arch.uvmem_lock);
 	if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
-		uvmem_page = pfn_to_page(uvmem_pfn);
+		uvmem_page = device_private_offset_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = true;
 		/*
@@ -906,7 +906,7 @@ static unsigned long kvmppc_share_page(struct kvm *kvm, unsigned long gpa,
 
 	mutex_lock(&kvm->arch.uvmem_lock);
 	if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, &uvmem_pfn)) {
-		uvmem_page = pfn_to_page(uvmem_pfn);
+		uvmem_page = device_private_offset_to_page(uvmem_pfn);
 		pvt = uvmem_page->zone_device_data;
 		pvt->skip_page_out = true;
 		pvt->remove_gfn = false; /* it continues to be a valid GFN */
@@ -1017,7 +1017,7 @@ static vm_fault_t kvmppc_uvmem_migrate_to_ram(struct vm_fault *vmf)
 static void kvmppc_uvmem_folio_free(struct folio *folio)
 {
 	struct page *page = &folio->page;
-	unsigned long pfn = page_to_pfn(page) -
+	unsigned long pfn = device_private_page_to_offset(page) -
 			(kvmppc_uvmem_pgmap.range.start >> PAGE_SHIFT);
 	struct kvmppc_uvmem_page_pvt *pvt;
 
@@ -1159,8 +1159,6 @@ int kvmppc_uvmem_init(void)
 {
 	int ret = 0;
 	unsigned long size;
-	struct resource *res;
-	void *addr;
 	unsigned long pfn_last, pfn_first;
 
 	size = kvmppc_get_secmem_size();
@@ -1174,27 +1172,18 @@ int kvmppc_uvmem_init(void)
 		goto out;
 	}
 
-	res = request_free_mem_region(&iomem_resource, size, "kvmppc_uvmem");
-	if (IS_ERR(res)) {
-		ret = PTR_ERR(res);
-		goto out;
-	}
-
 	kvmppc_uvmem_pgmap.type = MEMORY_DEVICE_PRIVATE;
-	kvmppc_uvmem_pgmap.range.start = res->start;
-	kvmppc_uvmem_pgmap.range.end = res->end;
 	kvmppc_uvmem_pgmap.nr_range = 1;
+	kvmppc_uvmem_pgmap.nr_pages = size / PAGE_SIZE;
 	kvmppc_uvmem_pgmap.ops = &kvmppc_uvmem_ops;
 	/* just one global instance: */
 	kvmppc_uvmem_pgmap.owner = &kvmppc_uvmem_pgmap;
-	addr = memremap_pages(&kvmppc_uvmem_pgmap, NUMA_NO_NODE);
-	if (IS_ERR(addr)) {
-		ret = PTR_ERR(addr);
-		goto out_free_region;
-	}
+	ret = memremap_device_private_pagemap(&kvmppc_uvmem_pgmap, NUMA_NO_NODE);
+	if (ret)
+		goto out;
 
-	pfn_first = res->start >> PAGE_SHIFT;
-	pfn_last = pfn_first + (resource_size(res) >> PAGE_SHIFT);
+	pfn_first = kvmppc_uvmem_pgmap.range.start >> PAGE_SHIFT;
+	pfn_last = pfn_first + (range_len(&kvmppc_uvmem_pgmap.range) >> PAGE_SHIFT);
 	kvmppc_uvmem_bitmap = bitmap_zalloc(pfn_last - pfn_first, GFP_KERNEL);
 	if (!kvmppc_uvmem_bitmap) {
 		ret = -ENOMEM;
@@ -1204,9 +1193,7 @@ int kvmppc_uvmem_init(void)
 	pr_info("KVMPPC-UVMEM: Secure Memory size 0x%lx\n", size);
 	return ret;
 out_unmap:
-	memunmap_pages(&kvmppc_uvmem_pgmap);
-out_free_region:
-	release_mem_region(res->start, size);
+	memunmap_device_private_pagemap(&kvmppc_uvmem_pgmap);
 out:
 	return ret;
 }
@@ -1216,8 +1203,6 @@ void kvmppc_uvmem_free(void)
 	if (!kvmppc_uvmem_bitmap)
 		return;
 
-	memunmap_pages(&kvmppc_uvmem_pgmap);
-	release_mem_region(kvmppc_uvmem_pgmap.range.start,
-			   range_len(&kvmppc_uvmem_pgmap.range));
+	memunmap_device_private_pagemap(&kvmppc_uvmem_pgmap);
 	bitmap_free(kvmppc_uvmem_bitmap);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 1a07a8b92e8f..1e7768b91d9f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -1024,9 +1024,9 @@ int kgd2kfd_init_zone_device(struct amdgpu_device *adev)
 {
 	struct amdgpu_kfd_dev *kfddev = &adev->kfd;
 	struct dev_pagemap *pgmap;
-	struct resource *res = NULL;
 	unsigned long size;
 	void *r;
+	int ret;
 
 	/* Page migration works on gfx9 or newer */
 	if (amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(9, 0, 1))
@@ -1047,11 +1047,7 @@ int kgd2kfd_init_zone_device(struct amdgpu_device *adev)
 		pgmap->range.end = adev->gmc.aper_base + adev->gmc.aper_size - 1;
 		pgmap->type = MEMORY_DEVICE_COHERENT;
 	} else {
-		res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
-		if (IS_ERR(res))
-			return PTR_ERR(res);
-		pgmap->range.start = res->start;
-		pgmap->range.end = res->end;
+		pgmap->nr_pages = size / PAGE_SIZE;
 		pgmap->type = MEMORY_DEVICE_PRIVATE;
 	}
 
@@ -1062,14 +1058,19 @@ int kgd2kfd_init_zone_device(struct amdgpu_device *adev)
 	/* Device manager releases device-specific resources, memory region and
 	 * pgmap when driver disconnects from device.
 	 */
-	r = devm_memremap_pages(adev->dev, pgmap);
-	if (IS_ERR(r)) {
+	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
+		ret = devm_memremap_device_private_pagemap(adev->dev, pgmap);
+	} else {
+		r = devm_memremap_pages(adev->dev, pgmap);
+		if (IS_ERR(r))
+			ret = PTR_ERR(r);
+	}
+
+	if (ret) {
 		pr_err("failed to register HMM device memory\n");
-		if (pgmap->type == MEMORY_DEVICE_PRIVATE)
-			devm_release_mem_region(adev->dev, res->start, resource_size(res));
 		/* Disable SVM support capability */
 		pgmap->type = 0;
-		return PTR_ERR(r);
+		return ret;
 	}
 
 	pr_debug("reserve %ldMB system memory for VRAM pages struct\n",
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index adfa3df5cbc5..37fe1cfba414 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -109,7 +109,7 @@ static struct nouveau_drm *page_to_drm(struct page *page)
 unsigned long nouveau_dmem_page_addr(struct page *page)
 {
 	struct nouveau_dmem_chunk *chunk = nouveau_page_to_chunk(page);
-	unsigned long off = (page_to_pfn(page) << PAGE_SHIFT) -
+	unsigned long off = (device_private_page_to_offset(page) << PAGE_SHIFT) -
 				chunk->pagemap.range.start;
 
 	return chunk->bo->offset + off;
@@ -297,9 +297,7 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage,
 			 bool is_large)
 {
 	struct nouveau_dmem_chunk *chunk;
-	struct resource *res;
 	struct page *page;
-	void *ptr;
 	unsigned long i, pfn_first, pfn;
 	int ret;
 
@@ -309,39 +307,28 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage,
 		goto out;
 	}
 
-	/* Allocate unused physical address space for device private pages. */
-	res = request_free_mem_region(&iomem_resource, DMEM_CHUNK_SIZE * NR_CHUNKS,
-				      "nouveau_dmem");
-	if (IS_ERR(res)) {
-		ret = PTR_ERR(res);
-		goto out_free;
-	}
-
 	chunk->drm = drm;
 	chunk->pagemap.type = MEMORY_DEVICE_PRIVATE;
-	chunk->pagemap.range.start = res->start;
-	chunk->pagemap.range.end = res->end;
 	chunk->pagemap.nr_range = 1;
+	chunk->pagemap.nr_pages = DMEM_CHUNK_SIZE * NR_CHUNKS / PAGE_SIZE;
 	chunk->pagemap.ops = &nouveau_dmem_pagemap_ops;
 	chunk->pagemap.owner = drm->dev;
 
 	ret = nouveau_bo_new_pin(&drm->client, NOUVEAU_GEM_DOMAIN_VRAM, DMEM_CHUNK_SIZE,
 				 &chunk->bo);
 	if (ret)
-		goto out_release;
+		goto out_free;
 
-	ptr = memremap_pages(&chunk->pagemap, numa_node_id());
-	if (IS_ERR(ptr)) {
-		ret = PTR_ERR(ptr);
+	ret = memremap_device_private_pagemap(&chunk->pagemap, numa_node_id());
+	if (ret)
 		goto out_bo_free;
-	}
 
 	mutex_lock(&drm->dmem->mutex);
 	list_add(&chunk->list, &drm->dmem->chunks);
 	mutex_unlock(&drm->dmem->mutex);
 
 	pfn_first = chunk->pagemap.range.start >> PAGE_SHIFT;
-	page = pfn_to_page(pfn_first);
+	page = device_private_offset_to_page(pfn_first);
 	spin_lock(&drm->dmem->lock);
 
 	pfn = pfn_first;
@@ -350,12 +337,12 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage,
 
 		if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) || !is_large) {
 			for (j = 0; j < DMEM_CHUNK_NPAGES - 1; j++, pfn++) {
-				page = pfn_to_page(pfn);
+				page = device_private_offset_to_page(pfn);
 				page->zone_device_data = drm->dmem->free_pages;
 				drm->dmem->free_pages = page;
 			}
 		} else {
-			page = pfn_to_page(pfn);
+			page = device_private_offset_to_page(pfn);
 			page->zone_device_data = drm->dmem->free_folios;
 			drm->dmem->free_folios = page_folio(page);
 			pfn += DMEM_CHUNK_NPAGES;
@@ -382,8 +369,6 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage,
 
 out_bo_free:
 	nouveau_bo_unpin_del(&chunk->bo);
-out_release:
-	release_mem_region(chunk->pagemap.range.start, range_len(&chunk->pagemap.range));
 out_free:
 	kfree(chunk);
 out:
@@ -543,9 +528,7 @@ nouveau_dmem_fini(struct nouveau_drm *drm)
 		nouveau_bo_unpin_del(&chunk->bo);
 		WARN_ON(chunk->callocated);
 		list_del(&chunk->list);
-		memunmap_pages(&chunk->pagemap);
-		release_mem_region(chunk->pagemap.range.start,
-				   range_len(&chunk->pagemap.range));
+		memunmap_device_private_pagemap(&chunk->pagemap);
 		kfree(chunk);
 	}
 
diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
index f82790d7e7e6..cce7b4fc5db9 100644
--- a/drivers/gpu/drm/xe/xe_svm.c
+++ b/drivers/gpu/drm/xe/xe_svm.c
@@ -404,7 +404,7 @@ static u64 xe_vram_region_page_to_dpa(struct xe_vram_region *vr,
 				      struct page *page)
 {
 	u64 dpa;
-	u64 pfn = page_to_pfn(page);
+	u64 pfn = device_private_page_to_offset(page);
 	u64 offset;
 
 	xe_assert(vr->xe, is_device_private_page(page));
@@ -1471,39 +1471,27 @@ int xe_devm_add(struct xe_tile *tile, struct xe_vram_region *vr)
 {
 	struct xe_device *xe = tile_to_xe(tile);
 	struct device *dev = &to_pci_dev(xe->drm.dev)->dev;
-	struct resource *res;
-	void *addr;
 	int ret;
 
-	res = devm_request_free_mem_region(dev, &iomem_resource,
-					   vr->usable_size);
-	if (IS_ERR(res)) {
-		ret = PTR_ERR(res);
-		return ret;
-	}
-
 	vr->pagemap.type = MEMORY_DEVICE_PRIVATE;
-	vr->pagemap.range.start = res->start;
-	vr->pagemap.range.end = res->end;
 	vr->pagemap.nr_range = 1;
+	vr->pagemap.nr_pages = vr->usable_size / PAGE_SIZE;
 	vr->pagemap.ops = drm_pagemap_pagemap_ops_get();
 	vr->pagemap.owner = xe_svm_devm_owner(xe);
-	addr = devm_memremap_pages(dev, &vr->pagemap);
+	ret = devm_memremap_device_private_pagemap(dev, &vr->pagemap);
 
 	vr->dpagemap.dev = dev;
 	vr->dpagemap.ops = &xe_drm_pagemap_ops;
 
-	if (IS_ERR(addr)) {
-		devm_release_mem_region(dev, res->start, resource_size(res));
-		ret = PTR_ERR(addr);
-		drm_err(&xe->drm, "Failed to remap tile %d memory, errno %pe\n",
-			tile->id, ERR_PTR(ret));
+	if (ret) {
+		drm_err(&xe->drm, "Failed to remap tile %d memory, errno %d\n",
+			tile->id, ret);
 		return ret;
 	}
-	vr->hpa_base = res->start;
+	vr->hpa_base = vr->pagemap.range.start;
 
 	drm_dbg(&xe->drm, "Added tile %d memory [%llx-%llx] to devm, remapped to %pr\n",
-		tile->id, vr->io_start, vr->io_start + vr->usable_size, res);
+		tile->id, vr->io_start, vr->io_start + vr->usable_size, &vr->pagemap.range);
 	return 0;
 }
 #else
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index d8756c341620..25bb4df298f7 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -68,6 +68,9 @@ enum hmm_pfn_flags {
  */
 static inline struct page *hmm_pfn_to_page(unsigned long hmm_pfn)
 {
+	if (hmm_pfn & HMM_PFN_DEVICE_PRIVATE)
+		return device_private_offset_to_page(hmm_pfn & ~HMM_PFN_FLAGS);
+
 	return pfn_to_page(hmm_pfn & ~HMM_PFN_FLAGS);
 }
 
diff --git a/include/linux/leafops.h b/include/linux/leafops.h
index 2fa09ffe9e34..5c315af273a5 100644
--- a/include/linux/leafops.h
+++ b/include/linux/leafops.h
@@ -455,7 +455,13 @@ static inline unsigned long softleaf_to_flags(softleaf_t entry)
  */
 static inline struct page *softleaf_to_page(softleaf_t entry)
 {
-	struct page *page = pfn_to_page(softleaf_to_pfn(entry));
+	struct page *page;
+
+	if (softleaf_is_migration_device_private(entry) ||
+	    softleaf_is_device_private(entry))
+		page = device_private_entry_to_page(entry);
+	else
+		page = pfn_to_page(softleaf_to_pfn(entry));
 
 	VM_WARN_ON_ONCE(!softleaf_has_pfn(entry));
 	/*
@@ -475,7 +481,13 @@ static inline struct page *softleaf_to_page(softleaf_t entry)
  */
 static inline struct folio *softleaf_to_folio(softleaf_t entry)
 {
-	struct folio *folio = pfn_folio(softleaf_to_pfn(entry));
+	struct folio *folio;
+
+	if (softleaf_is_migration_device_private(entry) ||
+	    softleaf_is_device_private(entry))
+		folio = page_folio(device_private_entry_to_page(entry));
+	else
+		folio = pfn_folio(softleaf_to_pfn(entry));
 
 	VM_WARN_ON_ONCE(!softleaf_has_pfn(entry));
 	/*
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 713ec0435b48..7fad53f0f6ba 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -37,6 +37,7 @@ struct vmem_altmap {
  * backing the device memory. Doing so simplifies the implementation, but it is
  * important to remember that there are certain points at which the struct page
  * must be treated as an opaque object, rather than a "normal" struct page.
+ * Unlike "normal" struct pages, the page_to_pfn() is invalid.
  *
  * A more complete discussion of unaddressable memory may be found in
  * include/linux/hmm.h and Documentation/mm/hmm.rst.
@@ -126,8 +127,12 @@ struct dev_pagemap_ops {
  * @owner: an opaque pointer identifying the entity that manages this
  *	instance.  Used by various helpers to make sure that no
  *	foreign ZONE_DEVICE memory is accessed.
- * @nr_range: number of ranges to be mapped
- * @range: range to be mapped when nr_range == 1
+ * @nr_pages: number of pages requested to be mapped for MEMORY_DEVICE_PRIVATE.
+ * @pages: array of nr_pages initialized for MEMORY_DEVICE_PRIVATE.
+ * @nr_range: number of ranges to be mapped. Always == 1 for
+ *	MEMORY_DEVICE_PRIVATE.
+ * @range: range to be mapped when nr_range == 1. Used as an output param for
+ *	MEMORY_DEVICE_PRIVATE.
  * @ranges: array of ranges to be mapped when nr_range > 1
  */
 struct dev_pagemap {
@@ -139,6 +144,8 @@ struct dev_pagemap {
 	unsigned long vmemmap_shift;
 	const struct dev_pagemap_ops *ops;
 	void *owner;
+	unsigned long nr_pages;
+	struct page *pages;
 	int nr_range;
 	union {
 		struct range range;
@@ -224,7 +231,14 @@ static inline bool is_fsdax_page(const struct page *page)
 }
 
 #ifdef CONFIG_ZONE_DEVICE
+void __init_zone_device_page(struct page *page, unsigned long pfn,
+	unsigned long zone_idx, int nid,
+	struct dev_pagemap *pgmap);
 void zone_device_page_init(struct page *page, unsigned int order);
+unsigned long memremap_device_private_pagemap(struct dev_pagemap *pgmap, int nid);
+void memunmap_device_private_pagemap(struct dev_pagemap *pgmap);
+int devm_memremap_device_private_pagemap(struct device *dev, struct dev_pagemap *pgmap);
+void devm_memunmap_device_private_pagemap(struct device *dev, struct dev_pagemap *pgmap);
 void *memremap_pages(struct dev_pagemap *pgmap, int nid);
 void memunmap_pages(struct dev_pagemap *pgmap);
 void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap);
@@ -234,6 +248,15 @@ bool pgmap_pfn_valid(struct dev_pagemap *pgmap, unsigned long pfn);
 
 unsigned long memremap_compat_align(void);
 
+struct page *device_private_offset_to_page(unsigned long offset);
+struct page *device_private_entry_to_page(softleaf_t entry);
+pgoff_t device_private_page_to_offset(const struct page *page);
+
+static inline pgoff_t device_private_folio_to_offset(const struct folio *folio)
+{
+	return device_private_page_to_offset((const struct page *)&folio->page);
+}
+
 static inline void zone_device_folio_init(struct folio *folio, unsigned int order)
 {
 	zone_device_page_init(&folio->page, order);
@@ -276,6 +299,23 @@ static inline void devm_memunmap_pages(struct device *dev,
 {
 }
 
+static inline int devm_memremap_device_private_pagemap(struct device *dev,
+		struct dev_pagemap *pgmap)
+{
+	/*
+	 * Fail attempts to call devm_memremap_device_private_pagemap() without
+	 * ZONE_DEVICE support enabled, this requires callers to fall
+	 * back to plain devm_memremap() based on config
+	 */
+	WARN_ON_ONCE(1);
+	return -ENXIO;
+}
+
+static inline void devm_memunmap_device_private_pagemap(struct device *dev,
+		struct dev_pagemap *pgmap)
+{
+}
+
 static inline struct dev_pagemap *get_dev_pagemap(unsigned long pfn)
 {
 	return NULL;
@@ -296,6 +336,26 @@ static inline void zone_device_private_split_cb(struct folio *original_folio,
 						struct folio *new_folio)
 {
 }
+
+static inline struct page *device_private_offset_to_page(unsigned long offset)
+{
+	return NULL;
+}
+
+static inline struct page *device_private_entry_to_page(softleaf_t entry)
+{
+	return NULL;
+}
+
+static inline pgoff_t device_private_page_to_offset(const struct page *page)
+{
+	return 0;
+}
+
+static inline pgoff_t device_private_folio_to_offset(const struct folio *folio)
+{
+	return 0;
+}
 #endif /* CONFIG_ZONE_DEVICE */
 
 static inline void put_dev_pagemap(struct dev_pagemap *pgmap)
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 5fd2ee080bc0..2921b3abddf3 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -133,6 +133,10 @@ static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
 {
 	if (!(mpfn & MIGRATE_PFN_VALID))
 		return NULL;
+
+	if (mpfn & MIGRATE_PFN_DEVICE_PRIVATE)
+		return device_private_offset_to_page(mpfn >> MIGRATE_PFN_SHIFT);
+
 	return pfn_to_page(mpfn >> MIGRATE_PFN_SHIFT);
 }
 
@@ -144,7 +148,7 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 static inline unsigned long migrate_pfn_from_page(struct page *page)
 {
 	if (is_device_private_page(page))
-		return migrate_pfn(page_to_pfn(page)) |
+		return migrate_pfn(device_private_page_to_offset(page)) |
 		       MIGRATE_PFN_DEVICE_PRIVATE;
 	return migrate_pfn(page_to_pfn(page));
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e65329e1969f..b36599ab41ba 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2038,6 +2038,8 @@ static inline unsigned long memdesc_section(memdesc_flags_t mdf)
  */
 static inline unsigned long folio_pfn(const struct folio *folio)
 {
+	VM_BUG_ON(folio_is_device_private(folio));
+
 	return page_to_pfn(&folio->page);
 }
 
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 57c63b6a8f65..c1561a92864f 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -951,7 +951,7 @@ static inline unsigned long page_vma_walk_pfn(unsigned long pfn)
 static inline unsigned long folio_page_vma_walk_pfn(const struct folio *folio)
 {
 	if (folio_is_device_private(folio))
-		return page_vma_walk_pfn(folio_pfn(folio)) |
+		return page_vma_walk_pfn(device_private_folio_to_offset(folio)) |
 		       PVMW_PFN_DEVICE_PRIVATE;
 
 	return page_vma_walk_pfn(folio_pfn(folio));
@@ -959,6 +959,9 @@ static inline unsigned long folio_page_vma_walk_pfn(const struct folio *folio)
 
 static inline struct page *page_vma_walk_pfn_to_page(unsigned long pvmw_pfn)
 {
+	if (pvmw_pfn & PVMW_PFN_DEVICE_PRIVATE)
+		return device_private_offset_to_page(pvmw_pfn >> PVMW_PFN_SHIFT);
+
 	return pfn_to_page(pvmw_pfn >> PVMW_PFN_SHIFT);
 }
 
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index f7d85a451a2b..def1d079d69b 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -141,7 +141,7 @@ static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 static inline swp_entry_t make_readable_device_private_entry_from_page(struct page *page,
 								       pgoff_t flags)
 {
-	return swp_entry(SWP_DEVICE_READ, page_to_pfn(page) | flags);
+	return swp_entry(SWP_DEVICE_READ, device_private_page_to_offset(page) | flags);
 }
 
 static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
@@ -152,7 +152,7 @@ static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 static inline swp_entry_t make_writable_device_private_entry_from_page(struct page *page,
 								       pgoff_t flags)
 {
-	return swp_entry(SWP_DEVICE_WRITE, page_to_pfn(page) | flags);
+	return swp_entry(SWP_DEVICE_WRITE, device_private_page_to_offset(page) | flags);
 }
 
 static inline swp_entry_t make_device_exclusive_entry(pgoff_t offset)
@@ -268,7 +268,7 @@ static inline swp_entry_t make_readable_migration_entry_from_page(struct page *p
 {
 	if (is_device_private_page(page))
 		return make_readable_migration_device_private_entry(
-				page_to_pfn(page) | flags);
+				device_private_page_to_offset(page) | flags);
 
 	return swp_entry(SWP_MIGRATION_READ, page_to_pfn(page) | flags);
 }
@@ -283,7 +283,7 @@ static inline swp_entry_t make_readable_exclusive_migration_entry_from_page(stru
 {
 	if (is_device_private_page(page))
 		return make_readable_exclusive_migration_device_private_entry(
-				page_to_pfn(page) | flags);
+				device_private_page_to_offset(page) | flags);
 
 	return swp_entry(SWP_MIGRATION_READ_EXCLUSIVE, page_to_pfn(page) | flags);
 }
@@ -298,7 +298,7 @@ static inline swp_entry_t make_writable_migration_entry_from_page(struct page *p
 {
 	if (is_device_private_page(page))
 		return make_writable_migration_device_private_entry(
-				page_to_pfn(page) | flags);
+				device_private_page_to_offset(page) | flags);
 
 	return swp_entry(SWP_MIGRATION_WRITE, page_to_pfn(page) | flags);
 }
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 872d3846af7b..b6e20041e448 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -497,7 +497,7 @@ static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 				  struct page **ppage, bool is_large)
 {
 	struct dmirror_chunk *devmem;
-	struct resource *res = NULL;
+	bool device_private = false;
 	unsigned long pfn;
 	unsigned long pfn_first;
 	unsigned long pfn_last;
@@ -510,13 +510,9 @@ static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 
 	switch (mdevice->zone_device_type) {
 	case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE:
-		res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE,
-					      "hmm_dmirror");
-		if (IS_ERR_OR_NULL(res))
-			goto err_devmem;
-		devmem->pagemap.range.start = res->start;
-		devmem->pagemap.range.end = res->end;
+		device_private = true;
 		devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+		devmem->pagemap.nr_pages = DEVMEM_CHUNK_SIZE / PAGE_SIZE;
 		break;
 	case HMM_DMIRROR_MEMORY_DEVICE_COHERENT:
 		devmem->pagemap.range.start = (MINOR(mdevice->cdevice.dev) - 2) ?
@@ -525,13 +521,13 @@ static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 		devmem->pagemap.range.end = devmem->pagemap.range.start +
 					    DEVMEM_CHUNK_SIZE - 1;
 		devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
+		devmem->pagemap.nr_range = 1;
 		break;
 	default:
 		ret = -EINVAL;
 		goto err_devmem;
 	}
 
-	devmem->pagemap.nr_range = 1;
 	devmem->pagemap.ops = &dmirror_devmem_ops;
 	devmem->pagemap.owner = mdevice;
 
@@ -551,13 +547,20 @@ static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 		mdevice->devmem_capacity = new_capacity;
 		mdevice->devmem_chunks = new_chunks;
 	}
-	ptr = memremap_pages(&devmem->pagemap, numa_node_id());
-	if (IS_ERR_OR_NULL(ptr)) {
-		if (ptr)
-			ret = PTR_ERR(ptr);
-		else
-			ret = -EFAULT;
-		goto err_release;
+
+	if (device_private) {
+		ret = memremap_device_private_pagemap(&devmem->pagemap, numa_node_id());
+		if (ret)
+			goto err_release;
+	} else {
+		ptr = memremap_pages(&devmem->pagemap, numa_node_id());
+		if (IS_ERR_OR_NULL(ptr)) {
+			if (ptr)
+				ret = PTR_ERR(ptr);
+			else
+				ret = -EFAULT;
+			goto err_release;
+		}
 	}
 
 	devmem->mdevice = mdevice;
@@ -567,15 +570,21 @@ static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 
 	mutex_unlock(&mdevice->devmem_lock);
 
-	pr_info("added new %u MB chunk (total %u chunks, %u MB) PFNs [0x%lx 0x%lx)\n",
+	pr_info("added new %u MB chunk (total %u chunks, %u MB) %sPFNs [0x%lx 0x%lx)\n",
 		DEVMEM_CHUNK_SIZE / (1024 * 1024),
 		mdevice->devmem_count,
 		mdevice->devmem_count * (DEVMEM_CHUNK_SIZE / (1024 * 1024)),
+		device_private ? "device " : "",
 		pfn_first, pfn_last);
 
 	spin_lock(&mdevice->lock);
 	for (pfn = pfn_first; pfn < pfn_last; ) {
-		struct page *page = pfn_to_page(pfn);
+		struct page *page;
+
+		if (device_private)
+			page = device_private_offset_to_page(pfn);
+		else
+			page = pfn_to_page(pfn);
 
 		if (is_large && IS_ALIGNED(pfn, HPAGE_PMD_NR)
 			&& (pfn + HPAGE_PMD_NR <= pfn_last)) {
@@ -616,9 +625,6 @@ static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
 
 err_release:
 	mutex_unlock(&mdevice->devmem_lock);
-	if (res && devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
-		release_mem_region(devmem->pagemap.range.start,
-				   range_len(&devmem->pagemap.range));
 err_devmem:
 	kfree(devmem);
 
@@ -696,8 +702,8 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
 		 */
 		spage = migrate_pfn_to_page(*src);
 		if (WARN(spage && is_zone_device_page(spage),
-		     "page already in device spage pfn: 0x%lx\n",
-		     page_to_pfn(spage)))
+		     "page already in device spage mpfn: 0x%lx\n",
+		     migrate_pfn_from_page(spage)))
 			goto next;
 
 		if (dmirror->flags & HMM_DMIRROR_FLAG_FAIL_ALLOC) {
@@ -752,8 +758,9 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
 		 */
 		rpage->zone_device_data = dmirror;
 
-		pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 0x%lx\n",
-			 page_to_pfn(spage), page_to_pfn(dpage));
+		pr_debug("migrating from sys to mpfn src: 0x%lx pfn dst: 0x%lx\n",
+			 page_to_pfn(spage),
+			 migrate_pfn_from_page(dpage));
 
 		*dst = migrate_pfn_from_page(dpage) | write;
 
@@ -1462,10 +1469,10 @@ static void dmirror_device_remove_chunks(struct dmirror_device *mdevice)
 			spin_unlock(&mdevice->lock);
 
 			dmirror_device_evict_chunk(devmem);
-			memunmap_pages(&devmem->pagemap);
 			if (devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
-				release_mem_region(devmem->pagemap.range.start,
-						   range_len(&devmem->pagemap.range));
+				memunmap_device_private_pagemap(&devmem->pagemap);
+			else
+				memunmap_pages(&devmem->pagemap);
 			kfree(devmem);
 		}
 		mdevice->devmem_count = 0;
@@ -1710,7 +1717,12 @@ static void dmirror_devmem_folio_split(struct folio *head, struct folio *tail)
 		return;
 	}
 
-	offset = folio_pfn(tail) - folio_pfn(head);
+	tail->pgmap = head->pgmap;
+
+	if (folio_is_device_private(head))
+		offset = device_private_folio_to_offset(tail) - device_private_folio_to_offset(head);
+	else
+		offset = folio_pfn(tail) - folio_pfn(head);
 
 	rpage_tail = folio_page(rfolio, offset);
 	tail->page.zone_device_data = rpage_tail;
@@ -1719,7 +1731,6 @@ static void dmirror_devmem_folio_split(struct folio *head, struct folio *tail)
 	rpage_tail->mapping = NULL;
 
 	folio_page(tail, 0)->mapping = folio_page(head, 0)->mapping;
-	tail->pgmap = head->pgmap;
 	folio_set_count(page_folio(rpage_tail), 1);
 }
 
diff --git a/mm/debug.c b/mm/debug.c
index 77fa8fe1d641..04fcc62d440f 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -77,9 +77,11 @@ static void __dump_folio(const struct folio *folio, const struct page *page,
 	if (page_mapcount_is_type(mapcount))
 		mapcount = 0;
 
-	pr_warn("page: refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n",
+	pr_warn("page: refcount:%d mapcount:%d mapping:%p index:%#lx %spfn:%#lx\n",
 			folio_ref_count(folio), mapcount, mapping,
-			folio->index + idx, pfn);
+			folio->index + idx,
+			folio_is_device_private(folio) ? "device " : "",
+			pfn);
 	if (folio_test_large(folio)) {
 		int pincount = 0;
 
@@ -113,7 +115,8 @@ static void __dump_folio(const struct folio *folio, const struct page *page,
 	 * inaccuracy here due to racing.
 	 */
 	pr_warn("%sflags: %pGp%s\n", type, &folio->flags,
-		is_migrate_cma_folio(folio, pfn) ? " CMA" : "");
+		(!folio_is_device_private(folio) &&
+		 is_migrate_cma_folio(folio, pfn)) ? " CMA" : "");
 	if (page_has_type(&folio->page))
 		pr_warn("page_type: %x(%s)\n", folio->page.page_type >> 24,
 				page_type_name(folio->page.page_type));
diff --git a/mm/memremap.c b/mm/memremap.c
index 4c2e0d68eb27..f0fe92c3227a 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -12,9 +12,12 @@
 #include <linux/types.h>
 #include <linux/wait_bit.h>
 #include <linux/xarray.h>
+#include <linux/maple_tree.h>
 #include "internal.h"
 
 static DEFINE_XARRAY(pgmap_array);
+static struct maple_tree device_private_pgmap_tree =
+	MTREE_INIT(device_private_pgmap_tree, MT_FLAGS_ALLOC_RANGE);
 
 /*
  * The memremap() and memremap_pages() interfaces are alternately used
@@ -113,9 +116,10 @@ void memunmap_pages(struct dev_pagemap *pgmap)
 {
 	int i;
 
+	WARN_ONCE(pgmap->type == MEMORY_DEVICE_PRIVATE, "Type should not be MEMORY_DEVICE_PRIVATE\n");
+
 	percpu_ref_kill(&pgmap->ref);
-	if (pgmap->type != MEMORY_DEVICE_PRIVATE &&
-	    pgmap->type != MEMORY_DEVICE_COHERENT)
+	if (pgmap->type != MEMORY_DEVICE_COHERENT)
 		for (i = 0; i < pgmap->nr_range; i++)
 			percpu_ref_put_many(&pgmap->ref, pfn_len(pgmap, i));
 
@@ -144,7 +148,6 @@ static void dev_pagemap_percpu_release(struct percpu_ref *ref)
 static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 		int range_id, int nid)
 {
-	const bool is_private = pgmap->type == MEMORY_DEVICE_PRIVATE;
 	struct range *range = &pgmap->ranges[range_id];
 	struct dev_pagemap *conflict_pgmap;
 	int error, is_ram;
@@ -190,7 +193,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 	if (error)
 		goto err_pfn_remap;
 
-	if (!mhp_range_allowed(range->start, range_len(range), !is_private)) {
+	if (!mhp_range_allowed(range->start, range_len(range), true)) {
 		error = -EINVAL;
 		goto err_kasan;
 	}
@@ -198,30 +201,19 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 	mem_hotplug_begin();
 
 	/*
-	 * For device private memory we call add_pages() as we only need to
-	 * allocate and initialize struct page for the device memory. More-
-	 * over the device memory is un-accessible thus we do not want to
-	 * create a linear mapping for the memory like arch_add_memory()
-	 * would do.
-	 *
-	 * For all other device memory types, which are accessible by
-	 * the CPU, we do want the linear mapping and thus use
+	 * All device memory types except device private memory are accessible
+	 * by the CPU, so we want the linear mapping and thus use
 	 * arch_add_memory().
 	 */
-	if (is_private) {
-		error = add_pages(nid, PHYS_PFN(range->start),
-				PHYS_PFN(range_len(range)), params);
-	} else {
-		error = kasan_add_zero_shadow(__va(range->start), range_len(range));
-		if (error) {
-			mem_hotplug_done();
-			goto err_kasan;
-		}
-
-		error = arch_add_memory(nid, range->start, range_len(range),
-					params);
+	error = kasan_add_zero_shadow(__va(range->start), range_len(range));
+	if (error) {
+		mem_hotplug_done();
+		goto err_kasan;
 	}
 
+	error = arch_add_memory(nid, range->start, range_len(range),
+				params);
+
 	if (!error) {
 		struct zone *zone;
 
@@ -248,8 +240,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
 	return 0;
 
 err_add_memory:
-	if (!is_private)
-		kasan_remove_zero_shadow(__va(range->start), range_len(range));
+	kasan_remove_zero_shadow(__va(range->start), range_len(range));
 err_kasan:
 	pfnmap_untrack(PHYS_PFN(range->start), range_len(range));
 err_pfn_remap:
@@ -281,22 +272,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
 
 	switch (pgmap->type) {
 	case MEMORY_DEVICE_PRIVATE:
-		if (!IS_ENABLED(CONFIG_DEVICE_PRIVATE)) {
-			WARN(1, "Device private memory not supported\n");
-			return ERR_PTR(-EINVAL);
-		}
-		if (!pgmap->ops || !pgmap->ops->migrate_to_ram) {
-			WARN(1, "Missing migrate_to_ram method\n");
-			return ERR_PTR(-EINVAL);
-		}
-		if (!pgmap->ops->folio_free) {
-			WARN(1, "Missing folio_free method\n");
-			return ERR_PTR(-EINVAL);
-		}
-		if (!pgmap->owner) {
-			WARN(1, "Missing owner\n");
-			return ERR_PTR(-EINVAL);
-		}
+		WARN(1, "Use memremap_device_private_pagemap()\n");
+		return ERR_PTR(-EINVAL);
 		break;
 	case MEMORY_DEVICE_COHERENT:
 		if (!pgmap->ops->folio_free) {
@@ -394,6 +371,31 @@ void devm_memunmap_pages(struct device *dev, struct dev_pagemap *pgmap)
 }
 EXPORT_SYMBOL_GPL(devm_memunmap_pages);
 
+static void devm_memremap_device_private_pagemap_release(void *data)
+{
+	memunmap_device_private_pagemap(data);
+}
+
+int devm_memremap_device_private_pagemap(struct device *dev, struct dev_pagemap *pgmap)
+{
+	int ret;
+
+	ret = memremap_device_private_pagemap(pgmap, dev_to_node(dev));
+	if (ret)
+		return ret;
+
+	ret = devm_add_action_or_reset(dev, devm_memremap_device_private_pagemap_release,
+			pgmap);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(devm_memremap_device_private_pagemap);
+
+void devm_memunmap_device_private_pagemap(struct device *dev, struct dev_pagemap *pgmap)
+{
+	devm_release_action(dev, devm_memremap_device_private_pagemap_release, pgmap);
+}
+EXPORT_SYMBOL_GPL(devm_memunmap_device_private_pagemap);
+
 /**
  * get_dev_pagemap() - take a new live reference on the dev_pagemap for @pfn
  * @pfn: page frame number to lookup page_map
@@ -495,3 +497,110 @@ void zone_device_page_init(struct page *page, unsigned int order)
 		prep_compound_page(page, order);
 }
 EXPORT_SYMBOL_GPL(zone_device_page_init);
+
+unsigned long memremap_device_private_pagemap(struct dev_pagemap *pgmap, int nid)
+{
+	unsigned long dpfn, dpfn_first, dpfn_last = 0;
+	unsigned long start;
+	int rc;
+
+	if (pgmap->type != MEMORY_DEVICE_PRIVATE) {
+		WARN(1, "Not device private memory\n");
+		return -EINVAL;
+	}
+	if (!IS_ENABLED(CONFIG_DEVICE_PRIVATE)) {
+		WARN(1, "Device private memory not supported\n");
+		return -EINVAL;
+	}
+	if (!pgmap->ops || !pgmap->ops->migrate_to_ram) {
+		WARN(1, "Missing migrate_to_ram method\n");
+		return -EINVAL;
+	}
+	if (!pgmap->owner) {
+		WARN(1, "Missing owner\n");
+		return -EINVAL;
+	}
+
+	pgmap->pages = kvzalloc(sizeof(struct page) * pgmap->nr_pages,
+			       GFP_KERNEL);
+	if (!pgmap->pages)
+		return -ENOMEM;
+
+	rc = mtree_alloc_range(&device_private_pgmap_tree, &start, pgmap,
+			       pgmap->nr_pages * PAGE_SIZE, 0,
+			       1ull << MAX_PHYSMEM_BITS, GFP_KERNEL);
+	if (rc < 0)
+		goto err_mtree_alloc;
+
+	pgmap->range.start = start;
+	pgmap->range.end = pgmap->range.start + (pgmap->nr_pages * PAGE_SIZE) - 1;
+	pgmap->nr_range = 1;
+
+	init_completion(&pgmap->done);
+	rc = percpu_ref_init(&pgmap->ref, dev_pagemap_percpu_release, 0,
+		GFP_KERNEL);
+	if (rc < 0)
+		goto err_ref_init;
+
+	dpfn_first = pgmap->range.start >> PAGE_SHIFT;
+	dpfn_last = dpfn_first + (range_len(&pgmap->range) >> PAGE_SHIFT);
+	for (dpfn = dpfn_first; dpfn < dpfn_last; dpfn++) {
+		struct page *page = device_private_offset_to_page(dpfn);
+
+		__init_zone_device_page(page, dpfn, ZONE_DEVICE, nid, pgmap);
+		page_folio(page)->pgmap = (void *) pgmap;
+	}
+
+	return 0;
+
+err_ref_init:
+	mtree_erase(&device_private_pgmap_tree, pgmap->range.start);
+err_mtree_alloc:
+	kvfree(pgmap->pages);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(memremap_device_private_pagemap);
+
+void memunmap_device_private_pagemap(struct dev_pagemap *pgmap)
+{
+	percpu_ref_kill(&pgmap->ref);
+	wait_for_completion(&pgmap->done);
+	percpu_ref_exit(&pgmap->ref);
+	kvfree(pgmap->pages);
+	mtree_erase(&device_private_pgmap_tree, pgmap->range.start);
+}
+EXPORT_SYMBOL_GPL(memunmap_device_private_pagemap);
+
+struct page *device_private_offset_to_page(unsigned long offset)
+{
+	struct dev_pagemap *pgmap;
+
+	pgmap = mtree_load(&device_private_pgmap_tree, offset << PAGE_SHIFT);
+	if (WARN_ON_ONCE(!pgmap))
+		return NULL;
+
+	return &pgmap->pages[offset - (pgmap->range.start >> PAGE_SHIFT)];
+}
+EXPORT_SYMBOL_GPL(device_private_offset_to_page);
+
+struct page *device_private_entry_to_page(softleaf_t entry)
+{
+	unsigned long offset;
+
+	if (!((softleaf_is_device_private(entry) ||
+	    (softleaf_is_migration_device_private(entry)))))
+		return NULL;
+
+	offset = softleaf_to_pfn(entry);
+	return device_private_offset_to_page(offset);
+}
+
+pgoff_t device_private_page_to_offset(const struct page *page)
+{
+	struct dev_pagemap *pgmap = (struct dev_pagemap *) page_pgmap(page);
+
+	VM_BUG_ON_PAGE(!is_device_private_page(page), page);
+
+	return (pgmap->range.start >> PAGE_SHIFT) + ((page - pgmap->pages));
+}
+EXPORT_SYMBOL_GPL(device_private_page_to_offset);
diff --git a/mm/mm_init.c b/mm/mm_init.c
index fc2a6f1e518f..4a9420cb610c 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1004,9 +1004,9 @@ static void __init memmap_init(void)
 }
 
 #ifdef CONFIG_ZONE_DEVICE
-static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
-					  unsigned long zone_idx, int nid,
-					  struct dev_pagemap *pgmap)
+void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
+				   unsigned long zone_idx, int nid,
+				   struct dev_pagemap *pgmap)
 {
 
 	__init_single_page(page, pfn, zone_idx, nid);
@@ -1038,7 +1038,7 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
 	 * Please note that MEMINIT_HOTPLUG path doesn't clear memmap
 	 * because this is done early in section_activate()
 	 */
-	if (pageblock_aligned(pfn)) {
+	if (pgmap->type != MEMORY_DEVICE_PRIVATE && pageblock_aligned(pfn)) {
 		init_pageblock_migratetype(page, MIGRATE_MOVABLE, false);
 		cond_resched();
 	}
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 96c525785d78..141fe5abd33f 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -107,6 +107,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
 static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 {
 	unsigned long pfn;
+	bool device_private = false;
 	pte_t ptent = ptep_get(pvmw->pte);
 
 	if (pvmw->flags & PVMW_MIGRATION) {
@@ -115,6 +116,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 		if (!softleaf_is_migration(entry))
 			return false;
 
+		if (softleaf_is_migration_device_private(entry))
+			device_private = true;
+
 		pfn = softleaf_to_pfn(entry);
 	} else if (pte_present(ptent)) {
 		pfn = pte_pfn(ptent);
@@ -127,8 +131,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 			return false;
 
 		pfn = softleaf_to_pfn(entry);
+
+		if (softleaf_is_device_private(entry))
+			device_private = true;
 	}
 
+	if ((device_private) ^ !!(pvmw->pfn & PVMW_PFN_DEVICE_PRIVATE))
+		return false;
+
 	if ((pfn + pte_nr - 1) < (pvmw->pfn >> PVMW_PFN_SHIFT))
 		return false;
 	if (pfn > ((pvmw->pfn >> PVMW_PFN_SHIFT) + pvmw->nr_pages - 1))
@@ -137,8 +147,11 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 }
 
 /* Returns true if the two ranges overlap.  Careful to not overflow. */
-static bool check_pmd(unsigned long pfn, struct page_vma_mapped_walk *pvmw)
+static bool check_pmd(unsigned long pfn, bool device_private, struct page_vma_mapped_walk *pvmw)
 {
+	if ((device_private) ^ !!(pvmw->pfn & PVMW_PFN_DEVICE_PRIVATE))
+		return false;
+
 	if ((pfn + HPAGE_PMD_NR - 1) < (pvmw->pfn >> PVMW_PFN_SHIFT))
 		return false;
 	if (pfn > (pvmw->pfn >> PVMW_PFN_SHIFT) + pvmw->nr_pages - 1)
@@ -255,6 +268,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 
 				if (!softleaf_is_migration(entry) ||
 				    !check_pmd(softleaf_to_pfn(entry),
+					       softleaf_is_device_private(entry) ||
+					       softleaf_is_migration_device_private(entry),
 					       pvmw))
 					return not_found(pvmw);
 				return true;
@@ -262,7 +277,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
 			if (likely(pmd_trans_huge(pmde))) {
 				if (pvmw->flags & PVMW_MIGRATION)
 					return not_found(pvmw);
-				if (!check_pmd(pmd_pfn(pmde), pvmw))
+				if (!check_pmd(pmd_pfn(pmde), false, pvmw))
 					return not_found(pvmw);
 				return true;
 			}
diff --git a/mm/rmap.c b/mm/rmap.c
index 6a63333f8722..3f708ed5c89f 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1860,7 +1860,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)(long)arg;
 	unsigned long nr_pages = 1, end_addr;
-	unsigned long pfn;
+	unsigned long nr;
 	unsigned long hsz = 0;
 	int ptes = 0;
 
@@ -1967,15 +1967,20 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
 		 */
 		pteval = ptep_get(pvmw.pte);
 		if (likely(pte_present(pteval))) {
-			pfn = pte_pfn(pteval);
+			nr = pte_pfn(pteval) - folio_pfn(folio);
 		} else {
 			const softleaf_t entry = softleaf_from_pte(pteval);
 
-			pfn = softleaf_to_pfn(entry);
+			if (softleaf_is_device_private(entry) ||
+			    softleaf_is_migration_device_private(entry))
+				nr = softleaf_to_pfn(entry) - device_private_folio_to_offset(folio);
+			else
+				nr = softleaf_to_pfn(entry) - folio_pfn(folio);
+
 			VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
 		}
 
-		subpage = folio_page(folio, pfn - folio_pfn(folio));
+		subpage = folio_page(folio, nr);
 		address = pvmw.address;
 		anon_exclusive = folio_test_anon(folio) &&
 				 PageAnonExclusive(subpage);
@@ -2289,7 +2294,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 	struct page *subpage;
 	struct mmu_notifier_range range;
 	enum ttu_flags flags = (enum ttu_flags)(long)arg;
-	unsigned long pfn;
+	unsigned long nr;
 	unsigned long hsz = 0;
 
 	/*
@@ -2328,7 +2333,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 	while (page_vma_mapped_walk(&pvmw)) {
 		/* PMD-mapped THP migration entry */
 		if (!pvmw.pte) {
-			__maybe_unused unsigned long pfn;
+			__maybe_unused softleaf_t entry;
 			__maybe_unused pmd_t pmdval;
 
 			if (flags & TTU_SPLIT_HUGE_PMD) {
@@ -2340,12 +2345,17 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 			}
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 			pmdval = pmdp_get(pvmw.pmd);
+			entry = softleaf_from_pmd(pmdval);
 			if (likely(pmd_present(pmdval)))
-				pfn = pmd_pfn(pmdval);
-			else
-				pfn = softleaf_to_pfn(softleaf_from_pmd(pmdval));
+				nr = pmd_pfn(pmdval) - folio_pfn(folio);
+			else if (softleaf_is_device_private(entry) ||
+				 softleaf_is_migration_device_private(entry)) {
+				nr = softleaf_to_pfn(entry) - device_private_folio_to_offset(folio);
+			} else {
+				nr = softleaf_to_pfn(entry) - folio_pfn(folio);
+			}
 
-			subpage = folio_page(folio, pfn - folio_pfn(folio));
+			subpage = folio_page(folio, nr);
 
 			VM_BUG_ON_FOLIO(folio_test_hugetlb(folio) ||
 					!folio_test_pmd_mappable(folio), folio);
@@ -2368,15 +2378,20 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 		 */
 		pteval = ptep_get(pvmw.pte);
 		if (likely(pte_present(pteval))) {
-			pfn = pte_pfn(pteval);
+			nr = pte_pfn(pteval) - folio_pfn(folio);
 		} else {
 			const softleaf_t entry = softleaf_from_pte(pteval);
 
-			pfn = softleaf_to_pfn(entry);
+			if (softleaf_is_device_private(entry) ||
+			    is_device_private_migration_entry(entry))
+				nr = softleaf_to_pfn(entry) - device_private_folio_to_offset(folio);
+			else
+				nr = softleaf_to_pfn(entry) - folio_pfn(folio);
+
 			VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio);
 		}
 
-		subpage = folio_page(folio, pfn - folio_pfn(folio));
+		subpage = folio_page(folio, nr);
 		address = pvmw.address;
 		anon_exclusive = folio_test_anon(folio) &&
 				 PageAnonExclusive(subpage);
@@ -2436,7 +2451,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 				folio_mark_dirty(folio);
 			writable = pte_write(pteval);
 		} else if (likely(pte_present(pteval))) {
-			flush_cache_page(vma, address, pfn);
+			flush_cache_page(vma, address, pte_pfn(pteval));
 			/* Nuke the page table entry. */
 			if (should_defer_flush(mm, flags)) {
 				/*
diff --git a/mm/util.c b/mm/util.c
index 65e3f1a97d76..8482ebc5c394 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1244,7 +1244,10 @@ void snapshot_page(struct page_snapshot *ps, const struct page *page)
 	struct folio *foliop;
 	int loops = 5;
 
-	ps->pfn = page_to_pfn(page);
+	if (is_device_private_page(page))
+		ps->pfn = device_private_page_to_offset(page);
+	else
+		ps->pfn = page_to_pfn(page);
 	ps->flags = PAGE_SNAPSHOT_FAITHFUL;
 
 again:
-- 
2.34.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (10 preceding siblings ...)
  2026-01-07  9:18 ` [PATCH v2 11/11] mm: Remove device private pages from the physical address space Jordan Niethe
@ 2026-01-07 18:36 ` Matthew Brost
  2026-01-07 20:21   ` Zi Yan
  2026-01-08  2:25   ` Jordan Niethe
  2026-01-07 20:06 ` Andrew Morton
  12 siblings, 2 replies; 33+ messages in thread
From: Matthew Brost @ 2026-01-07 18:36 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

On Wed, Jan 07, 2026 at 08:18:12PM +1100, Jordan Niethe wrote:
> Today, when creating these device private struct pages, the first step
> is to use request_free_mem_region() to get a range of physical address
> space large enough to represent the devices memory. This allocated
> physical address range is then remapped as device private memory using
> memremap_pages.
> 
> Needing allocation of physical address space has some problems:
> 
>   1) There may be insufficient physical address space to represent the
>      device memory. KASLR reducing the physical address space and VM
>      configurations with limited physical address space increase the
>      likelihood of hitting this especially as device memory increases. This
>      has been observed to prevent device private from being initialized.  
> 
>   2) Attempting to add the device private pages to the linear map at
>      addresses beyond the actual physical memory causes issues on
>      architectures like aarch64  - meaning the feature does not work there [0].
> 
> This series changes device private memory so that it does not require
> allocation of physical address space and these problems are avoided.
> Instead of using the physical address space, we introduce a "device
> private address space" and allocate from there.
> 
> A consequence of placing the device private pages outside of the
> physical address space is that they no longer have a PFN. However, it is
> still necessary to be able to look up a corresponding device private
> page from a device private PTE entry, which means that we still require
> some way to index into this device private address space. Instead of a
> PFN, device private pages use an offset into this device private address
> space to look up device private struct pages.
> 
> The problem that then needs to be addressed is how to avoid confusing
> these device private offsets with PFNs. It is the inherent limited usage
> of the device private pages themselves which make this possible. A
> device private page is only used for userspace mappings, we do not need
> to be concerned with them being used within the mm more broadly. This
> means that the only way that the core kernel looks up these pages is via
> the page table, where their PTE already indicates if they refer to a
> device private page via their swap type, e.g.  SWP_DEVICE_WRITE. We can
> use this information to determine if the PTE contains a PFN which should
> be looked up in the page map, or a device private offset which should be
> looked up elsewhere.
> 
> This applies when we are creating PTE entries for device private pages -
> because they have their own type there are already must be handled
> separately, so it is a small step to convert them to a device private
> PFN now too.
> 
> The first part of the series updates callers where device private
> offsets might now be encountered to track this extra state.
> 
> The last patch contains the bulk of the work where we change how we
> convert between device private pages to device private offsets and then
> use a new interface for allocating device private pages without the need
> for reserving physical address space.
> 
> By removing the device private pages from the physical address space,
> this series also opens up the possibility to moving away from tracking
> device private memory using struct pages in the future. This is
> desirable as on systems with large amounts of memory these device
> private struct pages use a signifiant amount of memory and take a
> significant amount of time to initialize.
> 
> *** Changes in v2 ***
> 
> The most significant change in v2 is addressing code paths that are
> common between MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT devices.
> 
> This had been overlooked in previous revisions.
> 
> To do this we introduce a migrate_pfn_from_page() helper which will call
> device_private_offset_to_page() and set the MIGRATE_PFN_DEVICE_PRIVATE
> flag if required.
> 
> In places where we could have a device private offset
> (MEMORY_DEVICE_PRIVATE) or a pfn (MEMORY_DEVICE_COHERENT) we update to
> use an mpfn to disambiguate.  This includes some users in the drivers
> and migrate_device_{pfns,range}().
> 
> Seeking opinions on using the mpfns like this or if a new type would be
> preferred.
> 
>   - mm/migrate_device: Introduce migrate_pfn_from_page() helper
>     - New to series
> 
>   - drm/amdkfd: Use migrate pfns internally
>     - New to series
> 
>   - mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
>     - New to series
> 
>   - mm/migrate_device: Add migrate PFN flag to track device private pages
>     - Update for migrate_pfn_from_page()
>     - Rename to MIGRATE_PFN_DEVICE_PRIVATE
>     - drm/amd: Check adev->gmc.xgmi.connected_to_cpu
>     - lib/test_hmm.c: Check chunk->pagemap.type == MEMORY_DEVICE_PRIVATE
> 
>   - mm: Add helpers to create migration entries from struct pages
>     - Add a flags param
> 
>   - mm: Add a new swap type for migration entries of device private pages
>     - Add softleaf_is_migration_device_private_read()
> 
>   - mm: Add helpers to create device private entries from struct pages
>     - Add a flags param
> 
>   - mm: Remove device private pages from the physical address space
>     - Make sure last member of struct dev_pagemap remains DECLARE_FLEX_ARRAY(struct range, ranges);
> 
> Testing:
> - selftests/mm/hmm-tests on an amd64 VM
> 
> * NOTE: I will need help in testing the driver changes *
> 

Thanks for the series. For some reason Intel's CI couldn't apply this
series to drm-tip to get results [1]. I'll manually apply this and run all
our SVM tests and get back you on results + review the changes here. For
future reference if you want to use our CI system, the series must apply
to drm-tip, feel free to rebase this series and just send to intel-xe
list if you want CI results.

I was also wondering if Nvidia could help review one our core MM patches
[2] which is gating enabling 2M device pages too?

Matt

[1] https://patchwork.freedesktop.org/series/159738/
[2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1 

> Revisions:
> - RFC: https://lore.kernel.org/all/20251128044146.80050-1-jniethe@nvidia.com/
> - v1: https://lore.kernel.org/all/20251231043154.42931-1-jniethe@nvidia.com/
> 
> [0] https://lore.kernel.org/lkml/CAMj1kXFZ=4hLL1w6iCV5O5uVoVLHAJbc0rr40j24ObenAjXe9w@mail.gmail.com/
> 
> Jordan Niethe (11):
>   mm/migrate_device: Introduce migrate_pfn_from_page() helper
>   drm/amdkfd: Use migrate pfns internally
>   mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
>   mm/migrate_device: Add migrate PFN flag to track device private pages
>   mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn to track
>     device private pages
>   mm: Add helpers to create migration entries from struct pages
>   mm: Add a new swap type for migration entries of device private pages
>   mm: Add helpers to create device private entries from struct pages
>   mm/util: Add flag to track device private pages in page snapshots
>   mm/hmm: Add flag to track device private pages
>   mm: Remove device private pages from the physical address space
> 
>  Documentation/mm/hmm.rst                 |  11 +-
>  arch/powerpc/kvm/book3s_hv_uvmem.c       |  43 ++---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  45 +++---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   2 +-
>  drivers/gpu/drm/drm_pagemap.c            |  11 +-
>  drivers/gpu/drm/nouveau/nouveau_dmem.c   |  45 ++----
>  drivers/gpu/drm/xe/xe_svm.c              |  37 ++---
>  fs/proc/page.c                           |   6 +-
>  include/drm/drm_pagemap.h                |   8 +-
>  include/linux/hmm.h                      |   7 +-
>  include/linux/leafops.h                  | 116 ++++++++++++--
>  include/linux/memremap.h                 |  64 +++++++-
>  include/linux/migrate.h                  |  23 ++-
>  include/linux/mm.h                       |   9 +-
>  include/linux/rmap.h                     |  33 +++-
>  include/linux/swap.h                     |   8 +-
>  include/linux/swapops.h                  | 136 ++++++++++++++++
>  lib/test_hmm.c                           |  86 ++++++----
>  mm/debug.c                               |   9 +-
>  mm/hmm.c                                 |   5 +-
>  mm/huge_memory.c                         |  43 ++---
>  mm/hugetlb.c                             |  15 +-
>  mm/memory.c                              |   5 +-
>  mm/memremap.c                            | 193 ++++++++++++++++++-----
>  mm/migrate.c                             |   6 +-
>  mm/migrate_device.c                      |  76 +++++----
>  mm/mm_init.c                             |   8 +-
>  mm/mprotect.c                            |  10 +-
>  mm/page_vma_mapped.c                     |  32 +++-
>  mm/rmap.c                                |  59 ++++---
>  mm/util.c                                |   8 +-
>  mm/vmscan.c                              |   2 +-
>  32 files changed, 822 insertions(+), 339 deletions(-)
> 
> 
> base-commit: f8f9c1f4d0c7a64600e2ca312dec824a0bc2f1da
> -- 
> 2.34.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
                   ` (11 preceding siblings ...)
  2026-01-07 18:36 ` [PATCH v2 00/11] Remove device private pages from " Matthew Brost
@ 2026-01-07 20:06 ` Andrew Morton
  2026-01-07 20:54   ` Jason Gunthorpe
                     ` (2 more replies)
  12 siblings, 3 replies; 33+ messages in thread
From: Andrew Morton @ 2026-01-07 20:06 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: linux-mm, balbirs, matthew.brost, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

On Wed,  7 Jan 2026 20:18:12 +1100 Jordan Niethe <jniethe@nvidia.com> wrote:

> Today, when creating these device private struct pages, the first step
> is to use request_free_mem_region() to get a range of physical address
> space large enough to represent the devices memory. This allocated
> physical address range is then remapped as device private memory using
> memremap_pages.

Welcome to Linux MM.  That's a heck of an opening salvo ;)

> Needing allocation of physical address space has some problems:
> 
>   1) There may be insufficient physical address space to represent the
>      device memory. KASLR reducing the physical address space and VM
>      configurations with limited physical address space increase the
>      likelihood of hitting this especially as device memory increases. This
>      has been observed to prevent device private from being initialized.  
> 
>   2) Attempting to add the device private pages to the linear map at
>      addresses beyond the actual physical memory causes issues on
>      architectures like aarch64  - meaning the feature does not work there [0].

Can you better help us understand the seriousness of these problems? 
How much are our users really hurting from this?

> Seeking opinions on using the mpfns like this or if a new type would be
> preferred.

Whose opinions?  IOW, can you suggest who you'd like to see review this
work?

> 
> * NOTE: I will need help in testing the driver changes *
> 

Again, please name names ;)  I'm not afraid to prod.


I'm reluctant to add this to mm.git's development/testing branches at
this time.  Your advice on when you think we're ready for that step
would be valuable, thanks.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 18:36 ` [PATCH v2 00/11] Remove device private pages from " Matthew Brost
@ 2026-01-07 20:21   ` Zi Yan
  2026-01-08  2:25   ` Jordan Niethe
  1 sibling, 0 replies; 33+ messages in thread
From: Zi Yan @ 2026-01-07 20:21 UTC (permalink / raw)
  To: Matthew Brost
  Cc: Jordan Niethe, linux-mm, balbirs, akpm, linux-kernel, dri-devel,
	david, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

On 7 Jan 2026, at 13:36, Matthew Brost wrote:

> On Wed, Jan 07, 2026 at 08:18:12PM +1100, Jordan Niethe wrote:
>> Today, when creating these device private struct pages, the first step
>> is to use request_free_mem_region() to get a range of physical address
>> space large enough to represent the devices memory. This allocated
>> physical address range is then remapped as device private memory using
>> memremap_pages.
>>
>> Needing allocation of physical address space has some problems:
>>
>>   1) There may be insufficient physical address space to represent the
>>      device memory. KASLR reducing the physical address space and VM
>>      configurations with limited physical address space increase the
>>      likelihood of hitting this especially as device memory increases. This
>>      has been observed to prevent device private from being initialized.
>>
>>   2) Attempting to add the device private pages to the linear map at
>>      addresses beyond the actual physical memory causes issues on
>>      architectures like aarch64  - meaning the feature does not work there [0].
>>
>> This series changes device private memory so that it does not require
>> allocation of physical address space and these problems are avoided.
>> Instead of using the physical address space, we introduce a "device
>> private address space" and allocate from there.
>>
>> A consequence of placing the device private pages outside of the
>> physical address space is that they no longer have a PFN. However, it is
>> still necessary to be able to look up a corresponding device private
>> page from a device private PTE entry, which means that we still require
>> some way to index into this device private address space. Instead of a
>> PFN, device private pages use an offset into this device private address
>> space to look up device private struct pages.
>>
>> The problem that then needs to be addressed is how to avoid confusing
>> these device private offsets with PFNs. It is the inherent limited usage
>> of the device private pages themselves which make this possible. A
>> device private page is only used for userspace mappings, we do not need
>> to be concerned with them being used within the mm more broadly. This
>> means that the only way that the core kernel looks up these pages is via
>> the page table, where their PTE already indicates if they refer to a
>> device private page via their swap type, e.g.  SWP_DEVICE_WRITE. We can
>> use this information to determine if the PTE contains a PFN which should
>> be looked up in the page map, or a device private offset which should be
>> looked up elsewhere.
>>
>> This applies when we are creating PTE entries for device private pages -
>> because they have their own type there are already must be handled
>> separately, so it is a small step to convert them to a device private
>> PFN now too.
>>
>> The first part of the series updates callers where device private
>> offsets might now be encountered to track this extra state.
>>
>> The last patch contains the bulk of the work where we change how we
>> convert between device private pages to device private offsets and then
>> use a new interface for allocating device private pages without the need
>> for reserving physical address space.
>>
>> By removing the device private pages from the physical address space,
>> this series also opens up the possibility to moving away from tracking
>> device private memory using struct pages in the future. This is
>> desirable as on systems with large amounts of memory these device
>> private struct pages use a signifiant amount of memory and take a
>> significant amount of time to initialize.
>>
>> *** Changes in v2 ***
>>
>> The most significant change in v2 is addressing code paths that are
>> common between MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT devices.
>>
>> This had been overlooked in previous revisions.
>>
>> To do this we introduce a migrate_pfn_from_page() helper which will call
>> device_private_offset_to_page() and set the MIGRATE_PFN_DEVICE_PRIVATE
>> flag if required.
>>
>> In places where we could have a device private offset
>> (MEMORY_DEVICE_PRIVATE) or a pfn (MEMORY_DEVICE_COHERENT) we update to
>> use an mpfn to disambiguate.  This includes some users in the drivers
>> and migrate_device_{pfns,range}().
>>
>> Seeking opinions on using the mpfns like this or if a new type would be
>> preferred.
>>
>>   - mm/migrate_device: Introduce migrate_pfn_from_page() helper
>>     - New to series
>>
>>   - drm/amdkfd: Use migrate pfns internally
>>     - New to series
>>
>>   - mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
>>     - New to series
>>
>>   - mm/migrate_device: Add migrate PFN flag to track device private pages
>>     - Update for migrate_pfn_from_page()
>>     - Rename to MIGRATE_PFN_DEVICE_PRIVATE
>>     - drm/amd: Check adev->gmc.xgmi.connected_to_cpu
>>     - lib/test_hmm.c: Check chunk->pagemap.type == MEMORY_DEVICE_PRIVATE
>>
>>   - mm: Add helpers to create migration entries from struct pages
>>     - Add a flags param
>>
>>   - mm: Add a new swap type for migration entries of device private pages
>>     - Add softleaf_is_migration_device_private_read()
>>
>>   - mm: Add helpers to create device private entries from struct pages
>>     - Add a flags param
>>
>>   - mm: Remove device private pages from the physical address space
>>     - Make sure last member of struct dev_pagemap remains DECLARE_FLEX_ARRAY(struct range, ranges);
>>
>> Testing:
>> - selftests/mm/hmm-tests on an amd64 VM
>>
>> * NOTE: I will need help in testing the driver changes *
>>
>
> Thanks for the series. For some reason Intel's CI couldn't apply this
> series to drm-tip to get results [1]. I'll manually apply this and run all
> our SVM tests and get back you on results + review the changes here. For
> future reference if you want to use our CI system, the series must apply
> to drm-tip, feel free to rebase this series and just send to intel-xe
> list if you want CI results.
>
> I was also wondering if Nvidia could help review one our core MM patches
> [2] which is gating enabling 2M device pages too?

I will take a look. But next time, do you mind Ccing MM maintainers and
reviewers based on MAINTAINERS file? Otherwise, it is hard for people to
check every email from linux-mm.

Thanks.

>
> Matt
>
> [1] https://patchwork.freedesktop.org/series/159738/
> [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1
>
>> Revisions:
>> - RFC: https://lore.kernel.org/all/20251128044146.80050-1-jniethe@nvidia.com/
>> - v1: https://lore.kernel.org/all/20251231043154.42931-1-jniethe@nvidia.com/
>>
>> [0] https://lore.kernel.org/lkml/CAMj1kXFZ=4hLL1w6iCV5O5uVoVLHAJbc0rr40j24ObenAjXe9w@mail.gmail.com/
>>
>> Jordan Niethe (11):
>>   mm/migrate_device: Introduce migrate_pfn_from_page() helper
>>   drm/amdkfd: Use migrate pfns internally
>>   mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns
>>   mm/migrate_device: Add migrate PFN flag to track device private pages
>>   mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn to track
>>     device private pages
>>   mm: Add helpers to create migration entries from struct pages
>>   mm: Add a new swap type for migration entries of device private pages
>>   mm: Add helpers to create device private entries from struct pages
>>   mm/util: Add flag to track device private pages in page snapshots
>>   mm/hmm: Add flag to track device private pages
>>   mm: Remove device private pages from the physical address space
>>
>>  Documentation/mm/hmm.rst                 |  11 +-
>>  arch/powerpc/kvm/book3s_hv_uvmem.c       |  43 ++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  45 +++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   2 +-
>>  drivers/gpu/drm/drm_pagemap.c            |  11 +-
>>  drivers/gpu/drm/nouveau/nouveau_dmem.c   |  45 ++----
>>  drivers/gpu/drm/xe/xe_svm.c              |  37 ++---
>>  fs/proc/page.c                           |   6 +-
>>  include/drm/drm_pagemap.h                |   8 +-
>>  include/linux/hmm.h                      |   7 +-
>>  include/linux/leafops.h                  | 116 ++++++++++++--
>>  include/linux/memremap.h                 |  64 +++++++-
>>  include/linux/migrate.h                  |  23 ++-
>>  include/linux/mm.h                       |   9 +-
>>  include/linux/rmap.h                     |  33 +++-
>>  include/linux/swap.h                     |   8 +-
>>  include/linux/swapops.h                  | 136 ++++++++++++++++
>>  lib/test_hmm.c                           |  86 ++++++----
>>  mm/debug.c                               |   9 +-
>>  mm/hmm.c                                 |   5 +-
>>  mm/huge_memory.c                         |  43 ++---
>>  mm/hugetlb.c                             |  15 +-
>>  mm/memory.c                              |   5 +-
>>  mm/memremap.c                            | 193 ++++++++++++++++++-----
>>  mm/migrate.c                             |   6 +-
>>  mm/migrate_device.c                      |  76 +++++----
>>  mm/mm_init.c                             |   8 +-
>>  mm/mprotect.c                            |  10 +-
>>  mm/page_vma_mapped.c                     |  32 +++-
>>  mm/rmap.c                                |  59 ++++---
>>  mm/util.c                                |   8 +-
>>  mm/vmscan.c                              |   2 +-
>>  32 files changed, 822 insertions(+), 339 deletions(-)
>>
>>
>> base-commit: f8f9c1f4d0c7a64600e2ca312dec824a0bc2f1da
>> -- 
>> 2.34.1
>>


Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 20:06 ` Andrew Morton
@ 2026-01-07 20:54   ` Jason Gunthorpe
  2026-01-07 21:02     ` Balbir Singh
  2026-01-08  1:08   ` John Hubbard
  2026-01-08  1:49   ` Alistair Popple
  2 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2026-01-07 20:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jordan Niethe, linux-mm, balbirs, matthew.brost, linux-kernel,
	dri-devel, david, ziy, apopple, lorenzo.stoakes, lyude, dakr,
	airlied, simona, rcampbell, mpenttil, willy, linuxppc-dev,
	intel-xe, Felix.Kuehling

On Wed, Jan 07, 2026 at 12:06:08PM -0800, Andrew Morton wrote:

> >   2) Attempting to add the device private pages to the linear map at
> >      addresses beyond the actual physical memory causes issues on
> >      architectures like aarch64  - meaning the feature does not work there [0].
> 
> Can you better help us understand the seriousness of these problems? 
> How much are our users really hurting from this?

We think it is pretty serious, in the future HW support sense, as it
means real systems being built do not work :)

Also Willy and others were cheering this work on at LPC. I think the
possible followup to move DEVICE_PRIVATE from struct page and reduce
the memory allocation would be well celebrated.

The Intel Xe and AMD GPU teams are the two drivers most important to
be testing this as they consume the feature.

Jason


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 20:54   ` Jason Gunthorpe
@ 2026-01-07 21:02     ` Balbir Singh
  2026-01-08  1:29       ` Alistair Popple
  0 siblings, 1 reply; 33+ messages in thread
From: Balbir Singh @ 2026-01-07 21:02 UTC (permalink / raw)
  To: Jason Gunthorpe, Andrew Morton
  Cc: Jordan Niethe, linux-mm, matthew.brost, linux-kernel, dri-devel,
	david, ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied,
	simona, rcampbell, mpenttil, willy, linuxppc-dev, intel-xe,
	Felix.Kuehling

On 1/8/26 06:54, Jason Gunthorpe wrote:
> On Wed, Jan 07, 2026 at 12:06:08PM -0800, Andrew Morton wrote:
> 
>>>   2) Attempting to add the device private pages to the linear map at
>>>      addresses beyond the actual physical memory causes issues on
>>>      architectures like aarch64  - meaning the feature does not work there [0].
>>
>> Can you better help us understand the seriousness of these problems? 
>> How much are our users really hurting from this?
> 
> We think it is pretty serious, in the future HW support sense, as it
> means real systems being built do not work :)
> 
> Also Willy and others were cheering this work on at LPC. I think the
> possible followup to move DEVICE_PRIVATE from struct page and reduce
> the memory allocation would be well celebrated.
> 
> The Intel Xe and AMD GPU teams are the two drivers most important to
> be testing this as they consume the feature.
> 

And the ultravisor usage in powerpc as well (book3s_hv_uvmem).

Balbir


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 20:06 ` Andrew Morton
  2026-01-07 20:54   ` Jason Gunthorpe
@ 2026-01-08  1:08   ` John Hubbard
  2026-01-08  1:49   ` Alistair Popple
  2 siblings, 0 replies; 33+ messages in thread
From: John Hubbard @ 2026-01-08  1:08 UTC (permalink / raw)
  To: Andrew Morton, Jordan Niethe
  Cc: linux-mm, balbirs, matthew.brost, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

On 1/7/26 12:06 PM, Andrew Morton wrote:
> On Wed,  7 Jan 2026 20:18:12 +1100 Jordan Niethe <jniethe@nvidia.com> wrote:
...
> Can you better help us understand the seriousness of these problems? 
> How much are our users really hurting from this?
> 

A lot! We have been involved in escalations from various customers
who have attempted to enable, say, KASLR and HMM at the same time.
And they ran out of phys address space, forcing them into an awkward
ugly choice of one or the other, often.

This is a huge pain point and a barrier to HMM adoption.

thanks,
-- 
John Hubbard



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 21:02     ` Balbir Singh
@ 2026-01-08  1:29       ` Alistair Popple
  0 siblings, 0 replies; 33+ messages in thread
From: Alistair Popple @ 2026-01-08  1:29 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Jason Gunthorpe, Andrew Morton, Jordan Niethe, linux-mm,
	matthew.brost, linux-kernel, dri-devel, david, ziy,
	lorenzo.stoakes, lyude, dakr, airlied, simona, rcampbell,
	mpenttil, willy, linuxppc-dev, intel-xe, Felix.Kuehling

On 2026-01-08 at 08:02 +1100, Balbir Singh <balbirs@nvidia.com> wrote...
> On 1/8/26 06:54, Jason Gunthorpe wrote:
> > On Wed, Jan 07, 2026 at 12:06:08PM -0800, Andrew Morton wrote:
> > 
> >>>   2) Attempting to add the device private pages to the linear map at
> >>>      addresses beyond the actual physical memory causes issues on
> >>>      architectures like aarch64  - meaning the feature does not work there [0].
> >>
> >> Can you better help us understand the seriousness of these problems? 
> >> How much are our users really hurting from this?
> > 
> > We think it is pretty serious, in the future HW support sense, as it
> > means real systems being built do not work :)

There's actually existing HW that could benefit from this support - after all
there is nothing stopping someone plugging a Intel/AMD/NVIDIA GPU into an ARM
machine today :-)

So it would be nice if we could support this feature there as it results in
really sub-optimal performance compared with x86 when using the SVM (shared
virtual memory) feature because data has to be remote mapped (ie. accessed via
PCIe link) rather than migrated to local GPU video memory.

Having the kernel steal physical address space has also caused problems on
x86 - we have encountered virtualised environments which depending on specific
firmware/BIOS don't have enough free physical address space to support device
private pages and hence migration of memory to the GPU device, again leading to
sub-optmial performance.

> > Also Willy and others were cheering this work on at LPC. I think the
> > possible followup to move DEVICE_PRIVATE from struct page and reduce
> > the memory allocation would be well celebrated.

For reference the recording of my LPC presentation covering both this series and
the above is here - https://www.youtube.com/watch?v=CFe_c8-tEuM

The hope is that in addition to enabling support for this more broadly across
other platforms/architectures that it will also enable further clean-ups to
reduce memory allocation overhead (I almost convinced myself we wouldn't need a
struct at all ... almost)

> > The Intel Xe and AMD GPU teams are the two drivers most important to
> > be testing this as they consume the feature.
> > 
> 
> And the ultravisor usage in powerpc as well (book3s_hv_uvmem).

As does Nouveau (which I've tested). But I agree AMD GPU and Intel Xe are the
most important drivers here. I would be surprised if anyone was actually using
the powerpc ultravisor, and I don't have access to a setup for this, so unless
some PPC folk can offer to help I wouldn't like to see testing there hold up
the series.

Especially as I believe most of the driver side changes are relatively straight
forward.

 - Alistair

> Balbir


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 20:06 ` Andrew Morton
  2026-01-07 20:54   ` Jason Gunthorpe
  2026-01-08  1:08   ` John Hubbard
@ 2026-01-08  1:49   ` Alistair Popple
  2026-01-08  2:55     ` Jordan Niethe
  2 siblings, 1 reply; 33+ messages in thread
From: Alistair Popple @ 2026-01-08  1:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jordan Niethe, linux-mm, balbirs, matthew.brost, linux-kernel,
	dri-devel, david, ziy, lorenzo.stoakes, lyude, dakr, airlied,
	simona, rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe,
	jgg, Felix.Kuehling

On 2026-01-08 at 07:06 +1100, Andrew Morton <akpm@linux-foundation.org> wrote...
> On Wed,  7 Jan 2026 20:18:12 +1100 Jordan Niethe <jniethe@nvidia.com> wrote:
> 
> > Today, when creating these device private struct pages, the first step
> > is to use request_free_mem_region() to get a range of physical address
> > space large enough to represent the devices memory. This allocated
> > physical address range is then remapped as device private memory using
> > memremap_pages.
> 
> Welcome to Linux MM.  That's a heck of an opening salvo ;)
> 
> > Needing allocation of physical address space has some problems:
> > 
> >   1) There may be insufficient physical address space to represent the
> >      device memory. KASLR reducing the physical address space and VM
> >      configurations with limited physical address space increase the
> >      likelihood of hitting this especially as device memory increases. This
> >      has been observed to prevent device private from being initialized.  
> > 
> >   2) Attempting to add the device private pages to the linear map at
> >      addresses beyond the actual physical memory causes issues on
> >      architectures like aarch64  - meaning the feature does not work there [0].
> 
> Can you better help us understand the seriousness of these problems? 
> How much are our users really hurting from this?

Hopefully the rest of the thread helps address this.

> > Seeking opinions on using the mpfns like this or if a new type would be
> > preferred.
> 
> Whose opinions?  IOW, can you suggest who you'd like to see review this
> work?

I was going to see if I could find Lorenzo on IRC as I think it would be good to
get his opinion on the softleaf changes. And probably Felix's (and my) opinion
for the mpfn changes (I don't think Intel currently uses DEVICE_COHERENT which
this bit has the biggest impact on).

> > 
> > * NOTE: I will need help in testing the driver changes *
> > 
> 
> Again, please name names ;)  I'm not afraid to prod.

As noted in the other thread Intel Xe and AMD GPU are the biggest. Matthew has
already offered to help test Intel (thanks!) and Felix saw the v1 posting so
hoping he can help with testing there.

> I'm reluctant to add this to mm.git's development/testing branches at
> this time.  Your advice on when you think we're ready for that step
> would be valuable, thanks.

Will leave the readiness call to Jordan, but we were hoping to get
this in for the v6.20 merge window if at all possible. I realise
we're probably running late given we generally like to let stuff
settle in development/testing branches for a while prior to the
merge window, but it did have an early round of review last year
(https://lore.kernel.org/linux-mm/20251128044146.80050-1-jniethe@nvidia.com/)
and I reviewed it internally and it looked very reasonable.

I will take a look at this latest version later today.

 - Alistair


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-07 18:36 ` [PATCH v2 00/11] Remove device private pages from " Matthew Brost
  2026-01-07 20:21   ` Zi Yan
@ 2026-01-08  2:25   ` Jordan Niethe
  2026-01-08  5:42     ` Jordan Niethe
  1 sibling, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-08  2:25 UTC (permalink / raw)
  To: Matthew Brost
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

Hi,

On 8/1/26 05:36, Matthew Brost wrote:
> 
> Thanks for the series. For some reason Intel's CI couldn't apply this
> series to drm-tip to get results [1]. I'll manually apply this and run all
> our SVM tests and get back you on results + review the changes here. For
> future reference if you want to use our CI system, the series must apply
> to drm-tip, feel free to rebase this series and just send to intel-xe
> list if you want CI 

Thanks, I'll rebase on drm-tip and send to the intel-xe list.

Jordan.

> 
> I was also wondering if Nvidia could help review one our core MM patches
> [2] which is gating enabling 2M device pages too?
> 
> Matt
> 
> [1] https://patchwork.freedesktop.org/series/159738/
> [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-08  1:49   ` Alistair Popple
@ 2026-01-08  2:55     ` Jordan Niethe
  0 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-08  2:55 UTC (permalink / raw)
  To: Alistair Popple, Andrew Morton
  Cc: linux-mm, balbirs, matthew.brost, linux-kernel, dri-devel, david,
	ziy, lorenzo.stoakes, lyude, dakr, airlied, simona, rcampbell,
	mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling, maddy

Hi,

On 8/1/26 12:49, Alistair Popple wrote:
> On 2026-01-08 at 07:06 +1100, Andrew Morton <akpm@linux-foundation.org> wrote...
>> On Wed,  7 Jan 2026 20:18:12 +1100 Jordan Niethe <jniethe@nvidia.com> wrote:
>>
>>> Today, when creating these device private struct pages, the first step
>>> is to use request_free_mem_region() to get a range of physical address
>>> space large enough to represent the devices memory. This allocated
>>> physical address range is then remapped as device private memory using
>>> memremap_pages.
>>
>> Welcome to Linux MM.  That's a heck of an opening salvo ;)
>>
>>> Needing allocation of physical address space has some problems:
>>>
>>>    1) There may be insufficient physical address space to represent the
>>>       device memory. KASLR reducing the physical address space and VM
>>>       configurations with limited physical address space increase the
>>>       likelihood of hitting this especially as device memory increases. This
>>>       has been observed to prevent device private from being initialized.
>>>
>>>    2) Attempting to add the device private pages to the linear map at
>>>       addresses beyond the actual physical memory causes issues on
>>>       architectures like aarch64  - meaning the feature does not work there [0].
>>
>> Can you better help us understand the seriousness of these problems?
>> How much are our users really hurting from this?
> 
> Hopefully the rest of the thread helps address this.
> 
>>> Seeking opinions on using the mpfns like this or if a new type would be
>>> preferred.
>>
>> Whose opinions?  IOW, can you suggest who you'd like to see review this
>> work?
> 
> I was going to see if I could find Lorenzo on IRC as I think it would be good to
> get his opinion on the softleaf changes. And probably Felix's (and my) opinion
> for the mpfn changes (I don't think Intel currently uses DEVICE_COHERENT which
> this bit has the biggest impact on).

It also effects intel's driver because the mpfn changes also touch
migrate_device_pfns() which gets used there.

So also looking for Matthew's thoughts here as well as Felix's.

> 
>>>
>>> * NOTE: I will need help in testing the driver changes *
>>>
>>
>> Again, please name names ;)  I'm not afraid to prod.
> 
> As noted in the other thread Intel Xe and AMD GPU are the biggest. Matthew has
> already offered to help test Intel (thanks!) and Felix saw the v1 posting so
> hoping he can help with testing there.

Yes, I should also be able to get run this through the intel-xe CI.
The other area that needs testing is the powerpc ultravisor.
(+cc) Madhavan Srinivasan - are you able to help here?

> 
>> I'm reluctant to add this to mm.git's development/testing branches at
>> this time.  Your advice on when you think we're ready for that step
>> would be valuable, thanks.
> 
> Will leave the readiness call to Jordan, but we were hoping to get
> this in for the v6.20 merge window if at all possible. I realise
> we're probably running late given we generally like to let stuff
> settle in development/testing branches for a while prior to the
> merge window, but it did have an early round of review last year
> (https://lore.kernel.org/linux-mm/20251128044146.80050-1-jniethe@nvidia.com/)
> and I reviewed it internally and it looked very reasonable.

Matt has kindly said that he is reviewing the patches so will wait for 
his feedback.
I'd also like to get the results from the intel-xe CI first.

Andrew, I'll advise on including in mm.git after these steps - but I don't
expect any major issues at this stage.  The changes have been solid with the
hmm selftests and with updating our out of tree driver to use the new
interface.


Thanks,
Jordan.

> 
> I will take a look at this latest version later today.
> 
>   - Alistair



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-08  2:25   ` Jordan Niethe
@ 2026-01-08  5:42     ` Jordan Niethe
  2026-01-09  0:01       ` Jordan Niethe
  0 siblings, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-08  5:42 UTC (permalink / raw)
  To: Matthew Brost
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

Hi,

On 8/1/26 13:25, Jordan Niethe wrote:
> Hi,
> 
> On 8/1/26 05:36, Matthew Brost wrote:
>>
>> Thanks for the series. For some reason Intel's CI couldn't apply this
>> series to drm-tip to get results [1]. I'll manually apply this and run 
>> all
>> our SVM tests and get back you on results + review the changes here. For
>> future reference if you want to use our CI system, the series must apply
>> to drm-tip, feel free to rebase this series and just send to intel-xe
>> list if you want CI 
> 
> Thanks, I'll rebase on drm-tip and send to the intel-xe list.

For reference the rebase on drm-tip on the intel-xe list:

https://patchwork.freedesktop.org/series/159738/

Will watch the CI results.

Thanks,
Jordan.

> 
> Jordan.
> 
>>
>> I was also wondering if Nvidia could help review one our core MM patches
>> [2] which is gating enabling 2M device pages too?
>>
>> Matt
>>
>> [1] https://patchwork.freedesktop.org/series/159738/
>> [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1
> 
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages
  2026-01-07  9:18 ` [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages Jordan Niethe
@ 2026-01-08 20:01   ` Felix Kuehling
  2026-01-08 23:41     ` Jordan Niethe
  0 siblings, 1 reply; 33+ messages in thread
From: Felix Kuehling @ 2026-01-08 20:01 UTC (permalink / raw)
  To: Jordan Niethe, linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg


On 2026-01-07 04:18, Jordan Niethe wrote:
> A future change will remove device private pages from the physical
> address space. This will mean that device private pages no longer have
> normal PFN and must be handled separately.
>
> Prepare for this by adding a MIGRATE_PFN_DEVICE_PRIVATE flag to indicate
> that a migrate pfn contains a PFN for a device private page.
>
> Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>
> ---
> v1:
> - Update for HMM huge page support
> - Update existing drivers to use MIGRATE_PFN_DEVICE
> v2:
> - Include changes to migrate_pfn_from_page()
> - Rename to MIGRATE_PFN_DEVICE_PRIVATE
> - drm/amd: Check adev->gmc.xgmi.connected_to_cpu
> - lib/test_hmm.c: Check chunk->pagemap.type == MEMORY_DEVICE_PRIVATE
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  7 ++++++-
>   drivers/gpu/drm/nouveau/nouveau_dmem.c   |  3 ++-
>   drivers/gpu/drm/xe/xe_svm.c              |  2 +-
>   include/linux/migrate.h                  | 14 +++++++++-----
>   lib/test_hmm.c                           |  6 +++++-
>   5 files changed, 23 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index c493b19268cc..1a07a8b92e8f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -206,7 +206,12 @@ svm_migrate_copy_done(struct amdgpu_device *adev, struct dma_fence *mfence)
>   unsigned long
>   svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long addr)
>   {
> -	return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT);
> +	unsigned long flags = 0;
> +
> +	if (!adev->gmc.xgmi.connected_to_cpu)

We could probably use adev->kfd.pgmap.type == MEMORY_DEVICE_PRIVATE 
here. This avoids making any assumptions about how KFD decides device 
page type it wants to use, which may change on future HW generations.

Other than that, this looks good to me.

Thanks,
   Felix


> +		flags |= MIGRATE_PFN_DEVICE_PRIVATE;
> +	return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT) |
> +	       flags;
>   }
>   
>   static void
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index bd3f7102c3f9..adfa3df5cbc5 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -484,7 +484,8 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
>   	dma_info = kvcalloc(npages, sizeof(*dma_info), GFP_KERNEL | __GFP_NOFAIL);
>   
>   	migrate_device_range(src_pfns,
> -			     migrate_pfn(chunk->pagemap.range.start >> PAGE_SHIFT),
> +			     migrate_pfn(chunk->pagemap.range.start >> PAGE_SHIFT) |
> +			     MIGRATE_PFN_DEVICE_PRIVATE,
>   			     npages);
>   
>   	for (i = 0; i < npages; i++) {
> diff --git a/drivers/gpu/drm/xe/xe_svm.c b/drivers/gpu/drm/xe/xe_svm.c
> index 260676b0d246..f82790d7e7e6 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -698,7 +698,7 @@ static int xe_svm_populate_devmem_mpfn(struct drm_pagemap_devmem *devmem_allocat
>   		int i;
>   
>   		for (i = 0; i < drm_buddy_block_size(buddy, block) >> PAGE_SHIFT; ++i)
> -			pfn[j++] = migrate_pfn(block_pfn + i);
> +			pfn[j++] = migrate_pfn(block_pfn + i) | MIGRATE_PFN_DEVICE_PRIVATE;
>   	}
>   
>   	return 0;
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index d269ec1400be..5fd2ee080bc0 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -122,11 +122,12 @@ static inline int migrate_misplaced_folio(struct folio *folio, int node)
>    * have enough bits to store all physical address and flags. So far we have
>    * enough room for all our flags.
>    */
> -#define MIGRATE_PFN_VALID	(1UL << 0)
> -#define MIGRATE_PFN_MIGRATE	(1UL << 1)
> -#define MIGRATE_PFN_WRITE	(1UL << 3)
> -#define MIGRATE_PFN_COMPOUND	(1UL << 4)
> -#define MIGRATE_PFN_SHIFT	6
> +#define MIGRATE_PFN_VALID		(1UL << 0)
> +#define MIGRATE_PFN_MIGRATE		(1UL << 1)
> +#define MIGRATE_PFN_WRITE		(1UL << 3)
> +#define MIGRATE_PFN_COMPOUND		(1UL << 4)
> +#define MIGRATE_PFN_DEVICE_PRIVATE	(1UL << 5)
> +#define MIGRATE_PFN_SHIFT		6
>   
>   static inline struct page *migrate_pfn_to_page(unsigned long mpfn)
>   {
> @@ -142,6 +143,9 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
>   
>   static inline unsigned long migrate_pfn_from_page(struct page *page)
>   {
> +	if (is_device_private_page(page))
> +		return migrate_pfn(page_to_pfn(page)) |
> +		       MIGRATE_PFN_DEVICE_PRIVATE;
>   	return migrate_pfn(page_to_pfn(page));
>   }
>   
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index a6ff292596f3..872d3846af7b 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -1385,11 +1385,15 @@ static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
>   	unsigned long *src_pfns;
>   	unsigned long *dst_pfns;
>   	unsigned int order = 0;
> +	unsigned long flags = 0;
>   
>   	src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
>   	dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
>   
> -	migrate_device_range(src_pfns, migrate_pfn(start_pfn), npages);
> +	if (chunk->pagemap.type == MEMORY_DEVICE_PRIVATE)
> +		flags |= MIGRATE_PFN_DEVICE_PRIVATE;
> +
> +	migrate_device_range(src_pfns, migrate_pfn(start_pfn) | flags, npages);
>   	for (i = 0; i < npages; i++) {
>   		struct page *dpage, *spage;
>   


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper
  2026-01-07  9:18 ` [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
@ 2026-01-08 20:03   ` Felix Kuehling
  2026-01-08 23:49     ` Jordan Niethe
  0 siblings, 1 reply; 33+ messages in thread
From: Felix Kuehling @ 2026-01-08 20:03 UTC (permalink / raw)
  To: Jordan Niethe, linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg


On 2026-01-07 04:18, Jordan Niethe wrote:
> To create a migrate from a given struct page, that page is first
> converted to its pfn, before passing the pfn to migrate_pfn().
>
> A future change will remove device private pages from the physical
> address space. This will mean that device private pages no longer have a
> pfn and must be handled separately.
>
> Prepare for this with a new helper:
>
>      - migrate_pfn_from_page()
>
> This helper takes a struct page as parameter instead of a pfn. This will
> allow more flexibility for handling the mpfn differently for device
> private pages.
>
> Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
> ---
> v2: New to series
> ---
>   arch/powerpc/kvm/book3s_hv_uvmem.c       |  2 +-
>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  2 +-
>   drivers/gpu/drm/drm_pagemap.c            |  2 +-
>   drivers/gpu/drm/nouveau/nouveau_dmem.c   |  4 ++--
>   include/linux/migrate.h                  |  5 +++++
>   lib/test_hmm.c                           | 11 ++++++-----
>   mm/migrate_device.c                      |  7 +++----
>   7 files changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index e5000bef90f2..67910900af7b 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -784,7 +784,7 @@ static int kvmppc_svm_page_in(struct vm_area_struct *vma,
>   		}
>   	}
>   
> -	*mig.dst = migrate_pfn(page_to_pfn(dpage));
> +	*mig.dst = migrate_pfn_from_page(dpage);
>   	migrate_vma_pages(&mig);
>   out_finalize:
>   	migrate_vma_finalize(&mig);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index af53e796ea1b..ca552c34ece2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -646,7 +646,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
>   		pr_debug_ratelimited("dma mapping dst to 0x%llx, pfn 0x%lx\n",
>   				     dst[i] >> PAGE_SHIFT, page_to_pfn(dpage));
>   
> -		migrate->dst[i] = migrate_pfn(page_to_pfn(dpage));
> +		migrate->dst[i] = migrate_pfn_from_page(dpage);

You missed another instance of this in svm_migrate_copy_to_vram.

Regards,
   Felix


>   		j++;
>   	}
>   
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index 37d7cfbbb3e8..5ddf395847ef 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -490,7 +490,7 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
>   			goto free_pages;
>   
>   		page = folio_page(folio, 0);
> -		mpfn[i] = migrate_pfn(page_to_pfn(page));
> +		mpfn[i] = migrate_pfn_from_page(page);
>   
>   next:
>   		if (page)
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 58071652679d..a7edcdca9701 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -249,7 +249,7 @@ static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf)
>   		goto done;
>   	}
>   
> -	args.dst[0] = migrate_pfn(page_to_pfn(dpage));
> +	args.dst[0] = migrate_pfn_from_page(dpage);
>   	if (order)
>   		args.dst[0] |= MIGRATE_PFN_COMPOUND;
>   	dfolio = page_folio(dpage);
> @@ -766,7 +766,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
>   		((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
>   	if (src & MIGRATE_PFN_WRITE)
>   		*pfn |= NVIF_VMM_PFNMAP_V0_W;
> -	mpfn = migrate_pfn(page_to_pfn(dpage));
> +	mpfn = migrate_pfn_from_page(dpage);
>   	if (folio_order(page_folio(dpage)))
>   		mpfn |= MIGRATE_PFN_COMPOUND;
>   	return mpfn;
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 26ca00c325d9..d269ec1400be 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -140,6 +140,11 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
>   	return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID;
>   }
>   
> +static inline unsigned long migrate_pfn_from_page(struct page *page)
> +{
> +	return migrate_pfn(page_to_pfn(page));
> +}
> +
>   enum migrate_vma_direction {
>   	MIGRATE_VMA_SELECT_SYSTEM = 1 << 0,
>   	MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1,
> diff --git a/lib/test_hmm.c b/lib/test_hmm.c
> index 8af169d3873a..7e5248404d00 100644
> --- a/lib/test_hmm.c
> +++ b/lib/test_hmm.c
> @@ -727,7 +727,8 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
>   				rpage = BACKING_PAGE(dpage);
>   				rpage->zone_device_data = dmirror;
>   
> -				*dst = migrate_pfn(page_to_pfn(dpage)) | write;
> +				*dst = migrate_pfn_from_page(dpage) |
> +				       write;
>   				src_page = pfn_to_page(spfn + i);
>   
>   				if (spage)
> @@ -754,7 +755,7 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args,
>   		pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 0x%lx\n",
>   			 page_to_pfn(spage), page_to_pfn(dpage));
>   
> -		*dst = migrate_pfn(page_to_pfn(dpage)) | write;
> +		*dst = migrate_pfn_from_page(dpage) | write;
>   
>   		if (is_large) {
>   			int i;
> @@ -989,7 +990,7 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma *args,
>   
>   		if (dpage) {
>   			lock_page(dpage);
> -			*dst |= migrate_pfn(page_to_pfn(dpage));
> +			*dst |= migrate_pfn_from_page(dpage);
>   		}
>   
>   		for (i = 0; i < (1 << order); i++) {
> @@ -1000,7 +1001,7 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma *args,
>   			if (!dpage && order) {
>   				dpage = alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr);
>   				lock_page(dpage);
> -				dst[i] = migrate_pfn(page_to_pfn(dpage));
> +				dst[i] = migrate_pfn_from_page(dpage);
>   				dst_page = pfn_to_page(page_to_pfn(dpage));
>   				dpage = NULL; /* For the next iteration */
>   			} else {
> @@ -1412,7 +1413,7 @@ static void dmirror_device_evict_chunk(struct dmirror_chunk *chunk)
>   
>   		/* TODO Support splitting here */
>   		lock_page(dpage);
> -		dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
> +		dst_pfns[i] = migrate_pfn_from_page(dpage);
>   		if (src_pfns[i] & MIGRATE_PFN_WRITE)
>   			dst_pfns[i] |= MIGRATE_PFN_WRITE;
>   		if (order)
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 23379663b1e1..1a2067f830da 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -207,9 +207,8 @@ static int migrate_vma_collect_huge_pmd(pmd_t *pmdp, unsigned long start,
>   			.vma = walk->vma,
>   		};
>   
> -		unsigned long pfn = page_to_pfn(folio_page(folio, 0));
> -
> -		migrate->src[migrate->npages] = migrate_pfn(pfn) | write
> +		migrate->src[migrate->npages] = migrate_pfn_from_page(folio_page(folio, 0))
> +						| write
>   						| MIGRATE_PFN_MIGRATE
>   						| MIGRATE_PFN_COMPOUND;
>   		migrate->dst[migrate->npages++] = 0;
> @@ -328,7 +327,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>   				goto again;
>   			}
>   
> -			mpfn = migrate_pfn(page_to_pfn(page)) |
> +			mpfn = migrate_pfn_from_page(page) |
>   					MIGRATE_PFN_MIGRATE;
>   			if (softleaf_is_device_private_write(entry))
>   				mpfn |= MIGRATE_PFN_WRITE;


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally
  2026-01-07  9:18 ` [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally Jordan Niethe
@ 2026-01-08 22:00   ` Felix Kuehling
  2026-01-08 23:56     ` Jordan Niethe
  0 siblings, 1 reply; 33+ messages in thread
From: Felix Kuehling @ 2026-01-08 22:00 UTC (permalink / raw)
  To: Jordan Niethe, linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg


On 2026-01-07 04:18, Jordan Niethe wrote:
> A future change will remove device private pages from the physical
> address space. This will mean that device private pages no longer have a
> pfn.
>
> A MIGRATE_PFN flag will be introduced that distinguishes between mpfns
> that contain a pfn vs an offset into device private memory.
>
> Replace usages of pfns and page_to_pfn() to mpfns and
> migrate_pfn_to_page() to prepare for handling this distinction. This
> will assist in continuing to use the same code paths for both
> MEMORY_DEVICE_PRIVATE and MEMORY_DEVICE_COHERENT devices.
>
> Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
> ---
> v2:
>    - New to series
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 +++++++--------
>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |  2 +-
>   2 files changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index ca552c34ece2..c493b19268cc 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -204,17 +204,17 @@ svm_migrate_copy_done(struct amdgpu_device *adev, struct dma_fence *mfence)
>   }
>   
>   unsigned long
> -svm_migrate_addr_to_pfn(struct amdgpu_device *adev, unsigned long addr)
> +svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long addr)
>   {
> -	return (addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT;
> +	return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> PAGE_SHIFT);
>   }
>   
>   static void
> -svm_migrate_get_vram_page(struct svm_range *prange, unsigned long pfn)
> +svm_migrate_get_vram_page(struct svm_range *prange, unsigned long mpfn)
>   {
>   	struct page *page;
>   
> -	page = pfn_to_page(pfn);
> +	page = migrate_pfn_to_page(mpfn);
>   	svm_range_bo_ref(prange->svm_bo);
>   	page->zone_device_data = prange->svm_bo;
>   	zone_device_page_init(page, 0);
> @@ -225,7 +225,7 @@ svm_migrate_put_vram_page(struct amdgpu_device *adev, unsigned long addr)
>   {
>   	struct page *page;
>   
> -	page = pfn_to_page(svm_migrate_addr_to_pfn(adev, addr));
> +	page = migrate_pfn_to_page(svm_migrate_addr_to_mpfn(adev, addr));
>   	unlock_page(page);
>   	put_page(page);
>   }
> @@ -235,7 +235,7 @@ svm_migrate_addr(struct amdgpu_device *adev, struct page *page)
>   {
>   	unsigned long addr;
>   
> -	addr = page_to_pfn(page) << PAGE_SHIFT;
> +	addr = (migrate_pfn_from_page(page) >> MIGRATE_PFN_SHIFT) << PAGE_SHIFT;
>   	return (addr - adev->kfd.pgmap.range.start);

I guess we rely on the fact that for DEVICE_PRIVATE memory, 
adev->kfd.pgmap.range.start will be 0 after your patch 11. So we don't 
need a special condition here to handle DEVICE_PRIVATE differently.

In general, I like the way you handle mpfns as it keeps all the special 
casing out of the drivers.

Regards,
   Felix


>   }
>   
> @@ -301,9 +301,8 @@ svm_migrate_copy_to_vram(struct kfd_node *node, struct svm_range *prange,
>   
>   		if (migrate->src[i] & MIGRATE_PFN_MIGRATE) {
>   			dst[i] = cursor.start + (j << PAGE_SHIFT);
> -			migrate->dst[i] = svm_migrate_addr_to_pfn(adev, dst[i]);
> +			migrate->dst[i] = svm_migrate_addr_to_mpfn(adev, dst[i]);
>   			svm_migrate_get_vram_page(prange, migrate->dst[i]);
> -			migrate->dst[i] = migrate_pfn(migrate->dst[i]);
>   			mpages++;
>   		}
>   		spage = migrate_pfn_to_page(migrate->src[i]);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
> index 2b7fd442d29c..a80b72abe1e0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.h
> @@ -48,7 +48,7 @@ int svm_migrate_vram_to_ram(struct svm_range *prange, struct mm_struct *mm,
>   			    uint32_t trigger, struct page *fault_page);
>   
>   unsigned long
> -svm_migrate_addr_to_pfn(struct amdgpu_device *adev, unsigned long addr);
> +svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long addr);
>   
>   #endif /* IS_ENABLED(CONFIG_HSA_AMD_SVM) */
>   


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages
  2026-01-08 20:01   ` Felix Kuehling
@ 2026-01-08 23:41     ` Jordan Niethe
  0 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-08 23:41 UTC (permalink / raw)
  To: Felix Kuehling, linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg

Hi,

On 9/1/26 07:01, Felix Kuehling wrote:
> 
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/ 
>> drm/amd/amdkfd/kfd_migrate.c
>> index c493b19268cc..1a07a8b92e8f 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
>> @@ -206,7 +206,12 @@ svm_migrate_copy_done(struct amdgpu_device *adev, 
>> struct dma_fence *mfence)
>>   unsigned long
>>   svm_migrate_addr_to_mpfn(struct amdgpu_device *adev, unsigned long 
>> addr)
>>   {
>> -    return migrate_pfn((addr + adev->kfd.pgmap.range.start) >> 
>> PAGE_SHIFT);
>> +    unsigned long flags = 0;
>> +
>> +    if (!adev->gmc.xgmi.connected_to_cpu)
> 
> We could probably use adev->kfd.pgmap.type == MEMORY_DEVICE_PRIVATE 
> here. This avoids making any assumptions about how KFD decides device 
> page type it wants to use, which may change on future HW generations.
> 
> Other than that, this looks good to me.

That's a good point - I'll update.

Thanks for review,
Jordan.

> 
> Thanks,
>    Felix
> 
> 




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper
  2026-01-08 20:03   ` Felix Kuehling
@ 2026-01-08 23:49     ` Jordan Niethe
  0 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-08 23:49 UTC (permalink / raw)
  To: Felix Kuehling, linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg

Hi,

On 9/1/26 07:03, Felix Kuehling wrote:
> 
>> @@ -646,7 +646,7 @@ svm_migrate_copy_to_ram(struct amdgpu_device 
>> *adev, struct svm_range *prange,
>>           pr_debug_ratelimited("dma mapping dst to 0x%llx, pfn 0x%lx\n",
>>                        dst[i] >> PAGE_SHIFT, page_to_pfn(dpage));
>> -        migrate->dst[i] = migrate_pfn(page_to_pfn(dpage));
>> +        migrate->dst[i] = migrate_pfn_from_page(dpage);
> 
> You missed another instance of this in svm_migrate_copy_to_vram.

I might be missing something, but is there call to migrate_pfn() in
svm_migrate_copy_to_vram()?  I'm seeing svm_migrate_copy_to_vram() calls
svm_migrate_addr_to_mpfn() - that should be handled already.

Thanks for reviewing,
Jordan.

> 
> Regards,
>    Felix
> 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally
  2026-01-08 22:00   ` Felix Kuehling
@ 2026-01-08 23:56     ` Jordan Niethe
  0 siblings, 0 replies; 33+ messages in thread
From: Jordan Niethe @ 2026-01-08 23:56 UTC (permalink / raw)
  To: Felix Kuehling, linux-mm
  Cc: balbirs, matthew.brost, akpm, linux-kernel, dri-devel, david,
	ziy, apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg

Hi,

On 9/1/26 09:00, Felix Kuehling wrote:
> 
>> @@ -235,7 +235,7 @@ svm_migrate_addr(struct amdgpu_device *adev, 
>> struct page *page)
>>   {
>>       unsigned long addr;
>> -    addr = page_to_pfn(page) << PAGE_SHIFT;
>> +    addr = (migrate_pfn_from_page(page) >> MIGRATE_PFN_SHIFT) << 
>> PAGE_SHIFT;
>>       return (addr - adev->kfd.pgmap.range.start);
> 
> I guess we rely on the fact that for DEVICE_PRIVATE memory, adev- 
>  >kfd.pgmap.range.start will be 0 after your patch 11. So we don't need 
> a special condition here to handle DEVICE_PRIVATE differently.

Actually pgmap.range.start won't be zero - part of the change to the
memremap_device_private_pagemap() in patch 11 is that range is used as an
output parameter.  It returns the range we allocate for the pagemap from the
device_private_pgmap_tree maple tree, representing "device private address
space".

But it's correct that means we don't need special handling here.

> 
> In general, I like the way you handle mpfns as it keeps all the special 
> casing out of the drivers.

Yeah, it does turn out quite neat.

Thanks for review,
Jordan.

> 
> Regards,
>    Felix
> 
> 




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-08  5:42     ` Jordan Niethe
@ 2026-01-09  0:01       ` Jordan Niethe
  2026-01-09  0:31         ` Matthew Brost
  0 siblings, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-09  0:01 UTC (permalink / raw)
  To: Matthew Brost
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

Hi,

On 8/1/26 16:42, Jordan Niethe wrote:
> Hi,
> 
> On 8/1/26 13:25, Jordan Niethe wrote:
>> Hi,
>>
>> On 8/1/26 05:36, Matthew Brost wrote:
>>>
>>> Thanks for the series. For some reason Intel's CI couldn't apply this
>>> series to drm-tip to get results [1]. I'll manually apply this and 
>>> run all
>>> our SVM tests and get back you on results + review the changes here. For
>>> future reference if you want to use our CI system, the series must apply
>>> to drm-tip, feel free to rebase this series and just send to intel-xe
>>> list if you want CI 
>>
>> Thanks, I'll rebase on drm-tip and send to the intel-xe list.
> 
> For reference the rebase on drm-tip on the intel-xe list:
> 
> https://patchwork.freedesktop.org/series/159738/
> 
> Will watch the CI results.

The series causes some failures in the intel-xe tests:
https://patchwork.freedesktop.org/series/159738/#rev4

Working through the failures now.

Thanks,
Jordan.

> 
> Thanks,
> Jordan.
> 
>>
>> Jordan.
>>
>>>
>>> I was also wondering if Nvidia could help review one our core MM patches
>>> [2] which is gating enabling 2M device pages too?
>>>
>>> Matt
>>>
>>> [1] https://patchwork.freedesktop.org/series/159738/
>>> [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1
>>
>>
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-09  0:01       ` Jordan Niethe
@ 2026-01-09  0:31         ` Matthew Brost
  2026-01-09  1:27           ` Jordan Niethe
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Brost @ 2026-01-09  0:31 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

On Fri, Jan 09, 2026 at 11:01:13AM +1100, Jordan Niethe wrote:
> Hi,
> 
> On 8/1/26 16:42, Jordan Niethe wrote:
> > Hi,
> > 
> > On 8/1/26 13:25, Jordan Niethe wrote:
> > > Hi,
> > > 
> > > On 8/1/26 05:36, Matthew Brost wrote:
> > > > 
> > > > Thanks for the series. For some reason Intel's CI couldn't apply this
> > > > series to drm-tip to get results [1]. I'll manually apply this
> > > > and run all
> > > > our SVM tests and get back you on results + review the changes here. For
> > > > future reference if you want to use our CI system, the series must apply
> > > > to drm-tip, feel free to rebase this series and just send to intel-xe
> > > > list if you want CI
> > > 
> > > Thanks, I'll rebase on drm-tip and send to the intel-xe list.
> > 
> > For reference the rebase on drm-tip on the intel-xe list:
> > 
> > https://patchwork.freedesktop.org/series/159738/
> > 
> > Will watch the CI results.
> 
> The series causes some failures in the intel-xe tests:
> https://patchwork.freedesktop.org/series/159738/#rev4
> 
> Working through the failures now.
> 

Yea, I saw the failures. I haven't had time look at the patches on my
end quite yet. Scrabling to get a few things in 6.20/7.0 PR, so I may
not have bandwidth to look in depth until mid next week but digging is
on my TODO list.

Matt 

> Thanks,
> Jordan.
> 
> > 
> > Thanks,
> > Jordan.
> > 
> > > 
> > > Jordan.
> > > 
> > > > 
> > > > I was also wondering if Nvidia could help review one our core MM patches
> > > > [2] which is gating enabling 2M device pages too?
> > > > 
> > > > Matt
> > > > 
> > > > [1] https://patchwork.freedesktop.org/series/159738/
> > > > [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1
> > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-09  0:31         ` Matthew Brost
@ 2026-01-09  1:27           ` Jordan Niethe
  2026-01-09  6:22             ` Matthew Brost
  0 siblings, 1 reply; 33+ messages in thread
From: Jordan Niethe @ 2026-01-09  1:27 UTC (permalink / raw)
  To: Matthew Brost
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

Hi
On 9/1/26 11:31, Matthew Brost wrote:
> On Fri, Jan 09, 2026 at 11:01:13AM +1100, Jordan Niethe wrote:
>> Hi,
>>
>> On 8/1/26 16:42, Jordan Niethe wrote:
>>> Hi,
>>>
>>> On 8/1/26 13:25, Jordan Niethe wrote:
>>>> Hi,
>>>>
>>>> On 8/1/26 05:36, Matthew Brost wrote:
>>>>>
>>>>> Thanks for the series. For some reason Intel's CI couldn't apply this
>>>>> series to drm-tip to get results [1]. I'll manually apply this
>>>>> and run all
>>>>> our SVM tests and get back you on results + review the changes here. For
>>>>> future reference if you want to use our CI system, the series must apply
>>>>> to drm-tip, feel free to rebase this series and just send to intel-xe
>>>>> list if you want CI
>>>>
>>>> Thanks, I'll rebase on drm-tip and send to the intel-xe list.
>>>
>>> For reference the rebase on drm-tip on the intel-xe list:
>>>
>>> https://patchwork.freedesktop.org/series/159738/
>>>
>>> Will watch the CI results.
>>
>> The series causes some failures in the intel-xe tests:
>> https://patchwork.freedesktop.org/series/159738/#rev4
>>
>> Working through the failures now.
>>
> 
> Yea, I saw the failures. I haven't had time look at the patches on my
> end quite yet. Scrabling to get a few things in 6.20/7.0 PR, so I may
> not have bandwidth to look in depth until mid next week but digging is
> on my TODO list.

Sure, that's completely fine. The failures seem pretty directly related 
to the
series so I think I'll be able to make good progress.

For example 
https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-159738v4/bat-bmg-2/igt@xe_evict@evict-beng-small.html

It looks like I missed that xe_pagemap_destroy_work() needs to be updated to
remove the call to devm_release_mem_region() now we are no longer 
reserving a mem
region. 
  
  


Thanks,
Jordan.

> 
> Matt
> 
>> Thanks,
>> Jordan.
>>
>>>
>>> Thanks,
>>> Jordan.
>>>
>>>>
>>>> Jordan.
>>>>
>>>>>
>>>>> I was also wondering if Nvidia could help review one our core MM patches
>>>>> [2] which is gating enabling 2M device pages too?
>>>>>
>>>>> Matt
>>>>>
>>>>> [1] https://patchwork.freedesktop.org/series/159738/
>>>>> [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1
>>>>
>>>>
>>>
>>



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/11] Remove device private pages from physical address space
  2026-01-09  1:27           ` Jordan Niethe
@ 2026-01-09  6:22             ` Matthew Brost
  0 siblings, 0 replies; 33+ messages in thread
From: Matthew Brost @ 2026-01-09  6:22 UTC (permalink / raw)
  To: Jordan Niethe
  Cc: linux-mm, balbirs, akpm, linux-kernel, dri-devel, david, ziy,
	apopple, lorenzo.stoakes, lyude, dakr, airlied, simona,
	rcampbell, mpenttil, jgg, willy, linuxppc-dev, intel-xe, jgg,
	Felix.Kuehling

On Fri, Jan 09, 2026 at 12:27:50PM +1100, Jordan Niethe wrote:
> Hi
> On 9/1/26 11:31, Matthew Brost wrote:
> > On Fri, Jan 09, 2026 at 11:01:13AM +1100, Jordan Niethe wrote:
> > > Hi,
> > > 
> > > On 8/1/26 16:42, Jordan Niethe wrote:
> > > > Hi,
> > > > 
> > > > On 8/1/26 13:25, Jordan Niethe wrote:
> > > > > Hi,
> > > > > 
> > > > > On 8/1/26 05:36, Matthew Brost wrote:
> > > > > > 
> > > > > > Thanks for the series. For some reason Intel's CI couldn't apply this
> > > > > > series to drm-tip to get results [1]. I'll manually apply this
> > > > > > and run all
> > > > > > our SVM tests and get back you on results + review the changes here. For
> > > > > > future reference if you want to use our CI system, the series must apply
> > > > > > to drm-tip, feel free to rebase this series and just send to intel-xe
> > > > > > list if you want CI
> > > > > 
> > > > > Thanks, I'll rebase on drm-tip and send to the intel-xe list.
> > > > 
> > > > For reference the rebase on drm-tip on the intel-xe list:
> > > > 
> > > > https://patchwork.freedesktop.org/series/159738/
> > > > 
> > > > Will watch the CI results.
> > > 
> > > The series causes some failures in the intel-xe tests:
> > > https://patchwork.freedesktop.org/series/159738/#rev4
> > > 
> > > Working through the failures now.
> > > 
> > 
> > Yea, I saw the failures. I haven't had time look at the patches on my
> > end quite yet. Scrabling to get a few things in 6.20/7.0 PR, so I may
> > not have bandwidth to look in depth until mid next week but digging is
> > on my TODO list.
> 
> Sure, that's completely fine. The failures seem pretty directly related to
> the
> series so I think I'll be able to make good progress.
> 
> For example https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-159738v4/bat-bmg-2/igt@xe_evict@evict-beng-small.html
> 
> It looks like I missed that xe_pagemap_destroy_work() needs to be updated to
> remove the call to devm_release_mem_region() now we are no longer reserving
> a mem
> region.

+1

So this is the one I’d be most concerned about [1].
xe_exec_system_allocator is our SVM test, which does almost all the
ridiculous things possible in user space to stress SVM. It’s blowing up
in the core MM—but the source of the bug could be anywhere (e.g., Xe
SVM, GPU SVM, migrate device layer, or core MM). I’ll try to help when I
have bandwidth.

Matt

[1] https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-159738v4/shard-bmg-9/igt@xe_exec_system_allocator@threads-many-large-execqueues-free-nomemset.html

> 
> 
> Thanks,
> Jordan.
> 
> > 
> > Matt
> > 
> > > Thanks,
> > > Jordan.
> > > 
> > > > 
> > > > Thanks,
> > > > Jordan.
> > > > 
> > > > > 
> > > > > Jordan.
> > > > > 
> > > > > > 
> > > > > > I was also wondering if Nvidia could help review one our core MM patches
> > > > > > [2] which is gating enabling 2M device pages too?
> > > > > > 
> > > > > > Matt
> > > > > > 
> > > > > > [1] https://patchwork.freedesktop.org/series/159738/
> > > > > > [2] https://patchwork.freedesktop.org/patch/694775/?series=159119&rev=1
> > > > > 
> > > > > 
> > > > 
> > > 
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2026-01-09  6:22 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-07  9:18 [PATCH v2 00/11] Remove device private pages from physical address space Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 01/11] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
2026-01-08 20:03   ` Felix Kuehling
2026-01-08 23:49     ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 02/11] drm/amdkfd: Use migrate pfns internally Jordan Niethe
2026-01-08 22:00   ` Felix Kuehling
2026-01-08 23:56     ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 03/11] mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 04/11] mm/migrate_device: Add migrate PFN flag to track device private pages Jordan Niethe
2026-01-08 20:01   ` Felix Kuehling
2026-01-08 23:41     ` Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 05/11] mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn " Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 06/11] mm: Add helpers to create migration entries from struct pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 07/11] mm: Add a new swap type for migration entries of device private pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 08/11] mm: Add helpers to create device private entries from struct pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 09/11] mm/util: Add flag to track device private pages in page snapshots Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 10/11] mm/hmm: Add flag to track device private pages Jordan Niethe
2026-01-07  9:18 ` [PATCH v2 11/11] mm: Remove device private pages from the physical address space Jordan Niethe
2026-01-07 18:36 ` [PATCH v2 00/11] Remove device private pages from " Matthew Brost
2026-01-07 20:21   ` Zi Yan
2026-01-08  2:25   ` Jordan Niethe
2026-01-08  5:42     ` Jordan Niethe
2026-01-09  0:01       ` Jordan Niethe
2026-01-09  0:31         ` Matthew Brost
2026-01-09  1:27           ` Jordan Niethe
2026-01-09  6:22             ` Matthew Brost
2026-01-07 20:06 ` Andrew Morton
2026-01-07 20:54   ` Jason Gunthorpe
2026-01-07 21:02     ` Balbir Singh
2026-01-08  1:29       ` Alistair Popple
2026-01-08  1:08   ` John Hubbard
2026-01-08  1:49   ` Alistair Popple
2026-01-08  2:55     ` Jordan Niethe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox