* [PATCH v5 0/5] Enable THP support in drm_pagemap
@ 2026-01-14 19:19 Francois Dugast
2026-01-14 19:19 ` [PATCH v5 1/5] mm/zone_device: Reinitialize large zone device private folios Francois Dugast
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Francois Dugast @ 2026-01-14 19:19 UTC (permalink / raw)
To: intel-xe
Cc: dri-devel, Francois Dugast, Zi Yan, Madhavan Srinivasan,
Alistair Popple, Lorenzo Stoakes, Liam R . Howlett,
Suren Baghdasaryan, Michal Hocko, Mike Rapoport, Vlastimil Babka,
Nicholas Piggin, Michael Ellerman, Christophe Leroy (CS GROUP),
Felix Kuehling, Alex Deucher, Christian König, David Airlie,
Simona Vetter, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Lyude Paul, Danilo Krummrich, Bjorn Helgaas,
Logan Gunthorpe, David Hildenbrand, Oscar Salvador,
Andrew Morton, Jason Gunthorpe, Leon Romanovsky, Balbir Singh,
Dan Williams, Matthew Wilcox, Jan Kara, Alexander Viro,
Christian Brauner, linuxppc-dev, kvm, linux-kernel, amd-gfx,
nouveau, linux-pci, linux-mm, linux-cxl, nvdimm, linux-fsdevel
Use Balbir Singh's series for device-private THP support [1] and previous
preparation work in drm_pagemap [2] to add 2MB/THP support in xe. This leads
to significant performance improvements when using SVM with 2MB pages.
[1] https://lore.kernel.org/linux-mm/20251001065707.920170-1-balbirs@nvidia.com/
[2] https://patchwork.freedesktop.org/series/151754/
v2:
- rebase on top of multi-device SVM
- add drm_pagemap_cpages() with temporary patch
- address other feedback from Matt Brost on v1
v3:
The major change is to remove the dependency to the mm/huge_memory
helper migrate_device_split_page() that was called explicitely when
a 2M buddy allocation backed by a large folio would be later reused
for a smaller allocation (4K or 64K). Instead, the first 3 patches
provided by Matthew Brost ensure large folios are split at the time
of freeing.
v4:
- add order argument to folio_free callback
- send complete series to linux-mm and MM folks as requested (Zi Yan
and Andrew Morton) and cover letter to anyone receiving at least
one of the patches (Liam R. Howlett)
v5:
- update zone_device_page_init() in patch #1 to reinitialize large
zone device private folios
Cc: Zi Yan <ziy@nvidia.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: linux-pci@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-cxl@vger.kernel.org
Cc: nvdimm@lists.linux.dev
Cc: linux-fsdevel@vger.kernel.org
Francois Dugast (3):
drm/pagemap: Unlock and put folios when possible
drm/pagemap: Add helper to access zone_device_data
drm/pagemap: Enable THP support for GPU memory migration
Matthew Brost (2):
mm/zone_device: Reinitialize large zone device private folios
drm/pagemap: Correct cpages calculation for migrate_vma_setup
arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 +-
drivers/gpu/drm/drm_gpusvm.c | 7 +-
drivers/gpu/drm/drm_pagemap.c | 158 ++++++++++++++++++-----
drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +-
include/drm/drm_pagemap.h | 15 +++
include/linux/memremap.h | 9 +-
lib/test_hmm.c | 4 +-
mm/memremap.c | 20 ++-
9 files changed, 180 insertions(+), 39 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 1/5] mm/zone_device: Reinitialize large zone device private folios
2026-01-14 19:19 [PATCH v5 0/5] Enable THP support in drm_pagemap Francois Dugast
@ 2026-01-14 19:19 ` Francois Dugast
2026-01-14 19:19 ` [PATCH v5 2/5] drm/pagemap: Unlock and put folios when possible Francois Dugast
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Francois Dugast @ 2026-01-14 19:19 UTC (permalink / raw)
To: intel-xe
Cc: dri-devel, Matthew Brost, Zi Yan, Alistair Popple,
adhavan Srinivasan, Nicholas Piggin, Michael Ellerman,
Christophe Leroy (CS GROUP),
Felix Kuehling, Alex Deucher, Christian König, David Airlie,
Simona Vetter, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Lyude Paul, Danilo Krummrich,
David Hildenbrand, Oscar Salvador, Andrew Morton,
Jason Gunthorpe, Leon Romanovsky, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Balbir Singh, linuxppc-dev,
kvm, linux-kernel, amd-gfx, nouveau, linux-mm, linux-cxl,
Francois Dugast
From: Matthew Brost <matthew.brost@intel.com>
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device
private folio. This step is necessary when the folio’s order changes
dynamically between zone_device_page_init calls to avoid building a
corrupt folio. As part of the metadata reinitialization, the dev_pagemap
must be passed in from the caller because the pgmap stored in the folio
page may have been overwritten with a compound head.
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: adhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: linux-mm@kvack.org
Cc: linux-cxl@vger.kernel.org
Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 +-
drivers/gpu/drm/drm_pagemap.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +-
include/linux/memremap.h | 9 ++++++---
lib/test_hmm.c | 4 +++-
mm/memremap.c | 20 +++++++++++++++++++-
7 files changed, 32 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index e5000bef90f2..7cf9310de0ec 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -723,7 +723,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm)
dpage = pfn_to_page(uvmem_pfn);
dpage->zone_device_data = pvt;
- zone_device_page_init(dpage, 0);
+ zone_device_page_init(dpage, &kvmppc_uvmem_pgmap, 0);
return dpage;
out_clear:
spin_lock(&kvmppc_uvmem_bitmap_lock);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index af53e796ea1b..6ada7b4af7c6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -217,7 +217,7 @@ svm_migrate_get_vram_page(struct svm_range *prange, unsigned long pfn)
page = pfn_to_page(pfn);
svm_range_bo_ref(prange->svm_bo);
page->zone_device_data = prange->svm_bo;
- zone_device_page_init(page, 0);
+ zone_device_page_init(page, page_pgmap(page), 0);
}
static void
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index 03ee39a761a4..c497726b0147 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -201,7 +201,7 @@ static void drm_pagemap_get_devmem_page(struct page *page,
struct drm_pagemap_zdd *zdd)
{
page->zone_device_data = drm_pagemap_zdd_get(zdd);
- zone_device_page_init(page, 0);
+ zone_device_page_init(page, zdd->dpagemap->pagemap, 0);
}
/**
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 58071652679d..3d8031296eed 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -425,7 +425,7 @@ nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm, bool is_large)
order = ilog2(DMEM_CHUNK_NPAGES);
}
- zone_device_folio_init(folio, order);
+ zone_device_folio_init(folio, page_pgmap(folio_page(folio, 0)), order);
return page;
}
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 713ec0435b48..e3c2ccf872a8 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -224,7 +224,8 @@ static inline bool is_fsdax_page(const struct page *page)
}
#ifdef CONFIG_ZONE_DEVICE
-void zone_device_page_init(struct page *page, unsigned int order);
+void zone_device_page_init(struct page *page, struct dev_pagemap *pgmap,
+ unsigned int order);
void *memremap_pages(struct dev_pagemap *pgmap, int nid);
void memunmap_pages(struct dev_pagemap *pgmap);
void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap);
@@ -234,9 +235,11 @@ bool pgmap_pfn_valid(struct dev_pagemap *pgmap, unsigned long pfn);
unsigned long memremap_compat_align(void);
-static inline void zone_device_folio_init(struct folio *folio, unsigned int order)
+static inline void zone_device_folio_init(struct folio *folio,
+ struct dev_pagemap *pgmap,
+ unsigned int order)
{
- zone_device_page_init(&folio->page, order);
+ zone_device_page_init(&folio->page, pgmap, order);
if (order)
folio_set_large_rmappable(folio);
}
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 8af169d3873a..455a6862ae50 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -662,7 +662,9 @@ static struct page *dmirror_devmem_alloc_page(struct dmirror *dmirror,
goto error;
}
- zone_device_folio_init(page_folio(dpage), order);
+ zone_device_folio_init(page_folio(dpage),
+ page_pgmap(folio_page(page_folio(dpage), 0)),
+ order);
dpage->zone_device_data = rpage;
return dpage;
diff --git a/mm/memremap.c b/mm/memremap.c
index 63c6ab4fdf08..6f46ab14662b 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -477,10 +477,28 @@ void free_zone_device_folio(struct folio *folio)
}
}
-void zone_device_page_init(struct page *page, unsigned int order)
+void zone_device_page_init(struct page *page, struct dev_pagemap *pgmap,
+ unsigned int order)
{
+ struct page *new_page = page;
+ unsigned int i;
+
VM_WARN_ON_ONCE(order > MAX_ORDER_NR_PAGES);
+ for (i = 0; i < (1UL << order); ++i, ++new_page) {
+ struct folio *new_folio = (struct folio *)new_page;
+
+ new_page->flags.f &= ~0xffUL; /* Clear possible order, page head */
+#ifdef NR_PAGES_IN_LARGE_FOLIO
+ ((struct folio *)(new_page - 1))->_nr_pages = 0;
+#endif
+ new_folio->mapping = NULL;
+ new_folio->pgmap = pgmap; /* Also clear compound head */
+ new_folio->share = 0; /* fsdax only, unused for device private */
+ VM_WARN_ON_FOLIO(folio_ref_count(new_folio), new_folio);
+ VM_WARN_ON_FOLIO(!folio_is_zone_device(new_folio), new_folio);
+ }
+
/*
* Drivers shouldn't be allocating pages after calling
* memunmap_pages().
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 2/5] drm/pagemap: Unlock and put folios when possible
2026-01-14 19:19 [PATCH v5 0/5] Enable THP support in drm_pagemap Francois Dugast
2026-01-14 19:19 ` [PATCH v5 1/5] mm/zone_device: Reinitialize large zone device private folios Francois Dugast
@ 2026-01-14 19:19 ` Francois Dugast
2026-01-14 19:19 ` [PATCH v5 3/5] drm/pagemap: Add helper to access zone_device_data Francois Dugast
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Francois Dugast @ 2026-01-14 19:19 UTC (permalink / raw)
To: intel-xe
Cc: dri-devel, Francois Dugast, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Liam R . Howlett, Vlastimil Babka,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Zi Yan,
Alistair Popple, Balbir Singh, linux-mm, Matthew Brost
If the page is part of a folio, unlock and put the whole folio at once
instead of individual pages one after the other. This will reduce the
amount of operations once device THP are in use.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
drivers/gpu/drm/drm_pagemap.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index c497726b0147..b31090b8e97c 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -154,15 +154,15 @@ static void drm_pagemap_zdd_put(struct drm_pagemap_zdd *zdd)
}
/**
- * drm_pagemap_migration_unlock_put_page() - Put a migration page
- * @page: Pointer to the page to put
+ * drm_pagemap_migration_unlock_put_folio() - Put a migration folio
+ * @folio: Pointer to the folio to put
*
- * This function unlocks and puts a page.
+ * This function unlocks and puts a folio.
*/
-static void drm_pagemap_migration_unlock_put_page(struct page *page)
+static void drm_pagemap_migration_unlock_put_folio(struct folio *folio)
{
- unlock_page(page);
- put_page(page);
+ folio_unlock(folio);
+ folio_put(folio);
}
/**
@@ -177,15 +177,23 @@ static void drm_pagemap_migration_unlock_put_pages(unsigned long npages,
{
unsigned long i;
- for (i = 0; i < npages; ++i) {
+ for (i = 0; i < npages;) {
struct page *page;
+ struct folio *folio;
+ unsigned int order = 0;
if (!migrate_pfn[i])
- continue;
+ goto next;
page = migrate_pfn_to_page(migrate_pfn[i]);
- drm_pagemap_migration_unlock_put_page(page);
+ folio = page_folio(page);
+ order = folio_order(folio);
+
+ drm_pagemap_migration_unlock_put_folio(folio);
migrate_pfn[i] = 0;
+
+next:
+ i += NR_PAGES(order);
}
}
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 3/5] drm/pagemap: Add helper to access zone_device_data
2026-01-14 19:19 [PATCH v5 0/5] Enable THP support in drm_pagemap Francois Dugast
2026-01-14 19:19 ` [PATCH v5 1/5] mm/zone_device: Reinitialize large zone device private folios Francois Dugast
2026-01-14 19:19 ` [PATCH v5 2/5] drm/pagemap: Unlock and put folios when possible Francois Dugast
@ 2026-01-14 19:19 ` Francois Dugast
2026-01-14 19:19 ` [PATCH v5 4/5] drm/pagemap: Correct cpages calculation for migrate_vma_setup Francois Dugast
2026-01-14 19:19 ` [PATCH v5 5/5] drm/pagemap: Enable THP support for GPU memory migration Francois Dugast
4 siblings, 0 replies; 6+ messages in thread
From: Francois Dugast @ 2026-01-14 19:19 UTC (permalink / raw)
To: intel-xe
Cc: dri-devel, Francois Dugast, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Liam R . Howlett, Vlastimil Babka,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Zi Yan,
Alistair Popple, Balbir Singh, linux-mm, Matthew Brost
This new helper helps ensure all accesses to zone_device_data use the
correct API whether the page is part of a folio or not.
v2:
- Move to drm_pagemap.h, stick to folio_zone_device_data (Matthew Brost)
- Return struct drm_pagemap_zdd * (Matthew Brost)
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
drivers/gpu/drm/drm_gpusvm.c | 7 +++++--
drivers/gpu/drm/drm_pagemap.c | 21 ++++++++++++---------
include/drm/drm_pagemap.h | 15 +++++++++++++++
3 files changed, 32 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/drm_gpusvm.c b/drivers/gpu/drm/drm_gpusvm.c
index aa9a0b60e727..585d913d3d19 100644
--- a/drivers/gpu/drm/drm_gpusvm.c
+++ b/drivers/gpu/drm/drm_gpusvm.c
@@ -1488,12 +1488,15 @@ int drm_gpusvm_get_pages(struct drm_gpusvm *gpusvm,
order = drm_gpusvm_hmm_pfn_to_order(pfns[i], i, npages);
if (is_device_private_page(page) ||
is_device_coherent_page(page)) {
+ struct drm_pagemap_zdd *__zdd =
+ drm_pagemap_page_zone_device_data(page);
+
if (!ctx->allow_mixed &&
- zdd != page->zone_device_data && i > 0) {
+ zdd != __zdd && i > 0) {
err = -EOPNOTSUPP;
goto err_unmap;
}
- zdd = page->zone_device_data;
+ zdd = __zdd;
if (pagemap != page_pgmap(page)) {
if (i > 0) {
err = -EOPNOTSUPP;
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index b31090b8e97c..f613b4d48499 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -252,7 +252,7 @@ static int drm_pagemap_migrate_map_pages(struct device *dev,
order = folio_order(folio);
if (is_device_private_page(page)) {
- struct drm_pagemap_zdd *zdd = page->zone_device_data;
+ struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
struct drm_pagemap *dpagemap = zdd->dpagemap;
struct drm_pagemap_addr addr;
@@ -323,7 +323,7 @@ static void drm_pagemap_migrate_unmap_pages(struct device *dev,
goto next;
if (is_zone_device_page(page)) {
- struct drm_pagemap_zdd *zdd = page->zone_device_data;
+ struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
struct drm_pagemap *dpagemap = zdd->dpagemap;
dpagemap->ops->device_unmap(dpagemap, dev, pagemap_addr[i]);
@@ -611,7 +611,8 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
pages[i] = NULL;
if (src_page && is_device_private_page(src_page)) {
- struct drm_pagemap_zdd *src_zdd = src_page->zone_device_data;
+ struct drm_pagemap_zdd *src_zdd =
+ drm_pagemap_page_zone_device_data(src_page);
if (page_pgmap(src_page) == pagemap &&
!mdetails->can_migrate_same_pagemap) {
@@ -733,8 +734,8 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
goto next;
if (fault_page) {
- if (src_page->zone_device_data !=
- fault_page->zone_device_data)
+ if (drm_pagemap_page_zone_device_data(src_page) !=
+ drm_pagemap_page_zone_device_data(fault_page))
goto next;
}
@@ -1075,7 +1076,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
void *buf;
int i, err = 0;
- zdd = page->zone_device_data;
+ zdd = drm_pagemap_page_zone_device_data(page);
if (time_before64(get_jiffies_64(), zdd->devmem_allocation->timeslice_expiration))
return 0;
@@ -1158,7 +1159,9 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
*/
static void drm_pagemap_folio_free(struct folio *folio)
{
- drm_pagemap_zdd_put(folio->page.zone_device_data);
+ struct page *page = folio_page(folio, 0);
+
+ drm_pagemap_zdd_put(drm_pagemap_page_zone_device_data(page));
}
/**
@@ -1174,7 +1177,7 @@ static void drm_pagemap_folio_free(struct folio *folio)
*/
static vm_fault_t drm_pagemap_migrate_to_ram(struct vm_fault *vmf)
{
- struct drm_pagemap_zdd *zdd = vmf->page->zone_device_data;
+ struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(vmf->page);
int err;
err = __drm_pagemap_migrate_to_ram(vmf->vma,
@@ -1240,7 +1243,7 @@ EXPORT_SYMBOL_GPL(drm_pagemap_devmem_init);
*/
struct drm_pagemap *drm_pagemap_page_to_dpagemap(struct page *page)
{
- struct drm_pagemap_zdd *zdd = page->zone_device_data;
+ struct drm_pagemap_zdd *zdd = drm_pagemap_page_zone_device_data(page);
return zdd->devmem_allocation->dpagemap;
}
diff --git a/include/drm/drm_pagemap.h b/include/drm/drm_pagemap.h
index 46e9c58f09e0..736fb6cb7b33 100644
--- a/include/drm/drm_pagemap.h
+++ b/include/drm/drm_pagemap.h
@@ -4,6 +4,7 @@
#include <linux/dma-direction.h>
#include <linux/hmm.h>
+#include <linux/memremap.h>
#include <linux/types.h>
#define NR_PAGES(order) (1U << (order))
@@ -359,4 +360,18 @@ int drm_pagemap_populate_mm(struct drm_pagemap *dpagemap,
void drm_pagemap_destroy(struct drm_pagemap *dpagemap, bool is_atomic_or_reclaim);
int drm_pagemap_reinit(struct drm_pagemap *dpagemap);
+
+/**
+ * drm_pagemap_page_zone_device_data() - Page to zone_device_data
+ * @page: Pointer to the page
+ *
+ * Return: Page's zone_device_data
+ */
+static inline struct drm_pagemap_zdd *drm_pagemap_page_zone_device_data(struct page *page)
+{
+ struct folio *folio = page_folio(page);
+
+ return folio_zone_device_data(folio);
+}
+
#endif
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 4/5] drm/pagemap: Correct cpages calculation for migrate_vma_setup
2026-01-14 19:19 [PATCH v5 0/5] Enable THP support in drm_pagemap Francois Dugast
` (2 preceding siblings ...)
2026-01-14 19:19 ` [PATCH v5 3/5] drm/pagemap: Add helper to access zone_device_data Francois Dugast
@ 2026-01-14 19:19 ` Francois Dugast
2026-01-14 19:19 ` [PATCH v5 5/5] drm/pagemap: Enable THP support for GPU memory migration Francois Dugast
4 siblings, 0 replies; 6+ messages in thread
From: Francois Dugast @ 2026-01-14 19:19 UTC (permalink / raw)
To: intel-xe
Cc: dri-devel, Matthew Brost, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Liam R . Howlett, Vlastimil Babka,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Zi Yan,
Alistair Popple, Balbir Singh, linux-mm, Francois Dugast
From: Matthew Brost <matthew.brost@intel.com>
cpages returned from migrate_vma_setup represents the total number of
individual pages found, not the number of 4K pages. The math in
drm_pagemap_migrate_to_devmem for npages is based on the number of 4K
pages, so cpages != npages can fail even if the entire memory range is
found in migrate_vma_setup (e.g., when a single 2M page is found).
Add drm_pagemap_cpages, which converts cpages to the number of 4K pages
found.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
drivers/gpu/drm/drm_pagemap.c | 38 ++++++++++++++++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index f613b4d48499..3fc466f04b13 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -452,6 +452,41 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
return ret;
}
+/**
+ * drm_pagemap_cpages() - Count collected pages
+ * @migrate_pfn: Array of migrate_pfn entries to account
+ * @npages: Number of entries in @migrate_pfn
+ *
+ * Compute the total number of minimum-sized pages represented by the
+ * collected entries in @migrate_pfn. The total is derived from the
+ * order encoded in each entry.
+ *
+ * Return: Total number of minimum-sized pages.
+ */
+static int drm_pagemap_cpages(unsigned long *migrate_pfn, unsigned long npages)
+{
+ unsigned long i, cpages = 0;
+
+ for (i = 0; i < npages;) {
+ struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
+ struct folio *folio;
+ unsigned int order = 0;
+
+ if (page) {
+ folio = page_folio(page);
+ order = folio_order(folio);
+ cpages += NR_PAGES(order);
+ } else if (migrate_pfn[i] & MIGRATE_PFN_COMPOUND) {
+ order = HPAGE_PMD_ORDER;
+ cpages += NR_PAGES(order);
+ }
+
+ i += NR_PAGES(order);
+ }
+
+ return cpages;
+}
+
/**
* drm_pagemap_migrate_to_devmem() - Migrate a struct mm_struct range to device memory
* @devmem_allocation: The device memory allocation to migrate to.
@@ -564,7 +599,8 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
goto err_free;
}
- if (migrate.cpages != npages) {
+ if (migrate.cpages != npages &&
+ drm_pagemap_cpages(migrate.src, npages) != npages) {
/*
* Some pages to migrate. But we want to migrate all or
* nothing. Raced or unknown device pages.
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 5/5] drm/pagemap: Enable THP support for GPU memory migration
2026-01-14 19:19 [PATCH v5 0/5] Enable THP support in drm_pagemap Francois Dugast
` (3 preceding siblings ...)
2026-01-14 19:19 ` [PATCH v5 4/5] drm/pagemap: Correct cpages calculation for migrate_vma_setup Francois Dugast
@ 2026-01-14 19:19 ` Francois Dugast
4 siblings, 0 replies; 6+ messages in thread
From: Francois Dugast @ 2026-01-14 19:19 UTC (permalink / raw)
To: intel-xe
Cc: dri-devel, Francois Dugast, Matthew Brost, Thomas Hellström,
Michal Mrozek, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Zi Yan, Alistair Popple,
Balbir Singh, linux-mm
This enables support for Transparent Huge Pages (THP) for device pages by
using MIGRATE_VMA_SELECT_COMPOUND during migration. It removes the need to
split folios and loop multiple times over all pages to perform required
operations at page level. Instead, we rely on newly introduced support for
higher orders in drm_pagemap and folio-level API.
In Xe, this drastically improves performance when using SVM. The GT stats
below collected after a 2MB page fault show overall servicing is more than
7 times faster, and thanks to reduced CPU overhead the time spent on the
actual copy goes from 23% without THP to 80% with THP:
Without THP:
svm_2M_pagefault_us: 966
svm_2M_migrate_us: 942
svm_2M_device_copy_us: 223
svm_2M_get_pages_us: 9
svm_2M_bind_us: 10
With THP:
svm_2M_pagefault_us: 132
svm_2M_migrate_us: 128
svm_2M_device_copy_us: 106
svm_2M_get_pages_us: 1
svm_2M_bind_us: 2
v2:
- Fix one occurrence of drm_pagemap_get_devmem_page() (Matthew Brost)
v3:
- Remove migrate_device_split_page() and folio_split_lock, instead rely on
free_zone_device_folio() to split folios before freeing (Matthew Brost)
- Assert folio order is HPAGE_PMD_ORDER (Matthew Brost)
- Always use folio_set_zone_device_data() in split (Matthew Brost)
v4:
- Warn on compound device page, s/continue/goto next/ (Matthew Brost)
v5:
- Revert warn on compound device page
- s/zone_device_page_init()/zone_device_folio_init() (Matthew Brost)
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
---
drivers/gpu/drm/drm_pagemap.c | 73 ++++++++++++++++++++++++++++++-----
1 file changed, 63 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
index 3fc466f04b13..e2aecd519f14 100644
--- a/drivers/gpu/drm/drm_pagemap.c
+++ b/drivers/gpu/drm/drm_pagemap.c
@@ -200,16 +200,19 @@ static void drm_pagemap_migration_unlock_put_pages(unsigned long npages,
/**
* drm_pagemap_get_devmem_page() - Get a reference to a device memory page
* @page: Pointer to the page
+ * @order: Order
* @zdd: Pointer to the GPU SVM zone device data
*
* This function associates the given page with the specified GPU SVM zone
* device data and initializes it for zone device usage.
*/
static void drm_pagemap_get_devmem_page(struct page *page,
+ unsigned int order,
struct drm_pagemap_zdd *zdd)
{
- page->zone_device_data = drm_pagemap_zdd_get(zdd);
- zone_device_page_init(page, zdd->dpagemap->pagemap, 0);
+ zone_device_folio_init((struct folio *)page, zdd->dpagemap->pagemap,
+ order);
+ folio_set_zone_device_data(page_folio(page), drm_pagemap_zdd_get(zdd));
}
/**
@@ -534,7 +537,8 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
* rare and only occur when the madvise attributes of memory are
* changed or atomics are being used.
*/
- .flags = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT,
+ .flags = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT |
+ MIGRATE_VMA_SELECT_COMPOUND,
};
unsigned long i, npages = npages_in_range(start, end);
unsigned long own_pages = 0, migrated_pages = 0;
@@ -640,11 +644,13 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
own_pages = 0;
- for (i = 0; i < npages; ++i) {
+ for (i = 0; i < npages;) {
+ unsigned long j;
struct page *page = pfn_to_page(migrate.dst[i]);
struct page *src_page = migrate_pfn_to_page(migrate.src[i]);
- cur.start = i;
+ unsigned int order = 0;
+ cur.start = i;
pages[i] = NULL;
if (src_page && is_device_private_page(src_page)) {
struct drm_pagemap_zdd *src_zdd =
@@ -654,7 +660,7 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
!mdetails->can_migrate_same_pagemap) {
migrate.dst[i] = 0;
own_pages++;
- continue;
+ goto next;
}
if (mdetails->source_peer_migrates) {
cur.dpagemap = src_zdd->dpagemap;
@@ -670,7 +676,20 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
pages[i] = page;
}
migrate.dst[i] = migrate_pfn(migrate.dst[i]);
- drm_pagemap_get_devmem_page(page, zdd);
+
+ if (migrate.src[i] & MIGRATE_PFN_COMPOUND) {
+ drm_WARN_ONCE(dpagemap->drm, src_page &&
+ folio_order(page_folio(src_page)) != HPAGE_PMD_ORDER,
+ "Unexpected folio order\n");
+
+ order = HPAGE_PMD_ORDER;
+ migrate.dst[i] |= MIGRATE_PFN_COMPOUND;
+
+ for (j = 1; j < NR_PAGES(order) && i + j < npages; j++)
+ migrate.dst[i + j] = 0;
+ }
+
+ drm_pagemap_get_devmem_page(page, order, zdd);
/* If we switched the migrating drm_pagemap, migrate previous pages now */
err = drm_pagemap_migrate_range(devmem_allocation, migrate.src, migrate.dst,
@@ -680,7 +699,11 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
npages = i + 1;
goto err_finalize;
}
+
+next:
+ i += NR_PAGES(order);
}
+
cur.start = npages;
cur.ops = NULL; /* Force migration */
err = drm_pagemap_migrate_range(devmem_allocation, migrate.src, migrate.dst,
@@ -789,6 +812,8 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
page = folio_page(folio, 0);
mpfn[i] = migrate_pfn(page_to_pfn(page));
+ if (order)
+ mpfn[i] |= MIGRATE_PFN_COMPOUND;
next:
if (page)
addr += page_size(page);
@@ -1044,8 +1069,15 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
if (err)
goto err_finalize;
- for (i = 0; i < npages; ++i)
+ for (i = 0; i < npages;) {
+ unsigned int order = 0;
+
pages[i] = migrate_pfn_to_page(src[i]);
+ if (pages[i])
+ order = folio_order(page_folio(pages[i]));
+
+ i += NR_PAGES(order);
+ }
err = ops->copy_to_ram(pages, pagemap_addr, npages, NULL);
if (err)
@@ -1098,7 +1130,8 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
.vma = vas,
.pgmap_owner = page_pgmap(page)->owner,
.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE |
- MIGRATE_VMA_SELECT_DEVICE_COHERENT,
+ MIGRATE_VMA_SELECT_DEVICE_COHERENT |
+ MIGRATE_VMA_SELECT_COMPOUND,
.fault_page = page,
};
struct drm_pagemap_migrate_details mdetails = {};
@@ -1164,8 +1197,15 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
if (err)
goto err_finalize;
- for (i = 0; i < npages; ++i)
+ for (i = 0; i < npages;) {
+ unsigned int order = 0;
+
pages[i] = migrate_pfn_to_page(migrate.src[i]);
+ if (pages[i])
+ order = folio_order(page_folio(pages[i]));
+
+ i += NR_PAGES(order);
+ }
err = ops->copy_to_ram(pages, pagemap_addr, npages, NULL);
if (err)
@@ -1223,9 +1263,22 @@ static vm_fault_t drm_pagemap_migrate_to_ram(struct vm_fault *vmf)
return err ? VM_FAULT_SIGBUS : 0;
}
+static void drm_pagemap_folio_split(struct folio *orig_folio, struct folio *new_folio)
+{
+ struct drm_pagemap_zdd *zdd;
+
+ if (!new_folio)
+ return;
+
+ new_folio->pgmap = orig_folio->pgmap;
+ zdd = folio_zone_device_data(orig_folio);
+ folio_set_zone_device_data(new_folio, drm_pagemap_zdd_get(zdd));
+}
+
static const struct dev_pagemap_ops drm_pagemap_pagemap_ops = {
.folio_free = drm_pagemap_folio_free,
.migrate_to_ram = drm_pagemap_migrate_to_ram,
+ .folio_split = drm_pagemap_folio_split,
};
/**
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-01-14 19:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-14 19:19 [PATCH v5 0/5] Enable THP support in drm_pagemap Francois Dugast
2026-01-14 19:19 ` [PATCH v5 1/5] mm/zone_device: Reinitialize large zone device private folios Francois Dugast
2026-01-14 19:19 ` [PATCH v5 2/5] drm/pagemap: Unlock and put folios when possible Francois Dugast
2026-01-14 19:19 ` [PATCH v5 3/5] drm/pagemap: Add helper to access zone_device_data Francois Dugast
2026-01-14 19:19 ` [PATCH v5 4/5] drm/pagemap: Correct cpages calculation for migrate_vma_setup Francois Dugast
2026-01-14 19:19 ` [PATCH v5 5/5] drm/pagemap: Enable THP support for GPU memory migration Francois Dugast
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox