From: Matthew Brost <matthew.brost@intel.com>
To: Francois Dugast <francois.dugast@intel.com>
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Michal Mrozek" <michal.mrozek@intel.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"David Hildenbrand" <david@kernel.org>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Mike Rapoport" <rppt@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Michal Hocko" <mhocko@suse.com>, "Zi Yan" <ziy@nvidia.com>,
"Alistair Popple" <apopple@nvidia.com>,
"Balbir Singh" <balbirs@nvidia.com>,
linux-mm@kvack.org
Subject: Re: [PATCH v4 7/7] drm/pagemap: Enable THP support for GPU memory migration
Date: Sun, 11 Jan 2026 13:37:41 -0800 [thread overview]
Message-ID: <aWQYJYR3LXTgtYsg@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20260111205820.830410-8-francois.dugast@intel.com>
On Sun, Jan 11, 2026 at 09:55:46PM +0100, Francois Dugast wrote:
> This enables support for Transparent Huge Pages (THP) for device pages by
> using MIGRATE_VMA_SELECT_COMPOUND during migration. It removes the need to
> split folios and loop multiple times over all pages to perform required
> operations at page level. Instead, we rely on newly introduced support for
> higher orders in drm_pagemap and folio-level API.
>
> In Xe, this drastically improves performance when using SVM. The GT stats
> below collected after a 2MB page fault show overall servicing is more than
> 7 times faster, and thanks to reduced CPU overhead the time spent on the
> actual copy goes from 23% without THP to 80% with THP:
>
> Without THP:
>
> svm_2M_pagefault_us: 966
> svm_2M_migrate_us: 942
> svm_2M_device_copy_us: 223
> svm_2M_get_pages_us: 9
> svm_2M_bind_us: 10
>
> With THP:
>
> svm_2M_pagefault_us: 132
> svm_2M_migrate_us: 128
> svm_2M_device_copy_us: 106
> svm_2M_get_pages_us: 1
> svm_2M_bind_us: 2
>
> v2:
> - Fix one occurrence of drm_pagemap_get_devmem_page() (Matthew Brost)
>
> v3:
> - Remove migrate_device_split_page() and folio_split_lock, instead rely on
> free_zone_device_folio() to split folios before freeing (Matthew Brost)
> - Assert folio order is HPAGE_PMD_ORDER (Matthew Brost)
> - Always use folio_set_zone_device_data() in split (Matthew Brost)
>
> v4:
> - Warn on compound device page, s/continue/goto next/ (Matthew Brost)
>
> Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Michal Mrozek <michal.mrozek@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: linux-mm@kvack.org
> Signed-off-by: Francois Dugast <francois.dugast@intel.com>
> ---
> drivers/gpu/drm/drm_pagemap.c | 77 ++++++++++++++++++++++++++++++-----
> 1 file changed, 67 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index af2c8f4da00e..bd2c9af51564 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -200,16 +200,20 @@ static void drm_pagemap_migration_unlock_put_pages(unsigned long npages,
> /**
> * drm_pagemap_get_devmem_page() - Get a reference to a device memory page
> * @page: Pointer to the page
> + * @order: Order
> * @zdd: Pointer to the GPU SVM zone device data
> *
> * This function associates the given page with the specified GPU SVM zone
> * device data and initializes it for zone device usage.
> */
> static void drm_pagemap_get_devmem_page(struct page *page,
> + unsigned int order,
> struct drm_pagemap_zdd *zdd)
> {
> - page->zone_device_data = drm_pagemap_zdd_get(zdd);
> - zone_device_page_init(page, 0);
> + struct folio *folio = page_folio(page);
> +
> + folio_set_zone_device_data(folio, drm_pagemap_zdd_get(zdd));
> + zone_device_page_init(page, order);
> }
>
> /**
> @@ -534,7 +538,8 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
> * rare and only occur when the madvise attributes of memory are
> * changed or atomics are being used.
> */
> - .flags = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT,
> + .flags = MIGRATE_VMA_SELECT_SYSTEM | MIGRATE_VMA_SELECT_DEVICE_COHERENT |
> + MIGRATE_VMA_SELECT_COMPOUND,
> };
> unsigned long i, npages = npages_in_range(start, end);
> unsigned long own_pages = 0, migrated_pages = 0;
> @@ -640,11 +645,16 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
>
> own_pages = 0;
>
> - for (i = 0; i < npages; ++i) {
> + for (i = 0; i < npages;) {
> + unsigned long j;
> struct page *page = pfn_to_page(migrate.dst[i]);
> struct page *src_page = migrate_pfn_to_page(migrate.src[i]);
> - cur.start = i;
> + unsigned int order = 0;
> +
> + drm_WARN_ONCE(dpagemap->drm, folio_order(page_folio(page)),
> + "Unexpected compound device page found\n");
>
> + cur.start = i;
> pages[i] = NULL;
> if (src_page && is_device_private_page(src_page)) {
> struct drm_pagemap_zdd *src_zdd =
> @@ -654,7 +664,7 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
> !mdetails->can_migrate_same_pagemap) {
> migrate.dst[i] = 0;
> own_pages++;
> - continue;
> + goto next;
> }
> if (mdetails->source_peer_migrates) {
> cur.dpagemap = src_zdd->dpagemap;
> @@ -670,7 +680,20 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
> pages[i] = page;
> }
> migrate.dst[i] = migrate_pfn(migrate.dst[i]);
> - drm_pagemap_get_devmem_page(page, zdd);
> +
> + if (migrate.src[i] & MIGRATE_PFN_COMPOUND) {
> + drm_WARN_ONCE(dpagemap->drm, src_page &&
> + folio_order(page_folio(src_page)) != HPAGE_PMD_ORDER,
> + "Unexpected folio order\n");
> +
> + order = HPAGE_PMD_ORDER;
> + migrate.dst[i] |= MIGRATE_PFN_COMPOUND;
> +
> + for (j = 1; j < NR_PAGES(order) && i + j < npages; j++)
> + migrate.dst[i + j] = 0;
> + }
> +
> + drm_pagemap_get_devmem_page(page, order, zdd);
>
> /* If we switched the migrating drm_pagemap, migrate previous pages now */
> err = drm_pagemap_migrate_range(devmem_allocation, migrate.src, migrate.dst,
> @@ -680,7 +703,11 @@ int drm_pagemap_migrate_to_devmem(struct drm_pagemap_devmem *devmem_allocation,
> npages = i + 1;
> goto err_finalize;
> }
> +
> +next:
> + i += NR_PAGES(order);
> }
> +
> cur.start = npages;
> cur.ops = NULL; /* Force migration */
> err = drm_pagemap_migrate_range(devmem_allocation, migrate.src, migrate.dst,
> @@ -789,6 +816,8 @@ static int drm_pagemap_migrate_populate_ram_pfn(struct vm_area_struct *vas,
> page = folio_page(folio, 0);
> mpfn[i] = migrate_pfn(page_to_pfn(page));
>
> + if (order)
> + mpfn[i] |= MIGRATE_PFN_COMPOUND;
> next:
> if (page)
> addr += page_size(page);
> @@ -1044,8 +1073,15 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> if (err)
> goto err_finalize;
>
> - for (i = 0; i < npages; ++i)
> + for (i = 0; i < npages;) {
> + unsigned int order = 0;
> +
> pages[i] = migrate_pfn_to_page(src[i]);
> + if (pages[i])
> + order = folio_order(page_folio(pages[i]));
> +
> + i += NR_PAGES(order);
> + }
>
> err = ops->copy_to_ram(pages, pagemap_addr, npages, NULL);
> if (err)
> @@ -1098,7 +1134,8 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> .vma = vas,
> .pgmap_owner = page_pgmap(page)->owner,
> .flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE |
> - MIGRATE_VMA_SELECT_DEVICE_COHERENT,
> + MIGRATE_VMA_SELECT_DEVICE_COHERENT |
> + MIGRATE_VMA_SELECT_COMPOUND,
> .fault_page = page,
> };
> struct drm_pagemap_migrate_details mdetails = {};
> @@ -1164,8 +1201,15 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> if (err)
> goto err_finalize;
>
> - for (i = 0; i < npages; ++i)
> + for (i = 0; i < npages;) {
> + unsigned int order = 0;
> +
> pages[i] = migrate_pfn_to_page(migrate.src[i]);
> + if (pages[i])
> + order = folio_order(page_folio(pages[i]));
> +
> + i += NR_PAGES(order);
> + }
>
> err = ops->copy_to_ram(pages, pagemap_addr, npages, NULL);
> if (err)
> @@ -1224,9 +1268,22 @@ static vm_fault_t drm_pagemap_migrate_to_ram(struct vm_fault *vmf)
> return err ? VM_FAULT_SIGBUS : 0;
> }
>
> +static void drm_pagemap_folio_split(struct folio *orig_folio, struct folio *new_folio)
> +{
> + struct drm_pagemap_zdd *zdd;
> +
> + if (!new_folio)
> + return;
> +
> + new_folio->pgmap = orig_folio->pgmap;
> + zdd = folio_zone_device_data(orig_folio);
> + folio_set_zone_device_data(new_folio, drm_pagemap_zdd_get(zdd));
> +}
> +
> static const struct dev_pagemap_ops drm_pagemap_pagemap_ops = {
> .folio_free = drm_pagemap_folio_free,
> .migrate_to_ram = drm_pagemap_migrate_to_ram,
> + .folio_split = drm_pagemap_folio_split,
> };
>
> /**
> --
> 2.43.0
>
prev parent reply other threads:[~2026-01-11 21:37 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-11 20:55 [PATCH v4 0/7] Enable THP support in drm_pagemap Francois Dugast
2026-01-11 20:55 ` [PATCH v4 1/7] mm/zone_device: Add order argument to folio_free callback Francois Dugast
2026-01-11 22:35 ` Matthew Wilcox
2026-01-12 0:19 ` Balbir Singh
2026-01-12 0:51 ` Zi Yan
2026-01-11 20:55 ` [PATCH v4 2/7] mm/zone_device: Add free_zone_device_folio_prepare() helper Francois Dugast
2026-01-12 0:44 ` Balbir Singh
2026-01-12 1:16 ` Matthew Brost
2026-01-11 20:55 ` [PATCH v4 3/7] fs/dax: Use " Francois Dugast
2026-01-11 20:55 ` [PATCH v4 4/7] drm/pagemap: Unlock and put folios when possible Francois Dugast
2026-01-11 20:55 ` [PATCH v4 5/7] drm/pagemap: Add helper to access zone_device_data Francois Dugast
2026-01-11 20:55 ` [PATCH v4 6/7] drm/pagemap: Correct cpages calculation for migrate_vma_setup Francois Dugast
2026-01-11 20:55 ` [PATCH v4 7/7] drm/pagemap: Enable THP support for GPU memory migration Francois Dugast
2026-01-11 21:37 ` Matthew Brost [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWQYJYR3LXTgtYsg@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=david@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=michal.mrozek@intel.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox