From: Zi Yan <ziy@nvidia.com>
To: Balbir Singh <balbirs@nvidia.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
linux-kernel@vger.kernel.org, "Karol Herbst" <kherbst@redhat.com>,
"Lyude Paul" <lyude@redhat.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Shuah Khan" <shuah@kernel.org>,
"David Hildenbrand" <david@redhat.com>,
"Barry Song" <baohua@kernel.org>,
"Baolin Wang" <baolin.wang@linux.alibaba.com>,
"Ryan Roberts" <ryan.roberts@arm.com>,
"Matthew Wilcox" <willy@infradead.org>,
"Peter Xu" <peterx@redhat.com>,
"Kefeng Wang" <wangkefeng.wang@huawei.com>,
"Jane Chu" <jane.chu@oracle.com>,
"Alistair Popple" <apopple@nvidia.com>,
"Donet Tom" <donettom@linux.ibm.com>
Subject: Re: [v1 resend 00/12] THP support for zone device page migration
Date: Fri, 04 Jul 2025 12:16:05 -0400 [thread overview]
Message-ID: <106174EE-0E89-49DC-AF9D-76BE74FD2C18@nvidia.com> (raw)
In-Reply-To: <20250703233511.2028395-1-balbirs@nvidia.com>
On 3 Jul 2025, at 19:34, Balbir Singh wrote:
> This patch series adds support for THP migration of zone device pages.
> To do so, the patches implement support for folio zone device pages
> by adding support for setting up larger order pages.
>
> These patches build on the earlier posts by Ralph Campbell [1]
>
> Two new flags are added in vma_migration to select and mark compound pages.
> migrate_vma_setup(), migrate_vma_pages() and migrate_vma_finalize()
> support migration of these pages when MIGRATE_VMA_SELECT_COMPOUND
> is passed in as arguments.
>
> The series also adds zone device awareness to (m)THP pages along
> with fault handling of large zone device private pages. page vma walk
> and the rmap code is also zone device aware. Support has also been
> added for folios that might need to be split in the middle
> of migration (when the src and dst do not agree on
> MIGRATE_PFN_COMPOUND), that occurs when src side of the migration can
> migrate large pages, but the destination has not been able to allocate
> large pages. The code supported and used folio_split() when migrating
> THP pages, this is used when MIGRATE_VMA_SELECT_COMPOUND is not passed
> as an argument to migrate_vma_setup().
>
> The test infrastructure lib/test_hmm.c has been enhanced to support THP
> migration. A new ioctl to emulate failure of large page allocations has
> been added to test the folio split code path. hmm-tests.c has new test
> cases for huge page migration and to test the folio split path. A new
> throughput test has been added as well.
>
> The nouveau dmem code has been enhanced to use the new THP migration
> capability.
>
> Feedback from the RFC [2]:
>
> It was advised that prep_compound_page() not be exposed just for the purposes
> of testing (test driver lib/test_hmm.c). Work arounds of copy and split the
> folios did not work due to lock order dependency in the callback for
> split folio.
>
> mTHP support:
>
> The patches hard code, HPAGE_PMD_NR in a few places, but the code has
> been kept generic to support various order sizes. With additional
> refactoring of the code support of different order sizes should be
> possible.
>
> The future plan is to post enhancements to support mTHP with a rough
> design as follows:
>
> 1. Add the notion of allowable thp orders to the HMM based test driver
> 2. For non PMD based THP paths in migrate_device.c, check to see if
> a suitable order is found and supported by the driver
> 3. Iterate across orders to check the highest supported order for migration
> 4. Migrate and finalize
>
> The mTHP patches can be built on top of this series, the key design elements
> that need to be worked out are infrastructure and driver support for multiple
> ordered pages and their migration.
To help me better review the patches, can you tell me if my mental model below
for device private folios is correct or not?
1. device private folios represent device memory, but the associated PFNs
do not exist in the system. folio->pgmap contains the meta info about
device memory.
2. when data is migrated from system memory to device private memory, a device
private page table entry is established in place of the original entry.
A device private page table entry is a swap entry with a device private type.
And the swap entry points to a device private folio in which the data resides
in the device private memory.
3. when CPU tries to access an address with device private page table entry,
a fault happens and data is migrated from device private memory to system
memory. The device private folio pointed by the device private page table
entry tells driver where to look for the data on the device.
4. one of the reasons causing a large device private folio split is that
when a large device private folio is migrated back to system memory and
there is no free large folio in system memory. So that driver splits
the large device private folio and only migrate a subpage instead.
Thanks.
--
Best Regards,
Yan, Zi
next prev parent reply other threads:[~2025-07-04 16:16 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-03 23:34 Balbir Singh
2025-07-03 23:35 ` [v1 resend 01/12] mm/zone_device: support large zone device private folios Balbir Singh
2025-07-07 5:28 ` Alistair Popple
2025-07-08 6:47 ` Balbir Singh
2025-07-03 23:35 ` [v1 resend 02/12] mm/migrate_device: flags for selecting device private THP pages Balbir Singh
2025-07-07 5:31 ` Alistair Popple
2025-07-08 7:31 ` Balbir Singh
2025-07-19 20:06 ` Matthew Brost
2025-07-19 20:16 ` Matthew Brost
2025-07-18 3:15 ` Matthew Brost
2025-07-03 23:35 ` [v1 resend 03/12] mm/thp: zone_device awareness in THP handling code Balbir Singh
2025-07-04 4:46 ` Mika Penttilä
2025-07-06 1:21 ` Balbir Singh
2025-07-04 11:10 ` Mika Penttilä
2025-07-05 0:14 ` Balbir Singh
2025-07-07 6:09 ` Alistair Popple
2025-07-08 7:40 ` Balbir Singh
2025-07-07 3:49 ` Mika Penttilä
2025-07-08 4:20 ` Balbir Singh
2025-07-08 4:30 ` Mika Penttilä
2025-07-07 6:07 ` Alistair Popple
2025-07-08 4:59 ` Balbir Singh
2025-07-22 4:42 ` Matthew Brost
2025-07-03 23:35 ` [v1 resend 04/12] mm/migrate_device: THP migration of zone device pages Balbir Singh
2025-07-04 15:35 ` kernel test robot
2025-07-18 6:59 ` Matthew Brost
2025-07-18 7:04 ` Balbir Singh
2025-07-18 7:21 ` Matthew Brost
2025-07-18 8:22 ` Matthew Brost
2025-07-22 4:54 ` Matthew Brost
2025-07-19 2:10 ` Matthew Brost
2025-07-03 23:35 ` [v1 resend 05/12] mm/memory/fault: add support for zone device THP fault handling Balbir Singh
2025-07-17 19:34 ` Matthew Brost
2025-07-03 23:35 ` [v1 resend 06/12] lib/test_hmm: test cases and support for zone device private THP Balbir Singh
2025-07-03 23:35 ` [v1 resend 07/12] mm/memremap: add folio_split support Balbir Singh
2025-07-04 11:14 ` Mika Penttilä
2025-07-06 1:24 ` Balbir Singh
2025-07-03 23:35 ` [v1 resend 08/12] mm/thp: add split during migration support Balbir Singh
2025-07-04 5:17 ` Mika Penttilä
2025-07-04 6:43 ` Mika Penttilä
2025-07-05 0:26 ` Balbir Singh
2025-07-05 3:17 ` Mika Penttilä
2025-07-07 2:35 ` Balbir Singh
2025-07-07 3:29 ` Mika Penttilä
2025-07-08 7:37 ` Balbir Singh
2025-07-04 11:24 ` Zi Yan
2025-07-05 0:58 ` Balbir Singh
2025-07-05 1:55 ` Zi Yan
2025-07-06 1:15 ` Balbir Singh
2025-07-06 1:34 ` Zi Yan
2025-07-06 1:47 ` Balbir Singh
2025-07-06 2:34 ` Zi Yan
2025-07-06 3:03 ` Zi Yan
2025-07-07 2:29 ` Balbir Singh
2025-07-07 2:45 ` Zi Yan
2025-07-08 3:31 ` Balbir Singh
2025-07-08 7:43 ` Balbir Singh
2025-07-16 5:34 ` Matthew Brost
2025-07-16 11:19 ` Zi Yan
2025-07-16 16:24 ` Matthew Brost
2025-07-16 21:53 ` Balbir Singh
2025-07-17 22:24 ` Matthew Brost
2025-07-17 23:04 ` Zi Yan
2025-07-18 0:41 ` Matthew Brost
2025-07-18 1:25 ` Zi Yan
2025-07-18 3:33 ` Matthew Brost
2025-07-18 15:06 ` Zi Yan
2025-07-23 0:00 ` Matthew Brost
2025-07-03 23:35 ` [v1 resend 09/12] lib/test_hmm: add test case for split pages Balbir Singh
2025-07-03 23:35 ` [v1 resend 10/12] selftests/mm/hmm-tests: new tests for zone device THP migration Balbir Singh
2025-07-03 23:35 ` [v1 resend 11/12] gpu/drm/nouveau: add THP migration support Balbir Singh
2025-07-03 23:35 ` [v1 resend 12/12] selftests/mm/hmm-tests: new throughput tests including THP Balbir Singh
2025-07-04 16:16 ` Zi Yan [this message]
2025-07-04 23:56 ` [v1 resend 00/12] THP support for zone device page migration Balbir Singh
2025-07-08 14:53 ` David Hildenbrand
2025-07-08 22:43 ` Balbir Singh
2025-07-17 23:40 ` Matthew Brost
2025-07-18 3:57 ` Balbir Singh
2025-07-18 4:57 ` Matthew Brost
2025-07-21 23:48 ` Balbir Singh
2025-07-22 0:07 ` Matthew Brost
2025-07-22 0:51 ` Balbir Singh
2025-07-19 0:53 ` Matthew Brost
2025-07-21 11:42 ` Francois Dugast
2025-07-21 23:34 ` Balbir Singh
2025-07-22 0:01 ` Matthew Brost
2025-07-22 19:34 ` [PATCH] mm/hmm: Do not fault in device private pages owned by the caller Francois Dugast
2025-07-22 20:07 ` Andrew Morton
2025-07-23 15:34 ` Francois Dugast
2025-07-23 18:05 ` Matthew Brost
2025-07-24 0:25 ` Balbir Singh
2025-07-24 5:02 ` Matthew Brost
2025-07-24 5:46 ` Mika Penttilä
2025-07-24 5:57 ` Matthew Brost
2025-07-24 6:04 ` Mika Penttilä
2025-07-24 6:47 ` Leon Romanovsky
2025-07-28 13:34 ` Jason Gunthorpe
2025-08-08 0:21 ` Matthew Brost
2025-08-08 9:43 ` Francois Dugast
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=106174EE-0E89-49DC-AF9D-76BE74FD2C18@nvidia.com \
--to=ziy@nvidia.com \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=dakr@kernel.org \
--cc=david@redhat.com \
--cc=donettom@linux.ibm.com \
--cc=jane.chu@oracle.com \
--cc=jglisse@redhat.com \
--cc=kherbst@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lyude@redhat.com \
--cc=peterx@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=shuah@kernel.org \
--cc=simona@ffwll.ch \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox