linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Mika Penttilä" <mpenttil@redhat.com>
To: "David Hildenbrand (Arm)" <david@kernel.org>, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Jason Gunthorpe <jgg@nvidia.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Matthew Brost <matthew.brost@intel.com>
Subject: Re: [PATCH v5 2/6] mm: Add helper to convert HMM pfn to migrate pfn
Date: Tue, 17 Mar 2026 15:01:09 +0200	[thread overview]
Message-ID: <59d3c0e5-23fb-4a30-825a-d90fd07bd34e@redhat.com> (raw)
In-Reply-To: <3c0578a9-ce1c-4d26-86d5-681cae4b8200@kernel.org>

Hi,

On 3/17/26 11:05, David Hildenbrand (Arm) wrote:

> On 2/11/26 09:12, mpenttil@redhat.com wrote:
>> From: Mika Penttilä <mpenttil@redhat.com>
>>
>> The unified HMM/migrate_device pagewalk does the "collecting"
>> in HMM side, so need a helper to transfer pfns to migrate_vma
> s/in/on the/ ?
>
> "so we" ?
>
> "to the" ?

Yes will correct.

>> world.
>>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Jason Gunthorpe <jgg@nvidia.com>
>> Cc: Leon Romanovsky <leonro@nvidia.com>
>> Cc: Alistair Popple <apopple@nvidia.com>
>> Cc: Balbir Singh <balbirs@nvidia.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Suggested-by: Alistair Popple <apopple@nvidia.com>
>> Signed-off-by: Mika Penttilä <mpenttil@redhat.com>
>> ---
>>  include/linux/hmm.h     | 18 +++++++++--
>>  include/linux/migrate.h |  3 +-
>>  mm/hmm.c                |  6 ----
>>  mm/migrate_device.c     | 69 +++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 87 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
>> index db75ffc949a7..b5418e318782 100644
>> --- a/include/linux/hmm.h
>> +++ b/include/linux/hmm.h
>> @@ -12,7 +12,7 @@
>>  #include <linux/mm.h>
>>  
>>  struct mmu_interval_notifier;
>> -
>> +struct migrate_vma;
> Keep the empty line, please.

ack

>
>>  /*
>>   * On output:
>>   * 0             - The page is faultable and a future call with 
>> @@ -27,6 +27,7 @@ struct mmu_interval_notifier;
>>   * HMM_PFN_P2PDMA_BUS - Bus mapped P2P transfer
>>   * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation
>>   *                      to mark that page is already DMA mapped
>> + * HMM_PFN_MIGRATE    - Migrate PTE installed
> I have no idea what exactly that means. Can we a bit clearer?
> Especially, what does it mean when this is set alongside HMM_PFN_VALID? What if it
> is not set alongside HMM_PFN_VALID.
>
Agreed needs more documentation. Look below for explanation also.

> Shouldn't we document what HMM_PFN_COMPOUND and HMM_PFN_MIGRATE mean as well somewhere?

I think they deserve some words as well.

>
>>   *
>>   * On input:
>>   * 0                 - Return the current state of the page, do not fault it.
>> @@ -34,6 +35,7 @@ struct mmu_interval_notifier;
>>   *                     will fail
>>   * HMM_PFN_REQ_WRITE - The output must have HMM_PFN_WRITE or hmm_range_fault()
>>   *                     will fail. Must be combined with HMM_PFN_REQ_FAULT.
>> + * HMM_PFN_REQ_MIGRATE - For default_flags, request to migrate to device
>>   */
>>  enum hmm_pfn_flags {
>>  	/* Output fields and flags */
>> @@ -48,15 +50,25 @@ enum hmm_pfn_flags {
>>  	HMM_PFN_P2PDMA     = 1UL << (BITS_PER_LONG - 5),
>>  	HMM_PFN_P2PDMA_BUS = 1UL << (BITS_PER_LONG - 6),
>>  
>> -	HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 11),
>> +	/* Migrate request */
>> +	HMM_PFN_MIGRATE    = 1UL << (BITS_PER_LONG - 7),
>> +	HMM_PFN_COMPOUND   = 1UL << (BITS_PER_LONG - 8),
>> +	HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 13),
>>  
>>  	/* Input flags */
>>  	HMM_PFN_REQ_FAULT = HMM_PFN_VALID,
>>  	HMM_PFN_REQ_WRITE = HMM_PFN_WRITE,
>> +	HMM_PFN_REQ_MIGRATE = HMM_PFN_MIGRATE,
>>  
>>  	HMM_PFN_FLAGS = ~((1UL << HMM_PFN_ORDER_SHIFT) - 1),
>>  };
>>  
>> +enum {
>> +	/* These flags are carried from input-to-output */
>> +	HMM_PFN_INOUT_FLAGS = HMM_PFN_DMA_MAPPED | HMM_PFN_P2PDMA |
>> +		HMM_PFN_P2PDMA_BUS,
>> +};
> Why is that an anonymous enum with a single value? Ah, you are moving
> that. This should be spelled out in the patch description (and why you
> are doing it).

Yes I moved for visibility to migrate_device.

> Is there any reason we can't move that into hmm_pfn_flags?

Can't think of any, I think that would be good to do.

>
>> +
>>  /*
>>   * hmm_pfn_to_page() - return struct page pointed to by a device entry
>>   *
>> @@ -107,6 +119,7 @@ static inline unsigned int hmm_pfn_to_map_order(unsigned long hmm_pfn)
>>   * @default_flags: default flags for the range (write, read, ... see hmm doc)
>>   * @pfn_flags_mask: allows to mask pfn flags so that only default_flags matter
>>   * @dev_private_owner: owner of device private pages
>> + * @migrate: structure for migrating the associated vma
> Is it really for "migrating a vma"? I assume it's for migrating a range
> of a VMA.

Yes, a range. Will fix.

>
>>   */
>>  struct hmm_range {
>>  	struct mmu_interval_notifier *notifier;
>> @@ -117,6 +130,7 @@ struct hmm_range {
>>  	unsigned long		default_flags;
>>  	unsigned long		pfn_flags_mask;
>>  	void			*dev_private_owner;
>> +	struct migrate_vma      *migrate;
>>  };
>>  
>>  /*
>> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
>> index 26ca00c325d9..8e6c28efd4f8 100644
>> --- a/include/linux/migrate.h
>> +++ b/include/linux/migrate.h
>> @@ -3,6 +3,7 @@
>>  #define _LINUX_MIGRATE_H
>>  
>>  #include <linux/mm.h>
>> +#include <linux/hmm.h>
>>  #include <linux/mempolicy.h>
>>  #include <linux/migrate_mode.h>
>>  #include <linux/hugetlb.h>
>> @@ -192,7 +193,7 @@ void migrate_device_pages(unsigned long *src_pfns, unsigned long *dst_pfns,
>>  			unsigned long npages);
>>  void migrate_device_finalize(unsigned long *src_pfns,
>>  			unsigned long *dst_pfns, unsigned long npages);
>> -
>> +void migrate_hmm_range_setup(struct hmm_range *range);
>>  #endif /* CONFIG_MIGRATION */
>>  
>>  #endif /* _LINUX_MIGRATE_H */
>> diff --git a/mm/hmm.c b/mm/hmm.c
>> index 4ec74c18bef6..21ff99379836 100644
>> --- a/mm/hmm.c
>> +++ b/mm/hmm.c
>> @@ -41,12 +41,6 @@ enum {
>>  	HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,
>>  };
>>  
>> -enum {
>> -	/* These flags are carried from input-to-output */
>> -	HMM_PFN_INOUT_FLAGS = HMM_PFN_DMA_MAPPED | HMM_PFN_P2PDMA |
>> -			      HMM_PFN_P2PDMA_BUS,
>> -};
>> -
>>  static int hmm_pfns_fill(unsigned long addr, unsigned long end,
>>  			 struct hmm_range *range, unsigned long cpu_flags)
>>  {
>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>> index 23379663b1e1..c773a82ea1ed 100644
>> --- a/mm/migrate_device.c
>> +++ b/mm/migrate_device.c
>> @@ -1489,3 +1489,72 @@ int migrate_device_coherent_folio(struct folio *folio)
>>  		return 0;
>>  	return -EBUSY;
>>  }
>> +
>> +/**
>> + * migrate_hmm_range_setup() - prepare to migrate a range of memory
>> + * @range: contains pointer to migrate_vma to be populated
> "pointer to hmm_range to be migrated" ?

Tried to say range->migrate's src and dst arrays, cpages and npages get populated.

>
> Why can't we just pass the migrate_vma? Or is there another use case for
> adding the migrate_vma to hmm_range?

We have to pass hmm_range anyway because of range->hmm_pfns.
migrate_hmm_range_setup() is supposed to be called after hmm_range_fault(), which
populates migrate_vma's .start .end and .vma. So passing a hmm_range
also emphasizes the intended usage.

>> + *
>> + * When collecting happens by hmm_range_fault(), this populates
>> + * the migrate->src[] and migrate->dst[] using range->hmm_pfns[].
>> + * Also, migrate->cpages and migrate->npages get initialized.
> Can you please document what exactly this function does with
> the magical HMM_PFN_VALID | HMM_PFN_MIGRATE, and why it skips
> everything else?
>
> What is the caller supposed to setup?
>
> I am getting the feeling that this is severely under-documented :)

Yes will add documentation. The caller of hmm_range_fault() is supposed to
fill migrate_vma's pgmap_owner, src and dst array pointers, and flags for migrate direction.
(there is minor change in v6 https://lore.kernel.org/linux-mm/20260316062407.3354636-1-mpenttil@redhat.com/)
for the flags requirement.

>
>> + */
>> +void migrate_hmm_range_setup(struct hmm_range *range)
>> +{
>> +
>> +	struct migrate_vma *migrate = range->migrate;
>> +
>> +	if (!migrate)
>> +		return;
>> +
>> +	migrate->npages = (migrate->end - migrate->start) >> PAGE_SHIFT;
>> +	migrate->cpages = 0;
>> +
>> +	for (unsigned long i = 0; i < migrate->npages; i++) {
>> +
>> +		unsigned long pfn = range->hmm_pfns[i];
>> +
>> +		pfn &= ~HMM_PFN_INOUT_FLAGS;
>> +
>> +		/*
>> +		 *
> Please don't add unnecessary empty comment lines above/below the text.

ack

>
>> +		 *  Don't do migration if valid and migrate flags are not both set.
>> +		 *
>> +		 */
>
> /*
>  * There is nothing to migrate if the entry is not valid
>  * migration entry.
>  */
>
> Which raises the question:
>
> When would HMM_PFN_MIGRATE be set without HMM_PFN_VALID

As is HMM_PFN_MIGRATE is not supposed to be set without HMM_PFN_VALID...
But, I just realized that in order to allow hmm pfn 0 we have to
make HMM_PFN_MIGRATE alone mean the "empty page" (hole, pte_none, zero page)

>
>> +		if ((pfn & (HMM_PFN_VALID | HMM_PFN_MIGRATE)) !=
>> +		    (HMM_PFN_VALID | HMM_PFN_MIGRATE)) {
>> +			migrate->src[i] = 0;
>> +			migrate->dst[i] = 0;
>> +			continue;
>> +		}
>> +
>> +		migrate->cpages++;
>> +
>> +		/*
>> +		 *
>> +		 * The zero page is encoded in a special way, valid and migrate is
>> +		 * set, and pfn part is zero. Encode specially for migrate also.
>> +		 *
>> +		 */
>> +		if (pfn == (HMM_PFN_VALID|HMM_PFN_MIGRATE)) {
>
> I think you can simplify all that by doing a
>
> if (pfn & ~(HMM_PFN_VALID|HMM_PFN_COMPOUND|HMM_PFN_MIGRATE))
> 	migrate->src[i] = migrate_pfn(page_to_pfn(hmm_pfn_to_page(pfn)));
> else
> 	migrate->src[i] = 0;
>
> migrate->src[i] |= (pfn & HMM_PFN_WRITE) ? MIGRATE_PFN_WRITE : 0;
> migrate->src[i] |= (pfn & HMM_PFN_COMPOUND) ? MIGRATE_PFN_COMPOUND : 0;
> migrate->dst[i] = 0;
>
>
> And I wonder whether you could replace the (HMM_PFN_VALID|HMM_PFN_COMPOUND|HMM_PFN_MIGRATE)
> by HMM_PFN_FLAGS, to only check for non-zero PFNs?

For current code the above would be ok. With HMM_PFN_MIGRATE alone for empty page
have to adjust. Will fix.

>
Thanks for the review!

--Mika



  reply	other threads:[~2026-03-17 13:01 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11  8:12 [PATCH v5 0/6] Migrate on fault for device pages mpenttil
2026-02-11  8:12 ` [PATCH v5 1/6] mm:/Kconfig changes for migrate " mpenttil
2026-03-17  8:31   ` David Hildenbrand (Arm)
2026-02-11  8:12 ` [PATCH v5 2/6] mm: Add helper to convert HMM pfn to migrate pfn mpenttil
2026-03-17  9:05   ` David Hildenbrand (Arm)
2026-03-17 13:01     ` Mika Penttilä [this message]
2026-03-17 13:45       ` Leon Romanovsky
2026-03-23 17:58         ` Jason Gunthorpe
2026-02-11  8:12 ` [PATCH v5 3/6] mm/hmm: do the plumbing for HMM to participate in migration mpenttil
2026-02-11  8:12 ` [PATCH 4/6] mm: setup device page migration in HMM pagewalk mpenttil
2026-02-11  8:13 ` [PATCH v5 5/6] mm: add new testcase for the migrate on fault case mpenttil
2026-02-11  8:13 ` [PATCH v5 6/6] mm:/migrate_device.c: remove migrate_vma_collect_*() mpenttil
2026-03-12 19:58 ` [PATCH v5 0/6] Migrate on fault for device pages David Hildenbrand (Arm)
2026-03-13  0:27   ` Alistair Popple
2026-03-17 10:06 ` Balbir Singh
2026-03-17 10:28   ` Mika Penttilä

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59d3c0e5-23fb-4a30-825a-d90fd07bd34e@redhat.com \
    --to=mpenttil@redhat.com \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=david@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox