linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Jordan Niethe <jniethe@nvidia.com>, linux-mm@kvack.org
Cc: balbirs@nvidia.com, matthew.brost@intel.com,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, ziy@nvidia.com,
	apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com,
	dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch,
	rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com,
	willy@infradead.org, linuxppc-dev@lists.ozlabs.org,
	intel-xe@lists.freedesktop.org, jgg@ziepe.ca,
	Felix.Kuehling@amd.com, jhubbard@nvidia.com, maddy@linux.ibm.com,
	mpe@ellerman.id.au, ying.huang@linux.alibaba.com
Subject: Re: [PATCH v6 13/13] mm: Remove device private pages from the physical address space
Date: Fri, 6 Mar 2026 17:11:13 +0100	[thread overview]
Message-ID: <cd9885ff-a48a-424f-b9ee-9bd7d1514b73@kernel.org> (raw)
In-Reply-To: <20260202113642.59295-14-jniethe@nvidia.com>

On 2/2/26 12:36, Jordan Niethe wrote:
> The existing design of device private memory imposes limitations which
> render it non functional for certain systems and configurations where
> the physical address space is limited.
> 
> Device private memory is implemented by first reserving a region of the
> physical address space. This is a problem. The physical address space is
> not a resource that is directly under the kernel's control. Availability
> of suitable physical address space is constrained by the underlying
> hardware and firmware and may not always be available.
> 
> Device private memory assumes that it will be able to reserve a device
> memory sized chunk of physical address space. However, there is nothing
> guaranteeing that this will succeed, and there a number of factors that
> increase the likelihood of failure. We need to consider what else may
> exist in the physical address space. It is observed that certain VM
> configurations place very large PCI windows immediately after RAM. Large
> enough that there is no physical address space available at all for
> device private memory. This is more likely to occur on 43 bit physical
> width systems which have less physical address space.
> 
> Instead of using the physical address space, introduce a device private
> address space and allocate devices regions from there to represent the
> device private pages.
> 
> Introduce a new interface memremap_device_private_pagemap() that
> allocates a requested amount of device private address space and creates
> the necessary device private pages.
> 
> To support this new interface, struct dev_pagemap needs some changes:
> 
>   - Add a new dev_pagemap::nr_pages field as an input parameter.
>   - Add a new dev_pagemap::pages array to store the device
>     private pages.
> 
> When using memremap_device_private_pagemap(), rather then passing in
> dev_pagemap::ranges[dev_pagemap::nr_ranges] of physical address space to
> be remapped, dev_pagemap::nr_ranges will always be 1, and the device
> private range that is reserved is returned in dev_pagemap::range.
> 
> Forbid calling memremap_pages() with dev_pagemap::ranges::type =
> MEMORY_DEVICE_PRIVATE.
> 
> Represent this device private address space using a new
> device_private_pgmap_tree maple tree. This tree maps a given device
> private address to a struct dev_pagemap, where a specific device private
> page may then be looked up in that dev_pagemap::pages array.
> 
> Device private address space can be reclaimed and the assoicated device
> private pages freed using the corresponding new
> memunmap_device_private_pagemap() interface.
> 
> Because the device private pages now live outside the physical address
> space, they no longer have a normal PFN. This means that page_to_pfn(),
> et al. are no longer meaningful.
> 
> Introduce helpers:
> 
>   - device_private_page_to_offset()
>   - device_private_folio_to_offset()
> 
> to take a given device private page / folio and return its offset within
> the device private address space.
> 
> Update the places where we previously converted a device private page to
> a PFN to use these new helpers. When we encounter a device private
> offset, instead of looking up its page within the pagemap use
> device_private_offset_to_page() instead.
> 
> Update the existing users:
> 
>  - lib/test_hmm.c
>  - ppc ultravisor
>  - drm/amd/amdkfd
>  - gpu/drm/xe
>  - gpu/drm/nouveau
> 
> to use the new memremap_device_private_pagemap() interface.
> 
> Acked-by: Felix Kuehling <felix.kuehling@amd.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com> # for MM changes
> Signed-off-by: Jordan Niethe <jniethe@nvidia.com>
> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> 
> ---
> v1:
> - Include NUMA node paramater for memremap_device_private_pagemap()
> - Add devm_memremap_device_private_pagemap() and friends
> - Update existing users of memremap_pages():
>     - ppc ultravisor
>     - drm/amd/amdkfd
>     - gpu/drm/xe
>     - gpu/drm/nouveau
> - Update for HMM huge page support
> - Guard device_private_offset_to_page and friends with CONFIG_ZONE_DEVICE
> 
> v2:
> - Make sure last member of struct dev_pagemap remains DECLARE_FLEX_ARRAY(struct range, ranges);
> 
> v3:
> - Use numa_mem_id() if memremap_device_private_pagemap is called with
>   NUMA_NO_NODE. This fixes a null pointer deref in
>   lruvec_stat_mod_folio().
> - drm/xe: Remove call to devm_release_mem_region() in xe_pagemap_destroy_work()
> - s/VM_BUG/VM_WARN/
> 
> v4:
> - Use devm_memunmap_device_private_pagemap() in
>   xe_pagemap_destroy_work()
> - Replace ^ with != for PVMW_DEVICE_PRIVATE comparisions
> - Minor style changes
> - remove discussion of aarch64 from commit message - not relevant post
>   eeb8fdfcf090 ("arm64: Expose the end of the linear map in PHYSMEM_END")
> 
> v6:
> - Fix maybe unused in kgd2kfd_init_zone_device()
> - Replace division by PAGE_SIZE with DIV_ROUND_UP() when setting
>   nr_pages. This mirrors the align up that previously happened in
>   get_free_mem_region()
> ---


There is just too much in this patch to review it reasonably.

You should probably have a patch that just introduces the helpers and
have them just do what we to today.

E.g., device_private_page_to_offset() would just do a pfn_to_page().

Then you can convert individual core-mm pieces that I people can review
them making their brain hurt.

Afterwards, you can have a patch that does the real "mm: Remove device
private pages from the physical address space" and doesn't have to touch
too many core-mm pieces.

[...]

> diff --git a/mm/util.c b/mm/util.c
> index 65e3f1a97d76..8482ebc5c394 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -1244,7 +1244,10 @@ void snapshot_page(struct page_snapshot *ps, const struct page *page)
>  	struct folio *foliop;
>  	int loops = 5;
>  
> -	ps->pfn = page_to_pfn(page);
> +	if (is_device_private_page(page))
> +		ps->pfn = device_private_page_to_offset(page);
> +	else
> +		ps->pfn = page_to_pfn(page);
>  	ps->flags = PAGE_SNAPSHOT_FAITHFUL;

Why is that not done by the caller?

-- 
Cheers,

David


  reply	other threads:[~2026-03-06 16:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-02 11:36 [PATCH v6 00/13] Remove device private pages from " Jordan Niethe
2026-02-02 11:36 ` [PATCH v6 01/13] mm/migrate_device: Introduce migrate_pfn_from_page() helper Jordan Niethe
2026-02-27 21:11   ` David Hildenbrand (Arm)
2026-03-01 23:38     ` Jordan Niethe
2026-03-02  9:22       ` David Hildenbrand (Arm)
2026-03-03  5:52         ` Jordan Niethe
2026-03-03 16:32           ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 02/13] drm/amdkfd: Use migrate pfns internally Jordan Niethe
2026-03-03 16:40   ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 03/13] mm/migrate_device: Make migrate_device_{pfns,range}() take mpfns Jordan Niethe
2026-03-03 16:52   ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 04/13] mm/migrate_device: Add migrate PFN flag to track device private pages Jordan Niethe
2026-03-03 16:58   ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 05/13] mm/page_vma_mapped: Add flag to page_vma_mapped_walk::flags " Jordan Niethe
2026-03-06 15:44   ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 06/13] mm: Add helpers to create migration entries from struct pages Jordan Niethe
2026-03-06 15:59   ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 07/13] mm: Add a new swap type for migration entries of device private pages Jordan Niethe
2026-02-02 11:36 ` [PATCH v6 08/13] mm: Add softleaf support for device private migration entries Jordan Niethe
2026-02-02 11:36 ` [PATCH v6 09/13] mm: Begin creating " Jordan Niethe
2026-02-02 11:36 ` [PATCH v6 10/13] mm: Add helpers to create device private entries from struct pages Jordan Niethe
2026-02-02 11:36 ` [PATCH v6 11/13] mm/util: Add flag to track device private pages in page snapshots Jordan Niethe
2026-03-06 16:02   ` David Hildenbrand (Arm)
2026-03-06 16:03     ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 12/13] mm/hmm: Add flag to track device private pages Jordan Niethe
2026-03-06 16:05   ` David Hildenbrand (Arm)
2026-02-02 11:36 ` [PATCH v6 13/13] mm: Remove device private pages from the physical address space Jordan Niethe
2026-03-06 16:11   ` David Hildenbrand (Arm) [this message]
2026-02-06 13:08 ` [PATCH v6 00/13] Remove device private pages from " David Hildenbrand (Arm)
2026-03-06 16:16 ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd9885ff-a48a-424f-b9ee-9bd7d1514b73@kernel.org \
    --to=david@kernel.org \
    --cc=Felix.Kuehling@amd.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=jniethe@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lyude@redhat.com \
    --cc=maddy@linux.ibm.com \
    --cc=matthew.brost@intel.com \
    --cc=mpe@ellerman.id.au \
    --cc=mpenttil@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=simona@ffwll.ch \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox