linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yonatan Maman <ymaman@nvidia.com>
To: <nouveau@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>,
	<linux-rdma@vger.kernel.org>, <linux-mm@kvack.org>,
	<herbst@redhat.com>, <lyude@redhat.com>, <dakr@redhat.com>,
	<airlied@gmail.com>, <simona@ffwll.ch>, <jgg@ziepe.ca>,
	<leon@kernel.org>, <jglisse@redhat.com>,
	<akpm@linux-foundation.org>, <dri-devel@lists.freedesktop.org>,
	<apopple@nvidia.com>, <bskeggs@nvidia.com>
Cc: Yonatan Maman <Ymaman@Nvidia.com>, Gal Shalom <GalShalom@Nvidia.com>
Subject: [PATCH v1 1/4] mm/hmm: HMM API for P2P DMA to device zone pages
Date: Tue, 15 Oct 2024 18:23:45 +0300	[thread overview]
Message-ID: <20241015152348.3055360-2-ymaman@nvidia.com> (raw)
In-Reply-To: <20241015152348.3055360-1-ymaman@nvidia.com>

From: Yonatan Maman <Ymaman@Nvidia.com>

hmm_range_fault() natively triggers a page fault on device private
pages, migrating them to RAM. In some cases, such as with RDMA devices,
the migration overhead between the device (e.g., GPU) and the CPU, and
vice-versa, significantly damages performance. Thus, enabling Peer-to-
Peer (P2P) DMA access for device private page might be crucial for
minimizing data transfer overhead.

This change introduces an API to support P2P connections for device
private pages by implementing the following:

 - Leveraging the struct pagemap_ops for P2P Page Callbacks. This
   callback involves mapping the page to MMIO and returning the
   corresponding PCI_P2P page.

 - Utilizing hmm_range_fault for Initializing P2P Connections. The API
   also adds the HMM_PFN_REQ_TRY_P2P flag option for the
   hmm_range_fault caller to initialize P2P. If set, hmm_range_fault
   attempts initializing the P2P connection first, if the owner device
   supports P2P, using p2p_page. In case of failure or lack of support,
   hmm_range_fault will continue with the regular flow of migrating the
   page to RAM.

This change does not affect previous use-cases of hmm_range_fault,
because both the caller and the page owner must explicitly request and
support it to initialize P2P connection.

Signed-off-by: Yonatan Maman <Ymaman@Nvidia.com>
Reviewed-by: Gal Shalom <GalShalom@Nvidia.com>
---
 include/linux/hmm.h      |  2 ++
 include/linux/memremap.h |  7 +++++++
 mm/hmm.c                 | 28 ++++++++++++++++++++++++++++
 3 files changed, 37 insertions(+)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 126a36571667..7154f5ed73a1 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -41,6 +41,8 @@ enum hmm_pfn_flags {
 	/* Input flags */
 	HMM_PFN_REQ_FAULT = HMM_PFN_VALID,
 	HMM_PFN_REQ_WRITE = HMM_PFN_WRITE,
+	/* allow returning PCI P2PDMA pages */
+	HMM_PFN_REQ_ALLOW_P2P = 1,
 
 	HMM_PFN_FLAGS = 0xFFUL << HMM_PFN_ORDER_SHIFT,
 };
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 3f7143ade32c..0ecfd3d191fa 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -89,6 +89,13 @@ struct dev_pagemap_ops {
 	 */
 	vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
 
+	/*
+	 * Used for private (un-addressable) device memory only. Return a
+	 * corresponding struct page, that can be mapped to device
+	 * (e.g using dma_map_page)
+	 */
+	struct page *(*get_dma_page_for_device)(struct page *private_page);
+
 	/*
 	 * Handle the memory failure happens on a range of pfns.  Notify the
 	 * processes who are using these pfns, and try to recover the data on
diff --git a/mm/hmm.c b/mm/hmm.c
index 7e0229ae4a5a..987dd143d697 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -230,6 +230,8 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 	unsigned long cpu_flags;
 	pte_t pte = ptep_get(ptep);
 	uint64_t pfn_req_flags = *hmm_pfn;
+	struct page *(*get_dma_page_handler)(struct page *private_page);
+	struct page *dma_page;
 
 	if (pte_none_mostly(pte)) {
 		required_fault =
@@ -257,6 +259,32 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 			return 0;
 		}
 
+		/*
+		 * P2P for supported pages, and according to caller request
+		 * translate the private page to the match P2P page if it fails
+		 * continue with the regular flow
+		 */
+		if (is_device_private_entry(entry)) {
+			get_dma_page_handler =
+				pfn_swap_entry_to_page(entry)
+					->pgmap->ops->get_dma_page_for_device;
+			if ((hmm_vma_walk->range->default_flags &
+			    HMM_PFN_REQ_ALLOW_P2P) &&
+			    get_dma_page_handler) {
+				dma_page = get_dma_page_handler(
+					pfn_swap_entry_to_page(entry));
+				if (!IS_ERR(dma_page)) {
+					cpu_flags = HMM_PFN_VALID;
+					if (is_writable_device_private_entry(
+						    entry))
+						cpu_flags |= HMM_PFN_WRITE;
+					*hmm_pfn = page_to_pfn(dma_page) |
+						   cpu_flags;
+					return 0;
+				}
+			}
+		}
+
 		required_fault =
 			hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0);
 		if (!required_fault) {
-- 
2.34.1



  reply	other threads:[~2024-10-15 15:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 15:23 [PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages Yonatan Maman
2024-10-15 15:23 ` Yonatan Maman [this message]
2024-10-16  4:49   ` [PATCH v1 1/4] mm/hmm: HMM API for P2P DMA to device zone pages Christoph Hellwig
2024-10-16 15:04     ` Yonatan Maman
2024-10-16 15:44     ` Jason Gunthorpe
2024-10-16 16:41       ` Christoph Hellwig
2024-10-16 17:44         ` Jason Gunthorpe
2024-10-17 11:58           ` Christoph Hellwig
2024-10-17 13:05             ` Jason Gunthorpe
2024-10-17 13:12               ` Christoph Hellwig
2024-10-17 13:46                 ` Jason Gunthorpe
2024-10-17 13:49                   ` Christoph Hellwig
2024-10-17 14:05                     ` Jason Gunthorpe
2024-10-17 14:19                       ` Christoph Hellwig
2024-10-16  5:10   ` Alistair Popple
2024-10-16 15:45     ` Jason Gunthorpe
2024-10-17  1:58       ` Alistair Popple
2024-10-17 11:53         ` Jason Gunthorpe
2024-10-15 15:23 ` [PATCH v1 2/4] nouveau/dmem: HMM P2P DMA for private dev pages Yonatan Maman
2024-10-16  5:12   ` Alistair Popple
2024-10-16 15:18     ` Yonatan Maman
2024-10-15 15:23 ` [PATCH v1 3/4] IB/core: P2P DMA for device private pages Yonatan Maman
2024-10-15 15:23 ` [PATCH v1 4/4] RDMA/mlx5: Enabling ATS for ODP memory Yonatan Maman
2024-10-16  4:23 ` [PATCH v1 0/4] GPU Direct RDMA (P2P DMA) for Device Private Pages Christoph Hellwig
2024-10-16 15:16   ` Yonatan Maman
2024-10-16 22:22     ` Alistair Popple
2024-10-18  7:26     ` Zhu Yanjun
2024-10-20 15:26       ` Yonatan Maman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241015152348.3055360-2-ymaman@nvidia.com \
    --to=ymaman@nvidia.com \
    --cc=GalShalom@Nvidia.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=bskeggs@nvidia.com \
    --cc=dakr@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=herbst@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lyude@redhat.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=simona@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox