From: Yisheng Xie <xieyisheng1@huawei.com>
To: Jerome Glisse <jglisse@redhat.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
Dan Williams <dan.j.williams@intel.com>,
David Nellans <dnellans@nvidia.com>
Subject: Re: [PATCH 00/15] HMM (Heterogeneous Memory Management) v24
Date: Fri, 21 Jul 2017 17:03:22 +0800 [thread overview]
Message-ID: <40b9d534-4809-0a84-27dc-5c3faee3f69c@huawei.com> (raw)
In-Reply-To: <20170720171850.GC2767@redhat.com>
Hi Jerome,
On 2017/7/21 1:18, Jerome Glisse wrote:
> On Wed, Jul 19, 2017 at 07:48:08PM +0800, Yisheng Xie wrote:
>> Hi Jerome
>>
>> On 2017/6/29 2:00, Jerome Glisse wrote:
>>>
>>> Patchset is on top of git://git.cmpxchg.org/linux-mmotm.git so i
>>> test same kernel as kbuild system, git branch:
>>>
>>> https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-v24
>>>
>>> Change since v23 is code comment fixes, simplify kernel configuration and
>>> improve allocation of new page on migration do device memory (last patch
>>> in this patchset).
>>>
>>> Everything else is the same. Below is the long description of what HMM
>>> is about and why. At the end of this email i describe briefly each patch
>>> and suggest reviewers for each of them.
>>>
>>>
>>> Heterogeneous Memory Management (HMM) (description and justification)
>>>
>>> Today device driver expose dedicated memory allocation API through their
>>> device file, often relying on a combination of IOCTL and mmap calls. The
>>> device can only access and use memory allocated through this API. This
>>> effectively split the program address space into object allocated for the
>>> device and useable by the device and other regular memory (malloc, mmap
>>> of a file, share memory, a) only accessible by CPU (or in a very limited
>>> way by a device by pinning memory).
>>>
>>> Allowing different isolated component of a program to use a device thus
>>> require duplication of the input data structure using device memory
>>> allocator. This is reasonable for simple data structure (array, grid,
>>> image, a) but this get extremely complex with advance data structure
>>> (list, tree, graph, a) that rely on a web of memory pointers. This is
>>> becoming a serious limitation on the kind of work load that can be
>>> offloaded to device like GPU.
>>>
>>> New industry standard like C++, OpenCL or CUDA are pushing to remove this
>>> barrier. This require a shared address space between GPU device and CPU so
>>> that GPU can access any memory of a process (while still obeying memory
>>> protection like read only). This kind of feature is also appearing in
>>> various other operating systems.
>>>
>>> HMM is a set of helpers to facilitate several aspects of address space
>>> sharing and device memory management. Unlike existing sharing mechanism
>>> that rely on pining pages use by a device, HMM relies on mmu_notifier to
>>> propagate CPU page table update to device page table.
>>>
>>> Duplicating CPU page table is only one aspect necessary for efficiently
>>> using device like GPU. GPU local memory have bandwidth in the TeraBytes/
>>> second range but they are connected to main memory through a system bus
>>> like PCIE that is limited to 32GigaBytes/second (PCIE 4.0 16x). Thus it
>>> is necessary to allow migration of process memory from main system memory
>>> to device memory. Issue is that on platform that only have PCIE the device
>>> memory is not accessible by the CPU with the same properties as main
>>> memory (cache coherency, atomic operations, ...).
>>>
>>> To allow migration from main memory to device memory HMM provides a set
>>> of helper to hotplug device memory as a new type of ZONE_DEVICE memory
>>> which is un-addressable by CPU but still has struct page representing it.
>>> This allow most of the core kernel logic that deals with a process memory
>>> to stay oblivious of the peculiarity of device memory.
>>>
>>> When page backing an address of a process is migrated to device memory
>>> the CPU page table entry is set to a new specific swap entry. CPU access
>>> to such address triggers a migration back to system memory, just like if
>>> the page was swap on disk.
>>> [...]
>>> To allow efficient migration between device memory and main memory a new
>>> migrate_vma() helpers is added with this patchset. It allows to leverage
>>> device DMA engine to perform the copy operation.
>>>
>>
>> Is this means that when CPU access an address of a process is migrated to device
>> memory, it should call migrate_vma() to migrate a range of address back to CPU ?
>> If it is so, I think it should somewhere call this function in this patchset,
>> however, I do not find anywhere in this patchset call this function.
>>
>> Or am I miss anything?
>
> There is a callback in struct dev_pagemap page_fault. Device driver will
> set that callback to a device driver function that itself might call
> migrate_vma(). It might call a different helper thought.
>
> For instance GPU driver commonly use memory oversubscription, ie they
> evict device memory to system page to make room for other stuff. If a
> page fault happen while there is already a system page for that memory
> than the device driver might only need to hand over that page and no
> need to migrate anything.
>
> That is why you do not see migrate_vma() call in this patchset. Calls
> to that function will be inside the individual device driver.
>
Get your point.
Without a open source driver, it makes hard to get the whole view of this solution.
Hope can see your open source driver soon.
Thanks
Yisheng Xie
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2017-07-21 9:07 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-28 18:00 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 01/15] hmm: heterogeneous memory management documentation v2 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 02/15] mm/hmm: heterogeneous memory management (HMM for short) v4 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 03/15] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 04/15] mm/hmm/mirror: helper to snapshot CPU page table v3 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 05/15] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-06-28 18:00 ` [PATCH 06/15] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-06-28 18:00 ` [PATCH 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v4 Jérôme Glisse
2017-07-01 0:16 ` kbuild test robot
2017-06-28 18:00 ` [PATCH 08/15] mm/ZONE_DEVICE: special case put_page() for device private pages v2 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 09/15] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v6 Jérôme Glisse
2017-07-18 21:41 ` Evgeny Baskakov
2017-07-28 11:10 ` Michal Hocko
2017-07-31 17:21 ` Jerome Glisse
2017-08-01 12:17 ` Michal Hocko
2017-06-28 18:00 ` [PATCH 10/15] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 11/15] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-06-28 18:00 ` [PATCH 12/15] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 13/15] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-06-28 18:00 ` [PATCH 14/15] mm/migrate: support un-addressable ZONE_DEVICE page in migration v2 Jérôme Glisse
2017-06-28 18:00 ` [PATCH 15/15] mm/migrate: allow migrate_vma() to alloc new page on empty entry v3 Jérôme Glisse
2017-06-30 5:32 ` [PATCH 00/15] HMM (Heterogeneous Memory Management) v24 John Hubbard
2017-06-30 19:49 ` Jerome Glisse
2017-07-19 11:48 ` Yisheng Xie
2017-07-20 17:18 ` Jerome Glisse
2017-07-21 9:03 ` Yisheng Xie [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40b9d534-4809-0a84-27dc-5c3faee3f69c@huawei.com \
--to=xieyisheng1@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=dnellans@nvidia.com \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox