linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Valmiki <valmikibow@gmail.com>
To: Ralph Campbell <rcampbell@nvidia.com>, linux-mm@kvack.org
Cc: jglisse@redhat.com
Subject: Re: Regarding HMM
Date: Sun, 23 Aug 2020 18:38:16 +0530	[thread overview]
Message-ID: <6b768b7d-e754-ebea-8467-005c38db6dd9@gmail.com> (raw)
In-Reply-To: <3482c2c7-6827-77f7-a581-69af8adc73c3@nvidia.com>


On 18-08-2020 10:36 pm, Ralph Campbell wrote:
> 
> On 8/18/20 12:15 AM, Valmiki wrote:
>> Hi All,
>>
>> Im trying to understand heterogeneous memory management, i have 
>> following doubts.
>>
>> If HMM is being used we dont have to use DMA controller on device for 
>> memory transfers ?
>> Without DMA if software is managing page faults and migrations, will 
>> there be any performance impacts ?
>>
>> Is HMM targeted for any specific use cases where DMA controller is not 
>> there on device ?
>>
>> Regards,
>> Valmiki
>>
> 
> There are two APIs that are part of "HMM" and are independent of each 
> other.
> 
> hmm_range_fault() is for getting the physical address of a system 
> resident memory page that
> a device can map but is not pinned in the usual way I/O increases the 
> page reference count
> to pin the page. The device driver has to handle invalidation callbacks 
> to remove the device
> mapping. This lets the device access the page without moving it.
> 
> migrate_vma_setup(), migrate_vma_pages(), and migrate_vma_finalize() are 
> used by the device
> driver to migrate data to device private memory. After migration, the 
> system memory is freed
> and the CPU page table holds an invalid PTE that points to the device 
> private struct page
> (similar to a swap PTE). If the CPU process faults on that address, 
> there is a callback
> to the driver to migrate it back to system memory. This is where device 
> DMA engines can
> be used to copy data to/from system memory and device private memory.
> 
> The use case for the above is to be able to run code such as OpenCL on 
> GPUs and CPUs using
> the same virtual addresses without having to call special memory 
> allocators.
> In other words, just use mmap() and malloc() and not clSVMAlloc().
> 
> There is a performance consideration here. If the GPU accesses the data 
> over PCIe to
> system memory, there is much less bandwidth than accessing local GPU 
> memory. If the
> data is to be accessed/used many times, it can be more efficient to 
> migrate the data
> to local GPU memory. If the data is only accessed a few times, then it 
> is probably
> more efficient to map system memory.
Thanks Ralph for the clarification.


      parent reply	other threads:[~2020-08-23 13:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-18  7:15 Valmiki
2020-08-18 17:06 ` Ralph Campbell
2020-08-18 20:35   ` John Hubbard
2020-08-23 13:21     ` Valmiki
2020-08-23 13:08   ` Valmiki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b768b7d-e754-ebea-8467-005c38db6dd9@gmail.com \
    --to=valmikibow@gmail.com \
    --cc=jglisse@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox