From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=G0TP=B4=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-8.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1
	autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 0EDC2C433E1
	for <linux-mm@archiver.kernel.org>; Tue, 18 Aug 2020 20:35:21 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id 98B64206B5
	for <linux-mm@archiver.kernel.org>; Tue, 18 Aug 2020 20:35:20 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="VYeEQ6Gu"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 98B64206B5
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id E6FB38D001E; Tue, 18 Aug 2020 16:35:19 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id E21148D0001; Tue, 18 Aug 2020 16:35:19 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id D0FE98D001E; Tue, 18 Aug 2020 16:35:19 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0213.hostedemail.com [216.40.44.213])
	by kanga.kvack.org (Postfix) with ESMTP id B9BF88D0001
	for <linux-mm@kvack.org>; Tue, 18 Aug 2020 16:35:19 -0400 (EDT)
Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay01.hostedemail.com (Postfix) with ESMTP id 57DD6180AD820
	for <linux-mm@kvack.org>; Tue, 18 Aug 2020 20:35:19 +0000 (UTC)
X-FDA: 77164844358.06.pear65_550c81b27022
Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251])
	by smtpin06.hostedemail.com (Postfix) with ESMTP id 048DF1004E418
	for <linux-mm@kvack.org>; Tue, 18 Aug 2020 20:35:14 +0000 (UTC)
X-HE-Tag: pear65_550c81b27022
X-Filterd-Recvd-Size: 5318
Received: from hqnvemgate26.nvidia.com (hqnvemgate26.nvidia.com [216.228.121.65])
	by imf34.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Tue, 18 Aug 2020 20:35:13 +0000 (UTC)
Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA)
	id <B5f3c3b710001>; Tue, 18 Aug 2020 13:34:57 -0700
Received: from hqmail.nvidia.com ([172.20.161.6])
  by hqpgpgate102.nvidia.com (PGP Universal service);
  Tue, 18 Aug 2020 13:35:11 -0700
X-PGP-Universal: processed;
	by hqpgpgate102.nvidia.com on Tue, 18 Aug 2020 13:35:11 -0700
Received: from [10.2.49.218] (10.124.1.5) by HQMAIL107.nvidia.com
 (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 18 Aug
 2020 20:35:11 +0000
Subject: Re: Regarding HMM
To: Ralph Campbell <rcampbell@nvidia.com>, Valmiki <valmikibow@gmail.com>,
	<linux-mm@kvack.org>
CC: <jglisse@redhat.com>
References: <b2c02ebc-8313-e7f2-7bde-2141f7969da2@gmail.com>
 <3482c2c7-6827-77f7-a581-69af8adc73c3@nvidia.com>
From: John Hubbard <jhubbard@nvidia.com>
Message-ID: <9af4d56c-61f5-9367-28bf-b6f1236e90fa@nvidia.com>
Date: Tue, 18 Aug 2020 13:35:11 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.11.0
MIME-Version: 1.0
In-Reply-To: <3482c2c7-6827-77f7-a581-69af8adc73c3@nvidia.com>
X-Originating-IP: [10.124.1.5]
X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To
 HQMAIL107.nvidia.com (172.20.187.13)
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1;
	t=1597782897; bh=R8ElAAipytxv7LDeJFEKMMOsNvf9Imju1jxvhbxW+0A=;
	h=X-PGP-Universal:Subject:To:CC:References:From:Message-ID:Date:
	 User-Agent:MIME-Version:In-Reply-To:X-Originating-IP:
	 X-ClientProxiedBy:Content-Type:Content-Language:
	 Content-Transfer-Encoding;
	b=VYeEQ6GucsHf9oORPA4LtN/7uyviib9oGSzDROy7UeTSQg85u0hvXiuzkx4C95nrB
	 gk78d+3FwPkQrN6//NfXoR0qx8WMvVQBSmCOvin8zhbw8qgUrUiEHhoNa0tJK0I57H
	 8D5OwW1YVQDOWKSNq3B96UNt+Qg3/EwXdgcK1qWVJfWzMA9u/TTDOleqlOaPP1JSAi
	 8i1UQJCmnunkR5Q6BJanVlR2pyXRXS2AYl9cCQ0B8/05uUOLrgCtwIfqAJB6hhRilh
	 zvmoKgj+LjJD0zWwRwvf8ZWiHyu7PYNj5F4PoWq+VARrPwtJw+gPpRCri0wEyD7J36
	 spO+lnbXoNV6A==
X-Rspamd-Queue-Id: 048DF1004E418
X-Spamd-Result: default: False [0.00 / 100.00]
X-Rspamd-Server: rspam03
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On 8/18/20 10:06 AM, Ralph Campbell wrote:
> 
> On 8/18/20 12:15 AM, Valmiki wrote:
>> Hi All,
>>
>> Im trying to understand heterogeneous memory management, i have following doubts.
>>
>> If HMM is being used we dont have to use DMA controller on device for memory transfers ?

Hi,

Nothing about HMM either requires or prevents using DMA controllers.

>> Without DMA if software is managing page faults and migrations, will there be any performance 
>> impacts ?
>>
>> Is HMM targeted for any specific use cases where DMA controller is not there on device ?
>>
>> Regards,
>> Valmiki
>>
> 
> There are two APIs that are part of "HMM" and are independent of each other.
> 
> hmm_range_fault() is for getting the physical address of a system resident memory page that
> a device can map but is not pinned in the usual way I/O increases the page reference count
> to pin the page. The device driver has to handle invalidation callbacks to remove the device
> mapping. This lets the device access the page without moving it.
> 
> migrate_vma_setup(), migrate_vma_pages(), and migrate_vma_finalize() are used by the device
> driver to migrate data to device private memory. After migration, the system memory is freed
> and the CPU page table holds an invalid PTE that points to the device private struct page
> (similar to a swap PTE). If the CPU process faults on that address, there is a callback
> to the driver to migrate it back to system memory. This is where device DMA engines can
> be used to copy data to/from system memory and device private memory.
> 
> The use case for the above is to be able to run code such as OpenCL on GPUs and CPUs using
> the same virtual addresses without having to call special memory allocators.
> In other words, just use mmap() and malloc() and not clSVMAlloc().
> 
> There is a performance consideration here. If the GPU accesses the data over PCIe to
> system memory, there is much less bandwidth than accessing local GPU memory. If the
> data is to be accessed/used many times, it can be more efficient to migrate the data
> to local GPU memory. If the data is only accessed a few times, then it is probably
> more efficient to map system memory.
> 

Ralph, that's a good write-up!

Valmiki, did you already read Documentation/vm/hmm.rst, before posting your question?

It's OK to say "no"--I'm not asking in order to criticize, but in order to calibrate
the documentation. Because, we should consider merging in Ralph's write-up above
into hmm.rst, depending on if it helps (which I expect it does, but I'm tainted by
reading hmm.rst too many times and now I can't see what might be missing).

Any time someone new tries to understand the system, it's an opportunity to "unit test"
the documentation. Ideally, hmm.rst would answer many of a first-time reader's questions,
that's where we'd like to end up.


thanks,
-- 
John Hubbard
NVIDIA