linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	Dave Hansen <dave.hansen@intel.com>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Subject: Re: [LSF/MM TOPIC/ATTEND] Memory Types
Date: Mon, 16 Jan 2017 14:45:18 -0800	[thread overview]
Message-ID: <7cb30765-6e04-427f-47ea-23fd4158a910@nvidia.com> (raw)
In-Reply-To: <22fbcb9f-f69a-6532-691f-c0f757cf6b8b@linux.vnet.ibm.com>



On 01/16/2017 02:59 AM, Anshuman Khandual wrote:
> On 01/16/2017 10:59 AM, Dave Hansen wrote:
>> Historically, computers have sped up memory accesses by either adding
>> cache (or cache layers), or by moving to faster memory technologies
>> (like the DDR3 to DDR4 transition).  Today we are seeing new types of
>> memory being exposed not as caches, but as RAM [1].
>>
>> I'd like to discuss how the NUMA APIs are being reused to manage not
>> just the physical locality of memory, but the various types.  I'd also
>> like to discuss the parts of the NUMA API that are a bit lacking to
>> manage these types, like the inability to have fallback lists based on
>> memory type instead of location.
>>
>> I believe this needs to be a distinct discussion from Jerome's HMM
>> topic.  All of the cases we care about are cache-coherent and can be
>> treated as "normal" RAM by the VM.  The HMM model is for on-device
>> memory and is largely managed outside the core VM.
>
> Agreed. In future core VM should be able to deal with these type of
> coherent memory directly as part of the generic NUMA API and page
> allocator framework. The type of the coherent memory must be a factor
> other than NUMA distances while dealing with it from a NUMA perspective
> as well from page allocation fallback sequence perspective. I have been
> working on a very similar solution called CDM (Coherent Device Memory)
> where we change the zonelist building process as well mbind() interface
> to accommodate a different type of coherent memory other than existing
> normal system RAM. Here are the related postings and discussions.
>
> https://lkml.org/lkml/2016/10/24/19 (CDM with modified zonelists)
> https://lkml.org/lkml/2016/11/22/339 (CDM with modified cpusets)
>
> Though named as "device" for now, it can very well evolve into a generic
> solution to accommodate all kinds of coherent memory (which warrants
> them to be treated at par with system RAM in the core VM in the first
> place). I would like to attend to discuss this topic.

Yes. I'm also very interested in working through a clear way to describe memory, and use it, for 
CDM, NUMA devices, and HMM situations.

I agree that hardware-based memory coherence should be a major dividing line, as Anshuman mentions 
above. So for non-coherent memory, something very like HMM still has to exist...or so I believe, in 
the absence of seeing any better ideas. (16 proposed versions of HMM have gone by, and still no 
better ideas yet, so it's probably about right.)

We also need to work through the memory hot plug questions, and the pfn and struct pages questions: 
avoiding conflicts with real physical memory. I'm still hoping for something like "just put the 
device struct pages above the physical limit for CPU struct pages", but I approximately recall that 
there were a few CPU arch's that didn't like that (might be mis-remembering that point).

Beyond that, there is also: whether a CPU or device should move pages around, or map pages. That's 
often a fairly dynamic (runtime) question, depending on both the machine layout (there are a *lot* 
of subtle variations, and more are coming), and the workload.

Furthermore, similar questions apply to memory that can be shared peer-to-peer. So far, both those 
points are handled in device drivers, but as the kernel becomes more aware of high-speed, 
page-fault-capable remote devices, we may want to move some of this into the main kernel.

Also, as Anshuman pointed out in another thread, these topics go together (although it's unlikely to 
fit into a single hour, so maybe a couple of slots is better):

(1) [LSF/MM TOPIC/ATTEND] Memory Types
(2) [LSF/MM TOPIC] Un-addressable device memory and block/fs implications
(3) [LSF/MM TOPIC] Memory hotplug, ZONE_DEVICE, and the future of struct page
(4) [LSF/MM ATTEND] HMM, CDM and other infrastructure for device memory management

I'm interested in all of the above.

thanks
John Hubbard
NVIDIA

>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2017-01-16 22:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16  5:29 Dave Hansen
2017-01-16 10:59 ` Anshuman Khandual
2017-01-16 22:45   ` John Hubbard [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7cb30765-6e04-427f-47ea-23fd4158a910@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=dave.hansen@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox