Re: [Ksummit-discuss] [CORE TOPIC] Core Kernel support for Compute-Offload Devices

ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

From: Jerome Glisse <jglisse@redhat.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: ksummit-discuss@lists.linuxfoundation.org
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Core Kernel support for Compute-Offload Devices
Date: Fri, 31 Jul 2015 12:34:53 -0400	[thread overview]
Message-ID: <20150731163453.GB2039@redhat.com> (raw)
In-Reply-To: <20150730135440.GB14980@8bytes.org>

On Thu, Jul 30, 2015 at 13:54:40 UTC 2015, Joerg Roedel wrote:
> On Thu, Jul 30, 2015 at 02:31:38PM +0100, David Woodhouse wrote:
> > On Thu, 2015-07-30 at 15:00 +0200, Joerg Roedel wrote:
> > > 	    (2.3) How can we attach common state for off-CPU tasks to
> > > 	          mm_struct (and what needs to be in there)?
> > 
> > And how do we handle the assignment of Address Space IDs? The AMD
> > implementation currently allows the PASID space to be managed per
> > -device, but I understand ARM systems handle the TLB shootdown
> > broadcasts in hardware and need the PASID that the device sees to be
> > identical to the ASID on the CPU's MMU? And there are reasons why we
> > might actually want that model on Intel systems too. I'm working on the
> > Intel SVM right now, and looking at a single-PASID-space model (partly
> > because the PASID tables have to be physically contiguous, and they can
> > be huge!).
> 
> True, ASIDs would be one thing that needs to be attached to a mm_struct,
> but I am also interested what other platforms might need here. For
> example, is there a better way to track these off-cpu users than using
> mmu-notifiers.

No the ASID should not be associated with mm_struct. There is to
few ASID to have enough of them. I think currently there is only
8bits worth of ASID. So what happen is that the GPU device driver
schedule process and recycle ASID as it does.

Which means that ASID really need to be on device driver control
as i explained in another mail only device driver knows how to
schedule thing for a given device and it is too much hw specific
to be move to common code.

> > > 	(3) Does it make sense to implement automatic migration of
> > > 	    system memory to device memory (when available) and vice
> > > 	    versa? How do we decide what and when to migrate?
> > 
> > This is quite a horrid one, but perhaps ties into generic NUMA
> > considerations — if a memory page is being frequently accessed by
> > something that it's far away from, can we move it to closer memory?
>
> Yeah, conceptually it is NUMA, so it might fit there. But the difference
> to the current NUMA handling is that the device memory is not always
> completly visible to the CPU, so I think quite some significant changes
> are necessary to make this work.

My HMM patchset already handle all this for anonymous memory, i showed
a proof of concept for file back one but i am exploring other method
for that.

> > Another idea is to handle migration like swapping. The difference to
> > real swapping is that it is not relying on the LRU lists but the device
> > access patterns we measure.
> 
> > The question is how we handle that. We do have Extended Accessed bits
> > in the Intel implementation of SVM that let us know that a given PTE
> > was used from a device. Although not *which* device, in cases where
> > there might be more than one.
> 
> One way would be to use seperate page-tables for the devices (which, on
> the other side, somehow contradicts the design of the hardware, because
> its designed to reuse cpu page-tables).

So HMM use a seperate page table for storing information relating to
migrated memory. Note that not all hw reuse the CPU page table, some
hardware do not and it is very much a platform thing.

> And I don't know which features other devices have (like the CAPI
> devices on Power that Paul wrote about) to help in this decission.

CAPI would not need special PTE, as on CAPI device memory is accessible
by the CPU as regular memory. Only platform that can not offer this
need some special handling. AFAICT x86 and ARM have nothing plan to
offer such level of integration (thought lately i have not paid close
attention to what new features the PCIe consortium is discussing).

Joerg i think you really want to take a look at my patchset to see
how i implemented this. I have been discussing this with AMD, Mellanox,
NVidia and couple other smaller specialize hw manufacturer.

Cheers,
Jérôme

next prev parent reply	other threads:[~2015-07-31 16:34 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-30 13:00 Joerg Roedel
2015-07-30 13:31 ` David Woodhouse
2015-07-30 13:54   ` Joerg Roedel
2015-07-31 16:34     ` Jerome Glisse [this message]
2015-08-03 18:51       ` David Woodhouse
2015-08-03 19:01         ` Jerome Glisse
2015-08-03 19:07           ` Andy Lutomirski
2015-08-03 19:56             ` Jerome Glisse
2015-08-03 21:10           ` Joerg Roedel
2015-08-03 21:12             ` David Woodhouse
2015-08-03 21:31               ` Joerg Roedel
2015-08-03 21:34               ` Jerome Glisse
2015-08-03 21:51                 ` David Woodhouse
2015-08-04 18:11               ` Catalin Marinas
2015-08-03 22:10         ` Benjamin Herrenschmidt
2015-07-30 22:32 ` Benjamin Herrenschmidt
2015-08-01 16:10   ` Joerg Roedel
2015-07-31 14:52 ` Rik van Riel
2015-07-31 16:13   ` Jerome Glisse
2015-08-01 15:57     ` Joerg Roedel
2015-08-01 19:08       ` Jerome Glisse
2015-08-03 16:02         ` Joerg Roedel
2015-08-03 18:28           ` Jerome Glisse
2015-08-01 20:46 ` Arnd Bergmann
2015-08-03 16:10   ` Joerg Roedel
2015-08-03 19:23     ` Arnd Bergmann
2015-08-04 15:40   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150731163453.GB2039@redhat.com \
    --to=jglisse@redhat.com \
    --cc=joro@8bytes.org \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox