Re: [RFC v2 1/2] dma-buf: Introduce dma buffer sharing mechanism

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Semwal, Sumit" <sumit.semwal@ti.com>
To: Rob Clark <rob@ti.com>
Cc: Daniel Vetter <daniel@ffwll.ch>,
	t.stanislaws@samsung.com, linux@arm.linux.org.uk,
	Arnd Bergmann <arnd@arndb.de>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org, linux-mm@kvack.org,
	m.szyprowski@samsung.com, Sumit Semwal <sumit.semwal@linaro.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-media@vger.kernel.org
Subject: Re: [RFC v2 1/2] dma-buf: Introduce dma buffer sharing mechanism
Date: Wed, 7 Dec 2011 18:57:17 +0530	[thread overview]
Message-ID: <CAB2ybb-0mTdNXN82O1TUGVjhMZUQtQb07A3EVmmdxg3ngEc3Dw@mail.gmail.com> (raw)
In-Reply-To: <CAF6AEGto-+oSqguuWyPunUbtE65GpNiXh21srQzrChiBQMb1Nw@mail.gmail.com>

Hi Daniel, Rob,


On Tue, Dec 6, 2011 at 3:41 AM, Rob Clark <rob@ti.com> wrote:
> On Mon, Dec 5, 2011 at 3:23 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Mon, Dec 05, 2011 at 02:46:47PM -0600, Rob Clark wrote:
>>> On Mon, Dec 5, 2011 at 11:18 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>> > In the patch 2, you have a section about migration that mentions that
>>> > it is possible to export a buffer that can be migrated after it
>>> > is already mapped into one user driver. How does that work when
>>> > the physical addresses are mapped into a consumer device already?
>>>
>>> I think you can do physical migration if you are attached, but
>>> probably not if you are mapped.
>>
>> Yeah, that's very much how I see this, and also why map/unmap (at least
>> for simple users like v4l) should only bracket actual usage. GPU memory
>> managers need to be able to move around buffers while no one is using
>> them.
>>
>> [snip]
>>
>>> >> +     /* allow allocator to take care of cache ops */
>>> >> +     void (*sync_sg_for_cpu) (struct dma_buf *, struct device *);
>>> >> +     void (*sync_sg_for_device)(struct dma_buf *, struct device *);
>>> >
>>> > I don't see how this works with multiple consumers: For the streaming
>>> > DMA mapping, there must be exactly one owner, either the device or
>>> > the CPU. Obviously, this rule needs to be extended when you get to
>>> > multiple devices and multiple device drivers, plus possibly user
>>> > mappings. Simply assigning the buffer to "the device" from one
>>> > driver does not block other drivers from touching the buffer, and
>>> > assigning it to "the cpu" does not stop other hardware that the
>>> > code calling sync_sg_for_cpu is not aware of.
>>> >
>>> > The only way to solve this that I can think of right now is to
>>> > mandate that the mappings are all coherent (i.e. noncachable
>>> > on noncoherent architectures like ARM). If you do that, you no
>>> > longer need the sync_sg_for_* calls.
>>>
>>> My original thinking was that you either need DMABUF_CPU_{PREP,FINI}
>>> ioctls and corresponding dmabuf ops, which userspace is required to
>>> call before / after CPU access.  Or just remove mmap() and do the
>>> mmap() via allocating device and use that device's equivalent
>>> DRM_XYZ_GEM_CPU_{PREP,FINI} or DRM_XYZ_GEM_SET_DOMAIN ioctls.  That
>>> would give you a way to (a) synchronize with gpu/asynchronous
>>> pipeline, (b) synchronize w/ multiple hw devices vs cpu accessing
>>> buffer (ie. wait all devices have dma_buf_unmap_attachment'd).  And
>>> that gives you a convenient place to do cache operations on
>>> noncoherent architecture.
>>>
>>> I sort of preferred having the DMABUF shim because that lets you pass
>>> a buffer around userspace without the receiving code knowing about a
>>> device specific API.  But the problem I eventually came around to: if
>>> your GL stack (or some other userspace component) is batching up
>>> commands before submission to kernel, the buffers you need to wait for
>>> completion might not even be submitted yet.  So from kernel
>>> perspective they are "ready" for cpu access.  Even though in fact they
>>> are not in a consistent state from rendering perspective.  I don't
>>> really know a sane way to deal with that.  Maybe the approach instead
>>> should be a userspace level API (in libkms/libdrm?) to provide
>>> abstraction for userspace access to buffers rather than dealing with
>>> this at the kernel level.
>>
>> Well, there's a reason GL has an explicit flush and extensions for sync
>> objects. It's to support such scenarios where the driver batches up gpu
>> commands before actually submitting them.
>
> Hmm.. what about other non-GL APIs..  maybe vaapi/vdpau or similar?
> (Or something that I haven't thought of.)
>
>> Also, recent gpus have all (or
>> shortly will grow) multiple execution pipelines, so it's also important
>> that you sync up with the right command stream. Syncing up with all of
>> them is generally frowned upon for obvious reasons ;-)
>
> Well, I guess I am happy enough with something that is at least
> functional.  Usespace access would (I think) mainly be weird edge case
> type stuff.  But...
>
<snip>
>
>> On the topic of a coherency model for dmabuf, I think we need to look at
>> dma_buf_attachment_map/unmap (and also the mmap variants cpu_start and
>> cpu_finish or whatever they might get called) as barriers:
>>
>> So after a dma_buf_map, all previsously completed dma operations (i.e.
>> unmap already called) and any cpu writes (i.e. cpu_finish called) will be
>> coherent. Similar rule holds for cpu access through the userspace mmap,
>> only writes completed before the cpu_start will show up.
>>
>> Similar, writes done by the device are only guaranteed to show up after
>> the _unmap. Dito for cpu writes and cpu_finish.
>>
>> In short we always need two function calls to denote the start/end of the
>> "critical section".
>
> Yup, this was exactly my assumption.  But I guess it is better to spell it out.

Thanks for the excellent discussion - it indeed is very good learning
for the relatively-inexperienced me :)

So, for the purpose of dma-buf framework, could I summarize the
following and rework accordingly?:
1. remove mmap() dma_buf_op [and mmap fop], and introduce cpu_start(),
cpu_finish() ops to bracket cpu accesses to the buffer. Also add
DMABUF_CPU_START / DMABUF_CPU_FINI IOCTLs?
2. remove sg_sync* ops for now (and we'll see if we need to add them
later if needed)
>
> BR,
> -R
>
>> Any concurrent operations are allowed to yield garbage, meaning any
>> combination of the old or either of the newly written contents (i.e.
>> non-overlapping writes might not actually all end up in the buffer,
>> but instead some old contents). Maybe we even need to loosen that to
>> the real "undefined behaviour", but atm I can't think of an example.
I guess that should be acceptable for our video / media use cases. How
about other potential users of dma-buf? [I am asking this because
Jesse did tell me that there were some other subsystems also
interested in dmabuf usage]
>>
>> -Daniel
BR,
~Sumit.
>> --
>> Daniel Vetter
>> Mail: daniel@ffwll.ch
>> Mobile: +41 (0)79 365 57 48
<snip>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-12-07 13:27 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-02  8:57 [RFC v2 0/2] Introduce DMA " Sumit Semwal
2011-12-02  8:57 ` [RFC v2 1/2] dma-buf: Introduce dma " Sumit Semwal
2011-12-02 17:11   ` Konrad Rzeszutek Wilk
2011-12-05  9:48     ` Semwal, Sumit
2011-12-05 17:18   ` Arnd Bergmann
2011-12-05 18:55     ` Daniel Vetter
2011-12-05 19:29       ` Arnd Bergmann
2011-12-05 20:58         ` Daniel Vetter
2011-12-05 22:04           ` Arnd Bergmann
2011-12-05 22:33             ` Daniel Vetter
2011-12-05 20:46     ` Rob Clark
2011-12-05 21:23       ` Daniel Vetter
2011-12-05 22:11         ` Rob Clark
2011-12-05 22:33           ` Daniel Vetter
2011-12-06 13:16           ` Arnd Bergmann
2011-12-06 15:28             ` Daniel Vetter
2011-12-07 13:27           ` Semwal, Sumit [this message]
2011-12-07 13:40             ` Arnd Bergmann
2011-12-08 21:44               ` [Linaro-mm-sig] " Daniel Vetter
2011-12-09 14:13                 ` Arnd Bergmann
2011-12-09 14:24                   ` Alan Cox
2011-12-10  4:01                     ` Daniel Vetter
2011-12-12 16:48                       ` Arnd Bergmann
2011-12-19  6:16                         ` Semwal, Sumit
2011-12-20 15:41                           ` Arnd Bergmann
2011-12-20 16:41                             ` Rob Clark
2011-12-20 17:14                               ` Daniel Vetter
2011-12-21 17:27                                 ` Arnd Bergmann
2011-12-21 19:04                                   ` Daniel Vetter
2011-12-23 10:00                                   ` Semwal, Sumit
2011-12-23 17:10                                     ` Rob Clark
2011-12-20  9:03                   ` Sakari Ailus
2011-12-20 15:36                     ` Arnd Bergmann
2012-01-01 20:53                       ` Sakari Ailus
2012-01-01 23:12                         ` Rob Clark
2011-12-13 13:33                 ` Hans Verkuil
2011-12-05 22:09       ` Arnd Bergmann
2011-12-05 22:15         ` Rob Clark
2011-12-05 22:35         ` Rob Clark
2011-12-07  6:35     ` Semwal, Sumit
2011-12-07 10:11       ` Arnd Bergmann
2011-12-07 11:02         ` Semwal, Sumit
2011-12-07 11:34           ` Arnd Bergmann
2011-12-09 22:50     ` [Linaro-mm-sig] " Robert Morell
2011-12-10 11:13       ` Mauro Carvalho Chehab
2011-12-12 22:44         ` Robert Morell
2011-12-13 15:10           ` Arnd Bergmann
2011-12-20  2:05             ` Robert Morell
2011-12-20 14:29               ` Anca Emanuel
2012-01-09  6:20   ` InKi Dae
2012-01-09  8:10     ` Daniel Vetter
2012-01-09  8:11       ` [Linaro-mm-sig] " Dave Airlie
2012-01-09 10:10       ` InKi Dae
2012-01-09 10:27         ` Daniel Vetter
2012-01-09 12:06           ` InKi Dae
2012-01-09 16:02             ` Daniel Vetter
2012-01-09 15:17         ` Rob Clark
2012-01-10  1:34           ` InKi Dae
2012-01-10  2:14             ` Rob Clark
2012-01-10  6:09               ` Semwal, Sumit
2012-01-10  7:28                 ` InKi Dae
2012-01-10  9:19                   ` InKi Dae
2012-01-11  1:08               ` InKi Dae
2011-12-02  8:57 ` [RFC v2 2/2] dma-buf: Documentation for buffer sharing framework Sumit Semwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAB2ybb-0mTdNXN82O1TUGVjhMZUQtQb07A3EVmmdxg3ngEc3Dw@mail.gmail.com \
    --to=sumit.semwal@ti.com \
    --cc=arnd@arndb.de \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@arm.linux.org.uk \
    --cc=m.szyprowski@samsung.com \
    --cc=rob@ti.com \
    --cc=sumit.semwal@linaro.org \
    --cc=t.stanislaws@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox