From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f46.google.com (mail-qg0-f46.google.com [209.85.192.46]) by kanga.kvack.org (Postfix) with ESMTP id CEFC26B0005 for ; Tue, 16 Feb 2016 23:03:54 -0500 (EST) Received: by mail-qg0-f46.google.com with SMTP id b35so3620871qge.0 for ; Tue, 16 Feb 2016 20:03:54 -0800 (PST) Received: from mail-qg0-x229.google.com (mail-qg0-x229.google.com. [2607:f8b0:400d:c04::229]) by mx.google.com with ESMTPS id v43si11070095qge.70.2016.02.16.20.03.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Feb 2016 20:03:54 -0800 (PST) Received: by mail-qg0-x229.google.com with SMTP id b35so3620762qge.0 for ; Tue, 16 Feb 2016 20:03:54 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <20160216182212.GA21071@obsidianresearch.com> References: <1455207177-11949-1-git-send-email-artemyko@mellanox.com> <20160211191838.GA23675@obsidianresearch.com> <56C08EC8.10207@mellanox.com> <20160216182212.GA21071@obsidianresearch.com> From: davide rossetti Date: Tue, 16 Feb 2016 20:03:34 -0800 Message-ID: Subject: Re: [RFC 0/7] Peer-direct memory Content-Type: multipart/alternative; boundary=001a11395a229f29ff052bef5677 Sender: owner-linux-mm@kvack.org List-ID: To: Jason Gunthorpe Cc: Haggai Eran , Kovalyov Artemy , "dledford@redhat.com" , "linux-rdma@vger.kernel.org" , "linux-mm@kvack.org" , "leon@leon.ro" , Sagi Grimberg --001a11395a229f29ff052bef5677 Content-Type: text/plain; charset=UTF-8 On Tue, Feb 16, 2016 at 10:22 AM, Jason Gunthorpe < jgunthorpe@obsidianresearch.com> wrote: > On Sun, Feb 14, 2016 at 04:27:20PM +0200, Haggai Eran wrote: > > [apologies: sending again because linux-mm address was wrong] > > > > On 11/02/2016 21:18, Jason Gunthorpe wrote: > > > Resubmit those parts under the mm subsystem, or another more > > > appropriate place. > > > > We want the feedback from linux-mm, and they are now Cced. > > Resubmit to mm means put this stuff someplace outside > drivers/infiniband in the tree and don't try and inappropriately send > memory management stuff through Doug's tree. > > Jason, I beg to differ. 1) I see mm as appropriate for real memory, i.e. something that user-space apps can pass around. This is not totally true for BAR memory, for instance as long as CPU initiated atomic ops are not supported on BAR space of PCIe devices. OTOT, CPU reading from BAR is awful (BW being abysmal,~10MB/s), while high BW writing requires use of vector instructions (at least on x86_64). 2) Instead, I see appropriate that two sophisticated devices, like an IB NIC and a storage/accelerator device, can freely target each other for I/O, i.e. exchanging peer-to-peer PCIe transactions. And as long as the existing sophisticated initiators are confined to the RDMA subsystem, that is where this support belongs to. On a different note, this reminds me that the current patch set may be missing a way to disable the use of platform PCIe atomics when the target is the BAR of a peer device. -- sincerely, d. email: davide DOT rossetti AT gmail DOT com work: drossetti AT nvidia DOT com facebook: http://www.facebook.com/dado.rossetti twitter: @dado_rossetti skype: d.rossetti --001a11395a229f29ff052bef5677 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

= On Tue, Feb 16, 2016 at 10:22 AM, Jason Gunthorpe <jgunthorp= e@obsidianresearch.com> wrote:
On Sun, Feb 14, 2016 at 04:27:20PM = +0200, Haggai Eran wrote:
> [apologies: sending again because linux-mm address was wrong]
>
> On 11/02/2016 21:18, Jason Gunthorpe wrote:
> > Resubmit those parts under the mm subsystem, or another more
> > appropriate place.
>
> We want the feedback from linux-mm, and they are now Cced.

Resubmit to mm means put this stuff someplace outside
drivers/infiniband in the tree and don't try and inappropriately send memory management stuff through Doug's tree.


Jason,
I beg to differ.

1) I see mm as ap= propriate for real memory, i.e. something that user-space apps can pass aro= und.
This is not totally true for BAR m= emory, for instance as long as CPU initiated atomic ops are not supported o= n BAR space of PCIe devices.
OTOT, CPU = reading from BAR is awful (BW being abysmal,~10MB/s), while high BW writing= requires use of vector instructions (at least on x86_64).

2) Instead, I see = appropriate that two sophisticated devices, like an IB NIC and a storage/ac= celerator device, can freely target each other for I/O, i.e. exchanging pee= r-to-peer PCIe transactions. And as long as the existing sophisticated init= iators are confined to the RDMA subsystem, that is where this support belon= gs to.

On a different note, this reminds me that the current patch set may be= missing a way to disable the use of platform PCIe atomics when the target = is the BAR of a peer device.

--
sincerely,
= d.

email: davide DOT rossetti AT gmail DOT com
work:= drossetti AT nvidia DOT com
facebook: http://www.facebook.com/dado.rossetti
twitter: @dado_rossetti
skype: d.rossetti
--001a11395a229f29ff052bef5677-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org