From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Greg KH <greg@kroah.com>
Cc: ksummit-discuss@lists.linuxfoundation.org
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Redesign Memory Management layer and more core subsystem
Date: Fri, 13 Jun 2014 10:55:46 -0700 [thread overview]
Message-ID: <1402682146.2224.44.camel@dabdike.int.hansenpartnership.com> (raw)
In-Reply-To: <20140613173041.GA19513@kroah.com>
On Fri, 2014-06-13 at 10:30 -0700, Greg KH wrote:
> On Fri, Jun 13, 2014 at 11:56:08AM -0500, Christoph Lameter wrote:
> > On Wed, 11 Jun 2014, Greg KH wrote:
> >
> > > > Often the kernel subsystems are impeding performance. In high speed
> > > > computing we regularly bypass the kernel network subsystems, block I/O
> > > > etc. Direct hardware access means though that one is explosed to the ugly
> > > > particularities of how a certain device has to be handled. Can we have the
> > > > cake and eat it too by defining APIs that allow low level hardware access
> > > > but also provide hardware abstraction (maybe limited to certain types of
> > > > devices).
> > >
> > > What type of devices are you wanting here, block and networking or
> > > something else? We have the uio interface if you want to (and know how
> > > to) talk to your hardware directly from userspace, what else do you want
> > > to do here that this doesn't provide?
> >
> > Block and networking mainly. The userspace VFIO API exposes device
> > specific registers. We need something that is a decent abstraction.
> > IBverbs is something like that but it could be done much better.
>
> Heh, we've been down this road before :)
>
> In the end, userspace wants a socket-like interface to the networking
> "stack", right? So either you provide that with a custom networking
> library that talks directly to a specific hardware card (like 3
> different companies provide), or you just deal with the in-kernel
> network stack. What else is there that we can do here?
>
> And as for block device, "raw access", really? What is lacking with
> what we already provide in "raw mode", and a no-op block scheduler? How
> much more "lean" can we possibly go without you having to write a custom
> userspace uio driver for every block controller out there?
Just remember there are lessons from Raw devices too. Oracle originally
forced the raw mode on our block devices for this reason ... just get
your block layer and filesystems mostly out of our way was their cry.
Then they discovered that not having a FS wrapper led to the system not
being able to recognise the raw devices as being raw, which lead to an
awful lot of really expensive data loss cockups.
The compromise today is using filesystems with O_DIRECT to the file data
containers.
The point here is that lots of people say "just get your operating
system out of my way" most realise they actually didn't mean it when
presented with the reality.
The abstractions most people who say this want are a zero delay data
path with someone else taking care of all of the metadata and setup
problems ... effectively a MPI type interface. Is that what you're
looking for, Christoph?
James
next prev parent reply other threads:[~2014-06-13 17:55 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-11 19:03 Christoph Lameter
2014-06-11 19:26 ` Daniel Phillips
2014-06-11 19:45 ` Greg KH
2014-06-12 13:35 ` John W. Linville
2014-06-13 16:57 ` Christoph Lameter
2014-06-13 17:31 ` Greg KH
2014-06-13 17:59 ` Christoph Lameter
2014-06-13 19:18 ` Stephen Hemminger
2014-06-13 22:30 ` Christoph Lameter
2014-06-13 16:56 ` Christoph Lameter
2014-06-13 17:30 ` Greg KH
2014-06-13 17:55 ` James Bottomley [this message]
2014-06-13 18:41 ` Christoph Lameter
2014-06-16 11:39 ` Thomas Petazzoni
2014-06-16 14:05 ` Christoph Lameter
2014-06-16 14:09 ` Thomas Petazzoni
2014-06-16 14:28 ` Christoph Lameter
2014-06-13 18:01 ` Christoph Lameter
2014-06-13 18:25 ` Greg KH
2014-06-13 18:54 ` Christoph Lameter
2014-06-11 20:08 ` josh
2014-06-11 20:15 ` Andy Lutomirski
2014-06-11 20:52 ` Dave Hansen
2014-06-12 6:59 ` Phillip Lougher
2014-06-13 17:02 ` Christoph Lameter
2014-06-13 21:36 ` Benjamin Herrenschmidt
2014-06-13 22:23 ` Rik van Riel
2014-06-13 23:04 ` Christoph Lameter
2014-06-14 1:19 ` Phillip Lougher
2014-06-16 14:04 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1402682146.2224.44.camel@dabdike.int.hansenpartnership.com \
--to=james.bottomley@hansenpartnership.com \
--cc=greg@kroah.com \
--cc=ksummit-discuss@lists.linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox