linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jamie Lokier <jamie@shareable.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	jens.axboe@oracle.com, akpm@linux-foundation.org,
	nickpiggin@yahoo.com.au, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch v3] splice: fix race with page invalidation
Date: Thu, 31 Jul 2008 09:34:44 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.1.10.0807310925360.3277@nehalem.linux-foundation.org> (raw)
In-Reply-To: <20080731061201.GA7156@shareable.org>


On Thu, 31 Jul 2008, Jamie Lokier wrote:
> 
> Having implemented an equivalent zero-copy thing in userspace, I can
> confidently say it's not fundamental at all.

Oh yes it is.

Doing it in user space is _trivial_, because you control everything, and 
there are no barriers.

> What is fundamental is that you either (a) treat sendfile as an async
> operation, and get a notification when it's finished with the data,
> just like any other async operation

Umm. And that's exactly what I *described*.

But it's trivial to do inside one program (either all in user space, or 
all in kernel space).

It's very difficult indeed to do across two totally different domains.

Have you _looked_ at the complexities of async IO in UNIX? They are 
horrible. The overhead to even just _track_ the notifiers basically undoes 
all relevant optimizations for doing zero-copy.

IOW, AIO is useful not because of zero-copy, but because it allows 
_overlapping_ IO. Anybody who confuses the two is seriously misguided.

>			, or (b) while sendfile claims those
> pages, they are marked COW.

.. and this one shows that you have no clue about performance of a memcpy.

Once you do that COW, you're actually MUCH BETTER OFF just copying.

Really.

Copying a page is much cheaper than doing COW on it. Doing a "write()" 
really isn't that expensive. People think that memory is slow, but memory 
isn't all that slow, and caches work really well. Yes, memory is slow 
compared to a few reference count increments, but memory is absolutely 
*not* slow when compared to the overhead of TLB invalidates across CPUs 
etc.

So don't do it. If you think you need it, you should not be using 
zero-copy in the first place.

In other words, let me repeat:

 - use splice() when you *understand* that it's just taking a refcount and 
   you don't care.

 - use read()/write() when you can't be bothered.

There's nothing wrong with read/write. The _normal_ situation should be 
that 99.9% of all IO is done using the regular interfaces. Splice() (and 
sendpage() before it) is a special case. You should be using splice if you 
have a DVR and you can do all the DMA from the tuner card into buffers 
that you can then split up and send off to show real-time at the same time 
as you copy them to disk.

THAT is when zero-copy is useful. If you think you need to play games with 
async notifiers, you're already off the deep end.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-07-31 16:34 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-30  9:43 Miklos Szeredi
2008-07-30 17:00 ` Linus Torvalds
2008-07-30 17:29   ` Miklos Szeredi
2008-07-30 17:54     ` Jens Axboe
2008-07-30 18:32       ` Miklos Szeredi
2008-07-30 18:43         ` Miklos Szeredi
2008-07-30 19:45           ` Jens Axboe
2008-07-30 20:05             ` Miklos Szeredi
2008-07-30 20:13               ` Linus Torvalds
2008-07-30 20:45                 ` Miklos Szeredi
2008-07-30 20:51                   ` Linus Torvalds
2008-07-30 21:16                     ` Miklos Szeredi
2008-07-30 21:22                       ` Linus Torvalds
2008-07-30 21:46                         ` Miklos Szeredi
2008-07-30 21:56                           ` Linus Torvalds
2008-07-31  0:11                   ` Jamie Lokier
2008-07-31  0:42                     ` Jamie Lokier
2008-07-31  0:51                       ` Linus Torvalds
2008-07-31  0:54                         ` Linus Torvalds
2008-07-31  6:12                         ` Jamie Lokier
2008-07-31 10:26                           ` Evgeniy Polyakov
2008-07-31 12:33                             ` Jamie Lokier
2008-07-31 12:49                               ` Nick Piggin
2008-07-31 13:29                               ` Evgeniy Polyakov
2008-07-31 16:56                                 ` Linus Torvalds
2008-07-31 16:34                           ` Linus Torvalds [this message]
2008-07-31 17:21                             ` Jamie Lokier
2008-07-31 18:54                               ` Linus Torvalds
2008-07-31  7:30                     ` Miklos Szeredi
2008-07-31  2:16       ` Nick Piggin
2008-07-31 12:59 ` Nick Piggin
2008-07-31 17:00   ` Linus Torvalds
2008-07-31 18:13     ` Miklos Szeredi
2008-08-01  1:22       ` Nick Piggin
2008-08-01 18:28         ` Miklos Szeredi
2008-08-01 18:32           ` Linus Torvalds
2008-08-02  4:26           ` Nick Piggin
2008-08-04 15:29             ` Jamie Lokier
2008-08-05  2:57               ` Nick Piggin
2008-08-11  3:22                 ` Michael Kerrisk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.1.10.0807310925360.3277@nehalem.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=jamie@shareable.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox