linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: dhowells@redhat.com, Matthew Wilcox <willy@infradead.org>,
	Matt Whitlock <kernel@mattwhitlock.name>,
	netdev@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	Jens Axboe <axboe@kernel.dk>,
	linux-fsdevel@kvack.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe
Date: Fri, 30 Jun 2023 17:50:43 +0100	[thread overview]
Message-ID: <663489.1688143843@warthog.procyon.org.uk> (raw)
In-Reply-To: <CAHk-=wjixHw6n_R5TQWW1r0a+GgFAPGw21KMj6obkzr3qXXbYA@mail.gmail.com>

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> Quite the reverse. I'd be willing to *simplify* splice() by just
> saying "it was all a mistake", and just turning it into wrappers
> around read/write. But those patches would have to be radical
> simplifications, not adding yet more crud on top of the pain that is
> splice().
> 
> Because it will hurt performance. And I'm ok with that as long as it
> comes with huge simplifications. What I'm *not* ok with is "I mis-used
> splice, now I want splice to act differently, so let's make it even
> more complicated".

If we want to go down the simplification route, then the patches I posted
might be a good start.

The idea I tried to work towards is that the pipe only ever contains private
pages in it that only the pipe has a ref on and that no one else can access
until they come out the other end again.  I got rid of the ->confirm() pipe
buf op and would like to kill off all of the others too.

I simplified splice() by:

 - Making sure any candidate pages are uptodate right up front.

 - Allowing automatic stealing of pages from the pagecache if no one else is
   using them.  This should avoid losing a chunk of the performance that
   splice is supposed to gain - but if you're serving pages repeatedly in a
   webserver with this, it's going to be a problem.

   Possibly this should be contingent on SPLICE_F_MOVE - though the manpage
   says "*from* the pipe" implying it's only usable on the output side.

 - Copying in every other circumstance.

I simplified vmsplice() by:

 - If SPLICE_F_GIFT is set, attempting to steal whole pages in the buffer up
   front if not in use by anyone else.

 - Copying in every other circumstance.

That said, there are still sources that I didn't touch yet that attempt to
insert pages into a pipe: relayfs (which does some accounting stuff based on
the final consumption of the pages it inserted), sockets (which don't allow
inserted pages to be stolen) and notifications (which don't want to allocate
at notification time - but I can deal with that).  And there's tee() (which
would need to copy the data).  And pipe-to-pipe splice (which could steal
whole pages, but would otherwise have to copy).


If you would prefer to go for utter simplification, we could make sendfile()
from a buffered file just call sendmsg() directly with MSG_SPLICE_PAGES set
and ignore the pipe entirely (I'm tempted to do this anyway) and then make
splice() to a pipe just do copy_splice_read() and vmsplice() to a pipe do
writev().

I wonder how much splice() is used compared to sendfile().


I would prefer to leave splice() and vmsplice() as they are now and adjust the
documentation.  As you say, they can be considered a type of zerocopy.

David



  parent reply	other threads:[~2023-06-30 16:50 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-29 15:54 David Howells
2023-06-29 15:54 ` [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns David Howells
2023-07-19 10:17   ` Miklos Szeredi
2023-07-19 17:59     ` Matt Whitlock
2023-07-19 19:35       ` Miklos Szeredi
2023-07-19 19:44         ` Matthew Wilcox
2023-07-19 19:56           ` Miklos Szeredi
2023-07-19 20:04             ` Matthew Wilcox
2023-07-19 20:16           ` Linus Torvalds
2023-07-19 21:02             ` Matt Whitlock
2023-07-19 23:20               ` Linus Torvalds
2023-07-19 23:41                 ` Matt Whitlock
2023-07-20  0:00                   ` Linus Torvalds
2023-07-19 23:48                 ` Linus Torvalds
2023-07-24  9:44           ` David Howells
2023-07-24 13:55             ` Miklos Szeredi
2023-07-24 16:15             ` David Howells
2023-06-29 15:54 ` [RFC PATCH 2/4] splice: Make vmsplice() steal or copy David Howells
2023-06-30 13:44   ` Simon Horman
2023-06-30 15:29   ` David Howells
2023-06-30 17:32     ` Simon Horman
2023-06-29 15:54 ` [RFC PATCH 3/4] splice: Remove some now-unused bits David Howells
2023-06-29 15:54 ` [RFC PATCH 4/4] splice: Record some statistics David Howells
2023-06-29 17:56 ` [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe Linus Torvalds
2023-06-29 18:05   ` Matt Whitlock
2023-06-29 18:19     ` Linus Torvalds
2023-06-29 18:34       ` Matthew Wilcox
2023-06-29 18:53         ` Linus Torvalds
2023-06-30 16:50         ` David Howells [this message]
2023-06-29 18:42       ` Linus Torvalds
2023-06-29 18:16 ` Matt Whitlock
2023-06-30  0:01 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=663489.1688143843@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=kernel@mattwhitlock.name \
    --cc=linux-fsdevel@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox