linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Matthew Wilcox <willy@infradead.org>
Cc: Matt Whitlock <kernel@mattwhitlock.name>,
	David Howells <dhowells@redhat.com>,
	 netdev@vger.kernel.org, Dave Chinner <david@fromorbit.com>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	linux-fsdevel@kvack.org,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  Christoph Hellwig <hch@lst.de>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns
Date: Wed, 19 Jul 2023 21:56:44 +0200	[thread overview]
Message-ID: <CAJfpegtYQXgAyejoYWRVkf+9y91O70jaTu+mm+3zhnGPJhKwcA@mail.gmail.com> (raw)
In-Reply-To: <ZLg9HbhOVnLk1ogA@casper.infradead.org>

On Wed, 19 Jul 2023 at 21:44, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jul 19, 2023 at 09:35:33PM +0200, Miklos Szeredi wrote:
> > On Wed, 19 Jul 2023 at 19:59, Matt Whitlock <kernel@mattwhitlock.name> wrote:
> > >
> > > On Wednesday, 19 July 2023 06:17:51 EDT, Miklos Szeredi wrote:
> > > > On Thu, 29 Jun 2023 at 17:56, David Howells <dhowells@redhat.com> wrote:
> > > >>
> > > >> Splicing data from, say, a file into a pipe currently leaves the source
> > > >> pages in the pipe after splice() returns - but this means that those pages
> > > >> can be subsequently modified by shared-writable mmap(), write(),
> > > >> fallocate(), etc. before they're consumed.
> > > >
> > > > What is this trying to fix?   The above behavior is well known, so
> > > > it's not likely to be a problem.
> > >
> > > Respectfully, it's not well-known, as it's not documented. If the splice(2)
> > > man page had mentioned that pages can be mutated after they're already
> > > ostensibly at rest in the output pipe buffer, then my nightly backups
> > > wouldn't have been incurring corruption silently for many months.
> >
> > splice(2):
> >
> >        Though we talk of copying, actual copies are generally avoided.
> > The kernel does this by implementing a pipe buffer as a set  of
> > refer‐
> >        ence-counted  pointers  to  pages  of kernel memory.  The
> > kernel creates "copies" of pages in a buffer by creating new pointers
> > (for the
> >        output buffer) referring to the pages, and increasing the
> > reference counts for the pages: only pointers are copied, not the
> > pages of the
> >        buffer.
> >
> > While not explicitly stating that the contents of the pages can change
> > after being spliced, this can easily be inferred from the above
> > semantics.
>
> So what's the API that provides the semantics of _copying_?

What's your definition of copying?

Thanks,
Miklos


  reply	other threads:[~2023-07-19 19:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-29 15:54 [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe David Howells
2023-06-29 15:54 ` [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns David Howells
2023-07-19 10:17   ` Miklos Szeredi
2023-07-19 17:59     ` Matt Whitlock
2023-07-19 19:35       ` Miklos Szeredi
2023-07-19 19:44         ` Matthew Wilcox
2023-07-19 19:56           ` Miklos Szeredi [this message]
2023-07-19 20:04             ` Matthew Wilcox
2023-07-19 20:16           ` Linus Torvalds
2023-07-19 21:02             ` Matt Whitlock
2023-07-19 23:20               ` Linus Torvalds
2023-07-19 23:41                 ` Matt Whitlock
2023-07-20  0:00                   ` Linus Torvalds
2023-07-19 23:48                 ` Linus Torvalds
2023-07-24  9:44           ` David Howells
2023-07-24 13:55             ` Miklos Szeredi
2023-07-24 16:15             ` David Howells
2023-06-29 15:54 ` [RFC PATCH 2/4] splice: Make vmsplice() steal or copy David Howells
2023-06-30 13:44   ` Simon Horman
2023-06-30 15:29   ` David Howells
2023-06-30 17:32     ` Simon Horman
2023-06-29 15:54 ` [RFC PATCH 3/4] splice: Remove some now-unused bits David Howells
2023-06-29 15:54 ` [RFC PATCH 4/4] splice: Record some statistics David Howells
2023-06-29 17:56 ` [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe Linus Torvalds
2023-06-29 18:05   ` Matt Whitlock
2023-06-29 18:19     ` Linus Torvalds
2023-06-29 18:34       ` Matthew Wilcox
2023-06-29 18:53         ` Linus Torvalds
2023-06-30 16:50         ` David Howells
2023-06-29 18:42       ` Linus Torvalds
2023-06-29 18:16 ` Matt Whitlock
2023-06-30  0:01 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJfpegtYQXgAyejoYWRVkf+9y91O70jaTu+mm+3zhnGPJhKwcA@mail.gmail.com \
    --to=miklos@szeredi.hu \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=hch@lst.de \
    --cc=kernel@mattwhitlock.name \
    --cc=linux-fsdevel@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox