From: Miklos Szeredi <miklos@szeredi.hu>
To: nickpiggin@yahoo.com.au
Cc: miklos@szeredi.hu, jamie@shareable.org,
torvalds@linux-foundation.org, jens.axboe@oracle.com,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, akpm@linux-foundation.org, hugh@veritas.com
Subject: Re: [patch 1/2] mm: dont clear PG_uptodate in invalidate_complete_page2()
Date: Mon, 07 Jul 2008 14:03:31 +0200 [thread overview]
Message-ID: <E1KFpRL-0001pA-Aq@pomaz-ex.szeredi.hu> (raw)
In-Reply-To: <200807072101.58963.nickpiggin@yahoo.com.au> (message from Nick Piggin on Mon, 7 Jul 2008 21:01:58 +1000)
On Mon, 7 Jul 2008, Nick Piggin wrote:
> And it isn't just a fuse problem is it? Other places can invalidate
> and truncate pages which might be spliced into a pipe, can't they?
Yes. There are several different problems here actually, and it's not
completely clear how splice should handle them:
Case 1: page is completely truncated while in the pipe
Currently splice() will detect this and return a short read count on
the splice-out. Which sounds sane, and consistent with the fact
that a zero return value can happen on EOF.
Case 2: page is partially truncated while in the pipe
Splice doesn't detect this, and returns the contents of the whole
page on splice-out, which will contain the zeroed-out part as well.
This is not so nice, but other than some elaborate COW schemes, I
don't see how this could be fixed.
Case 3: page is invalidated while in the pipe
This can happen on pages in the middle of the file, and splice-out
can return a zero count. This is *BAD*, callers of splice really
should be able to assume, that a zero return means EOF.
Page invalidation (the invalidate_inode_pages2 kind) is done by only a
few filesystems (FUSE, NFS, AFS, 9P), and by O_DIRECT hackery. So
case 3 only affects these, and only fuse can be re-exported by nfsd
(and that's only in -mm yet), hence this is very unlikely to be hit
for any of the others.
But it's bad despite being rare, because once it hits it can cause
data corruption (like with fuse/nfsd) and could be very hard to debug.
OK, case 2 could also cause corruption if caller is not expecting it.
Splice is a cool concept, but it isn't easy to get the implementation
right...
Miklos
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-07-07 12:03 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-25 12:40 [patch 0/2] splice: fix nfs export of fuse filesystems Miklos Szeredi
2008-06-25 12:40 ` [patch 1/2] mm: dont clear PG_uptodate in invalidate_complete_page2() Miklos Szeredi, Miklos Szeredi
2008-06-25 13:11 ` Evgeniy Polyakov
2008-06-25 13:32 ` Miklos Szeredi
2008-06-25 14:16 ` Evgeniy Polyakov
2008-06-25 14:41 ` Miklos Szeredi
2008-06-25 15:30 ` Evgeniy Polyakov
2008-06-25 15:59 ` Miklos Szeredi
2008-06-25 16:18 ` Evgeniy Polyakov
2008-06-25 15:47 ` Evgeniy Polyakov
2008-06-25 16:02 ` Miklos Szeredi
2008-06-25 16:19 ` Evgeniy Polyakov
2008-06-25 15:11 ` Linus Torvalds
2008-06-25 15:29 ` Miklos Szeredi
2008-06-25 16:30 ` Linus Torvalds
2008-06-25 16:42 ` Miklos Szeredi
2008-06-25 17:38 ` Jamie Lokier
2008-06-25 18:35 ` Miklos Szeredi
2008-07-07 6:38 ` Nick Piggin
2008-07-07 9:21 ` Miklos Szeredi
2008-07-07 10:12 ` Miklos Szeredi
2008-07-07 11:01 ` Nick Piggin
2008-07-07 12:03 ` Miklos Szeredi [this message]
2008-07-07 12:17 ` Nick Piggin
2008-07-07 12:52 ` Miklos Szeredi
2008-07-07 14:28 ` Nick Piggin
2008-07-07 15:08 ` Miklos Szeredi
2008-07-08 2:22 ` Nick Piggin
2008-07-07 10:43 ` Nick Piggin
2008-06-25 12:40 ` [patch 2/2] splice: fix generic_file_splice_read() race with page invalidation Miklos Szeredi, Miklos Szeredi
2008-06-25 13:00 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E1KFpRL-0001pA-Aq@pomaz-ex.szeredi.hu \
--to=miklos@szeredi.hu \
--cc=akpm@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=jamie@shareable.org \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox