linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: nickpiggin@yahoo.com.au
Cc: miklos@szeredi.hu, torvalds@linux-foundation.org,
	jens.axboe@oracle.com, akpm@linux-foundation.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [patch v3] splice: fix race with page invalidation
Date: Fri, 01 Aug 2008 20:28:47 +0200	[thread overview]
Message-ID: <E1KOzMt-0003fa-Ah@pomaz-ex.szeredi.hu> (raw)
In-Reply-To: <200808011122.51792.nickpiggin@yahoo.com.au> (message from Nick Piggin on Fri, 1 Aug 2008 11:22:51 +1000)

On Fri, 1 Aug 2008, Nick Piggin wrote:
> Well, a) it probably makes sense in that case to provide another mode
> of operation which fills the data synchronously from the sender and
> copys it to the pipe (although the sender might just use read/write)
> And b) we could *also* look at clearing PG_uptodate as an optimisation
> iff that is found to help.

IMO it's not worth it to complicate the API just for the sake of
correctness in the so-very-rare read error case.  Users of the splice
API will simply ignore this requirement, because things will work fine
on ext3 and friends, and will break only rarely on NFS and FUSE.

So I think it's much better to make the API simple: invalid pages are
OK, and for I/O errors we return -EIO on the pipe.  It's not 100%
correct, but all in all it will result in less buggy programs.

Thanks,
Miklos
----

Subject: mm: dont clear PG_uptodate on truncate/invalidate

From: Miklos Szeredi <mszeredi@suse.cz>

Brian Wang reported that a FUSE filesystem exported through NFS could return
I/O errors on read.  This was traced to splice_direct_to_actor() returning a
short or zero count when racing with page invalidation.

However this is not FUSE or NFSD specific, other filesystems (notably NFS)
also call invalidate_inode_pages2() to purge stale data from the cache.

If this happens while such pages are sitting in a pipe buffer, then splice(2)
from the pipe can return zero, and read(2) from the pipe can return ENODATA.

The zero return is especially bad, since it implies end-of-file or
disconnected pipe/socket, and is documented as such for splice.  But returning
an error for read() is also nasty, when in fact there was no error (data
becoming stale is not an error).

The same problems can be triggered by "hole punching" with
madvise(MADV_REMOVE).

Fix this by not clearing the PG_uptodate flag on truncation and
invalidation.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 mm/truncate.c |    2 --
 1 file changed, 2 deletions(-)

Index: linux-2.6/mm/truncate.c
===================================================================
--- linux-2.6.orig/mm/truncate.c	2008-07-28 17:45:02.000000000 +0200
+++ linux-2.6/mm/truncate.c	2008-08-01 20:18:51.000000000 +0200
@@ -104,7 +104,6 @@ truncate_complete_page(struct address_sp
 	cancel_dirty_page(page, PAGE_CACHE_SIZE);
 
 	remove_from_page_cache(page);
-	ClearPageUptodate(page);
 	ClearPageMappedToDisk(page);
 	page_cache_release(page);	/* pagecache ref */
 }
@@ -356,7 +355,6 @@ invalidate_complete_page2(struct address
 	BUG_ON(PagePrivate(page));
 	__remove_from_page_cache(page);
 	spin_unlock_irq(&mapping->tree_lock);
-	ClearPageUptodate(page);
 	page_cache_release(page);	/* pagecache ref */
 	return 1;
 failed:

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-08-01 18:28 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-30  9:43 Miklos Szeredi
2008-07-30 17:00 ` Linus Torvalds
2008-07-30 17:29   ` Miklos Szeredi
2008-07-30 17:54     ` Jens Axboe
2008-07-30 18:32       ` Miklos Szeredi
2008-07-30 18:43         ` Miklos Szeredi
2008-07-30 19:45           ` Jens Axboe
2008-07-30 20:05             ` Miklos Szeredi
2008-07-30 20:13               ` Linus Torvalds
2008-07-30 20:45                 ` Miklos Szeredi
2008-07-30 20:51                   ` Linus Torvalds
2008-07-30 21:16                     ` Miklos Szeredi
2008-07-30 21:22                       ` Linus Torvalds
2008-07-30 21:46                         ` Miklos Szeredi
2008-07-30 21:56                           ` Linus Torvalds
2008-07-31  0:11                   ` Jamie Lokier
2008-07-31  0:42                     ` Jamie Lokier
2008-07-31  0:51                       ` Linus Torvalds
2008-07-31  0:54                         ` Linus Torvalds
2008-07-31  6:12                         ` Jamie Lokier
2008-07-31 10:26                           ` Evgeniy Polyakov
2008-07-31 12:33                             ` Jamie Lokier
2008-07-31 12:49                               ` Nick Piggin
2008-07-31 13:29                               ` Evgeniy Polyakov
2008-07-31 16:56                                 ` Linus Torvalds
2008-07-31 16:34                           ` Linus Torvalds
2008-07-31 17:21                             ` Jamie Lokier
2008-07-31 18:54                               ` Linus Torvalds
2008-07-31  7:30                     ` Miklos Szeredi
2008-07-31  2:16       ` Nick Piggin
2008-07-31 12:59 ` Nick Piggin
2008-07-31 17:00   ` Linus Torvalds
2008-07-31 18:13     ` Miklos Szeredi
2008-08-01  1:22       ` Nick Piggin
2008-08-01 18:28         ` Miklos Szeredi [this message]
2008-08-01 18:32           ` Linus Torvalds
2008-08-02  4:26           ` Nick Piggin
2008-08-04 15:29             ` Jamie Lokier
2008-08-05  2:57               ` Nick Piggin
2008-08-11  3:22                 ` Michael Kerrisk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1KOzMt-0003fa-Ah@pomaz-ex.szeredi.hu \
    --to=miklos@szeredi.hu \
    --cc=akpm@linux-foundation.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox