linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Memory Management <linux-mm@kvack.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [patch 2/5] mm: fault vs invalidate/truncate race fix
Date: Tue, 10 Oct 2006 21:38:43 -0700	[thread overview]
Message-ID: <20061010213843.4478ddfc.akpm@osdl.org> (raw)
In-Reply-To: <20061010121332.19693.37204.sendpatchset@linux.site>

On Tue, 10 Oct 2006 16:21:49 +0200 (CEST)
Nick Piggin <npiggin@suse.de> wrote:

> Fix the race between invalidate_inode_pages and do_no_page.
> 
> Andrea Arcangeli identified a subtle race between invalidation of
> pages from pagecache with userspace mappings, and do_no_page.
> 
> The issue is that invalidation has to shoot down all mappings to the
> page, before it can be discarded from the pagecache. Between shooting
> down ptes to a particular page, and actually dropping the struct page
> from the pagecache, do_no_page from any process might fault on that
> page and establish a new mapping to the page just before it gets
> discarded from the pagecache.
> 
> The most common case where such invalidation is used is in file
> truncation. This case was catered for by doing a sort of open-coded
> seqlock between the file's i_size, and its truncate_count.
> 
> Truncation will decrease i_size, then increment truncate_count before
> unmapping userspace pages; do_no_page will read truncate_count, then
> find the page if it is within i_size, and then check truncate_count
> under the page table lock and back out and retry if it had
> subsequently been changed (ptl will serialise against unmapping, and
> ensure a potentially updated truncate_count is actually visible).
> 
> Complexity and documentation issues aside, the locking protocol fails
> in the case where we would like to invalidate pagecache inside i_size.
> do_no_page can come in anytime and filemap_nopage is not aware of the
> invalidation in progress (as it is when it is outside i_size). The
> end result is that dangling (->mapping == NULL) pages that appear to
> be from a particular file may be mapped into userspace with nonsense
> data. Valid mappings to the same place will see a different page.
> 
> Andrea implemented two working fixes, one using a real seqlock,
> another using a page->flags bit. He also proposed using the page lock
> in do_no_page, but that was initially considered too heavyweight.
> However, it is not a global or per-file lock, and the page cacheline
> is modified in do_no_page to increment _count and _mapcount anyway, so
> a further modification should not be a large performance hit.
> Scalability is not an issue.
> 
> This patch implements this latter approach. ->nopage implementations
> return with the page locked if it is possible for their underlying
> file to be invalidated (in that case, they must set a special vm_flags
> bit to indicate so). do_no_page only unlocks the page after setting
> up the mapping completely. invalidation is excluded because it holds
> the page lock during invalidation of each page (and ensures that the
> page is not mapped while holding the lock).
> 
> This allows significant simplifications in do_no_page.
> 

The (unchangelogged) changes to filemap_nopage() appear to have switched
the try-the-read-twice logic into try-it-forever logic.  I think it'll hang
if there's a bad sector?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-10-11  4:38 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-10 14:21 [rfc] 2.6.19-rc1-git5: consolidation of file backed fault handlers Nick Piggin
2006-10-10 14:21 ` [patch 1/5] mm: fault vs invalidate/truncate check Nick Piggin
2006-10-10 14:21 ` [patch 2/5] mm: fault vs invalidate/truncate race fix Nick Piggin
2006-10-11  4:38   ` Andrew Morton [this message]
2006-10-11  5:39     ` Nick Piggin
2006-10-11  6:00       ` Andrew Morton
2006-10-11  9:21         ` Nick Piggin
2006-10-11 16:21         ` Linus Torvalds
2006-10-11 16:57           ` SPAM: " Nick Piggin
2006-10-11 17:11             ` Linus Torvalds
2006-10-11 17:21               ` SPAM: " Nick Piggin
2006-10-11 17:38                 ` Linus Torvalds
2006-10-12  3:33                   ` Nick Piggin
2006-10-12 15:37                     ` Linus Torvalds
2006-10-12 15:40                       ` Nick Piggin
2006-10-11  5:13   ` Andrew Morton
2006-10-11  5:50     ` Nick Piggin
2006-10-11  6:10       ` Andrew Morton
2006-10-11  6:17       ` [patch 1/6] revert "generic_file_buffered_write(): handle zero length iovec segments" Andrew Morton, Andrew Morton
     [not found]       ` <20061010231150.fb9e30f5.akpm@osdl.org>
2006-10-11  6:17         ` [patch 2/6] revert "generic_file_buffered_write(): deadlock on vectored write" Andrew Morton, Andrew Morton
     [not found]         ` <20061010231243.bc8b834c.akpm@osdl.org>
2006-10-11  6:17           ` [patch 3/6] generic_file_buffered_write() cleanup Andrew Morton, Andrew Morton
     [not found]           ` <20061010231339.a79c1fae.akpm@osdl.org>
2006-10-11  6:18             ` [patch 4/6] generic_file_buffered_write(): fix page prefaulting Andrew Morton, Andrew Morton
     [not found]             ` <20061010231424.db88931f.akpm@osdl.org>
2006-10-11  6:18               ` [patch 5/6] generic_file_buffered_write(): max_len cleanup Andrew Morton, Andrew Morton
     [not found]               ` <20061010231514.c1da7355.akpm@osdl.org>
2006-10-11  6:18                 ` [patch 6/6] fix pagecache write deadlocks Andrew Morton, Andrew Morton
2006-10-21  1:53       ` [patch 2/5] mm: fault vs invalidate/truncate race fix Benjamin Herrenschmidt
2006-10-10 14:22 ` [patch 3/5] mm: fault handler to replace nopage and populate Nick Piggin
2006-10-10 14:22 ` [patch 4/5] mm: add vm_insert_pfn helpler Nick Piggin
2006-10-10 14:22 ` [patch 5/5] mm: merge nopfn with fault handler Nick Piggin
2006-10-10 14:26 ` [rfc] 2.6.19-rc1-git5: consolidation of file backed fault handlers Nick Piggin
2006-10-10 14:33 ` Christoph Hellwig
2006-10-10 15:01   ` Nick Piggin
2006-10-10 16:09     ` Arjan van de Ven
2006-10-11  0:46       ` SPAM: " Nick Piggin
2006-10-10 15:07   ` Arjan van de Ven
  -- strict thread matches above, loose matches on Subject: below --
2006-10-09 16:12 Nick Piggin
2006-10-09 16:12 ` [patch 2/5] mm: fault vs invalidate/truncate race fix Nick Piggin
2006-10-09 21:10   ` Mark Fasheh
2006-10-10  1:10     ` Nick Piggin
2006-10-11 18:34       ` Mark Fasheh
2006-10-12  3:28         ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061010213843.4478ddfc.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox