From: Andrew Morton <akpm@osdl.org>
To: chuck.lever@oracle.com
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>,
Steve Dickson <steved@redhat.com>,
linux-mm@kvack.org
Subject: Re: Checking page_count(page) in invalidate_complete_page
Date: Mon, 25 Sep 2006 14:10:36 -0700 [thread overview]
Message-ID: <20060925141036.73f1e2b3.akpm@osdl.org> (raw)
In-Reply-To: <4518333E.2060101@oracle.com>
(Added linux-mm)
On Mon, 25 Sep 2006 15:51:26 -0400
Chuck Lever <chuck.lever@oracle.com> wrote:
> Hi Andrew-
>
> Steve Dickson and I have independently discovered some cache
> invalidation problems in 2.6.18's NFS client. Using git bisect, I was
> able to track it back to this commit:
>
> > commit 016eb4a0ed06a3677d67a584da901f0e9a63c666
> > Author: Andrew Morton <akpm@osdl.org>
> > Date: Fri Sep 8 09:48:38 2006 -0700
> >
> > [PATCH] invalidate_complete_page() race fix
> >
> > If a CPU faults this page into pagetables after invalidate_mapping_pages()
> > checked page_mapped(), invalidate_complete_page() will still proceed to remove
> > the page from pagecache. This leaves the page-faulting process with a
> > detached page. If it was MAP_SHARED then file data loss will ensue.
> >
> > Fix that up by checking the page's refcount after taking tree_lock.
> >
> > Cc: Nick Piggin <nickpiggin@yahoo.com.au>
> > Cc: Hugh Dickins <hugh@veritas.com>
> > Cc: <stable@kernel.org>
> > Signed-off-by: Andrew Morton <akpm@osdl.org>
> > Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
> Instrumenting get_page and put_page has shown that the page reclaim
> logic is temporarily bumping the page count on otherwise active but idle
> pages, racing with the new page_count check in invalidate_complete_page
> and preventing pages from being invalidated.
>
> One problem for the NFS client is that invalidate_inode_pages2 is being
> used to invalidate the pages associated with a cached directory. If the
> directory pages can't be invalidated because of this race, the contents
> of the directory pages don't match the dentry cache, and all kinds of
> strange behavior results.
NFS is presently ignoring the return value from invalidate_inode_pages2(),
in two places. Could I suggest you fix that? Then we'd at least not be
seeing "strange behaviour" and things will be easier to diagnose next time.
> I haven't checked the data invalidation behavior for regular files, but
> the result will be file data corruption that comes and goes.
>
> Would it be acceptable to revert that page_count(page) != 2 check in
> invalidate_complete_page ?
Unfortunately not - that patch fixes cramfs failures and potential file
corruption.
The way to keep memory reclaim away from that page is to take
zone->lru_lock, but that's quite impractical for several reasons.
We could retry the invalidation a few times, but that stinks.
I think invalidate_inode_pages2() is sufficiently different from (ie:
stronger than) invalidate_inode_pages() to justify the addition of a new
invalidate_complete_page2(), which skips the page refcount check.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next parent reply other threads:[~2006-09-25 21:10 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4518333E.2060101@oracle.com>
2006-09-25 21:10 ` Andrew Morton [this message]
2006-09-25 22:30 ` Chuck Lever
2006-09-25 22:53 ` Andrew Morton
2006-09-25 22:57 ` Steve Dickson
2006-09-25 23:14 ` Nick Piggin
2006-09-25 22:40 ` Chuck Lever
2006-09-25 23:02 ` Andrew Morton
2006-09-25 22:50 ` Steve Dickson
2006-09-25 22:51 ` Nick Piggin
2006-09-25 23:14 ` Chuck Lever
2006-09-25 23:21 ` Nick Piggin
2006-09-26 0:01 ` Chuck Lever
2006-09-26 0:13 ` Nick Piggin
2006-09-26 1:33 ` Chuck Lever
2006-09-26 1:48 ` Nick Piggin
2006-09-28 16:26 ` Chuck Lever
2006-09-28 16:36 ` Andrew Morton
2006-09-28 16:40 ` Andrew Morton
2006-09-28 16:42 ` Chuck Lever
2006-09-28 17:03 ` Andrew Morton
2006-09-28 17:09 ` Chuck Lever
2006-09-29 0:37 ` Nick Piggin
2006-09-29 20:34 ` Chuck Lever
2006-09-29 20:45 ` Peter Zijlstra
2006-09-29 21:02 ` Chuck Lever
2006-09-29 21:17 ` Peter Zijlstra
2006-09-29 21:44 ` Andrew Morton
2006-09-29 21:48 ` Chuck Lever
2006-09-29 22:29 ` Andrew Morton
2006-09-29 23:05 ` Chuck Lever
2006-10-01 4:21 ` Chuck Lever
2006-10-02 12:01 ` Steve Dickson
2006-10-02 13:25 ` Trond Myklebust
2006-10-02 16:57 ` Andrew Morton
2006-10-02 17:02 ` Steve Dickson
2006-10-02 18:20 ` Andrew Morton
2006-10-02 19:02 ` Steve Dickson
2006-10-03 2:14 ` Chuck Lever
2006-10-03 4:18 ` Trond Myklebust
2006-10-03 4:24 ` Andrew Morton
2006-10-03 18:50 ` Chuck Lever
2006-10-03 19:10 ` Trond Myklebust
2006-10-03 19:21 ` Chuck Lever
2006-10-03 21:37 ` Andrew Morton
2006-10-04 19:29 ` Chuck Lever
2006-10-04 19:43 ` Andrew Morton
2006-10-04 19:53 ` Steve Dickson
2006-09-28 16:41 ` Chuck Lever
2006-09-26 6:25 ` Nick Piggin
2006-09-26 13:12 ` Chuck Lever
2006-09-27 4:47 ` Nick Piggin
2006-09-27 8:25 ` Andrew Morton
2006-09-27 8:39 ` Nick Piggin
2006-09-27 16:03 ` Andrew Morton
2006-09-27 15:54 ` Chuck Lever
2006-09-25 22:56 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060925141036.73f1e2b3.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=Trond.Myklebust@netapp.com \
--cc=chuck.lever@oracle.com \
--cc=linux-mm@kvack.org \
--cc=steved@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox