From: David Howells <dhowells@redhat.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: David Howells <dhowells@redhat.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Hugh Dickins <hugh@veritas.com>, Andrew Morton <akpm@osdl.org>,
Christoph Lameter <christoph@lameter.com>,
Martin Bligh <mbligh@google.com>, Nick Piggin <npiggin@suse.de>,
Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH 1/3] mm: tracking shared dirty pages
Date: Tue, 30 May 2006 09:00:35 +0100 [thread overview]
Message-ID: <12042.1148976035@warthog.cambridge.redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0605260825160.31609@schroedinger.engr.sgi.com>
Christoph Lameter <clameter@sgi.com> wrote:
> > page_mkwrite() is called just before the _PTE_ is dirtied. Take
> > do_wp_page() for example, set_page_dirty() is called after a lot of stuff,
> > including some stuff that marks the PTE dirty... by which time it's too
> > late as another thread sharing the page tables can come along and modify
> > the page before the first thread calls set_page_dirty().
>
> Since we are terminating the application with extreme prejudice on an
> error (SIGBUS) it does not matter if another process has written to the
> page in the meantime.
Erm... Yes, it does matter, at least for AFS or NFS using FS-Cache, and whether
or not we're generating a SIGBUS or just proceeding normally. There are two
cases I've come across:
Firstly I use page_mkwrite() to make sure that the page is written to the cache
before we let anyone modify it, just so that we've got a reasonable idea of
what's in the cache.
What we currently have is:
invoke page_mkwrite()
- wait for page to be written to the cache
lock
modify PTE
unlock
invoke set_page_dirty()
What your suggestion gives is:
lock
modify PTE
unlock
invoke set_page_dirty()
- wait for page to be written to the cache
But what can happen is:
CPU 1 CPU 2
======================= =======================
write to page (faults)
lock
modify PTE
unlock
write to page (succeeds)
invoke set_page_dirty()
- wait for page to be written to the cache
write to page (succeeds)
That potentially lets data of uncertain state into the cache, which means we
can't trust what's in the cache any longer.
Secondly some filesystems want to use page_mkwrite() to prevent a write from
occurring if a write to a shared writable mapping would require an allocation
from a filesystem that's currently in an ENOSPC state. That means that we may
not change the PTE until we're sure that the allocation is guaranteed to
succeed, and that means that the kernel isn't left with dirty pages attached to
inodes it'd like to dispose of but can't because there's nowhere to write the
data.
David
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-05-30 8:00 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-05-25 13:55 [PATCH 0/3] mm: tracking dirty pages -v5 Peter Zijlstra
2006-05-25 13:55 ` [PATCH -1/3] mm: page_mkwrite Peter Zijlstra
2006-05-25 13:55 ` [PATCH 1/3] mm: tracking shared dirty pages Peter Zijlstra
2006-05-25 16:21 ` Christoph Lameter
2006-05-25 17:00 ` Peter Zijlstra
2006-05-25 17:03 ` Christoph Lameter
2006-05-25 16:27 ` Christoph Lameter
2006-05-25 17:03 ` Peter Zijlstra
2006-05-25 17:06 ` Christoph Lameter
2006-05-26 2:28 ` Jeff Anderson-Lee
2006-05-26 2:33 ` Christoph Lameter
2006-05-26 14:33 ` David Howells
2006-05-26 15:39 ` Christoph Lameter
2006-05-30 8:00 ` David Howells [this message]
2006-05-30 15:38 ` Christoph Lameter
2006-05-30 16:26 ` David Howells
2006-05-30 17:02 ` Christoph Lameter
2006-05-30 17:25 ` Hugh Dickins
2006-05-30 17:30 ` Christoph Lameter
2006-05-30 17:41 ` Hugh Dickins
2006-05-30 17:56 ` David Howells
2006-05-30 20:21 ` Christoph Lameter
2006-05-25 13:56 ` [PATCH 2/3] mm: balance " Peter Zijlstra
2006-05-25 13:56 ` [PATCH 3/3] mm: msync cleanup Peter Zijlstra
2006-06-06 20:06 ` [PATCH 0/3] mm: tracking dirty pages -v5 Hugh Dickins
2006-06-07 18:08 ` Peter Zijlstra
2006-06-08 12:44 ` [PATCH] mm: tracking dirty pages -v6 Peter Zijlstra
2006-06-08 13:02 ` Peter Zijlstra
2006-06-08 16:53 ` Christoph Lameter
2006-06-08 20:10 ` Nate Diller
2006-06-08 20:20 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=12042.1148976035@warthog.cambridge.redhat.com \
--to=dhowells@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@osdl.org \
--cc=christoph@lameter.com \
--cc=clameter@sgi.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mbligh@google.com \
--cc=npiggin@suse.de \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox