From: steve@chygwyn.com
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: npiggin@suse.de, akpm@linux-foundation.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [patch] fs: improved handling of page and buffer IO errors
Date: Tue, 21 Oct 2008 14:38:14 +0100 [thread overview]
Message-ID: <20081021133814.GA26942@fogou.chygwyn.com> (raw)
In-Reply-To: <E1KsH4S-0005ya-6F@pomaz-ex.szeredi.hu>
Hi,
On Tue, Oct 21, 2008 at 03:14:48PM +0200, Miklos Szeredi wrote:
> On Tue, 21 Oct 2008, steve@chygwyn.com
> > > Is there a case where retrying in case of !PageUptodate() makes any
> > > sense?
> > >
> > Yes... cluster filesystems. Its very important in case a readpage
> > races with a lock demotion. Since the introduction of page_mkwrite
> > that hasn't worked quite right, but by retrying when the page is
> > not uptodate, that should fix the problem,
>
> I see.
>
> Could you please give some more details? In particular I don't know
> what's lock demotion in this context. And how page_mkwrite() come
> into the picture?
>
> Thanks,
> Miklos
page_mkwrite is only in the picture really because its the last
time that code was changed. At that point GFS2 adopted
->filemap_fault() rather than using its own page fault
routine.
So here are the basics of locking, so far as GFS2 goes, although
other cluster filesystems are similar. Lets suppose that on
node A, an inode is in cache and its being read/written and
on node B, another process wants to perform some operation
(read/write/etc) on the same inode.
Node B requests a lock via the dlm which causes Node A to
receive a callback. The callback sets a flag in the glock[*]
state on Node A corresponding to the inode in question. This
results in all future requests for that particular lock on
Node A blocking. Also at that time, we unmap any mapped pages
relating to that inode, flush any dirty data back onto the
disk (and if the request was for an exclusive lock, invalidate
the pages as well). So thats what I was refering to above as
lock demotion.
Once thats done, the dlm/glock is dropped (again notification is via
the dlm) and if Node A has outstanding requests queued up, it
re-requests the glock. This is a slightly simplified explanation
but, I hope it gives the general drift.
So to return to the original subject, in order to allow all
this locking to occur with no lock ordering problems, we have
to define a suitable ordering of page locks vs. glocks, and the
ordering that we use is that glocks must come before page locks. The
full ordering of locks in GFS2 is in Documentation/filesystems/gfs2-glocks.txt
As a result of that, the VFS needs reads (and page_mkwrite) to
retry when !PageUptodate() in case the returned page has been
invalidated at any time when the page lock has been dropped.
Obviously we hope that this doesn't happen too often since its
very inefficient (and we have a system to try and reduce the
frequency of such events) but it can and does happen at more
or less any time, so the vfs needs to take that into account.
I hope that makes some kind of sense... let me know if its
not clear,
Steve.
[*] The glock layer is a state machine which is associated with each
dlm lock and performs the required actions is response to dlm messages
and filesystem requests to keep the page cache coherent.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-10-21 13:38 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-21 11:21 Nick Piggin
2008-10-21 12:52 ` Miklos Szeredi
2008-10-21 12:59 ` steve
2008-10-21 13:14 ` Miklos Szeredi
2008-10-21 13:38 ` steve [this message]
2008-10-21 14:32 ` Miklos Szeredi
2008-10-21 15:09 ` steve
2008-10-21 16:13 ` Miklos Szeredi
2008-10-22 12:51 ` Jamie Lokier
2008-10-22 14:08 ` Miklos Szeredi
2008-10-21 14:35 ` Evgeniy Polyakov
2008-10-21 14:59 ` steve
2008-10-21 16:20 ` Miklos Szeredi
2008-10-21 16:25 ` steve
2008-10-21 16:28 ` Miklos Szeredi
2008-10-21 16:29 ` Matthew Wilcox
2008-10-22 12:48 ` Jamie Lokier
2008-10-22 13:45 ` Matthew Wilcox
2008-10-22 14:02 ` Miklos Szeredi
2008-10-22 14:35 ` Matthew Wilcox
2008-10-22 14:45 ` Miklos Szeredi
2008-10-23 13:48 ` Matthew Wilcox
2008-10-22 22:23 ` Mark Fasheh
2008-10-23 9:59 ` steve
2008-10-23 10:21 ` Nick Piggin
2008-10-23 10:52 ` steve
2008-10-23 11:07 ` Nick Piggin
2008-10-22 13:16 ` Nick Piggin
2008-10-22 20:09 ` Miklos Szeredi
2008-10-21 16:16 ` Andi Kleen
2008-10-21 16:30 ` steve
2008-10-22 10:31 ` Nick Piggin
2008-10-22 18:46 ` Brad Boyer
2008-10-22 20:19 ` Andi Kleen
2008-10-23 7:08 ` Nick Piggin
2008-10-22 23:07 ` Dave Chinner
2008-10-23 7:07 ` Nick Piggin
2008-10-23 9:44 ` steve
2008-10-23 11:15 ` Nick Piggin
2008-10-23 22:48 ` Dave Chinner
2008-10-24 1:05 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081021133814.GA26942@fogou.chygwyn.com \
--to=steve@chygwyn.com \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=miklos@szeredi.hu \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox