From: Nick Piggin <nickpiggin@yahoo.com.au>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Chris Mason <chris.mason@oracle.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Theodore Ts'o <tytso@mit.edu>,
stable@kernel.org
Subject: Re: [RFC][PATCH] block: Isolate the buffer cache in it's own mappings.
Date: Sun, 21 Oct 2007 15:36:24 +1000 [thread overview]
Message-ID: <200710211536.24722.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <m16411dq6b.fsf@ebiederm.dsl.xmission.com>
On Sunday 21 October 2007 14:53, Eric W. Biederman wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> writes:
> > On Saturday 20 October 2007 07:27, Eric W. Biederman wrote:
> >> Andrew Morton <akpm@linux-foundation.org> writes:
> >> > I don't think we little angels want to tread here. There are so many
> >> > weirdo things out there which will break if we bust the coherence
> >> > between the fs and /dev/hda1.
> >>
> >> We broke coherence between the fs and /dev/hda1 when we introduced
> >> the page cache years ago,
> >
> > Not for metadata. And I wouldn't expect many filesystem analysis
> > tools to care about data.
>
> Well tools like dump certainly weren't happy when we made the change.
Doesn't that give you any suspicion that other tools mightn't
be happy if we make this change, then?
> >> and weird hacky cases like
> >> unmap_underlying_metadata don't change that.
> >
> > unmap_underlying_metadata isn't about raw block device access at
> > all, though (if you write to the filesystem via the blockdevice
> > when it isn't expecting it, it's going to blow up regardless).
>
> Well my goal with separating things is so that we could decouple two
> pieces of code that have different usage scenarios, and where
> supporting both scenarios simultaneously appears to me to needlessly
> complicate the code.
>
> Added to that we could then tune the two pieces of code for their
> different users.
I don't see too much complication from it. If we can actually
simplify things or make useful tuning, maybe it will be worth
doing.
> >> Currently only
> >> metadata is more or less in sync with the contents of /dev/hda1.
> >
> > It either is or it isn't, right? And it is, isn't it? (at least
> > for the common filesystems).
>
> ext2 doesn't store directories in the buffer cache.
Oh that's what you mean. OK, agreed there. But for the filesystems
and types of metadata that can now expect to have coherency, doing
this will break that expectation.
Again, I have no opinions either way on whether we should do that
in the long run. But doing it as a kneejerk response to braindead
rd.c code is wrong because of what *might* go wrong and we don't
know about.
> Journaling filesystems and filesystems that do ordered writes
> game the buffer cache. Putting in data that should not yet
> be written to disk. That gaming is where reiserfs goes BUG
> and where JBD moves the dirty bit to a different dirty bit.
Filesystems really want better control of writeback, I think.
This isn't really a consequence of the unified blockdev pagecache
/ metadata buffer cache, it is just that most of the important
things they do are with metadata.
If they have their own metadata inode, then they'll need to game
the cache for it, or the writeback code for that inode somehow
too.
> So as far as I can tell what is in the buffer cache is not really
> in sync with what should be on disk at any given movement except
> when everything is clean.
Naturally. It is a writeback cache.
> My suspicion is that actually reading from disk is likely to
> give a more coherent view of things. Because there at least
> we have the writes as they are expected to be seen by fsck
> to recover the data, and a snapshot there should at least
> be recoverable. Whereas a snapshot provides not such guarantees.
ext3 fsck I don't think is supposed to be run under a read/write
filesystem, so it's going to explode if you do that regardless.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-10-21 5:36 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-15 8:28 [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure Christian Borntraeger
2007-10-15 14:06 ` Nick Piggin
2007-10-15 9:05 ` Christian Borntraeger
2007-10-15 14:38 ` Nick Piggin
2007-10-15 18:38 ` Eric W. Biederman
2007-10-15 22:37 ` Eric W. Biederman
2007-10-15 22:40 ` [PATCH] rd: Preserve the dirty bit in init_page_buffers() Eric W. Biederman
2007-10-15 22:42 ` [PATCH] rd: Mark ramdisk buffers heads dirty Eric W. Biederman
2007-10-16 7:56 ` Christian Borntraeger
2007-10-16 9:22 ` Eric W. Biederman
2007-10-17 16:14 ` Christian Borntraeger
2007-10-17 17:57 ` Eric W. Biederman
2007-10-17 19:14 ` Chris Mason
2007-10-17 20:29 ` Eric W. Biederman
2007-10-17 20:54 ` Chris Mason
2007-10-17 21:30 ` Eric W. Biederman
2007-10-17 22:58 ` Chris Mason
2007-10-17 23:28 ` Eric W. Biederman
2007-10-18 0:03 ` Chris Mason
2007-10-18 3:27 ` Eric W. Biederman
2007-10-18 3:59 ` [RFC][PATCH] block: Isolate the buffer cache in it's own mappings Eric W. Biederman
2007-10-18 4:32 ` Andrew Morton
2007-10-19 21:27 ` Eric W. Biederman
2007-10-21 4:24 ` Nick Piggin
2007-10-21 4:53 ` Eric W. Biederman
2007-10-21 5:36 ` Nick Piggin [this message]
2007-10-21 7:09 ` Eric W. Biederman
2007-10-22 0:15 ` David Chinner
2007-10-18 5:10 ` Nick Piggin
2007-10-19 21:35 ` Eric W. Biederman
2007-10-17 21:48 ` [PATCH] rd: Mark ramdisk buffers heads dirty Christian Borntraeger
2007-10-17 22:22 ` Eric W. Biederman
2007-10-18 9:26 ` Christian Borntraeger
2007-10-19 22:46 ` Eric W. Biederman
2007-10-19 22:51 ` [PATCH] rd: Use a private inode for backing storage Eric W. Biederman
2007-10-21 4:28 ` Nick Piggin
2007-10-21 5:10 ` Eric W. Biederman
2007-10-21 5:24 ` Nick Piggin
2007-10-21 6:48 ` Eric W. Biederman
2007-10-21 7:28 ` Christian Borntraeger
2007-10-21 8:23 ` Eric W. Biederman
2007-10-21 9:56 ` Nick Piggin
2007-10-21 18:39 ` Eric W. Biederman
2007-10-22 1:56 ` Nick Piggin
2007-10-22 13:11 ` Chris Mason
2007-10-21 9:39 ` Nick Piggin
2007-10-21 17:56 ` Eric W. Biederman
2007-10-22 0:29 ` Nick Piggin
2007-10-16 8:19 ` [PATCH] rd: Mark ramdisk buffers heads dirty Nick Piggin
2007-10-16 8:48 ` Christian Borntraeger
2007-10-16 19:06 ` Eric W. Biederman
2007-10-16 22:06 ` Nick Piggin
2007-10-16 8:12 ` [PATCH] rd: Preserve the dirty bit in init_page_buffers() Nick Piggin
2007-10-16 9:35 ` Eric W. Biederman
2007-10-15 9:16 ` [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure Andrew Morton
2007-10-15 15:23 ` Nick Piggin
2007-10-16 3:14 ` Eric W. Biederman
2007-10-16 6:45 ` Nick Piggin
2007-10-16 4:57 ` Eric W. Biederman
2007-10-16 8:08 ` Nick Piggin
2007-10-16 7:47 ` [patch][rfc] rewrite ramdisk Nick Piggin
2007-10-16 7:52 ` Jan Engelhardt
2007-10-16 8:07 ` Nick Piggin
2007-10-16 8:17 ` Jan Engelhardt
2007-10-16 8:26 ` Nick Piggin
2007-10-16 8:53 ` Jan Engelhardt
2007-10-16 9:08 ` Eric W. Biederman
2007-10-16 21:28 ` Theodore Tso
2007-10-16 22:08 ` Nick Piggin
2007-10-16 23:48 ` Eric W. Biederman
2007-10-17 0:28 ` Nick Piggin
2007-10-17 1:13 ` Eric W. Biederman
2007-10-17 1:47 ` Nick Piggin
2007-10-17 10:30 ` Eric W. Biederman
2007-10-17 12:49 ` Nick Piggin
2007-10-17 18:45 ` Eric W. Biederman
2007-10-18 1:06 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200710211536.24722.nickpiggin@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@linux-foundation.org \
--cc=borntraeger@de.ibm.com \
--cc=chris.mason@oracle.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=schwidefsky@de.ibm.com \
--cc=stable@kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox