From: Nick Piggin <nickpiggin@yahoo.com.au>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Theodore Ts'o <tytso@mit.edu>
Subject: Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure
Date: Tue, 16 Oct 2007 18:08:06 +1000 [thread overview]
Message-ID: <200710161808.06405.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <m1abqjirmd.fsf@ebiederm.dsl.xmission.com>
On Tuesday 16 October 2007 14:57, Eric W. Biederman wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> writes:
> >> make_page_uptodate() is most hideous part I have run into.
> >> It has to know details about other layers to now what not
> >> to stomp. I think my incorrect simplification of this is what messed
> >> things up, last round.
> >
> > Not really, it's just named funny. That's just a minor utility
> > function that more or less does what it says it should do.
> >
> > The main problem is really that it's implementing a block device
> > who's data comes from its own buffercache :P. I think.
>
> Well to put it another way, mark_page_uptodate() is the only
> place where we really need to know about the upper layers.
> Given that you can kill ramdisks by coding it as:
>
> static void make_page_uptodate(struct page *page)
> {
> clear_highpage(page);
> flush_dcache_page(page);
> SetPageUptodate(page);
> }
>
> Something is seriously non-intuitive about that function if
> you understand the usual rules for how to use the page cache.
You're overwriting some buffers that were uptodate and dirty.
That would be expected to cause problems.
> The problem is that we support a case in the buffer cache
> where pages are partially uptodate and only the buffer_heads
> remember which parts are valid. Assuming we are using them
> correctly.
>
> Having to walk through all of the buffer heads in make_page_uptodate
> seems to me to be a nasty layering violation in rd.c
Sure, but it's not just about the buffers. It's the pagecache
in general. It is supposed to be invisible to the device driver
and sitting above it, and yet it is taking the buffercache and
using it to pull its data out of.
> > I think it's worthwhile, given that we'd have a "real" looking
> > block device and minus these bugs.
>
> For testing purposes I think I can agree with that.
What non-testing uses does it have?
> >> Having a separate store would
> >> solve some of the problems, and probably remove the need
> >> for carefully specifying the ramdisk block size. We would
> >> still need the magic restictions on page allocations though
> >> and it we would use them more often as the initial write to the
> >> ramdisk would not populate the pages we need.
> >
> > What magic restrictions on page allocations? Actually we have
> > fewer restrictions on page allocations because we can use
> > highmem!
>
> With the proposed rewrite yes.
>
> > And the lowmem buffercache pages that we currently pin
> > (unsuccessfully, in the case of this bug) are now completely
> > reclaimable. And all your buffer heads are now reclaimable.
>
> Hmm. Good point. So in net it should save memory even if
> it consumes a little more in the worst case.
Highmem systems would definitely like it. For others, yes, all
the duplicated pages should be able to get reclaimed if memory
gets tight, along with the buffer heads, so yeah footprint may
be a tad smaller.
> > If you mean GFP_NOIO... I don't see any problem. Block device
> > drivers have to allocate memory with GFP_NOIO; this may have
> > been considered magic or deep badness back when the code was
> > written, but it's pretty simple and accepted now.
>
> Well I always figured it was a bit rude allocating large amounts
> of memory GFP_NOIO but whatever.
You'd rather not, of course, but with dirty data limits now,
it doesn't matter much. (and I doubt anybody outside testing
is going to be hammering like crazy on rd).
Note that the buffercache based ramdisk driver is going to
also be allocating with GFP_NOFS if you're talking about a
filesystem writing to its metadata. In most systems, GFP_NOFS
isn't much different to GFP_NOIO.
We could introduce a mode which allocates pages up front
quite easily if it were a problem (which I doubt it ever would
be).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-10-16 8:08 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-15 8:28 Christian Borntraeger
2007-10-15 14:06 ` Nick Piggin
2007-10-15 9:05 ` Christian Borntraeger
2007-10-15 14:38 ` Nick Piggin
2007-10-15 18:38 ` Eric W. Biederman
2007-10-15 22:37 ` Eric W. Biederman
2007-10-15 22:40 ` [PATCH] rd: Preserve the dirty bit in init_page_buffers() Eric W. Biederman
2007-10-15 22:42 ` [PATCH] rd: Mark ramdisk buffers heads dirty Eric W. Biederman
2007-10-16 7:56 ` Christian Borntraeger
2007-10-16 9:22 ` Eric W. Biederman
2007-10-17 16:14 ` Christian Borntraeger
2007-10-17 17:57 ` Eric W. Biederman
2007-10-17 19:14 ` Chris Mason
2007-10-17 20:29 ` Eric W. Biederman
2007-10-17 20:54 ` Chris Mason
2007-10-17 21:30 ` Eric W. Biederman
2007-10-17 22:58 ` Chris Mason
2007-10-17 23:28 ` Eric W. Biederman
2007-10-18 0:03 ` Chris Mason
2007-10-18 3:27 ` Eric W. Biederman
2007-10-18 3:59 ` [RFC][PATCH] block: Isolate the buffer cache in it's own mappings Eric W. Biederman
2007-10-18 4:32 ` Andrew Morton
2007-10-19 21:27 ` Eric W. Biederman
2007-10-21 4:24 ` Nick Piggin
2007-10-21 4:53 ` Eric W. Biederman
2007-10-21 5:36 ` Nick Piggin
2007-10-21 7:09 ` Eric W. Biederman
2007-10-22 0:15 ` David Chinner
2007-10-18 5:10 ` Nick Piggin
2007-10-19 21:35 ` Eric W. Biederman
2007-10-17 21:48 ` [PATCH] rd: Mark ramdisk buffers heads dirty Christian Borntraeger
2007-10-17 22:22 ` Eric W. Biederman
2007-10-18 9:26 ` Christian Borntraeger
2007-10-19 22:46 ` Eric W. Biederman
2007-10-19 22:51 ` [PATCH] rd: Use a private inode for backing storage Eric W. Biederman
2007-10-21 4:28 ` Nick Piggin
2007-10-21 5:10 ` Eric W. Biederman
2007-10-21 5:24 ` Nick Piggin
2007-10-21 6:48 ` Eric W. Biederman
2007-10-21 7:28 ` Christian Borntraeger
2007-10-21 8:23 ` Eric W. Biederman
2007-10-21 9:56 ` Nick Piggin
2007-10-21 18:39 ` Eric W. Biederman
2007-10-22 1:56 ` Nick Piggin
2007-10-22 13:11 ` Chris Mason
2007-10-21 9:39 ` Nick Piggin
2007-10-21 17:56 ` Eric W. Biederman
2007-10-22 0:29 ` Nick Piggin
2007-10-16 8:19 ` [PATCH] rd: Mark ramdisk buffers heads dirty Nick Piggin
2007-10-16 8:48 ` Christian Borntraeger
2007-10-16 19:06 ` Eric W. Biederman
2007-10-16 22:06 ` Nick Piggin
2007-10-16 8:12 ` [PATCH] rd: Preserve the dirty bit in init_page_buffers() Nick Piggin
2007-10-16 9:35 ` Eric W. Biederman
2007-10-15 9:16 ` [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure Andrew Morton
2007-10-15 15:23 ` Nick Piggin
2007-10-16 3:14 ` Eric W. Biederman
2007-10-16 6:45 ` Nick Piggin
2007-10-16 4:57 ` Eric W. Biederman
2007-10-16 8:08 ` Nick Piggin [this message]
2007-10-16 7:47 ` [patch][rfc] rewrite ramdisk Nick Piggin
2007-10-16 7:52 ` Jan Engelhardt
2007-10-16 8:07 ` Nick Piggin
2007-10-16 8:17 ` Jan Engelhardt
2007-10-16 8:26 ` Nick Piggin
2007-10-16 8:53 ` Jan Engelhardt
2007-10-16 9:08 ` Eric W. Biederman
2007-10-16 21:28 ` Theodore Tso
2007-10-16 22:08 ` Nick Piggin
2007-10-16 23:48 ` Eric W. Biederman
2007-10-17 0:28 ` Nick Piggin
2007-10-17 1:13 ` Eric W. Biederman
2007-10-17 1:47 ` Nick Piggin
2007-10-17 10:30 ` Eric W. Biederman
2007-10-17 12:49 ` Nick Piggin
2007-10-17 18:45 ` Eric W. Biederman
2007-10-18 1:06 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200710161808.06405.nickpiggin@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@linux-foundation.org \
--cc=borntraeger@de.ibm.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=schwidefsky@de.ibm.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox