linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Theodore Ts'o <tytso@mit.edu>
Subject: Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure
Date: Mon, 15 Oct 2007 22:57:14 -0600	[thread overview]
Message-ID: <m1abqjirmd.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <200710161645.58686.nickpiggin@yahoo.com.au> (Nick Piggin's message of "Tue, 16 Oct 2007 16:45:58 +1000")

Nick Piggin <nickpiggin@yahoo.com.au> writes:

>>
>> make_page_uptodate() is most hideous part I have run into.
>> It has to know details about other layers to now what not
>> to stomp.  I think my incorrect simplification of this is what messed
>> things up, last round.
>
> Not really, it's just named funny. That's just a minor utility
> function that more or less does what it says it should do.
>
> The main problem is really that it's implementing a block device
> who's data comes from its own buffercache :P. I think.

Well to put it another way, mark_page_uptodate() is the only
place where we really need to know about the upper layers.
Given that you can kill ramdisks by coding it as:

static void make_page_uptodate(struct page *page)
{
	clear_highpage(page);
	flush_dcache_page(page);
	SetPageUptodate(page);
}

Something is seriously non-intuitive about that function if
you understand the usual rules for how to use the page cache.

The problem is that we support a case in the buffer cache
where pages are partially uptodate and only the buffer_heads
remember which parts are valid.  Assuming we are using them
correctly.

Having to walk through all of the buffer heads in make_page_uptodate
seems to me to be a nasty layering violation in rd.c

>> > I guess it's not nice
>> > for operating on the pagecache from its request_fn, but the
>> > alternative is to duplicate pages for backing store and buffer
>> > cache (actually that might not be a bad alternative really).
>>
>> Cool. Triple buffering :)  Although I guess that would only
>> apply to metadata these days.
>
> Double buffering. You no longer serve data out of your buffer
> cache.  All filesystem data was already double buffered anyway,
> so we'd be just losing out on one layer of savings for metadata.

Yep we are in agreement there.

> I think it's worthwhile, given that we'd have a "real" looking
> block device and minus these bugs.

For testing purposes I think I can agree with that.

>> Having a separate store would 
>> solve some of the problems, and probably remove the need
>> for carefully specifying the ramdisk block size.  We would
>> still need the magic restictions on page allocations though
>> and it we would use them more often as the initial write to the
>> ramdisk would not populate the pages we need.
>
> What magic restrictions on page allocations? Actually we have
> fewer restrictions on page allocations because we can use
> highmem! 

With the proposed rewrite yes.

> And the lowmem buffercache pages that we currently pin
> (unsuccessfully, in the case of this bug) are now completely
> reclaimable. And all your buffer heads are now reclaimable.

Hmm.  Good point.  So in net it should save memory even if
it consumes a little more in the worst case.


> If you mean GFP_NOIO... I don't see any problem. Block device
> drivers have to allocate memory with GFP_NOIO; this may have
> been considered magic or deep badness back when the code was
> written, but it's pretty simple and accepted now.

Well I always figured it was a bit rude allocating large amounts
of memory GFP_NOIO but whatever.

>> A very ugly bit seems to be the fact that we assume we can
>> dereference bh->b_data without any special magic which
>> means the ramdisk must live in low memory on 32bit machines.
>
> Yeah but that's not rd.c. You need to rewrite the buffer layer
> to fix that (see fsblock ;)).

I'm not certain which way we should go.  Take fsblock and run it
in parallel until everything is converted or use fsblock as a
prototype and once we have figured out which way we should go
convert struct buffer_head into struct fsblock one patch at a time.

I'm inclined to think we should evolve the buffer_head.

Eric





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-10-16  4:57 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-15  8:28 Christian Borntraeger
2007-10-15 14:06 ` Nick Piggin
2007-10-15  9:05   ` Christian Borntraeger
2007-10-15 14:38     ` Nick Piggin
2007-10-15 18:38       ` Eric W. Biederman
2007-10-15 22:37         ` Eric W. Biederman
2007-10-15 22:40           ` [PATCH] rd: Preserve the dirty bit in init_page_buffers() Eric W. Biederman
2007-10-15 22:42             ` [PATCH] rd: Mark ramdisk buffers heads dirty Eric W. Biederman
2007-10-16  7:56               ` Christian Borntraeger
2007-10-16  9:22                 ` Eric W. Biederman
2007-10-17 16:14                 ` Christian Borntraeger
2007-10-17 17:57                   ` Eric W. Biederman
2007-10-17 19:14                     ` Chris Mason
2007-10-17 20:29                       ` Eric W. Biederman
2007-10-17 20:54                         ` Chris Mason
2007-10-17 21:30                           ` Eric W. Biederman
2007-10-17 22:58                             ` Chris Mason
2007-10-17 23:28                               ` Eric W. Biederman
2007-10-18  0:03                                 ` Chris Mason
2007-10-18  3:27                                   ` Eric W. Biederman
2007-10-18  3:59                                   ` [RFC][PATCH] block: Isolate the buffer cache in it's own mappings Eric W. Biederman
2007-10-18  4:32                                     ` Andrew Morton
2007-10-19 21:27                                       ` Eric W. Biederman
2007-10-21  4:24                                         ` Nick Piggin
2007-10-21  4:53                                           ` Eric W. Biederman
2007-10-21  5:36                                             ` Nick Piggin
2007-10-21  7:09                                               ` Eric W. Biederman
2007-10-22  0:15                                           ` David Chinner
2007-10-18  5:10                                     ` Nick Piggin
2007-10-19 21:35                                       ` Eric W. Biederman
2007-10-17 21:48                     ` [PATCH] rd: Mark ramdisk buffers heads dirty Christian Borntraeger
2007-10-17 22:22                       ` Eric W. Biederman
2007-10-18  9:26                         ` Christian Borntraeger
2007-10-19 22:46                           ` Eric W. Biederman
2007-10-19 22:51                           ` [PATCH] rd: Use a private inode for backing storage Eric W. Biederman
2007-10-21  4:28                             ` Nick Piggin
2007-10-21  5:10                               ` Eric W. Biederman
2007-10-21  5:24                                 ` Nick Piggin
2007-10-21  6:48                                   ` Eric W. Biederman
2007-10-21  7:28                                     ` Christian Borntraeger
2007-10-21  8:23                                       ` Eric W. Biederman
2007-10-21  9:56                                         ` Nick Piggin
2007-10-21 18:39                                           ` Eric W. Biederman
2007-10-22  1:56                                             ` Nick Piggin
2007-10-22 13:11                                             ` Chris Mason
2007-10-21  9:39                                     ` Nick Piggin
2007-10-21 17:56                                       ` Eric W. Biederman
2007-10-22  0:29                                         ` Nick Piggin
2007-10-16  8:19               ` [PATCH] rd: Mark ramdisk buffers heads dirty Nick Piggin
2007-10-16  8:48                 ` Christian Borntraeger
2007-10-16 19:06                 ` Eric W. Biederman
2007-10-16 22:06                   ` Nick Piggin
2007-10-16  8:12             ` [PATCH] rd: Preserve the dirty bit in init_page_buffers() Nick Piggin
2007-10-16  9:35               ` Eric W. Biederman
2007-10-15  9:16   ` [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure Andrew Morton
2007-10-15 15:23     ` Nick Piggin
2007-10-16  3:14       ` Eric W. Biederman
2007-10-16  6:45         ` Nick Piggin
2007-10-16  4:57           ` Eric W. Biederman [this message]
2007-10-16  8:08             ` Nick Piggin
2007-10-16  7:47               ` [patch][rfc] rewrite ramdisk Nick Piggin
2007-10-16  7:52                 ` Jan Engelhardt
2007-10-16  8:07                   ` Nick Piggin
2007-10-16  8:17                     ` Jan Engelhardt
2007-10-16  8:26                       ` Nick Piggin
2007-10-16  8:53                         ` Jan Engelhardt
2007-10-16  9:08                 ` Eric W. Biederman
2007-10-16 21:28                 ` Theodore Tso
2007-10-16 22:08                   ` Nick Piggin
2007-10-16 23:48                     ` Eric W. Biederman
2007-10-17  0:28                       ` Nick Piggin
2007-10-17  1:13                         ` Eric W. Biederman
2007-10-17  1:47                           ` Nick Piggin
2007-10-17 10:30                 ` Eric W. Biederman
2007-10-17 12:49                   ` Nick Piggin
2007-10-17 18:45                     ` Eric W. Biederman
2007-10-18  1:06                       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1abqjirmd.fsf@ebiederm.dsl.xmission.com \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=borntraeger@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=schwidefsky@de.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox