From: Nick Piggin <npiggin@suse.de>
To: Jamie Lokier <jamie@shareable.org>
Cc: jim owens <jowens@hp.com>,
linux-fsdevel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [patch][rfc] mm: hold page lock over page_mkwrite
Date: Wed, 4 Mar 2009 10:23:43 +0100 [thread overview]
Message-ID: <20090304092343.GB27043@wotan.suse.de> (raw)
In-Reply-To: <20090303172535.GA16993@shareable.org>
On Tue, Mar 03, 2009 at 05:25:36PM +0000, Jamie Lokier wrote:
> Nick Piggin wrote:
> > The block layer below the filesystem should be robust. Well
> > actually the core block layer is (except maybe for the new
> > bio integrity stuff that looks pretty nasty). Not sure about
> > md/dm, but they really should be safe (they use mempools etc).
>
> Are mempools fully safe, or just statistically safer?
They will guarantee forward progress if used correctly, so
yes fully safe.
> > > it so "we can always make forward progress". But it won't
> > > matter because once a real user drives the system off this
> > > cliff there is no difference between "hung" and "really slow
> > > progress". They are going to crash it and report a hang.
> >
> > I don't think that is the case. These are situations that
> > would be *really* rare and transient. It is not like thrashing
> > in that your working set size exceeds physical RAM, but just
> > a combination of conditions that causes an unusual spike in the
> > required memory to clean some dirty pages (eg. Dave's example
> > of several IOs requiring btree splits over several AGs). Could
> > cause a resource deadlock.
>
> Suppose the systems has two pages to be written. The first must
> _reserve_ 40 pages of scratch space just in case the operation will
> need them. If the second page write is initiated concurrently with
> the first, the second must reserve another 40 pages concurrently.
>
> If 10 page writes are concurrent, that's 400 pages of scratch space
> needed in reserve...
You only need to guarantee forward progress, so you would reserve
40 pages up front for the entire machine (some mempools have more
memory than strictly needed to improve performance, so you could
toy with that, but let's just describe the baseline).
So allocations happen as normal, except when an allocation fails,
then the task which fails the allocation is given access to this
reserve memory, any other task requiring reserve will then block.
Now the reserve provides enough pages to guarantee forward progress,
so that one task is going to be able to proceed and eventually its
pages will become freeable and can be returned to the reserve. Once
the writeout has finished, the reserve will become available to
other tasks.
So this way you only have to reserve enough to write out 1 page,
and you only start blocking things when their memory allocations
wolud have failed *anyway*. And you guarantee forward progress.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-04 9:23 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-25 9:36 Nick Piggin
2009-02-25 16:42 ` Zach Brown
2009-02-25 16:55 ` Nick Piggin
2009-02-25 16:58 ` Zach Brown
2009-02-25 17:02 ` Nick Piggin
2009-02-25 22:35 ` Mark Fasheh
2009-02-25 16:48 ` Chris Mason
2009-02-26 9:20 ` Peter Zijlstra
2009-02-26 11:09 ` Nick Piggin
2009-03-01 8:17 ` Dave Chinner
2009-03-01 13:50 ` Nick Piggin
2009-03-02 8:19 ` Dave Chinner
2009-03-02 8:37 ` Nick Piggin
2009-03-02 15:26 ` jim owens
2009-03-03 4:33 ` Nick Piggin
2009-03-03 17:25 ` Jamie Lokier
2009-03-04 4:37 ` Dave Chinner
2009-03-04 9:23 ` Nick Piggin [this message]
2009-03-04 18:13 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090304092343.GB27043@wotan.suse.de \
--to=npiggin@suse.de \
--cc=jamie@shareable.org \
--cc=jowens@hp.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox