linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>, Pavel Machek <pavel@ucw.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, dkegel@google.com,
	David Miller <davem@davemloft.net>
Subject: Re: [RFC 2/9] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set
Date: Tue, 21 Aug 2007 16:07:15 +0200	[thread overview]
Message-ID: <1187705235.6114.247.camel@twins> (raw)
In-Reply-To: <20070821003922.GD8414@wotan.suse.de>

[-- Attachment #1: Type: text/plain, Size: 2931 bytes --]

On Tue, 2007-08-21 at 02:39 +0200, Nick Piggin wrote:
> On Mon, Aug 20, 2007 at 11:14:08PM +0200, Peter Zijlstra wrote:
> > On Mon, 2007-08-20 at 13:27 -0700, Christoph Lameter wrote:
> > > On Mon, 20 Aug 2007, Peter Zijlstra wrote:
> > > 
> > > > > Plus the same issue can happen today. Writes are usually not completed 
> > > > > during reclaim. If the writes are sufficiently deferred then you have the 
> > > > > same issue now.
> > > > 
> > > > Once we have initiated (disk) writeout we do not need more memory to
> > > > complete it, all we need to do is wait for the completion interrupt.
> > > 
> > > We cannot reclaim the page as long as the I/O is not complete. If you 
> > > have too many anonymous pages and the rest of memory is dirty then you can 
> > > get into OOM scenarios even without this patch.
> > 
> > As long as the reserve is large enough to completely initialize writeout
> > of a single page we can make progress. Once writeout is initialized the
> > completion interrupt is guaranteed to happen (assuming working
> > hardware).
> 
> Although interestingly, we are not guaranteed to have enough memory to
> completely initialise writeout of a single page.

Yes, that is due to the unbounded nature of direct reclaim, no?

I've been meaning to write some patches to address this problem in a way
that does not introduce the hard wall Linus objects to. If only I had
this extra day in the week :-/

And then there is the deadlock in add_to_swap() that I still have to
look into, I hope it can eventually be solved using reserve based
allocation.

> The buffer layer doesn't require disk blocks to be allocated at page
> dirty-time. Allocating disk blocks can require complex filesystem operations
> and readin of buffer cache pages. The buffer_head structures themselves may
> not even be present and must be allocated :P
> 
> In _practice_, this isn't such a problem because we have dirty limits, and
> we're almost guaranteed to have some clean pages to be reclaimed. In this
> same way, networked filesystems are not a problem in practice. However
> network swap, because there is no dirty limits on swap, can actually see
> the deadlock problems.

The main problem with networked swap is not so much sending out the
pages (this has similar problems like the filesystems but is all bounded
in its memory use).

The biggest issue is receiving the completion notification. Network
needs to fall back to a state where it does not blindly consumes memory
or drops _all_ packets. An intermediate state is required, one where we
can receive and inspect incoming packets but commit to very few.

In order to create such a network state and for it to be stable, a
certain amount of memory needs to be available and an external trigger
is needed to enter and leave this state - currently provided by there
being more memory available than needed or not.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2007-08-21 14:07 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-14 15:30 [RFC 0/9] Reclaim during GFP_ATOMIC allocs Christoph Lameter
2007-08-14 15:30 ` [RFC 1/9] Allow reclaim via __GFP_NOMEMALLOC reclaim Christoph Lameter
2007-08-14 15:30 ` [RFC 2/9] Use NOMEMALLOC reclaim to allow reclaim if PF_MEMALLOC is set Christoph Lameter
2007-08-18  7:10   ` Pavel Machek
2007-08-20 19:00     ` Christoph Lameter
2007-08-20 20:17       ` Peter Zijlstra
2007-08-20 20:27         ` Christoph Lameter
2007-08-20 21:14           ` Peter Zijlstra
2007-08-20 21:17             ` Christoph Lameter
2007-08-21 14:07               ` Peter Zijlstra
2007-08-21  0:39             ` Nick Piggin
2007-08-21 14:07               ` Peter Zijlstra [this message]
2007-08-23  3:38                 ` Nick Piggin
2007-08-23  9:26                   ` Peter Zijlstra
2007-08-23 10:11                     ` Nikita Danilov
2007-08-23 13:58                       ` Peter Zijlstra
2007-08-24  4:00                     ` Nick Piggin
2007-08-14 15:30 ` [RFC 3/9] Make cond_rescheds conditional on __GFP_WAIT Christoph Lameter
2007-08-14 15:30 ` [RFC 4/9] Atomic reclaim: Save irq flags in vmscan.c Christoph Lameter
2007-08-14 20:02   ` Andi Kleen
2007-08-14 19:12     ` Christoph Lameter
2007-08-14 20:05       ` Peter Zijlstra
2007-08-14 20:34         ` Andi Kleen
2007-08-14 20:33       ` Andi Kleen
2007-08-14 20:42         ` Christoph Lameter
2007-08-14 20:44           ` Andi Kleen
2007-08-14 21:15             ` Christoph Lameter
2007-08-14 21:23               ` Andi Kleen
2007-08-14 21:26                 ` Christoph Lameter
2007-08-14 21:29                   ` Andi Kleen
2007-08-14 21:37                     ` Christoph Lameter
2007-08-14 21:44                       ` Andi Kleen
2007-08-14 21:48                         ` Christoph Lameter
2007-08-14 21:56                           ` Andi Kleen
2007-08-14 22:07                             ` Christoph Lameter
2007-08-14 22:16                               ` Andi Kleen
2007-08-14 22:20                                 ` Christoph Lameter
2007-08-14 22:21                                   ` Andi Kleen
2007-08-14 22:41                                     ` Christoph Lameter
2007-08-14 15:30 ` [RFC 5/9] Save irqflags on taking the mapping lock Christoph Lameter
2007-08-14 15:30 ` [RFC 6/9] Disable irqs on taking the private_lock Christoph Lameter
2007-08-14 15:30 ` [RFC 7/9] Save flags in swap.c Christoph Lameter
2007-08-14 15:30 ` [RFC 8/9] Reclaim on an atomic allocation if necessary Christoph Lameter
2007-08-14 15:30 ` [RFC 9/9] Testing: Perform GFP_ATOMIC overallocation Christoph Lameter
2007-08-16  2:49 ` [RFC 0/9] Reclaim during GFP_ATOMIC allocs Nick Piggin
2007-08-16 20:24   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1187705235.6114.247.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=davem@davemloft.net \
    --cc=dkegel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=pavel@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox