linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks
Date: Tue, 21 Aug 2007 15:43:21 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0708211532560.5728@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <1187734144.5463.35.camel@lappy>

On Wed, 22 Aug 2007, Peter Zijlstra wrote:

> Also, all I want is for slab to honour gfp flags like page allocation
> does, nothing more, nothing less.
> 
> (well, actually slightly less, since I'm only really interrested in the
> ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER -> ALLOC_NO_WATERMARKS transition and
> not all higher ones)

I am still not sure what that brings you. There may be multiple 
PF_MEMALLOC going on at the same time. On a large system with N cpus
there may be more than N of these that can steal objects from one another. 

A NUMA system will be shot anyways if memory gets that problematic to 
handle since the OS cannot effectively place memory if all zones are 
overallocated so that only a few pages are left.


> I want slab to fail when a similar page alloc would fail, no magic.

Yes I know. I do not want allocations to fail but that reclaim occurs in 
order to avoid failing any allocation. We need provisions that 
make sure that we never get into such a bad memory situation that would
cause severe slowless and usually end up in a livelock anyways.

> > > Anonymous pages are a there to stay, and we cannot tell people how to
> > > use them. So we need some free or freeable pages in order to avoid the
> > > vm deadlock that arises from all memory dirty.
> > 
> > No one is trying to abolish Anonymous pages. Free memory is readily 
> > available on demand if one calls reclaim. Your scheme introduces complex 
> > negotiations over a few scraps of memory when large amounts of memory 
> > would still be readily available if one would do the right thing and call 
> > into reclaim.
> 
> This is the thing I contend, there need not be large amounts of memory
> around. In my test prog the hot code path fits into a single page, the
> rest can be anonymous.

Thats a bit extreme.... We need to make sure that there are larger amounts 
of memory around. Pages are used for all shorts of short term uses (like 
slab shrinking etc etc.). If memory is that low that a single page matters
then we are in very bad shape anyways.

> > Sounds like you would like to change the way we handle memory in general 
> > in the VM? Reclaim (and thus finding freeable pages) is basic to Linux 
> > memory management.
> 
> Not quite, currently we have free pages in the reserves, if you want to
> replace some (or all) of that by freeable pages then that is a change.

We have free pages primarily to optimize the allocation. Meaning we do not 
have to run reclaim on every call. We want to use all of memory. The 
reserves are there for the case that we cannot call into reclaim. The easy 
solution if that is problematic is to enhance the reclaim to work in the
critical situations that we care about.


> > Sorry I just got into this a short time ago and I may need a few cycles 
> > to get this all straight. An approach that uses memory instead of 
> > ignoring available memory is certainly better.
> 
> Sure if and when possible. There will always be need to fall back to the
> reserves.

Maybe. But we can certainly avoid that as much as possible which would 
also increase our ability to use all available memory instead of leaving 
some of it unused./

> A bit off-topic, re that reclaim from atomic context:
> Currently we try to hold spinlocks only for short periods of time so
> that reclaim can be preempted, if you run all of reclaim from a
> non-preemptible context you get very large preemption latencies and if
> done from int context it'd also generate large int latencies.

If you call into the page allocator from an interrupt context then you are 
already in bad shape since we may check pcps lists and then potentially 
have to traverse the zonelists and check all sorts of things. If we 
would implement atomic reclaim then the reserves may become a latency 
optimizations. At least we will not fail anymore if the reserves are out.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-08-21 22:43 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-20 21:50 Christoph Lameter
2007-08-20 21:50 ` [RFC 1/7] release_lru_pages(): Generic release of pages to the LRU Christoph Lameter
2007-08-21 14:52   ` Mel Gorman
2007-08-21 20:51     ` Christoph Lameter
2007-08-20 21:50 ` [RFC 2/7] Move checks from pageout() to shrink_page_list Christoph Lameter
2007-08-20 21:50 ` [RFC 3/7] shrink_page_list: Support isolating dirty pages on laundry list Christoph Lameter
2007-08-21 15:04   ` Mel Gorman
2007-08-21 20:53     ` Christoph Lameter
2007-08-20 21:50 ` [RFC 4/7] Pass laundry through shrink_inactive_list() and shrink_zone() Christoph Lameter
2007-08-20 21:50 ` [RFC 5/7] Laundry handling for direct reclaim Christoph Lameter
2007-08-21 15:06   ` Mel Gorman
2007-08-21 20:55     ` Christoph Lameter
2007-08-21 15:19   ` Mel Gorman
2007-08-21 21:00     ` Christoph Lameter
2007-08-20 21:50 ` [RFC 6/7] kswapd: Do laundry after reclaim Christoph Lameter
2007-08-20 21:50 ` [RFC 7/7] Switch of PF_MEMALLOC during writeout Christoph Lameter
2007-08-20 23:08   ` Andi Kleen
2007-08-20 23:19     ` Christoph Lameter
2007-08-21  1:13       ` Andi Kleen
2007-08-21 10:36 ` [RFC 0/7] Postphone reclaim laundry to write at high water marks Peter Zijlstra
2007-08-21 20:48   ` Christoph Lameter
2007-08-21 21:13     ` Peter Zijlstra
2007-08-21 21:29       ` Christoph Lameter
2007-08-21 21:43         ` Rik van Riel
2007-08-21 22:32           ` Christoph Lameter
2007-08-23 12:05             ` Andrea Arcangeli
2007-08-23 20:23               ` Christoph Lameter
2007-08-21 22:09         ` Peter Zijlstra
2007-08-21 22:43           ` Christoph Lameter [this message]
2007-08-22  7:02             ` Peter Zijlstra
2007-08-22 19:04               ` Christoph Lameter
2007-08-22 20:03                 ` Peter Zijlstra
2007-08-22 20:16                   ` Christoph Lameter
2007-08-23  7:39                     ` Peter Zijlstra
2007-08-26  4:52                     ` Rik van Riel
2007-08-23 12:16                   ` Andrea Arcangeli
2007-08-22  7:45             ` Ingo Molnar
2007-08-22 19:19               ` Christoph Lameter
2007-08-23 12:08           ` Andrea Arcangeli
2007-08-23 12:59             ` Peter Zijlstra
2007-08-21 15:16 ` Rik van Riel
2007-08-21 20:59   ` Christoph Lameter
2007-08-21 21:14     ` Rik van Riel
2007-08-21 21:30       ` Christoph Lameter
2007-08-21 15:51 ` Dave McCracken
2007-08-21 21:03   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0708211532560.5728@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox