Re: la la la la ... swappiness

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@osdl.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: Linus Torvalds <torvalds@osdl.org>,
	Aucoin <aucoin@houston.rr.com>,
	'Nick Piggin' <nickpiggin@yahoo.com.au>,
	'Tim Schmielau' <tim@physik3.uni-rostock.de>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: la la la la ... swappiness
Date: Tue, 5 Dec 2006 12:48:59 -0800	[thread overview]
Message-ID: <20061205124859.333d980d.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0612051207240.18863@schroedinger.engr.sgi.com>

On Tue, 5 Dec 2006 12:15:46 -0800 (PST)
Christoph Lameter <clameter@sgi.com> wrote:

> On Tue, 5 Dec 2006, Andrew Morton wrote:
> 
> > But otoh, it's a very common scenario, and nobody has observed it before. 
> 
> This is the same scenario as mlocked memory.

Not quite - mlocked pages are on the page LRU and hence contribute to the
arithmetic in there.   The hugetlb pages are simply gone.

> Kame-san has recently posted 
> an occurence in ZONE_DMA. I have 3 customers where I have seen similar VM 
> behavior with a special shared memory thingy locking down lots of 
> memory.

I expect the mechanisms are different.  The mlocked shared-memory segment
will fill the LRU with unreclaimable pages and the machine will do lots of
scanning.  That's inefficient, but it is unexpected that this will lead to
fals declaration of OOM.

> In fact in the NUMA case with cpusets the limits being off is a very 
> common problem. F.e. the dirty balancing logic does not take into account 
> that the application can just run on a subset of the machine.

Yup.

> So if a 
> cpuset is just 1/10th of the whole machine then we will never be able to 
> reach the dirty limits, all the nodes of a cpuset may be filled up with 
> dirty pages. A simple cp of a large file will bring the machine into a 
> continual reclaim on all nodes.

It shouldn't be continual and it shouldn't be on all nodes.  What _should_
happen in this situation is that the dirty pages in those zones are written
back off the LRU by the vm scanner.

That's less efficient from an IO scheduling POV than writing them back via
the inodes, but it should work OK and it shouldn't affect other zones.

If the activity is really "continual" and "on all nodes" then we have some
bugs to fix.

> I am working on a solution for the dirty throttling but we have similar 
> issues for the other limits. I wonder if we should not account for 
> unreclaimable memory per zone and recalculate the limits if they change 
> significantly. A series of huge page allocations would then retune the 
> limits.

We should fix the existing code before even thinking about this sort of
thing.  Or at least, gain a full understanding of why it is failing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2006-12-05 20:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200612050641.kB56f7wY018196@ms-smtp-06.texas.rr.com>
2006-12-05 16:17 ` Linus Torvalds
2006-12-05 16:59   ` Andrew Morton
2006-12-05 17:41     ` aucoin, Andrew Morton
2006-12-05 18:31       ` Christoph Lameter
2006-12-05 18:44         ` Linus Torvalds
2006-12-05 19:32           ` Christoph Lameter
2006-12-05 20:02             ` Andrew Morton
2006-12-05 20:15               ` Christoph Lameter
2006-12-05 20:48                 ` Andrew Morton [this message]
2006-12-05 20:59                   ` Christoph Lameter
2006-12-05 21:39                     ` Andrew Morton
2006-12-05 23:20                       ` Christoph Lameter
2006-12-12 15:12                         ` Aucoin
2006-12-05 20:52               ` Andrew Morton
2006-12-05 20:39           ` aucoin, Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061205124859.333d980d.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=aucoin@houston.rr.com \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=tim@physik3.uni-rostock.de \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox