From: Christoph Lameter <clameter@sgi.com>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>,
Lee.Schermerhorn@hp.com, ak@suse.de, anton@samba.org,
mel@csn.ul.ie, akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH v2] gfp.h: GFP_THISNODE can go to other nodes if some are unpopulated
Date: Mon, 11 Jun 2007 09:42:14 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0706110926110.15868@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0706110911080.15326@schroedinger.engr.sgi.com>
On Mon, 11 Jun 2007, Christoph Lameter wrote:
> Well maybe we better fix this? I put an effort into using only cachelines
> already used for GFP_THISNODE since this is in a very performance
> critical path but at that point I was not thinking that we
> would have memoryless nodes.
Duh. Too bad. The node information is not available in __alloc_pages at
all. The only thing we have to go on is a zonelist. And the first element
of that zonelist must no longer be the node from which we picked up
the zonelist after memoryless nodes come into play.
We could check this for alloc_pages_node() and alloc_pages_current by
putting in some code into the place where we retrive the zonelist based on
the current policy.
And looking at that code I can see some more bad consequences of
memoryless nodes:
1. Interleave to the memoryless node will be redirected to the nearest
node to the memoryless node. This will typically result in the nearest
node getting double the allocations if interleave is set.
So interleave is basically broken. It will no longer spread out the
allocations properly.
2. MPOL_BIND may allow allocations outside of the nodes specified.
It assumes that the first item of the zonelist of each node
is that zone.
So we have a universal assumption in the VM that the first zone of a
zonelist contains the local node. The current way of generating
zonelists for memoryless zones is broken (unsurprisingly since the NUMA
handling was never designed to handle memoryless nodes).
I think we can to fix all these troubles by adding a empty zone as
a first zone in the zonelist if the node has no memory of its own.
Then we need to make sure that we do the right thing of falling back
anytime these empty zones will be encountered.
This will have the effect of
1. GFP_THISNODE will fail since there is no memory in the empty zone.
2. MPOL_BIND will not allocate on nodes outside of the specified set
since there will be an empty zone in the generated zonelist.
3. Interleave will still hit an empty zones and fall back to the next.
We should add detection of memoryless nodes to mempoliy.c to skip
those nodes.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-06-11 16:42 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-07 15:04 [PATCH] " Nishanth Aravamudan
2007-06-07 18:11 ` Christoph Lameter
2007-06-07 22:01 ` [PATCH v2] " Nishanth Aravamudan
2007-06-07 22:05 ` Christoph Lameter
2007-06-07 22:16 ` Nishanth Aravamudan
2007-06-11 12:49 ` Andy Whitcroft
2007-06-11 16:12 ` Christoph Lameter
2007-06-11 16:42 ` Christoph Lameter [this message]
2007-06-11 17:12 ` Nishanth Aravamudan
2007-06-11 18:29 ` Christoph Lameter
2007-06-11 18:46 ` Nishanth Aravamudan
2007-06-11 18:54 ` Christoph Lameter
2007-06-11 19:36 ` Nishanth Aravamudan
2007-06-11 19:43 ` Christoph Lameter
2007-06-11 20:18 ` Nishanth Aravamudan
2007-06-11 18:23 ` Lee Schermerhorn
2007-06-11 18:40 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0706110926110.15868@schroedinger.engr.sgi.com \
--to=clameter@sgi.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=anton@samba.org \
--cc=apw@shadowen.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=nacc@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox