From: Mel Gorman <mel@csn.ul.ie>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-mm@kvack.org
Subject: Re: [patch] mm: default to node zonelist ordering when nodes have only lowmem
Date: Fri, 26 Mar 2010 14:07:35 +0000 [thread overview]
Message-ID: <20100326140735.GB2024@csn.ul.ie> (raw)
In-Reply-To: <alpine.DEB.2.00.1003251532150.7950@chino.kir.corp.google.com>
On Thu, Mar 25, 2010 at 03:33:08PM -0700, David Rientjes wrote:
> There are two types of zonelist ordering methodologies:
>
> - node order, preferring allocations on a node to stay local to and
>
> - zone order, preferring allocations come from a higher zone to avoid
> allocating in lowmem zones even though they may not be local.
>
> The ordering technique used by the kernel is configurable on the command
> line, but also has some logic to determine what the default should be.
>
> This logic currently lacks knowledge of systems where a node may only
> have lowmem. For such systems, it is necessary to use node order so that
> GFP_KERNEL allocations may be satisfied by nodes consisting of only
> lowmem.
>
> If zone order is used, GFP_KERNEL allocations to such nodes are actually
> allocated on a node with local affinity that includes ZONE_NORMAL.
>
> This change defaults to node zonelist ordering if any node lacks
> ZONE_NORMAL.
>
> To force zone order, append 'numa_zonelist_order=zone' to the kernel
> command line.
>
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> mm/page_alloc.c | 11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2582,7 +2582,7 @@ static int default_zonelist_order(void)
> * ZONE_DMA and ZONE_DMA32 can be very small area in the sytem.
> * If they are really small and used heavily, the system can fall
> * into OOM very easily.
> - * This function detect ZONE_DMA/DMA32 size and confgigures zone order.
> + * This function detect ZONE_DMA/DMA32 size and configures zone order.
> */
Spurious change here but it's not very important.
> /* Is there ZONE_NORMAL ? (ex. ppc has only DMA zone..) */
> low_kmem_size = 0;
> @@ -2594,6 +2594,15 @@ static int default_zonelist_order(void)
> if (zone_type < ZONE_NORMAL)
> low_kmem_size += z->present_pages;
> total_size += z->present_pages;
> + } else if (zone_type == ZONE_NORMAL) {
> + /*
What if it was ZONE_DMA32?
> + * If any node has only lowmem, then node order
> + * is preferred to allow kernel allocations
> + * locally; otherwise, they can easily infringe
> + * on other nodes when there is an abundance of
> + * lowmem available to allocate from.
> + */
> + return ZONELIST_ORDER_NODE;
It might be clearer if it was done as a similar check later
if (low_kmem_size &&
total_size > average_size && /* ignore small node */
low_kmem_size > total_size * 70/100)
return ZONELIST_ORDER_NODE;
This is saying if low memory is > 70% of total, then use nodes. To take
yours into account, it'd look something like;
if (low_kmwm_size && total_size > average_size) {
if (lowmem_size == total_size)
return ZONELIST_ORDER_ZONE;
if (lowmem_size > total_size * 70/100)
return ZONELIST_ORDER_NODE;
}
> }
> }
> }
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-26 14:07 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-25 22:33 David Rientjes
2010-03-26 14:07 ` Mel Gorman [this message]
2010-03-26 19:05 ` David Rientjes
2010-03-30 10:03 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100326140735.GB2024@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox