linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Robin Holt <holt@sgi.com>
Cc: kosaki.motohiro@jp.fujitsu.com, Rik van Riel <riel@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux-foundation.org>
Subject: Re: [PATCH 4/4] zone_reclaim_mode is always 0 by default
Date: Thu, 14 May 2009 21:02:32 +0900 (JST)	[thread overview]
Message-ID: <20090514205654.9B8A.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <20090514114827.GN7601@sgi.com>

> > Unfortunately no.
> > zone reclaim has two weakness by design.
> > 
> > 1.
> > zone reclaim don't works well when workingset size > local node size.
> > but it can happen easily on small machine.
> > if it happen, zone reclaim drop own process's memory.
> > 
> > Plus, zone reclaim also doesn't fit DB server. its process has large
> > workingset.
> 
> Large DB server is not your typical desktop application either.

ack.


> > 2.
> > zone reclaim have inter zone balancing issue.
> > 
> > example: x86_64 2node 8G machine has following zone assignment
> > 
> >    zone 0 (DMA32):  3GB
> >    zone 0 (Normal): 1GB
> >    zone 1 (Normal): 4GB
> > 
> > if the page is allocated from DMA32, you are lucky. DMA32 isn't reclaimed
> > so freqently. but if from zone0 Normal, you are unlucky.
> > it is very frequent reclaimed although it is small than other zone.
> 
> I have seen that behavior on some of our mismatched large systems as well,
> although never had one so imbalanced because ia64 only has Normal.

not true.
some ia64 server has about 2GB DMA zone. SGI ia64 is special one.


> > I know my patch change large server default. but I believe linux
> > default kernel parameter adapt to desktop and entry machine.
> 
> If this imbalance is an x86_64 only problem, then we could do something
> simple like the following untested patch.  This leaves the default
> for everyone except x86_64.

not x86_64 only.
many 64bit architecture have 2 or 4GB DMA zone.

even though, your patch seems interesting. at least it solve
desktop user issue and we don't need to care another area user.

embedded and high-end server user is typically skillfull. they can
change kernel parameter by themself.


> 
> Robin
> 
> ------------------------------------------------------------------------
> 
> Even if there is a great node distance on x86_64, disable zone reclaim
> by default.  This was done to handle the imbalanced zone sizes where a
> majority of the memory in zone 0 is DMA32 with a small remaining Normal
> which will be aggressively reclaimed.
> 
> For other architectures, we leave the default behavior.
> 
> Signed-off-by: Robin Holt <holt@sgi.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Christoph Lameter <cl@linux-foundation.org>
> Cc: Rik van Riel <riel@redhat.com>
> 
> ---
>  arch/x86/include/asm/topology.h |    2 ++
>  include/linux/topology.h        |    5 +++++
>  mm/page_alloc.c                 |    2 +-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> Index: page_reclaim_mode/arch/x86/include/asm/topology.h
> ===================================================================
> --- page_reclaim_mode.orig/arch/x86/include/asm/topology.h	2009-05-14 06:44:20.118925713 -0500
> +++ page_reclaim_mode/arch/x86/include/asm/topology.h	2009-05-14 06:44:21.251067716 -0500
> @@ -128,6 +128,8 @@ extern unsigned long node_remap_size[];
>  
>  #endif
>  
> +#define DEFAULT_ZONE_RECLAIM_MODE	0
> +
>  /* sched_domains SD_NODE_INIT for NUMA machines */
>  #define SD_NODE_INIT (struct sched_domain) {		\
>  	.min_interval		= 8,			\
> Index: page_reclaim_mode/include/linux/topology.h
> ===================================================================
> --- page_reclaim_mode.orig/include/linux/topology.h	2009-05-14 06:44:20.070919619 -0500
> +++ page_reclaim_mode/include/linux/topology.h	2009-05-14 06:44:21.279071382 -0500
> @@ -61,6 +61,11 @@ int arch_update_cpu_topology(void);
>   */
>  #define RECLAIM_DISTANCE 20
>  #endif
> +
> +#ifndef DEFAULT_ZONE_RECLAIM_MODE
> +#define DEFAULT_ZONE_RECLAIM_MODE	1
> +#endif
> +
>  #ifndef PENALTY_FOR_NODE_WITH_CPUS
>  #define PENALTY_FOR_NODE_WITH_CPUS	(1)
>  #endif
> Index: page_reclaim_mode/mm/page_alloc.c
> ===================================================================
> --- page_reclaim_mode.orig/mm/page_alloc.c	2009-05-14 06:44:20.138928363 -0500
> +++ page_reclaim_mode/mm/page_alloc.c	2009-05-14 06:44:21.311075244 -0500
> @@ -2331,7 +2331,7 @@ static void build_zonelists(pg_data_t *p
>  		 * to reclaim pages in a zone before going off node.
>  		 */
>  		if (distance > RECLAIM_DISTANCE)
> -			zone_reclaim_mode = 1;
> +			zone_reclaim_mode = DEFAULT_ZONE_RECLAIM_MODE;
>  
>  		/*
>  		 * We don't want to pressure a particular node.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-05-14 12:02 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-13  3:06 [PATCH 0/4] various zone_reclaim cleanup KOSAKI Motohiro
2009-05-13  3:06 ` [PATCH 1/4] vmscan: change the number of the unmapped files in zone reclaim KOSAKI Motohiro
2009-05-13 13:31   ` Rik van Riel
2009-05-14 19:52   ` Christoph Lameter
2009-05-18  3:15   ` Wu Fengguang
2009-05-18  3:35     ` KOSAKI Motohiro
2009-05-18  3:53       ` Wu Fengguang
2009-05-19  1:11         ` KOSAKI Motohiro
2009-05-13  3:06 ` [PATCH 2/4] vmscan: drop PF_SWAPWRITE from zone_reclaim KOSAKI Motohiro
2009-05-13 13:35   ` Rik van Riel
2009-05-14 19:57   ` Christoph Lameter
2009-05-18  3:33   ` Wu Fengguang
2009-05-13  3:07 ` [PATCH 3/4] vmscan: zone_reclaim use may_swap KOSAKI Motohiro
2009-05-13 11:26   ` Johannes Weiner
2009-05-13 14:43   ` Rik van Riel
2009-05-14 19:59   ` Christoph Lameter
2009-05-18  3:35   ` Wu Fengguang
2009-05-13  3:08 ` [PATCH 4/4] zone_reclaim_mode is always 0 by default KOSAKI Motohiro
2009-05-13 14:47   ` Rik van Riel
2009-05-14  8:20     ` KOSAKI Motohiro
2009-05-14 11:48       ` Robin Holt
2009-05-14 12:02         ` KOSAKI Motohiro [this message]
2009-05-13 15:22   ` Robin Holt
2009-05-14 20:05     ` Christoph Lameter
2009-05-14 20:23       ` Rik van Riel
2009-05-14 20:31         ` Christoph Lameter
2009-05-15  1:02       ` KOSAKI Motohiro
2009-05-15 10:51         ` Robin Holt
2009-05-19  2:53           ` KOSAKI Motohiro
2009-05-20 14:00             ` Robin Holt
2009-05-21  2:44               ` KOSAKI Motohiro
2009-05-21 13:31                 ` Christoph Lameter
2009-05-21 13:57                   ` Robin Holt
2009-05-24 13:44                   ` KOSAKI Motohiro
2009-05-15 18:01         ` Christoph Lameter
2009-05-18  3:49   ` Wu Fengguang
2009-05-19  1:16     ` Zhang, Yanmin
2009-05-19  2:53     ` KOSAKI Motohiro
2009-05-19  2:57       ` KOSAKI Motohiro
2009-05-19  3:38       ` Zhang, Yanmin
2009-05-19  4:30         ` KOSAKI Motohiro
2009-05-19  5:06           ` Zhang, Yanmin
2009-05-19  7:09             ` KOSAKI Motohiro
2009-05-19  7:15               ` Zhang, Yanmin
2009-05-18  9:09   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090514205654.9B8A.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=holt@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox