From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: kosaki.motohiro@jp.fujitsu.com,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Christoph Lameter <cl@linux-foundation.org>,
"Zhang, Yanmin" <yanmin.zhang@intel.com>
Subject: Re: [PATCH 4/4] zone_reclaim_mode is always 0 by default
Date: Tue, 19 May 2009 11:53:38 +0900 (JST) [thread overview]
Message-ID: <20090519102634.4EB4.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <20090518034907.GF5869@localhost>
> On Wed, May 13, 2009 at 12:08:12PM +0900, KOSAKI Motohiro wrote:
> > Subject: [PATCH] zone_reclaim_mode is always 0 by default
> >
> > Current linux policy is, if the machine has large remote node distance,
> > zone_reclaim_mode is enabled by default because we've be able to assume to
> > large distance mean large server until recently.
> >
> > Unfrotunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> > memory controller. IOW it's NUMA from software view.
> >
> > Some Core i7 machine has large remote node distance and zone_reclaim don't
> > fit desktop and small file server. it cause performance degression.
>
> I can confirm this, Yanmin recently ran into exactly such a
> regression, which was fixed by manually disabling the zone reclaim
> mode. So I guess you can safely add an
>
> Tested-by: "Zhang, Yanmin" <yanmin.zhang@intel.com>
>
> > Thus, zone_reclaim == 0 is better by default. sorry, HPC gusy.
> > you need to turn zone_reclaim_mode on manually now.
>
> I guess the borderline will continue to blur up. It will be more
> dependent on workloads instead of physical NUMA capabilities. So
>
> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
ok, I would explain zone reclaim design and performance tendency.
Firstly, we can make classification of linux eco system, roughly.
- HPC
- high-end server
- volume server
- desktop
- embedded
it is separated by typical workload mainly.
Secondly, zone_reclaim mean "I strongly dislike remote node access than
disk access".
it is very fitting on HPC workload. it because
- HPC workload typically make the number of the same as cpus of processess (or thread).
IOW, the workload typically use memory equally each node.
- HPC workload is typically CPU bounded job. CPU migration is rare.
- HPC workload is typically long lived. (possible >1 year)
IOW, remote node allocation makes _very_ _very_ much remote node access.
but zone_reclaim don't fit typical server workload.
- server workload often make thread pool and some thread is sleeping until
a request receved.
IOW, when thread waking-up, the thread might move another cpu.
node distance tendency don't make sense on weak cpu locality workload.
Plus, disk-cache is the file-server's identity. we shouldn't think it's not important.
Plus, DB software can consume almost system memory and (In general) RDB data makes
harder to split equally as hpc.
desktop workload is special. desktop peopole can run various workload beyond
our assumption. So, we shouldn't have any workload assumption to desktop people.
However, AFAIK almost desktop software use memory as UMA.
we don't need to care embedded. it is typically UMA.
IOW, the benefit of zone reclaim depend on "strong cpu locality" and
"workload is cpu bounded" and "thead is long lived".
but many workload don't fill above requirement. IOW, zone reclaim is
workload depended feature (as Wu said).
In general, the feature of workload depended don't fit default option.
we can't know end-user run what workload anyway.
Fortunately (or Unfortunately), typical workload and machine size had
significant mutuality.
Thus, the current default setting calculation had worked well in past days.
Now, it was breaked. What should we do?
Yanmin, We know 99% linux people use intel cpu and you are one of
most hard repeated testing guy in lkml and you have much test.
May I ask your tested machine and benchmark?
if zone_reclaim=0 tendency workload is much than zone_reclaim=1 tendency workload,
we can drop our afraid and we would prioritize your opinion, of cource.
thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-05-19 2:52 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-13 3:06 [PATCH 0/4] various zone_reclaim cleanup KOSAKI Motohiro
2009-05-13 3:06 ` [PATCH 1/4] vmscan: change the number of the unmapped files in zone reclaim KOSAKI Motohiro
2009-05-13 13:31 ` Rik van Riel
2009-05-14 19:52 ` Christoph Lameter
2009-05-18 3:15 ` Wu Fengguang
2009-05-18 3:35 ` KOSAKI Motohiro
2009-05-18 3:53 ` Wu Fengguang
2009-05-19 1:11 ` KOSAKI Motohiro
2009-05-13 3:06 ` [PATCH 2/4] vmscan: drop PF_SWAPWRITE from zone_reclaim KOSAKI Motohiro
2009-05-13 13:35 ` Rik van Riel
2009-05-14 19:57 ` Christoph Lameter
2009-05-18 3:33 ` Wu Fengguang
2009-05-13 3:07 ` [PATCH 3/4] vmscan: zone_reclaim use may_swap KOSAKI Motohiro
2009-05-13 11:26 ` Johannes Weiner
2009-05-13 14:43 ` Rik van Riel
2009-05-14 19:59 ` Christoph Lameter
2009-05-18 3:35 ` Wu Fengguang
2009-05-13 3:08 ` [PATCH 4/4] zone_reclaim_mode is always 0 by default KOSAKI Motohiro
2009-05-13 14:47 ` Rik van Riel
2009-05-14 8:20 ` KOSAKI Motohiro
2009-05-14 11:48 ` Robin Holt
2009-05-14 12:02 ` KOSAKI Motohiro
2009-05-13 15:22 ` Robin Holt
2009-05-14 20:05 ` Christoph Lameter
2009-05-14 20:23 ` Rik van Riel
2009-05-14 20:31 ` Christoph Lameter
2009-05-15 1:02 ` KOSAKI Motohiro
2009-05-15 10:51 ` Robin Holt
2009-05-19 2:53 ` KOSAKI Motohiro
2009-05-20 14:00 ` Robin Holt
2009-05-21 2:44 ` KOSAKI Motohiro
2009-05-21 13:31 ` Christoph Lameter
2009-05-21 13:57 ` Robin Holt
2009-05-24 13:44 ` KOSAKI Motohiro
2009-05-15 18:01 ` Christoph Lameter
2009-05-18 3:49 ` Wu Fengguang
2009-05-19 1:16 ` Zhang, Yanmin
2009-05-19 2:53 ` KOSAKI Motohiro [this message]
2009-05-19 2:57 ` KOSAKI Motohiro
2009-05-19 3:38 ` Zhang, Yanmin
2009-05-19 4:30 ` KOSAKI Motohiro
2009-05-19 5:06 ` Zhang, Yanmin
2009-05-19 7:09 ` KOSAKI Motohiro
2009-05-19 7:15 ` Zhang, Yanmin
2009-05-18 9:09 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090519102634.4EB4.A69D9226@jp.fujitsu.com \
--to=kosaki.motohiro@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=fengguang.wu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=yanmin.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox