From: Robin Holt <holt@sgi.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Christoph Lameter <cl@linux-foundation.org>,
Robin Holt <holt@sgi.com>,
"Zhang, Yanmin" <yanmin.zhang@intel.com>,
Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: [PATCH v3] zone_reclaim is always 0 by default
Date: Fri, 22 May 2009 07:26:09 -0500 [thread overview]
Message-ID: <20090522122609.GC29447@sgi.com> (raw)
In-Reply-To: <20090521114408.63D0.A69D9226@jp.fujitsu.com>
OK. While I did not object earlier, I am starting to feel a NACK
coming on.
How did you determine this is the source of your problems? What leads
you to believe this is the correct fix instead of an easy change which
affects some random benchmark?
Let me clear, I believe you are seeing an impact from reclaim. I do
not agree it is necessarily a negative impact for the majority of users.
On Thu, May 21, 2009 at 11:47:01AM +0900, KOSAKI Motohiro wrote:
>
> Subject: [PATCH v3] zone_reclaim is always 0 by default
>
> Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> has large remote node distance. it's because we could assume that large distance
> mean large server until recently.
>
> Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> memory controller. IOW it's seen as NUMA from software view.
> Some Core i7 machine has large remote node distance.
>
> Yanmin reported zone_reclaim_mode=1 cause large apache regression.
>
> One Nehalem machine has 12GB memory,
> but there is always 2GB free although applications accesses lots of files.
> Eventually we located the root cause as zone_reclaim_mode=1.
Your root cause analysis is suspect. You found a knob to turn which
suddenly improved performance for one specific un-tuned server workload.
> Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> disk access", it makes performance improvement to HPC workload.
> but it makes performance regression desktop, file server and web server.
zone_reclaim_mode merely means try to free any local unused page before
going off node. I have never seen off-node allocations precluded as
long as the local node's pages are in use. The effect on your one test
shows that unused page cache pages get properly discarded and reused by
the allocator.
> In general, workload depended configuration shouldn't put into default settings.
> Plus, desktop and file/web server eco-system is much larger than hpc's.
I believe you are putting a workload dependent configuration in as the
default. You have not shown this improves anything other than a poorly
configured system running apache responds better on your tests. I can
make a common sense argument that both =1 and =0 are better. I think
the fact that it has been =1 for so long and not caused significant
issues should at least be factored in. Making an exception for the
new hardware on the block makes sense as well.
> Thus, zone_reclaim == 0 is better by default.
How did you determine better by default? I think we already established
that apache is a server workload and not a desktop workload. Earlier
you were arguing that we need this turned off to improve the desktop
environment. You have not established this improves desktop performance.
Actually, you have not established it improves apache performance or
server performance. You have documented it improves memory utilization,
but that is not always the same as faster.
Sorry for being difficult about this, but you are tweaking a knob that
completely changes performance for my typical workload. Reclaim has
been the source of great frustration for me over the years.
Hopefully this is not arrogance on my part, but if you went back to
something equivalent to my earlier patch which allowed the architecture
to decide the default, I would go back to not objecting despite the lack
of proof this is the right fix. You never did specify what was wrong
with that patch. It was simple to understand, accomplished your needs
as well as mine, allowed flexibility in implementing the default as the
#define could be expanded to include arch specific checks if sub-arches
find they need a different default than the rest of the arch. Compared to
"Just remove the default", that seems preferable.
Thanks,
Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-05-22 12:25 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-21 2:47 KOSAKI Motohiro
2009-05-21 3:27 ` Zhang, Yanmin
2009-05-22 12:26 ` Robin Holt [this message]
2009-05-24 13:44 ` KOSAKI Motohiro
2009-05-25 11:41 ` Robin Holt
2009-05-27 8:06 ` KOSAKI Motohiro
2009-05-27 9:50 ` Robin Holt
2009-05-28 4:30 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090522122609.GC29447@sgi.com \
--to=holt@sgi.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=fengguang.wu@intel.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=yanmin.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox