* [PATCH v3] zone_reclaim is always 0 by default
@ 2009-05-21 2:47 KOSAKI Motohiro
2009-05-21 3:27 ` Zhang, Yanmin
2009-05-22 12:26 ` Robin Holt
0 siblings, 2 replies; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-05-21 2:47 UTC (permalink / raw)
To: LKML, linux-mm, Andrew Morton, Rik van Riel, Christoph Lameter,
Robin Holt, Zhang, Yanmin, Wu Fengguang
Cc: kosaki.motohiro
Subject: [PATCH v3] zone_reclaim is always 0 by default
Current linux policy is, zone_reclaim_mode is enabled by default if the machine
has large remote node distance. it's because we could assume that large distance
mean large server until recently.
Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
memory controller. IOW it's seen as NUMA from software view.
Some Core i7 machine has large remote node distance.
Yanmin reported zone_reclaim_mode=1 cause large apache regression.
One Nehalem machine has 12GB memory,
but there is always 2GB free although applications accesses lots of files.
Eventually we located the root cause as zone_reclaim_mode=1.
Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
disk access", it makes performance improvement to HPC workload.
but it makes performance degression desktop, file server and web server.
In general, workload depended configration shouldn't put into default settings.
Plus, desktop and file/web server eco-system is much larger than hpc's.
Thus, zone_reclaim == 0 is better by default.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <holt@sgi.com>
Tested-by: "Zhang, Yanmin" <yanmin.zhang@intel.com>
Acked-by: Wu Fengguang <fengguang.wu@intel.com>
---
arch/ia64/include/asm/topology.h | 5 -----
include/linux/topology.h | 9 +--------
mm/page_alloc.c | 7 -------
3 files changed, 1 insertion(+), 20 deletions(-)
Index: b/mm/page_alloc.c
===================================================================
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2494,13 +2494,6 @@ static void build_zonelists(pg_data_t *p
int distance = node_distance(local_node, node);
/*
- * If another node is sufficiently far away then it is better
- * to reclaim pages in a zone before going off node.
- */
- if (distance > RECLAIM_DISTANCE)
- zone_reclaim_mode = 1;
-
- /*
* We don't want to pressure a particular node.
* So adding penalty to the first node in same
* distance group to make it round-robin.
Index: b/arch/ia64/include/asm/topology.h
===================================================================
--- a/arch/ia64/include/asm/topology.h
+++ b/arch/ia64/include/asm/topology.h
@@ -21,11 +21,6 @@
#define PENALTY_FOR_NODE_WITH_CPUS 255
/*
- * Distance above which we begin to use zone reclaim
- */
-#define RECLAIM_DISTANCE 15
-
-/*
* Returns the number of the node containing CPU 'cpu'
*/
#define cpu_to_node(cpu) (int)(cpu_to_node_map[cpu])
Index: b/include/linux/topology.h
===================================================================
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -53,14 +53,7 @@ int arch_update_cpu_topology(void);
#ifndef node_distance
#define node_distance(from,to) ((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
#endif
-#ifndef RECLAIM_DISTANCE
-/*
- * If the distance between nodes in a system is larger than RECLAIM_DISTANCE
- * (in whatever arch specific measurement units returned by node_distance())
- * then switch on zone reclaim on boot.
- */
-#define RECLAIM_DISTANCE 20
-#endif
+
#ifndef PENALTY_FOR_NODE_WITH_CPUS
#define PENALTY_FOR_NODE_WITH_CPUS (1)
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH v3] zone_reclaim is always 0 by default
2009-05-21 2:47 [PATCH v3] zone_reclaim is always 0 by default KOSAKI Motohiro
@ 2009-05-21 3:27 ` Zhang, Yanmin
2009-05-22 12:26 ` Robin Holt
1 sibling, 0 replies; 8+ messages in thread
From: Zhang, Yanmin @ 2009-05-21 3:27 UTC (permalink / raw)
To: KOSAKI Motohiro, LKML, linux-mm, Andrew Morton, Rik van Riel,
Christoph Lameter, Robin Holt, Wu, Fengguang
>>-----Original Message-----
>>From: KOSAKI Motohiro [mailto:kosaki.motohiro@jp.fujitsu.com]
>>Sent: 2009年5月21日 10:47
>>To: LKML; linux-mm; Andrew Morton; Rik van Riel; Christoph Lameter; Robin Holt;
>>Zhang, Yanmin; Wu, Fengguang
>>Cc: kosaki.motohiro@jp.fujitsu.com
>>Subject: [PATCH v3] zone_reclaim is always 0 by default
>>
>>
>>Subject: [PATCH v3] zone_reclaim is always 0 by default
>>
>>Current linux policy is, zone_reclaim_mode is enabled by default if the machine
>>has large remote node distance. it's because we could assume that large distance
>>mean large server until recently.
>>
>>Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P
>>transport
>>memory controller. IOW it's seen as NUMA from software view.
>>Some Core i7 machine has large remote node distance.
>>
>>Yanmin reported zone_reclaim_mode=1 cause large apache regression.
>>
>> One Nehalem machine has 12GB memory,
>> but there is always 2GB free although applications accesses lots of files.
>> Eventually we located the root cause as zone_reclaim_mode=1.
>>
>>Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather
>>than
>>disk access", it makes performance improvement to HPC workload.
>>but it makes performance degression desktop, file server and web server.
>>
>>In general, workload depended configration shouldn't put into default
>>settings.
>>Plus, desktop and file/web server eco-system is much larger than hpc's.
>>
>>Thus, zone_reclaim == 0 is better by default.
[YM] Thanks. I started a series of testing on 2 Nehalem machines by setting
zone_reclaim_mode=0 (The default is 1 on the 2 machines). I didn't find
regression with non-disk_I/O (mostly cpubound) benchmarks. disk I/O benchmarks
could benefit a little from zone_reclaim_mode=0. As I start benchmark fio with
numactl --interleave=all, so the fio improvement is not so bigger like before.
One thing I need mention is my testing with non-disk_I/O might be not good examples
for this patch, because every node has far more memory than the testing needs.
Only some disk I/O benchmarks have big requirement on page cache memory, so they could benefit from zone_reclaim_mode=0.
>>
>>
>>Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>Cc: Christoph Lameter <cl@linux-foundation.org>
>>Cc: Rik van Riel <riel@redhat.com>
>>Cc: Robin Holt <holt@sgi.com>
>>Tested-by: "Zhang, Yanmin" <yanmin.zhang@intel.com>
>>Acked-by: Wu Fengguang <fengguang.wu@intel.com>
>>---
>> arch/ia64/include/asm/topology.h | 5 -----
>> include/linux/topology.h | 9 +--------
>> mm/page_alloc.c | 7 -------
>> 3 files changed, 1 insertion(+), 20 deletions(-)
>>
>>Index: b/mm/page_alloc.c
>>===================================================================
>>--- a/mm/page_alloc.c
>>+++ b/mm/page_alloc.c
>>@@ -2494,13 +2494,6 @@ static void build_zonelists(pg_data_t *p
>> int distance = node_distance(local_node, node);
>>
>> /*
>>- * If another node is sufficiently far away then it is better
>>- * to reclaim pages in a zone before going off node.
>>- */
>>- if (distance > RECLAIM_DISTANCE)
>>- zone_reclaim_mode = 1;
>>-
>>- /*
>> * We don't want to pressure a particular node.
>> * So adding penalty to the first node in same
>> * distance group to make it round-robin.
>>Index: b/arch/ia64/include/asm/topology.h
>>===================================================================
>>--- a/arch/ia64/include/asm/topology.h
>>+++ b/arch/ia64/include/asm/topology.h
>>@@ -21,11 +21,6 @@
>> #define PENALTY_FOR_NODE_WITH_CPUS 255
>>
>> /*
>>- * Distance above which we begin to use zone reclaim
>>- */
>>-#define RECLAIM_DISTANCE 15
>>-
>>-/*
>> * Returns the number of the node containing CPU 'cpu'
>> */
>> #define cpu_to_node(cpu) (int)(cpu_to_node_map[cpu])
>>Index: b/include/linux/topology.h
>>===================================================================
>>--- a/include/linux/topology.h
>>+++ b/include/linux/topology.h
>>@@ -53,14 +53,7 @@ int arch_update_cpu_topology(void);
>> #ifndef node_distance
>> #define node_distance(from,to) ((from) == (to) ? LOCAL_DISTANCE :
>>REMOTE_DISTANCE)
>> #endif
>>-#ifndef RECLAIM_DISTANCE
>>-/*
>>- * If the distance between nodes in a system is larger than RECLAIM_DISTANCE
>>- * (in whatever arch specific measurement units returned by node_distance())
>>- * then switch on zone reclaim on boot.
>>- */
>>-#define RECLAIM_DISTANCE 20
>>-#endif
>>+
>> #ifndef PENALTY_FOR_NODE_WITH_CPUS
>> #define PENALTY_FOR_NODE_WITH_CPUS (1)
>> #endif
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] zone_reclaim is always 0 by default
2009-05-21 2:47 [PATCH v3] zone_reclaim is always 0 by default KOSAKI Motohiro
2009-05-21 3:27 ` Zhang, Yanmin
@ 2009-05-22 12:26 ` Robin Holt
2009-05-24 13:44 ` KOSAKI Motohiro
1 sibling, 1 reply; 8+ messages in thread
From: Robin Holt @ 2009-05-22 12:26 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, linux-mm, Andrew Morton, Rik van Riel, Christoph Lameter,
Robin Holt, Zhang, Yanmin, Wu Fengguang
OK. While I did not object earlier, I am starting to feel a NACK
coming on.
How did you determine this is the source of your problems? What leads
you to believe this is the correct fix instead of an easy change which
affects some random benchmark?
Let me clear, I believe you are seeing an impact from reclaim. I do
not agree it is necessarily a negative impact for the majority of users.
On Thu, May 21, 2009 at 11:47:01AM +0900, KOSAKI Motohiro wrote:
>
> Subject: [PATCH v3] zone_reclaim is always 0 by default
>
> Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> has large remote node distance. it's because we could assume that large distance
> mean large server until recently.
>
> Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> memory controller. IOW it's seen as NUMA from software view.
> Some Core i7 machine has large remote node distance.
>
> Yanmin reported zone_reclaim_mode=1 cause large apache regression.
>
> One Nehalem machine has 12GB memory,
> but there is always 2GB free although applications accesses lots of files.
> Eventually we located the root cause as zone_reclaim_mode=1.
Your root cause analysis is suspect. You found a knob to turn which
suddenly improved performance for one specific un-tuned server workload.
> Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> disk access", it makes performance improvement to HPC workload.
> but it makes performance regression desktop, file server and web server.
zone_reclaim_mode merely means try to free any local unused page before
going off node. I have never seen off-node allocations precluded as
long as the local node's pages are in use. The effect on your one test
shows that unused page cache pages get properly discarded and reused by
the allocator.
> In general, workload depended configuration shouldn't put into default settings.
> Plus, desktop and file/web server eco-system is much larger than hpc's.
I believe you are putting a workload dependent configuration in as the
default. You have not shown this improves anything other than a poorly
configured system running apache responds better on your tests. I can
make a common sense argument that both =1 and =0 are better. I think
the fact that it has been =1 for so long and not caused significant
issues should at least be factored in. Making an exception for the
new hardware on the block makes sense as well.
> Thus, zone_reclaim == 0 is better by default.
How did you determine better by default? I think we already established
that apache is a server workload and not a desktop workload. Earlier
you were arguing that we need this turned off to improve the desktop
environment. You have not established this improves desktop performance.
Actually, you have not established it improves apache performance or
server performance. You have documented it improves memory utilization,
but that is not always the same as faster.
Sorry for being difficult about this, but you are tweaking a knob that
completely changes performance for my typical workload. Reclaim has
been the source of great frustration for me over the years.
Hopefully this is not arrogance on my part, but if you went back to
something equivalent to my earlier patch which allowed the architecture
to decide the default, I would go back to not objecting despite the lack
of proof this is the right fix. You never did specify what was wrong
with that patch. It was simple to understand, accomplished your needs
as well as mine, allowed flexibility in implementing the default as the
#define could be expanded to include arch specific checks if sub-arches
find they need a different default than the rest of the arch. Compared to
"Just remove the default", that seems preferable.
Thanks,
Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] zone_reclaim is always 0 by default
2009-05-22 12:26 ` Robin Holt
@ 2009-05-24 13:44 ` KOSAKI Motohiro
2009-05-25 11:41 ` Robin Holt
0 siblings, 1 reply; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-05-24 13:44 UTC (permalink / raw)
To: Robin Holt
Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Rik van Riel,
Christoph Lameter, Zhang, Yanmin, Wu Fengguang
> OK. While I did not object earlier, I am starting to feel a NACK
> coming on.
>
> How did you determine this is the source of your problems? What leads
> you to believe this is the correct fix instead of an easy change which
> affects some random benchmark?
>
> Let me clear, I believe you are seeing an impact from reclaim. I do
> not agree it is necessarily a negative impact for the majority of users.
>
>
> On Thu, May 21, 2009 at 11:47:01AM +0900, KOSAKI Motohiro wrote:
> >
> > Subject: [PATCH v3] zone_reclaim is always 0 by default
> >
> > Current linux policy is, zone_reclaim_mode is enabled by default if the machine
> > has large remote node distance. it's because we could assume that large distance
> > mean large server until recently.
> >
> > Unfortunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P transport
> > memory controller. IOW it's seen as NUMA from software view.
> > Some Core i7 machine has large remote node distance.
> >
> > Yanmin reported zone_reclaim_mode=1 cause large apache regression.
> >
> > One Nehalem machine has 12GB memory,
> > but there is always 2GB free although applications accesses lots of files.
> > Eventually we located the root cause as zone_reclaim_mode=1.
>
> Your root cause analysis is suspect. You found a knob to turn which
> suddenly improved performance for one specific un-tuned server workload.
You'd think.
Actually, I have both HPC and server area job experience.
I've seen zone reclaim improve some workload performance and decrease
another some workload (note, it's include hpc workload).
if you haven't seen zone_reclaim decrease performance, it mean
you don't test this feature enough widely.
The fact is, workload dependency charactetistics of zone reclaim is
widely known from very ago.
Even Documentaion/sysctl/vm.txt said,
> It may be beneficial to switch off zone reclaim if the system is
> used for a file server and all of memory should be used for caching files
> from disk. In that case the caching effect is more important than
> data locality.
Nobody except you oppose this.
> > Actually, zone_reclaim_mode=1 mean "I dislike remote node allocation rather than
> > disk access", it makes performance improvement to HPC workload.
> > but it makes performance regression desktop, file server and web server.
>
> zone_reclaim_mode merely means try to free any local unused page before
> going off node. I have never seen off-node allocations precluded as
> long as the local node's pages are in use. The effect on your one test
> shows that unused page cache pages get properly discarded and reused by
> the allocator.
You'd think.
Don't you have x86 machine? you can test zone_reclaim_mode on desktop by
using fake-numa.
Actually, your "local unused page" is _not_ unused. zone reclaim drop
oldest file backed non-dirty page.
if you think non-dirty mean unused, you don't understand linux memory management.
Only overkill system memory gurantee oldest page is unused.
> > In general, workload depended configuration shouldn't put into default settings.
> > Plus, desktop and file/web server eco-system is much larger than hpc's.
>
> I believe you are putting a workload dependent configuration in as the
> default. You have not shown this improves anything other than a poorly
> configured system running apache responds better on your tests. I can
> make a common sense argument that both =1 and =0 are better. I think
> the fact that it has been =1 for so long and not caused significant
> issues should at least be factored in. Making an exception for the
> new hardware on the block makes sense as well.
You'd think.
performance issue have been exist. but it merely mean hpc and high-end server
could avoid it. because they are skillfull engineer.
> > Thus, zone_reclaim == 0 is better by default.
>
> How did you determine better by default? I think we already established
> that apache is a server workload and not a desktop workload. Earlier
> you were arguing that we need this turned off to improve the desktop
> environment. You have not established this improves desktop performance.
> Actually, you have not established it improves apache performance or
> server performance. You have documented it improves memory utilization,
> but that is not always the same as faster.
The fact is, low-end machine performace depend on cache hitting ratio widely.
improving memory utilization mean improving cache hitting ratio.
Plus, I already explained about desktop use case. multiple worst case scenario
can happend on it easily.
if big process consume memory rather than node size, zone-reclaim
decrease performance largely.
zone reclaim decrease page-cache hitting ratio. some desktop don't have
much memory. cache missies does'nt only increase latency, but also
increase unnecessary I/O. desktop don't have rich I/O bandwidth rather than
server or hpc. it makes bad I/O affect.
inter zone imbalancing issue makes another cache hitting ratio decreasing.
> Sorry for being difficult about this, but you are tweaking a knob that
> completely changes performance for my typical workload. Reclaim has
> been the source of great frustration for me over the years.
>
> Hopefully this is not arrogance on my part, but if you went back to
> something equivalent to my earlier patch which allowed the architecture
> to decide the default, I would go back to not objecting despite the lack
> of proof this is the right fix. You never did specify what was wrong
> with that patch. It was simple to understand, accomplished your needs
> as well as mine, allowed flexibility in implementing the default as the
> #define could be expanded to include arch specific checks if sub-arches
> find they need a different default than the rest of the arch. Compared to
> "Just remove the default", that seems preferable.
firstly, I'd say I think your patch is enough considerable.
However, your past explanation is really wrong and bogus.
I wrote
> If this imbalance is an x86_64 only problem, then we could do something
> simple like the following untested patch. This leaves the default
> for everyone except x86_64.
and I wrote it isn't true. after that, you haven't provide addisional
explanation.
Nobody ack CODE-ONLY-PATCH. _You_ have to explain _why_ you think
your approach is better.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] zone_reclaim is always 0 by default
2009-05-24 13:44 ` KOSAKI Motohiro
@ 2009-05-25 11:41 ` Robin Holt
2009-05-27 8:06 ` KOSAKI Motohiro
0 siblings, 1 reply; 8+ messages in thread
From: Robin Holt @ 2009-05-25 11:41 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Robin Holt, LKML, linux-mm, Andrew Morton, Rik van Riel,
Christoph Lameter, Zhang, Yanmin, Wu Fengguang
On Sun, May 24, 2009 at 10:44:29PM +0900, KOSAKI Motohiro wrote:
...
> > Your root cause analysis is suspect. You found a knob to turn which
> > suddenly improved performance for one specific un-tuned server workload.
...
> The fact is, workload dependency charactetistics of zone reclaim is
> widely known from very ago.
> Even Documentaion/sysctl/vm.txt said,
>
> > It may be beneficial to switch off zone reclaim if the system is
> > used for a file server and all of memory should be used for caching files
> > from disk. In that case the caching effect is more important than
> > data locality.
>
> Nobody except you oppose this.
I don't disagree with that statement. I agree this is a workload specific
tuneable that for the case where you want to use the system for nothing
other than file serving, you need to turn it off. It has been this way
for ages. I am saying let's not change that default behavior.
> > How did you determine better by default? I think we already established
> > that apache is a server workload and not a desktop workload. Earlier
> > you were arguing that we need this turned off to improve the desktop
> > environment. You have not established this improves desktop performance.
> > Actually, you have not established it improves apache performance or
> > server performance. You have documented it improves memory utilization,
> > but that is not always the same as faster.
>
> The fact is, low-end machine performace depend on cache hitting ratio widely.
> improving memory utilization mean improving cache hitting ratio.
>
> Plus, I already explained about desktop use case. multiple worst case scenario
> can happend on it easily.
>
> if big process consume memory rather than node size, zone-reclaim
> decrease performance largely.
It may improve performance as well. I agree we can come up with
theoretical cases that show both. I am asking for documented cases where
it does. Your original post indicated an apache regression. In that
case apache was being used under server type loads. If you have a machine
with this condition, you should probably be considered the exception.
> zone reclaim decrease page-cache hitting ratio. some desktop don't have
> much memory. cache missies does'nt only increase latency, but also
> increase unnecessary I/O. desktop don't have rich I/O bandwidth rather than
> server or hpc. it makes bad I/O affect.
If low I/O performance should be turning it off, then shouldn't that
case be coded into the default as opposed to changing the default to
match your specific opinion?
> However, your past explanation is really wrong and bogus.
> I wrote
>
> > If this imbalance is an x86_64 only problem, then we could do something
> > simple like the following untested patch. This leaves the default
> > for everyone except x86_64.
>
> and I wrote it isn't true. after that, you haven't provide addisional
> explanation.
I don't recall seeing your response. Sorry, but this has been, and will
remain, low priority for me. If the default gets changed, we will detect
the performance regression very early after we start testing this bad
of a change on a low memory machine and then we will put a tweak into
place at the next distro release to turn this off following boot.
> Nobody ack CODE-ONLY-PATCH. _You_ have to explain _why_ you think
> your approach is better.
Because it doesn't throw out a lot of history based upon your opinion of
one server type test found under lab conditions on a poorly tuned machine.
Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] zone_reclaim is always 0 by default
2009-05-25 11:41 ` Robin Holt
@ 2009-05-27 8:06 ` KOSAKI Motohiro
2009-05-27 9:50 ` Robin Holt
0 siblings, 1 reply; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-05-27 8:06 UTC (permalink / raw)
To: Robin Holt
Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Rik van Riel,
Christoph Lameter, Zhang, Yanmin, Wu Fengguang
> On Sun, May 24, 2009 at 10:44:29PM +0900, KOSAKI Motohiro wrote:
> ...
> > > Your root cause analysis is suspect. You found a knob to turn which
> > > suddenly improved performance for one specific un-tuned server workload.
> ...
> > The fact is, workload dependency charactetistics of zone reclaim is
> > widely known from very ago.
> > Even Documentaion/sysctl/vm.txt said,
> >
> > > It may be beneficial to switch off zone reclaim if the system is
> > > used for a file server and all of memory should be used for caching files
> > > from disk. In that case the caching effect is more important than
> > > data locality.
> >
> > Nobody except you oppose this.
>
> I don't disagree with that statement. I agree this is a workload specific
> tuneable that for the case where you want to use the system for nothing
> other than file serving, you need to turn it off. It has been this way
> for ages. I am saying let's not change that default behavior.
>
> > > How did you determine better by default? I think we already established
> > > that apache is a server workload and not a desktop workload. Earlier
> > > you were arguing that we need this turned off to improve the desktop
> > > environment. You have not established this improves desktop performance.
> > > Actually, you have not established it improves apache performance or
> > > server performance. You have documented it improves memory utilization,
> > > but that is not always the same as faster.
> >
> > The fact is, low-end machine performace depend on cache hitting ratio widely.
> > improving memory utilization mean improving cache hitting ratio.
> >
> > Plus, I already explained about desktop use case. multiple worst case scenario
> > can happend on it easily.
> >
> > if big process consume memory rather than node size, zone-reclaim
> > decrease performance largely.
>
> It may improve performance as well. I agree we can come up with
> theoretical cases that show both. I am asking for documented cases where
> it does. Your original post indicated an apache regression. In that
> case apache was being used under server type loads. If you have a machine
> with this condition, you should probably be considered the exception.
>
> > zone reclaim decrease page-cache hitting ratio. some desktop don't have
> > much memory. cache missies does'nt only increase latency, but also
> > increase unnecessary I/O. desktop don't have rich I/O bandwidth rather than
> > server or hpc. it makes bad I/O affect.
>
> If low I/O performance should be turning it off, then shouldn't that
> case be coded into the default as opposed to changing the default to
> match your specific opinion?
>
> > However, your past explanation is really wrong and bogus.
> > I wrote
> >
> > > If this imbalance is an x86_64 only problem, then we could do something
> > > simple like the following untested patch. This leaves the default
> > > for everyone except x86_64.
> >
> > and I wrote it isn't true. after that, you haven't provide addisional
> > explanation.
>
> I don't recall seeing your response. Sorry, but this has been, and will
> remain, low priority for me. If the default gets changed, we will detect
> the performance regression very early after we start testing this bad
> of a change on a low memory machine and then we will put a tweak into
> place at the next distro release to turn this off following boot.
>
> > Nobody ack CODE-ONLY-PATCH. _You_ have to explain _why_ you think
> > your approach is better.
>
> Because it doesn't throw out a lot of history based upon your opinion of
> one server type test found under lab conditions on a poorly tuned machine.
Robin, sorry, if this is all of your intention, I can't agree it. firstly,
poorly tuned machine is not wrong at all. valume zone server (low-end sever)
and deskrop people never change kernel parameter. default parameter shold
be optimal. because they are majority user. Yanmin did test proper condition.
secondly, a lot history is not good enough reason in this case. in past days,
larger distance remote node machine is verrrrrrrrrrry few. it was very expensive.
but Core i7 is cheap. There are Ci7 user much x1000 times than high-end
hpc machine user.
your last patch is one of considerable thing. but it has one weakness.
in general "ifdef x86" is wrong idea. almost minor architecture don't
have sufficient tester. the difference against x86 often makes bug.
Then, unnecessary difference is hated by much people.
So, I think we have two selectable choice.
1. remove zone_reclaim default setting completely (this patch)
2. Only PowerPC and IA64 have default zone_reclaim_mode settings,
other architecture always use zone_reclaim_mode=0.
it mean larger distance remote node machine are only in ia64 and power
as a matter of practice. (nobody sale high-end linux on parisc nor sparc)
Changing "as a matter of practice" to "formally" is not caused your worried
risk.
Here is your turn. comments?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] zone_reclaim is always 0 by default
2009-05-27 8:06 ` KOSAKI Motohiro
@ 2009-05-27 9:50 ` Robin Holt
2009-05-28 4:30 ` KOSAKI Motohiro
0 siblings, 1 reply; 8+ messages in thread
From: Robin Holt @ 2009-05-27 9:50 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Robin Holt, LKML, linux-mm, Andrew Morton, Rik van Riel,
Christoph Lameter, Zhang, Yanmin, Wu Fengguang
On Wed, May 27, 2009 at 05:06:18PM +0900, KOSAKI Motohiro wrote:
> your last patch is one of considerable thing. but it has one weakness.
> in general "ifdef x86" is wrong idea. almost minor architecture don't
> have sufficient tester. the difference against x86 often makes bug.
> Then, unnecessary difference is hated by much people.
Let me start by saying I can barely understand this entire email.
I appreciate that english is a second language for you and you are
doing a service to the linux community with your contributions despite
the language barrier. I commend you for your efforts. I do ask that if
there was more information contained in your email than I am replying too,
please reword it so I may understand.
IIRC, my last patch made it an arch header option to set zone_reclaim_mode
to any value it desired while leaving the default as 1. The only arch
that changed the default was x86 (both 32 and 64 bit). That seems the
least disruptive to existing users.
> So, I think we have two selectable choice.
>
> 1. remove zone_reclaim default setting completely (this patch)
> 2. Only PowerPC and IA64 have default zone_reclaim_mode settings,
> other architecture always use zone_reclaim_mode=0.
Looks like 2 is the inverse of my patch. That is fine as well. The only
reason I formed the patch with the default of 1 and override on x86 is
it was one less line of change and one less file.
Thanks,
Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] zone_reclaim is always 0 by default
2009-05-27 9:50 ` Robin Holt
@ 2009-05-28 4:30 ` KOSAKI Motohiro
0 siblings, 0 replies; 8+ messages in thread
From: KOSAKI Motohiro @ 2009-05-28 4:30 UTC (permalink / raw)
To: Robin Holt
Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Rik van Riel,
Christoph Lameter, Zhang, Yanmin, Wu Fengguang
> On Wed, May 27, 2009 at 05:06:18PM +0900, KOSAKI Motohiro wrote:
> > your last patch is one of considerable thing. but it has one weakness.
> > in general "ifdef x86" is wrong idea. almost minor architecture don't
> > have sufficient tester. the difference against x86 often makes bug.
> > Then, unnecessary difference is hated by much people.
>
> Let me start by saying I can barely understand this entire email.
> I appreciate that english is a second language for you and you are
> doing a service to the linux community with your contributions despite
> the language barrier. I commend you for your efforts. I do ask that if
> there was more information contained in your email than I am replying too,
> please reword it so I may understand.
>
> IIRC, my last patch made it an arch header option to set zone_reclaim_mode
> to any value it desired while leaving the default as 1. The only arch
> that changed the default was x86 (both 32 and 64 bit). That seems the
> least disruptive to existing users.
>
> > So, I think we have two selectable choice.
> >
> > 1. remove zone_reclaim default setting completely (this patch)
> > 2. Only PowerPC and IA64 have default zone_reclaim_mode settings,
> > other architecture always use zone_reclaim_mode=0.
>
> Looks like 2 is the inverse of my patch. That is fine as well. The only
> reason I formed the patch with the default of 1 and override on x86 is
> it was one less line of change and one less file.
OK. I appreciate we reach good agreement.
I'll try make patch (2) in this week end.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-05-28 4:30 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-21 2:47 [PATCH v3] zone_reclaim is always 0 by default KOSAKI Motohiro
2009-05-21 3:27 ` Zhang, Yanmin
2009-05-22 12:26 ` Robin Holt
2009-05-24 13:44 ` KOSAKI Motohiro
2009-05-25 11:41 ` Robin Holt
2009-05-27 8:06 ` KOSAKI Motohiro
2009-05-27 9:50 ` Robin Holt
2009-05-28 4:30 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox