linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
@ 2010-10-08  1:48 KOSAKI Motohiro
  2010-10-08  9:04 ` Balbir Singh
  0 siblings, 1 reply; 14+ messages in thread
From: KOSAKI Motohiro @ 2010-10-08  1:48 UTC (permalink / raw)
  To: Christoph Lameter, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes
  Cc: kosaki.motohiro

Recently, Robert Mueller reported zone_reclaim_mode doesn't work
properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
He is using Cyrus IMAPd and it's built on a very traditional
single-process model.

  * a master process which reads config files and manages the other
    process
  * multiple imapd processes, one per connection
  * multiple pop3d processes, one per connection
  * multiple lmtpd processes, one per connection
  * periodical "cleanup" processes.

Then, there are thousands of independent processes. The problem is,
recent Intel motherboard turn on zone_reclaim_mode by default and
traditional prefork model software don't work fine on it.
Unfortunatelly, Such model is still typical one even though 21th
century. We can't ignore them.

This patch raise zone_reclaim_mode threshold to 30. 30 don't have
specific meaning. but 20 mean one-hop QPI/Hypertransport and such
relatively cheap 2-4 socket machine are often used for tradiotional
server as above. The intention is, their machine don't use
zone_reclaim_mode.

Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
then this patch doesn't change such high-end NUMA machine behavior.

Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Bron Gondwana <brong@fastmail.fm>
Cc: Robert Mueller <robm@fastmail.fm>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/topology.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/topology.h b/include/linux/topology.h
index 64e084f..bfbec49 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -60,7 +60,7 @@ int arch_update_cpu_topology(void);
  * (in whatever arch specific measurement units returned by node_distance())
  * then switch on zone reclaim on boot.
  */
-#define RECLAIM_DISTANCE 20
+#define RECLAIM_DISTANCE 30
 #endif
 #ifndef PENALTY_FOR_NODE_WITH_CPUS
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-08  1:48 [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30 KOSAKI Motohiro
@ 2010-10-08  9:04 ` Balbir Singh
  2010-10-08 15:45   ` Christoph Lameter
  2010-10-12  1:55   ` KOSAKI Motohiro
  0 siblings, 2 replies; 14+ messages in thread
From: Balbir Singh @ 2010-10-08  9:04 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes

* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2010-10-08 10:48:26]:

> Recently, Robert Mueller reported zone_reclaim_mode doesn't work
> properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
> He is using Cyrus IMAPd and it's built on a very traditional
> single-process model.
> 
>   * a master process which reads config files and manages the other
>     process
>   * multiple imapd processes, one per connection
>   * multiple pop3d processes, one per connection
>   * multiple lmtpd processes, one per connection
>   * periodical "cleanup" processes.
> 
> Then, there are thousands of independent processes. The problem is,
> recent Intel motherboard turn on zone_reclaim_mode by default and
> traditional prefork model software don't work fine on it.
> Unfortunatelly, Such model is still typical one even though 21th
> century. We can't ignore them.
> 
> This patch raise zone_reclaim_mode threshold to 30. 30 don't have
> specific meaning. but 20 mean one-hop QPI/Hypertransport and such
> relatively cheap 2-4 socket machine are often used for tradiotional
> server as above. The intention is, their machine don't use
> zone_reclaim_mode.
> 
> Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
> then this patch doesn't change such high-end NUMA machine behavior.
> 
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: Bron Gondwana <brong@fastmail.fm>
> Cc: Robert Mueller <robm@fastmail.fm>
> Acked-by: Christoph Lameter <cl@linux.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  include/linux/topology.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 64e084f..bfbec49 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -60,7 +60,7 @@ int arch_update_cpu_topology(void);
>   * (in whatever arch specific measurement units returned by node_distance())
>   * then switch on zone reclaim on boot.
>   */
> -#define RECLAIM_DISTANCE 20
> +#define RECLAIM_DISTANCE 30
>  #endif
>  #ifndef PENALTY_FOR_NODE_WITH_CPUS
>  #define PENALTY_FOR_NODE_WITH_CPUS	(1)

I am not sure if this makes sense, since RECLAIM_DISTANCE is supposed
to be a hardware parameter. Could you please help clarify what the
access latency of a node with RECLAIM_DISTANCE 20 to that of a node
with RECLAIM_DISTANCE 30 is? Has the hardware definition of reclaim
distance changed?

I suspect the side effect is the zone_reclaim_mode is not set to 1 on
bootup for the 2-4 socket machines you mention, which results in
better VM behaviour?

-- 
	Three Cheers,
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-08  9:04 ` Balbir Singh
@ 2010-10-08 15:45   ` Christoph Lameter
  2010-10-08 16:59     ` Balbir Singh
  2010-10-12  1:55   ` KOSAKI Motohiro
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2010-10-08 15:45 UTC (permalink / raw)
  To: Balbir Singh
  Cc: KOSAKI Motohiro, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes

On Fri, 8 Oct 2010, Balbir Singh wrote:

> I am not sure if this makes sense, since RECLAIM_DISTANCE is supposed
> to be a hardware parameter. Could you please help clarify what the
> access latency of a node with RECLAIM_DISTANCE 20 to that of a node
> with RECLAIM_DISTANCE 30 is? Has the hardware definition of reclaim
> distance changed?

10 is the local distance. So 30 should be 3x the latency that a local
access takes.

> I suspect the side effect is the zone_reclaim_mode is not set to 1 on
> bootup for the 2-4 socket machines you mention, which results in
> better VM behaviour?

Right.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-08 15:45   ` Christoph Lameter
@ 2010-10-08 16:59     ` Balbir Singh
  2010-10-08 17:56       ` Christoph Lameter
  0 siblings, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2010-10-08 16:59 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: KOSAKI Motohiro, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes

* Christoph Lameter <cl@linux.com> [2010-10-08 10:45:16]:

> On Fri, 8 Oct 2010, Balbir Singh wrote:
> 
> > I am not sure if this makes sense, since RECLAIM_DISTANCE is supposed
> > to be a hardware parameter. Could you please help clarify what the
> > access latency of a node with RECLAIM_DISTANCE 20 to that of a node
> > with RECLAIM_DISTANCE 30 is? Has the hardware definition of reclaim
> > distance changed?
> 
> 10 is the local distance. So 30 should be 3x the latency that a local
> access takes.
>

Does this patch then imply that we should do zone_reclaim only for 3x
nodes and not 2x nodes as we did earlier.
 
> > I suspect the side effect is the zone_reclaim_mode is not set to 1 on
> > bootup for the 2-4 socket machines you mention, which results in
> > better VM behaviour?
> 
> Right.
> 

-- 
	Three Cheers,
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-08 16:59     ` Balbir Singh
@ 2010-10-08 17:56       ` Christoph Lameter
  2010-10-12  2:11         ` David Rientjes
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2010-10-08 17:56 UTC (permalink / raw)
  To: Balbir Singh
  Cc: KOSAKI Motohiro, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes

On Fri, 8 Oct 2010, Balbir Singh wrote:

> * Christoph Lameter <cl@linux.com> [2010-10-08 10:45:16]:
>
> > On Fri, 8 Oct 2010, Balbir Singh wrote:
> >
> > > I am not sure if this makes sense, since RECLAIM_DISTANCE is supposed
> > > to be a hardware parameter. Could you please help clarify what the
> > > access latency of a node with RECLAIM_DISTANCE 20 to that of a node
> > > with RECLAIM_DISTANCE 30 is? Has the hardware definition of reclaim
> > > distance changed?
> >
> > 10 is the local distance. So 30 should be 3x the latency that a local
> > access takes.
> >
>
> Does this patch then imply that we should do zone_reclaim only for 3x
> nodes and not 2x nodes as we did earlier.

It implies that zone reclaim is going to be automatically enabled if the
maximum latency to the memory farthest away is 3 times or more that of a
local memory access.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-08  9:04 ` Balbir Singh
  2010-10-08 15:45   ` Christoph Lameter
@ 2010-10-12  1:55   ` KOSAKI Motohiro
  1 sibling, 0 replies; 14+ messages in thread
From: KOSAKI Motohiro @ 2010-10-12  1:55 UTC (permalink / raw)
  To: balbir
  Cc: kosaki.motohiro, Christoph Lameter, Mel Gorman, Rob Mueller,
	linux-kernel, Bron Gondwana, linux-mm, David Rientjes

Hi

> > -#define RECLAIM_DISTANCE 20
> > +#define RECLAIM_DISTANCE 30
> >  #endif
> >  #ifndef PENALTY_FOR_NODE_WITH_CPUS
> >  #define PENALTY_FOR_NODE_WITH_CPUS	(1)
> 
> I am not sure if this makes sense, since RECLAIM_DISTANCE is supposed
> to be a hardware parameter. Could you please help clarify what the
> access latency of a node with RECLAIM_DISTANCE 20 to that of a node
> with RECLAIM_DISTANCE 30 is? Has the hardware definition of reclaim
> distance changed?

Recently, Intel/AMD implemented QPI/Hypertransport on their cpus. Then, 
commodity server's average node distance dramatically changed and our threshold
became typical case unfit.

So, my intention is, commodity server continue to don't use zone_reclaim_mode.
because their workload haven't been changed. 

30 itself don't have strong meaning.

> I suspect the side effect is the zone_reclaim_mode is not set to 1 on
> bootup for the 2-4 socket machines you mention, which results in
> better VM behaviour?

It depend on workload. If you are using file/web/emal server (i.e. most common case),
it's better. but HPC workload don't works so fine.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-08 17:56       ` Christoph Lameter
@ 2010-10-12  2:11         ` David Rientjes
  2010-10-12  2:17           ` KOSAKI Motohiro
  0 siblings, 1 reply; 14+ messages in thread
From: David Rientjes @ 2010-10-12  2:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Balbir Singh, KOSAKI Motohiro, Mel Gorman, Rob Mueller,
	linux-kernel, Bron Gondwana, linux-mm

On Fri, 8 Oct 2010, Christoph Lameter wrote:

> It implies that zone reclaim is going to be automatically enabled if the
> maximum latency to the memory farthest away is 3 times or more that of a
> local memory access.
> 

It doesn't determine what the maximum latency to that memory is, it relies 
on whatever was defined in the SLIT; the only semantics of that distance 
comes from the ACPI spec that states those distances are relative to the 
local distance of 10.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-12  2:11         ` David Rientjes
@ 2010-10-12  2:17           ` KOSAKI Motohiro
  2010-10-12  3:12             ` David Rientjes
  0 siblings, 1 reply; 14+ messages in thread
From: KOSAKI Motohiro @ 2010-10-12  2:17 UTC (permalink / raw)
  To: David Rientjes
  Cc: kosaki.motohiro, Christoph Lameter, Balbir Singh, Mel Gorman,
	Rob Mueller, linux-kernel, Bron Gondwana, linux-mm

> On Fri, 8 Oct 2010, Christoph Lameter wrote:
> 
> > It implies that zone reclaim is going to be automatically enabled if the
> > maximum latency to the memory farthest away is 3 times or more that of a
> > local memory access.
> > 
> 
> It doesn't determine what the maximum latency to that memory is, it relies 
> on whatever was defined in the SLIT; the only semantics of that distance 
> comes from the ACPI spec that states those distances are relative to the 
> local distance of 10.

Right. but do we need to consider fake SLIT case? I know actually such bogus
slit are there. but I haven't seen such fake SLIT made serious trouble.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-12  2:17           ` KOSAKI Motohiro
@ 2010-10-12  3:12             ` David Rientjes
  2010-10-12  4:07               ` KOSAKI Motohiro
  0 siblings, 1 reply; 14+ messages in thread
From: David Rientjes @ 2010-10-12  3:12 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, Balbir Singh, Mel Gorman, Rob Mueller,
	linux-kernel, Bron Gondwana, linux-mm

On Tue, 12 Oct 2010, KOSAKI Motohiro wrote:

> > It doesn't determine what the maximum latency to that memory is, it relies 
> > on whatever was defined in the SLIT; the only semantics of that distance 
> > comes from the ACPI spec that states those distances are relative to the 
> > local distance of 10.
> 
> Right. but do we need to consider fake SLIT case? I know actually such bogus
> slit are there. but I haven't seen such fake SLIT made serious trouble.
> 

If we can make the assumption that the SLIT entries are truly 
representative of the latencies and are adhering to the semantics 
presented in the ACPI spec, then this means the VM prefers to do zone 
reclaim rather than from other nodes when the latter is 3x more costly.

That's fine by me, as I've mentioned we've done this for a couple years 
because we've had to explicitly disable zone_reclaim_mode for such 
configurations.  If that's the policy decision that's been made, though, 
we _could_ measure the cost at boot and set zone_reclaim_mode depending on 
the measured latency rather than relying on the SLIT at all in this case.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-12  3:12             ` David Rientjes
@ 2010-10-12  4:07               ` KOSAKI Motohiro
  2010-10-12  6:41                 ` Balbir Singh
  0 siblings, 1 reply; 14+ messages in thread
From: KOSAKI Motohiro @ 2010-10-12  4:07 UTC (permalink / raw)
  To: David Rientjes
  Cc: kosaki.motohiro, Christoph Lameter, Balbir Singh, Mel Gorman,
	Rob Mueller, linux-kernel, Bron Gondwana, linux-mm

> On Tue, 12 Oct 2010, KOSAKI Motohiro wrote:
> 
> > > It doesn't determine what the maximum latency to that memory is, it relies 
> > > on whatever was defined in the SLIT; the only semantics of that distance 
> > > comes from the ACPI spec that states those distances are relative to the 
> > > local distance of 10.
> > 
> > Right. but do we need to consider fake SLIT case? I know actually such bogus
> > slit are there. but I haven't seen such fake SLIT made serious trouble.
> > 
> 
> If we can make the assumption that the SLIT entries are truly 
> representative of the latencies and are adhering to the semantics 
> presented in the ACPI spec, then this means the VM prefers to do zone 
> reclaim rather than from other nodes when the latter is 3x more costly.
> 
> That's fine by me, as I've mentioned we've done this for a couple years 
> because we've had to explicitly disable zone_reclaim_mode for such 
> configurations.  If that's the policy decision that's been made, though, 
> we _could_ measure the cost at boot and set zone_reclaim_mode depending on 
> the measured latency rather than relying on the SLIT at all in this case.

ok, got it. thanks.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-12  4:07               ` KOSAKI Motohiro
@ 2010-10-12  6:41                 ` Balbir Singh
  0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2010-10-12  6:41 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: David Rientjes, Christoph Lameter, Mel Gorman, Rob Mueller,
	linux-kernel, Bron Gondwana, linux-mm

* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2010-10-12 13:07:35]:

> > On Tue, 12 Oct 2010, KOSAKI Motohiro wrote:
> > 
> > > > It doesn't determine what the maximum latency to that memory is, it relies 
> > > > on whatever was defined in the SLIT; the only semantics of that distance 
> > > > comes from the ACPI spec that states those distances are relative to the 
> > > > local distance of 10.
> > > 
> > > Right. but do we need to consider fake SLIT case? I know actually such bogus
> > > slit are there. but I haven't seen such fake SLIT made serious trouble.
> > > 
> > 
> > If we can make the assumption that the SLIT entries are truly 
> > representative of the latencies and are adhering to the semantics 
> > presented in the ACPI spec, then this means the VM prefers to do zone 
> > reclaim rather than from other nodes when the latter is 3x more costly.
> > 
> > That's fine by me, as I've mentioned we've done this for a couple years 
> > because we've had to explicitly disable zone_reclaim_mode for such 
> > configurations.  If that's the policy decision that's been made, though, 
> > we _could_ measure the cost at boot and set zone_reclaim_mode depending on 
> > the measured latency rather than relying on the SLIT at all in this case.
> 
> ok, got it. thanks.
>

Could we please document the change and help people understand why
with newer kernels they may see the value of zone_reclaim_mode change
on their systems and how to set it back if required. 

-- 
	Three Cheers,
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
  2010-10-25  3:24 KOSAKI Motohiro
@ 2010-10-25  4:35 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-10-25  4:35 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Christoph Lameter, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes, Andrew Morton,
	Balbir Singh

On Mon, 25 Oct 2010 12:24:24 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Recently, Robert Mueller reported zone_reclaim_mode doesn't work
> properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
> He is using Cyrus IMAPd and it's built on a very traditional
> single-process model.
> 
>   * a master process which reads config files and manages the other
>     process
>   * multiple imapd processes, one per connection
>   * multiple pop3d processes, one per connection
>   * multiple lmtpd processes, one per connection
>   * periodical "cleanup" processes.
> 
> Then, there are thousands of independent processes. The problem is,
> recent Intel motherboard turn on zone_reclaim_mode by default and
> traditional prefork model software don't work fine on it.
> Unfortunatelly, Such model is still typical one even though 21th
> century. We can't ignore them.
> 
> This patch raise zone_reclaim_mode threshold to 30. 30 don't have
> specific meaning. but 20 mean one-hop QPI/Hypertransport and such
> relatively cheap 2-4 socket machine are often used for tradiotional
> server as above. The intention is, their machine don't use
> zone_reclaim_mode.
> 
> Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
> then this patch doesn't change such high-end NUMA machine behavior.
> 
> Cc: Mel Gorman <mel@csn.ul.ie>
> Cc: Bron Gondwana <brong@fastmail.fm>
> Cc: Robert Mueller <robm@fastmail.fm>
> Acked-by: Christoph Lameter <cl@linux.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
@ 2010-10-25  3:24 KOSAKI Motohiro
  2010-10-25  4:35 ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 14+ messages in thread
From: KOSAKI Motohiro @ 2010-10-25  3:24 UTC (permalink / raw)
  To: Christoph Lameter, Mel Gorman, Rob Mueller, linux-kernel,
	Bron Gondwana, linux-mm, David Rientjes, Andrew Morton,
	Balbir Singh
  Cc: kosaki.motohiro

Recently, Robert Mueller reported zone_reclaim_mode doesn't work
properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
He is using Cyrus IMAPd and it's built on a very traditional
single-process model.

  * a master process which reads config files and manages the other
    process
  * multiple imapd processes, one per connection
  * multiple pop3d processes, one per connection
  * multiple lmtpd processes, one per connection
  * periodical "cleanup" processes.

Then, there are thousands of independent processes. The problem is,
recent Intel motherboard turn on zone_reclaim_mode by default and
traditional prefork model software don't work fine on it.
Unfortunatelly, Such model is still typical one even though 21th
century. We can't ignore them.

This patch raise zone_reclaim_mode threshold to 30. 30 don't have
specific meaning. but 20 mean one-hop QPI/Hypertransport and such
relatively cheap 2-4 socket machine are often used for tradiotional
server as above. The intention is, their machine don't use
zone_reclaim_mode.

Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
then this patch doesn't change such high-end NUMA machine behavior.

Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Bron Gondwana <brong@fastmail.fm>
Cc: Robert Mueller <robm@fastmail.fm>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/topology.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/topology.h b/include/linux/topology.h
index 64e084f..bfbec49 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -60,7 +60,7 @@ int arch_update_cpu_topology(void);
  * (in whatever arch specific measurement units returned by node_distance())
  * then switch on zone reclaim on boot.
  */
-#define RECLAIM_DISTANCE 20
+#define RECLAIM_DISTANCE 30
 #endif
 #ifndef PENALTY_FOR_NODE_WITH_CPUS
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30
@ 2010-10-08  1:48 KOSAKI Motohiro
  0 siblings, 0 replies; 14+ messages in thread
From: KOSAKI Motohiro @ 2010-10-08  1:48 UTC (permalink / raw)
  To: Christoph Lameter, Mel Gorman, Rob Mueller, linux-kernel, Bron Gondwana
  Cc: kosaki.motohiro

Recently, Robert Mueller reported zone_reclaim_mode doesn't work
properly on his new NUMA server (Dual Xeon E5520 + Intel S5520UR MB).
He is using Cyrus IMAPd and it's built on a very traditional
single-process model.

  * a master process which reads config files and manages the other
    process
  * multiple imapd processes, one per connection
  * multiple pop3d processes, one per connection
  * multiple lmtpd processes, one per connection
  * periodical "cleanup" processes.

Then, there are thousands of independent processes. The problem is,
recent Intel motherboard turn on zone_reclaim_mode by default and
traditional prefork model software don't work fine on it.
Unfortunatelly, Such model is still typical one even though 21th
century. We can't ignore them.

This patch raise zone_reclaim_mode threshold to 30. 30 don't have
specific meaning. but 20 mean one-hop QPI/Hypertransport and such
relatively cheap 2-4 socket machine are often used for tradiotional
server as above. The intention is, their machine don't use
zone_reclaim_mode.

Note: ia64 and Power have arch specific RECLAIM_DISTANCE definition.
then this patch doesn't change such high-end NUMA machine behavior.

Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Bron Gondwana <brong@fastmail.fm>
Cc: Robert Mueller <robm@fastmail.fm>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/topology.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/topology.h b/include/linux/topology.h
index 64e084f..bfbec49 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -60,7 +60,7 @@ int arch_update_cpu_topology(void);
  * (in whatever arch specific measurement units returned by node_distance())
  * then switch on zone reclaim on boot.
  */
-#define RECLAIM_DISTANCE 20
+#define RECLAIM_DISTANCE 30
 #endif
 #ifndef PENALTY_FOR_NODE_WITH_CPUS
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
-- 
1.6.5.2



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-10-25  4:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-08  1:48 [resend][PATCH] mm: increase RECLAIM_DISTANCE to 30 KOSAKI Motohiro
2010-10-08  9:04 ` Balbir Singh
2010-10-08 15:45   ` Christoph Lameter
2010-10-08 16:59     ` Balbir Singh
2010-10-08 17:56       ` Christoph Lameter
2010-10-12  2:11         ` David Rientjes
2010-10-12  2:17           ` KOSAKI Motohiro
2010-10-12  3:12             ` David Rientjes
2010-10-12  4:07               ` KOSAKI Motohiro
2010-10-12  6:41                 ` Balbir Singh
2010-10-12  1:55   ` KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2010-10-25  3:24 KOSAKI Motohiro
2010-10-25  4:35 ` KAMEZAWA Hiroyuki
2010-10-08  1:48 KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox