linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linuxfoundation.org>,
	Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Fengguang Wu <fengguang.wu@intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: page_alloc: Default node-ordering on 64-bit NUMA, zone-ordering on 32-bit v2
Date: Tue, 9 Sep 2014 10:52:42 -0400	[thread overview]
Message-ID: <20140909145242.GA24127@cmpxchg.org> (raw)
In-Reply-To: <20140909144630.GA12309@suse.de>

On Tue, Sep 09, 2014 at 03:46:30PM +0100, Mel Gorman wrote:
> Changelog since v1
> o Default to zone-ordering on 32-bit and remove heuristics
> o Expand changelog
> 
> Zones are allocated by the page allocator in either node or zone order.
> Node ordering is preferred in terms of locality and is applied automatically
> in one of three cases.
> 
>   1. If a node has only low memory
> 
>   2. If DMA/DMA32 is a high percentage of memory
> 
>   3. If low memory on a single node is greater than 70% of the node size
> 
> Otherwise zone ordering is used to preserve low memory for devices that
> require it. Unfortunately a consequence of this is that a machine with
> balanced NUMA nodes will experience different performance characteristics
> depending on which node they happen to start from.
> 
> The point of zone ordering is to protect lower nodes for devices that
> require DMA/DMA32 memory. When NUMA was first introduced, this was critical
> as 32-bit NUMA machines existed and exhausting low memory triggered OOMs
> easily as so many allocations required low memory. On 64-bit machines the
> primary concern is devices that are 32-bit only which is less severe than
> the low memory exhaustion problem on 32-bit NUMA. It seems there are really
> few devices that depends on it.
> 
> AGP -- I assume this is getting more rare but even then I think the allocations
> 	happen early in boot time where lowmem pressure is less of a problem
> 
> DRM -- If the device is 32-bit only then there may be low pressure. I didn't
> 	evaluate these in detail but it looks like some of these are mobile
> 	graphics card. Not many NUMA laptops out there. DRM folk should know
> 	better though.
> 
> Some TV cards -- Much demand for 32-bit capable TV cards on NUMA machines?
> 
> B43 wireless card -- again not really a NUMA thing.
> 
> I cannot find a good reason to incur a performance penalty on all 64-bit NUMA
> machines in case someone throws a brain damanged TV or graphics card in there.
> This patch defaults to node-ordering on 64-bit NUMA machines. I was tempted
> to make it default everywhere but I understand that some embedded arches may
> be using 32-bit NUMA where I cannot predict the consequences.
> 
> The performance impact depends on the workload and the characteristics of the
> machine and the machine I tested on had a large Normal zone on node 0 so the
> impact is within the noise for the majority of tests. The allocation stats
> show more allocation requests were from DMA32 and local node. Running SpecJBB
> with multiple JVMs and automatic NUMA balancing disabled the results were
> 
> specjbb
>                      3.17.0-rc2            3.17.0-rc2
>                         vanilla        nodeorder-v1r1
> Min    1      29534.00 (  0.00%)     30020.00 (  1.65%)
> Min    10    115717.00 (  0.00%)    134038.00 ( 15.83%)
> Min    19    109718.00 (  0.00%)    114186.00 (  4.07%)
> Min    28    104459.00 (  0.00%)    103639.00 ( -0.78%)
> Min    37     98245.00 (  0.00%)    103756.00 (  5.61%)
> Min    46     97198.00 (  0.00%)     96197.00 ( -1.03%)
> Mean   1      30953.25 (  0.00%)     31917.75 (  3.12%)
> Mean   10    124432.50 (  0.00%)    140904.00 ( 13.24%)
> Mean   19    116033.50 (  0.00%)    119294.75 (  2.81%)
> Mean   28    108365.25 (  0.00%)    106879.50 ( -1.37%)
> Mean   37    102984.75 (  0.00%)    106924.25 (  3.83%)
> Mean   46    100783.25 (  0.00%)    105368.50 (  4.55%)
> Stddev 1       1260.38 (  0.00%)      1109.66 ( 11.96%)
> Stddev 10      7434.03 (  0.00%)      5171.91 ( 30.43%)
> Stddev 19      8453.84 (  0.00%)      5309.59 ( 37.19%)
> Stddev 28      4184.55 (  0.00%)      2906.63 ( 30.54%)
> Stddev 37      5409.49 (  0.00%)      3192.12 ( 40.99%)
> Stddev 46      4521.95 (  0.00%)      7392.52 (-63.48%)
> Max    1      32738.00 (  0.00%)     32719.00 ( -0.06%)
> Max    10    136039.00 (  0.00%)    148614.00 (  9.24%)
> Max    19    130566.00 (  0.00%)    127418.00 ( -2.41%)
> Max    28    115404.00 (  0.00%)    111254.00 ( -3.60%)
> Max    37    112118.00 (  0.00%)    111732.00 ( -0.34%)
> Max    46    108541.00 (  0.00%)    116849.00 (  7.65%)
> TPut   1     123813.00 (  0.00%)    127671.00 (  3.12%)
> TPut   10    497730.00 (  0.00%)    563616.00 ( 13.24%)
> TPut   19    464134.00 (  0.00%)    477179.00 (  2.81%)
> TPut   28    433461.00 (  0.00%)    427518.00 ( -1.37%)
> TPut   37    411939.00 (  0.00%)    427697.00 (  3.83%)
> TPut   46    403133.00 (  0.00%)    421474.00 (  4.55%)
> 
>                             3.17.0-rc2  3.17.0-rc2
>                                vanillanodeorder-v1r1
> DMA allocs                           0           0
> DMA32 allocs                        57     1491992
> Normal allocs                 32543566    30026383
> Movable allocs                       0           0
> Direct pages scanned                 0           0
> Kswapd pages scanned                 0           0
> Kswapd pages reclaimed               0           0
> Direct pages reclaimed               0           0
> Kswapd efficiency                 100%        100%
> Kswapd velocity                  0.000       0.000
> Direct efficiency                 100%        100%
> Direct velocity                  0.000       0.000
> Percentage direct scans             0%          0%
> Zone normal velocity             0.000       0.000
> Zone dma32 velocity              0.000       0.000
> Zone dma velocity                0.000       0.000
> THP fault alloc                  55164       52987
> THP collapse alloc                 139         147
> THP splits                          26          21
> NUMA alloc hit                 4169066     4250692
> NUMA alloc miss                      0           0
> 
> Note that there were more DMA32 allocations with the patch applied.  In this
> particular case there was no difference in numa_hit and numa_miss. The
> expectation is that DMA32 was being used at the low watermark instead of
> falling into the slow path. kswapd was not woken but it's not worken for
> THP allocations.
> 
> On 32-bit, this patch defaults to zone-ordering as low memory depletion
> can be a serious problem on 32-bit large memory machines. If the default
> ordering was node then processes on node 0 will deplete the Normal zone
> due to normal activity.  The problem is worse if CONFIG_HIGHPTE is not
> set. If combined with large amounts of dirty/writeback pages in Normal
> zone then there is also a high risk of OOM. The heuristics are removed
> as it's not clear they were ever important on 32-bit. They were only
> relevant for setting node-ordering on 64-bit.
> 
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2014-09-09 14:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-01 12:55 [PATCH] mm: page_alloc: Default to node-ordering on 64-bit NUMA machines Mel Gorman
2014-09-02  1:12 ` Kamezawa Hiroyuki
2014-09-02 13:51 ` Johannes Weiner
2014-09-02 14:01   ` Kamezawa Hiroyuki
2014-09-02 15:21   ` Mel Gorman
2014-09-04 15:29     ` Johannes Weiner
2014-09-05 10:30       ` Mel Gorman
2014-09-09 14:46         ` [PATCH] mm: page_alloc: Default node-ordering on 64-bit NUMA, zone-ordering on 32-bit v2 Mel Gorman
2014-09-09 14:52           ` Johannes Weiner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140909145242.GA24127@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linuxfoundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox