Re: NUMA? bisected performance regression 3.11->3.12

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>, Mel Gorman <mgorman@suse.de>,
	Rik van Riel <riel@redhat.com>, Kevin Hilman <khilman@linaro.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Paul Bolle <paul.bollee@gmail.com>,
	Zlatko Calusic <zcalusic@bitsync.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>
Subject: Re: NUMA? bisected performance regression 3.11->3.12
Date: Fri, 22 Nov 2013 00:22:19 -0500	[thread overview]
Message-ID: <20131122052219.GL3556@cmpxchg.org> (raw)
In-Reply-To: <528E8FCE.1000707@intel.com>

Hi Dave,

On Thu, Nov 21, 2013 at 02:57:18PM -0800, Dave Hansen wrote:
> Hey Johannes,
> 
> I'm running an open/close microbenchmark from the will-it-scale set:
> > https://github.com/antonblanchard/will-it-scale/blob/master/tests/open1.c
> 
> I was seeing some weird symptoms on 3.12 vs 3.11.  The throughput in
> that test was going from down from 50 million to 35 million.
> 
> The profiles show an increase in cpu time in _raw_spin_lock_irq.  The
> profiles pointed to slub code that hasn't been touched in quite a while.
>  I bisected it down to:
> 
> 81c0a2bb515fd4daae8cab64352877480792b515 is the first bad commit
> commit 81c0a2bb515fd4daae8cab64352877480792b515
> Author: Johannes Weiner <hannes@cmpxchg.org>
> Date:   Wed Sep 11 14:20:47 2013 -0700
> 
> Which also seems a bit weird, but I've tested with this and its
> preceding commit enough times to be fairly sure that I did it right.
> 
> __slab_free() and free_one_page() both seem to be spending more time
> spinning on their respective spinlocks, even though the throughput went
> down and we should have been doing fewer actual allocations/frees.  The
> best explanation for this would be if CPUs are tending to go after and
> contending for remote cachelines more often once this patch is applied.
> 
> Any ideas?
> 
> It's a 8-socket/160-thread (one NUMA node per socket) system that is not
> under memory pressure during the test.  The latencies are also such that
> vm.zone_reclaim_mode=0.

The change will definitely spread allocations out to all nodes then
and it's plausible that the remote references will hurt kernel object
allocations in a tight loop.  Just to confirm, could you rerun the
test with zone_reclaim_mode enabled to make the allocator stay in the
local zones?

The fairness code was written for reclaimable memory, which is
longer-lived, and the only memory where it matters.  I might have to
be bypass it for unreclaimable allocations...

> Raw perf profiles and .config are in here:
> http://www.sr71.net/~dave/intel/201311-wisregress0/
> 
> Here's a chunk of the 'perf diff':
> >     17.65%   +3.47%  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave           
> >     13.80%   -0.31%  [kernel.kallsyms]  [k] _raw_spin_lock                   
> >      7.21%   -0.51%  [unknown]          [.] 0x00007f7849058640               
> >      3.43%   +0.15%  [kernel.kallsyms]  [k] setup_object                     
> >      2.99%   -0.31%  [kernel.kallsyms]  [k] file_free_rcu                    
> >      2.71%   -0.13%  [kernel.kallsyms]  [k] rcu_process_callbacks            
> >      2.26%   -0.09%  [kernel.kallsyms]  [k] get_empty_filp                   
> >      2.06%   -0.09%  [kernel.kallsyms]  [k] kmem_cache_alloc                 
> >      1.65%   -0.08%  [kernel.kallsyms]  [k] link_path_walk                   
> >      1.53%   -0.08%  [kernel.kallsyms]  [k] memset                           
> >      1.46%   -0.09%  [kernel.kallsyms]  [k] do_dentry_open                   
> >      1.44%   -0.04%  [kernel.kallsyms]  [k] __d_lookup_rcu                   
> >      1.27%   -0.04%  [kernel.kallsyms]  [k] do_last                          
> >      1.18%   -0.04%  [kernel.kallsyms]  [k] ext4_release_file                
> >      1.16%   -0.04%  [kernel.kallsyms]  [k] __call_rcu.constprop.11          

Thanks for the detailed report.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-11-22  5:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-21 22:57 Dave Hansen
2013-11-22  5:22 ` Johannes Weiner [this message]
2013-11-22  6:18   ` Dave Hansen
2013-11-22  6:38     ` Johannes Weiner
2013-11-22 16:57       ` Dave Hansen
2013-11-26 10:32 ` Mel Gorman
2013-12-06 17:43   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131122052219.GL3556@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=khilman@linaro.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=paul.bollee@gmail.com \
    --cc=riel@redhat.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=zcalusic@bitsync.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox