linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alex Thorlton <athorlton@sgi.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andi Kleen <ak@linux.intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Dave Jones <davej@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kees Cook <keescook@chromium.org>, Mel Gorman <mgorman@suse.de>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Rik van Riel <riel@redhat.com>, Robin Holt <robinmholt@gmail.com>,
	Sedat Dilek <sedat.dilek@gmail.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCHv4 00/10] split page table lock for PMD tables
Date: Fri, 4 Oct 2013 15:12:13 -0500	[thread overview]
Message-ID: <20131004201213.GB32110@sgi.com> (raw)
In-Reply-To: <1380287787-30252-1-git-send-email-kirill.shutemov@linux.intel.com>

Kirill,

I've pasted in my results for 512 cores below.  Things are looking 
really good here.  I don't have a test for HUGETLBFS, but if you want to
pass me the one you used, I can run that too.  I suppose I could write
one, but why reinvent the wheel? :)

Sorry for the delay on these results.  I hit some strange issues with
running thp_memscale on systems with either of the following
combinations of configuration options set:

[thp off]
HUGETLBFS=y
HUGETLB_PAGE=y
NUMA_BALANCING=y
NUMA_BALANCING_DEFAULT_ENABLED=y

[thp on or off]
HUGETLBFS=n
HUGETLB_PAGE=n
NUMA_BALANCING=y
NUMA_BALANCING_DEFAULT_ENABLED=y

I'm getting segfaults intermittently, as well as some weird RCU sched
errors.  This happens in vanilla 3.12-rc2, so it doesn't have anything
to do with your patches, but I thought I'd let you know.  There didn't
used to be any issues with this test, so I think there's a subtle kernel
bug here.  That's, of course, an entirely separate issue though.

As far as these patches go, I think everything looks good (save for the
bit of discussion you were having with Andrew earlier, which I think
you've worked out).  My testing shows that the page fault rates are
actually better on this threaded test than in the non-threaded case!

- Alex

On Fri, Sep 27, 2013 at 04:16:17PM +0300, Kirill A. Shutemov wrote:
> Alex Thorlton noticed that some massively threaded workloads work poorly,
> if THP enabled. This patchset fixes this by introducing split page table
> lock for PMD tables. hugetlbfs is not covered yet.
> 
> This patchset is based on work by Naoya Horiguchi.
> 
> Please review and consider applying.
> 
> Changes:
>  v4:
>   - convert hugetlb to new locking;
>  v3:
>   - fix USE_SPLIT_PMD_PTLOCKS;
>   - fix warning in fs/proc/task_mmu.c;
>  v2:
>   - reuse CONFIG_SPLIT_PTLOCK_CPUS for PMD split lock;
>   - s/huge_pmd_lock/pmd_lock/g;
>   - assume pgtable_pmd_page_ctor() can fail;
>   - fix format line in task_mem() for VmPTE;
> 
> THP off, v3.12-rc2:
> -------------------
> 
>  Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
> 
>     1037072.835207 task-clock                #   57.426 CPUs utilized            ( +-  3.59% )
>             95,093 context-switches          #    0.092 K/sec                    ( +-  3.93% )
>                140 cpu-migrations            #    0.000 K/sec                    ( +-  5.28% )
>         10,000,550 page-faults               #    0.010 M/sec                    ( +-  0.00% )
>  2,455,210,400,261 cycles                    #    2.367 GHz                      ( +-  3.62% ) [83.33%]
>  2,429,281,882,056 stalled-cycles-frontend   #   98.94% frontend cycles idle     ( +-  3.67% ) [83.33%]
>  1,975,960,019,659 stalled-cycles-backend    #   80.48% backend  cycles idle     ( +-  3.88% ) [66.68%]
>     46,503,296,013 instructions              #    0.02  insns per cycle
>                                              #   52.24  stalled cycles per insn  ( +-  3.21% ) [83.34%]
>      9,278,997,542 branches                  #    8.947 M/sec                    ( +-  4.00% ) [83.34%]
>         89,881,640 branch-misses             #    0.97% of all branches          ( +-  1.17% ) [83.33%]
> 
>       18.059261877 seconds time elapsed                                          ( +-  2.65% )
> 
> THP on, v3.12-rc2:
> ------------------
> 
>  Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
> 
>     3114745.395974 task-clock                #   73.875 CPUs utilized            ( +-  1.84% )
>            267,356 context-switches          #    0.086 K/sec                    ( +-  1.84% )
>                 99 cpu-migrations            #    0.000 K/sec                    ( +-  1.40% )
>             58,313 page-faults               #    0.019 K/sec                    ( +-  0.28% )
>  7,416,635,817,510 cycles                    #    2.381 GHz                      ( +-  1.83% ) [83.33%]
>  7,342,619,196,993 stalled-cycles-frontend   #   99.00% frontend cycles idle     ( +-  1.88% ) [83.33%]
>  6,267,671,641,967 stalled-cycles-backend    #   84.51% backend  cycles idle     ( +-  2.03% ) [66.67%]
>    117,819,935,165 instructions              #    0.02  insns per cycle
>                                              #   62.32  stalled cycles per insn  ( +-  4.39% ) [83.34%]
>     28,899,314,777 branches                  #    9.278 M/sec                    ( +-  4.48% ) [83.34%]
>         71,787,032 branch-misses             #    0.25% of all branches          ( +-  1.03% ) [83.33%]
> 
>       42.162306788 seconds time elapsed                                          ( +-  1.73% )

THP on, v3.12-rc2:
------------------

 Performance counter stats for './thp_memscale -C 0 -m 0 -c 512 -b 512m' (5 runs):

  568668865.944994 task-clock                #  528.547 CPUs utilized            ( +-  0.21% ) [100.00%]
         1,491,589 context-switches          #    0.000 M/sec                    ( +-  0.25% ) [100.00%]
             1,085 CPU-migrations            #    0.000 M/sec                    ( +-  1.80% ) [100.00%]
           400,822 page-faults               #    0.000 M/sec                    ( +-  0.41% )
1,306,612,476,049,478 cycles                    #    2.298 GHz                      ( +-  0.23% ) [100.00%]
1,277,211,694,318,724 stalled-cycles-frontend   #   97.75% frontend cycles idle     ( +-  0.21% ) [100.00%]
1,163,736,844,232,064 stalled-cycles-backend    #   89.07% backend  cycles idle     ( +-  0.20% ) [100.00%]
53,855,178,678,230 instructions              #    0.04  insns per cycle        
                                             #   23.72  stalled cycles per insn  ( +-  1.15% ) [100.00%]
21,041,661,816,782 branches                  #   37.002 M/sec                    ( +-  0.64% ) [100.00%]
       606,665,092 branch-misses             #    0.00% of all branches          ( +-  0.63% )

    1075.909782795 seconds time elapsed                                          ( +-  0.21% )

> HUGETLB, v3.12-rc2:
> -------------------
> 
>  Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):
> 
>     2588052.787264 task-clock                #   54.400 CPUs utilized            ( +-  3.69% )
>            246,831 context-switches          #    0.095 K/sec                    ( +-  4.15% )
>                138 cpu-migrations            #    0.000 K/sec                    ( +-  5.30% )
>             21,027 page-faults               #    0.008 K/sec                    ( +-  0.01% )
>  6,166,666,307,263 cycles                    #    2.383 GHz                      ( +-  3.68% ) [83.33%]
>  6,086,008,929,407 stalled-cycles-frontend   #   98.69% frontend cycles idle     ( +-  3.77% ) [83.33%]
>  5,087,874,435,481 stalled-cycles-backend    #   82.51% backend  cycles idle     ( +-  4.41% ) [66.67%]
>    133,782,831,249 instructions              #    0.02  insns per cycle
>                                              #   45.49  stalled cycles per insn  ( +-  4.30% ) [83.34%]
>     34,026,870,541 branches                  #   13.148 M/sec                    ( +-  4.24% ) [83.34%]
>         68,670,942 branch-misses             #    0.20% of all branches          ( +-  3.26% ) [83.33%]
> 
>       47.574936948 seconds time elapsed                                          ( +-  2.09% )
> 
> THP off, patched:
> -----------------
> 
>  Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
> 
>      943301.957892 task-clock                #   56.256 CPUs utilized            ( +-  3.01% )
>             86,218 context-switches          #    0.091 K/sec                    ( +-  3.17% )
>                121 cpu-migrations            #    0.000 K/sec                    ( +-  6.64% )
>         10,000,551 page-faults               #    0.011 M/sec                    ( +-  0.00% )
>  2,230,462,457,654 cycles                    #    2.365 GHz                      ( +-  3.04% ) [83.32%]
>  2,204,616,385,805 stalled-cycles-frontend   #   98.84% frontend cycles idle     ( +-  3.09% ) [83.32%]
>  1,778,640,046,926 stalled-cycles-backend    #   79.74% backend  cycles idle     ( +-  3.47% ) [66.69%]
>     45,995,472,617 instructions              #    0.02  insns per cycle
>                                              #   47.93  stalled cycles per insn  ( +-  2.51% ) [83.34%]
>      9,179,700,174 branches                  #    9.731 M/sec                    ( +-  3.04% ) [83.35%]
>         89,166,529 branch-misses             #    0.97% of all branches          ( +-  1.45% ) [83.33%]
> 
>       16.768027318 seconds time elapsed                                          ( +-  2.47% )
> 
> THP on, patched:
> ----------------
> 
>  Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
> 
>      458793.837905 task-clock                #   54.632 CPUs utilized            ( +-  0.79% )
>             41,831 context-switches          #    0.091 K/sec                    ( +-  0.97% )
>                 98 cpu-migrations            #    0.000 K/sec                    ( +-  1.66% )
>             57,829 page-faults               #    0.126 K/sec                    ( +-  0.62% )
>  1,077,543,336,716 cycles                    #    2.349 GHz                      ( +-  0.81% ) [83.33%]
>  1,067,403,802,964 stalled-cycles-frontend   #   99.06% frontend cycles idle     ( +-  0.87% ) [83.33%]
>    864,764,616,143 stalled-cycles-backend    #   80.25% backend  cycles idle     ( +-  0.73% ) [66.68%]
>     16,129,177,440 instructions              #    0.01  insns per cycle
>                                              #   66.18  stalled cycles per insn  ( +-  7.94% ) [83.35%]
>      3,618,938,569 branches                  #    7.888 M/sec                    ( +-  8.46% ) [83.36%]
>         33,242,032 branch-misses             #    0.92% of all branches          ( +-  2.02% ) [83.32%]
> 
>        8.397885779 seconds time elapsed                                          ( +-  0.18% )

THP on, patched:
----------------

 Performance counter stats for './runt -t -c 512 -b 512m' (5 runs):

   15836198.490485 task-clock                #  533.304 CPUs utilized            ( +-  0.95% ) [100.00%]
           127,507 context-switches          #    0.000 M/sec                    ( +-  1.65% ) [100.00%]
             1,223 CPU-migrations            #    0.000 M/sec                    ( +-  3.23% ) [100.00%]
           302,080 page-faults               #    0.000 M/sec                    ( +-  6.88% )
18,925,875,973,975 cycles                    #    1.195 GHz                      ( +-  0.43% ) [100.00%]
18,325,469,464,007 stalled-cycles-frontend   #   96.83% frontend cycles idle     ( +-  0.44% ) [100.00%]
17,522,272,147,141 stalled-cycles-backend    #   92.58% backend  cycles idle     ( +-  0.49% ) [100.00%]
 2,686,490,067,197 instructions              #    0.14  insns per cycle        
                                             #    6.82  stalled cycles per insn  ( +-  2.16% ) [100.00%]
   944,712,646,402 branches                  #   59.655 M/sec                    ( +-  2.03% ) [100.00%]
       145,956,565 branch-misses             #    0.02% of all branches          ( +-  0.88% )

      29.694499652 seconds time elapsed                                          ( +-  0.95% )

(these results are from the test suite that I ripped thp_memscale out
of, but it's the same test)
 
> HUGETLB, patched
> -----------------
> 
>  Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):
> 
>      395353.076837 task-clock                #   20.329 CPUs utilized            ( +-  8.16% )
>             55,730 context-switches          #    0.141 K/sec                    ( +-  5.31% )
>                138 cpu-migrations            #    0.000 K/sec                    ( +-  4.24% )
>             21,027 page-faults               #    0.053 K/sec                    ( +-  0.00% )
>    930,219,717,244 cycles                    #    2.353 GHz                      ( +-  8.21% ) [83.32%]
>    914,295,694,103 stalled-cycles-frontend   #   98.29% frontend cycles idle     ( +-  8.35% ) [83.33%]
>    704,137,950,187 stalled-cycles-backend    #   75.70% backend  cycles idle     ( +-  9.16% ) [66.69%]
>     30,541,538,385 instructions              #    0.03  insns per cycle
>                                              #   29.94  stalled cycles per insn  ( +-  3.98% ) [83.35%]
>      8,415,376,631 branches                  #   21.286 M/sec                    ( +-  3.61% ) [83.36%]
>         32,645,478 branch-misses             #    0.39% of all branches          ( +-  3.41% ) [83.32%]
> 
>       19.447481153 seconds time elapsed                                          ( +-  2.00% )
> 
> Kirill A. Shutemov (10):
>   mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS
>   mm: convert mm->nr_ptes to atomic_t
>   mm: introduce api for split page table lock for PMD level
>   mm, thp: change pmd_trans_huge_lock() to return taken lock
>   mm, thp: move ptl taking inside page_check_address_pmd()
>   mm, thp: do not access mm->pmd_huge_pte directly
>   mm, hugetlb: convert hugetlbfs to use split pmd lock
>   mm: convent the rest to new page table lock api
>   mm: implement split page table lock for PMD level
>   x86, mm: enable split page table lock for PMD level
> 
>  arch/arm/mm/fault-armv.c       |   6 +-
>  arch/s390/mm/pgtable.c         |  12 +--
>  arch/sparc/mm/tlb.c            |  12 +--
>  arch/x86/Kconfig               |   4 +
>  arch/x86/include/asm/pgalloc.h |  11 ++-
>  arch/x86/xen/mmu.c             |   6 +-
>  fs/proc/meminfo.c              |   2 +-
>  fs/proc/task_mmu.c             |  16 ++--
>  include/linux/huge_mm.h        |  17 ++--
>  include/linux/hugetlb.h        |  25 +++++
>  include/linux/mm.h             |  52 ++++++++++-
>  include/linux/mm_types.h       |  18 ++--
>  include/linux/swapops.h        |   7 +-
>  kernel/fork.c                  |   6 +-
>  mm/Kconfig                     |   3 +
>  mm/huge_memory.c               | 201 ++++++++++++++++++++++++-----------------
>  mm/hugetlb.c                   | 108 +++++++++++++---------
>  mm/memcontrol.c                |  10 +-
>  mm/memory.c                    |  21 +++--
>  mm/mempolicy.c                 |   5 +-
>  mm/migrate.c                   |  14 +--
>  mm/mmap.c                      |   3 +-
>  mm/mprotect.c                  |   4 +-
>  mm/oom_kill.c                  |   6 +-
>  mm/pgtable-generic.c           |  16 ++--
>  mm/rmap.c                      |  15 ++-
>  26 files changed, 379 insertions(+), 221 deletions(-)
> 
> -- 
> 1.8.4.rc3
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-10-04 20:12 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-27 13:16 Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 01/10] mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 02/10] mm: convert mm->nr_ptes to atomic_t Kirill A. Shutemov
2013-09-27 20:46   ` Cody P Schafer
2013-09-27 21:01     ` Dave Hansen
2013-09-27 22:24     ` Kirill A. Shutemov
2013-09-28  0:13       ` Johannes Weiner
2013-09-28 16:12         ` Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 03/10] mm: introduce api for split page table lock for PMD level Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 04/10] mm, thp: change pmd_trans_huge_lock() to return taken lock Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 05/10] mm, thp: move ptl taking inside page_check_address_pmd() Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 06/10] mm, thp: do not access mm->pmd_huge_pte directly Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 07/10] mm, hugetlb: convert hugetlbfs to use split pmd lock Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 08/10] mm: convent the rest to new page table lock api Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 09/10] mm: implement split page table lock for PMD level Kirill A. Shutemov
2013-10-03 23:11   ` Andrew Morton
2013-10-03 23:38     ` Kirill A. Shutemov
2013-10-04  0:34       ` Kirill A. Shutemov
2013-10-04  7:21     ` Peter Zijlstra
2013-10-03 23:42   ` Kirill A. Shutemov
2013-09-27 13:16 ` [PATCHv4 10/10] x86, mm: enable " Kirill A. Shutemov
2013-10-04 20:12 ` Alex Thorlton [this message]
2013-10-04 20:26   ` [PATCHv4 00/10] split page table lock for PMD tables Kirill A. Shutemov
2013-10-04 20:31     ` Alex Thorlton
2013-10-07  9:48       ` Kirill A. Shutemov
2013-10-08 21:47         ` Alex Thorlton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131004201213.GB32110@sgi.com \
    --to=athorlton@sgi.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=davej@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=fweisbec@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=robinmholt@gmail.com \
    --cc=sedat.dilek@gmail.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox