From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alex Thorlton <athorlton@sgi.com>, Ingo Molnar <mingo@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Eric W . Biederman" <ebiederm@xmission.com>,
"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Andi Kleen <ak@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Dave Hansen <dave.hansen@intel.com>,
Dave Jones <davej@redhat.com>,
David Howells <dhowells@redhat.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Kees Cook <keescook@chromium.org>, Mel Gorman <mgorman@suse.de>,
Michael Kerrisk <mtk.manpages@gmail.com>,
Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Rik van Riel <riel@redhat.com>, Robin Holt <robinmholt@gmail.com>,
Sedat Dilek <sedat.dilek@gmail.com>,
Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCHv2 0/9] split page table lock for PMD tables
Date: Thu, 26 Sep 2013 20:04:57 -0400 [thread overview]
Message-ID: <1380240297-ia3atfjx-mutt-n-horiguchi@ah.jp.nec.com> (raw)
In-Reply-To: <20130926154224.D2CFFE0090@blue.fi.intel.com>
On Thu, Sep 26, 2013 at 06:42:24PM +0300, Kirill A. Shutemov wrote:
> Kirill A. Shutemov wrote:
> > Alex Thorlton wrote:
> > > > THP off:
> > > > --------
> > ...
> > > > 36.540185552 seconds time elapsed ( +- 18.36% )
> > >
> > > I'm assuming this was THP off, no patchset, correct?
> >
> > Yes. But THP off patched is *very* close to this, so I didn't post it separately.
> >
> > > Here are my results from this test on 3.12-rc1:
> > ...
> > > 1138.759708820 seconds time elapsed ( +- 0.47% )
> > >
> > > And the same test on 3.12-rc1 with your patchset:
> > >
> > > Performance counter stats for './runt -t -c 512 -b 512m' (5 runs):
> > ...
> > > 1115.214191126 seconds time elapsed ( +- 0.18% )
> > >
> > > Looks like we're getting a mild performance increase here, but we still
> > > have a problem.
> >
> > Let me guess: you have HUGETLBFS enabled in your config, right? ;)
> >
> > HUGETLBFS hasn't converted to new locking and we disable split pmd lock if
> > HUGETLBFS is enabled.
> >
> > I'm going to convert HUGETLBFS too, but it might take some time.
>
> Okay, here is a bit reworked patch from Naoya Horiguchi.
> It might need more cleanup.
>
> Please, test and review.
>
> From 47e400fc308e0054c2cadf7df48f632555c83572 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Thu, 26 Sep 2013 17:51:33 +0300
> Subject: [PATCH] mm/hugetlb: convert hugetlbfs to use split pmd lock
>
> Hugetlb supports multiple page sizes. We use split lock only for PMD
> level, but not for PUD.
I like this simple approach, because I don't think the benefit of doing
split ptl for PUD is large enough comparing with the cost of adding spinlock
initialization on every pud. Maybe we might as well consider this when
pud hugepage will be widely used.
> I've run workload from Alex Thorlton[1], slightly modified to use
> mmap(MAP_HUGETLB) for memory allocation.
>
> hugetlbfs, v3.12-rc2:
> ---------------------
>
> Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):
>
> 2588052.787264 task-clock # 54.400 CPUs utilized ( +- 3.69% )
> 246,831 context-switches # 0.095 K/sec ( +- 4.15% )
> 138 cpu-migrations # 0.000 K/sec ( +- 5.30% )
> 21,027 page-faults # 0.008 K/sec ( +- 0.01% )
> 6,166,666,307,263 cycles # 2.383 GHz ( +- 3.68% ) [83.33%]
> 6,086,008,929,407 stalled-cycles-frontend # 98.69% frontend cycles idle ( +- 3.77% ) [83.33%]
> 5,087,874,435,481 stalled-cycles-backend # 82.51% backend cycles idle ( +- 4.41% ) [66.67%]
> 133,782,831,249 instructions # 0.02 insns per cycle
> # 45.49 stalled cycles per insn ( +- 4.30% ) [83.34%]
> 34,026,870,541 branches # 13.148 M/sec ( +- 4.24% ) [83.34%]
> 68,670,942 branch-misses # 0.20% of all branches ( +- 3.26% ) [83.33%]
>
> 47.574936948 seconds time elapsed ( +- 2.09% )
>
> hugetlbfs, patched:
> -------------------
>
> Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):
>
> 395353.076837 task-clock # 20.329 CPUs utilized ( +- 8.16% )
> 55,730 context-switches # 0.141 K/sec ( +- 5.31% )
> 138 cpu-migrations # 0.000 K/sec ( +- 4.24% )
> 21,027 page-faults # 0.053 K/sec ( +- 0.00% )
> 930,219,717,244 cycles # 2.353 GHz ( +- 8.21% ) [83.32%]
> 914,295,694,103 stalled-cycles-frontend # 98.29% frontend cycles idle ( +- 8.35% ) [83.33%]
> 704,137,950,187 stalled-cycles-backend # 75.70% backend cycles idle ( +- 9.16% ) [66.69%]
> 30,541,538,385 instructions # 0.03 insns per cycle
> # 29.94 stalled cycles per insn ( +- 3.98% ) [83.35%]
> 8,415,376,631 branches # 21.286 M/sec ( +- 3.61% ) [83.36%]
> 32,645,478 branch-misses # 0.39% of all branches ( +- 3.41% ) [83.32%]
>
> 19.447481153 seconds time elapsed ( +- 2.00% )
>
> Split lock helps, but hugetlbs is still significantly slower the THP
> (8.4 seconds).
The difference is interesting, but it can be the different problem
from our current problem. Even in vanilla kernel, hugetlbfs looks slower.
> [1] ftp://shell.sgi.com/collect/memscale/thp_memscale.tar.gz
>
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> fs/proc/meminfo.c | 2 +-
> include/linux/hugetlb.h | 19 +++++++++
> include/linux/mm_types.h | 4 +-
> include/linux/swapops.h | 9 ++--
> mm/hugetlb.c | 108 ++++++++++++++++++++++++++++-------------------
> mm/mempolicy.c | 5 ++-
> mm/migrate.c | 7 +--
> mm/rmap.c | 2 +-
> 8 files changed, 100 insertions(+), 56 deletions(-)
>
> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> index 59d85d6088..6d061f5359 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -1,8 +1,8 @@
> #include <linux/fs.h>
> -#include <linux/hugetlb.h>
> #include <linux/init.h>
> #include <linux/kernel.h>
> #include <linux/mm.h>
> +#include <linux/hugetlb.h>
> #include <linux/mman.h>
> #include <linux/mmzone.h>
> #include <linux/proc_fs.h>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 0393270466..4a4a73b1ec 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -401,6 +401,7 @@ struct hstate {};
> #define hstate_sizelog(s) NULL
> #define hstate_vma(v) NULL
> #define hstate_inode(i) NULL
> +#define page_hstate(page) NULL
> #define huge_page_size(h) PAGE_SIZE
> #define huge_page_mask(h) PAGE_MASK
> #define vma_kernel_pagesize(v) PAGE_SIZE
> @@ -423,4 +424,22 @@ static inline pgoff_t basepage_index(struct page *page)
> #define hugepage_migration_support(h) 0
> #endif /* CONFIG_HUGETLB_PAGE */
>
> +static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
> + struct mm_struct *mm, pte_t *pte)
> +{
> + if (huge_page_size(h) == PMD_SIZE)
> + return pmd_lockptr(mm, (pmd_t *) pte);
> + VM_BUG_ON(huge_page_size(h) == PAGE_SIZE);
> + return &mm->page_table_lock;
> +}
> +
> +static inline spinlock_t *huge_pte_lock(struct hstate *h,
> + struct mm_struct *mm, pte_t *pte)
> +{
> + spinlock_t *ptl;
> + ptl = huge_pte_lockptr(h, mm, pte);
> + spin_lock(ptl);
> + return ptl;
> +}
> +
> #endif /* _LINUX_HUGETLB_H */
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 8c2a3c3e28..90cc93b2cc 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -24,10 +24,8 @@
> struct address_space;
>
> #define USE_SPLIT_PTE_PTLOCKS (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS)
> -/* hugetlb hasn't converted to split locking yet */
> #define USE_SPLIT_PMD_PTLOCKS (USE_SPLIT_PTE_PTLOCKS && \
> - IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK) && \
> - !IS_ENABLED(CONFIG_HUGETLB_PAGE))
> + IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK))
>
> /*
> * Each physical page in the system has a struct page associated with
> diff --git a/include/linux/swapops.h b/include/linux/swapops.h
> index 8d4fa82bfb..ad9f4b2964 100644
> --- a/include/linux/swapops.h
> +++ b/include/linux/swapops.h
> @@ -102,6 +102,8 @@ static inline void *swp_to_radix_entry(swp_entry_t entry)
> return (void *)(value | RADIX_TREE_EXCEPTIONAL_ENTRY);
> }
>
> +struct hstate;
> +
> #ifdef CONFIG_MIGRATION
> static inline swp_entry_t make_migration_entry(struct page *page, int write)
> {
We can avoid this workaround by passing vma to migration_entry_wait_huge(),
and getting hstate inside that function.
Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-09-27 0:05 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-16 11:25 Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 1/9] mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 2/9] mm: convert mm->nr_ptes to atomic_t Kirill A. Shutemov
2013-09-17 14:33 ` Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 3/9] mm: introduce api for split page table lock for PMD level Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 4/9] mm, thp: change pmd_trans_huge_lock() to return taken lock Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 5/9] mm, thp: move ptl taking inside page_check_address_pmd() Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 6/9] mm, thp: do not access mm->pmd_huge_pte directly Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 7/9] mm: convent the rest to new page table lock api Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 8/9] mm: implement split page table lock for PMD level Kirill A. Shutemov
2013-09-17 14:28 ` Kirill A. Shutemov
2013-09-16 11:25 ` [PATCHv2 9/9] x86, mm: enable " Kirill A. Shutemov
2013-09-16 11:44 ` [PATCHv2 0/9] split page table lock for PMD tables Peter Zijlstra
2013-09-16 12:11 ` Kirill A. Shutemov
2013-09-19 17:17 ` Alex Thorlton
2013-09-20 12:31 ` Kirill A. Shutemov
2013-09-24 16:44 ` Alex Thorlton
2013-09-26 10:50 ` Kirill A. Shutemov
2013-09-26 15:42 ` Kirill A. Shutemov
2013-09-27 0:04 ` Naoya Horiguchi [this message]
2013-09-26 21:19 ` Alex Thorlton
2013-09-26 21:38 ` Kirill A. Shutemov
2013-09-26 21:42 ` Kirill A. Shutemov
2013-09-26 21:44 ` Alex Thorlton
2013-09-26 21:43 ` Alex Thorlton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1380240297-ia3atfjx-mutt-n-horiguchi@ah.jp.nec.com \
--to=n-horiguchi@ah.jp.nec.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=athorlton@sgi.com \
--cc=dave.hansen@intel.com \
--cc=davej@redhat.com \
--cc=dhowells@redhat.com \
--cc=ebiederm@xmission.com \
--cc=fweisbec@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=keescook@chromium.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=robinmholt@gmail.com \
--cc=sedat.dilek@gmail.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox