Re: [PATCH -mm 0/7] mm: scalable and unified arch_get_unmapped_area

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Johannes Weiner <hannes@cmpxchg.org>
To: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	aarcange@redhat.com, peterz@infradead.org, minchan@gmail.com,
	kosaki.motohiro@gmail.com, andi@firstfloor.org, mel@csn.ul.ie,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH -mm 0/7] mm: scalable and unified arch_get_unmapped_area
Date: Thu, 21 Jun 2012 12:18:00 +0200	[thread overview]
Message-ID: <20120621101800.GH27816@cmpxchg.org> (raw)
In-Reply-To: <1340057126-31143-1-git-send-email-riel@redhat.com>

On Mon, Jun 18, 2012 at 06:05:19PM -0400, Rik van Riel wrote:
> [actually include all 7 patches]
> 
> A long time ago, we decided to limit the number of VMAs per
> process to 64k. As it turns out, there actually are programs
> using tens of thousands of VMAs.
> 
> The linear search in arch_get_unmapped_area and
> arch_get_unmapped_area_topdown can be a real issue for
> those programs. 
> 
> This patch series aims to fix the scalability issue by
> tracking the size of each free hole in the VMA rbtree,
> propagating the free hole info up the tree. 
> 
> Another major goal is to put the bulk of the necessary
> arch_get_unmapped_area(_topdown) functionality into one
> set of functions, so we can eliminate the custom large
> functions per architecture, sticking to a few much smaller
> architecture specific functions instead.
> 
> In this version I have only gotten rid of the x86, ARM
> and MIPS arch-specific code, and am already showing a
> fairly promising diffstat:
> 
>  arch/arm/include/asm/pgtable.h    |    6 
>  arch/arm/mm/init.c                |    3 
>  arch/arm/mm/mmap.c                |  217 ------------------
>  arch/mips/include/asm/page.h      |    2 
>  arch/mips/include/asm/pgtable.h   |    7 
>  arch/mips/mm/mmap.c               |  177 --------------
>  arch/x86/include/asm/elf.h        |    3 
>  arch/x86/include/asm/pgtable_64.h |    4 
>  arch/x86/kernel/sys_x86_64.c      |  200 ++--------------
>  arch/x86/vdso/vma.c               |    2 
>  include/linux/mm_types.h          |    8 
>  include/linux/sched.h             |   13 +
>  mm/internal.h                     |    5 
>  mm/mmap.c                         |  455 ++++++++++++++++++++++++++++++--------
>  14 files changed, 420 insertions(+), 682 deletions(-)
> 
> TODO:
> - eliminate arch-specific functions for more architectures
> - integrate hugetlbfs alignment (with Andi Kleen's patch?)
> 
> Performance
> 
> Testing performance with a benchmark that allocates tens
> of thousands of VMAs, unmaps them and mmaps them some more
> in a loop, shows promising results.
> 
> Vanilla 3.4 kernel:
> $ ./agua_frag_test_64
> ..........
> 
> Min Time (ms): 6
> Avg. Time (ms): 294.0000
> Max Time (ms): 609
> Std Dev (ms): 113.1664
> Standard deviation exceeds 10
> 
> With patches:
> $ ./agua_frag_test_64
> ..........
> 
> Min Time (ms): 14
> Avg. Time (ms): 38.0000
> Max Time (ms): 60
> Std Dev (ms): 3.9312
> All checks pass
> 
> The total run time of the test goes down by about a
> factor 4.  More importantly, the worst case performance
> of the loop (which is what really hurt some applications)
> has gone down by about a factor 10.

I ran 8 4-job kernel builds on before and after kernels, here are the
results:

                        1x4kernbench-3.5.0-rc3-next-20120619    1x4kernbench-3.5.0-rc3-next-20120619-00007-g594e750
Elapsed time                               273.95 (  +0.00%)                                      274.11 (  +0.06%)
Elapsed time (stddev)                        0.23 (  +0.00%)                                        0.20 (  -2.81%)
User time                                  463.38 (  +0.00%)                                      463.13 (  -0.05%)
User time (stddev)                           0.16 (  +0.00%)                                        0.23 (  +6.66%)
System time                                 49.36 (  +0.00%)                                       50.16 (  +1.60%)
System time (stddev)                         0.24 (  +0.00%)                                        0.26 (  +1.43%)

Walltime is unchanged, but system time went up a tiny notch.

I suspect this comes from the freegap propagation, which can be quite
a bit of work if vmas are created, split/merged, or unmapped deep down
in the rb tree.  Here is a worst-case scenario that creates a bunch of
vmas next to each other and then unmaps and remaps one in the middle
repeatedly in a tight loop (code below):

vanilla: 0.802003266 seconds time elapsed                                          ( +-  0.66% )
patched: 1.710614276 seconds time elapsed                                          ( +-  0.28% )

vanilla:
     7.50%  freegap  [kernel.kallsyms]  [k] perf_event_mmap
     6.73%  freegap  [kernel.kallsyms]  [k] __split_vma
     6.06%  freegap  [kernel.kallsyms]  [k] unmap_single_vma
     5.96%  freegap  [kernel.kallsyms]  [k] vma_adjust
     5.55%  freegap  [kernel.kallsyms]  [k] do_munmap
     5.24%  freegap  [kernel.kallsyms]  [k] find_vma_prepare
     4.21%  freegap  [kernel.kallsyms]  [k] rb_insert_color
     4.13%  freegap  [kernel.kallsyms]  [k] vma_merge
     3.39%  freegap  [kernel.kallsyms]  [k] find_vma
     2.88%  freegap  [kernel.kallsyms]  [k] do_mmap_pgoff
patched:
    28.36%  freegap  [kernel.kallsyms]  [k] vma_rb_augment_cb
    12.20%  freegap  [kernel.kallsyms]  [k] rb_augment_path
     3.85%  freegap  [kernel.kallsyms]  [k] unmap_single_vma
     3.68%  freegap  [kernel.kallsyms]  [k] perf_event_mmap
     3.56%  freegap  [kernel.kallsyms]  [k] vma_adjust
     3.30%  freegap  [kernel.kallsyms]  [k] __split_vma
     3.03%  freegap  [kernel.kallsyms]  [k] rb_erase
     2.56%  freegap  [kernel.kallsyms]  [k] arch_get_unmapped_area_topdown
     1.97%  freegap  [kernel.kallsyms]  [k] rb_insert_color
     1.92%  freegap  [kernel.kallsyms]  [k] vma_merge

One idea would be to be a bit more generous with free space and lazily
update the holes.  Chain up vmas that need to propagate gaps and do it
in batch.  If the same vma is unmapped and remapped, it would first
move to the left-most gap but then the vma that'd need propagation
would always be the same neighboring one, and it can be easily checked
if it's already linked to the lazy-update chain, so no extra work in
this case.  When the chain is full (or finding a hole failed), the
batched propagation can at the same time unlink any visited vma that
are also linked to the lazy-update chain, in an effort to avoid repeat
tree path traversals.

And vma_adjust() could probably be more discriminate about splits and
merges where the holes don't change, no?

---

#include <sys/mman.h>
#include <stdio.h>

int main(void)
{
	unsigned int i;
	char *map;

	for (i = 0; i < 512; i++)
		mmap(NULL, 1 << 12, PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0);
	map = mmap(NULL, 1 << 12, PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0);
	sbrk(0);
	for (i = 0; i < 512; i++)
		mmap(NULL, 1 << 12, PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0);
	sbrk(0);
	for (i = 0; i < (1 << 20); i++) {
#if 1
		mprotect(map, 1 << 6, PROT_NONE);
		mprotect(map, 1 << 6, PROT_READ);
#else
		munmap(map, 1 << 12);
		map = mmap(NULL, 1 << 12, PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0);
#endif
	}
	return 0;
}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2012-06-21 10:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-18 22:05 Rik van Riel
2012-06-18 22:05 ` [PATCH -mm 1/7] mm: track free size between VMAs in VMA rbtree Rik van Riel
2012-06-19 23:25   ` Andrew Morton
2012-06-21 11:01   ` Peter Zijlstra
2012-06-21 11:07   ` Peter Zijlstra
2012-06-21 14:47   ` Mel Gorman
2012-06-18 22:05 ` [PATCH -mm 2/7] mm: get unmapped area from VMA tree Rik van Riel
2012-06-21  9:01   ` Johannes Weiner
2012-06-21 13:17     ` Rik van Riel
2012-06-21 16:50     ` Rik van Riel
2012-06-21 16:16   ` Mel Gorman
2012-06-21 17:27     ` Rik van Riel
2012-06-21 21:06   ` Peter Zijlstra
2012-06-18 22:05 ` [PATCH -mm 3/7] Allow each architecture to specify the address range that can be used for this allocation Rik van Riel
2012-06-18 22:05 ` [PATCH -mm 4/7] mm: make page colouring code generic Rik van Riel
2012-06-19 23:27   ` Andrew Morton
2012-06-21 17:52     ` Rik van Riel
2012-06-21 19:22       ` Borislav Petkov
2012-06-21 11:20   ` Peter Zijlstra
2012-06-21 14:30     ` Rik van Riel
2012-06-21 17:40       ` Andrew Morton
2012-06-21 17:45         ` Rik van Riel
2012-06-21 12:37   ` Borislav Petkov
2012-06-21 13:24     ` Rik van Riel
2012-06-18 22:05 ` [PATCH -mm 5/7] mm: remove x86 arch_get_unmapped_area(_topdown) Rik van Riel
2012-06-18 22:05 ` [PATCH -mm 6/7] remove MIPS arch_get_unmapped_area code Rik van Riel
2012-06-18 22:05 ` [PATCH -mm 7/7] remove ARM arch_get_unmapped_area functions Rik van Riel
2012-06-19 23:20 ` [PATCH -mm 0/7] mm: scalable and unified arch_get_unmapped_area Andrew Morton
2012-06-21 10:18 ` Johannes Weiner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120621101800.GH27816@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan@gmail.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox