[PATCH V2 0/6] slub: bulk alloc and free for slub allocator

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <brouer@redhat.com>
To: linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Subject: [PATCH V2 0/6] slub: bulk alloc and free for slub allocator
Date: Wed, 17 Jun 2015 16:26:54 +0200	[thread overview]
Message-ID: <20150617142613.11791.76008.stgit@devil> (raw)

With this patchset SLUB allocator now both have bulk alloc and free
implemented.

(This patchset is based on DaveM's net-next tree on-top of commit
89d256bb69f)

This patchset mostly optimizes the "fastpath" where objects are
available on the per CPU fastpath page.  This mostly amortize the
less-heavy none-locked cmpxchg_double used on fastpath.

The "fallback" bulking (e.g __kmem_cache_free_bulk) provides a good
basis for comparison. Measurements[1] of the fallback functions
__kmem_cache_{free,alloc}_bulk have been copied from slab_common.c and
forced "noinline" to force a function call like slab_common.c.

Measurements on CPU CPU i7-4790K @ 4.00GHz
Baseline normal fastpath (alloc+free cost): 42 cycles(tsc) 10.601 ns

Measurements last-patch with disabled debugging:

Bulk- fallback                   - this-patch
  1 -  57 cycles(tsc) 14.448 ns  -  44 cycles(tsc) 11.236 ns  improved 22.8%
  2 -  51 cycles(tsc) 12.768 ns  -  28 cycles(tsc)  7.019 ns  improved 45.1%
  3 -  48 cycles(tsc) 12.232 ns  -  22 cycles(tsc)  5.526 ns  improved 54.2%
  4 -  48 cycles(tsc) 12.025 ns  -  19 cycles(tsc)  4.786 ns  improved 60.4%
  8 -  46 cycles(tsc) 11.558 ns  -  18 cycles(tsc)  4.572 ns  improved 60.9%
 16 -  45 cycles(tsc) 11.458 ns  -  18 cycles(tsc)  4.658 ns  improved 60.0%
 30 -  45 cycles(tsc) 11.499 ns  -  18 cycles(tsc)  4.568 ns  improved 60.0%
 32 -  79 cycles(tsc) 19.917 ns  -  65 cycles(tsc) 16.454 ns  improved 17.7%
 34 -  78 cycles(tsc) 19.655 ns  -  63 cycles(tsc) 15.932 ns  improved 19.2%
 48 -  68 cycles(tsc) 17.049 ns  -  50 cycles(tsc) 12.506 ns  improved 26.5%
 64 -  80 cycles(tsc) 20.009 ns  -  63 cycles(tsc) 15.929 ns  improved 21.3%
128 -  94 cycles(tsc) 23.749 ns  -  86 cycles(tsc) 21.583 ns  improved  8.5%
158 -  97 cycles(tsc) 24.299 ns  -  90 cycles(tsc) 22.552 ns  improved  7.2%
250 - 102 cycles(tsc) 25.681 ns  -  98 cycles(tsc) 24.589 ns  improved  3.9%

Benchmarking shows impressive improvements in the "fastpath" with a
small number of objects in the working set.  Once the working set
increases, resulting in activating the "slowpath" (that contains the
heavier locked cmpxchg_double) the improvement decreases.

I'm currently working on also optimizing the "slowpath" (as network
stack use-case hits this), but this patchset should provide a good
foundation for further improvements.
 Rest of my patch queue in this area needs some more work, but
preliminary results are good.  I'm attending Netfilter Workshop[2]
next week, and I'll hopefully return working on further improvements
in this area.

[1] https://github.com/netoptimizer/prototype-kernel/blob/b4688559b/kernel/mm/slab_bulk_test01.c#L80
[2] http://workshop.netfilter.org/2015/
---

Christoph Lameter (1):
      slab: infrastructure for bulk object allocation and freeing

Jesper Dangaard Brouer (5):
      slub: fix spelling succedd to succeed
      slub bulk alloc: extract objects from the per cpu slab
      slub: improve bulk alloc strategy
      slub: initial bulk free implementation
      slub: add support for kmem_cache_debug in bulk calls

 include/linux/slab.h |   10 +++++
 mm/slab.c            |   13 ++++++
 mm/slab.h            |    9 ++++
 mm/slab_common.c     |   23 +++++++++++
 mm/slob.c            |   13 ++++++
 mm/slub.c            |  109 ++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 176 insertions(+), 1 deletion(-)

--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next             reply	other threads:[~2015-06-17 14:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-17 14:26 Jesper Dangaard Brouer [this message]
2015-06-17 14:27 ` [PATCH V2 1/6] slub: fix spelling succedd to succeed Jesper Dangaard Brouer
2015-06-17 14:27 ` [PATCH V2 2/6] slab: infrastructure for bulk object allocation and freeing Jesper Dangaard Brouer
2015-06-17 14:28 ` [PATCH V2 3/6] slub bulk alloc: extract objects from the per cpu slab Jesper Dangaard Brouer
2015-06-17 14:28 ` [PATCH V2 4/6] slub: improve bulk alloc strategy Jesper Dangaard Brouer
2015-06-17 14:29 ` [PATCH V2 5/6] slub: initial bulk free implementation Jesper Dangaard Brouer
2015-06-17 14:29 ` [PATCH V2 6/6] slub: add support for kmem_cache_debug in bulk calls Jesper Dangaard Brouer
2015-06-17 15:08   ` Christoph Lameter
2015-06-17 15:24     ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150617142613.11791.76008.stgit@devil \
    --to=brouer@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox