From: Pekka Enberg <penberg@cs.helsinki.fi>
To: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>,
Linux Memory Management List <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Lin Ming <ming.m.lin@intel.com>,
"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
Christoph Lameter <cl@linux-foundation.org>
Subject: Re: [patch] SLQB slab allocator
Date: Fri, 23 Jan 2009 16:30:11 +0200 [thread overview]
Message-ID: <1232721011.6094.70.camel@penberg-laptop> (raw)
In-Reply-To: <Pine.LNX.4.64.0901231357250.9011@blonde.anvils>
Hi Hugh,
On Wed, Jan 21, 2009 at 8:10 PM, Hugh Dickins <hugh@veritas.com> wrote:
> > > > That's been making SLUB behave pretty badly (e.g. elapsed time 30%
> > > > more than SLAB) with swapping loads on most of my machines. Though
> > > > oddly one seems immune, and another takes four times as long: guess
> > > > it depends on how close to thrashing, but probably more to investigate
> > > > there. I think my original SLUB versus SLAB comparisons were done on
> > > > the immune one: as I remember, SLUB and SLAB were equivalent on those
> > > > loads when SLUB came in, but even with boot option slub_max_order=1,
> > > > SLUB is still slower than SLAB on such tests (e.g. 2% slower).
> > > > FWIW - swapping loads are not what anybody should tune for.
On Thu, 22 Jan 2009, Pekka Enberg wrote:
> > > What kind of machine are you seeing this on? It sounds like it could
> > > be a side-effect from commit 9b2cd506e5f2117f94c28a0040bf5da058105316
> > > ("slub: Calculate min_objects based on number of processors").
On Thu, 22 Jan 2009, Hugh Dickins wrote:
> > Thanks, yes, that could well account for the residual difference: the
> > machines in question have 2 or 4 cpus, so the old slub_min_objects=4
> > has effectively become slub_min_objects=12 or slub_min_objects=16.
> >
> > I'm now trying with slub_max_order=1 slub_min_objects=4 on the boot
> > lines (though I'll need to curtail tests on a couple of machines),
> > and will report back later.
On Fri, 2009-01-23 at 14:23 +0000, Hugh Dickins wrote:
> Yes, slub_max_order=1 with slub_min_objects=4 certainly helps this
> swapping load. I've not tried slub_max_order=0, but I'm running
> with 8kB stacks, so order 1 seems a reasonable choice.
Yanmin/Christoph, maybe we should revisit the min objects logic
calculate_order()?
On Fri, 2009-01-23 at 14:23 +0000, Hugh Dickins wrote:
> I can't say where I pulled that "e.g. 2% slower" from: on different
> machines slub was 5% or 10% or 20% slower than slab and slqb even with
> slub_max_order=1 (but not significantly slower on the "immune" machine).
> How much slub_min_objects=4 helps again varies widely, between halving
> or eliminating the difference.
>
> But I think it's more important that I focus on the worst case machine,
> try to understand what's going on there.
Yeah. Oprofile and CONFIG_SLUB_STATS are usually quite helpful. You
might want to test the included patch which targets one known SLAB vs.
SLUB regression discovered quite recently.
Pekka
Subject: [PATCH] SLUB: revert direct page allocator pass through
From: Pekka Enberg <penberg@cs.helsinki.fi>
This patch reverts page allocator pass-through logic from the SLUB allocator.
Commit aadb4bc4a1f9108c1d0fbd121827c936c2ed4217 ("SLUB: direct pass through of
page size or higher kmalloc requests") added page allocator pass-through to the
SLUB allocator for large sized allocations. This, however, results in a
performance regression compared to SLAB in the netperf UDP-U-4k test.
The regression comes from the kfree(skb->head) call in skb_release_data() that
is subject to page allocator pass-through as the size passed to __alloc_skb()
is larger than 4 KB in this test. With this patch, the performance regression
is almost closed:
<insert numbers here>
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
---
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 2f5c16b..3bd3662 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -124,7 +124,7 @@ struct kmem_cache {
* We keep the general caches in an array of slab caches that are used for
* 2^x bytes of allocations.
*/
-extern struct kmem_cache kmalloc_caches[PAGE_SHIFT + 1];
+extern struct kmem_cache kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
/*
* Sorry that the following has to be that ugly but some versions of GCC
@@ -135,6 +135,9 @@ static __always_inline int kmalloc_index(size_t size)
if (!size)
return 0;
+ if (size > KMALLOC_MAX_SIZE)
+ return -1;
+
if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -154,10 +157,6 @@ static __always_inline int kmalloc_index(size_t size)
if (size <= 1024) return 10;
if (size <= 2 * 1024) return 11;
if (size <= 4 * 1024) return 12;
-/*
- * The following is only needed to support architectures with a larger page
- * size than 4k.
- */
if (size <= 8 * 1024) return 13;
if (size <= 16 * 1024) return 14;
if (size <= 32 * 1024) return 15;
@@ -167,6 +166,10 @@ static __always_inline int kmalloc_index(size_t size)
if (size <= 512 * 1024) return 19;
if (size <= 1024 * 1024) return 20;
if (size <= 2 * 1024 * 1024) return 21;
+ if (size <= 4 * 1024 * 1024) return 22;
+ if (size <= 8 * 1024 * 1024) return 23;
+ if (size <= 16 * 1024 * 1024) return 24;
+ if (size <= 32 * 1024 * 1024) return 25;
return -1;
/*
@@ -191,6 +194,19 @@ static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
if (index == 0)
return NULL;
+ /*
+ * This function only gets expanded if __builtin_constant_p(size), so
+ * testing it here shouldn't be needed. But some versions of gcc need
+ * help.
+ */
+ if (__builtin_constant_p(size) && index < 0) {
+ /*
+ * Generate a link failure. Would be great if we could
+ * do something to stop the compile here.
+ */
+ extern void __kmalloc_size_too_large(void);
+ __kmalloc_size_too_large();
+ }
return &kmalloc_caches[index];
}
@@ -204,17 +220,9 @@ static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
void *__kmalloc(size_t size, gfp_t flags);
-static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
-{
- return (void *)__get_free_pages(flags | __GFP_COMP, get_order(size));
-}
-
static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
if (__builtin_constant_p(size)) {
- if (size > PAGE_SIZE)
- return kmalloc_large(size, flags);
-
if (!(flags & SLUB_DMA)) {
struct kmem_cache *s = kmalloc_slab(size);
diff --git a/mm/slub.c b/mm/slub.c
index 6392ae5..8fad23f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2475,7 +2475,7 @@ EXPORT_SYMBOL(kmem_cache_destroy);
* Kmalloc subsystem
*******************************************************************/
-struct kmem_cache kmalloc_caches[PAGE_SHIFT + 1] __cacheline_aligned;
+struct kmem_cache kmalloc_caches[KMALLOC_SHIFT_HIGH + 1] __cacheline_aligned;
EXPORT_SYMBOL(kmalloc_caches);
static int __init setup_slub_min_order(char *str)
@@ -2537,7 +2537,7 @@ panic:
}
#ifdef CONFIG_ZONE_DMA
-static struct kmem_cache *kmalloc_caches_dma[PAGE_SHIFT + 1];
+static struct kmem_cache *kmalloc_caches_dma[KMALLOC_SHIFT_HIGH + 1];
static void sysfs_add_func(struct work_struct *w)
{
@@ -2643,8 +2643,12 @@ static struct kmem_cache *get_slab(size_t size, gfp_t flags)
return ZERO_SIZE_PTR;
index = size_index[(size - 1) / 8];
- } else
+ } else {
+ if (size > KMALLOC_MAX_SIZE)
+ return NULL;
+
index = fls(size - 1);
+ }
#ifdef CONFIG_ZONE_DMA
if (unlikely((flags & SLUB_DMA)))
@@ -2658,9 +2662,6 @@ void *__kmalloc(size_t size, gfp_t flags)
{
struct kmem_cache *s;
- if (unlikely(size > PAGE_SIZE))
- return kmalloc_large(size, flags);
-
s = get_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
@@ -2670,25 +2671,11 @@ void *__kmalloc(size_t size, gfp_t flags)
}
EXPORT_SYMBOL(__kmalloc);
-static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
-{
- struct page *page = alloc_pages_node(node, flags | __GFP_COMP,
- get_order(size));
-
- if (page)
- return page_address(page);
- else
- return NULL;
-}
-
#ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
struct kmem_cache *s;
- if (unlikely(size > PAGE_SIZE))
- return kmalloc_large_node(size, flags, node);
-
s = get_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
@@ -2746,11 +2733,8 @@ void kfree(const void *x)
return;
page = virt_to_head_page(x);
- if (unlikely(!PageSlab(page))) {
- BUG_ON(!PageCompound(page));
- put_page(page);
+ if (unlikely(WARN_ON(!PageSlab(page)))) /* XXX */
return;
- }
slab_free(page->slab, page, object, _RET_IP_);
}
EXPORT_SYMBOL(kfree);
@@ -2985,7 +2969,7 @@ void __init kmem_cache_init(void)
caches++;
}
- for (i = KMALLOC_SHIFT_LOW; i <= PAGE_SHIFT; i++) {
+ for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
create_kmalloc_cache(&kmalloc_caches[i],
"kmalloc", 1 << i, GFP_KERNEL);
caches++;
@@ -3022,7 +3006,7 @@ void __init kmem_cache_init(void)
slab_state = UP;
/* Provide the correct kmalloc names now that the caches are up */
- for (i = KMALLOC_SHIFT_LOW; i <= PAGE_SHIFT; i++)
+ for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++)
kmalloc_caches[i]. name =
kasprintf(GFP_KERNEL, "kmalloc-%d", 1 << i);
@@ -3222,9 +3206,6 @@ void *__kmalloc_track_caller(size_t size, gfp_t gfpflags, unsigned long caller)
{
struct kmem_cache *s;
- if (unlikely(size > PAGE_SIZE))
- return kmalloc_large(size, gfpflags);
-
s = get_slab(size, gfpflags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
@@ -3238,9 +3219,6 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags,
{
struct kmem_cache *s;
- if (unlikely(size > PAGE_SIZE))
- return kmalloc_large_node(size, gfpflags, node);
-
s = get_slab(size, gfpflags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-01-23 14:30 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-21 14:30 Nick Piggin
2009-01-21 14:59 ` Ingo Molnar
2009-01-21 15:17 ` Nick Piggin
2009-01-21 16:56 ` Nick Piggin
2009-01-21 17:40 ` Ingo Molnar
2009-01-23 3:31 ` Nick Piggin
2009-01-23 6:14 ` Nick Piggin
2009-01-23 12:56 ` Ingo Molnar
2009-01-21 17:59 ` Joe Perches
2009-01-23 3:35 ` Nick Piggin
2009-01-23 4:00 ` Joe Perches
2009-01-21 18:10 ` Hugh Dickins
2009-01-22 10:01 ` Pekka Enberg
2009-01-22 12:47 ` Hugh Dickins
2009-01-23 14:23 ` Hugh Dickins
2009-01-23 14:30 ` Pekka Enberg [this message]
2009-02-02 3:38 ` Zhang, Yanmin
2009-02-02 9:00 ` Pekka Enberg
2009-02-02 15:00 ` Christoph Lameter
2009-02-03 1:34 ` Zhang, Yanmin
2009-02-03 7:29 ` Zhang, Yanmin
2009-02-03 12:18 ` Hugh Dickins
2009-02-04 2:21 ` Zhang, Yanmin
2009-02-05 19:04 ` Hugh Dickins
2009-02-06 0:47 ` Zhang, Yanmin
2009-02-06 8:57 ` Pekka Enberg
2009-02-06 12:33 ` Hugh Dickins
2009-02-10 8:56 ` Zhang, Yanmin
2009-02-02 11:50 ` Hugh Dickins
2009-01-23 3:55 ` Nick Piggin
2009-01-23 13:57 ` Hugh Dickins
2009-01-22 8:45 ` Zhang, Yanmin
2009-01-23 3:57 ` Nick Piggin
2009-01-23 9:00 ` Nick Piggin
2009-01-23 13:34 ` Hugh Dickins
2009-01-23 13:44 ` Nick Piggin
2009-01-23 9:55 ` Andi Kleen
2009-01-23 10:13 ` Pekka Enberg
2009-01-23 11:25 ` Nick Piggin
2009-01-23 11:57 ` Andi Kleen
2009-01-23 13:18 ` Nick Piggin
2009-01-23 14:04 ` Andi Kleen
2009-01-23 14:27 ` Nick Piggin
2009-01-23 15:06 ` Andi Kleen
2009-01-23 15:15 ` Nick Piggin
2009-01-23 12:55 ` Nick Piggin
-- strict thread matches above, loose matches on Subject: below --
2009-01-14 9:04 Nick Piggin
2009-01-14 10:53 ` Pekka Enberg
2009-01-14 11:47 ` Nick Piggin
2009-01-14 13:44 ` Pekka Enberg
2009-01-14 14:22 ` Nick Piggin
2009-01-14 14:45 ` Pekka Enberg
2009-01-14 15:09 ` Nick Piggin
2009-01-14 15:22 ` Nick Piggin
2009-01-14 15:30 ` Pekka Enberg
2009-01-14 15:59 ` Nick Piggin
2009-01-14 18:40 ` Christoph Lameter
2009-01-15 6:19 ` Nick Piggin
2009-01-15 20:47 ` Christoph Lameter
2009-01-16 3:43 ` Nick Piggin
2009-01-16 21:25 ` Christoph Lameter
2009-01-19 6:18 ` Nick Piggin
2009-01-22 0:13 ` Christoph Lameter
2009-01-22 9:27 ` Pekka Enberg
2009-01-22 9:30 ` Zhang, Yanmin
2009-01-22 9:33 ` Pekka Enberg
2009-01-23 15:32 ` Christoph Lameter
2009-01-23 15:37 ` Pekka Enberg
2009-01-23 15:42 ` Christoph Lameter
2009-01-23 15:32 ` Christoph Lameter
2009-01-23 4:09 ` Nick Piggin
2009-01-23 15:41 ` Christoph Lameter
2009-01-23 15:53 ` Nick Piggin
2009-01-26 17:28 ` Christoph Lameter
2009-02-03 1:53 ` Nick Piggin
2009-02-03 17:33 ` Christoph Lameter
2009-02-03 18:42 ` Pekka Enberg
2009-02-03 18:47 ` Pekka Enberg
2009-02-04 4:22 ` Nick Piggin
2009-02-04 20:09 ` Christoph Lameter
2009-02-05 3:18 ` Nick Piggin
2009-02-04 20:10 ` Christoph Lameter
2009-02-05 3:14 ` Nick Piggin
2009-02-04 4:07 ` Nick Piggin
2009-01-14 18:01 ` Christoph Lameter
2009-01-15 6:03 ` Nick Piggin
2009-01-15 20:05 ` Christoph Lameter
2009-01-16 3:19 ` Nick Piggin
2009-01-16 21:07 ` Christoph Lameter
2009-01-19 5:47 ` Nick Piggin
2009-01-22 0:19 ` Christoph Lameter
2009-01-23 4:17 ` Nick Piggin
2009-01-23 15:52 ` Christoph Lameter
2009-01-23 16:10 ` Nick Piggin
2009-01-23 17:09 ` Nick Piggin
2009-01-26 17:46 ` Christoph Lameter
2009-02-03 1:42 ` Nick Piggin
2009-01-26 17:34 ` Christoph Lameter
2009-02-03 1:48 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1232721011.6094.70.camel@penberg-laptop \
--to=penberg@cs.helsinki.fi \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ming.m.lin@intel.com \
--cc=npiggin@suse.de \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox