[patch 0/3] Cpu alloc slub support V2: Replace percpu allocator in slub.c

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [patch 0/3] Cpu alloc slub support V2: Replace percpu allocator in slub.c
@ 2008-10-03 15:24 Christoph Lameter
  2008-10-03 15:24 ` [patch 1/3] Increase default reserve percpu area Christoph Lameter
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Christoph Lameter @ 2008-10-03 15:24 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, jeremy, ebiederm, travis, herbert, xemul,
	penberg

Slub also has its own per cpu allocator. Get rid of it and use cpu_alloc().

V1->V2
- Fix 2 bugs in patches (node[] array problem, inverted check)
- Change reservation of percpu area defaults.

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch 1/3] Increase default reserve percpu area
  2008-10-03 15:24 [patch 0/3] Cpu alloc slub support V2: Replace percpu allocator in slub.c Christoph Lameter
@ 2008-10-03 15:24 ` Christoph Lameter
  2008-10-03 15:24 ` [patch 2/3] cpu alloc: Use in slub Christoph Lameter
  2008-10-03 15:24 ` [patch 3/3] cpu alloc: Remove slub fields Christoph Lameter
  2 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter @ 2008-10-03 15:24 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, Christoph Lameter, linux-mm, jeremy, ebiederm,
	travis, herbert, xemul, penberg

[-- Attachment #1: cpu_alloc_increase_percpu_default --]
[-- Type: text/plain, Size: 1712 bytes --]

SLUB now requires a portion of the per cpu reserve. There are on average
about 70 real slabs on a system (aliases do not count) and each needs 12 bytes
of per cpu space. Thats 840 bytes. In debug mode all slabs will be real slabs
which will make us end up with 150 -> 1800.

Things work fine without this patch but then slub will reduce the percpu reserve
for modules.

Percpu data must be available regardless if modules are in use or not. So get
rid of the #ifdef CONFIG_MODULES.

Make the size of the percpu area dependant on the size of a machine word. That
way we have larger sizes for 64 bit machines. 64 bit machines need more percpu
memory since the pointer and counters may have double the size. Plus there is
lots of memory available on 64 bit.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

Index: linux-2.6/include/linux/percpu.h
===================================================================
--- linux-2.6.orig/include/linux/percpu.h	2008-09-29 13:10:33.000000000 -0500
+++ linux-2.6/include/linux/percpu.h	2008-09-29 13:13:21.000000000 -0500
@@ -37,11 +37,7 @@
 extern unsigned int percpu_reserve;
 /* Enough to cover all DEFINE_PER_CPUs in kernel, including modules. */
 #ifndef PERCPU_AREA_SIZE
-#ifdef CONFIG_MODULES
-#define PERCPU_RESERVE_SIZE	8192
-#else
-#define PERCPU_RESERVE_SIZE	0
-#endif
+#define PERCPU_RESERVE_SIZE	(sizeof(unsigned long) * 2500)

 #define PERCPU_AREA_SIZE						\
 	(__per_cpu_end - __per_cpu_start + percpu_reserve)

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch 2/3] cpu alloc: Use in slub
  2008-10-03 15:24 [patch 0/3] Cpu alloc slub support V2: Replace percpu allocator in slub.c Christoph Lameter
  2008-10-03 15:24 ` [patch 1/3] Increase default reserve percpu area Christoph Lameter
@ 2008-10-03 15:24 ` Christoph Lameter
  2008-10-03 15:47   ` Eric Dumazet
  2008-10-03 15:24 ` [patch 3/3] cpu alloc: Remove slub fields Christoph Lameter
  2 siblings, 1 reply; 6+ messages in thread
From: Christoph Lameter @ 2008-10-03 15:24 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, Christoph Lameter, linux-mm, jeremy, ebiederm,
	travis, herbert, xemul, penberg

[-- Attachment #1: cpu_alloc_slub_conversion --]
[-- Type: text/plain, Size: 10597 bytes --]

Using cpu alloc removes the needs for the per cpu arrays in the kmem_cache struct.
These could get quite big if we have to support system of up to thousands of cpus.
The use of cpu_alloc means that:

1. The size of kmem_cache for SMP configuration shrinks since we will only
   need 1 pointer instead of NR_CPUS. The same pointer can be used by all
   processors. Reduces cache footprint of the allocator.

2. We can dynamically size kmem_cache according to the actual nodes in the
   system meaning less memory overhead for configurations that may potentially
   support up to 1k NUMA nodes / 4k cpus.

3. We can remove the diddle widdle with allocating and releasing of
   kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu
   alloc logic will do it all for us. Removes some portions of the cpu hotplug
   functionality.

4. Fastpath performance increases.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2008-10-03 09:05:11.000000000 -0500
+++ linux-2.6/include/linux/slub_def.h	2008-10-03 10:17:32.000000000 -0500
@@ -68,6 +68,7 @@
  * Slab cache management.
  */
 struct kmem_cache {
+	struct kmem_cache_cpu *cpu_slab;
 	/* Used for retriving partial slabs etc */
 	unsigned long flags;
 	int size;		/* The size of an object including meta data */
@@ -102,11 +103,6 @@
 	int remote_node_defrag_ratio;
 	struct kmem_cache_node *node[MAX_NUMNODES];
 #endif
-#ifdef CONFIG_SMP
-	struct kmem_cache_cpu *cpu_slab[NR_CPUS];
-#else
-	struct kmem_cache_cpu cpu_slab;
-#endif
 };
 
 /*
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2008-10-03 09:05:16.000000000 -0500
+++ linux-2.6/mm/slub.c	2008-10-03 10:17:32.000000000 -0500
@@ -226,15 +226,6 @@
 #endif
 }
 
-static inline struct kmem_cache_cpu *get_cpu_slab(struct kmem_cache *s, int cpu)
-{
-#ifdef CONFIG_SMP
-	return s->cpu_slab[cpu];
-#else
-	return &s->cpu_slab;
-#endif
-}
-
 /* Verify that a pointer has an address that is valid within a slab page */
 static inline int check_valid_pointer(struct kmem_cache *s,
 				struct page *page, const void *object)
@@ -1087,7 +1078,7 @@
 		if (!page)
 			return NULL;
 
-		stat(get_cpu_slab(s, raw_smp_processor_id()), ORDER_FALLBACK);
+		stat(THIS_CPU(s->cpu_slab), ORDER_FALLBACK);
 	}
 	page->objects = oo_objects(oo);
 	mod_zone_page_state(page_zone(page),
@@ -1364,7 +1355,7 @@
 static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail)
 {
 	struct kmem_cache_node *n = get_node(s, page_to_nid(page));
-	struct kmem_cache_cpu *c = get_cpu_slab(s, smp_processor_id());
+	struct kmem_cache_cpu *c = THIS_CPU(s->cpu_slab);
 
 	__ClearPageSlubFrozen(page);
 	if (page->inuse) {
@@ -1396,7 +1387,7 @@
 			slab_unlock(page);
 		} else {
 			slab_unlock(page);
-			stat(get_cpu_slab(s, raw_smp_processor_id()), FREE_SLAB);
+			stat(__THIS_CPU(s->cpu_slab), FREE_SLAB);
 			discard_slab(s, page);
 		}
 	}
@@ -1449,7 +1440,7 @@
  */
 static inline void __flush_cpu_slab(struct kmem_cache *s, int cpu)
 {
-	struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
+	struct kmem_cache_cpu *c = CPU_PTR(s->cpu_slab, cpu);
 
 	if (likely(c && c->page))
 		flush_slab(s, c);
@@ -1552,7 +1543,7 @@
 		local_irq_disable();
 
 	if (new) {
-		c = get_cpu_slab(s, smp_processor_id());
+		c = __THIS_CPU(s->cpu_slab);
 		stat(c, ALLOC_SLAB);
 		if (c->page)
 			flush_slab(s, c);
@@ -1588,24 +1579,22 @@
 	void **object;
 	struct kmem_cache_cpu *c;
 	unsigned long flags;
-	unsigned int objsize;
 
 	local_irq_save(flags);
-	c = get_cpu_slab(s, smp_processor_id());
-	objsize = c->objsize;
-	if (unlikely(!c->freelist || !node_match(c, node)))
+	c = __THIS_CPU(s->cpu_slab);
+	object = c->freelist;
+	if (unlikely(!object || !node_match(c, node)))
 
 		object = __slab_alloc(s, gfpflags, node, addr, c);
 
 	else {
-		object = c->freelist;
 		c->freelist = object[c->offset];
 		stat(c, ALLOC_FASTPATH);
 	}
 	local_irq_restore(flags);
 
 	if (unlikely((gfpflags & __GFP_ZERO) && object))
-		memset(object, 0, objsize);
+		memset(object, 0, s->objsize);
 
 	return object;
 }
@@ -1639,7 +1628,7 @@
 	void **object = (void *)x;
 	struct kmem_cache_cpu *c;
 
-	c = get_cpu_slab(s, raw_smp_processor_id());
+	c = __THIS_CPU(s->cpu_slab);
 	stat(c, FREE_SLOWPATH);
 	slab_lock(page);
 
@@ -1710,7 +1699,7 @@
 	unsigned long flags;
 
 	local_irq_save(flags);
-	c = get_cpu_slab(s, smp_processor_id());
+	c = __THIS_CPU(s->cpu_slab);
 	debug_check_no_locks_freed(object, c->objsize);
 	if (!(s->flags & SLAB_DEBUG_OBJECTS))
 		debug_check_no_obj_freed(object, s->objsize);
@@ -1937,130 +1926,19 @@
 #endif
 }
 
-#ifdef CONFIG_SMP
-/*
- * Per cpu array for per cpu structures.
- *
- * The per cpu array places all kmem_cache_cpu structures from one processor
- * close together meaning that it becomes possible that multiple per cpu
- * structures are contained in one cacheline. This may be particularly
- * beneficial for the kmalloc caches.
- *
- * A desktop system typically has around 60-80 slabs. With 100 here we are
- * likely able to get per cpu structures for all caches from the array defined
- * here. We must be able to cover all kmalloc caches during bootstrap.
- *
- * If the per cpu array is exhausted then fall back to kmalloc
- * of individual cachelines. No sharing is possible then.
- */
-#define NR_KMEM_CACHE_CPU 100
-
-static DEFINE_PER_CPU(struct kmem_cache_cpu,
-				kmem_cache_cpu)[NR_KMEM_CACHE_CPU];
-
-static DEFINE_PER_CPU(struct kmem_cache_cpu *, kmem_cache_cpu_free);
-static cpumask_t kmem_cach_cpu_free_init_once = CPU_MASK_NONE;
-
-static struct kmem_cache_cpu *alloc_kmem_cache_cpu(struct kmem_cache *s,
-							int cpu, gfp_t flags)
-{
-	struct kmem_cache_cpu *c = per_cpu(kmem_cache_cpu_free, cpu);
-
-	if (c)
-		per_cpu(kmem_cache_cpu_free, cpu) =
-				(void *)c->freelist;
-	else {
-		/* Table overflow: So allocate ourselves */
-		c = kmalloc_node(
-			ALIGN(sizeof(struct kmem_cache_cpu), cache_line_size()),
-			flags, cpu_to_node(cpu));
-		if (!c)
-			return NULL;
-	}
-
-	init_kmem_cache_cpu(s, c);
-	return c;
-}
-
-static void free_kmem_cache_cpu(struct kmem_cache_cpu *c, int cpu)
-{
-	if (c < per_cpu(kmem_cache_cpu, cpu) ||
-			c > per_cpu(kmem_cache_cpu, cpu) + NR_KMEM_CACHE_CPU) {
-		kfree(c);
-		return;
-	}
-	c->freelist = (void *)per_cpu(kmem_cache_cpu_free, cpu);
-	per_cpu(kmem_cache_cpu_free, cpu) = c;
-}
-
-static void free_kmem_cache_cpus(struct kmem_cache *s)
-{
-	int cpu;
-
-	for_each_online_cpu(cpu) {
-		struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
-
-		if (c) {
-			s->cpu_slab[cpu] = NULL;
-			free_kmem_cache_cpu(c, cpu);
-		}
-	}
-}
-
 static int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
 {
 	int cpu;
 
-	for_each_online_cpu(cpu) {
-		struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
-
-		if (c)
-			continue;
-
-		c = alloc_kmem_cache_cpu(s, cpu, flags);
-		if (!c) {
-			free_kmem_cache_cpus(s);
-			return 0;
-		}
-		s->cpu_slab[cpu] = c;
-	}
-	return 1;
-}
-
-/*
- * Initialize the per cpu array.
- */
-static void init_alloc_cpu_cpu(int cpu)
-{
-	int i;
-
-	if (cpu_isset(cpu, kmem_cach_cpu_free_init_once))
-		return;
+	s->cpu_slab = CPU_ALLOC(struct kmem_cache_cpu, flags);
 
-	for (i = NR_KMEM_CACHE_CPU - 1; i >= 0; i--)
-		free_kmem_cache_cpu(&per_cpu(kmem_cache_cpu, cpu)[i], cpu);
-
-	cpu_set(cpu, kmem_cach_cpu_free_init_once);
-}
-
-static void __init init_alloc_cpu(void)
-{
-	int cpu;
-
-	for_each_online_cpu(cpu)
-		init_alloc_cpu_cpu(cpu);
-  }
-
-#else
-static inline void free_kmem_cache_cpus(struct kmem_cache *s) {}
-static inline void init_alloc_cpu(void) {}
+	if (!s->cpu_slab)
+		return 0;
 
-static inline int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
-{
-	init_kmem_cache_cpu(s, &s->cpu_slab);
+	for_each_possible_cpu(cpu)
+		init_kmem_cache_cpu(s, CPU_PTR(s->cpu_slab, cpu));
 	return 1;
 }
-#endif
 
 #ifdef CONFIG_NUMA
 /*
@@ -2427,9 +2305,8 @@
 	int node;
 
 	flush_all(s);
-
+	CPU_FREE(s->cpu_slab);
 	/* Attempt to free all objects */
-	free_kmem_cache_cpus(s);
 	for_each_node_state(node, N_NORMAL_MEMORY) {
 		struct kmem_cache_node *n = get_node(s, node);
 
@@ -2946,8 +2823,6 @@
 	int i;
 	int caches = 0;
 
-	init_alloc_cpu();
-
 #ifdef CONFIG_NUMA
 	/*
 	 * Must first have the slab cache available for the allocations of the
@@ -3015,11 +2890,12 @@
 	for (i = KMALLOC_SHIFT_LOW; i <= PAGE_SHIFT; i++)
 		kmalloc_caches[i]. name =
 			kasprintf(GFP_KERNEL, "kmalloc-%d", 1 << i);
-
 #ifdef CONFIG_SMP
 	register_cpu_notifier(&slab_notifier);
-	kmem_size = offsetof(struct kmem_cache, cpu_slab) +
-				nr_cpu_ids * sizeof(struct kmem_cache_cpu *);
+#endif
+#ifdef CONFIG_NUMA
+	kmem_size = offsetof(struct kmem_cache, node) +
+				nr_node_ids * sizeof(struct kmem_cache_node *);
 #else
 	kmem_size = sizeof(struct kmem_cache);
 #endif
@@ -3115,7 +2991,7 @@
 		 * per cpu structures
 		 */
 		for_each_online_cpu(cpu)
-			get_cpu_slab(s, cpu)->objsize = s->objsize;
+			CPU_PTR(s->cpu_slab, cpu)->objsize = s->objsize;
 
 		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));
 		up_write(&slub_lock);
@@ -3163,11 +3039,9 @@
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		init_alloc_cpu_cpu(cpu);
 		down_read(&slub_lock);
 		list_for_each_entry(s, &slab_caches, list)
-			s->cpu_slab[cpu] = alloc_kmem_cache_cpu(s, cpu,
-							GFP_KERNEL);
+			init_kmem_cache_cpu(s, CPU_PTR(s->cpu_slab, cpu));
 		up_read(&slub_lock);
 		break;
 
@@ -3177,13 +3051,9 @@
 	case CPU_DEAD_FROZEN:
 		down_read(&slub_lock);
 		list_for_each_entry(s, &slab_caches, list) {
-			struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
-
 			local_irq_save(flags);
 			__flush_cpu_slab(s, cpu);
 			local_irq_restore(flags);
-			free_kmem_cache_cpu(c, cpu);
-			s->cpu_slab[cpu] = NULL;
 		}
 		up_read(&slub_lock);
 		break;
@@ -3674,7 +3544,7 @@
 		int cpu;
 
 		for_each_possible_cpu(cpu) {
-			struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
+			struct kmem_cache_cpu *c = CPU_PTR(s->cpu_slab, cpu);
 
 			if (!c || c->node < 0)
 				continue;
@@ -4079,7 +3949,7 @@
 		return -ENOMEM;
 
 	for_each_online_cpu(cpu) {
-		unsigned x = get_cpu_slab(s, cpu)->stat[si];
+		unsigned x = CPU_PTR(s->cpu_slab, cpu)->stat[si];
 
 		data[cpu] = x;
 		sum += x;

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch 3/3] cpu alloc: Remove slub fields
  2008-10-03 15:24 [patch 0/3] Cpu alloc slub support V2: Replace percpu allocator in slub.c Christoph Lameter
  2008-10-03 15:24 ` [patch 1/3] Increase default reserve percpu area Christoph Lameter
  2008-10-03 15:24 ` [patch 2/3] cpu alloc: Use in slub Christoph Lameter
@ 2008-10-03 15:24 ` Christoph Lameter
  2 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter @ 2008-10-03 15:24 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, Christoph Lameter, linux-mm, jeremy, ebiederm,
	travis, herbert, xemul, penberg

[-- Attachment #1: cpu_alloc_remove_slub_fields --]
[-- Type: text/plain, Size: 6741 bytes --]

Remove the fields in kmem_cache_cpu that were used to cache data from
kmem_cache when they were in different cachelines. The cacheline that holds
the per cpu array pointer now also holds these values. We can cut down the
struct kmem_cache_cpu size to almost half.

The get_freepointer() and set_freepointer() functions that used to be only
intended for the slow path now are also useful for the hot path since access
to the field does not require accessing an additional cacheline anymore. This
results in consistent use of setting the freepointer for objects throughout
SLUB.

Also we initialize all possible kmem_cache_cpu structures when a slab is
created. No need to initialize them when a processor or node comes online.
And all fields are set to zero. So just use __GFP_ZERO on cpu alloc.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
---
 include/linux/slub_def.h |    2 --
 mm/slub.c                |   39 +++++++++++----------------------------
 2 files changed, 11 insertions(+), 30 deletions(-)

Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2008-10-03 10:17:32.000000000 -0500
+++ linux-2.6/include/linux/slub_def.h	2008-10-03 10:18:15.000000000 -0500
@@ -36,8 +36,6 @@
 	void **freelist;	/* Pointer to first free per cpu object */
 	struct page *page;	/* The slab from which we are allocating */
 	int node;		/* The node of the page (or -1 for debug) */
-	unsigned int offset;	/* Freepointer offset (in word units) */
-	unsigned int objsize;	/* Size of an object (from kmem_cache) */
 #ifdef CONFIG_SLUB_STATS
 	unsigned stat[NR_SLUB_STAT_ITEMS];
 #endif
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2008-10-03 10:17:32.000000000 -0500
+++ linux-2.6/mm/slub.c	2008-10-03 10:18:15.000000000 -0500
@@ -244,13 +244,6 @@
 	return 1;
 }
 
-/*
- * Slow version of get and set free pointer.
- *
- * This version requires touching the cache lines of kmem_cache which
- * we avoid to do in the fast alloc free paths. There we obtain the offset
- * from the page struct.
- */
 static inline void *get_freepointer(struct kmem_cache *s, void *object)
 {
 	return *(void **)(object + s->offset);
@@ -1415,10 +1408,10 @@
 
 		/* Retrieve object from cpu_freelist */
 		object = c->freelist;
-		c->freelist = c->freelist[c->offset];
+		c->freelist = get_freepointer(s, c->freelist);
 
 		/* And put onto the regular freelist */
-		object[c->offset] = page->freelist;
+		set_freepointer(s, object, page->freelist);
 		page->freelist = object;
 		page->inuse--;
 	}
@@ -1514,7 +1507,7 @@
 	if (unlikely(SLABDEBUG && PageSlubDebug(c->page)))
 		goto debug;
 
-	c->freelist = object[c->offset];
+	c->freelist = get_freepointer(s, object);
 	c->page->inuse = c->page->objects;
 	c->page->freelist = NULL;
 	c->node = page_to_nid(c->page);
@@ -1558,7 +1551,7 @@
 		goto another_slab;
 
 	c->page->inuse++;
-	c->page->freelist = object[c->offset];
+	c->page->freelist = get_freepointer(s, object);
 	c->node = -1;
 	goto unlock_out;
 }
@@ -1588,7 +1581,7 @@
 		object = __slab_alloc(s, gfpflags, node, addr, c);
 
 	else {
-		c->freelist = object[c->offset];
+		c->freelist = get_freepointer(s, object);
 		stat(c, ALLOC_FASTPATH);
 	}
 	local_irq_restore(flags);
@@ -1622,7 +1615,7 @@
  * handling required then we can return immediately.
  */
 static void __slab_free(struct kmem_cache *s, struct page *page,
-				void *x, void *addr, unsigned int offset)
+				void *x, void *addr)
 {
 	void *prior;
 	void **object = (void *)x;
@@ -1636,7 +1629,8 @@
 		goto debug;
 
 checks_ok:
-	prior = object[offset] = page->freelist;
+	prior = page->freelist;
+	set_freepointer(s, object, prior);
 	page->freelist = object;
 	page->inuse--;
 
@@ -1700,15 +1694,15 @@
 
 	local_irq_save(flags);
 	c = __THIS_CPU(s->cpu_slab);
-	debug_check_no_locks_freed(object, c->objsize);
+	debug_check_no_locks_freed(object, s->objsize);
 	if (!(s->flags & SLAB_DEBUG_OBJECTS))
 		debug_check_no_obj_freed(object, s->objsize);
 	if (likely(page == c->page && c->node >= 0)) {
-		object[c->offset] = c->freelist;
+		set_freepointer(s, object, c->freelist);
 		c->freelist = object;
 		stat(c, FREE_FASTPATH);
 	} else
-		__slab_free(s, page, x, addr, c->offset);
+		__slab_free(s, page, x, addr);
 
 	local_irq_restore(flags);
 }
@@ -1889,19 +1883,6 @@
 	return ALIGN(align, sizeof(void *));
 }
 
-static void init_kmem_cache_cpu(struct kmem_cache *s,
-			struct kmem_cache_cpu *c)
-{
-	c->page = NULL;
-	c->freelist = NULL;
-	c->node = 0;
-	c->offset = s->offset / sizeof(void *);
-	c->objsize = s->objsize;
-#ifdef CONFIG_SLUB_STATS
-	memset(c->stat, 0, NR_SLUB_STAT_ITEMS * sizeof(unsigned));
-#endif
-}
-
 static void
 init_kmem_cache_node(struct kmem_cache_node *n, struct kmem_cache *s)
 {
@@ -1926,20 +1907,6 @@
 #endif
 }
 
-static int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
-{
-	int cpu;
-
-	s->cpu_slab = CPU_ALLOC(struct kmem_cache_cpu, flags);
-
-	if (!s->cpu_slab)
-		return 0;
-
-	for_each_possible_cpu(cpu)
-		init_kmem_cache_cpu(s, CPU_PTR(s->cpu_slab, cpu));
-	return 1;
-}
-
 #ifdef CONFIG_NUMA
 /*
  * No kmalloc_node yet so do it by hand. We know that this is the first
@@ -2196,8 +2163,11 @@
 	if (!init_kmem_cache_nodes(s, gfpflags & ~SLUB_DMA))
 		goto error;
 
-	if (alloc_kmem_cache_cpus(s, gfpflags & ~SLUB_DMA))
+	s->cpu_slab = CPU_ALLOC(struct kmem_cache_cpu,
+				(flags & ~SLUB_DMA) | __GFP_ZERO);
+	if (s->cpu_slab)
 		return 1;
+
 	free_kmem_cache_nodes(s);
 error:
 	if (flags & SLAB_PANIC)
@@ -2977,8 +2947,6 @@
 	down_write(&slub_lock);
 	s = find_mergeable(size, align, flags, name, ctor);
 	if (s) {
-		int cpu;
-
 		s->refcount++;
 		/*
 		 * Adjust the object sizes so that we clear
@@ -2986,13 +2954,6 @@
 		 */
 		s->objsize = max(s->objsize, (int)size);
 
-		/*
-		 * And then we need to update the object size in the
-		 * per cpu structures
-		 */
-		for_each_online_cpu(cpu)
-			CPU_PTR(s->cpu_slab, cpu)->objsize = s->objsize;
-
 		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));
 		up_write(&slub_lock);
 
@@ -3037,14 +2998,6 @@
 	unsigned long flags;
 
 	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
-		down_read(&slub_lock);
-		list_for_each_entry(s, &slab_caches, list)
-			init_kmem_cache_cpu(s, CPU_PTR(s->cpu_slab, cpu));
-		up_read(&slub_lock);
-		break;
-
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:

-- 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 2/3] cpu alloc: Use in slub
  2008-10-03 15:24 ` [patch 2/3] cpu alloc: Use in slub Christoph Lameter
@ 2008-10-03 15:47   ` Eric Dumazet
  2008-10-03 16:05     ` Christoph Lameter
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2008-10-03 15:47 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, linux-kernel, linux-mm, jeremy, ebiederm, travis, herbert,
	xemul, penberg

Christoph Lameter a ecrit :
> Using cpu alloc removes the needs for the per cpu arrays in the kmem_cache struct.
> These could get quite big if we have to support system of up to thousands of cpus.
> The use of cpu_alloc means that:
> 
> 1. The size of kmem_cache for SMP configuration shrinks since we will only
>    need 1 pointer instead of NR_CPUS. The same pointer can be used by all
>    processors. Reduces cache footprint of the allocator.
> 
> 2. We can dynamically size kmem_cache according to the actual nodes in the
>    system meaning less memory overhead for configurations that may potentially
>    support up to 1k NUMA nodes / 4k cpus.
> 
> 3. We can remove the diddle widdle with allocating and releasing of
>    kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu
>    alloc logic will do it all for us. Removes some portions of the cpu hotplug
>    functionality.
> 
> 4. Fastpath performance increases.
> 
> Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
> 
> Index: linux-2.6/include/linux/slub_def.h
> ===================================================================
> --- linux-2.6.orig/include/linux/slub_def.h	2008-10-03 09:05:11.000000000 -0500
> +++ linux-2.6/include/linux/slub_def.h	2008-10-03 10:17:32.000000000 -0500
> @@ -68,6 +68,7 @@
>   * Slab cache management.
>   */
>  struct kmem_cache {
> +	struct kmem_cache_cpu *cpu_slab;
>  	/* Used for retriving partial slabs etc */
>  	unsigned long flags;
>  	int size;		/* The size of an object including meta data */
> @@ -102,11 +103,6 @@
>  	int remote_node_defrag_ratio;
>  	struct kmem_cache_node *node[MAX_NUMNODES];

Then maybe change MAX_NUMNODES to 0 or 1 to reflect
node[] is dynamically sized ?


>  #endif
> -#ifdef CONFIG_SMP
> -	struct kmem_cache_cpu *cpu_slab[NR_CPUS];
> -#else
> -	struct kmem_cache_cpu cpu_slab;
> -#endif
>  };



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch 2/3] cpu alloc: Use in slub
  2008-10-03 15:47   ` Eric Dumazet
@ 2008-10-03 16:05     ` Christoph Lameter
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter @ 2008-10-03 16:05 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: akpm, linux-kernel, linux-mm, jeremy, ebiederm, travis, herbert,
	xemul, penberg

Eric Dumazet wrote:
>
> Then maybe change MAX_NUMNODES to 0 or 1 to reflect
> node[] is dynamically sized ?

See kmem_cache_init(). It only allocates the used bytes. If you only have a
single nodes then only 1 pointer will be allocated.

#ifdef CONFIG_NUMA
        kmem_size = offsetof(struct kmem_cache, node) +
                                nr_node_ids * sizeof(struct kmem_cache_node *);
#else


That does not work for the statically allocated kmalloc array though.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-10-03 16:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-03 15:24 [patch 0/3] Cpu alloc slub support V2: Replace percpu allocator in slub.c Christoph Lameter
2008-10-03 15:24 ` [patch 1/3] Increase default reserve percpu area Christoph Lameter
2008-10-03 15:24 ` [patch 2/3] cpu alloc: Use in slub Christoph Lameter
2008-10-03 15:47   ` Eric Dumazet
2008-10-03 16:05     ` Christoph Lameter
2008-10-03 15:24 ` [patch 3/3] cpu alloc: Remove slub fields Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox