[PATCH] Slab infrastructure for array operations

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] Slab infrastructure for array operations
@ 2014-12-18 16:33 Christoph Lameter
  2014-12-18 22:06 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Lameter @ 2014-12-18 16:33 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: akpm, Steven Rostedt, LKML, Thomas Gleixner,
	Linux Memory Management List, Pekka Enberg, Joonsoo Kim

This patch adds the basic infrastructure for alloc / free operations
on pointer arrays. It includes a fallback function.

Allocators must define _HAVE_SLAB_ALLOCATOR_OPERATIONS in their
header files in order to implement their own fast version for
these array operations.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/slab.h
===================================================================
--- linux.orig/include/linux/slab.h	2014-12-16 09:27:26.369447763 -0600
+++ linux/include/linux/slab.h	2014-12-18 10:30:33.394927526 -0600
@@ -123,6 +123,7 @@ struct kmem_cache *memcg_create_kmem_cac
 void kmem_cache_destroy(struct kmem_cache *);
 int kmem_cache_shrink(struct kmem_cache *);
 void kmem_cache_free(struct kmem_cache *, void *);
+void kmem_cache_free_array(struct kmem_cache *, int, void **);

 /*
  * Please use this macro to create slab caches. Simply specify the
@@ -289,6 +290,7 @@ static __always_inline int kmalloc_index

 void *__kmalloc(size_t size, gfp_t flags);
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags);
+int kmem_cache_alloc_array(struct kmem_cache *, gfp_t, int, void **);

 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node);
Index: linux/mm/slab_common.c
===================================================================
--- linux.orig/mm/slab_common.c	2014-12-12 10:27:49.360799479 -0600
+++ linux/mm/slab_common.c	2014-12-18 10:25:41.695889129 -0600
@@ -105,6 +105,31 @@ static inline int kmem_cache_sanity_chec
 }
 #endif

+#ifndef _HAVE_SLAB_ALLOCATOR_ARRAY_OPERATIONS
+int kmem_cache_alloc_array(struct kmem_cache *s, gfp_t flags, int nr, void **p)
+{
+	int i;
+
+	for (i=0; i < nr; i++) {
+		void *x = p[i] = kmem_cache_alloc(s, flags);
+		if (!x)
+			return i;
+	}
+	return nr;
+}
+EXPORT_SYMBOL(kmem_cache_alloc_array);
+
+void kmem_cache_free_array(struct kmem_cache *s, int nr, void **p)
+{
+	int i;
+
+	for (i=0; i < nr; i++)
+		kmem_cache_free(s, p[i]);
+}
+EXPORT_SYMBOL(kmem_cache_free_array);
+
+#endif
+
 #ifdef CONFIG_MEMCG_KMEM
 static int memcg_alloc_cache_params(struct mem_cgroup *memcg,
 		struct kmem_cache *s, struct kmem_cache *root_cache)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Slab infrastructure for array operations
  2014-12-18 16:33 [PATCH] Slab infrastructure for array operations Christoph Lameter
@ 2014-12-18 22:06 ` Andrew Morton
  2014-12-19 10:31   ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2014-12-18 22:06 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Jesper Dangaard Brouer, akpm, Steven Rostedt, LKML,
	Thomas Gleixner, Linux Memory Management List, Pekka Enberg,
	Joonsoo Kim

On Thu, 18 Dec 2014 10:33:23 -0600 (CST) Christoph Lameter <cl@linux.com> wrote:

> This patch adds the basic infrastructure for alloc / free operations
> on pointer arrays.

Please provide the justification/reason for making this change.

> It includes a fallback function.

I don't know what this means.  Something to do with
_HAVE_SLAB_ALLOCATOR_OPERATIONS perhaps.

> Allocators must define _HAVE_SLAB_ALLOCATOR_OPERATIONS in their
> header files in order to implement their own fast version for
> these array operations.

Why?  What's driving this?

The changelog is far too skimpy, sorry.  It makes the patch
unreviewable.

> --- linux.orig/include/linux/slab.h	2014-12-16 09:27:26.369447763 -0600
> +++ linux/include/linux/slab.h	2014-12-18 10:30:33.394927526 -0600
> @@ -123,6 +123,7 @@ struct kmem_cache *memcg_create_kmem_cac
>  void kmem_cache_destroy(struct kmem_cache *);
>  int kmem_cache_shrink(struct kmem_cache *);
>  void kmem_cache_free(struct kmem_cache *, void *);
> +void kmem_cache_free_array(struct kmem_cache *, int, void **);

These declarations are much more useful if they include the argument
names.

> --- linux.orig/mm/slab_common.c	2014-12-12 10:27:49.360799479 -0600
> +++ linux/mm/slab_common.c	2014-12-18 10:25:41.695889129 -0600
> @@ -105,6 +105,31 @@ static inline int kmem_cache_sanity_chec
>  }
>  #endif
> 
> +#ifndef _HAVE_SLAB_ALLOCATOR_ARRAY_OPERATIONS
> +int kmem_cache_alloc_array(struct kmem_cache *s, gfp_t flags, int nr, void **p)
> +{
> +	int i;
> +
> +	for (i=0; i < nr; i++) {
> +		void *x = p[i] = kmem_cache_alloc(s, flags);
> +		if (!x)
> +			return i;
> +	}
> +	return nr;
> +}
> +EXPORT_SYMBOL(kmem_cache_alloc_array);

Please use checkpatch.

This function very much needs documentation.  Particularly concerning
the return value, and the caller's responsibility at cleanup time.

And that return value is weird.  What's the point in returning a
partial result?

Why is the memory exhaustion handling implemented this way rather than
zeroing out the rest of the array, so the caller doesn't have to
remember the return value for kmem_cache_free_array()?

> +void kmem_cache_free_array(struct kmem_cache *s, int nr, void **p)
> +{
> +	int i;
> +
> +	for (i=0; i < nr; i++)
> +		kmem_cache_free(s, p[i]);
> +}
> +EXPORT_SYMBOL(kmem_cache_free_array);

Possibly `nr' and `i' should be size_t, dunno.  They certainly don't
need to be signed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Slab infrastructure for array operations
  2014-12-18 22:06 ` Andrew Morton
@ 2014-12-19 10:31   ` Jesper Dangaard Brouer
  2014-12-19 15:01     ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2014-12-19 10:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, akpm, Steven Rostedt, LKML, Thomas Gleixner,
	Linux Memory Management List, Pekka Enberg, Joonsoo Kim, brouer

On Thu, 18 Dec 2014 14:06:29 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:

> On Thu, 18 Dec 2014 10:33:23 -0600 (CST) Christoph Lameter <cl@linux.com> wrote:
> 
> > This patch adds the basic infrastructure for alloc / free operations
> > on pointer arrays.
> 
> Please provide the justification/reason for making this change.

I agree that this needs more justification.

I (think) the reason behind this is a first step towards "bulk" alloc
and free.  And the reason behind that is to save/amortize the cost of
the locking/CAS operations.

> > Allocators must define _HAVE_SLAB_ALLOCATOR_OPERATIONS in their
> > header files in order to implement their own fast version for
> > these array operations.

I would like to see an implementation of a fast-version.  Else it is
difficult to evaluate if the API is the right one.  E.g. if it would be
beneficial for the MM system, we could likely restrict the API to only
work with power-of-two, from the beginning.

> Why?  What's driving this?

The network stack have a pattern of allocating 64 SKBs while pulling
out packets of the NICs RX-ring.  Packets are placing into the TX-ring,
and later at TX-completing time, we free up-to 256 SKBs (depending on
driver).

Another use-case, which need smaller bulk's, could be tree-structures
that need to expand, allocating two elems in one-shot should cut the
alloc overhead in half.

I'm implemented a prove-of-concept[1] lockless bulk alloc and free
scheme, that demonstrate this can benefit the network stack.  Now,
Christoph and I are trying to integrate some of the ideas into the slub
allocator.

[1] http://thread.gmane.org/gmane.linux.network/342347/focus=126138 
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Slab infrastructure for array operations
  2014-12-19 10:31   ` Jesper Dangaard Brouer
@ 2014-12-19 15:01     ` Christoph Lameter
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Lameter @ 2014-12-19 15:01 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Andrew Morton, akpm, Steven Rostedt, LKML, Thomas Gleixner,
	Linux Memory Management List, Pekka Enberg, Joonsoo Kim

On Fri, 19 Dec 2014, Jesper Dangaard Brouer wrote:

> > > Allocators must define _HAVE_SLAB_ALLOCATOR_OPERATIONS in their
> > > header files in order to implement their own fast version for
> > > these array operations.
>
> I would like to see an implementation of a fast-version.  Else it is
> difficult to evaluate if the API is the right one.  E.g. if it would be
> beneficial for the MM system, we could likely restrict the API to only
> work with power-of-two, from the beginning.

I have some half way done patch mess here to implement fast functions for
SLUB. This does not even compile just for illustration of my thinking.

Index: linux/include/linux/slub_def.h
===================================================================
--- linux.orig/include/linux/slub_def.h	2014-12-18 14:34:02.726408665 -0600
+++ linux/include/linux/slub_def.h	2014-12-18 14:34:38.977288122 -0600
@@ -110,4 +110,5 @@ static inline void sysfs_slab_remove(str
 }
 #endif

+#define _HAVE_SLAB_ARRAY_ALLOCATION
 #endif /* _LINUX_SLUB_DEF_H */
Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c	2014-12-18 14:34:02.730408541 -0600
+++ linux/mm/slub.c	2014-12-18 15:44:18.812347165 -0600
@@ -1374,13 +1374,9 @@ static void setup_object(struct kmem_cac
 		s->ctor(object);
 }

-static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
+static struct page *__new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
 	struct page *page;
-	void *start;
-	void *p;
-	int order;
-	int idx;

 	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
 		pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
@@ -1389,33 +1385,42 @@ static struct page *new_slab(struct kmem

 	page = allocate_slab(s,
 		flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
-	if (!page)
-		goto out;
+	if (page) {
+		inc_slabs_node(s, page_to_nid(page), page->objects);
+		page->slab_cache = s;
+		__SetPageSlab(page);
+		if (page->pfmemalloc)
+			SetPageSlabPfmemalloc(page);
+	}

-	order = compound_order(page);
-	inc_slabs_node(s, page_to_nid(page), page->objects);
-	page->slab_cache = s;
-	__SetPageSlab(page);
-	if (page->pfmemalloc)
-		SetPageSlabPfmemalloc(page);
-
-	start = page_address(page);
-
-	if (unlikely(s->flags & SLAB_POISON))
-		memset(start, POISON_INUSE, PAGE_SIZE << order);
-
-	for_each_object_idx(p, idx, s, start, page->objects) {
-		setup_object(s, page, p);
-		if (likely(idx < page->objects))
-			set_freepointer(s, p, p + s->size);
-		else
-			set_freepointer(s, p, NULL);
-	}
-
-	page->freelist = start;
-	page->inuse = page->objects;
-	page->frozen = 1;
-out:
+	return page;
+}
+
+static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
+{
+	struct page *page = __new_slab(s, flags, node);
+
+	if (page) {
+		void *p;
+		int idx;
+		void *start = page_address(page);
+
+		if (unlikely(s->flags & SLAB_POISON))
+			memset(start, POISON_INUSE,
+				PAGE_SIZE << compound_order(page));
+
+		for_each_object_idx(p, idx, s, start, page->objects) {
+			setup_object(s, page, p);
+			if (likely(idx < page->objects))
+				set_freepointer(s, p, p + s->size);
+			else
+				set_freepointer(s, p, NULL);
+		}
+
+		page->freelist = start;
+		page->inuse = page->objects;
+		page->frozen = 1;
+	}
 	return page;
 }

@@ -2511,6 +2516,62 @@ EXPORT_SYMBOL(kmem_cache_alloc_node_trac
 #endif
 #endif

+int kmem_cache_alloc_array(struct kmem_cache *s,
+	gfp_t gfpflags, int nr, void **p)
+{
+	void **end = p + nr;
+	struct kmem_cache_node *n = getnode(numa_mem_id());
+
+	/* See if we can use some of the partial slabs in our per node list */
+	if (n->nr_partial) {
+		spin_lock_irqsave(&n->list_lock, flags);
+		while (n->nr_partial) {
+			void *freelist;
+
+			page = n->partial.next;
+			if (page->objects - page->inuse > end - p)
+				/* More objects free in page than we want */
+				break;
+			list_del(page->list);
+			slab_lock(page);
+			freelist = page->freelist;
+			page->inuse = page->objects;
+			page->freelist = NULL;
+			slab_unlock(page);
+			/* Grab all available objects */
+			while (freelist) {
+				*p++ = freelist;
+				freelist = get_freepointer(s, freelist);
+			}
+		}
+		spin_lock_irqrestore(&n->list_lock, flags);
+	}
+
+	/* If we still need lots more objects get some new slabs allocated */
+	while (end - p >= oo_objects(s->oo)) {
+		struct page *page = __new_slab(s, gfpflags, NUMA_NO_NODE);
+		void *q = page_address(page);
+		int i;
+
+		/* Use all the objects */
+		for (i = 0; i < page->objects; i++) {
+			setup_object(s, page, q);
+			*p++ = q;
+			q += s->size;
+		}
+
+		page->inuse = page->objects;
+		page->freelist = NULL;
+	}
+
+	/* Drain per cpu partials */
+	/* Drain per cpu slab */
+
+	/* If we are missing some objects get them the regular way */
+	while (p < end)
+		*p++ = kmem_cache_alloc(s, gfpflags);
+}
+
 /*
  * Slow patch handling. This may still be called frequently since objects
  * have a longer lifetime than the cpu slabs in most processing loads.
@@ -2632,6 +2693,57 @@ slab_empty:
 	discard_slab(s, page);
 }

+void kmem_cache_free_array(struct kmem_cache *s, int nr, void **p)
+{
+	struct kmem_cache_node *n = NULL;
+	int last_node = NUMA_NO_NODE;
+	struct kmem_cache_cpu *c;
+	struct page *page;
+
+	local_irq_save(flags);
+	c = this_cpu_ptr(s->cpu_slab);
+	for (i = 0; i < nr; i++) {
+		object = p[i];
+		if (!object)
+			continue;
+		page = virt_to_head_page(object);
+		/* Check if valid slab page */
+		if (s != page->slab_cache)
+			BUG
+		if (c->page == page) {
+			set_freepointer(object, c->freelist);
+			c->freelist = object;
+		} else {
+			node = page_to_nid(page);
+			if (page->frozen) {
+				if (page is from this cpu) {
+					lock_page(page);
+					set_freepointer(object, page->freelist);
+					page->freelist = object;
+					unlock_page(page);
+					/* Can free without locking */
+				} else {
+					/* We need to wait */
+				}
+			} else {
+				if (node != last_node) {
+					if (n)
+						spin_unlock(n->list_lock);
+					n = s->node[node];
+					last_node = node;
+					spin_lock(n->list_lock);
+				}
+				/* General free case with locking */
+			}
+		}
+
+	}
+	if (n)
+		spin_unlock(n->list_lock);
+	local_irq_restore(flags);
+
+}
+
 /*
  * Fastpath with forced inlining to produce a kfree and kmem_cache_free that
  * can perform fastpath freeing without additional function calls.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-12-19 15:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-18 16:33 [PATCH] Slab infrastructure for array operations Christoph Lameter
2014-12-18 22:06 ` Andrew Morton
2014-12-19 10:31   ` Jesper Dangaard Brouer
2014-12-19 15:01     ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox