* Re: [patch 3/3] mm: NUMA slab -- minor optimizations
@ 2006-01-02 15:27 Alok Kataria
0 siblings, 0 replies; 5+ messages in thread
From: Alok Kataria @ 2006-01-02 15:27 UTC (permalink / raw)
To: clameter; +Cc: manfred, kiran, akpm, linux-mm, alokk
On Wed, 2005-12-28 at 02:05, Christoph Lameter wrote:
On Tue, 27 Dec 2005, Manfred Spraul wrote:
>
> > Isn't that a bug? What prevents an interrupt from occuring after the
> > spin_lock() and then causing a deadlock on cachep->spinlock?
>
> Right. cache_grow() may be called when doing slab allocations in an
> interrupt and it takes the lock in order to modify colour_next.
>
Yes you are right.
Looking at the cache_grow code again i think we can do
away with the cachep->spinlock in this code path.
The colour_next variable can be made per node to give better cache
colouring effect.
Then this minor optimizations patch should be alright.
Comments ?
Thanks & Regards,
Alok.
--
The colour_next which is used to calculate the offset of the object in the
slab descriptor, is incremented whenever we add a slab to any of the list3
for a particular cache. This is done now for every list3 to give better
(per node) cache colouring effect.
This also reduces thrashing on the cache_cache structure.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Index: linux-2.6.15-rc7/mm/slab.c
===================================================================
--- linux-2.6.15-rc7.orig/mm/slab.c 2005-12-24 15:47:48.000000000 -0800
+++ linux-2.6.15-rc7/mm/slab.c 2006-01-02 07:00:36.000000000 -0800
@@ -293,6 +293,7 @@ struct kmem_list3 {
unsigned long next_reap;
int free_touched;
unsigned int free_limit;
+ unsigned int colour_next; /* cache colouring */
spinlock_t list_lock;
struct array_cache *shared; /* shared per node */
struct array_cache **alien; /* on other nodes */
@@ -344,6 +345,7 @@ static inline void kmem_list3_init(struc
INIT_LIST_HEAD(&parent->slabs_free);
parent->shared = NULL;
parent->alien = NULL;
+ parent->colour_next = 0;
spin_lock_init(&parent->list_lock);
parent->free_objects = 0;
parent->free_touched = 0;
@@ -390,7 +392,6 @@ struct kmem_cache {
size_t colour; /* cache colouring range */
unsigned int colour_off; /* colour offset */
- unsigned int colour_next; /* cache colouring */
kmem_cache_t *slabp_cache;
unsigned int slab_size;
unsigned int dflags; /* dynamic flags */
@@ -1060,7 +1061,6 @@ void __init kmem_cache_init(void)
BUG();
cache_cache.colour = left_over/cache_cache.colour_off;
- cache_cache.colour_next = 0;
cache_cache.slab_size = ALIGN(cache_cache.num*sizeof(kmem_bufctl_t) +
sizeof(struct slab), cache_line_size());
@@ -2187,16 +2187,17 @@ static int cache_grow(kmem_cache_t *cach
/* About to mess with non-constant members - lock. */
check_irq_off();
- spin_lock(&cachep->spinlock);
+ l3 = cachep->nodelists[nodeid];
+ spin_lock(&l3->list_lock);
/* Get colour for the slab, and cal the next value. */
- offset = cachep->colour_next;
- cachep->colour_next++;
- if (cachep->colour_next >= cachep->colour)
- cachep->colour_next = 0;
- offset *= cachep->colour_off;
+ offset = l3->colour_next;
+ l3->colour_next++;
+ if (l3->colour_next >= cachep->colour)
+ l3->colour_next = 0;
+ spin_unlock(&l3->list_lock);
- spin_unlock(&cachep->spinlock);
+ offset *= cachep->colour_off;
check_irq_off();
if (local_flags & __GFP_WAIT)
@@ -2228,7 +2229,6 @@ static int cache_grow(kmem_cache_t *cach
if (local_flags & __GFP_WAIT)
local_irq_disable();
check_irq_off();
- l3 = cachep->nodelists[nodeid];
spin_lock(&l3->list_lock);
/* Make slab active. */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* [patch 1/3] mm: NUMA slab -- add alien cache drain statistics
@ 2005-11-29 8:50 Ravikiran G Thirumalai
2005-11-29 8:54 ` [patch 3/3] mm: NUMA slab -- minor optimizations Ravikiran G Thirumalai
0 siblings, 1 reply; 5+ messages in thread
From: Ravikiran G Thirumalai @ 2005-11-29 8:50 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, manfred, clameter, Alok Kataria
NUMA slab allocator frees remote objects to a local alien cache.
But if the local alien cache is full, the alien cache
is drained directly to the remote node.
This patch adds a statistics counter which is incremented everytime the
local alien cache is full and we have to drain it to the remote nodes list3.
This will be useful when we can dynamically tune the alien cache limit.
Currently, the alien cache limit is fixed at 12.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Index: linux-2.6.15-rc1/mm/slab.c
===================================================================
--- linux-2.6.15-rc1.orig/mm/slab.c 2005-11-17 15:37:26.000000000 -0800
+++ linux-2.6.15-rc1/mm/slab.c 2005-11-17 21:32:37.000000000 -0800
@@ -416,9 +416,11 @@
unsigned long max_freeable;
unsigned long node_allocs;
unsigned long node_frees;
+ unsigned long node_overflow;
atomic_t allochit;
atomic_t allocmiss;
atomic_t freehit;
+
atomic_t freemiss;
#endif
#if DEBUG
@@ -452,6 +454,7 @@
#define STATS_INC_ERR(x) ((x)->errors++)
#define STATS_INC_NODEALLOCS(x) ((x)->node_allocs++)
#define STATS_INC_NODEFREES(x) ((x)->node_frees++)
+#define STATS_INC_OVERFLOW(x) ((x)->node_overflow++)
#define STATS_SET_FREEABLE(x, i) \
do { if ((x)->max_freeable < i) \
(x)->max_freeable = i; \
@@ -471,6 +474,7 @@
#define STATS_INC_ERR(x) do { } while (0)
#define STATS_INC_NODEALLOCS(x) do { } while (0)
#define STATS_INC_NODEFREES(x) do { } while (0)
+#define STATS_INC_OVERFLOW(x) do { } while (0)
#define STATS_SET_FREEABLE(x, i) \
do { } while (0)
@@ -2765,9 +2769,11 @@
if (l3->alien && l3->alien[nodeid]) {
alien = l3->alien[nodeid];
spin_lock(&alien->lock);
- if (unlikely(alien->avail == alien->limit))
+ if (unlikely(alien->avail == alien->limit)) {
+ STATS_INC_OVERFLOW(cachep);
__drain_alien_cache(cachep,
alien, nodeid);
+ }
alien->entry[alien->avail++] = objp;
spin_unlock(&alien->lock);
} else {
@@ -3386,7 +3392,7 @@
seq_puts(m, " : slabdata <active_slabs> <num_slabs> <sharedavail>");
#if STATS
seq_puts(m, " : globalstat <listallocs> <maxobjs> <grown> <reaped>"
- " <error> <maxfreeable> <nodeallocs> <remotefrees>");
+ " <error> <maxfreeable> <nodeallocs> <remotefrees> <overflow>");
seq_puts(m, " : cpustat <allochit> <allocmiss> <freehit> <freemiss>");
#endif
seq_putc(m, '\n');
@@ -3492,11 +3498,13 @@
unsigned long max_freeable = cachep->max_freeable;
unsigned long node_allocs = cachep->node_allocs;
unsigned long node_frees = cachep->node_frees;
+ unsigned long overflows = cachep->node_overflow;
seq_printf(m, " : globalstat %7lu %6lu %5lu %4lu \
- %4lu %4lu %4lu %4lu",
+ %4lu %4lu %4lu %4lu %4lu",
allocs, high, grown, reaped, errors,
- max_freeable, node_allocs, node_frees);
+ max_freeable, node_allocs, node_frees,
+ overflows);
}
/* cpu stats */
{
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [patch 3/3] mm: NUMA slab -- minor optimizations
2005-11-29 8:50 [patch 1/3] mm: NUMA slab -- add alien cache drain statistics Ravikiran G Thirumalai
@ 2005-11-29 8:54 ` Ravikiran G Thirumalai
2005-11-29 17:53 ` Christoph Lameter
2005-12-26 23:42 ` Manfred Spraul
0 siblings, 2 replies; 5+ messages in thread
From: Ravikiran G Thirumalai @ 2005-11-29 8:54 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, manfred, clameter, Alok Kataria
Patch adds some minor optimizations:
1. Keeps on chip interrupts enabled for a bit longer while draining cpu
caches
2. Calls numa_node_id once in cache_reap
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Index: linux-2.6.15-rc1/mm/slab.c
===================================================================
--- linux-2.6.15-rc1.orig/mm/slab.c 2005-11-17 21:32:43.000000000 -0800
+++ linux-2.6.15-rc1/mm/slab.c 2005-11-17 21:32:50.000000000 -0800
@@ -1914,18 +1914,18 @@
smp_call_function_all_cpus(do_drain, cachep);
check_irq_on();
- spin_lock_irq(&cachep->spinlock);
+ spin_lock(&cachep->spinlock);
for_each_online_node(node) {
l3 = cachep->nodelists[node];
if (l3) {
- spin_lock(&l3->list_lock);
+ spin_lock_irq(&l3->list_lock);
drain_array_locked(cachep, l3->shared, 1, node);
- spin_unlock(&l3->list_lock);
+ spin_unlock_irq(&l3->list_lock);
if (l3->alien)
drain_alien_cache(cachep, l3);
}
}
- spin_unlock_irq(&cachep->spinlock);
+ spin_unlock(&cachep->spinlock);
}
static int __node_shrink(kmem_cache_t *cachep, int node)
@@ -3304,7 +3304,7 @@
list_for_each(walk, &cache_chain) {
kmem_cache_t *searchp;
struct list_head* p;
- int tofree;
+ int tofree, nodeid;
struct slab *slabp;
searchp = list_entry(walk, kmem_cache_t, next);
@@ -3314,13 +3314,14 @@
check_irq_on();
- l3 = searchp->nodelists[numa_node_id()];
+ nodeid = numa_node_id();
+ l3 = searchp->nodelists[nodeid];
if (l3->alien)
drain_alien_cache(searchp, l3);
spin_lock_irq(&l3->list_lock);
drain_array_locked(searchp, ac_data(searchp), 0,
- numa_node_id());
+ nodeid);
if (time_after(l3->next_reap, jiffies))
goto next_unlock;
@@ -3329,7 +3330,7 @@
if (l3->shared)
drain_array_locked(searchp, l3->shared, 0,
- numa_node_id());
+ nodeid);
if (l3->free_touched) {
l3->free_touched = 0;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [patch 3/3] mm: NUMA slab -- minor optimizations
2005-11-29 8:54 ` [patch 3/3] mm: NUMA slab -- minor optimizations Ravikiran G Thirumalai
@ 2005-11-29 17:53 ` Christoph Lameter
2005-12-26 23:42 ` Manfred Spraul
1 sibling, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2005-11-29 17:53 UTC (permalink / raw)
To: Ravikiran G Thirumalai; +Cc: Andrew Morton, linux-mm, manfred, Alok Kataria
On Tue, 29 Nov 2005, Ravikiran G Thirumalai wrote:
> Patch adds some minor optimizations:
> 1. Keeps on chip interrupts enabled for a bit longer while draining cpu
> caches
> 2. Calls numa_node_id once in cache_reap
>
> Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
> Signed-off-by: Shai Fultheim <shai@scalex86.org>
Ack-by: Christoph Lameter <clameter@sgi.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch 3/3] mm: NUMA slab -- minor optimizations
2005-11-29 8:54 ` [patch 3/3] mm: NUMA slab -- minor optimizations Ravikiran G Thirumalai
2005-11-29 17:53 ` Christoph Lameter
@ 2005-12-26 23:42 ` Manfred Spraul
2005-12-27 20:35 ` Christoph Lameter
1 sibling, 1 reply; 5+ messages in thread
From: Manfred Spraul @ 2005-12-26 23:42 UTC (permalink / raw)
To: Ravikiran G Thirumalai; +Cc: Andrew Morton, linux-mm, clameter, Alok Kataria
Ravikiran G Thirumalai wrote:
>Patch adds some minor optimizations:
>1. Keeps on chip interrupts enabled for a bit longer while draining cpu
>caches
>2. Calls numa_node_id once in cache_reap
>
>Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
>Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
>Signed-off-by: Shai Fultheim <shai@scalex86.org>
>
>Index: linux-2.6.15-rc1/mm/slab.c
>===================================================================
>--- linux-2.6.15-rc1.orig/mm/slab.c 2005-11-17 21:32:43.000000000 -0800
>+++ linux-2.6.15-rc1/mm/slab.c 2005-11-17 21:32:50.000000000 -0800
>@@ -1914,18 +1914,18 @@
>
> smp_call_function_all_cpus(do_drain, cachep);
> check_irq_on();
>- spin_lock_irq(&cachep->spinlock);
>+ spin_lock(&cachep->spinlock);
>
>
Isn't that a bug? What prevents an interrupt from occuring after the
spin_lock() and then causing a deadlock on cachep->spinlock?
--
Manfred
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [patch 3/3] mm: NUMA slab -- minor optimizations
2005-12-26 23:42 ` Manfred Spraul
@ 2005-12-27 20:35 ` Christoph Lameter
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2005-12-27 20:35 UTC (permalink / raw)
To: Manfred Spraul
Cc: Ravikiran G Thirumalai, Andrew Morton, linux-mm, Alok Kataria
On Tue, 27 Dec 2005, Manfred Spraul wrote:
> Isn't that a bug? What prevents an interrupt from occuring after the
> spin_lock() and then causing a deadlock on cachep->spinlock?
Right. cache_grow() may be called when doing slab allocations in an
interrupt and it takes the lock in order to modify colour_next.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-01-02 15:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-02 15:27 [patch 3/3] mm: NUMA slab -- minor optimizations Alok Kataria
-- strict thread matches above, loose matches on Subject: below --
2005-11-29 8:50 [patch 1/3] mm: NUMA slab -- add alien cache drain statistics Ravikiran G Thirumalai
2005-11-29 8:54 ` [patch 3/3] mm: NUMA slab -- minor optimizations Ravikiran G Thirumalai
2005-11-29 17:53 ` Christoph Lameter
2005-12-26 23:42 ` Manfred Spraul
2005-12-27 20:35 ` Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox