Slab cache reap and CPU availability

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Slab cache reap and CPU availability
@ 2004-05-21 15:41 Dimitri Sivanich
  2004-05-22  2:16 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Dimitri Sivanich @ 2004-05-21 15:41 UTC (permalink / raw)
  To: linux-kernel, linux-mm

Hi all,

I have a fairly general question about the slab cache reap code.

In running realtime noise tests on the 2.6 kernels (spinning to detect periods
of CPU unavailability to RT threads) on an IA/64 Altix system, I have found the
cache_reap code to be the source of a number of larger holdoffs (periods of
CPU unavailability).  These can last into the 100's of usec on 1300 MHz CPUs.
Since this code runs periodically every few seconds as a timer softirq on all
CPUs, holdoffs can occur frequently.

Has anyone looked into less interruptive alternatives to running cache_reap
this way (for the 2.6 kernel), or maybe looked into potential optimizations
to the routine itself?

Thanks in advance,

Dimitri Sivanich <sivanich@sgi.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slab cache reap and CPU availability
  2004-05-21 15:41 Slab cache reap and CPU availability Dimitri Sivanich
@ 2004-05-22  2:16 ` Andrew Morton
  2004-05-24 15:39   ` Dimitri Sivanich
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2004-05-22  2:16 UTC (permalink / raw)
  To: Dimitri Sivanich; +Cc: linux-kernel, linux-mm

Dimitri Sivanich <sivanich@sgi.com> wrote:
>
> Hi all,
> 
> I have a fairly general question about the slab cache reap code.
> 
> In running realtime noise tests on the 2.6 kernels (spinning to detect periods
> of CPU unavailability to RT threads) on an IA/64 Altix system, I have found the
> cache_reap code to be the source of a number of larger holdoffs (periods of
> CPU unavailability).  These can last into the 100's of usec on 1300 MHz CPUs.
> Since this code runs periodically every few seconds as a timer softirq on all
> CPUs, holdoffs can occur frequently.
> 
> Has anyone looked into less interruptive alternatives to running cache_reap
> this way (for the 2.6 kernel), or maybe looked into potential optimizations
> to the routine itself?
> 

Do you have stack backtraces?  I thought the problem was via the RCU
softirq callbacks, not via the timer interrupt.  Dipankar spent some time
looking at the RCU-related problem but solutions are not comfortable.

What workload is triggering this?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slab cache reap and CPU availability
  2004-05-22  2:16 ` Andrew Morton
@ 2004-05-24 15:39   ` Dimitri Sivanich
  2004-05-24 21:53     ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Dimitri Sivanich @ 2004-05-24 15:39 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel, linux-mm

> 
> Dimitri Sivanich <sivanich@sgi.com> wrote:
> >
> > Hi all,
> > 
> > I have a fairly general question about the slab cache reap code.
> > 
> > In running realtime noise tests on the 2.6 kernels (spinning to detect periods
> > of CPU unavailability to RT threads) on an IA/64 Altix system, I have found the
> > cache_reap code to be the source of a number of larger holdoffs (periods of
> > CPU unavailability).  These can last into the 100's of usec on 1300 MHz CPUs.
> > Since this code runs periodically every few seconds as a timer softirq on all
> > CPUs, holdoffs can occur frequently.
> > 
> > Has anyone looked into less interruptive alternatives to running cache_reap
> > this way (for the 2.6 kernel), or maybe looked into potential optimizations
> > to the routine itself?
> > 
> 
> Do you have stack backtraces?  I thought the problem was via the RCU
> softirq callbacks, not via the timer interrupt.  Dipankar spent some time
> looking at the RCU-related problem but solutions are not comfortable.
> 
> What workload is triggering this?
> 

The IA/64 backtrace with all the cruft removed looks as follows:

0xa000000100149ac0 reap_timer_fnc+0x100
0xa0000001000f4d70 run_timer_softirq+0x2d0
0xa0000001000e9440 __do_softirq+0x200
0xa0000001000e94e0 do_softirq+0x80
0xa000000100017f50 ia64_handle_irq+0x190

The system is running mostly AIM7, but I've seen holdoffs > 30 usec with
virtually no load on the system.

Which uncomfortable solutions (which could relate to this case) have been
investigated?


Dimitri Sivanich <sivanich@sgi.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slab cache reap and CPU availability
  2004-05-24 15:39   ` Dimitri Sivanich
@ 2004-05-24 21:53     ` Andrew Morton
  2004-06-01 21:40       ` Dimitri Sivanich
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2004-05-24 21:53 UTC (permalink / raw)
  To: Dimitri Sivanich; +Cc: linux-kernel, linux-mm

Dimitri Sivanich <sivanich@sgi.com> wrote:
>
> > Do you have stack backtraces?  I thought the problem was via the RCU
> > softirq callbacks, not via the timer interrupt.  Dipankar spent some time
> > looking at the RCU-related problem but solutions are not comfortable.
> > 
> > What workload is triggering this?
> > 
> 
> The IA/64 backtrace with all the cruft removed looks as follows:
> 
> 0xa000000100149ac0 reap_timer_fnc+0x100
> 0xa0000001000f4d70 run_timer_softirq+0x2d0
> 0xa0000001000e9440 __do_softirq+0x200
> 0xa0000001000e94e0 do_softirq+0x80
> 0xa000000100017f50 ia64_handle_irq+0x190
> 
> The system is running mostly AIM7, but I've seen holdoffs > 30 usec with
> virtually no load on the system.

They're pretty low latencies you're talking about there.

You should be able to reduce the amount of work in that timer handler by
limiting the size of the per-cpu caches in the slab allocator.  You can do
that by writing a magic incantation to /proc/slabinfo or:

--- 25/mm/slab.c~a	Mon May 24 14:51:32 2004
+++ 25-akpm/mm/slab.c	Mon May 24 14:51:37 2004
@@ -2642,6 +2642,7 @@ static void enable_cpucache (kmem_cache_
 	if (limit > 32)
 		limit = 32;
 #endif
+	limit = 8;
 	err = do_tune_cpucache(cachep, limit, (limit+1)/2, shared);
 	if (err)
 		printk(KERN_ERR "enable_cpucache failed for %s, error %d.\n",

_


> Which uncomfortable solutions (which could relate to this case) have been
> investigated?

That work was focussed on the amount of work which is performed in a single
RCU callback, not in the slab timer handler.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Slab cache reap and CPU availability
  2004-05-24 21:53     ` Andrew Morton
@ 2004-06-01 21:40       ` Dimitri Sivanich
  0 siblings, 0 replies; 5+ messages in thread
From: Dimitri Sivanich @ 2004-06-01 21:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm

> 
> Dimitri Sivanich <sivanich@sgi.com> wrote:
> >
> > The IA/64 backtrace with all the cruft removed looks as follows:
> > 
> > 0xa000000100149ac0 reap_timer_fnc+0x100
> > 0xa0000001000f4d70 run_timer_softirq+0x2d0
> > 0xa0000001000e9440 __do_softirq+0x200
> > 0xa0000001000e94e0 do_softirq+0x80
> > 0xa000000100017f50 ia64_handle_irq+0x190
> > 
> > The system is running mostly AIM7, but I've seen holdoffs > 30 usec with
> > virtually no load on the system.
> 
> They're pretty low latencies you're talking about there.
> 
> You should be able to reduce the amount of work in that timer handler by
> limiting the size of the per-cpu caches in the slab allocator.  You can do
> that by writing a magic incantation to /proc/slabinfo or:
> 
> --- 25/mm/slab.c~a	Mon May 24 14:51:32 2004
> +++ 25-akpm/mm/slab.c	Mon May 24 14:51:37 2004
> @@ -2642,6 +2642,7 @@ static void enable_cpucache (kmem_cache_
>  	if (limit > 32)
>  		limit = 32;
>  #endif
> +	limit = 8;

I tried several values for this limit, but these had little effect.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-06-01 21:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-21 15:41 Slab cache reap and CPU availability Dimitri Sivanich
2004-05-22  2:16 ` Andrew Morton
2004-05-24 15:39   ` Dimitri Sivanich
2004-05-24 21:53     ` Andrew Morton
2004-06-01 21:40       ` Dimitri Sivanich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox