Hi Andrew, Thanks for reviewing the patch! On Wed, Jul 27, 2011 at 4:19 PM, Andrew Morton wrote: > On Tue, 19 Jul 2011 12:31:22 -0700 > Andrew Bresticker wrote: > > > The problem with small dmesg ring buffer like 512k is that only limited > number > > of task traces will be logged. Sometimes we lose important information > only > > because of too many duplicated stack traces. This problem occurs when > dumping > > lots of stacks in a single operation, such as sysrq-T. > > > > This patch tries to reduce the duplication of task stack trace in the > dump > > message by hashing the task stack. The hashtable is a 32k pre-allocated > buffer > > during bootup. Then we hash the task stack with stack_depth 32 for each > stack > > entry. Each time if we find the identical task trace in the task stack, > we dump > > only the pid of the task which has the task trace dumped. So it is easy > to back > > track to the full stack with the pid. > > > > [ 58.469730] kworker/0:0 S 0000000000000000 0 4 2 > 0x00000000 > > [ 58.469735] ffff88082fcfde80 0000000000000046 ffff88082e9d8000 > ffff88082fcfc010 > > [ 58.469739] ffff88082fce9860 0000000000011440 ffff88082fcfdfd8 > ffff88082fcfdfd8 > > [ 58.469743] 0000000000011440 0000000000000000 ffff88082fcee180 > ffff88082fce9860 > > [ 58.469747] Call Trace: > > [ 58.469751] [] worker_thread+0x24b/0x250 > > [ 58.469754] [] ? manage_workers+0x192/0x192 > > [ 58.469757] [] kthread+0x82/0x8a > > [ 58.469760] [] kernel_thread_helper+0x4/0x10 > > [ 58.469763] [] ? kthread_worker_fn+0x112/0x112 > > [ 58.469765] [] ? gs_change+0xb/0xb > > [ 58.469768] kworker/u:0 S 0000000000000004 0 5 2 > 0x00000000 > > [ 58.469773] ffff88082fcffe80 0000000000000046 ffff880800000000 > ffff88082fcfe010 > > [ 58.469777] ffff88082fcea080 0000000000011440 ffff88082fcfffd8 > ffff88082fcfffd8 > > [ 58.469781] 0000000000011440 0000000000000000 ffff88082fd4e9a0 > ffff88082fcea080 > > [ 58.469785] Call Trace: > > [ 58.469786] > > [ 58.470235] kworker/0:1 S 0000000000000000 0 13 2 > 0x00000000 > > [ 58.470255] ffff88082fd3fe80 0000000000000046 ffff880800000000 > ffff88082fd3e010 > > [ 58.470279] ffff88082fcee180 0000000000011440 ffff88082fd3ffd8 > ffff88082fd3ffd8 > > [ 58.470301] 0000000000011440 0000000000000000 ffffffff8180b020 > ffff88082fcee180 > > [ 58.470325] Call Trace: > > [ 58.470332] > > That looks nice. > > > ... > > > > Signed-off-by: Ying Han > > Signed-off-by: Andrew Bresticker > > --- > > arch/x86/Kconfig | 3 + > > arch/x86/include/asm/stacktrace.h | 6 ++- > > arch/x86/kernel/dumpstack.c | 24 ++++++-- > > arch/x86/kernel/dumpstack_32.c | 7 ++- > > arch/x86/kernel/dumpstack_64.c | 11 +++- > > arch/x86/kernel/stacktrace.c | 106 > +++++++++++++++++++++++++++++++++++++ > > drivers/tty/sysrq.c | 2 +- > > include/linux/sched.h | 3 +- > > include/linux/stacktrace.h | 2 + > > kernel/debug/kdb/kdb_bt.c | 8 ++-- > > kernel/rtmutex-debug.c | 2 +- > > kernel/sched.c | 20 ++++++- > > kernel/stacktrace.c | 10 ++++ > > 13 files changed, 180 insertions(+), 24 deletions(-) > > This is all pretty x86-centric. I wonder if the code could/should be > implemented in a fashion whcih would permit other architectures to use > it? > With this interface we would need to modify show_stack() on each architecture since we added the dup_stack_pid argument. I'll look into changing the interface so that we don't have to do this. Do you have any suggestions? > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -103,6 +103,9 @@ config LOCKDEP_SUPPORT > > config STACKTRACE_SUPPORT > > def_bool y > > > > +config STACKTRACE > > + def_bool y > > + > > What's this change for? > We don't need this any more. I'll get rid of it. > > > config HAVE_LATENCYTOP_SUPPORT > > def_bool y > > > > > > ... > > > > +static unsigned int stack_trace_lookup(int len) > > +{ > > + int j; > > + int index = 0; > > + unsigned int ret = 0; > > + struct task_stack *stack; > > + > > + index = task_stack_hash(cur_stack, len) % DEDUP_STACK_LAST_ENTRY; > > + > > + for (j = 0; j < DEDUP_HASH_MAX_ITERATIONS; j++) { > > + stack = stack_hash_table + (index + (1 << j)) % > > + DEDUP_STACK_LAST_ENTRY; > > + if (stack->entries[0] == 0x0) { > > + memcpy(stack, cur_stack, sizeof(*cur_stack)); > > + ret = 0; > > + break; > > + } else { > > + if (memcmp(stack->entries, cur_stack->entries, > > + sizeof(stack->entries)) == > 0) { > > + ret = stack->pid; > > + break; > > + } > > + } > > + } > > + memset(cur_stack, 0, sizeof(struct task_stack)); > > + > > + return ret; > > +} > > I can kinda see what this function is doing - maintaining an LRU ring > of task stacks. Or something. I didn't look very hard because I > shouldn't have to ;) Please comment this function: tell us what it's > doing and why it's doing it? > > What surprises me about this patch is that it appears to be maintaining > an array of entire stack traces. Why not just generate a good hash of > the stack contents and assume that if one task's hash is equal to > another tasks's hash, then the two tasks have the same stack trace? > > That way, > > struct task_stack { > pid_t pid; > unsigned long entries[DEDUP_MAX_STACK_DEPTH]; > }; > > becomes > > struct task_stack { > pid_t pid; > unsigned long stack_hash; > }; > I'll clean this up for the next version. > > > > > ... > > > Thanks, Andrew