From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx152.postini.com [74.125.245.152]) by kanga.kvack.org (Postfix) with SMTP id 993936B0069 for ; Mon, 29 Oct 2012 18:36:39 -0400 (EDT) Received: by mail-qc0-f169.google.com with SMTP id t2so3925484qcq.14 for ; Mon, 29 Oct 2012 15:36:38 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20121015144412.GA2173@barrios> <20121016061854.GB3934@barrios> <20121022235321.GK13817@bbox> Date: Mon, 29 Oct 2012 15:36:38 -0700 Message-ID: Subject: Re: zram OOM behavior From: Luigi Semenzato Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-linux-mm@kvack.org List-ID: To: David Rientjes Cc: Minchan Kim , linux-mm@kvack.org, Dan Magenheimer , KOSAKI Motohiro On Mon, Oct 29, 2012 at 12:00 PM, David Rientjes wrote: > On Mon, 29 Oct 2012, Luigi Semenzato wrote: > >> I managed to get the stack trace for the process that refuses to die. >> I am not sure it's due to the deadlock described in earlier messages. >> I will investigate further. >> >> [96283.704390] chrome x 815ecd20 0 16573 1112 0x00100104 >> [96283.704405] c107fe34 00200046 f57ae000 815ecd20 815ecd20 ec0b645a >> 0000578f f67cfd20 >> [96283.704427] d0a9a9a0 c107fdf8 81037be5 f5bdf1e8 f6021800 00000000 >> c107fe04 00200202 >> [96283.704449] c107fe0c 00200202 f5bdf1b0 c107fe24 8117ddb1 00200202 >> f5bdf1b0 f5bdf1b8 >> [96283.704471] Call Trace: >> [96283.704484] [<81037be5>] ? queue_work_on+0x2d/0x39 >> [96283.704497] [<8117ddb1>] ? put_io_context+0x52/0x6a >> [96283.704510] [<813b68f6>] schedule+0x56/0x58 >> [96283.704520] [<81028525>] do_exit+0x63e/0x640 > > Could you find out where this happens to be in the function? If you > enable CONFIG_DEBUG_INFO, you should be able to use gdb on vmlinux and > find out with l *do_exit+0x63e. It looks like it's the final call to schedule() in do_exit(): 0x81028520 <+1593>: call 0x813b68a0 0x81028525 <+1598>: ud2a (gdb) l *do_exit+0x63e 0x81028525 is in do_exit (/home/semenzato/trunk/src/third_party/kernel/files/kernel/exit.c:1069). 1064 1065 /* causes final put_task_struct in finish_task_switch(). */ 1066 tsk->state = TASK_DEAD; 1067 tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */ 1068 schedule(); 1069 BUG(); 1070 /* Avoid "noreturn function does return". */ 1071 for (;;) 1072 cpu_relax(); /* For when BUG is null */ 1073 } Here's a theory: the thread exits fine, but the next scheduled thread tries to allocate memory before or during finish_task_switch(), so the dead thread is never cleaned up completely and is still considered alive by the OOM killer. Unfortunately I haven't found a code path that supports this theory... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org