From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx123.postini.com [74.125.245.123]) by kanga.kvack.org (Postfix) with SMTP id C72D76B006C for ; Mon, 29 Oct 2012 19:34:53 -0400 (EDT) Received: by mail-qa0-f41.google.com with SMTP id c4so2203301qae.14 for ; Mon, 29 Oct 2012 16:34:52 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20121015144412.GA2173@barrios> <20121016061854.GB3934@barrios> <20121022235321.GK13817@bbox> Date: Mon, 29 Oct 2012 16:34:52 -0700 Message-ID: Subject: Re: zram OOM behavior From: Luigi Semenzato Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-linux-mm@kvack.org List-ID: To: David Rientjes Cc: Minchan Kim , linux-mm@kvack.org, Dan Magenheimer , KOSAKI Motohiro On Mon, Oct 29, 2012 at 4:23 PM, Luigi Semenzato wrote: > On Mon, Oct 29, 2012 at 3:52 PM, David Rientjes wrote: >> On Mon, 29 Oct 2012, Luigi Semenzato wrote: >> >>> It looks like it's the final call to schedule() in do_exit(): >>> >>> 0x81028520 <+1593>: call 0x813b68a0 >>> 0x81028525 <+1598>: ud2a >>> >>> (gdb) l *do_exit+0x63e >>> 0x81028525 is in do_exit >>> (/home/semenzato/trunk/src/third_party/kernel/files/kernel/exit.c:1069). >>> 1064 >>> 1065 /* causes final put_task_struct in finish_task_switch(). */ >>> 1066 tsk->state = TASK_DEAD; >>> 1067 tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */ >>> 1068 schedule(); >>> 1069 BUG(); >>> 1070 /* Avoid "noreturn function does return". */ >>> 1071 for (;;) >>> 1072 cpu_relax(); /* For when BUG is null */ >>> 1073 } >>> >> >> You're using an older kernel since the code you quoted from the oom killer >> hasn't had the per-memcg oom kill rewrite. There's logic that is called >> from select_bad_process() that should exclude this thread from being >> considered and deferred since it has a non-zero task->exit_thread, i.e. in >> oom_scan_process_thread(): >> >> if (task->exit_state) >> return OOM_SCAN_CONTINUE; >> >> And that's called from both the global oom killer and memcg oom killer. >> So I'm thinking you're either running on an older kernel or there is no >> oom condition at the time this is captured. > Very sorry, I never said that we're on kernel 3.4.0. > > We are in a OOM-kill situation: > > ./arch/x86/include/asm/thread_info.h:91:#define TIF_MEMDIE 20 > > Bit 20 in the threadinfo flags is set: > >> [96283.704390] chrome x 815ecd20 0 16573 1112 0x00100104 > > So your suggestion would be to apply OOM-related patches from a later kernel? > > Thanks! Actually, I am not sure that the 3.6 OOM code is sufficiently different to avoid this situation. 3.4 already has a test for task->exit_state, which in my case must be failing even though TIF_MEMDIE is set and the process has finished do_exit: do_each_thread(g, p) { unsigned int points; if (p->exit_state) continue; ... In fact, those changes look mostly cosmetic. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org