From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com [209.85.212.171]) by kanga.kvack.org (Postfix) with ESMTP id 0C4DD6B0032 for ; Wed, 11 Feb 2015 14:18:40 -0500 (EST) Received: by mail-wi0-f171.google.com with SMTP id hi2so14647308wib.4 for ; Wed, 11 Feb 2015 11:18:39 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id oq1si3133468wjc.43.2015.02.11.11.18.36 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Feb 2015 11:18:37 -0800 (PST) Date: Wed, 11 Feb 2015 19:59:45 +0100 From: Oleg Nesterov Subject: Re: How to handle TIF_MEMDIE stalls? Message-ID: <20150211185945.GA3578@redhat.com> References: <20141230112158.GA15546@dhcp22.suse.cz> <201502092044.JDG39081.LVFOOtFHQFOMSJ@I-love.SAKURA.ne.jp> <201502102258.IFE09888.OVQFJOMSFtOLFH@I-love.SAKURA.ne.jp> <20150210151934.GA11212@phnom.home.cmpxchg.org> <201502111123.ICD65197.FMLOHSQJFVOtFO@I-love.SAKURA.ne.jp> <201502112237.CDD87547.tJOFFVHLOOQSMF@I-love.SAKURA.ne.jp> <20150211185015.GA2792@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150211185015.GA2792@redhat.com> Sender: owner-linux-mm@kvack.org List-ID: To: Tetsuo Handa Cc: mhocko@suse.cz, hannes@cmpxchg.org, david@fromorbit.com, dchinner@redhat.com, linux-mm@kvack.org, rientjes@google.com, akpm@linux-foundation.org, mgorman@suse.de, torvalds@linux-foundation.org On 02/11, Oleg Nesterov wrote: > > On 02/11, Tetsuo Handa wrote: > > > > (Asking Oleg this time.) > > Well, sorry, I ignored the previous discussion, not sure I understand you > correctly. > > > > Though, more serious behavior with this reproducer is (B) where the system > > > stalls forever without kernel messages being saved to /var/log/messages . > > > out_of_memory() does not select victims until the coredump to pipe can make > > > progress whereas the coredump to pipe can't make progress until memory > > > allocation succeeds or fails. > > > > This behavior is related to commit d003f371b2701635 ("oom: don't assume > > that a coredumping thread will exit soon"). That commit tried to take > > SIGNAL_GROUP_COREDUMP into account, but actually it is failing to do so. > > Heh. Please see the changelog. This "fix" is obviously very limited, it does > not even try to solve all problems (even with coredump in particular). > > Note also that SIGNAL_GROUP_COREDUMP is not even set if the process (not a > sub-thread) shares the memory with the coredumping task. It would be better > to check mm->core_state != NULL instead, but this needs the locking. Plus > that process likely sleeps in D state in exit_mm(), so this can't help. > > And that is why we set SIGNAL_GROUP_COREDUMP in zap_threads(), not in > zap_process(). We probably want to make that "wait for coredump_finish()" > sleep in exit_mm() killable, but this is not simple. on a cecond thought, perhaps it makes sense to set SIGNAL_GROUP_COREDUMP anyway, even if a CLONE_VM process participating in coredump is not killable. I'll recheck tomorrow. > Sorry for noise if the above is not relevant. > > Oleg. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org