From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f70.google.com (mail-wm0-f70.google.com [74.125.82.70]) by kanga.kvack.org (Postfix) with ESMTP id 8C7AC28025E for ; Thu, 22 Dec 2016 14:24:11 -0500 (EST) Received: by mail-wm0-f70.google.com with SMTP id c85so6060462wmi.6 for ; Thu, 22 Dec 2016 11:24:11 -0800 (PST) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id iq8si3063356wjb.259.2016.12.22.11.24.09 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 22 Dec 2016 11:24:10 -0800 (PST) Date: Thu, 22 Dec 2016 20:24:07 +0100 From: Michal Hocko Subject: Re: [PATCH] mm/page_alloc: Wait for oom_lock before retrying. Message-ID: <20161222192406.GB19898@dhcp22.suse.cz> References: <201612151921.CBE43202.SFLtOFJMOFOQVH@I-love.SAKURA.ne.jp> <201612192025.IFF13034.HJSFLtOFFMQOOV@I-love.SAKURA.ne.jp> <20161219122738.GB427@tigerII.localdomain> <20161220153948.GA575@tigerII.localdomain> <201612221927.BGE30207.OSFJMFLFOHQtOV@I-love.SAKURA.ne.jp> <201612222233.CBC56295.LFOtMOVQSJOFHF@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201612222233.CBC56295.LFOtMOVQSJOFHF@I-love.SAKURA.ne.jp> Sender: owner-linux-mm@kvack.org List-ID: To: Tetsuo Handa Cc: sergey.senozhatsky@gmail.com, linux-mm@kvack.org, pmladek@suse.cz On Thu 22-12-16 22:33:40, Tetsuo Handa wrote: > Tetsuo Handa wrote: > > Now, what options are left other than replacing !mutex_trylock(&oom_lock) > > with mutex_lock_killable(&oom_lock) which also stops wasting CPU time? > > Are we waiting for offloading sending to consoles? > > From http://lkml.kernel.org/r/20161222115057.GH6048@dhcp22.suse.cz : > > > Although I don't know whether we agree with mutex_lock_killable(&oom_lock) > > > change, I think this patch alone can go as a cleanup. > > > > No, we don't agree on that part. As this is a printk issue I do not want > > to workaround it in the oom related code. That is just ridiculous. The > > very same issue would be possible due to other continous source of log > > messages. > > I don't think so. Lockup caused by printk() is printk's problem. But printk > is not the only source of lockup. If CONFIG_PREEMPT=y, it is possible that > a thread which held oom_lock can sleep for unbounded period depending on > scheduling priority. Unless there is some runaway realtime process then the holder of the oom lock shouldn't be preempted for the _unbounded_ amount of time. It might take quite some time, though. But that is not reduced to the OOM killer. Any important part of the system (IO flushers and what not) would suffer from the same issue. > Then, you call such latency as scheduler's problem? > mutex_lock_killable(&oom_lock) change helps coping with whatever delays > OOM killer/reaper might encounter. It helps _your_ particular insane workload. I believe you can construct many others which which would cause a similar problem and the above suggestion wouldn't help a bit. Until I can see this is easily triggerable on a reasonably configured system then I am not convinced we should add more non trivial changes to the oom killer path. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org