From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C111C2BA2B for ; Wed, 15 Apr 2020 05:58:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E92F1206F9 for ; Wed, 15 Apr 2020 05:58:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E92F1206F9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=I-love.SAKURA.ne.jp Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 821878E0005; Wed, 15 Apr 2020 01:58:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D18D8E0001; Wed, 15 Apr 2020 01:58:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BFF38E0005; Wed, 15 Apr 2020 01:58:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0188.hostedemail.com [216.40.44.188]) by kanga.kvack.org (Postfix) with ESMTP id 5660B8E0001 for ; Wed, 15 Apr 2020 01:58:34 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 117CD824556B for ; Wed, 15 Apr 2020 05:58:34 +0000 (UTC) X-FDA: 76709034948.06.crib41_2d94814c3102b X-HE-Tag: crib41_2d94814c3102b X-Filterd-Recvd-Size: 4269 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Apr 2020 05:58:32 +0000 (UTC) Received: from fsav402.sakura.ne.jp (fsav402.sakura.ne.jp [133.242.250.101]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 03F5wPq9040061; Wed, 15 Apr 2020 14:58:25 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav402.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav402.sakura.ne.jp); Wed, 15 Apr 2020 14:58:25 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav402.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 03F5wO1S040056 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 15 Apr 2020 14:58:25 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Subject: Re: [RFC PATCH] mm, oom: oom ratelimit auto tuning To: Yafang Shao , Michal Hocko Cc: Andrew Morton , Linux MM , Petr Mladek , Sergey Senozhatsky References: <1586597774-6831-1-git-send-email-laoar.shao@gmail.com> <20200414073911.GC4629@dhcp22.suse.cz> <20200414143229.GN4629@dhcp22.suse.cz> From: Tetsuo Handa Message-ID: <634bab6a-fee1-45b8-62af-be03062ae2bf@I-love.SAKURA.ne.jp> Date: Wed, 15 Apr 2020 14:58:22 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020/04/14 23:58, Yafang Shao wrote: >>>>> The OOM ratelimit starts with a slow rate, and it will increase slowly >>>>> if the speed of the console is rapid and decrease rapidly if the speed >>>>> of the console is slow. oom_rs.burst will be in [1, 10] and >>>>> oom_rs.interval will always greater than 5 * HZ. >>>> >>>> I am not against increasing the ratelimit timeout. But this patch seems >>>> to be trying to be too clever. Why cannot we simply increase the >>>> parameters of the ratelimit? >>> >>> I justed worried that the user may complain it if too many >>> oom_kill_process callbacks are suppressed. >> >> This can be a real concern indeed. I'm proposing automated ratelimiting of dump_tasks() at http://lkml.kernel.org/r/1563360901-8277-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp . I believe that automated ratelimiting of dump_tasks() remains necessary even after printk() became asynchronous. >> >>> But considering that OOM burst at the same time are always because of >>> the same reason, >> >> This is not really the case. Please note that many parallel OOM killers >> might happen in memory cgroup setups. >> >>> so I think one snapshot of the OOM may be enough. >>> Simply setting oom_rs with {20 * HZ, 1} can resolve this issue. >> >> Does it really though? The ratelimit doesn't stop the long taking >> output. It simply cannot because the work is already done. >> >> That being said, making the ratelimiting more aggressive sounds more >> like a workaround than an actual fix. So I would go that route only if >> there is no other option. I believe the real problem here is in printk >> being too synchronous here. This is a general problem and something >> printk maintainers are already working on. >> > > Yes, printk being too sync is the real issue. If the printk an be > async, then we don't need to worry about it at all. I strongly disagree. dump_tasks() will needlessly fill printk() log buffer (and potentially loose other kernel messages due to buffer full / disk full). By the way, Petr and Sergey, how is the progress of making printk() asynchronous? When can we expect that work to be merged?