From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10849C3815B for ; Tue, 14 Apr 2020 14:58:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC84520787 for ; Tue, 14 Apr 2020 14:58:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lV1uRvxV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC84520787 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 656928E0003; Tue, 14 Apr 2020 10:58:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 606398E0001; Tue, 14 Apr 2020 10:58:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51BDC8E0003; Tue, 14 Apr 2020 10:58:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id 377158E0001 for ; Tue, 14 Apr 2020 10:58:50 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E19DF180AD80F for ; Tue, 14 Apr 2020 14:58:49 +0000 (UTC) X-FDA: 76706767578.24.front07_2ff854903e01b X-HE-Tag: front07_2ff854903e01b X-Filterd-Recvd-Size: 5933 Received: from mail-il1-f196.google.com (mail-il1-f196.google.com [209.85.166.196]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Tue, 14 Apr 2020 14:58:49 +0000 (UTC) Received: by mail-il1-f196.google.com with SMTP id f82so8850544ilh.8 for ; Tue, 14 Apr 2020 07:58:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MAtsz0iUlgyajrXzsV/OLL7GzsSdWm5LnSg3I14KSfQ=; b=lV1uRvxV7XN4SPlZBE6FxRGsRnaJVVP40qqu1xyBIqVcgvaMQ6QfbnSZGLNRVvgyWQ pRJPAcNSD+sltxkEGh+/l9neo7BNamtvRY2pRaCC2rsTKTZBedZYV4erwd02Q75KKmZz H5yLxGEux1GcGBiXhZ4keyocCe7U82pPI0YoSvoPC8vam4u+altqPDQ1RS90JaKpWUq7 fOz8BKndPJpzB3iU8N+d5utyuDHgf5+4M/2vsKnw/2i3/RXHG3LNsTuIAI1ESKVYBuC1 Y+/yWKoeNbBD0pS3Se+D4rOUAvO6JUGlllmWFdctCKh9ATte/jhcPxyda1QAp9U4cSmo goBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MAtsz0iUlgyajrXzsV/OLL7GzsSdWm5LnSg3I14KSfQ=; b=VW9srm0Ba99hmQb3YAeRrVDicrWWVKyK11COvVXe/TCbV/Xhv5Clp34JI+4KM4yCT1 fm56ArkjqReYq+Xu0bVji0lsD0qb+uNF1+VFGQYb30jhQz4VjLRlk6pqN6FsEULqOskq kws3ZE/mlISOWXiXMooGOuqEW30Zy44y2wZpJA7jOLYd4L4RHVEz448rJxsOpyDm+gj9 JP8PHlF9QO4tNCy1L0irv0SpTqg6jJMRdtRxEZT+wFi2uRreseWrL+6wkZtLYTV+d3rL gAbsyVpCBfFojoO54JeXsXg66BrGPCCReOzOINJy5qf4tQoUd9ijpgvb3mfspa6Gveht Aj6w== X-Gm-Message-State: AGi0PubIB2mTNLujNdfBSd2jqTfdk8k5/aS5KgONxlmuruXciV7IdOY7 8R0JrHeKd9NJ+K/8wNpXeDDYba/eWW/nTb5kNUw= X-Google-Smtp-Source: APiQypIRFODQAJnxs+Am10lydljw4h7khQyhycaft/1OhCik7umr02WS/mEBgszUS9OTf/BS/MsUuuhwJHq+BgqK3xk= X-Received: by 2002:a92:5c57:: with SMTP id q84mr621573ilb.203.1586876328974; Tue, 14 Apr 2020 07:58:48 -0700 (PDT) MIME-Version: 1.0 References: <1586597774-6831-1-git-send-email-laoar.shao@gmail.com> <20200414073911.GC4629@dhcp22.suse.cz> <20200414143229.GN4629@dhcp22.suse.cz> In-Reply-To: <20200414143229.GN4629@dhcp22.suse.cz> From: Yafang Shao Date: Tue, 14 Apr 2020 22:58:12 +0800 Message-ID: Subject: Re: [RFC PATCH] mm, oom: oom ratelimit auto tuning To: Michal Hocko Cc: Andrew Morton , Linux MM Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Apr 14, 2020 at 10:32 PM Michal Hocko wrote: > > On Tue 14-04-20 20:32:54, Yafang Shao wrote: > > On Tue, Apr 14, 2020 at 3:39 PM Michal Hocko wrote: > [...] > > > Besides that I strongly suspect that you would be much better of > > > by disabling /proc/sys/vm/oom_dump_tasks which would reduce the amount > > > of output a lot. Or do you really require this information when > > > debugging oom reports? > > > > > > > Yes, disabling /proc/sys/vm/oom_dump_tasks can save lots of time. > > But I'm not sure whehter we can disable it totally, because disabling > > it would prevent the tasks log from being wrote into /var/log/messages > > neither. > > Yes, eligible tasks would be really missing. The real question is > whether you are really going to miss that information. From my > experience of looking into oom reports for years I can tell that the > list might be useful but in a vast majority of cases I simply do not > really neeed it because the stat of memory and chosen victims are much > more important. The list of tasks is usually interesting only when you > want to double check whether the victim selection was reasonable or > cases where a list of tasks itself can tell whether something went wild > in the userspace. > Agreed. From my experience, the list of tasks is mainly used to double check the oom score. > > > > The OOM ratelimit starts with a slow rate, and it will increase slowly > > > > if the speed of the console is rapid and decrease rapidly if the speed > > > > of the console is slow. oom_rs.burst will be in [1, 10] and > > > > oom_rs.interval will always greater than 5 * HZ. > > > > > > I am not against increasing the ratelimit timeout. But this patch seems > > > to be trying to be too clever. Why cannot we simply increase the > > > parameters of the ratelimit? > > > > I justed worried that the user may complain it if too many > > oom_kill_process callbacks are suppressed. > > This can be a real concern indeed. > > > But considering that OOM burst at the same time are always because of > > the same reason, > > This is not really the case. Please note that many parallel OOM killers > might happen in memory cgroup setups. > > > so I think one snapshot of the OOM may be enough. > > Simply setting oom_rs with {20 * HZ, 1} can resolve this issue. > > Does it really though? The ratelimit doesn't stop the long taking > output. It simply cannot because the work is already done. > > That being said, making the ratelimiting more aggressive sounds more > like a workaround than an actual fix. So I would go that route only if > there is no other option. I believe the real problem here is in printk > being too synchronous here. This is a general problem and something > printk maintainers are already working on. > Yes, printk being too sync is the real issue. If the printk an be async, then we don't need to worry about it at all. > For now I would recommend to workaround this problem by reducing the log > level or disabling dump_tasks. > Reducing the log level is what we have been doing. Many thanks for your patient explaination. Thanks Yafang