From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F3C4C4320A for ; Sun, 1 Aug 2021 20:01:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 81D6161029 for ; Sun, 1 Aug 2021 20:01:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 81D6161029 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 02B9F8D0001; Sun, 1 Aug 2021 16:01:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F1D6F6B0036; Sun, 1 Aug 2021 16:01:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0BE88D0001; Sun, 1 Aug 2021 16:01:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id B99856B0033 for ; Sun, 1 Aug 2021 16:01:17 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6F5F88249980 for ; Sun, 1 Aug 2021 20:01:17 +0000 (UTC) X-FDA: 78427580994.32.28B9071 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf27.hostedemail.com (Postfix) with ESMTP id F35B970000B0 for ; Sun, 1 Aug 2021 20:01:16 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id BD40960295; Sun, 1 Aug 2021 20:01:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1627848075; bh=eoI/lcpTJDM3BlN7k4F+GYLHJpUesmWkh7ZzPTwAucY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=o9Z3/wzGq/uvJ2DnqAIVbljcfylaXEPGbNUzxMGwzl++/PbCsuvsxQlKorhZkRGH5 iL0Her5jHsMK07Xb6CnxWFMv6LBFXW8MRBt9NroPc6pXCM/NTW/Xay2zfv+aLc9udq Nkv6RAkO28r6puTmeS7dAzTOS983KHUa8LRjrTYI= Date: Sun, 1 Aug 2021 13:01:15 -0700 From: Andrew Morton To: Aaron Tomlin Cc: linux-mm@kvack.org, mhocko@suse.com, penguin-kernel@i-love.sakura.ne.jp, rientjes@google.com, llong@redhat.com, neelx@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3] mm/oom_kill: show oom eligibility when displaying the current memory state of all tasks Message-Id: <20210801130115.da6d5cd1d635b21315bcd995@linux-foundation.org> In-Reply-To: <20210730162002.279678-1-atomlin@redhat.com> References: <20210730162002.279678-1-atomlin@redhat.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F35B970000B0 Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b="o9Z3/wzG"; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Stat-Signature: rj1wn5imkcwjbi9prxybb5ddjhwb3144 X-HE-Tag: 1627848076-325623 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 30 Jul 2021 17:20:02 +0100 Aaron Tomlin wrote: > Changes since v2: > - Use single character (e.g. 'R' for MMF_OOM_SKIP) as suggested > by Tetsuo Handa > - Add new header to oom_dump_tasks documentation > - Provide further justification > > > The output generated by dump_tasks() can be helpful to determine why > there was an OOM condition and which rogue task potentially caused it. > Please note that this is only provided when sysctl oom_dump_tasks is > enabled. > > At the present time, when showing potential OOM victims, we do not > exclude any task that are not OOM eligible e.g. those that have > MMF_OOM_SKIP set; it is possible that the last OOM killable victim was > already OOM killed, yet the OOM reaper failed to reclaim memory and set > MMF_OOM_SKIP. This can be confusing (or perhaps even be misleading) to the > viewer. Now, we already unconditionally display a task's oom_score_adj_min > value that can be set to OOM_SCORE_ADJ_MIN which is indicative of an > "unkillable" task. > > This patch provides a clear indication with regard to the OOM ineligibility > (and why) of each displayed task with the addition of a new column namely > "oom_skipped". An example is provided below: > > [ 5084.524970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj oom_skipped name > [ 5084.526397] [660417] 0 660417 35869 683 167936 0 -1000 M conmon > [ 5084.526400] [660452] 0 660452 175834 472 86016 0 -998 pod > [ 5084.527460] [752415] 0 752415 35869 650 172032 0 -1000 M conmon > [ 5084.527462] [752575] 1001050000 752575 184205 11158 700416 0 999 npm > [ 5084.527467] [753606] 1001050000 753606 183380 46843 2134016 0 999 node > [ 5084.527581] Memory cgroup out of memory: Killed process 753606 (node) total-vm:733520kB, anon-rss:161228kB, file-rss:26144kB, shmem-rss:0kB, UID:1001050000 > > So, a single character 'M' is for OOM_SCORE_ADJ_MIN, 'R' MMF_OOM_SKIP and > 'V' for in_vfork(). > > index 003d5cc3751b..4c79fa00ddb3 100644 > --- a/Documentation/admin-guide/sysctl/vm.rst > +++ b/Documentation/admin-guide/sysctl/vm.rst > @@ -650,8 +650,9 @@ oom_dump_tasks > Enables a system-wide task dump (excluding kernel threads) to be produced > when the kernel performs an OOM-killing and includes such information as > pid, uid, tgid, vm size, rss, pgtables_bytes, swapents, oom_score_adj > -score, and name. This is helpful to determine why the OOM killer was > -invoked, to identify the rogue task that caused it, and to determine why > +score, oom eligibility status and name. This is helpful to determine why > +the OOM killer was invoked, to identify the rogue task that caused it, and > +to determine why It would be better if the meaning of 'M', 'R' and 'V' were described here. > the OOM killer chose the task it did to kill. > > +/** > + * is_task_eligible_oom - determine if and why a task cannot be OOM killed > + * @tsk: task to check > + * > + * Needs to be called with task_lock(). > + */ > +static const char * const is_task_oom_eligible(struct task_struct *p) Name seems inappropriate. task_oom_eligibility()? > +{ > + long adj; > + > + adj = (long)p->signal->oom_score_adj; > + if (adj == OOM_SCORE_ADJ_MIN) > + return "M"; > + else if (test_bit(MMF_OOM_SKIP, &p->mm->flags) > + return "R"; > + else if (in_vfork(p)) > + return "V"; > + else > + return ""; > +}