From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24A73C43603 for ; Fri, 20 Dec 2019 09:57:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CB43B206EC for ; Fri, 20 Dec 2019 09:57:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ogwCOepd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CB43B206EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 608DF8E0195; Fri, 20 Dec 2019 04:57:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5BC718E0184; Fri, 20 Dec 2019 04:57:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4A9668E0195; Fri, 20 Dec 2019 04:57:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 346018E0184 for ; Fri, 20 Dec 2019 04:57:36 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id ABF5B82499A8 for ; Fri, 20 Dec 2019 09:57:35 +0000 (UTC) X-FDA: 76285067670.09.songs30_7c6da5c961105 X-HE-Tag: songs30_7c6da5c961105 X-Filterd-Recvd-Size: 11389 Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Dec 2019 09:57:34 +0000 (UTC) Received: by mail-lf1-f65.google.com with SMTP id n12so6562023lfe.3 for ; Fri, 20 Dec 2019 01:57:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gK3Jsx+9OkB5dzf0hZjTWecvn10v9Iz4HMq46DNa1Yg=; b=ogwCOepdSVJFENSddB06nmsY3oVMA4Xl2VsA0qGaZrEXP3fCvs3tb5MaB2vqfmMpAt BRKmkpR0DABpFiexEZ1KEtlLwpytm4aTc5oAiD7xmRX2Ft72LF4MvloTuBzxWBZ9SBGo zW6Vosx+VMF+emXNiZX818+S1ja0mCB8ICVi0qT+g8qthBf05miXa8tJNKyoRpuTDBJN HmyP+j+CgcWQ/4uL2oOTK1wZs6m9VrXYI8UOhzEbytyhKizdEdakJca11/qN+O5EFVKw lHStrGCxO0SQmRyJsJpgWvqxvyTsR5S2UxkvfJx3YdEGatH0wr/m2PzuB0V/1swZcavv 5KMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gK3Jsx+9OkB5dzf0hZjTWecvn10v9Iz4HMq46DNa1Yg=; b=sqyMK3Kf7mk4+kQ/U08CvRzuuAJTg4luNlwIncWdYlJv6xCdKobm1Exsnhpu9GFnYW 1ABWoDpJ2Iqrui9QbdYQxwLhQMwbxWlsmNXWNTSfQjT1WnInRNLHD3qxsTgsH4qqRo+f FGRoiJnFRC/C6Msl4qYpj6ouaLh54qPZjPdxNo0KrjNJBPZe+F6HwlIrlJrKVbAH7lPS vO3KToP1X3WSF2ZGDIFmqsTZWRVdOYUzizi7HmpYVEkkauJY0SNTYSQC/rRardzL/HIR QzDSyfOroTD04JySg59PA01uYFCJLAoHxaxPJ6d0BvWQQLWJkknvzEydVh5aZ/6STSqm frwQ== X-Gm-Message-State: APjAAAVIV65xRcBXVIwJVoIwqSO2g2F9v5XXWPpblMERTU61F3zM6Iv1 lBybjgtLlzvKmD4IZuwO/1epn1zsORux5EBWVwM= X-Google-Smtp-Source: APXvYqxqpw9O+sAwMNvYmKUI8t8Oaz+eRSoxAmrSVsuGQI0M7xwpeYNaCPEjjTesHDek5ULGuo0k/x3NxhufXpGCbBc= X-Received: by 2002:ac2:41c8:: with SMTP id d8mr8224507lfi.65.1576835853400; Fri, 20 Dec 2019 01:57:33 -0800 (PST) MIME-Version: 1.0 References: <1576823172-25943-1-git-send-email-zgpeng.linux@gmail.com> <20191220071334.GB20332@dhcp22.suse.cz> In-Reply-To: <20191220071334.GB20332@dhcp22.suse.cz> From: =?UTF-8?B?5b2t5b+X5Yia?= Date: Fri, 20 Dec 2019 17:56:20 +0800 Message-ID: Subject: Re: [PATCH] oom: choose a more suitable process to kill while all processes are not killable To: Michal Hocko Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, vdavydov.dev@gmail.com, Shakeel Butt , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, zgpeng Content-Type: multipart/alternative; boundary="000000000000e93e0f059a1fb40a" X-Bogosity: Ham, tests=bogofilter, spamicity=0.025484, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --000000000000e93e0f059a1fb40a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable certainly. Steps to reproduce: (1)Create a mm cgroup and set memory.limit_in_bytes (2)Move the bash process to the newly created cgroup, and set the oom_score_adj of the bash process to -998. (3)In bash, start multiple processes, each process consumes different memory until cgroup oom is triggered. The triggered phenomenon is shown below. We can see that when cgroup oom happened, process 23777 was killed, but in fact, 23772 consumes more memory= ; [ 591.000970] Tasks state (memory values in pages): [ 591.000970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ 591.000973] [ 23344] 0 23344 2863 923 61440 0 -998 bash [ 591.000975] [ 23714] 0 23714 27522 25935 258048 0 -998 test [ 591.000976] [ 23772] 0 23772 104622 103032 876544 0 -998 test [ 591.000978] [ 23777] 0 23777 78922 77335 667648 0 -998 test [ 591.000980] oom-kill:constraint=3DCONSTRAINT_MEMCG,nodemask=3D(null),cpuset=3D/,mems_al= lowed=3D0-1,oom_memcg=3D/test,task_memcg=3D/test,task=3Dtest,pid=3D23777,ui= d=3D0 [ 591.000986] Memory cgroup out of memory: Killed process 23777 (test) total-vm:315688kB, anon-rss:308420kB, file-rss:920kB, shmem-rss:0kB, UID:0 pgtables:667648kB oom_score_adj:-998 The verification process is the same. After applying this repair patch, we can find that when the oom cgroup occurs, the process that consumes the most memory is killed first. The effect is shown below=EF=BC=9A [195118.961767] Tasks state (memory values in pages): [195118.961768] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [195118.961770] [ 22283] 0 22283 2862 911 69632 0 -998 bash [195118.961771] [ 79244] 0 79244 27522 25922 262144 0 -998 test [195118.961773] [ 79247] 0 79247 53222 51596 462848 0 -998 test [195118.961776] [ 79263] 0 79263 58362 56744 507904 0 -998 test [195118.961777] [ 79267] 0 79267 45769 44005 409600 0 -998 test [195118.961779] oom-kill:constraint=3DCONSTRAINT_MEMCG,nodemask=3D(null),cpuset=3D/,mems_al= lowed=3D0-1,oom_memcg=3D/test,task_memcg=3D/test,task=3Dtest,pid=3D79263,ui= d=3D0 [195118.961786] Memory cgroup out of memory: Killed process 79263 (test) total-vm:233448kB, anon-rss:226048kB, file-rss:928kB, shmem-rss:0kB, UID:0 pgtables:507904kB oom_score_adj:-998 Michal Hocko =E4=BA=8E2019=E5=B9=B412=E6=9C=8820=E6=97= =A5=E5=91=A8=E4=BA=94 =E4=B8=8B=E5=8D=883:13=E5=86=99=E9=81=93=EF=BC=9A > On Fri 20-12-19 14:26:12, zgpeng.linux@gmail.com wrote: > > From: zgpeng > > > > It has been found in multiple business scenarios that when a oom occurs > > in a cgroup, the process that consumes the most memory in the cgroup is > > not killed first. Analysis of the reasons found that each process in th= e > > cgroup oom_score_adj is set to -998, oom_badness in the calculation of > > points, if points is negative, uniformly set it to 1. > > Can you provide an example of the oom report? > -- > Michal Hocko > SUSE Labs > --000000000000e93e0f059a1fb40a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
certainly.

Steps to repr= oduce:
(1)Create a mm cgroup and set memory.limit_in_bytes
(2)Move the bash process to the newly created cgroup, and set the o= om_score_adj of the=C2=A0 bash process to -998.
(3)In bash, start= multiple processes, each process consumes different memory until cgroup oo= m is triggered.

The triggered phenomenon is shown = below. We can see that when cgroup oom happened, process 23777 was killed, = but in fact, 23772 consumes more memory;

[ =C2=A05= 91.000970] Tasks state (memory values in pages):
[ =C2=A0591.000970] [ = =C2=A0pid =C2=A0] =C2=A0 uid =C2=A0tgid total_vm =C2=A0 =C2=A0 =C2=A0rss pg= tables_bytes swapents oom_score_adj name
[ =C2=A0591.000973] [ =C2=A0233= 44] =C2=A0 =C2=A0 0 23344 =C2=A0 =C2=A0 2863 =C2=A0 =C2=A0 =C2=A0923 =C2=A0= =C2=A061440 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0-998 bash
[ =C2=A0591.000975] [ =C2=A023714] =C2=A0 =C2=A0 0 23714 = =C2=A0 =C2=A027522 =C2=A0 =C2=A025935 =C2=A0 258048 =C2=A0 =C2=A0 =C2=A0 = =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-998 test
[ =C2=A0591.000976] = [ =C2=A023772] =C2=A0 =C2=A0 0 23772 =C2=A0 104622 =C2=A0 103032 =C2=A0 876= 544 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-998 tes= t
[ =C2=A0591.000978] [ =C2=A023777] =C2=A0 =C2=A0 0 23777 =C2=A0 =C2=A0= 78922 =C2=A0 =C2=A077335 =C2=A0 667648 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0-998 test
[ =C2=A0591.000980] oom-kill:constr= aint=3DCONSTRAINT_MEMCG,nodemask=3D(null),cpuset=3D/,mems_allowed=3D0-1,oom= _memcg=3D/test,task_memcg=3D/test,task=3Dtest,pid=3D23777,uid=3D0
[ =C2= =A0591.000986] Memory cgroup out of memory: Killed process 23777 (test) tot= al-vm:315688kB, anon-rss:308420kB, file-rss:920kB, shmem-rss:0kB, UID:0 pgt= ables:667648kB oom_score_adj:-998

The verification= process is the same. After applying this repair patch, we can find that wh= en the oom cgroup occurs, the process that consumes the most memory is kill= ed first.=C2=A0=C2=A0The effect is shown below=EF=BC=9A

[195118.961767] Tasks state (memory values in pages):
[195118.961768] [= =C2=A0pid =C2=A0] =C2=A0 uid =C2=A0tgid total_vm =C2=A0 =C2=A0 =C2=A0rss p= gtables_bytes swapents oom_score_adj name
[195118.961770] [ =C2=A022283]= =C2=A0 =C2=A0 0 22283 =C2=A0 =C2=A0 2862 =C2=A0 =C2=A0 =C2=A0911 =C2=A0 = =C2=A069632 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= -998 bash
[195118.961771] [ =C2=A079244] =C2=A0 =C2=A0 0 79244 =C2=A0 = =C2=A027522 =C2=A0 =C2=A025922 =C2=A0 262144 =C2=A0 =C2=A0 =C2=A0 =C2=A00 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-998 test
[195118.961773] [ =C2=A07924= 7] =C2=A0 =C2=A0 0 79247 =C2=A0 =C2=A053222 =C2=A0 =C2=A051596 =C2=A0 46284= 8 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-998 test<= br>[195118.961776] [ =C2=A079263] =C2=A0 =C2=A0 0 79263 =C2=A0 =C2=A058362 = =C2=A0 =C2=A056744 =C2=A0 507904 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0-998 test
[195118.961777] [ =C2=A079267] =C2=A0 =C2= =A0 0 79267 =C2=A0 =C2=A045769 =C2=A0 =C2=A044005 =C2=A0 409600 =C2=A0 =C2= =A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-998 test
[195118.9= 61779] oom-kill:constraint=3DCONSTRAINT_MEMCG,nodemask=3D(null),cpuset=3D/,= mems_allowed=3D0-1,oom_memcg=3D/test,task_memcg=3D/test,task=3Dtest,pid=3D7= 9263,uid=3D0
[195118.961786] Memory cgroup out of memory: Killed process= 79263 (test) total-vm:233448kB, anon-rss:226048kB, file-rss:928kB, shmem-r= ss:0kB, UID:0 pgtables:507904kB oom_score_adj:-998=C2=A0=C2=A0
Michal Ho= cko <mhocko@kernel.org> =E4= =BA=8E2019=E5=B9=B412=E6=9C=8820=E6=97=A5=E5=91=A8=E4=BA=94 =E4=B8=8B=E5=8D= =883:13=E5=86=99=E9=81=93=EF=BC=9A
On Fri 20-12-19 14:26:12, zgpeng.linux@gmail.com wrote:
> From: zgpeng <zgpeng@tencent.com>
>
> It has been found in multiple business scenarios that when a oom occur= s
> in a cgroup, the process that consumes the most memory in the cgroup i= s
> not killed first. Analysis of the reasons found that each process in t= he
> cgroup oom_score_adj is set to -998, oom_badness in the calculation of=
> points, if points is negative, uniformly set it to 1.

Can you provide an example of the oom report?
--
Michal Hocko
SUSE Labs
--000000000000e93e0f059a1fb40a--