From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F768C432C0 for ; Thu, 28 Nov 2019 04:28:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 51A642071F for ; Thu, 28 Nov 2019 04:28:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lca.pw header.i=@lca.pw header.b="bM+76LTc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 51A642071F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=lca.pw Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DE53E6B04FB; Wed, 27 Nov 2019 23:28:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D972D6B04FC; Wed, 27 Nov 2019 23:28:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C85856B04FD; Wed, 27 Nov 2019 23:28:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id B2EB46B04FB for ; Wed, 27 Nov 2019 23:28:42 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 71B342C78 for ; Thu, 28 Nov 2019 04:28:42 +0000 (UTC) X-FDA: 76204405284.19.list96_2edaeb806de42 X-HE-Tag: list96_2edaeb806de42 X-Filterd-Recvd-Size: 6787 Received: from mail-qk1-f196.google.com (mail-qk1-f196.google.com [209.85.222.196]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Thu, 28 Nov 2019 04:28:41 +0000 (UTC) Received: by mail-qk1-f196.google.com with SMTP id m16so21599131qki.11 for ; Wed, 27 Nov 2019 20:28:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=LIbcxYG4/mrI+Kmttk4V4akGY1HX62YKjULB1DApbdo=; b=bM+76LTcaz8cxFBnvj8KfP04tYzxvRQGdtHS6ug9L50qRrVPZHzXptCSEzE+/3dBNq DzJdGU6nwuLnBmQcj3Ie3oPIAjgMuqDk79vT3Io9uk0C/jjLAShmQg9aOTCqjZ3/h1PC w0cd2lFx9s/Gpwj9SM7gBPeeQPePYkSzhEZfjmEvULjgGVxSDIzXPgvtFb3AvytDAhkD wmXIxWH1aku/M2N9CD8e8kS71Y0H9cLwbLIAiaPWauDNKZPNj1a26/FLCPlrGghR5jCI 8ScC2JE5eMp7WBFX3GFVQZkW4fvNBV4ox/pg6NIcDCc6oanHUhugQ+eqMI0D92s43GkB 0h5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=LIbcxYG4/mrI+Kmttk4V4akGY1HX62YKjULB1DApbdo=; b=LM3pn1eoFV419IQ/MGccvYIEktzPClNhaXv1/ZLG00/x4DwrMfT1kOTnJKoKTeulWZ Uc/hKJ3RYsg04b45n0juWoxnfsZVjkLQwX8SoLZGzzypov0cyEiB+I5hTlym5Dhjt8QQ MbRzHDfKEvHwB03lsHkU5jtS2QS8EfNtKWtlR06qPNlQE1S+W4SpqZiDOybdQCxTKh7I KC4BIQfk7v6f4+tOtNeSAU5T5IwU9ogafQF57mIuMUc68WRK8yAsOOWBGKKvWQftP/GJ U9Kr4Kh47IkgD81FUsb1K1g9wMwfEKNCREjvcsfQBu10bRp2QWOpc038I7cfDOli062Q 4SIA== X-Gm-Message-State: APjAAAWwe6TCLZOMRIV1Va0XOuktoc3vVxKq3vHPFudJHnbCR7kMTu96 NWNuCWtl9cULTY/iE+3KZy3WLg== X-Google-Smtp-Source: APXvYqyaOWq8GdIO7F5RSNwhIkrW0mmblbFvwBaG8DVlGLYd98oBKwxa/N8RDfLIN84pt73uFoxCFw== X-Received: by 2002:a05:620a:705:: with SMTP id 5mr8319565qkc.400.1574915321015; Wed, 27 Nov 2019 20:28:41 -0800 (PST) Received: from [192.168.1.153] (pool-71-184-117-43.bstnma.fios.verizon.net. [71.184.117.43]) by smtp.gmail.com with ESMTPSA id o10sm7826698qkg.83.2019.11.27.20.28.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 Nov 2019 20:28:40 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: [PATCH v3] mm, memcg: fix the stupid OOM killer when shrinking memcg hard limit From: Qian Cai In-Reply-To: <1574914633-2020-1-git-send-email-laoar.shao@gmail.com> Date: Wed, 27 Nov 2019 23:28:39 -0500 Cc: Michal Hocko , hannes@cmpxchg.org, vdavydov.dev@gmail.com, Andrew Morton , linux-mm@kvack.org, David Hildenbrand Content-Transfer-Encoding: quoted-printable Message-Id: <4807E3C5-990F-4A5E-B7D9-22357A4B2845@lca.pw> References: <1574914633-2020-1-git-send-email-laoar.shao@gmail.com> To: Yafang Shao X-Mailer: Apple Mail (2.3601.0.10) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Nov 27, 2019, at 11:17 PM, Yafang Shao = wrote: >=20 > When there are no more processes in a memcg (e.g., due to OOM > group), we can still have file pages in the page cache. >=20 > If these pages are protected by memory.min, they can't be reclaimed. > Especially if there won't be another process in this memcg and the = memcg > is kept online, we do want to drop these pages from the page cache. >=20 > By dropping these page caches we can avoid reclaimers (e.g., kswapd or > direct) to scan and reclaim pages from all memcgs in the system - > because the reclaimers will try to fairly reclaim pages from all = memcgs > in the system when under memory pressure. >=20 > By setting the hard limit of such a memcg to 0, we allow to drop the > page cache of such memcgs. Unfortunately, this may invoke the OOM = killer > and generate a lot of output. The OOM output is not expected by an = admin > who wants to drop these caches and knows that there are no processes = in > this memcg anymore. >=20 > Therefore, if a memcg is not populated, we should not invoke the OOM > killer - there is nothing to kill. The next time a new process is > started in the memcg and the "max" is still below usage, the OOM = killer > will be invoked and the new process will be killed. >=20 > [ Above commit log is contributed by David ] >=20 > What's worse about this issue is that when there're killable tasks and = the > OOM killer killed the last task, and what will happen then ? As = nr_reclaims > is already 0 and drained is alreay true, the OOM killer will try to = kill > nothing (because he knows he has killed the last task), what's a = stupid > behavior. >=20 > Someone may worry that the admins may not going to see that the memcg = was > OOM due to the limit change. But this is not a issue, because the = admins > changes the limit and then the admins must check the result of his = change > - by checking memory.{max, current, stat} he can get all he wants. >=20 > Cc: David Hildenbrand > Nacked-by: Michal Hocko > Signed-off-by: Yafang Shao Surely too big a turkey to swallow ? =E2=80=94 unprofessional wording = and carrying a NACK from one of the maintainers. >=20 > --- > Changes since v2: Refresh the subject and commit log. The original > subject is "mm, memcg: avoid oom if cgroup is not populated" > --- > mm/memcontrol.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) >=20 > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 1c4c08b..e936f1b 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6139,9 +6139,20 @@ static ssize_t memory_max_write(struct = kernfs_open_file *of, > continue; > } >=20 > - memcg_memory_event(memcg, MEMCG_OOM); > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > + /* If there's no procesess, we don't need to invoke the = OOM > + * killer. Then next time when you try to start a = process > + * in this memcg, the max may still bellow usage, and = then > + * this OOM killer will be invoked. This can be = considered > + * as lazy OOM, that is we have been always doing in the > + * kernel. Pls. Michal, that is really consistency. > + */ > + if (cgroup_is_populated(memcg->css.cgroup)) { > + memcg_memory_event(memcg, MEMCG_OOM); > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, = 0)) > + break; > + } else { > break; > + } > } >=20 > memcg_wb_domain_size_changed(memcg); > --=20 > 1.8.3.1 >=20 >=20