From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50E52C432C0 for ; Wed, 27 Nov 2019 11:35:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 043082068E for ; Wed, 27 Nov 2019 11:35:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lZL/b42o" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 043082068E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 93DBD6B038A; Wed, 27 Nov 2019 06:35:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EEB06B038B; Wed, 27 Nov 2019 06:35:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82BAE6B038C; Wed, 27 Nov 2019 06:35:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0247.hostedemail.com [216.40.44.247]) by kanga.kvack.org (Postfix) with ESMTP id 6B4146B038A for ; Wed, 27 Nov 2019 06:35:41 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 1DD963D0F for ; Wed, 27 Nov 2019 11:35:41 +0000 (UTC) X-FDA: 76201852482.14.pie90_4eaaf453f9320 X-HE-Tag: pie90_4eaaf453f9320 X-Filterd-Recvd-Size: 8614 Received: from mail-io1-f67.google.com (mail-io1-f67.google.com [209.85.166.67]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 27 Nov 2019 11:35:40 +0000 (UTC) Received: by mail-io1-f67.google.com with SMTP id k24so13823422ioc.4 for ; Wed, 27 Nov 2019 03:35:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MgL73ZTGXg3PVwxz2YQTfeIdxYwIBUjkIfjDRhYl+8w=; b=lZL/b42o1bVg7os3+kFBA/pwnDdb1jEWaj1UdsSbPV4thCWLkxOtiRwbWAzt+s/zm0 7BrHWeMl+JnfgujbqJKHh/XNAt+2fV6WTgvnWgfstS0hMcEKzjdpf1tqxtQkyqp+JQcm zh+8UaY7d4IatFW331etTyrnon66AiQ/9Vmw+Qsw6MtSSwLtZqNMeB+sfz1QrxeULXgP Dnc6VJDQBj/azHRKj1HhfFWmXF9x95d+n5GKccwll3NqZC8Pmcp/6ZpHfvxU1puA2JTy 4WbkJtKmxmHZBqeC6+h7bDod6xT9RvpsXOUEUU9Jn3xoBIb89ACOPLvmvidGFgO7wVCh B9cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MgL73ZTGXg3PVwxz2YQTfeIdxYwIBUjkIfjDRhYl+8w=; b=flqwS9P6VV7SH+NiCX78pWo8qJajh2vDo6+xbP2ZkmhjTK73fLZiwPmW6p0wTmBCQ2 TeYIfKD4nLSuci4ARFWksJM3mIRt7wPwuoWI0xGCewTX97aHjjqgZOysp1JO1LQedOB+ 1oDWytjQbSPgK/3PXAWZp82n06vAtZm3xzvkZBIeTMJ0AjtpUs9flBDKwOpqsIMLqdvH UfHeRlfugV10YIKZtETnpo1uykDJPXOaaOXTTep2J6649KHpQd9nvCwBqZd2DFJTGZJY 8Ri1pWvAsC7r4dZp4RBMNwCDHJWSCe0mBlasmjW0dLKyySYyLL59sc8flPamP1RgNs/a 1sIw== X-Gm-Message-State: APjAAAU4cFNuE94HyugKeqFiMQPoQGvw6LKoxdB6Mo/Oe24H23WvxVBl MuAPLSjO0PVB7alT3HobMC5RHuf6+fYUrO3VHWM= X-Google-Smtp-Source: APXvYqzcDVCOdnknDkgd2z3sQBBCkUEHDszmxZ3snIGYOz1w+MpeUsKJGY5zLELMQzackz8mluKrUU0e2bMF+9fRlQM= X-Received: by 2002:a6b:5503:: with SMTP id j3mr15130468iob.142.1574854539559; Wed, 27 Nov 2019 03:35:39 -0800 (PST) MIME-Version: 1.0 References: <1574818117-2885-1-git-send-email-laoar.shao@gmail.com> In-Reply-To: From: Yafang Shao Date: Wed, 27 Nov 2019 19:35:03 +0800 Message-ID: Subject: Re: [PATCH v2] mm, memcg: avoid oom if cgroup is not populated To: David Hildenbrand Cc: Michal Hocko , Johannes Weiner , Vladimir Davydov , Andrew Morton , Linux MM , Michal Hocko Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 27, 2019 at 7:11 PM David Hildenbrand wrote: > > On 27.11.19 02:28, Yafang Shao wrote: > > Let me give this patch description an overhaul: > Well done! Thanks for your work. > > There's one case that the processes in a memcg are all exit (due to OOM > > group or some other reasons), but the file page caches are still exist. > > "When there are no more processes in a memcg (e.g., due to OOM > group), we can still have file pages in the page cache." > > > These file page caches may be protected by memory.min so can't be > > reclaimed. If we can't success to restart the processes in this memcg or > > don't want to make this memcg offline, then we want to drop the file page > > caches. > > "If these pages are protected by memory.min, they can't be reclaimed. > Especially if there won't be another process in this memcg and the memcg > is kept online, we do want to drop these pages from the page cache." > > > The advantage of droping this file caches is it can avoid the reclaimer > > (either kswapd or direct) scanning and reclaiming pages from all memcgs > > exist in this system, because currently the reclaimer will fairly reclaim > > pages from all memcgs if the system is under memory pressure. > > "By dropping these page caches we can avoid reclaimers (e.g., kswapd or > direct) to scan and reclaim pages from all memcgs in the system - > because the reclaimers will try to fairly reclaim pages from all memcgs > in the system when under memory pressure." > > > The possible method to drop these file page caches is setting the > > hard limit of this memcg to 0. Unfortunately this may invoke the OOM killer > > and generates lots of outputs, that should not happen. > > The OOM output is not expected by the admin if he or she wants to drop > > the cahes and knows there're no processes in this memcg. > > "By setting the hard limit of such a memcg to 0, we allow to drop the > page cache of such memcgs. Unfortunately, this may invoke the OOM killer > and generate a lot of output. The OOM output is not expected by an admin > who wants to drop these caches and knows that there are no processes in > this memcg anymore." > > > > > If memcg is not populated, we should not invoke the OOM killer because > > there's nothing to kill. Next time when you start a new process and if the > > max is still bellow usage, the OOM killer will be invoked and your new > > process is killed, so we can cosider it as lazy OOM, that is we have been > > always doing in the kernel. > > "Therefore, if a memcg is not populated, we should not invoke the OOM > killer - there is nothing to kill. The next time a new process is > started in the memcg and the "max" is still below usage, the OOM killer > will be invoked and the new process will be killed." > > 1. I don't think the "lazy OOM" part is relevant. > That doesn't imporatant. > 2. Where is the part that modifies the limits? or did you drop that? is > it part of another patch? > No. it is not part of another patch. Modifying the limits is really a workaround that Michal[1] has told me to fix my problem, while actually it doesn't work, that is why I submit this patch. 1. https://lore.kernel.org/linux-mm/20191126073129.GA20912@dhcp22.suse.cz/ > 3. I think I agree with Michal that modifying the limits smells more > like a configuration thingy to be handled by an admin (especially, adapt > min/max properly). But again, not sure where that change is located :) > I agree with you all, but that is Michal told me to do. See above and the disccussion in this thread. > 4. This patch on its own (if there are no processes, there is nothing to > kill) does not sound too wrong to me. Instead of an endless loop > (besides signals) where we can't make any progress, we exit right away. > Thanks for you feedback. > (I am not yet too familiar with memgc, Michal is clearly the expert :) ) > I agree with you that Michal is an expert, but clearly that Michal is not an expert on this issue. > > > > Fixes: b6e6edcf ("mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage") > > Signed-off-by: Yafang Shao > > Cc: Johannes Weiner > > Cc: Michal Hocko > > --- > > mm/memcontrol.c | 15 +++++++++++++-- > > 1 file changed, 13 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 1c4c08b..e936f1b 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -6139,9 +6139,20 @@ static ssize_t memory_max_write(struct kernfs_open_file *of, > > continue; > > } > > > > - memcg_memory_event(memcg, MEMCG_OOM); > > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > > + /* If there's no procesess, we don't need to invoke the OOM > > + * killer. Then next time when you try to start a process > > + * in this memcg, the max may still bellow usage, and then > > + * this OOM killer will be invoked. This can be considered > > + * as lazy OOM, that is we have been always doing in the > > + * kernel. Pls. Michal, that is really consistency. > > + */ > > + if (cgroup_is_populated(memcg->css.cgroup)) { > > + memcg_memory_event(memcg, MEMCG_OOM); > > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > > + break; > > + } else { > > break; > > + } > > } > > > > memcg_wb_domain_size_changed(memcg); > > > > > -- > Thanks, > > David / dhildenb > Thanks Yafang