From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD8EBC432C0 for ; Mon, 25 Nov 2019 12:31:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ADCF720836 for ; Mon, 25 Nov 2019 12:31:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ADCF720836 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 48A686B05CB; Mon, 25 Nov 2019 07:31:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 43AAD6B05CD; Mon, 25 Nov 2019 07:31:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 350576B05D1; Mon, 25 Nov 2019 07:31:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id 20C0E6B05CB for ; Mon, 25 Nov 2019 07:31:27 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id A2C83180AD81F for ; Mon, 25 Nov 2019 12:31:26 +0000 (UTC) X-FDA: 76194735372.15.goat61_c902e24a391e X-HE-Tag: goat61_c902e24a391e X-Filterd-Recvd-Size: 5227 Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Mon, 25 Nov 2019 12:31:26 +0000 (UTC) Received: by mail-wr1-f65.google.com with SMTP id i12so17864783wro.5 for ; Mon, 25 Nov 2019 04:31:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=00KccDknxf4UKB7xwCGYuDr8Vjld4woc4IOtC+bdIj8=; b=M0fec2z0fuK6wZlVW5imXp62qr5IE4DGc9dAOtpIF/xvsK27MxrvWbSyVTV9a6gztu 8imXu6hwNZi4Pps4ngeNglDMMCwfI8o1o5bo5gE1hDW2fgZEW/Kmm/ftx2TmhOya1tya c0Ey0DKfLiERt4c2Q/6ajjWqve7WFP8CVHz6dx+GIxwdDBcQMN6GYBob5weMenaIkCtq FCe3mRkEIGV+306bDgfO6ODoPZhsKkjCj1gcyrZS7YssceJGPLQ7youObDF07piX+44V yC5Uz79vx7Q0etv1GS1ZY9BxtcchEY5S5FhT79yu+AbhzA8sDWsdQ8wSwckbiY185QYO G2pg== X-Gm-Message-State: APjAAAXC+eOTIrcumaeW91vY1fmfDoH0uA8+22AcjFDzKc4+bP1NRmxi QuQmw/hfaf38AfiNt3S4wHfUIh19 X-Google-Smtp-Source: APXvYqykjvusrJTIcEB3DByidDV+v3TTkjHLzDTI1Mm6NDt5YOky7wGfGOiKMGTK9UgKfqVKUS+tew== X-Received: by 2002:a5d:5227:: with SMTP id i7mr31447077wra.277.1574685085122; Mon, 25 Nov 2019 04:31:25 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id g8sm8217969wmk.23.2019.11.25.04.31.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Nov 2019 04:31:24 -0800 (PST) Date: Mon, 25 Nov 2019 13:31:23 +0100 From: Michal Hocko To: Yafang Shao Cc: Johannes Weiner , Vladimir Davydov , Andrew Morton , Linux MM Subject: Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens Message-ID: <20191125123123.GL31714@dhcp22.suse.cz> References: <1574676893-1571-1-git-send-email-laoar.shao@gmail.com> <20191125110848.GH31714@dhcp22.suse.cz> <20191125115409.GJ31714@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 25-11-19 20:17:15, Yafang Shao wrote: > On Mon, Nov 25, 2019 at 7:54 PM Michal Hocko wrote: > > > > On Mon 25-11-19 19:37:59, Yafang Shao wrote: > > > On Mon, Nov 25, 2019 at 7:08 PM Michal Hocko wrote: > > > > > > > > On Mon 25-11-19 05:14:53, Yafang Shao wrote: > > > > > We set memory.oom.group to make all processes in this memcg are killed by > > > > > OOM killer to free more pages. In this case, it doesn't make sense to > > > > > protect the pages with memroy.{min, low} again if they are set. > > > > > > > > I do not see why? What does group OOM killing has anything to do with > > > > the reclaim protection? What is the actual problem you are trying to > > > > solve? > > > > > > > > > > The cgroup is treated as a indivisible workload when cgroup.oom.group > > > is set and OOM killer is trying to kill a prcess in this cgroup. > > > > Yes this is true. > > > > > We set cgroup.oom.group is to guarantee the workload integrity, now > > > that processes ara all killed, why keeps the page cache here? > > > > Because an administrator has configured the reclaim protection in a > > certain way and hopefully had a good reason to do that. We are not going > > to override that configure just because there is on OOM killer invoked > > and killed tasks in that memcg. The workload might get restarted and it > > would run under a different constrains all of the sudden which is not > > expected. > > > > In short kernel should never silently change the configuration made by > > an admistrator. > > Understood. > > So what about bellow changes ? We don't override the admin setting, > but we reclaim the page caches from it if this memcg is oom killed. > Something like, > > mem_cgroup_protected > { > ... > + if (!cgroup_is_populated(memcg->css.cgroup) && > mem_cgroup_under_oom_group_kill(memcg)) > + return MEMCG_PROT_NONE; > + > usage = page_counter_read(&memcg->memory); > if (!usage) > return MEMCG_PROT_NONE; > } I assume that mem_cgroup_under_oom_group_kill is essentially memcg->under_oom && memcg->oom_group But that doesn't really help much because all the reclaim attempts have been already attempted and failed. I do not remember exact details about under_oom but I have a recollection that it wouldn't really work for cgroup v2 because the oom_control is not in place and so the state would be set for only very short time period. Again, what is a problem that you are trying to fix? -- Michal Hocko SUSE Labs