From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A1B9C432C0 for ; Tue, 26 Nov 2019 09:50:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 963572073F for ; Tue, 26 Nov 2019 09:50:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 963572073F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2538D6B02D3; Tue, 26 Nov 2019 04:50:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 203CA6B02D4; Tue, 26 Nov 2019 04:50:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11BCC6B02D5; Tue, 26 Nov 2019 04:50:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id EFD586B02D3 for ; Tue, 26 Nov 2019 04:50:36 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 78540AC18 for ; Tue, 26 Nov 2019 09:50:36 +0000 (UTC) X-FDA: 76197958872.08.sail98_79a5b1d15c333 X-HE-Tag: sail98_79a5b1d15c333 X-Filterd-Recvd-Size: 6200 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Tue, 26 Nov 2019 09:50:35 +0000 (UTC) Received: by mail-wr1-f68.google.com with SMTP id b18so21606510wrj.8 for ; Tue, 26 Nov 2019 01:50:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=sZKv65dusbPoOWRGGnKQxgzU2YHRs8HNyNBIh7m/OlI=; b=boAO0VdYQilByeCwqQ7qq2d9Ud0JIe15mwBPEf/mhNmpVlV1MxtOLELYp4t7B6r+s8 97IUOgh+Tx4S1/R0oK+mOv1nCRg4pegXe21E2/wRkSvrQdTvk+v3Qnsee1nT5h2P2sI0 Yv8Z8Wwa/5/dyMie4aC6I+ohScyVvzrpMO2k1OIueAJ1m7wP8rs8JJ3Sgheyq+mGdTI3 k8pbl8FHfrGEBjCWU5KEwyWAlsXxpC44S0vJKyCZbqeUI9JZ701ucXjrAxKl7+DYc+JV Ez+rVurRxUS4rHJSUCdGdT8WX+mC+yI9J4ssJgU8Wk9uX95voBz7NYhK2AzaoE8oLUrN 4ubQ== X-Gm-Message-State: APjAAAUEfyCHlloSzV4fqkvflzg164vfS+wfnZKArnttQzJR5J6GQQCK zTkkEfuq2+uWJJcQKsOJIJs= X-Google-Smtp-Source: APXvYqyjK52xJVQggIP7rSrW7kDQiSgHAyg8ff1ZAnKksjvgWqOHLV/n5vS13+TzE+w8bmk7sKIlkw== X-Received: by 2002:adf:ef0c:: with SMTP id e12mr24657877wro.270.1574761834780; Tue, 26 Nov 2019 01:50:34 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id t81sm2575224wmg.6.2019.11.26.01.50.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Nov 2019 01:50:33 -0800 (PST) Date: Tue, 26 Nov 2019 10:50:33 +0100 From: Michal Hocko To: Yafang Shao Cc: Johannes Weiner , Vladimir Davydov , Andrew Morton , Linux MM Subject: Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens Message-ID: <20191126095033.GC20912@dhcp22.suse.cz> References: <20191125123123.GL31714@dhcp22.suse.cz> <20191125124553.GM31714@dhcp22.suse.cz> <20191125142150.GP31714@dhcp22.suse.cz> <20191125144213.GB602168@cmpxchg.org> <20191126073129.GA20912@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 26-11-19 17:35:59, Yafang Shao wrote: > On Tue, Nov 26, 2019 at 3:31 PM Michal Hocko wrote: > > > > On Tue 26-11-19 11:52:19, Yafang Shao wrote: > > > On Mon, Nov 25, 2019 at 10:42 PM Johannes Weiner wrote: > > > > > > > > On Mon, Nov 25, 2019 at 03:21:50PM +0100, Michal Hocko wrote: > > > > > On Mon 25-11-19 22:11:15, Yafang Shao wrote: > > > > > > When there're no processes, we don't need to protect the pages. You > > > > > > can consider it as 'fault tolerance' . > > > > > > > > > > I have already tried to explain why this is a bold statement that > > > > > doesn't really hold universally and that the kernel doesn't really have > > > > > enough information to make an educated guess. > > > > > > > > I agree, this is not obviously true. And the kernel shouldn't try to > > > > guess whether the explicit userspace configuration is still desirable > > > > to userspace or not. Should we also delete the cgroup when it becomes > > > > empty for example? > > > > > > > > It's better to implement these kinds of policy decisions from > > > > userspace. > > > > > > > > There is a cgroup.events file that can be polled, and its "populated" > > > > field shows conveniently whether there are tasks in a subtree or > > > > not. You can use that to clear protection settings. > > > > > > Why isn't force_empty supported in cgroup2 ? > > > > There wasn't any sound usecase AFAIR. > > > > > In this case we can free the protected file pages immdiately with force_empty. > > > > You can do the same thing by setting the hard limit to 0. > > I look though the code, and the difference between setting the hard > limit to 0 and force empty is that setting the hard limit to 0 will > generate some OOM reports, that should not happen in this case. > I think we should make little improvement as bellow, Yes, if you are not able to reclaim all of the memory then the OOM killer is triggered. And that was not the case with force_empty. I didn't mean that the two are equivalent, sorry if I misled you. I merely wanted to point out that you have means to cleanup the memcg with the existing API. > @@ -6137,9 +6137,11 @@ static ssize_t memory_max_write(struct > kernfs_open_file *of, > continue; > } > > - memcg_memory_event(memcg, MEMCG_OOM); > - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > - break; > + if (cgroup_is_populated(memcg->css.cgroup)) { > + memcg_memory_event(memcg, MEMCG_OOM); > + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > + break; > + } > } If there are no killable tasks then "Out of memory and no killable processes..." is printed and that really reflects the situation and is the right thing to do. Your above patch would suppress that information which might be important. > Well, if someone don't want to kill proesses but only want ot drop > page caches, setting the hard limit to 0 won't work. Could you be more specific about a real world example when somebody wants to drop per-memcg pagecache? -- Michal Hocko SUSE Labs