From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DB50C432C0 for ; Mon, 25 Nov 2019 14:21:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 21556207FD for ; Mon, 25 Nov 2019 14:21:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 21556207FD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C4E2F6B05C6; Mon, 25 Nov 2019 09:21:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BFE7A6B05C7; Mon, 25 Nov 2019 09:21:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B14CF6B05CD; Mon, 25 Nov 2019 09:21:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id 9C84E6B05C6 for ; Mon, 25 Nov 2019 09:21:54 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 6664445CD for ; Mon, 25 Nov 2019 14:21:54 +0000 (UTC) X-FDA: 76195013748.15.touch87_67e557ea70d5a X-HE-Tag: touch87_67e557ea70d5a X-Filterd-Recvd-Size: 5529 Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Mon, 25 Nov 2019 14:21:53 +0000 (UTC) Received: by mail-wm1-f67.google.com with SMTP id l17so15636397wmh.0 for ; Mon, 25 Nov 2019 06:21:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=hCTxF6jDaSSPOTAR8sKapOkDv8GQazAje0ZbFqj4xwA=; b=gKv8QUzPHPLbo+2r4NdWU9khBFPKdPFtAv7PEJPtcYnsSrqmD4KZ05hCOkPrdqF4ZP bcUMmY7hS+iEP71ygB6UEKxh0sXWJyrAiHStWFsHuBoBS+CVfau1sHe051gD+kxoZ08m O2JgoVieMC7dQx4p9V35MzrtZlXaYGqOtWUkc/aLyZIiHH8KPmcf5H99ekdZhJcnhbwl uIu3O6ZT1dUw5lg/SnqTRzYkYMPMIJHsFdCd7bpI80DlEmTvh3IwfQaiRNnuJKZlVydt QfgNsLoxbKNJ5dlNVMb+sN1+4XZ/qOUlKkzTXeDSB/0SiW36WQT9YWAPqmXKsfksuf0L yv8Q== X-Gm-Message-State: APjAAAXQEXEU4dY4ISCA1yN4DFR3/ZUD9xtenmZFUP23N6Sd/NVFzZ8h I/MvTEX7cM/FLtWXaV8cdWg= X-Google-Smtp-Source: APXvYqwTH7kJgnjnK19Dcoy/1VnYUd9eYd2aGQ43gP98xFRlkj/9Lbm04GqEN0DYg8mdaQ59g3ODzw== X-Received: by 2002:a1c:c906:: with SMTP id f6mr17427809wmb.14.1574691712679; Mon, 25 Nov 2019 06:21:52 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id u4sm10684878wrq.22.2019.11.25.06.21.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Nov 2019 06:21:51 -0800 (PST) Date: Mon, 25 Nov 2019 15:21:50 +0100 From: Michal Hocko To: Yafang Shao Cc: Johannes Weiner , Vladimir Davydov , Andrew Morton , Linux MM Subject: Re: [PATCH] mm, memcg: clear page protection when memcg oom group happens Message-ID: <20191125142150.GP31714@dhcp22.suse.cz> References: <1574676893-1571-1-git-send-email-laoar.shao@gmail.com> <20191125110848.GH31714@dhcp22.suse.cz> <20191125115409.GJ31714@dhcp22.suse.cz> <20191125123123.GL31714@dhcp22.suse.cz> <20191125124553.GM31714@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 25-11-19 22:11:15, Yafang Shao wrote: > On Mon, Nov 25, 2019 at 8:45 PM Michal Hocko wrote: > > > > On Mon 25-11-19 20:37:52, Yafang Shao wrote: > > > On Mon, Nov 25, 2019 at 8:31 PM Michal Hocko wrote: [...] > > > > Again, what is a problem that you are trying to fix? > > > > > > When there's no processes running in a memcg, for example if they are > > > killed by OOM killer, we can't reclaim the file page cache protected > > > by memory.min of this memcg. These file page caches are useless in > > > this case. > > > That's what I'm trying to fix. > > > > Could you be more specific please? I would assume that the group oom > > configured memcg would either restart its workload when killed (that is > > why you want to kill the whole workload to restart it cleanly in many > > case) or simply tear down the memcg altogether. > > > > Yes, we always restart it automatically if these processes are exit > (no matter because of OOM or some other reason). > It is safe to do that if OOM happens, because OOM is always because of > anon pages leaked and the restart can free these anon pages. No this is an incorrect assumption. The OOM might happen for many different reasons. > But there may be some cases that we can't success to restart it, while > if that happens the protected pages will be never be reclaimed until > the admin reset it or make this memcg offline. If the workload cannot be restarted for whatever reason then you need an admin intervention and a proper cleanup. That would include resetting reclaim protection when in use. > When there're no processes, we don't need to protect the pages. You > can consider it as 'fault tolerance' . I have already tried to explain why this is a bold statement that doesn't really hold universally and that the kernel doesn't really have enough information to make an educated guess. > > In other words why do you care about the oom killer case so much? It is > > not different that handling a lingering memcg with the workload already > > finished. You simply have no way to know whether the reclaim protection > > is still required. Admin is supposed to either offline the memcg that is > > no longer used or drop the reclaim protection once it is not needed > > because that has some visible consequences on the overall system > > operation. > > Actually what I concern is the case that there's no process running > but memory protection coninues protecting the file pages. > OOM is just one case of them. This sounds like a misconfiguration which should be handled by an admin. -- Michal Hocko SUSE Labs