From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 24DF5C4321E
	for <linux-mm@archiver.kernel.org>; Wed, 30 Nov 2022 23:29:35 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 816F76B0073; Wed, 30 Nov 2022 18:29:34 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 7EDFE6B0074; Wed, 30 Nov 2022 18:29:34 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 6B62F6B0075; Wed, 30 Nov 2022 18:29:34 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13])
	by kanga.kvack.org (Postfix) with ESMTP id 5C50D6B0073
	for <linux-mm@kvack.org>; Wed, 30 Nov 2022 18:29:34 -0500 (EST)
Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay09.hostedemail.com (Postfix) with ESMTP id 094B780DFB
	for <linux-mm@kvack.org>; Wed, 30 Nov 2022 23:29:34 +0000 (UTC)
X-FDA: 80191702668.05.49C1E63
Received: from out2.migadu.com (out2.migadu.com [188.165.223.204])
	by imf21.hostedemail.com (Postfix) with ESMTP id 63F4E1C000C
	for <linux-mm@kvack.org>; Wed, 30 Nov 2022 23:29:33 +0000 (UTC)
Date: Wed, 30 Nov 2022 15:29:11 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1669850971;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=tLdcW8YLuJNMoWtBITi5oNMUbS4UvnO6BlVvWMG2nYE=;
	b=oJT/fRlbxbmZrYuoQmUv92rI06HoeclkYkBmi6RfeHrRbgReK+CHE10c6xJ8goJ+2VOyWN
	h9wjleatctaHsOOBKTxLa1OitIsI7BCUNk3DDcto0/tSYqLUVefiRap60qItuf/b1XmpE8
	dJVgc1TkD0UwkYXK7a1IHJWdtcREVCk=
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers.
From: Roman Gushchin <roman.gushchin@linux.dev>
To: chengkaitao <pilgrimtao@gmail.com>
Cc: tj@kernel.org, lizefan.x@bytedance.com, hannes@cmpxchg.org,
	corbet@lwn.net, mhocko@kernel.org, shakeelb@google.com,
	akpm@linux-foundation.org, songmuchun@bytedance.com,
	cgel.zte@gmail.com, ran.xiaokai@zte.com.cn, viro@zeniv.linux.org.uk,
	zhengqi.arch@bytedance.com, ebiederm@xmission.com,
	Liam.Howlett@Oracle.com, chengzhihao1@huawei.com,
	haolee.swjtu@gmail.com, yuzhao@google.com, willy@infradead.org,
	vasily.averin@linux.dev, vbabka@suse.cz, surenb@google.com,
	sfr@canb.auug.org.au, mcgrof@kernel.org, sujiaxun@uniontech.com,
	feng.tang@intel.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: memcontrol: protect the memory in cgroup from being
 oom killed
Message-ID: <Y4fnRyIp17NXpti9@P9FQF9L96D.corp.robot.car>
References: <20221130070158.44221-1-chengkaitao@didiglobal.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20221130070158.44221-1-chengkaitao@didiglobal.com>
X-Migadu-Flow: FLOW_OUT
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669850973; a=rsa-sha256;
	cv=none;
	b=qzpiGOgwB1HE1tYgQtz4VIcgNy78UKgYQeEbUgH2R8uJCmXZr/So3RFh6Kbu/aiAqMwBz4
	5wzoz0YGx81HvU8Yf3wUId0hPRQw0oVPWrsqGvJY1Cxba9qahuy/Ewddm09noGWu+ZC87A
	zQ2cb5hbAdkgjxCXp0RO8u0qkB/n+bs=
ARC-Authentication-Results: i=1;
	imf21.hostedemail.com;
	dkim=pass header.d=linux.dev header.s=key1 header.b="oJT/fRlb";
	spf=pass (imf21.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev;
	dmarc=pass (policy=none) header.from=linux.dev
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1669850973;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=tLdcW8YLuJNMoWtBITi5oNMUbS4UvnO6BlVvWMG2nYE=;
	b=CtY1xlfFZGiToan0VBu07Ydr7LXQyT0YAt9cXSotfdKwEkUDLyWij9zYN5Xktyuk6PgOs5
	VAZv8DcyRHaUAq0apOdubfJnUCZYnOBvP0YcbaPdX+QJ7vWZQP5zx2W+X3SJnTD2vvWB0C
	UMD875c+N2P0/KTkhKEAU0WxvYKeUao=
X-Rspamd-Server: rspam04
X-Rspamd-Queue-Id: 63F4E1C000C
X-Rspam-User: 
Authentication-Results: imf21.hostedemail.com;
	dkim=pass header.d=linux.dev header.s=key1 header.b="oJT/fRlb";
	spf=pass (imf21.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev;
	dmarc=pass (policy=none) header.from=linux.dev
X-Stat-Signature: sziumh7mp5dtoyx9bwrdqed13fgwrsyb
X-HE-Tag: 1669850973-913394
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Wed, Nov 30, 2022 at 03:01:58PM +0800, chengkaitao wrote:
> From: chengkaitao <pilgrimtao@gmail.com>
> 
> We created a new interface <memory.oom.protect> for memory, If there is
> the OOM killer under parent memory cgroup, and the memory usage of a
> child cgroup is within its effective oom.protect boundary, the cgroup's
> tasks won't be OOM killed unless there is no unprotected tasks in other
> children cgroups. It draws on the logic of <memory.min/low> in the
> inheritance relationship.
> 
> It has the following advantages,
> 1. We have the ability to protect more important processes, when there
> is a memcg's OOM killer. The oom.protect only takes effect local memcg,
> and does not affect the OOM killer of the host.
> 2. Historically, we can often use oom_score_adj to control a group of
> processes, It requires that all processes in the cgroup must have a
> common parent processes, we have to set the common parent process's
> oom_score_adj, before it forks all children processes. So that it is
> very difficult to apply it in other situations. Now oom.protect has no
> such restrictions, we can protect a cgroup of processes more easily. The
> cgroup can keep some memory, even if the OOM killer has to be called.

It reminds me our attempts to provide a more sophisticated cgroup-aware oom
killer. The problem is that the decision which process(es) to kill or preserve
is individual to a specific workload (and can be even time-dependent
for a given workload). So it's really hard to come up with an in-kernel
mechanism which is at the same time flexible enough to work for the majority
of users and reliable enough to serve as the last oom resort measure (which
is the basic goal of the kernel oom killer).

Previously the consensus was to keep the in-kernel oom killer dumb and reliable
and implement complex policies in userspace (e.g. systemd-oomd etc).

Is there a reason why such approach can't work in your case?

Thanks!