From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49D3BC433EF for ; Mon, 4 Apr 2022 18:26:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85BE76B0071; Mon, 4 Apr 2022 14:25:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 80B6F6B0073; Mon, 4 Apr 2022 14:25:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FA576B0074; Mon, 4 Apr 2022 14:25:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id 61A596B0071 for ; Mon, 4 Apr 2022 14:25:56 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2B2ECAE557 for ; Mon, 4 Apr 2022 18:25:46 +0000 (UTC) X-FDA: 79320025092.29.AF92053 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf01.hostedemail.com (Postfix) with ESMTP id 7D74D40036 for ; Mon, 4 Apr 2022 18:25:45 +0000 (UTC) Date: Mon, 4 Apr 2022 11:25:35 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1649096743; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Og7JJS97Y7kZpeaH/i0wuDH1r4Ru5n8sJ0IhUTsaTkc=; b=IFf94GWr/6nr/4HgPNIQTPJGHrC3WBKmIpRxtM36jz9zsR04NOjeL/8OfSzqcvZG0zgjq5 bxiztvliLfe6bwYLU/TybmMsDuIL3Dv81VLiVs4OB9hzOhwlSPvHK8CDaU4e7vrJKGzg2u I0k54olorsvEY4ZnPZLQZx8TVXRhjg8= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Michal Hocko Cc: Yosry Ahmed , Johannes Weiner , Shakeel Butt , Andrew Morton , David Rientjes , Tejun Heo , Zefan Li , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Jonathan Corbet , Yu Zhao , Dave Hansen , Wei Xu , Greg Thelen Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface Message-ID: References: <20220331084151.2600229-1-yosryahmed@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Stat-Signature: 6dqczbdegnw9hxgfe6q8iobtc9mr5xm5 Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=IFf94GWr; spf=pass (imf01.hostedemail.com: domain of roman.gushchin@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7D74D40036 X-HE-Tag: 1649096745-688026 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Apr 04, 2022 at 10:44:04AM +0200, Michal Hocko wrote: > On Fri 01-04-22 09:58:59, Roman Gushchin wrote: > > On Fri, Apr 01, 2022 at 03:49:19PM +0200, Michal Hocko wrote: > > > On Thu 31-03-22 10:25:23, Roman Gushchin wrote: > > > > On Thu, Mar 31, 2022 at 08:41:51AM +0000, Yosry Ahmed wrote: > > > [...] > > > > > - A similar per-node interface can also be added to support proactive > > > > > reclaim and reclaim-based demotion in systems without memcg. > > > > > > > > Maybe an option to specify a timeout? That might simplify the userspace part. > > > > > > What do you mean by timeout here? Isn't > > > timeout $N echo $RECLAIM > .... > > > > > > enough? > > > > It's nice and simple when it's a bash script, but when it's a complex > > application trying to do the same, it quickly becomes less simple and > > likely will require a dedicated thread to avoid blocking the main app > > for too long and a mechanism to unblock it by timer/when the need arises. > > > > In my experience using correctly such semi-blocking interfaces (semi- because > > it's not clearly defined how much time the syscall can take and whether it > > makes sense to wait longer) is tricky. > > We have the same approach to setting other limits which need to perform > the reclaim. Have we ever hit that as a limitation that would make > userspace unnecessarily too complex? The difference here is that some limits are most likely set once and never adjusted, e.g. memory.max or memory.low. I do definitely remember some issues around memory.high, but as I recall, we've fixed them on the kernel side. We've even had a private memory.high.tmp interface with a value and a timeout, which later was replaced with a memory.reclaim interface similar to what we discuss here. But with memory.high we set the limit first, so if a user tries to reclaim a lot of hot memory, it will soon put all processes in the cgroup into the sleep/direct reclaim. So it's not expected to block for too long. In general it all comes to the question how hard the kernel should try to reclaim the memory before giving up. The userspace might have different needs in different cases. But if the interface is defined very vaguely like it tries for an undefined amount of time and then gives up, it's hard to use it in a predictive manner. Thanks!