From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DEDFCEBF69 for ; Fri, 27 Sep 2024 02:28:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92A6C6B00A4; Thu, 26 Sep 2024 22:28:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B2816B00A6; Thu, 26 Sep 2024 22:28:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 779E46B00A7; Thu, 26 Sep 2024 22:28:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 559336B00A4 for ; Thu, 26 Sep 2024 22:28:22 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0373E1C5E9D for ; Fri, 27 Sep 2024 02:28:21 +0000 (UTC) X-FDA: 82608934044.22.8EB78A3 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) by imf28.hostedemail.com (Postfix) with ESMTP id 46F82C000B for ; Fri, 27 Sep 2024 02:28:20 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=DxTO0GMV; spf=pass (imf28.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.42 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727403938; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=C9O13VW6xIAn56yUlrhZA8RDOmDsOKO5QdexkzqK17I=; b=08bEf/svqBXCg0zqRkUAhGDQHcUZiW2FDkzaE5j+wZS/FW/R5CR+k9kOksV0AnSoFrUSuF 8RZX2batBniNcogrtcl9MPdmotJ7Vwdww0jCoip59QkeGW7bG+0y1VdLcB3Skca7uih09D t9HkBdUXCcCKgTsy2vcDBv8IZy5D3O4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=DxTO0GMV; spf=pass (imf28.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.42 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727403938; a=rsa-sha256; cv=none; b=PowrMxh/YtopW2WUWEcZIztisB72yVcD5e/rgV+eP9SdCzEiM56MplU1RBqaaH+N7csjno BBhxxVOGOxzucrpITYU9xFNycka3uYXFLaLZhSbVQeBSpdcnB/gVx7XBDa3tmuH6sTwvg2 b+KNM1BeIJ4ciB3qxYpAErwGxMtMq8E= Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-6cb35f2aec1so8619076d6.2 for ; Thu, 26 Sep 2024 19:28:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727404099; x=1728008899; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=C9O13VW6xIAn56yUlrhZA8RDOmDsOKO5QdexkzqK17I=; b=DxTO0GMVZEY2CX9I5vgDJB1aENLnvtl1rfX8o2zbZDMaApiibDALbzxkSsC1sAWeI0 eJIImH3cHJ3ZcWBOiQ0HcgJHSEdy6gnS/SaxYO9gJQHrH4RfHL0NgNg80Xit5NQRwUVC ZFl8leeJiz9aL5Opp+y6vdtekSOD/RlWcEmHZ2BgQrienegOvfEE8mATwroF44u4UXqe FUCnNl7XNw9o3IVItOGFtlwnbHGqMzruFqJW9DRgZu2ntYQkpP/hvcyV9qm/9w1ueOy9 04zQ24nEi24P2OKL1VTLbbRdk7A7Ci6U656Ovbb/8akHRP1Xl1uRNVf+GEjByLN8z36I 7/zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727404099; x=1728008899; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=C9O13VW6xIAn56yUlrhZA8RDOmDsOKO5QdexkzqK17I=; b=vvd2FiKWrvHkV/vRBZkCCzxOPhUU335Jf7LtXg5OH0+Im3oWkDZgpAIl0CogkXtyan 4WghHhjsbHdILKva87Yp7zdyMa46Fi1Hz4pG40VKt5F+vGIcMfNAtPRlslf3Bybs+r66 pztufh4bWWt+Mr5Y8/+LTJQZCCXcxy31+EASBSyiUJX0IpSxVHO4/B0PjNEhvaG98YdY m9P2uVJzGNEirt0K0zwKq8mY3OqliidbVHdNFVJimEkhTt1ZFmJItPPddSvvGxMo2TXk 2UUMdr2J33XuOCwTFYRpewZqzIByuS7vugA4SiSiif3M4yxkMYz08cg0r1FKkIivHDtH ob2A== X-Forwarded-Encrypted: i=1; AJvYcCU0LccsJT8dGw32fW0M96YhWpMbcv9AbWAmCBTgFBVgpQy/HZkS1LLfVKspvF1rwU4D/QpxJqYbkQ==@kvack.org X-Gm-Message-State: AOJu0Yz53W3WncSV3IOt3744+ZjQ88g7kdlQO+oggvmxwIry0LriKFQe MVnCZ7SeBRTALWeYjlIAW6tjI4iz53W6YmDTCZTkYMY5HsGDJIHgh7HIsq+TklLsNB9g1jJLzmf MnjlicqcLnOX6zk0AUDHZX3tGBqObsLYC+V8= X-Google-Smtp-Source: AGHT+IGZQ4YR8rkdKsmI3FP4EUN0n7EV1ZNRkvjTMTTlL/e4iLP5LMwlmfhZW/1Gy6JYibHBAEdJQTwSXxsfxY+lr+U= X-Received: by 2002:a05:6214:570c:b0:6c7:c650:969b with SMTP id 6a1803df08f44-6cb3b63daecmr26693046d6.27.1727404099230; Thu, 26 Sep 2024 19:28:19 -0700 (PDT) MIME-Version: 1.0 References: <20240926225531.700742-1-intelfx@intelfx.name> In-Reply-To: <20240926225531.700742-1-intelfx@intelfx.name> From: Nhat Pham Date: Thu, 26 Sep 2024 19:28:08 -0700 Message-ID: Subject: Re: [PATCH] zswap: improve memory.zswap.writeback inheritance To: Ivan Shapovalov Cc: linux-kernel@vger.kernel.org, Mike Yuan , Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Jonathan Corbet , Yosry Ahmed , Chengming Zhou , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Chris Li , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 46F82C000B X-Stat-Signature: sohgqfmeos4jebzuta7w71i77cntb7a4 X-Rspam-User: X-HE-Tag: 1727404100-789256 X-HE-Meta: U2FsdGVkX19Uq+hh2DVzraVERxrTcQqyOs59JOIbiaBYs9YJolWt6DZhjYuKmr2ycAPnLBNdLsKoyfPejJy6vW4PjHPosBW5iy04F+3qzS3loCwxFOMbcj6Ne1fxiG639cLEm1AWtcGkTI/+roCFpQxZYQHr8xcRHgBilnMe0R00t8Te6W4jNtnyo3NRMJ/zoNQt+WYW3kcJ3rRrkesmQCY3WQF8d3NX1pzbpUqRam3/IwRFcpIPbyyGnggbmQuEkhoMttjIyJs+5Gh4+wrmLERP/uWyBO9LEqNXtUSYxVzAy6brMyEXn3fCOXNxtOQDT1QTapqBZLdOTAu2gCidBV5cLzL7XFwQ/87q1MdsL5l/ZIkT9RkJlf31/7XCv+qHLogpNOY1WmSfKCuXMc09jexF2NYvvvAdNmoWWN4H3csu434eCkK9OVdAJu+Vpux/aV9xx7zIiIBN/cwh9r3QGjE9M2xrzmxwOKNlhk85QzdwutxjFQai8vUN4pbXk9CNAWv+AjS82TTrgTkVS0nFKDAiSTrc1m+IWTXmG0TdKwUw+tlIP5g66JR5pZrmQcJUgiLwoZCA/iDijHu/1Vuewjq3iSsrh8Zm0uTptKdhMY/heGq1YtHgy/8h2mdmP9LcmzS12scvwF/kFPQXF4m0H5WmyifHZ5qXANOdJFuStMC96LLATDhFq/90wKot2+CUHHzcylweQM4jH8TrGmMBv4WUafu8acmfWJ4UDV0Yo/ysPLFW8LaOij2y+1nSllx4C1DKZGVoR1vxB8XGFLZf4VwLeB0qsPHkvurN/eJMNQY+FZtmUf+1nZ5XCjdCIIM74rYqP1ESiKkym3qF+rs7z9Wv5owzJGgsHvXrLLlt65EL2MPhyOVVIaY5WhguUE/p/XzOcnrTox/5SuKl8t03o2DYrii9zTe++cyM2po6NH6d0lM7u6gUgmb5lLuZZyEi8EqsvpaShH+OnDp7AUX Zldy2u7j TueDFMsRp+8WLuWC0eH3Ru5IF99OH4HPaazBHSzFfVB5AMjp7Ra2H7B7BlZEQvZOC8jjiv4LWldESfHCiQX69InGtV3eEyxxc5FiXoZXFIsHbB9U38AcLQPwoA8i+ayrGJuyXcrtL0a5G+C819CH9xX/EMLDErCk6eiAdvqYx9uSc8cjCYGU/LyGdNI0yoROooksfU07mOcTkDjIUzTjwShjhxmHMfKWNvIaelpqvQL00au98nNWx8sOMaJONzu19d46z9s9LMIuNZSAGX6yLNCiI+f87Jwp/bONMaxkcXqUcl9XAxN+h4BVBJY++VccQgpZU X-Bogosity: Ham, tests=bogofilter, spamicity=0.006765, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 26, 2024 at 3:55=E2=80=AFPM Ivan Shapovalov wrote: > > Improve the inheritance behavior of the `memory.zswap.writeback` cgroup > attribute introduced during the 6.11 cycle. Specifically, in 6.11 we > walk the parent cgroups until we find a _disabled_ writeback, which does > not allow the user to selectively enable zswap writeback while having it > disabled by default. > > Instead, introduce a third value for the `memory.zswap.writeback` cgroup > attribute meaning "follow the parent", and use it as the default value > for all cgroups. Additionally, introduce a `zswap.writeback_disable` > module parameter which is used as the "parent" of the root cgroup, > thus making it the global default. > > Reads from `memory.zswap.writeback` cgroup attribute will be coerced to > 0 or 1 to maintain backwards compatilibity. However, it is possible to > write -1 to that attribute to make the cgroup follow the parent again. > > Fixes: e39925734909 ("mm/memcontrol: respect zswap.writeback setting from= parent cg too") > Fixes: 501a06fe8e4c ("zswap: memcontrol: implement zswap writeback disabl= ing") > Signed-off-by: Ivan Shapovalov > --- > Documentation/admin-guide/cgroup-v2.rst | 17 +++++++++++------ > Documentation/admin-guide/mm/zswap.rst | 9 ++++++++- > include/linux/memcontrol.h | 3 ++- > include/linux/zswap.h | 6 ++++++ > mm/memcontrol.c | 24 +++++++++++++++++------- > mm/zswap.c | 9 +++++++++ > 6 files changed, 53 insertions(+), 15 deletions(-) > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admi= n-guide/cgroup-v2.rst > index 95c18bc17083..eea580490679 100644 > --- a/Documentation/admin-guide/cgroup-v2.rst > +++ b/Documentation/admin-guide/cgroup-v2.rst > @@ -1717,10 +1717,12 @@ The following nested keys are defined. > entries fault back in or are written out to disk. > > memory.zswap.writeback > - A read-write single value file. The default value is "1". > - Note that this setting is hierarchical, i.e. the writeback would = be > - implicitly disabled for child cgroups if the upper hierarchy > - does so. > + A read-write single value file. The default is to follow the pare= nt > + cgroup configuration, and the root cgroup follows the global > + ``zswap.writeback_enabled`` module parameter (which is 1 by defau= lt). > + Thus, this setting is hierarchical, i.e. the writeback setting fo= r > + a child cgroup can be implicitly controlled at runtime by changin= g > + any parent value or the global module parameter. > > When this is set to 0, all swapping attempts to swapping devices > are disabled. This included both zswap writebacks, and swapping d= ue > @@ -1729,8 +1731,11 @@ The following nested keys are defined. > reclaim inefficiency after disabling writeback (because the same > pages might be rejected again and again). > > - Note that this is subtly different from setting memory.swap.max t= o > - 0, as it still allows for pages to be written to the zswap pool. > + Note that this is different from setting memory.swap.max to 0, > + as it still allows for pages to be written to the zswap pool. > + > + This can also be set to -1, which would make the cgroup (and its > + future children) follow the parent/global value again. > API-design-wise, this seems a bit confusing... Using the value -1 to indicate the cgroup should follow ancestor is not quite semantically meaningful. What does "follow the parent" here mean? Reading your code, it seems to be "find the first non negative from this memcg to root and just use it", but it's neither intuitive, nor deducable from the documentation. There are also ill-defined/poorly-defined behavior. What if we set -1 to all cgroups? Should writeback be enabled or disabled? The implementation also contradicts the "writeback setting for a child cgroup can be implicitly controlled at runtime by chaging any parent value or the global module parameter" part. What happens if we have the following sequence of zswap.writeback values, from root to current memcg: root.memcg.zswap.writeback =3D 0, 1, -1, -1, -1 =3D cur_memcg.zswap.writeba= ck Should we disable or enable cur_memcg's zswap.writeback? It's disabled at root, but the first non-negative ancestral value is 1, so this cgroup will follow it (even though that ancestor itself has zswap writeback disabled!!!) I think we should separate the values itself from the decision to check ancestors. The former should be indicated per-memcg, and the latter decision should only come into play when the memcg itself does prevent zswap.writeback. It might be less confusing to make the latter decision globally - perhaps through a mount option, analogous to memory_recursiveprot?