From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1ED5C531DC for ; Tue, 20 Aug 2024 15:27:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C1BF6B0083; Tue, 20 Aug 2024 11:27:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 74B0E6B0085; Tue, 20 Aug 2024 11:27:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5EBED6B0088; Tue, 20 Aug 2024 11:27:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3C1916B0083 for ; Tue, 20 Aug 2024 11:27:12 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EE9381C2420 for ; Tue, 20 Aug 2024 15:27:11 +0000 (UTC) X-FDA: 82473002262.26.4BFAE81 Received: from mail-oo1-f51.google.com (mail-oo1-f51.google.com [209.85.161.51]) by imf18.hostedemail.com (Postfix) with ESMTP id 3566A1C000F for ; Tue, 20 Aug 2024 15:27:09 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=SfpuxZSU; spf=pass (imf18.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.161.51 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724167524; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+x1BJW8t+s9H7pLyi6ggUxbHvpTGKY4XdDmJLQ/HTPg=; b=V77saZEA2Dmn1lo+rh+K8kUL2+5Nh2dww9Qnsfxk7qCqy4gFIpgJlYhNDv2IRwz1o39p7l KFlm1DBm+IxOC9dAnzHL9DRr5+Mn/rVfIcU83dNk2BxV33HuxMz4pgnorclQhSp0TEOFwl W/tajSMkX6wryt+n9l2Nk/h1+Y1GB64= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=SfpuxZSU; spf=pass (imf18.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.161.51 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724167524; a=rsa-sha256; cv=none; b=KeVtSFaESTK2st7sho1hQSo8s2Sm2v1TXh14MmKj53dU5nh3qZkPq3auDNZveakKsx45vT JHYQq5bcncVzCsR/902rZ+XYBMQTJlcIZoQhpdMggUrNloO0MzgR8/3ke74UmJOsu/0ohK 28EDb01/sWNOyTi77ikSBKBTEqpxx5I= Received: by mail-oo1-f51.google.com with SMTP id 006d021491bc7-5d5e97b8a22so3360342eaf.2 for ; Tue, 20 Aug 2024 08:27:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724167628; x=1724772428; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+x1BJW8t+s9H7pLyi6ggUxbHvpTGKY4XdDmJLQ/HTPg=; b=SfpuxZSUFYy9ySPvD++yw1r0rRtCTxbLY7XmB5mbJGXDxwgznxK7Edp8R7ZwXGdeFK epiwNkqrMcOihKwR6C+AuLuP+df1bxiv82fvQAUX89Okryy4QQfwTCHwgak9t5pBDy2I XLVNp5DE6GqVVPe1X2s0P8EdoRNsEHnMcZd5ApQrAUDQ53q31QK6WPwLTAfnXNuhSfZC PTH4UkU80f61ZbYsmQ7u15OHWCNogXHClV8lr2d4Cj1eonOq0iUUO+/puPwFRAM8fllM MdWBMiICRu09ZX+Md0p/gy30ZPCvumRdBFbPvnlubsyC1gwnQXswIaktrmFkIDU8yLhc QIlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724167628; x=1724772428; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+x1BJW8t+s9H7pLyi6ggUxbHvpTGKY4XdDmJLQ/HTPg=; b=v9gbaW7eqPHBJI6ygcWN9/QFZuVrqlSOYuF6IMJfTlwj4HXGUoxI2hvOyqEJfXMS1u I3C9u7kKiJT9nKlFZH6Wu6aY23iMtwo9gSkfAFbxNskCIYUcZhQbvRLlwnIfrNZG1YZO +zYdzNMYZ/dppzj/7EWk3qOPELl76ntXQy++hh5qP4yTVw2bYczlxdZRCDDOXY+cl5UD wXajV8KVpgze1U+U0KjxHnFm+CpHp1iOXqDUbDtJpI3gXPZv7qnZJ2LF6Egg7Mo0x09T QLVdbCbj4XcfZOCHhHvEJNhP93AMEYjE+qdr3iNOa89q5RE2nP3cPfXF+r7edsvFRAec kmNA== X-Forwarded-Encrypted: i=1; AJvYcCWUCT0Vqsq0N81qhv2xX1+OR64q5ff2Q/5/hQmty7+JnPAVkbW+Hv3FMPmhRjCAoXQdBu2L4+EbvFWfIgZeL6n6cts= X-Gm-Message-State: AOJu0Yzda/JnfKPH7vsgF3lBgUj104RYGg8vzxqN9fcJNX0DT6ouJawC j5NPlkViJpGx7iwZwcJE4+o0QKneygOub4csthaL0uRAM6abpDEQhIWvuD9ZM+fY2vrTusvMTcO vJhdpz38IIEu3PD1MsjWzgJbmG0Q= X-Google-Smtp-Source: AGHT+IFksjA3k9jsKzRmzSa22c9HN/Mi5jzk0E6JpH+6wVtL0SHwf/SVqbuloWKkvFiLGvAZzft3WOUDryAw+ze2w0c= X-Received: by 2002:a05:6358:7f13:b0:1ac:f058:f714 with SMTP id e5c5f4694b2df-1b393291125mr2086011455d.21.1724167627860; Tue, 20 Aug 2024 08:27:07 -0700 (PDT) MIME-Version: 1.0 References: <20240816144344.18135-1-me@yhndnzj.com> <45e2c372f59748262b6e4390dc5548f8ebf6c41a.camel@yhndnzj.com> In-Reply-To: <45e2c372f59748262b6e4390dc5548f8ebf6c41a.camel@yhndnzj.com> From: Nhat Pham Date: Tue, 20 Aug 2024 11:26:56 -0400 Message-ID: Subject: Re: [PATCH v2 1/2] mm/memcontrol: respect zswap.writeback setting from parent cg too To: Mike Yuan Cc: Yosry Ahmed , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Johannes Weiner , Andrew Morton , Muchun Song , Shakeel Butt , Roman Gushchin , Michal Hocko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 3566A1C000F X-Stat-Signature: mux3anzfwj6b9p75g16xfgg7ycqotqwz X-Rspam-User: X-HE-Tag: 1724167629-958212 X-HE-Meta: U2FsdGVkX19WdMmLVm6lqcUqy0Nab1feMEspsul6tOy3m0jDQ5nySHDYJDLfEucaTieaU5Du1xfIecC/3RjWwMArXzWXLde9rLzzmEMnshXqLX8iCtHw/nqnA/7ezWmnY33Xj6oabuaFBflNojo+0vH7LmKx6Zb8Xc1NLJgD+uzLcGYI/pg2P416/88lHBsdMgJqeTATHXJeOWIA0o37ojFF+zmWL7UN2s97waH6DuVdLd8C4SlJSvZHm1yYTkqGhSNhnMbzV+waqomr9pbADxFaeDYxRxt7B1Df2xeCEYsWDXrc6rXmA+o3u3YZNoCP9yq2xStQsb2xjro71GTwuxijwyEWHryWNjpKPa6sXBZW0OAXqwh5ga9qCDEbzPzX9AjKMepthLvj3mbZdse7WzFqVqVbTTj7gdodLc37HZ3RCylskGBqRKATOkOka78pGlsNcokxSh1+lFLdDQ96f7dFGQrTR3gL/sAaDBe2cESKlaw362otBFJlioGD9WmGR2i9QIK5OpL6nhXai+4C9Ck4XDeAX8s7V16tIedBH8Bxow7XysJzLpqGKTv/KlpWGuvFSXZk0jZtz9+zDsbXgr3siFMwBjO4CWotzVPgrej7w8ql4KzhR5wXZyHu9xJoDd/pzihrmcGXGhNKewbOV9DxKbgjY9YqwvjmkXcypfWWNDeez4dbGQie7t6YJCcHEKabHUlf2vomnSrumtu6aOHmBeiTKsm64/QCno8YdphgJ2B6gBV+GH6GyQzvHdszwxu8MQTcQjqBxTGejB0qZNoyxU7qFgx45Brf3XS9V4yyQQ8lCfj7xd7FTQijqwb9wkXFbuXY9v32ncNU4IpDV4IBgTzIrCMG0Zarmur7ZHGdFuaLRMX+o2xYKK1hGehHy6vFHIFsueXnZ2MCUAT7FSX00KAzkO2mS//Tj2+zd6PIEdK3PlW50hGjbPWVK/7zeX5E11/AQH5bjTvBzBW Dq8npmdi o7MFIA7GtcHBaU8M+wFLY1DHbQys08kOEbICtkUa/OkHJrg87MuXUR6MF0LGB7cQF2m2knOy4a3hm2vAQtRDNt9359vECGuiGs1jBCluwGZH59rrYM+HO02jM4od2HhX8EhPdTKiFhjVqGNRIWHxqN/fBc8G9p+ZPWlaiWhzTEKReKtz6SKmCdQ9NnVxkmSulyd9DBOEjdo4RxR6qhCiaIZ3Vm0f7ekq1neqNwuGtZl8NYv7NXYWqMKEgL49D7pRNGrKoGs+GnHhLQHDq9/pnfijanncgzTiRV64clPiMSjUVgB7+kxmgnvkzYK7UnFkXuX0Ma5EwDGA5+uUEDA4ec3ScjBtveBj1FapLzNWyPYgrntvKSwN0w1KB/FfGZ+oespD6rFFWupaomBPnnHJQs0C/wo3s1Oaxuz9IiJcAbNBGX+th4fN/pu206/av9cqsyGL7lRdm74RHKdOVmp4pDttUcOO72oxxQ7qY5X8gvvMQsnqLvwgRb2k0RWSk8JbQK0PJrswpRzCwG7hmNng7O3joPtHEGqtRRufqqiu2gbF2soSH9EeL9x3/WWs57bxcCyTsxqfEvqbaLgGdN1qy9X0jXtJR+NLsLEHYJPdExcK/roLJDK0Bc7c+ig== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Aug 20, 2024 at 5:38=E2=80=AFAM Mike Yuan wrote: > > On 2024-08-19 at 12:09 -0700, Yosry Ahmed wrote: > > On Fri, Aug 16, 2024 at 7:44=E2=80=AFAM Mike Yuan wrot= e: > > > > > > Currently, the behavior of zswap.writeback wrt. > > > the cgroup hierarchy seems a bit odd. Unlike zswap.max, > > > it doesn't honor the value from parent cgroups. This > > > surfaced when people tried to globally disable zswap writeback, > > > i.e. reserve physical swap space only for hibernation [1] - > > > disabling zswap.writeback only for the root cgroup results > > > in subcgroups with zswap.writeback=3D1 still performing writeback. > > > > > > The inconsistency became more noticeable after I introduced > > > the MemoryZSwapWriteback=3D systemd unit setting [2] for > > > controlling the knob. The patch assumed that the kernel would > > > enforce the value of parent cgroups. It could probably be > > > workarounded from systemd's side, by going up the slice unit > > > tree and inheriting the value. Yet I think it's more sensible > > > to make it behave consistently with zswap.max and friends. > > > > > > [1] > > > https://wiki.archlinux.org/title/Power_management/Suspend_and_hiberna= te#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation > > > [2] https://github.com/systemd/systemd/pull/31734 > > > > > > Changes in v2: > > > - Actually base on latest tree (is_zswap_enabled() -> > > > zswap_is_enabled()) > > > - Updated Documentation/admin-guide/cgroup-v2.rst to reflect the > > > change > > > > > > Link to v1: > > > https://lore.kernel.org/linux-kernel/20240814171800.23558-1-me@yhndnz= j.com/ > > > > > > Cc: Nhat Pham > > > Cc: Yosry Ahmed > > > Cc: Johannes Weiner > > > Cc: Andrew Morton > > > > > > Signed-off-by: Mike Yuan > > > Reviewed-by: Nhat Pham > > > > LGTM, > > Acked-by: Yosry Ahmed > > > > > --- > > > Documentation/admin-guide/cgroup-v2.rst | 5 ++++- > > > mm/memcontrol.c | 9 ++++++++- > > > 2 files changed, 12 insertions(+), 2 deletions(-) > > > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst > > > b/Documentation/admin-guide/cgroup-v2.rst > > > index 86311c2907cd..80906cea4264 100644 > > > --- a/Documentation/admin-guide/cgroup-v2.rst > > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > > @@ -1719,7 +1719,10 @@ The following nested keys are defined. > > > memory.zswap.writeback > > > A read-write single value file. The default value is "1". > > > The > > > initial value of the root cgroup is 1, and when a new > > > cgroup is > > > - created, it inherits the current value of its parent. > > > + created, it inherits the current value of its parent. Note > > > that > > > + this setting is hierarchical, i.e. the writeback would be > > > + implicitly disabled for child cgroups if the upper > > > hierarchy > > > + does so. > > > > > > When this is set to 0, all swapping attempts to swapping > > > devices > > > are disabled. This included both zswap writebacks, and > > > swapping due > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > index f29157288b7d..327b2b030639 100644 > > > --- a/mm/memcontrol.c > > > +++ b/mm/memcontrol.c > > > @@ -5320,7 +5320,14 @@ void obj_cgroup_uncharge_zswap(struct > > > obj_cgroup *objcg, size_t size) > > > bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg) > > > { > > > /* if zswap is disabled, do not block pages going to the > > > swapping device */ > > > - return !zswap_is_enabled() || !memcg || READ_ONCE(memcg- > > > >zswap_writeback); > > > + if (!zswap_is_enabled()) > > > + return true; > > > > This is orthogonal to this patch, but I just realized that we > > completely ignore memory.zswap_writeback if zswap is disabled. This > > means that if a cgroup has disabled writeback, then zswap is globally > > disabled for some reason, we stop respecting the cgroup knob. I guess > > the rationale could be that we want to help get pages out of zswap as > > much as possible to honor zswap's disablement? Nhat, did I get that > > right? > > Hmm, I think the current behavior makes more sense. If zswap is > completely > disabled, it seems intuitive that zswap-related knobs lose their > effect. Mike is right here. It's less of a behavioral decision, but more of a this-can-confuse-users kind of thing :) At least that's my rationale when I wrote this. If users want to disable swap still, they can still do that with memory.swap.max =3D 0, which is probably better because it would fail earlier at the swap slot allocation step. > > > I feel like it's a little bit odd to be honest, but I don't have a > > strong opinion on it. Maybe we should document this behavior better. > > But clarify this in the documentation certainly sounds good :) But yes, better documentation =3D=3D happy Nhat :)