From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B480C52D7C for ; Mon, 19 Aug 2024 19:09:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A1C5F6B007B; Mon, 19 Aug 2024 15:09:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9CC546B0082; Mon, 19 Aug 2024 15:09:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 894736B0083; Mon, 19 Aug 2024 15:09:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6E65C6B007B for ; Mon, 19 Aug 2024 15:09:49 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2049CC10F9 for ; Mon, 19 Aug 2024 19:09:49 +0000 (UTC) X-FDA: 82469934498.22.BA25695 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf18.hostedemail.com (Postfix) with ESMTP id 460761C0012 for ; Mon, 19 Aug 2024 19:09:47 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fFebbH3y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724094572; a=rsa-sha256; cv=none; b=B44r/AngLJYX7YZ226QRAvu36v2EVoPWcZ19R/4lzTpvMxWMfXuRqE7TbvaZflQJJa+YGi 7A6RYyT5/ZaKQGMq18PXfuOUhGCINxSb62enLfsVjJFEQl3seN5rlxQZ0vQ4ep3Ec+QTlX lUCWBV+cNveNRoAPVACWd2y2hApq5OA= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fFebbH3y; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724094572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=B8keNgyq5Y1CiFPJpkWT/ZTIazYWWYL5qY5ay8DSsuU=; b=vp/3/jNpVoAn5UrN4eymO0LwxecRWXuKpg+aLTWRA1o1bQD0ghgwfX/YRX4bJUDU3JWjxd 2mBKQSMZolG0vn/FGYfdstuw9gw5Cq24Ob1T05dXPKQzK/vqgIM1oqvBF8V2orEzo1nMpQ 4bsXkFYFQIZu58RKsABGCn8KnwNQNvs= Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-5bed05c0a2fso3797937a12.3 for ; Mon, 19 Aug 2024 12:09:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724094586; x=1724699386; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=B8keNgyq5Y1CiFPJpkWT/ZTIazYWWYL5qY5ay8DSsuU=; b=fFebbH3yGD1nO1g4MbK8ZKoRZp9whpjtMmdLJ/Bu77J6ovulxWlT4VtEGbQ4ngAsrg xQvlbtFefkAewUr42N+Hxq364TL7dALrCmr75ozEmGdfcH57vtE3DAQ2WaRZVSLIv8sk e7FA+gzYGGwpQAOZ29CaaMUOBdYJoWF6MBngcC8DycOmi4pH1lq0BU2x2hkUQfBUjldN mAVMEziNGBnU9yDjQjZNt9TshqubYE2d7/saTKQBaBRkJawVHvDJIDgf3ZEgADYFcndc piuPuzq+WOwezvn3yciCEDl3/4cB9SKeToo/WxL0//GcyuGCbPhqLAu7Mae4d+ZkQEwJ 744A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724094586; x=1724699386; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B8keNgyq5Y1CiFPJpkWT/ZTIazYWWYL5qY5ay8DSsuU=; b=Nxb6lWhZCW2fQs+PbT6WclQcf89nh41qfRElmrFAv/MstOXQy41csABDpovGjnULxW hKTMxCxoujhkbB91YDYaeGzOTePCsuYaV2S4at5TnJ70M2P7zf0j9MT3OPlsWtn/fgqt hygOVtKrISk7yR4Dlb4v5KaZhmQOkmd5eSB8Nrv4+qm5MFnI1Daj78rlzqtzLHXiwTX3 ey5i8hobJaG6OcxmxUt/AoHp4Q45y+uw5ueqjSGn5hMILBcsT58iIHsRUdHnwt2HmGP3 3Iwdka+lf1EewvdQCnWrwQhbQ8qI5fOkZCbF3qilyU+DIUaVzyL9voxZr+2w4sTgubjJ tsHg== X-Forwarded-Encrypted: i=1; AJvYcCWs23TRvDuDUc/8i1DqBj/0XMOGYpo/E0pWXa81skLsWwIzUymJGQ3cvKiXCnttatwQFgN1belUlkuDCNnvE5rHmJI= X-Gm-Message-State: AOJu0YwkmxlElfFvQUdoVOaU9Is/j2rQMxOL8jfd2Nmn2nv4kH+pJXhC b0n/vLlpZSjiWKNPpK78o2StOQprRAAbGgN95lhSMKJJ7UgDx2tc/uqvJi4y0/AaTmIU/iZYSIS Zp+nCIZiwC92jzMcAPeNFNW9fgRanVx6KGLYS X-Google-Smtp-Source: AGHT+IGuKI69R41BZuJTwuz2InUwf8+whd8Hm0hr6/vouCAUsHezmuQBN4on3koeWHj1yT93eqQMGmKyg9R2P+SWPgw= X-Received: by 2002:a17:907:97c3:b0:a7a:bae8:f2a1 with SMTP id a640c23a62f3a-a8392a03c21mr988519966b.42.1724094585132; Mon, 19 Aug 2024 12:09:45 -0700 (PDT) MIME-Version: 1.0 References: <20240816144344.18135-1-me@yhndnzj.com> In-Reply-To: <20240816144344.18135-1-me@yhndnzj.com> From: Yosry Ahmed Date: Mon, 19 Aug 2024 12:09:09 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] mm/memcontrol: respect zswap.writeback setting from parent cg too To: Nhat Pham , Mike Yuan Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Johannes Weiner , Andrew Morton , Muchun Song , Shakeel Butt , Roman Gushchin , Michal Hocko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 460761C0012 X-Rspamd-Server: rspam01 X-Stat-Signature: b95driwb5wk7sww75raybbq77thr9f3j X-HE-Tag: 1724094587-6417 X-HE-Meta: U2FsdGVkX181gkm3oFZwck7KfY7kLFwn2I2yEH3BeUShglfh0DZybNf47L7fFgObowKkmtPpp9P2vu0g1UY5ltjdhSoGw3aBWvvInlLgvs58vKMzDr0SQr9idOvpkqMEPxfmM+QIqCcYOJJySbaQSRS93rGT9t1sOtzLJ5eVhXqDWMIr1DXI2XVoHrD9Kxi2Oqyg/RnuGTKF9MfxMmxuBeMrkTw7jkd/GutuNgVqGy/MigPlnyQypBqMJW4BQg8d4t8mvMjCJZsjDhXa2LKZJiegTHWfpyQCEEc3Hx9LAPx+NVHIl8mlIKpEPCscFeTsyjXsGDaBAyOqz0w3uCDtQKAvhyZ4EoBqa990p0v4usxop1b5DqxZ5j6096eId6B9o1ThL5VA/Ij1qvDHNpFtITYZlYcCu7N/I2GU6xuT7omKi5ng2mpeFi4MVZMLHYsqViNMYDidYuOb1e5phsiY9A6aYKxszdxKT1nbWO+0yh67vLcYCZjSEV+eas+reCxI473cpu+1uhLJze92r26y299O4Y5lVbkYZhRZ23em8dLBuYTbZLJswBfAYvJpQ7+7zSQc87dMSIIYJuBFWLmN9YUQxv9CKdXTHXYeM6qXOCdXBLT97zaMJsec141fxwqzBa8sPQBxluT0ljsT9EmhazsVkmrSxgqa2pV1CmEFEITJbPRQcMU1pO1JuOlVEjS7m3UKUJyJuxduugQOxGTJa0azYx94gse3dV2p5kqc45v+7iTc/FHwn3oBMD6RJyRCvdKF5vOZzMo17azSlXpJVbnXav7+CPXEaIX+VbGuJOa02IHF/wB9mz+Up0dYPL99kXiqPMc8ZMF+dmC8G38FVh9fXenHjr9ejr+2EXzqlkbUezvrEc0BxpbGMIFQJ0TBqUGd639RNzFwaUNtzch2W3sBvftHzbNsnDJWWT2cnN05JfZAWCdW3c56L7cDLXtZfr3q08liBAPC3E8RbMu VV83uIsi JjucpnzFZI9euAQZIDXLogu9F3fY/oA8RH9xZvTfdI5kgpRkhkdw7qC5pUJ9OjEcsacsN2Uh8HN3xku0/ckBqFcyodCgCDVLwppCXsw+Lii3U90DXv6GkK/hhXu6Xv85rY2K/eBl3HdxLKWTm4+mGkZCcCU0JD+U7UjV68Uaz2cJHaH96G38CVrflYyD2ulgXWWvFXxTDGgfh4CGZxEiJ5mElQn1Ha9JjLcA5vPjQpXAWdY4kdAIz4uNaV8mekqmH59L8YZGaP1zX73XVEcvVIaun9CICr8HpdCKUYB0tKHa4NXM+7j2XZQIYbRzQZIAUo/gNFEMhBAw6Y4P+Dlcq+GHeWVOtvhykOTm2hbgtbI3Iq8FOGo8BBEXotoxTmqFzXkZUnOOGgj9blOH9AyyXzB1lwCdCF84bX9H9r6gPplIfJD3S4R/2ZBjTjk+7THDjUAT8nTipzQITWTB57CqMhUJLhvTdUtK0wv16aaKtC0H3yxRMHRYlkSLkCMCzIhgzVb3Hco9i8kay68wsBVIlxbNFgYkEvp1RxPJ4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 16, 2024 at 7:44=E2=80=AFAM Mike Yuan wrote: > > Currently, the behavior of zswap.writeback wrt. > the cgroup hierarchy seems a bit odd. Unlike zswap.max, > it doesn't honor the value from parent cgroups. This > surfaced when people tried to globally disable zswap writeback, > i.e. reserve physical swap space only for hibernation [1] - > disabling zswap.writeback only for the root cgroup results > in subcgroups with zswap.writeback=3D1 still performing writeback. > > The inconsistency became more noticeable after I introduced > the MemoryZSwapWriteback=3D systemd unit setting [2] for > controlling the knob. The patch assumed that the kernel would > enforce the value of parent cgroups. It could probably be > workarounded from systemd's side, by going up the slice unit > tree and inheriting the value. Yet I think it's more sensible > to make it behave consistently with zswap.max and friends. > > [1] https://wiki.archlinux.org/title/Power_management/Suspend_and_hiberna= te#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation > [2] https://github.com/systemd/systemd/pull/31734 > > Changes in v2: > - Actually base on latest tree (is_zswap_enabled() -> zswap_is_enabled()) > - Updated Documentation/admin-guide/cgroup-v2.rst to reflect the change > > Link to v1: https://lore.kernel.org/linux-kernel/20240814171800.23558-1-m= e@yhndnzj.com/ > > Cc: Nhat Pham > Cc: Yosry Ahmed > Cc: Johannes Weiner > Cc: Andrew Morton > > Signed-off-by: Mike Yuan > Reviewed-by: Nhat Pham LGTM, Acked-by: Yosry Ahmed > --- > Documentation/admin-guide/cgroup-v2.rst | 5 ++++- > mm/memcontrol.c | 9 ++++++++- > 2 files changed, 12 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admi= n-guide/cgroup-v2.rst > index 86311c2907cd..80906cea4264 100644 > --- a/Documentation/admin-guide/cgroup-v2.rst > +++ b/Documentation/admin-guide/cgroup-v2.rst > @@ -1719,7 +1719,10 @@ The following nested keys are defined. > memory.zswap.writeback > A read-write single value file. The default value is "1". The > initial value of the root cgroup is 1, and when a new cgroup is > - created, it inherits the current value of its parent. > + created, it inherits the current value of its parent. Note that > + this setting is hierarchical, i.e. the writeback would be > + implicitly disabled for child cgroups if the upper hierarchy > + does so. > > When this is set to 0, all swapping attempts to swapping devices > are disabled. This included both zswap writebacks, and swapping d= ue > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index f29157288b7d..327b2b030639 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -5320,7 +5320,14 @@ void obj_cgroup_uncharge_zswap(struct obj_cgroup *= objcg, size_t size) > bool mem_cgroup_zswap_writeback_enabled(struct mem_cgroup *memcg) > { > /* if zswap is disabled, do not block pages going to the swapping= device */ > - return !zswap_is_enabled() || !memcg || READ_ONCE(memcg->zswap_wr= iteback); > + if (!zswap_is_enabled()) > + return true; This is orthogonal to this patch, but I just realized that we completely ignore memory.zswap_writeback if zswap is disabled. This means that if a cgroup has disabled writeback, then zswap is globally disabled for some reason, we stop respecting the cgroup knob. I guess the rationale could be that we want to help get pages out of zswap as much as possible to honor zswap's disablement? Nhat, did I get that right? I feel like it's a little bit odd to be honest, but I don't have a strong opinion on it. Maybe we should document this behavior better. > + > + for (; memcg; memcg =3D parent_mem_cgroup(memcg)) > + if (!READ_ONCE(memcg->zswap_writeback)) > + return false; > + > + return true; > } > > static u64 zswap_current_read(struct cgroup_subsys_state *css, > > base-commit: d07b43284ab356daf7ec5ae1858a16c1c7b6adab > -- > 2.46.0 > >