From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E017C3DA7F for ; Thu, 15 Aug 2024 19:12:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDBAE8D0009; Thu, 15 Aug 2024 15:12:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D8C038D0005; Thu, 15 Aug 2024 15:12:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C52A48D0009; Thu, 15 Aug 2024 15:12:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A7EEC8D0005 for ; Thu, 15 Aug 2024 15:12:40 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 68B9116155A for ; Thu, 15 Aug 2024 19:12:40 +0000 (UTC) X-FDA: 82455426480.08.97DF20C Received: from mail-qk1-f177.google.com (mail-qk1-f177.google.com [209.85.222.177]) by imf18.hostedemail.com (Postfix) with ESMTP id 98C2C1C001B for ; Thu, 15 Aug 2024 19:12:38 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Sa/DD6PJ"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.222.177 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723749104; a=rsa-sha256; cv=none; b=7KjGZ1VM7O8Xw+Vzsqi5EaGXUiPLKO48oroHoJIFUqopxx7hEyZIPRSpYs7BI5LSC6kdq+ tPFbmH2F4W1PV8cipLBtDjB8uzeJY1Rv3iXSsR04LIeQVUJ9g/rXs/6vAjxWOncYTiSKhm R/+R/u2/4WpQfgNyfTpf6fjoHSFSN98= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Sa/DD6PJ"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.222.177 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723749104; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mFfzZ3pFgqDb8mIZqmgPfNFtwYForYW8jYw7gaAy3c4=; b=YX8XJU5wdtyjovG5yxP4I8qCnVuuYUgCOtCzhKm8hYjb2eRZPfRCCLxmjUu0/5Z259m/WV E7joL0erBJgQLDPFU2h/XPsclaewMOUjjCDY6DXwAWv2HxCxyMvxCuBnEWclaXV80js1Yq 0ueHK4/RJHfQ03QwPpUxQTQ8DUuTXKA= Received: by mail-qk1-f177.google.com with SMTP id af79cd13be357-7a1e0ff6871so84070185a.2 for ; Thu, 15 Aug 2024 12:12:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723749157; x=1724353957; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=mFfzZ3pFgqDb8mIZqmgPfNFtwYForYW8jYw7gaAy3c4=; b=Sa/DD6PJq3CLK4qbWIDU/evr/DLpV4ZZkJ6H/Sbf7bYUaOIRPvwaJJqEQMnNYUxG8R gzIVKPyo3ocHpKjzcdlMEX1mcPAV3UgWA0gBRsVxvhx0TJrr0JxernPQhRGew34idrbc 2/4aqkVzoUuCa3VE/vN2f0Ig+D5eiMXrmQR9z5lw18yPSgqQa2RexAhR3anCTqDx6HOU ck0lgjADU+cTgaAOf7kbEZRsCCGJ/8aQtoT/mFJT31GRmeHqSRWHimW5XFvyEAa3QFuK 1Lxc+QKx9imIxeD5ITrGNyqdWvqyBIUes0eLPJBq8PPKbw0M2VJrOXq/LXIPQ+H4SFOj KhCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723749157; x=1724353957; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mFfzZ3pFgqDb8mIZqmgPfNFtwYForYW8jYw7gaAy3c4=; b=S80iVVy0CP3yRVeR1ihKqaOvY8mKqu2hwIgpfeGkdS7m1Jpp+3piX9iOpNHpY6g9Cj DWj5iMlb6M8D9q96S2UXroUcSaVsG0PjgabVfN/oWpIl+PDg9Q9mzNaUpEzA010/oNCZ QJGamOjwsPXPuqtiqPhevt6SR9Dw69P0MeDIGOvhEpRQc0wjU4KoWvvNkLTeWlNvWQ+J NbLKJhlxGY+QVz3ZAztDWNlFjIBRaYfyJGzPTqEZxRq7Tecx+TUUKAu5Iaoc9I5ArQ3n rzAf8K3kOLSZgEzyq4pb+7je0DW2O3I77k62cw86WaVnwzYJYQryrgGs1SzqbSRNMils iEPg== X-Forwarded-Encrypted: i=1; AJvYcCWuwD9N+scTrngtg43pR/ZhHHo/R6W14sYFKFuiA/qfUfbTihJCmsxV9gS4x7cf6MCfUI5NJD4D3zcZuVIu7+m3mgA= X-Gm-Message-State: AOJu0YwMmYjvi6YCwd/H+VK4zfnR37USF0J9BXBwfNyK9dlSEFzmdZDl Qi7kkWStuElWrVvFZvJjpd4r16JQ/wxdsDc92xt1XN2jvzVS3/B+qCoQvssiLyrFN1Y4THZW7NK OfUo9k421qA6l3n1CzKTKbkKeWWs= X-Google-Smtp-Source: AGHT+IEn/1oERfpX0KiSjifFU04Jgclm2hv8aboG2vr2BmvzqniXqnbcBMH4kvGFVH3Y1dIl6cPxXpob+kp2eSbtPzg= X-Received: by 2002:a05:6214:5542:b0:6b9:5fb9:113f with SMTP id 6a1803df08f44-6bf7ce04b10mr7208596d6.27.1723749157464; Thu, 15 Aug 2024 12:12:37 -0700 (PDT) MIME-Version: 1.0 References: <20240814171800.23558-1-me@yhndnzj.com> In-Reply-To: From: Nhat Pham Date: Thu, 15 Aug 2024 12:12:26 -0700 Message-ID: Subject: Re: [PATCH] mm/memcontrol: respect zswap.writeback setting from parent cg too To: Mike Yuan Cc: Yosry Ahmed , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Andrew Morton , Muchun Song , Shakeel Butt , Roman Gushchin , Michal Hocko , Johannes Weiner Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 98C2C1C001B X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: do33uagxgmt8pgqsrghrd9qbi3k5han8 X-HE-Tag: 1723749158-667819 X-HE-Meta: U2FsdGVkX18pivN48+0t7TKl8AeEZfJ1J2XyCkbII5aR7YEo4LZygUu2xWyjceYV8CvI1repi0I+xxvRCTJU2nu2omWRTzh9eepElJlwRw3N39e7A3cw1Im2aZ4rf8OE1O3lRL/o6LnXvn1OFNVivR0P7ipUjiDS3R/CX3pGwubR8BBoK+QH2cG7eh15oBCW1czYjpdhp9bjJNZWrB0pT0gULqkETELmfmDDbI/F+Z01P1yXP+aTbVjaJOdTLY7mWkalMVZTxvKsbGipC3cWODLyUyPpYKpDJVEL6sOPmv1I51XoNND28HcIqP2iciWxqVY+XYqR7v8IYsVxuEVSj00vUHAG1Scq97mvAoJBeXO8CLkSTMjQU4K0Yvwftl+ZN/lRRSn+h3vMZrGZ0dYYeSsjutEAidr96Mipv4q6k5rQqW6YW7apaA414yb/fK7HlclyXT6xxbOdP7+9fHVTMx5G5zfN3rc8pblWotv/PFTzJK85Cilz5xoDfsHQ85Hu7kzTUxl6tQFmPS4W4Fj5ujjaXQfe9ZFz9mMg/Sw6lnYOaJ3QwaXKu/d/AOQAJX2Gt2vDQqyS7wldpfYe+53RrmkJhLB9oY/0alWzX7AnyC8qUhx7eCNwHO3KTVoZ5N6ba1IkNtcpmk0Mdm4UVWrx6M7M21Q7mO9eg+/n0m5jQmLuME9glONcfDmUj9qKk6P4rlMVlDtoH+7625y33ryolyWpkF4LqXgCS/J932Z/jrd/uRlp2pMuDx723LjPVSNVCA9Tz2jQ2z/qyncfNQoXSOasWKAhZexXn2tynPyj8+GF50k6hSJLFxNdk6HK/BBZLDH00l+RA3pdg5pCuRV7D7egwlH6Y3cfzh8+EpxJI9lkodI/4xX+iqiY1tfvKx1Nkgzx6KO9qZyqQCT1UNkpOo150NIiTHM2yZ7yvmuAPR2swp3BDl7kyufqTm5pzKVmeOEtESPudv4TlzSOqpl 3IZKxPza NQCvkLLEaM5eQ3DGFctM0FOjNWSJUiGlZ6Lu5UspCZRs2UNjO95MZgiNDEJ3SVF6WDZ61OWfug0NeOuxdZhQdk+7e1TfxHMo1G8HYxLAa2TsvgO+osxHT79B3jmQFp/EZyKOnaiNiws0e/dBcN5CVWIrUXPCrhKf7VJc/EF7nSaXxsvEhDRNwwtnec0omguHK/SsIz3X6LPI6PzXbKg+KwAUo5ECqWWB4Gi6DqMgH6Sj0tLAOGAjfdFLcDenN/bBKiZnUFt02YCzuxBcD56svsYHn7yL6XOus9r9fqYvEpnP8ODKtSfPpEJ+Nl39JnW2uttJpW0SmID2IsckBvbXGB40jFjdU5HB/Xk59MCI33LYNl2YRxc5eh9HvdLSOxC2CjuJWzFJj5ncd6n6tH5shcKSGgmt70TQPJcqV9YzGZCr8+rE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000273, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Aug 14, 2024 at 1:43=E2=80=AFPM Mike Yuan wrote: > > On 2024-08-14 at 13:22 -0700, Yosry Ahmed wrote: > > On Wed, Aug 14, 2024 at 12:52=E2=80=AFPM Nhat Pham = wrote: > > > > > > On Wed, Aug 14, 2024 at 10:20=E2=80=AFAM Mike Yuan w= rote: > > > > > > > > Currently, the behavior of zswap.writeback wrt. > > > > the cgroup hierarchy seems a bit odd. Unlike zswap.max, > > > > it doesn't honor the value from parent cgroups. This > > > > surfaced when people tried to globally disable zswap writeback, > > > > i.e. reserve physical swap space only for hibernation [1] - > > > > disabling zswap.writeback only for the root cgroup results > > > > in subcgroups with zswap.writeback=3D1 still performing writeback. > > > > > > > > The consistency became more noticeable after I introduced > > > > the MemoryZSwapWriteback=3D systemd unit setting [2] for > > > > controlling the knob. The patch assumed that the kernel would > > > > enforce the value of parent cgroups. It could probably be > > > > workarounded from systemd's side, by going up the slice unit > > > > tree and inherit the value. Yet I think it's more sensible > > > > to make it behave consistently with zswap.max and friends. > > > > > > May I ask you to add/clarify this new expected behavior in > > > Documentation/admin-guide/cgroup-v2.rst? > > > > > > > > > > > [1] > > > > https://wiki.archlinux.org/title/Power_management/Suspend_and_hiber= nate#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation > > > > > > This is an interesting use case. Never envisioned this when I > > > developed this feature :) > > > > > > > [2] https://github.com/systemd/systemd/pull/31734 > > > > > > > > Signed-off-by: Mike Yuan > > > > --- > > > > > > Personally, I don't feel too strongly about this one way or > > > another. I > > > guess you can make a case that people want to disable zswap > > > writeback > > > by default, and only selectively enable it for certain descendant > > > workloads - for convenience, they would set memory.zswap.writeback > > > =3D=3D > > > 0 at root, then enable it on selected descendants? > > > > > > It's not super expensive IMHO - we already perform upward traversal > > > on > > > every zswap store. This wouldn't be the end of the world. > > > > > > Yosry, Johannes - how do you two feel about this? > > > > I wasn't CC'd on this, but found it by chance :) I think there is a > > way for the zswap maintainers entry to match any patch that mentions > > "zswap", not just based on files, right? > > > > Anyway, both use cases make sense to me, disabling writeback > > system-wide or in an entire subtree, and disabling writeback on the > > root and then selectively enabling it. I am slightly inclined to the > > first one (what this patch does). > > > > Considering the hierarchical cgroup knobs work, we usually use the > > most restrictive limit among the ancestors. I guess it ultimately > > depends on how we define "most restrictive". Disabling writeback is > > restrictive in the sense that you don't have access to free some > > zswap > > space to reclaim more memory. OTOH, disabling writeback also means > > that your zswapped memory won't go to disk under memory pressure, so > > in that sense it would be restrictive to force writeback :) > > > > Usually, the "default" is the non-restrictive thing, and then you can > > set restrictions that apply to all children (e.g. no limits are set > > by > > default). Since writeback is enabled by default, it seems like the > > restriction would be disabling writeback. Hence, it would make sense > > to inherit zswap disabling (i.e. only writeback if all ancestors > > allow > > it, like this patch does). > > > > Yeah, I thought about the other way around and reached the same > conclusion. > And there's permission boundary in the mix too - if root disables zswap > writeback for its cgroup, the subcgroups, which could possibly be owned > by other users, should not be able to reenable this. Hmm yeah, I think I agree with your and Yosry's reasonings :) It doesn't affect our use case AFAICS, and the code looks solid to me, so: Reviewed-by: Nhat Pham