From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A36ADC3DA4A for ; Wed, 14 Aug 2024 20:43:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A50A8D0002; Wed, 14 Aug 2024 16:43:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 355B16B00C2; Wed, 14 Aug 2024 16:43:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21DAD8D0002; Wed, 14 Aug 2024 16:43:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 053096B00C2 for ; Wed, 14 Aug 2024 16:43:16 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C1F101C4CE6 for ; Wed, 14 Aug 2024 20:43:16 +0000 (UTC) X-FDA: 82452025992.17.7619AF2 Received: from mail-4323.proton.ch (mail-4323.proton.ch [185.70.43.23]) by imf28.hostedemail.com (Postfix) with ESMTP id 8B9D2C0024 for ; Wed, 14 Aug 2024 20:43:14 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=yhndnzj.com header.s=protonmail header.b=ixyFmKni; spf=pass (imf28.hostedemail.com: domain of me@yhndnzj.com designates 185.70.43.23 as permitted sender) smtp.mailfrom=me@yhndnzj.com; dmarc=pass (policy=quarantine) header.from=yhndnzj.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723668124; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wOMrogBp49YTG/FcTV+4nAmr5uB92wczucA2j1jKDYY=; b=yjFIpDC206KxDp433/VJ/bftC+M1Z9v3r2w4nik3TbbKOpY4lAyv2caRNxVUSealXf1ifO 27b0s2h7F8IeLhFn6hTyt7pjsyPnKaaWc24D6WF/h2H+pnKIuaDTfhtbeL8GsclIUVmQJm b+QIvKuDv+jdfRKEZ5wrhvhkmLfjWB4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723668124; a=rsa-sha256; cv=none; b=owU7PRyht+UPiLSu6fAVMZlpkBJPQvj/4egU5Vyu5xQcaDWvUzVjqznEvZjRiWxmrAahKZ EsS9CEAQXSskMbTJMlDk1FVwdusnLdnXyFfpqAeR/ovlNl3l9OnHkFWm+D2Za69L+11gWb oe+YG9ARbRQNW6ighjn7NVFZA+SIzMQ= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=yhndnzj.com header.s=protonmail header.b=ixyFmKni; spf=pass (imf28.hostedemail.com: domain of me@yhndnzj.com designates 185.70.43.23 as permitted sender) smtp.mailfrom=me@yhndnzj.com; dmarc=pass (policy=quarantine) header.from=yhndnzj.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yhndnzj.com; s=protonmail; t=1723668190; x=1723927390; bh=wOMrogBp49YTG/FcTV+4nAmr5uB92wczucA2j1jKDYY=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=ixyFmKniqxypjGAKbwzbOWnniShM5sGQqu7zYES/X/U3O8ECOJh4H7HJnbr0FjvgH NlAgAug2PyWGSjcBcrH01ZU69mtDPEE4bQlgkpK3YFzlClZL3JuvFKqVXLs5wMCjw/ kXGq6d7gS7gjMVoyyrreWXcxF32+Pqe+R0CmO2Pqs4UifjX/kfEbvOMJXtBw7rM2CG PPok2NOqFHmFXaetZazLfPKx/eD8a0bL2TsEaLTtrZyGO8dOaZk0Idy87t6oQDyrwL Cjtpu7VFiVe2qdWTLSrEzxHMhsaVTWGY2r5GdKyWw3jknts+tNKfnRzapaOzHsKB0n yNrTfYrnhkstw== Date: Wed, 14 Aug 2024 20:43:07 +0000 To: Yosry Ahmed , Nhat Pham From: Mike Yuan Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Andrew Morton , Muchun Song , Shakeel Butt , Roman Gushchin , Michal Hocko , Johannes Weiner Subject: Re: [PATCH] mm/memcontrol: respect zswap.writeback setting from parent cg too Message-ID: In-Reply-To: References: <20240814171800.23558-1-me@yhndnzj.com> Feedback-ID: 102487535:user:proton X-Pm-Message-ID: cedc392e816fb90eb89fe531465e56ebe2b9e7a2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 8B9D2C0024 X-Stat-Signature: scj6npffy6exiwfukph48f7osg6gux1a X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1723668194-365553 X-HE-Meta: U2FsdGVkX1+Ud8ozCuRmv0/49FNTE8HEnkKEkGEj+juholTJlUvucKK4GkvJNtXNm6nku2pVU0FRIwm2dEelCVC6XXndFspbp6jqLoU+wzWWAO9Jav5UTDk6UeVVHRALONP49t4yD3nSKRjnOF5VE+P96Xs+dJkX3nT/WZF8wGpo7fOxGGT7NJnmC4zK2CwJozbQcAuXL+c90LiKrVhmgA0+TywOOOwK7oxqIvyi/Pu5z8yZxdf2gb2bxT/4nn8lkkE0xYwMiftmXCvGYK294/Evp//hsA+VtsV1+GK8bcn3wcHEXGs0LZrhRu1HCwmbZ9mG6Y7bDr4h5/ENUx7W9MF39vVu/42PxvHk4SSX8VXfkYXAE/meJY1HavnBUuhStsjcX46KP9WMdQuCrzObUoBn8gfg3S9nn1eoIVxMVSKD8WBqbLkvXnGem3RqRydOza9rx6tdbPUHTZ16u4nukQUhD/8rfHSEaevqMX1JqgicjCBAVxDPuw3pMR7EOivUbz5nAJ+dxir40Gra8k9d9PalNhFsbcD7eeoxezIL9JWXLUaxUFEpcpYlVIKrRQe+5Ui/NPIchRDuR1CYfYVGM1FrDePwmWxMne/jdKeN3aXywEkQLIVc6UitsgqQ8Jt6hIhWnCNVHiNSQkJdhi1SQX71GAFdMSMPMG4Pb0aOrEhht4A82yucd1V/jFVUM85BRTHq/cTAlhnNtzPXQaOR9AG9R5tkHpt18YKIU3D412UjpXYHG1q/V4PS96FWCys+7rQ6D4HjbyS/SFoypF/0HbLrLvhwtRfI/LT78StJ91cxf8HXTnjmm/JPcXZzg2Vp76dJC3+GkQaF0CGnA5OsCaOpibgsSlj2A+4Ji2zT035ow+daryu8lOAltPLL7Fd2d/a8D0bYGGGwY4vgJsg9MIM1kjvQpjAxGb3h4Mb6I/17LYy3yIRvf1i29Pvnlf2fFHS7twL3jmOpL75fLBE JGZI8DiD DjssjKDzhIXSNXx7lKRlJa1A3rKJ/G2OyAT8sGh3csWIY3k26rooulQHnacd/HDvgXOSwOVux44tHXA47RJ7o+Qd/P0KuIUSeUh2nZP+k1Ybfb/QJu6QK4/EvqKbXY0nLjKwJuij6v6RJaD/xLIhAdfhoBs2cZgpilmXEmVqx3ZO2eaHCwnV7Cd6nD6uJjPoJq512yPwX11Cp/maVVjHTtWgg/ghEhOpn9ErewSLU5p0CddG5Q56L/QZ2oAlW47GJke9tH1ReOqWH+dRLMccS5zQKckuL6QPQafW14sKkif3wVXu0GZ3NyLUiz1o1KZn8o0Pm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-08-14 at 13:22 -0700, Yosry Ahmed wrote: > On Wed, Aug 14, 2024 at 12:52=E2=80=AFPM Nhat Pham wr= ote: > >=20 > > On Wed, Aug 14, 2024 at 10:20=E2=80=AFAM Mike Yuan wro= te: > > >=20 > > > Currently, the behavior of zswap.writeback wrt. > > > the cgroup hierarchy seems a bit odd. Unlike zswap.max, > > > it doesn't honor the value from parent cgroups. This > > > surfaced when people tried to globally disable zswap writeback, > > > i.e. reserve physical swap space only for hibernation [1] - > > > disabling zswap.writeback only for the root cgroup results > > > in subcgroups with zswap.writeback=3D1 still performing writeback. > > >=20 > > > The consistency became more noticeable after I introduced > > > the MemoryZSwapWriteback=3D systemd unit setting [2] for > > > controlling the knob. The patch assumed that the kernel would > > > enforce the value of parent cgroups. It could probably be > > > workarounded from systemd's side, by going up the slice unit > > > tree and inherit the value. Yet I think it's more sensible > > > to make it behave consistently with zswap.max and friends. > >=20 > > May I ask you to add/clarify this new expected behavior in > > Documentation/admin-guide/cgroup-v2.rst? > >=20 > > >=20 > > > [1] > > > https://wiki.archlinux.org/title/Power_management/Suspend_and_hiberna= te#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation > >=20 > > This is an interesting use case. Never envisioned this when I > > developed this feature :) > >=20 > > > [2] https://github.com/systemd/systemd/pull/31734 > > >=20 > > > Signed-off-by: Mike Yuan > > > --- > >=20 > > Personally, I don't feel too strongly about this one way or > > another. I > > guess you can make a case that people want to disable zswap > > writeback > > by default, and only selectively enable it for certain descendant > > workloads - for convenience, they would set memory.zswap.writeback > > =3D=3D > > 0 at root, then enable it on selected descendants? > >=20 > > It's not super expensive IMHO - we already perform upward traversal > > on > > every zswap store. This wouldn't be the end of the world. > >=20 > > Yosry, Johannes - how do you two feel about this? >=20 > I wasn't CC'd on this, but found it by chance :) I think there is a > way for the zswap maintainers entry to match any patch that mentions > "zswap", not just based on files, right? >=20 > Anyway, both use cases make sense to me, disabling writeback > system-wide or in an entire subtree, and disabling writeback on the > root and then selectively enabling it. I am slightly inclined to the > first one (what this patch does). >=20 > Considering the hierarchical cgroup knobs work, we usually use the > most restrictive limit among the ancestors. I guess it ultimately > depends on how we define "most restrictive". Disabling writeback is > restrictive in the sense that you don't have access to free some > zswap > space to reclaim more memory. OTOH, disabling writeback also means > that your zswapped memory won't go to disk under memory pressure, so > in that sense it would be restrictive to force writeback :) >=20 > Usually, the "default" is the non-restrictive thing, and then you can > set restrictions that apply to all children (e.g. no limits are set > by > default). Since writeback is enabled by default, it seems like the > restriction would be disabling writeback. Hence, it would make sense > to inherit zswap disabling (i.e. only writeback if all ancestors > allow > it, like this patch does). >=20 Yeah, I thought about the other way around and reached the same conclusion. And there's permission boundary in the mix too - if root disables zswap writeback for its cgroup, the subcgroups, which could possibly be owned by other users, should not be able to reenable this. > What we do today dismisses inheritance completely, so it seems to me > like it should be changed anyway. >=20 > I am thinking out loud here, let me know if my reasoning makes sense > to you. >=20 > >=20 > > Code looks solid to me - I think the upward tree traversal should > > be > > safe, as long as memcg is valid (since memcg holds reference to its > > parent IIRC). > >=20