From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55A03CEBF68 for ; Fri, 27 Sep 2024 01:52:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A95556B00A2; Thu, 26 Sep 2024 21:52:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A45826B00A3; Thu, 26 Sep 2024 21:52:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90C5A6B00A4; Thu, 26 Sep 2024 21:52:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 72CCB6B00A2 for ; Thu, 26 Sep 2024 21:52:53 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1C93E1C4F32 for ; Fri, 27 Sep 2024 01:52:53 +0000 (UTC) X-FDA: 82608844626.27.9372ABE Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf27.hostedemail.com (Postfix) with ESMTP id 54A4240010 for ; Fri, 27 Sep 2024 01:52:51 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YOaJLyTf; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727401881; a=rsa-sha256; cv=none; b=i6PMn+sxVU6of01FpMFu6fo2MlY/EWLqddsDz1Fcc+T2JIkXt8WR6PZ5l4vFRhM28DvtwD 5kSQiSYaWQzse5MDjxsBQ2zkNBSyRJBwSap2w6USH62EI0GYNAHSw9aXDsy10SHQjBc02T xxTu4FQDSErM0ACfzb1mTuWoih4oFT8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=YOaJLyTf; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf27.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.51 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727401881; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xAnrZRaJyD7HBPI7XnenJ3MGc9tNIxZl2rrNgLUBi3g=; b=h+Zvjwuw6L/qRz9W0j8/+mOed7WQMrytvHCC70+RiH0Oyq+phhPC/QTgqbFHua10vBF3MA 0UD/J0iZn8gAFLIhOfHaPctoBSNClYmCP/oR3rBOkYXBII6Q4ex8nq9JLD/vv+XVbH1IaW t5+y8LEtVDRNiEVQXsP7YLddXaI77L4= Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6c548eb3354so12148316d6.2 for ; Thu, 26 Sep 2024 18:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727401970; x=1728006770; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xAnrZRaJyD7HBPI7XnenJ3MGc9tNIxZl2rrNgLUBi3g=; b=YOaJLyTfLGaU2h2zpFHhdjla7fCrAKYs2ECjRldyy2QYJN59kT0gwGM2PF1M0fE/6T JaeALgzv1M5i68leNxF8jUbvQDHy1uVpzAIEeZJTBGJMVtgWA2r+s+PY58JhkvHB8JaM sVfMVr6eWp1/SK+8ctQ6bxPF0RaofFlHR1Q7DADZH9st02crQINf8FkJyzDVUUk2KXuO MisCPmNzCAJUU7YRy0TvbORPT0dEvhQ4SHWkDzFvE0Bvk2kYY0Kxy9hNQjIMkZCm672E 7sq/Y9IPj4+WKdqRiql/BeywyxdzUc0fbgTmg5jo9g991wcZqHuCbYNFLkoWFKe3jc1X esVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727401970; x=1728006770; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xAnrZRaJyD7HBPI7XnenJ3MGc9tNIxZl2rrNgLUBi3g=; b=YyRKkcOHT0Pga0z1rrgzdzm4onVBXgMEyOguxCY3bKlO6pnP1w+Tp+zobiUYqt8t99 1PudKj/95VjWVBImHeFMSRxNxudSx/iEluqaSLAUlHupDkPpNDZmBJjKJC631XZwddZn 4gK3FwHp7OakBWcuwGdenTpW83uuYKvhII4a51oYzV4a3R7gCKDuzlMFkye1aKyNBtw1 WlS1dPEUuDNIPPHpX6MJVHjVm2z0a05ZvBj3Ng0BGKVr71d9nhWVmpDyQ/N2rDA4l8p4 AwGj5CEa0NhILmCRjB886OLL/iEWaVRGQe1LOHCwPYOQkhDdYmH++b1wA+2q9s9dmWOT BTFA== X-Forwarded-Encrypted: i=1; AJvYcCUnMBMZt5YqBoUWGFgmb8pIpvS4ftl9YjbUKcL6yd5JFKiO1iOCA9TdD6af5o3YW7C1XUlWQMRofg==@kvack.org X-Gm-Message-State: AOJu0YwVQAxmWmSehSPXDyxgXlcqWtovTxFliWBZU41XgG+aJ5N0EbAt th0dykMrEyMBwxaWets976DVY9YYJuZBjWeKWjMMQbn9GGXeb44hu1v8F7pNlobo2gVQj+o2lVR DvxGR+L63iff6DSUKoeTMo2cEPsc= X-Google-Smtp-Source: AGHT+IHCPKsx7efhs3k311WYqpjW68lnnDe97yjX9qnvjdLXn0K9KjZ7hrrKjK7KTeq6a5vgyQ5IgQgX4WMdpOgg9FQ= X-Received: by 2002:ad4:4450:0:b0:6cb:3ba3:8eab with SMTP id 6a1803df08f44-6cb3ba38f1amr20584326d6.31.1727401970400; Thu, 26 Sep 2024 18:52:50 -0700 (PDT) MIME-Version: 1.0 References: <20240926225531.700742-1-intelfx@intelfx.name> In-Reply-To: From: Nhat Pham Date: Thu, 26 Sep 2024 18:52:39 -0700 Message-ID: Subject: Re: [PATCH] zswap: improve memory.zswap.writeback inheritance To: Ivan Shapovalov Cc: linux-kernel@vger.kernel.org, Mike Yuan , Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?Q?Michal_Koutn=C3=BD?= , Jonathan Corbet , Yosry Ahmed , Chengming Zhou , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Chris Li , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 54A4240010 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: mryukibo7kzk9pu7ooywsjtx9thicod6 X-HE-Tag: 1727401971-107784 X-HE-Meta: U2FsdGVkX19Gdnny6DIXwxS6+Jkfstq2jxh4LHXV5Lvcp8uvIzgDHu9oS6JXliQnx2CZozZQAP924DvX4e59K7pD6fQ2UYIEWzN2hxEAxgd0tuYHFZnuKInVhv4Anrvfx2p01lX1VLQHaV00UurrQWdWNjnkkn5BZR+TlaMMuSA0MgaQO3ePEUERU8MGJ+KDuxsaNbqVNyKbT7DIC/LkIetwsfoIqAYFU3137VLYYDDurwtsSOlFeKMWJo8U6+wRTKxHQxrM//mdXwquJugZzRmRkhXthAIw95jbuVwTVigLez4palm6pyo+dkul8zK0BIl1UX46GGcqM6PRSKbd3e+sHYusc0yIiGXcjsKBch6hbkw63AgxoFoj6gQKDmk21q6xFVulzVTHQqwEsoMpKYORfuX4weCoRYhZy7/GPDTpRgH0dNdSKdfddyujK1rNhNGMDDIkdaOuMwdlHvUSxoTcV8y8YbEjiFhAuQYNgFi+GoJ+mf9rg83bK4i9lHKUzrhsfUr4y+N7I0L6A4tbFQD4QVL58WHxOCenE0V0mqxk7+FVJ8frlXdMI1F8egmz+dKTKZPrUZjw3NUa8SFLkyjUZjQvbGmoakw/PAVb9jcVa5GWILcJLvKYC3BvK9rXYLZmuF6jp6QCHYPgOGPYJn4VaWggGu7tyzFgVc7DGSkI7auqCHXDOfe3jPRf7NetehDitd0SrReKVqm2bOIpCjmqhAAKpUMPH7DpzldwXtFgA5NFTY6YjpRZpa8XO0dvQCQ0hgseYOvuqJbNjdkilRlQQtMgR/70VJLbYcr7FwYMGFO8tG1atAEdNLp5tLFARRidT1TCLt4xwDmVMWctGsv8EBHAWtAjFnHONoklUTjZ+YuS1xMH7pa//5p4649i7hGX4MEC6hxdQPs25MMoEttlWWMyUHblsDlbqBu5Zw7qXqFBRpleKXIOeYyPscBAqGSNXVfoUtpwwu/3TT9 REUddmtx KorUhxQfTXLEE3JorFzTDob0T14AeeISjTyT1M1tPq2CPlZ26HtloYMmffAJDlJCLC9RIwhXREvAZlTSk9fi2BFT8ajg/l69GW48J30tp2nvqxjBTXDn2elvt7jwfEzyI0KTIDSO0J7RklFcTCGvK8/2ovl5o3a9tjEpkEre4hqOgVd4JMRbruqSGx5nBv53ayomOxUe7PfSDuZvwlhgAsorsFH+L7NziU6vdlUafZE4KScr3Eij9LYf/iDnMXAoDu6D7rHtgLDYQn/fq8BHS5mjqooQHK1pvRhLeDX+KCfYgIN59EXLi2V6Yn+4IviWQhXnE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000007, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 26, 2024 at 4:40=E2=80=AFPM Ivan Shapovalov wrote: > > On 2024-09-26 at 16:12 -0700, Nhat Pham wrote: > > On Thu, Sep 26, 2024 at 3:55=E2=80=AFPM Ivan Shapovalov wrote: > > > > > > Improve the inheritance behavior of the `memory.zswap.writeback` cgro= up > > > attribute introduced during the 6.11 cycle. Specifically, in 6.11 we > > > walk the parent cgroups until we find a _disabled_ writeback, which d= oes > > > not allow the user to selectively enable zswap writeback while having= it > > > disabled by default. > > > > Is there an actual need for this? This is a theoretical use case I > > thought of (and raised), but I don't think anybody actually wants > > this...? > > This is of course anecdata, but yes, it does solve a real use-case that > I'm having right now, as well as a bunch of my colleagues who recently > complained to me (in private) about pretty much the same use-case. > > The use-case is following: it turns out that it could be beneficial for > desktop systems to run with a pretty high swappiness and zswap > writeback globally disabled, to nudge the system to compress cold pages > but not actually write them back to the disk (which would happen pretty > aggressively if it was not disabled) to reduce I/O and latencies. > However, under this setup it is sometimes needed to re-enable zswap > writeback for specific memory-heavy applications that allocate a lot of > cold pages, to "allow" the kernel to actually swap those programs out. Out of pure curiosity (and to make sure I fully grasp the problem at hand), is this the capacity-based zswap writeback (i.e the one triggered by limits), or the memory pressure based dynamic shrinker? If you disable the latter and only rely on the former, it should not "write pages aggressively". Limits are rarely reached (IIUC, zswap.max are frequently used as binary knobs, and global limits are hard to reach), so usually pages that are going to disk swap are just pages zswap reject (i.e mostly just pages that fail to compress). This should be a very small category. You will still see "swap" usage due to a quirk in zswap architecture (which I'm working to fix), but there should rarely be any IOs. So the setup itself is not super necessary. If it's the latter then yeah I can kinda see the need for the setup. > > > > > Besides, most people who want this can just: > > > > 1. Enable zswap writeback on root cgroup (and all non-leaf cgroups). > > > > 2. Disable zswap writeback on leaf cgroups on creation by default. > > > > 3. Selectively enable zswap writeback for the leaf cgroups. > > > > All of this is quite doable in userspace. It's not even _that_ racy - > > just do this before adding tasks etc. to the cgroup? > > Well, yes, it is technically doable from userspace, just like it was > technically doable prior to commit e39925734909 to have userspace > explicitly control the entire hierarchy of writeback settings. > However it _is_ pretty painful, and the flow you described would > essentially negate any benefits of that patch (it would require > userspace to, once again, manage the entire hierarchy explicitly > without any help from the kernel). I think it's a tad different. In the case of the commit e39925734909, the hierarchical behavior of zswap.writeback knob is quite semantically confusing, almost counter-intuitive (and does not conform to the convention of other cgroup knobs, which use the most restrictive limit - check out zswap.max limit checking for example). That commit arguably fixes it for the "common" case (i.e you want the hierarchical enforcement to hold for the most part). That's why there were even conversations about cc-ing the stable mailing list for backporting it to older kernels. This is more of a "new use case" patch. It complicates the API, for something readily doable in userspace - the kernel does not do anything that userspace cannot achieve. So it should undergo stricter scrutiny. :) Anyway, Yosry, Johannes, how do you feel about this?