From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4606CE79A8 for ; Tue, 19 Sep 2023 19:31:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 106E06B00C9; Tue, 19 Sep 2023 15:31:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B7EB6B00CA; Tue, 19 Sep 2023 15:31:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC1416B00CB; Tue, 19 Sep 2023 15:31:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DD5FC6B00C9 for ; Tue, 19 Sep 2023 15:31:58 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A8489C0672 for ; Tue, 19 Sep 2023 19:31:58 +0000 (UTC) X-FDA: 81254342316.11.8E78213 Received: from mail-io1-f52.google.com (mail-io1-f52.google.com [209.85.166.52]) by imf09.hostedemail.com (Postfix) with ESMTP id EFC29140003 for ; Tue, 19 Sep 2023 19:31:56 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="dvd/gcsv"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.52 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695151917; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RGRB7aZVznYzmjO6PIUxYYQozTLd4WBNImNSQyAGpdk=; b=6FaeoKY6oNfpz18fRHbaSIdsI2JbjleTiDZnvHkv/B7crzp2BM7BKOJtTC+4INcNhl20uf 0/ySeVNjAWiQq7SOhLmVOO+xQlPYw70gdvFbqQ6sCUtDFKgHcNnGvY1HE1k1wkbAX2dl2A bvjQyHrhNCTRUwF9Yyp/DJK1z0KL0jA= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="dvd/gcsv"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.52 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695151917; a=rsa-sha256; cv=none; b=70x78lQWvY6MSdn2izevaYhAT/gxbeco8LnrpISW4F1kIO7vkd60R0BLKqbYerTncy1Rth SW6krBLElObdVCBS4aSAQAH0I2g3BUAhMDnRKDFBfYeFMrWREZ78X+HpN7TW9mXAZ1wJGY K1fO5nlOpn+rRNzp2NA4cE9UiSznrKY= Received: by mail-io1-f52.google.com with SMTP id ca18e2360f4ac-79dc081ab8dso33951239f.1 for ; Tue, 19 Sep 2023 12:31:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695151916; x=1695756716; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RGRB7aZVznYzmjO6PIUxYYQozTLd4WBNImNSQyAGpdk=; b=dvd/gcsvt89eOA91fDOJnwjlOk61aviq5EpVACYQDdBfHbNxbCpzwpEAkecxwklxDv rjwu0pItbainLGAyI2j7zZKhxhhoDdNkpb281l7cRb0iymDrY2LMu8P6Pkn+xrragVZ0 mQHbexS64R/RrjPtj/lqfv6emE2fL+SDeFR6DIe3ZrsO3vu5EOf3QErpiGAnOGApTE7G jaH2QqA5vMOz9I6sx+0taRz+nbbJnDRGEMsHCGE9qvmQzSwnwY0wwGeef5SxKVk1rg+X vOOTrElvRQ3hWYHUl4tmaAay1xQY4EogJhcADnhZqiKXbaI29bOPoKkZhHZwhiW413H2 m7ZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695151916; x=1695756716; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RGRB7aZVznYzmjO6PIUxYYQozTLd4WBNImNSQyAGpdk=; b=el8ot7viuSCq2ORKbhfsqiFkCzvJzpgf66UNEqIUlYrGIF5WW+XiZOQ0vd4sF+T2K9 F39jdV+CRdCO68MAwjvvuNm9Axa+Nt6xAptWpBUABPNpqZGNoUqGDJEEgOI6kjl2gm0j GAYtsAR8nRxrHvo/fKqQXOMm31iZ5OJO5arnaHAHNhT8eAFhoeClqKAShx5CTnrIte/O Cx+yc1oUDWjRM3qHZTY5IXuinK4f1JoKdX4556kBe7AgUjTnLGRRzz6/IeDP6eTUAzXa c7I9pjaqdL7v/UslO5YdZ1Nn5J/6H+vE162SO5LBoZ0zeqPZvgSorxnWK2vn8fiG/OP4 EMGA== X-Gm-Message-State: AOJu0Yz0JbAaXgEHRF+UZFdHs4jOj2Gy1YD5UftCeu/pC+O1e+bshHHn iaS0rr0h1c9z0ExaNR/t+OWV+fwro2ux/A/oaXk= X-Google-Smtp-Source: AGHT+IETOaiwYEVB/hltNisobMmuHqMXXBC7zts8rxBGPENFVM98Hc46wT20xXd5dQuagXbUPJqvS4tvf14frSe6zfg= X-Received: by 2002:a5e:c301:0:b0:784:314f:8d68 with SMTP id a1-20020a5ec301000000b00784314f8d68mr817244iok.1.1695151915869; Tue, 19 Sep 2023 12:31:55 -0700 (PDT) MIME-Version: 1.0 References: <20230919171447.2712746-1-nphamcs@gmail.com> In-Reply-To: <20230919171447.2712746-1-nphamcs@gmail.com> From: Nhat Pham Date: Tue, 19 Sep 2023 12:31:44 -0700 Message-ID: Subject: Re: [PATCH v2 0/2] workload-specific and memory pressure-driven zswap writeback To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, cerasuolodomenico@gmail.com, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: EFC29140003 X-Stat-Signature: wipsfmfdxewkcwzx5ji7btsnto86u4or X-HE-Tag: 1695151916-733055 X-HE-Meta: U2FsdGVkX18YacjwfLCwQUtPq4gBfluDNBj/kYPNlEJGYkVXjzS7OCWIHpoAkKtypuzUkzXswvKno0NcaEWjkvRRUGVQydYQIXVPADtKcTXUnGWManpqNsGGyGoLtR7Fq0oxvxcUHO5CdkEDTcFU/JsAhhQhJ9NOW60/w/oFgJnKo0lPTiP4z6MpOfgcGL8C9b1Lqn8FVPqL4tS9Etojm9YMtC+opc0WTh6HyhJdiRa/OGuYEZ0gBsLQj3sW1qSXnKVa1tXh3Blx0scVZsZI/ozm/v3/1prUE5LkE93eQLkWwGmBEfcHEU7+zGIuH2Be6ALJI5eV2IUOvQyEP09zmMwaefj9HCTpJOuuAFc7IvAAgS78BLWY0DGIE61ZEYVwLtIZViSRCtLkyJwCuq9bEh485kiEp1EkDjj92Ucf+++aKFNyFkPhFDSwtEiVranSgD53MHIJtDyLVj/JYT6PD/KeTCq7fwtKOx+n8+xAlszb94+qgdC3bdAr/vrToaUucMo9QWUSQzwy5ovmilLYyh/iontgc+rf4D4jpbSz4hnURzzjoIG6UQFkpa7LrEoszWQ0dxw5eVgjeaD77hk79fCUcpUByRMqPzzJE3oheKXTPvrDdpoDu+HvEzB1ln7lKoFpVPT/7FjZFiA9YTl76u2TAXetk3DDfgIfsYFavYdUouJhxWN33g477Gt8zUVSVvvMta98PrlPlSLB1Mthf3YVnROh2T6CRtXNvEPFQFSAIKuzGfFrbpEovgpld8nQIiuo2uBkPCCPmAlz1ZwtJOdfadoQOhctYHXgtiO/9LmIRMK+Uwpx2i2T4nQnf10Aj/TTGGQNaR8qH5F6xFJkmQkzpmOKIBWQdbZ2UKdbFyEVnxPpLe4qIshYAwWQeqLoxFaAWs8/18JEaYPIcbBnRM4lYu/DSOOm/Y4ZiahN0V9ywOSdKZzlOSs1EZqcr7KF4PrhXZeZ6YZ58ahwQ8G jh+pC1XB hGzNmASLuKMkqya6GIrz/20UlU32eccWAY+T6hVDQH4s1zgGO9MgCky4JD4Lsx2MpFLrPHxRu3vTTJuCeTGdO2u/gsgDJk+SmzXFZsKEvdl8uvpQPnMtAAmMOEK+t4m52WYjkA14E1rF6oMgCIrn3SpZBRDcCVVbgfOKQGBdaX8HR/4+zexxo1Kbapk+ECX/SK9zLryQAAjxNZKEb/PD5/LQkkmBgH+zYlKGDqb6aX0VQxrkpefZPB+r7CORodq1FfI7CaXLH0vayQ0OWhR+MhrFrqLjObkJZxXVkSgsPjbBk2FNUYhJxnOREshbY+hG/84yg4eeoq+ZXiGxz6Wtc4ou/BDwdOGjWqDHpfB1dvYlfAgAr9T8Xd5EH4dYekskulAM6r5+NCohz9uEzmVNSCc/O6g7Q18CYbJF//G2TzHcpAZwBGjDctZQRog== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000028, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 19, 2023 at 10:14=E2=80=AFAM Nhat Pham wrot= e: > > Changelog: > v2: > * Fix loongarch compiler errors > * Use pool stats instead of memcg stats when !CONFIG_MEMCG_KEM * Rebase the patch on top of the new shrinker API. > > There are currently several issues with zswap writeback: > > 1. There is only a single global LRU for zswap. This makes it impossible > to perform worload-specific shrinking - an memcg under memory > pressure cannot determine which pages in the pool it owns, and often > ends up writing pages from other memcgs. This issue has been > previously observed in practice and mitigated by simply disabling > memcg-initiated shrinking: > > https://lore.kernel.org/all/20230530232435.3097106-1-nphamcs@gmail.com= /T/#u > > But this solution leaves a lot to be desired, as we still do not have = an > avenue for an memcg to free up its own memory locked up in zswap. > > 2. We only shrink the zswap pool when the user-defined limit is hit. > This means that if we set the limit too high, cold data that are > unlikely to be used again will reside in the pool, wasting precious > memory. It is hard to predict how much zswap space will be needed > ahead of time, as this depends on the workload (specifically, on > factors such as memory access patterns and compressibility of the > memory pages). > > This patch series solves these issues by separating the global zswap > LRU into per-memcg and per-NUMA LRUs, and performs workload-specific > (i.e memcg- and NUMA-aware) zswap writeback under memory pressure. The > new shrinker does not have any parameter that must be tuned by the > user, and can be opted in or out on a per-memcg basis. > > On a benchmark that we have run: > > (without the shrinker) > real -- mean: 153.27s, median: 153.199s > sys -- mean: 541.652s, median: 541.903s > user -- mean: 4384.9673999999995s, median: 4385.471s > > (with the shrinker) > real -- mean: 151.4956s, median: 151.456s > sys -- mean: 461.14639999999997s, median: 465.656s > user -- mean: 4384.7118s, median: 4384.675s > > We observed a 14-15% reduction in kernel CPU time, which translated to > over 1% reduction in real time. > > On another benchmark, where there was a lot more cold memory residing in > zswap, we observed even more pronounced gains: > > (without the shrinker) > real -- mean: 157.52519999999998s, median: 157.281s > sys -- mean: 769.3082s, median: 780.545s > user -- mean: 4378.1622s, median: 4378.286s > > (with the shrinker) > real -- mean: 152.9608s, median: 152.845s > sys -- mean: 517.4446s, median: 506.749s > user -- mean: 4387.694s, median: 4387.935s > > Here, we saw around 32-35% reduction in kernel CPU time, which > translated to 2.8% reduction in real time. These results confirm our > hypothesis that the shrinker is more helpful the more cold memory we > have. > > Domenico Cerasuolo (1): > zswap: make shrinking memcg-aware > > Nhat Pham (1): > zswap: shrinks zswap pool based on memory pressure > > Documentation/admin-guide/mm/zswap.rst | 12 + > include/linux/list_lru.h | 39 +++ > include/linux/memcontrol.h | 6 + > include/linux/mmzone.h | 14 + > include/linux/zswap.h | 9 + > mm/list_lru.c | 46 ++- > mm/memcontrol.c | 33 ++ > mm/swap_state.c | 50 +++- > mm/zswap.c | 397 ++++++++++++++++++++++--- > 9 files changed, 548 insertions(+), 58 deletions(-) > > -- > 2.34.1