From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F402C07545 for ; Tue, 24 Oct 2023 17:25:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A38086B02CC; Tue, 24 Oct 2023 13:25:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E7FE6B02CD; Tue, 24 Oct 2023 13:25:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D68E6B02CE; Tue, 24 Oct 2023 13:25:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7DE806B02CC for ; Tue, 24 Oct 2023 13:25:50 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5784B12041B for ; Tue, 24 Oct 2023 17:25:50 +0000 (UTC) X-FDA: 81381032460.18.E962E14 Received: from mail-il1-f177.google.com (mail-il1-f177.google.com [209.85.166.177]) by imf13.hostedemail.com (Postfix) with ESMTP id 9DFA520010 for ; Tue, 24 Oct 2023 17:25:48 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=khHGte40; spf=pass (imf13.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.177 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698168348; a=rsa-sha256; cv=none; b=UV1P8AD1a0zrVs72dqsxsnSNygUv9JfkAup0tnSsO/a6HE8/o8A4U3/8ICQInr0XvUCHD7 YYFIMIjlMkOnlRw5AH3T/CWMmFa33Yqeh7fSkEl1T1HZMeJZ2pc/1gFhfXVBFw7cIE+3jZ 9LEphsmex0l0UMdApj2YgpQQndQH6os= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=khHGte40; spf=pass (imf13.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.166.177 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698168348; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KyAmYbhNKRNC1zfPKtixIDHsRNfxwdhaIa0uB8FXo00=; b=EkbzYVEILNO9r7PipphGKn/Xt29FM5wqdCgjkWDqSNUdDNLR93fMDmbdAEyIAU4H9FN5ii rfrbYLCij9PrO4niU4Qk1hyQuzFJRH+VGkPTA1oGM2KnyQinuWuXWX2YSo5EPbqc+/kjBZ j0WNR3FpDQTZdNlKN62wME+oOpNZeHY= Received: by mail-il1-f177.google.com with SMTP id e9e14a558f8ab-3579c8f3054so10692555ab.1 for ; Tue, 24 Oct 2023 10:25:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698168348; x=1698773148; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KyAmYbhNKRNC1zfPKtixIDHsRNfxwdhaIa0uB8FXo00=; b=khHGte40GYjMSJYouPDIzJej758lNK2GQGfCk/TQ6aWwKQgJvTmLTivewS3aaQsILx ykOhWX5TVOvDMRK2lN0aozm1xSSsKvBS8ZXTtlfOr9zefrICEv+a00QkwiWDcilEqKzu MCzdSOh8cYFUqqdDuPEcQLA6pEdHtkJVMqRMH2WZWrYYSH7RqcZp4yVNk4jKopOqSGwB s/8O2uWmF9Gbl6flSTzEGEwOpdfsrnyW6RdOcPj++xdK2VmdAnAb6vrDAlxewNkPDa7z YUosJKeMEYwhya/935P5+f1kyQBjUjsR0h1i8BPTzcUoAkskGExYgrVt+Lcx/83qLhJ/ +rjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698168348; x=1698773148; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KyAmYbhNKRNC1zfPKtixIDHsRNfxwdhaIa0uB8FXo00=; b=p/+lpaLrw14vDHjEAi16+leHXvO6Ak4oxQBg8GSNMURsFqyYGBaT7cffXLtlm34gxJ SPb6weQqcq8mb7lAJZW5FZ3gwgNU6ncturWBJuFFDMiQSQnYgeKw/nDi0rdtM5/lFEdL N7gIAfAzZwFgzEci7ZSrKg/A1WQ3ZviztMffqXxXBhatwUYlAQBXEF067fmxA1kM8H00 4Icpt0LTdUrpnM8cnvHlYdciNxuQ16iq6Et3BNdwi9dr98GkfvKr1q+AzSM/cfF8/lQn BxPsC69ivOjSJ8cPk8PDfNH4uqbu0viR3CULz0I1yRpEUO6jzVq6l9uGR2ZC6UavZ98C 43Xg== X-Gm-Message-State: AOJu0YxudI8DDr/xzOvlDKud+cg3Zyb6KTUBHTOaYaJzznE/MRE5TIM+ fHriNfuevUh+NWAQaB1YeLHQnoJwg6xAgwKi0B0= X-Google-Smtp-Source: AGHT+IF+vxgskghZCA0NCHWlB1Q/r7kYFUjVNizx9+ZIXXSVzrK87LmX/ZotrdYY5apYuQ4vhLBKRTVkLEEBq2bzI6E= X-Received: by 2002:a05:6e02:20e2:b0:34f:20d9:74a9 with SMTP id q2-20020a056e0220e200b0034f20d974a9mr17350245ilv.11.1698168347662; Tue, 24 Oct 2023 10:25:47 -0700 (PDT) MIME-Version: 1.0 References: <20231024000702.1387130-1-nphamcs@gmail.com> <20231024160904.GA1971738@cmpxchg.org> In-Reply-To: <20231024160904.GA1971738@cmpxchg.org> From: Nhat Pham Date: Tue, 24 Oct 2023 10:25:36 -0700 Message-ID: Subject: Re: [PATCH] zswap: export more zswap store failure stats To: Johannes Weiner Cc: akpm@linux-foundation.org, cerasuolodomenico@gmail.com, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9DFA520010 X-Stat-Signature: g7giqrc5xwz4w4kmdqrcoemp56iy4yab X-Rspam-User: X-HE-Tag: 1698168348-27288 X-HE-Meta: U2FsdGVkX18CFGsiSMBCfrTWZ4ZMunBmp5zpK1bqzkF9NqtSuY/CPGl3LdDCzZQ7y8+vokAo3xGfbqQ6uYLaoOZCZFK8rJBo5kvx6eysfDcI2aPguZK4A4Sy8Qgh6u1bVQlgjtsZPXcWxSxESOPHIxTzr/0czR0WHqItWu/3U7VssvJ112/Wi6uYFkFddxrFfCSLt+B0KnWO+rvFBLyv4ZWV8UoxV8WRZaTx3BD3pbxcvavxXxCEx8YlbCUqiqjF7J3uCxpxV/lKbV0iUgtBz+a7hlxznAiJlO6OrtXUU4GKKqgMqDaiol7fCr1JPSFo5n1aV1MUuRE2jwJpoaoT6y3xfRZvkBTmQBfjuDT6R88G1dl1C/t05TpyHfLKR0xeKiixr9BHbIECD5mawsIc3nhl61HlAP1HA8AdlRrjxOVS/OEryicK/BBMvAukQIdJbPRnq8Qwy6Ciec78ddPQenUaroL+l8BT5xcotLL5T+y/kdbZEtWh9Y6bQwdm9d2ij5YjAFYz4W26+DG8B5ERllXajDuKmWcfz8q/M+r3XFwlRq1HV7UDH5lPnBp/HiykziJUII4kHPyhc3fCwQ25XTr2KNkP3vE4cq+vAY8Fj9GNvOrRXwCmSys+eaB+wtqKjl4Ct3lUfYke336R7xDc32vi2dblQ1wiR2Gz5bMSuPz+MZATJF/5XXNzusBZvLAKua1h3Zi117JrVfVs5Y5lMxn0Jz8UmB0wiSHgQxdygRZ43hX2J/uUZOkn5A6mSy+JWGxCYjhKxFvU6qs1Qpi0KzYWfm28qmvciiFSCkyIp1xYDx9U2dBBS49fVUTu2iy3FdtbenV47VOaRHaO+wGxZgE908MJLtIIaeGrf81gZxoj0u7LjnwzUECbMUPdA6dQeVH7J7PmyGBnyB2I+7iZBOO/T3UXKfbWsI7jQzk3sjBlzTkahNKNjg8Z83LUzLPPq10y04sf2V7IOfqnbVE /AgAd+S+ PjXeQf3bU+Vg5hEilfZrtCgFXi9e+DQSg2h2Jom4LkGpKNZCWEYpj+/XgfcwF5DFlLLerBoGhJSfFJTfAX3qC6tKsNaYYQmP7dkIpNyVjAHBzy1ZI15FnNRMN5hzgRG+wVOlmV5o8ajTB0dmzwEzWiDR5r+S5NbwqLOSalhtzZCCf8Rp2FhSTM2K+ob/HWZu2sC0fgLgbsVIFxgABS6WQUkyDia6AbNob8yVSHWv/nfELOBbSvQBuCwkQNxaK/ZoKrJKCU95wXA3JjIpeJ7iD+y7EOZEYfTavjvaLp3dsgBYex/hI9RPxvOm+l6CuDYux8fxIojH1D5wwMKAdkjd79cMFM6tExZPnSDzY7ZId3n++ykT58rvagdpqfBJyRyFy82DM9QhvFJCTK36DAUjJGH3pOnBNO9WMFKSZMM8Avpn/CAdT7SfLo5+5KQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Oct 24, 2023 at 9:09=E2=80=AFAM Johannes Weiner wrote: > > On Mon, Oct 23, 2023 at 05:07:02PM -0700, Nhat Pham wrote: > > Since: > > > > "42c06a0e8ebe mm: kill frontswap" > > > > we no longer have a counter to tracks the number of zswap store > > failures. This makes it hard to investigate and monitor for zswap > > issues. > > > > This patch adds a global and a per-cgroup zswap store failure counter, > > as well as a dedicated debugfs counter for compression algorithm failur= e > > (which can happen for e.g when random data are passed to zswap). > > > > Signed-off-by: Nhat Pham > > I agree this is an issue. > > > --- > > include/linux/vm_event_item.h | 1 + > > mm/memcontrol.c | 1 + > > mm/vmstat.c | 1 + > > mm/zswap.c | 18 ++++++++++++++---- > > 4 files changed, 17 insertions(+), 4 deletions(-) > > > > diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_ite= m.h > > index 8abfa1240040..7b2b117b193d 100644 > > --- a/include/linux/vm_event_item.h > > +++ b/include/linux/vm_event_item.h > > @@ -145,6 +145,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPO= UT, > > #ifdef CONFIG_ZSWAP > > ZSWPIN, > > ZSWPOUT, > > + ZSWPOUT_FAIL, > > Would the writeback stat be sufficient to determine this? > > Hear me out. We already have pswpout that shows when we're hitting > disk swap. Right now we can't tell if this is because of a rejection > or because of writeback. With a writeback counter we could. Oh I see! It's a bit of an extra step, but I supposed (pswpout - writeback) could give us the number of zswap store failures. > > And I think we want the writeback counter anyway going forward in > order to monitor and understand the dynamic shrinker's performance. Domenico and I were talking about this, and we both agree the writeback counter is absolutely necessary - if anything, to make sure that the shrinker is not a) completely not working or b) going overboard. So it is coming as part of the shrinker regardless of this. I just didn't realize that it also solves this issue we're having too! > > Either way we go, one of the metrics needs to be derived from the > other(s). But I think subtle and not so subtle shrinker issues are > more concerning than outright configuration problems where zswap > doesn't work at all. The latter is easier to catch before or during > early deployment with simple functionality tests. > > Plus, rejections should be rare. They are now, and they should become > even more rare or cease to exist going forward. Because every time > they happen at scale, they represent problematic LRU inversions. We > have patched, have pending patches, or discussed changes to reduce > every single one of them: > > /* Store failed due to a reclaim failure after pool limit was rea= ched */ > static u64 zswap_reject_reclaim_fail; > > With the shrinker this becomes less relevant. There was also the > proposal to disable the bypass to swap and just keep the page. The shrinker and that proposal sound like good ideas ;) > > /* Compressed page was too big for the allocator to (optimally) s= tore */ > static u64 zswap_reject_compress_poor; > > You were working on eradicating this (with zsmalloc at least). > > /* Store failed because underlying allocator could not get memory= */ > static u64 zswap_reject_alloc_fail; > /* Store failed because the entry metadata could not be allocated= (rare) */ > static u64 zswap_reject_kmemcache_fail; > > These shouldn't happen at all due to PF_MEMALLOC. > > IOW, the fail counter is expected to stay zero in healthy, > well-configured systems. Rather than an MM event that needs counting, > this strikes me as something that could be a WARN down the line... > Yup, I agree that it should (mostly) be at 0. It being non-zero (especially at a higher ratio w.r.t total number of zswap store counts) is an indicatio= n of something wrong - either a bug, misconfiguration, or a very ill-compressible workload (or again a bug with the compression algorithm). A WARN might be good too, but if it's just an ill-compressible workload that might be too many WARNS :) But we can always just monitor pswpout - writeback (both globally, and on a cgroup-basis, I assume?). > I agree with adding the debugfs counter though. Then I'll send a new patch that focuses on the debugfs counter (for the compression failure). Thanks for the feedback, Johannes.