From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 232D8C3DA64 for ; Thu, 1 Aug 2024 20:02:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F5AC6B0082; Thu, 1 Aug 2024 16:02:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A5AA6B0085; Thu, 1 Aug 2024 16:02:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86D456B0088; Thu, 1 Aug 2024 16:02:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 697176B0082 for ; Thu, 1 Aug 2024 16:02:53 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1AF281C45E4 for ; Thu, 1 Aug 2024 20:02:53 +0000 (UTC) X-FDA: 82404749826.15.9458B8D Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf04.hostedemail.com (Postfix) with ESMTP id 3E8B940016 for ; Thu, 1 Aug 2024 20:02:50 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dq/xnjvp"; spf=pass (imf04.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722542495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RlkiZo54BZ/zKdNjgu9gaTV31mJk+LsgKPG3sWxE9vc=; b=vts8llvVKzTgqEphy4C2aew6b+5uDNdlETcQver0n0qbOQJfn2+sLf16/fNwgVPICMyKZ0 cqnc4uLz9kg0eP20Y71yxI3l4M6nxAcOTA7Nl5INtH3/6ZucKqvEWLj55b0i8ItlKcN29A eRv/L1YzT6NLwGJGS+RYr5Q/Jwgrn54= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dq/xnjvp"; spf=pass (imf04.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722542495; a=rsa-sha256; cv=none; b=wfscrTw6fYKHUPcRvBKJMxKyGox4wgBtvq+FOariLG7g/fJrCfXg1eovQlK4qrot2sHlfA Ae32Gtu3TWVtaQchIRwO8EJHffnnJNS3bCLQ/8aC9qXk2AUugzhwxCSkcpfF9jc+k7E2C8 Ia1Up9NA1gz8rHIzKB08+++FskBPjlI= Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-a7a9cf7d3f3so905702366b.1 for ; Thu, 01 Aug 2024 13:02:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722542570; x=1723147370; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RlkiZo54BZ/zKdNjgu9gaTV31mJk+LsgKPG3sWxE9vc=; b=dq/xnjvpA9mk2odxBuKNX5ySiNkd3Q/ptZk0Hqz9jJ/koeMkKrNX0Hqt081WSsOAmo YzWPq2VufZG2qlkpFkuVUKOwcJy/W5epVOOOZlvmk/QzdshmOw1fFn83o8a6rwUQ+ZBz fj6l3DrFX99+j/F72KT6ym0FakFdA7qRH4M2xflWIR434O3GytQ58ZwNmXvaST+gL0Om ZigSTQcf02Jab86o2v93YM8/bkDAbriH0s+dRW/g/UqOtuzuyecy7mfnJ7JK3fwQbWzB ZGPvYieAWaDuHjdXBywcQoAoVeaY2Ywtlq8IT2bYRBdss/A+TLFRKyJ8+wfn4z5ANohY QEWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722542570; x=1723147370; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RlkiZo54BZ/zKdNjgu9gaTV31mJk+LsgKPG3sWxE9vc=; b=N7t2nWq9ZiAR/kdh8Fv74lkzArlxkumM0oPp1zxkqwyg7GBiHMelJhjJlfe23J7nUH HtiWsWwoExoJOgx8Q2LeyunjngSOmEfyhWgL4ztFza2BieDcNXvDcyzzcI5rYelqh9A2 ig7ZPwXRyyo2mcGnI2UEbiOwzHUkbAhLyNRHhZ0uYsgaeLAAqCOlEmc2rpSeuvA7llNY hRrWU3QF+5kZVgsahXy9J9ckwnAax5jtAkDS0zvCUInho5cIX/oPG7WiyA8ZXeX3gYpb bmU1/STDg066/ByUed+ysruC1XJ8JCHv4P0/JQb04HQP63mGNTNcZGHUGsl/nUNhAZVo c2gw== X-Forwarded-Encrypted: i=1; AJvYcCWOW/pw/7sJXcfzYTVJhzh/XiJsw7qB3m+F5j2bpojQxfCpGYtstqNQp5cEKvp17RHfRD+fu9CmxV0dBNMen5OBMHA= X-Gm-Message-State: AOJu0Yyj+JKxJjqOp2uz3qy81gECYNVFshdlE97cf55lp7x0Is7Gl8c3 4j+7H8Z8c5bHIFfSlYAumko3ZhP/Yra2/9Q8VIPxwcDzZ3U94OROJAEBhR4ySMUOqSLoy8nT7Yw YUk3Am6VtU5lCzxCIEeb1ewFNvc8gmJzhEa4XmcKzHeIT/PMv6g== X-Google-Smtp-Source: AGHT+IH7npuOr2V8R8GmfqK3qHO01cI/3LfNe2Xe6ngcXgPKH2SSXYEqtYqETgGR8DdlyWt4sZfX9oRGP57DA04+aMY= X-Received: by 2002:a17:906:fe0e:b0:a6f:6126:18aa with SMTP id a640c23a62f3a-a7dc51888c9mr92720066b.67.1722542568848; Thu, 01 Aug 2024 13:02:48 -0700 (PDT) MIME-Version: 1.0 References: <20240730222707.2324536-1-nphamcs@gmail.com> <20240730222707.2324536-3-nphamcs@gmail.com> In-Reply-To: <20240730222707.2324536-3-nphamcs@gmail.com> From: Yosry Ahmed Date: Thu, 1 Aug 2024 13:02:13 -0700 Message-ID: Subject: Re: [PATCH v2 2/2] zswap: increment swapin count for non-pivot swapped in pages To: Nhat Pham Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, shakeel.butt@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, flintglass@gmail.com, chengming.zhou@linux.dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 3E8B940016 X-Stat-Signature: nj8kmhe95job8a95h7jcjehotppfexkh X-Rspam-User: X-HE-Tag: 1722542570-404854 X-HE-Meta: U2FsdGVkX18djurOUD5BmKo0G2IyChTBJqHqiZBh2jOT7w8AVr9BJtGAgoKTAuwAAPr9XwQMz6h3xJypBt+G81iaFerX+qefYdx1RO5e8Y2cRThNv8wjs3EM3X6jN5gaWB7k5cP3ecxw8zqrPzzAf7RVtOLBOw8C+1ovaR2dO5AlEMlJvxyswUIqcpxTXram9ATezMq2ES+FaNvazKKpZbFZ0qLhw+7Z06+TC0ZU5eOR1nwzBjvI8ylNhnwQON+HVhFryvY0UaHu6dNXnoSRxzScwnKeMy4t1THFqrBUv4hllVWtO3cedjmgXkroBZZ0s5YEPusgYE/+gjbdowstxbLjQRPoSInoQL/NBkuP3FAmY35MWwopE8R30pmbQ8mikQVELykmGcJxM/nOX54EsYNIR6vcGCQ7UD9yBoe7Wxd66TW+bn1oOUUECOVnJQ9MXU49XsJMEvxOpBIdOP4dvbsTSyomHfVdJw1fq84KxhOjPuVBcs5uSqPgIqmCHFVNSmHdILhuXXdp2bbDIJi7yW+b85Qk3+7tZhN/E9N80HFU03yD+8KDDpW4Kcq0eLMmC1MIG+GmMtrzxulpJmZHONoX0F3EqmeUpmgo7R0yCAdzCFvnixnUcP3v/tIo7v9smYe9IZWIEUcoa979ppTogaYCXe5/FEcCZJd8j4tC46vG8L4M7XTUu7BOD7Z+YadIzc4ppUz65iHYkE7dX25h+F+koHrCiIwQHwn7AsC3nJmdLU0rCl0q2X95EBxGeVO5bR4vhfTs3XBfgwzKDQ/8rJTaH7ycoJiLvi5kZU8TepOzxyj3mH4RyF/zVbz8Il2og4e6FgYGX5bWGeP/PGQV82A4HVmsvt4qMqubbCJqdX7j7SCi1pmKaF0GjT/TVra4AW7HHZccLmvOBjh3Mdi2y8r4G39N/wodGiHODxTLPbabc4ZvsHQ/ev9sZxZRBKUdQwxwOGiDUyVttAvdyml BiMy8d1M eGWHQcJHu7o0aO2k1/cxVcM6XNPYrd7Aot2UaZO3B19YbKrY4HXE6WdHxQRfyAyZMCNlIx4tGcBGjKLfXsGeOoEkUzroNPq56fKPrZV4lJREbHAnGyAfedcFrfwonfQ/1DglMELVAN1Py7WxqZO+EzlZXnTnUEGpsYaTbl1u50HDtZzU0hz5wcQbFBoSKHTTnQxtxZhw092EcHKMIL5i8aEmUzn3ZiXEph5f9xko4PIpIIOnXB8BPrUf3ytQqXrJB74XU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 30, 2024 at 3:27=E2=80=AFPM Nhat Pham wrote= : > > Currently, we only increment the swapin counter on pivot pages. This > means we are not taking into account pages that also need to be swapped > in, but are already taken care of as part of the readahead window. We Hmm, but there is a chance that these pages are not actually needed, in which case we will unnecessarily increase the zswap protection. Does the readahead window self-correct if the pages were not used? > are also incrementing when the pages are read from the zswap pool, which > is inaccurate. I feel like this is the more important part. It should be the focus of the commit log with more details (i.e. why is it wrong to increment the zswap protection in this case). Do we need a Fixes and cc:stable for this one? Maybe it can be moved first to make backports easy. > > This patch rectifies this issue by incrementing whenever we need to > perform a non-zswap read. > > To test this change, I built the kernel under a cgroup with its > memory.max set to 2 GB: > > real: 236.66s > user: 4286.06s > sys: 652.86s > swapins: 81552 > > For comparison, with just the new second chance algorithm, the build > time is as follows: > > real: 244.85s > user: 4327.22s > sys: 664.39s > swapins: 94663 > > Without neither: > > real: 263.89s > user: 4318.11s > sys: 673.29s > swapins: 227300.5 > > (average over 5 runs) > > With this change, the kernel CPU time reduces by a further 1.7%, and > the real time is reduced by another 3.3%, compared to just the second > chance algorithm by itself. The swapins count also reduces by another > 13.85%. > > Combinng the two changes, we reduce the real time by 10.32%, kernel CPU > time by 3%, and number of swapins by 64.12%. > > To gauge the new scheme's ability to offload cold data, I ran another > benchmark, in which the kernel was built under a cgroup with memory.max > set to 3 GB, but with 0.5 GB worth of cold data allocated before each > build (in a shmem file). > > Under the old scheme: > > real: 197.18s > user: 4365.08s > sys: 289.02s > zswpwb: 72115.2 > > Under the new scheme: > > real: 195.8s > user: 4362.25s > sys: 290.14s > zswpwb: 87277.8 > > (average over 5 runs) > > Notice that we actually observe a 21% increase in the number of written > back pages - so the new scheme is just as good, if not better at > offloading pages from the zswap pool when they are cold. Build time > reduces by around 0.7% as a result. > > Suggested-by: Johannes Weiner > Signed-off-by: Nhat Pham > --- > mm/page_io.c | 11 ++++++++++- > mm/swap_state.c | 8 ++------ > 2 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/mm/page_io.c b/mm/page_io.c > index ff8c99ee3af7..0004c9fbf7e8 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -521,7 +521,15 @@ void swap_read_folio(struct folio *folio, struct swa= p_iocb **plug) > > if (zswap_load(folio)) { > folio_unlock(folio); > - } else if (data_race(sis->flags & SWP_FS_OPS)) { > + goto finish; > + } > + > + /* > + * We have to read the page from slower devices. Increase zswap p= rotection. > + */ > + zswap_folio_swapin(folio); > + > + if (data_race(sis->flags & SWP_FS_OPS)) { > swap_read_folio_fs(folio, plug); > } else if (synchronous) { > swap_read_folio_bdev_sync(folio, sis); > @@ -529,6 +537,7 @@ void swap_read_folio(struct folio *folio, struct swap= _iocb **plug) > swap_read_folio_bdev_async(folio, sis); > } > > +finish: > if (workingset) { > delayacct_thrashing_end(&in_thrashing); > psi_memstall_leave(&pflags); > diff --git a/mm/swap_state.c b/mm/swap_state.c > index a1726e49a5eb..3a0cf965f32b 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -698,10 +698,8 @@ struct folio *swap_cluster_readahead(swp_entry_t ent= ry, gfp_t gfp_mask, > /* The page was likely read above, so no need for plugging here *= / > folio =3D __read_swap_cache_async(entry, gfp_mask, mpol, ilx, > &page_allocated, false); > - if (unlikely(page_allocated)) { > - zswap_folio_swapin(folio); > + if (unlikely(page_allocated)) > swap_read_folio(folio, NULL); > - } > return folio; > } > > @@ -850,10 +848,8 @@ static struct folio *swap_vma_readahead(swp_entry_t = targ_entry, gfp_t gfp_mask, > /* The folio was likely read above, so no need for plugging here = */ > folio =3D __read_swap_cache_async(targ_entry, gfp_mask, mpol, tar= g_ilx, > &page_allocated, false); > - if (unlikely(page_allocated)) { > - zswap_folio_swapin(folio); > + if (unlikely(page_allocated)) > swap_read_folio(folio, NULL); > - } > return folio; > } > > -- > 2.43.0