From: Yosry Ahmed <yosryahmed@google.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"Huang, Ying" <ying.huang@intel.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
"davem@davemloft.net" <davem@davemloft.net>,
"clabbe@baylibre.com" <clabbe@baylibre.com>,
"ardb@kernel.org" <ardb@kernel.org>,
"ebiggers@google.com" <ebiggers@google.com>,
"surenb@google.com" <surenb@google.com>,
"Accardi, Kristen C" <kristen.c.accardi@intel.com>,
"zanussi@kernel.org" <zanussi@kernel.org>,
"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
"brauner@kernel.org" <brauner@kernel.org>,
"jack@suse.cz" <jack@suse.cz>,
"mcgrof@kernel.org" <mcgrof@kernel.org>,
"kees@kernel.org" <kees@kernel.org>,
"joel.granados@kernel.org" <joel.granados@kernel.org>,
"bfoster@redhat.com" <bfoster@redhat.com>,
"willy@infradead.org" <willy@infradead.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>
Subject: Re: [RFC PATCH v1 00/13] zswap IAA compress batching
Date: Wed, 23 Oct 2024 11:15:50 -0700 [thread overview]
Message-ID: <CAJD7tkZ9VLNrwyeRQf0AXdQAG8vW_ZL_y0rfU77p5HMZnch=mw@mail.gmail.com> (raw)
In-Reply-To: <SJ0PR11MB56784C5C542E84014525BA8CC94D2@SJ0PR11MB5678.namprd11.prod.outlook.com>
On Tue, Oct 22, 2024 at 7:53 PM Sridhar, Kanchana P
<kanchana.p.sridhar@intel.com> wrote:
>
> Hi Yosry,
>
> > -----Original Message-----
> > From: Yosry Ahmed <yosryahmed@google.com>
> > Sent: Tuesday, October 22, 2024 5:57 PM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > hannes@cmpxchg.org; nphamcs@gmail.com; chengming.zhou@linux.dev;
> > usamaarif642@gmail.com; ryan.roberts@arm.com; Huang, Ying
> > <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-foundation.org;
> > linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au;
> > davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> > ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> > <kristen.c.accardi@intel.com>; zanussi@kernel.org; viro@zeniv.linux.org.uk;
> > brauner@kernel.org; jack@suse.cz; mcgrof@kernel.org; kees@kernel.org;
> > joel.granados@kernel.org; bfoster@redhat.com; willy@infradead.org; linux-
> > fsdevel@vger.kernel.org; Feghali, Wajdi K <wajdi.k.feghali@intel.com>; Gopal,
> > Vinodh <vinodh.gopal@intel.com>
> > Subject: Re: [RFC PATCH v1 00/13] zswap IAA compress batching
> >
> > On Thu, Oct 17, 2024 at 11:41 PM Kanchana P Sridhar
> > <kanchana.p.sridhar@intel.com> wrote:
> > >
> > >
> > > IAA Compression Batching:
> > > =========================
> > >
> > > This RFC patch-series introduces the use of the Intel Analytics Accelerator
> > > (IAA) for parallel compression of pages in a folio, and for batched reclaim
> > > of hybrid any-order batches of folios in shrink_folio_list().
> > >
> > > The patch-series is organized as follows:
> > >
> > > 1) iaa_crypto driver enablers for batching: Relevant patches are tagged
> > > with "crypto:" in the subject:
> > >
> > > a) async poll crypto_acomp interface without interrupts.
> > > b) crypto testmgr acomp poll support.
> > > c) Modifying the default sync_mode to "async" and disabling
> > > verify_compress by default, to facilitate users to run IAA easily for
> > > comparison with software compressors.
> > > d) Changing the cpu-to-iaa mappings to more evenly balance cores to IAA
> > > devices.
> > > e) Addition of a "global_wq" per IAA, which can be used as a global
> > > resource for the socket. If the user configures 2WQs per IAA device,
> > > the driver will distribute compress jobs from all cores on the
> > > socket to the "global_wqs" of all the IAA devices on that socket, in
> > > a round-robin manner. This can be used to improve compression
> > > throughput for workloads that see a lot of swapout activity.
> > >
> > > 2) Migrating zswap to use async poll in zswap_compress()/decompress().
> > > 3) A centralized batch compression API that can be used by swap modules.
> > > 4) IAA compress batching within large folio zswap stores.
> > > 5) IAA compress batching of any-order hybrid folios in
> > > shrink_folio_list(). The newly added "sysctl vm.compress-batchsize"
> > > parameter can be used to configure the number of folios in [1, 32] to
> > > be reclaimed using compress batching.
> >
> > I am still digesting this series but I have some high level questions
> > that I left on some patches. My intuition though is that we should
> > drop (5) from the initial proposal as it's most controversial.
> > Batching reclaim of unrelated folios through zswap *might* make sense,
> > but it needs a broader conversation and it needs justification on its
> > own merit, without the rest of the series.
>
> Thanks for these suggestions! Sure, I can drop (5) from the initial patch-set.
> Agree also, this needs a broader discussion.
>
> I believe the 4K folios usemem30 data in this patchset does bring across
> the batching reclaim benefits to provide justification on its own merit. I added
> the data on batching reclaim with kernel compilation as part of the 4K folios
> experiments in the IAA decompression batching patch-series [1].
> Listing it here as well. I will make sure to add this data in subsequent revs.
>
> --------------------------------------------------------------------------
> Kernel compilation in tmpfs/allmodconfig, 2G max memory:
>
> No large folios mm-unstable-10-16-2024 shrink_folio_list()
> batching of folios
> --------------------------------------------------------------------------
> zswap compressor zstd deflate-iaa deflate-iaa
> vm.compress-batchsize n/a n/a 32
> vm.page-cluster 3 3 3
> --------------------------------------------------------------------------
> real_sec 783.87 761.69 747.32
> user_sec 15,750.07 15,716.69 15,728.39
> sys_sec 6,522.32 5,725.28 5,399.44
> Max_RSS_KB 1,872,640 1,870,848 1,874,432
>
> zswpout 82,364,991 97,739,600 102,780,612
> zswpin 21,303,393 27,684,166 29,016,252
> pswpout 13 222 213
> pswpin 12 209 202
> pgmajfault 17,114,339 22,421,211 23,378,161
> swap_ra 4,596,035 5,840,082 6,231,646
> swap_ra_hit 2,903,249 3,682,444 3,940,420
> --------------------------------------------------------------------------
>
> The performance improvements seen does depend on compression batching in
> the swap modules (zswap). The implementation in patch 12 in the compress
> batching series sets up this zswap compression pipeline, that takes an array of
> folios and processes them in batches of 8 pages compressed in parallel in hardware.
> That being said, we do see latency improvements even with reclaim batching
> combined with zswap compress batching with zstd/lzo-rle/etc. I haven't done a
> lot of analysis of this, but I am guessing fewer calls from the swap layer
> (swap_writepage()) into zswap could have something to do with this. If we believe
> that batching can be the right thing to do even for the software compressors,
> I can gather batching data with zstd for v2.
Thanks for sharing the data. What I meant is, I think we should focus
on supporting large folio compression batching for this series, and
only present figures for this support to avoid confusion.
Once this lands, we can discuss support for batching the compression
of different unrelated folios separately, as it spans areas beyond
just zswap and will need broader discussion.
next prev parent reply other threads:[~2024-10-23 18:16 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-18 6:40 Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 01/13] crypto: acomp - Add a poll() operation to acomp_alg and acomp_req Kanchana P Sridhar
2024-10-18 7:55 ` Herbert Xu
2024-10-18 23:01 ` Sridhar, Kanchana P
2024-10-19 0:19 ` Herbert Xu
2024-10-19 19:10 ` Sridhar, Kanchana P
2024-10-18 6:40 ` [RFC PATCH v1 02/13] crypto: iaa - Add support for irq-less crypto async interface Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 03/13] crypto: testmgr - Add crypto testmgr acomp poll support Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 04/13] mm: zswap: zswap_compress()/decompress() can submit, then poll an acomp_req Kanchana P Sridhar
2024-10-23 0:48 ` Yosry Ahmed
2024-10-23 2:01 ` Sridhar, Kanchana P
2024-10-18 6:40 ` [RFC PATCH v1 05/13] crypto: iaa - Make async mode the default Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 06/13] crypto: iaa - Disable iaa_verify_compress by default Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 07/13] crypto: iaa - Change cpu-to-iaa mappings to evenly balance cores to IAAs Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 08/13] crypto: iaa - Distribute compress jobs to all IAA devices on a NUMA node Kanchana P Sridhar
2024-10-18 6:40 ` [RFC PATCH v1 09/13] mm: zswap: Config variable to enable compress batching in zswap_store() Kanchana P Sridhar
2024-10-23 0:49 ` Yosry Ahmed
2024-10-23 2:17 ` Sridhar, Kanchana P
2024-10-23 2:58 ` Herbert Xu
2024-10-23 3:06 ` Sridhar, Kanchana P
2024-10-23 18:12 ` Yosry Ahmed
2024-10-23 20:32 ` Sridhar, Kanchana P
2024-10-18 6:40 ` [RFC PATCH v1 10/13] mm: zswap: Create multiple reqs/buffers in crypto_acomp_ctx if platform has IAA Kanchana P Sridhar
2024-10-23 0:51 ` Yosry Ahmed
2024-10-23 2:19 ` Sridhar, Kanchana P
2024-10-18 6:40 ` [RFC PATCH v1 11/13] mm: swap: Add IAA batch compression API swap_crypto_acomp_compress_batch() Kanchana P Sridhar
2024-10-23 0:53 ` Yosry Ahmed
2024-10-23 2:21 ` Sridhar, Kanchana P
2024-10-18 6:41 ` [RFC PATCH v1 12/13] mm: zswap: Compress batching with Intel IAA in zswap_store() of large folios Kanchana P Sridhar
2024-10-18 6:41 ` [RFC PATCH v1 13/13] mm: vmscan, swap, zswap: Compress batching of folios in shrink_folio_list() Kanchana P Sridhar
2024-10-28 14:41 ` Joel Granados
2024-10-28 18:53 ` Sridhar, Kanchana P
2024-10-23 0:56 ` [RFC PATCH v1 00/13] zswap IAA compress batching Yosry Ahmed
2024-10-23 2:53 ` Sridhar, Kanchana P
2024-10-23 18:15 ` Yosry Ahmed [this message]
2024-10-23 20:34 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJD7tkZ9VLNrwyeRQf0AXdQAG8vW_ZL_y0rfU77p5HMZnch=mw@mail.gmail.com' \
--to=yosryahmed@google.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=ardb@kernel.org \
--cc=bfoster@redhat.com \
--cc=brauner@kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=clabbe@baylibre.com \
--cc=davem@davemloft.net \
--cc=ebiggers@google.com \
--cc=hannes@cmpxchg.org \
--cc=herbert@gondor.apana.org.au \
--cc=jack@suse.cz \
--cc=joel.granados@kernel.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=kees@kernel.org \
--cc=kristen.c.accardi@intel.com \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=viro@zeniv.linux.org.uk \
--cc=wajdi.k.feghali@intel.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=zanussi@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox