From: Nhat Pham <nphamcs@gmail.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"yosry.ahmed@linux.dev" <yosry.ahmed@linux.dev>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"ying.huang@linux.alibaba.com" <ying.huang@linux.alibaba.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"senozhatsky@chromium.org" <senozhatsky@chromium.org>,
"sj@kernel.org" <sj@kernel.org>,
"kasong@tencent.com" <kasong@tencent.com>,
"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>,
"davem@davemloft.net" <davem@davemloft.net>,
"clabbe@baylibre.com" <clabbe@baylibre.com>,
"ardb@kernel.org" <ardb@kernel.org>,
"ebiggers@google.com" <ebiggers@google.com>,
"surenb@google.com" <surenb@google.com>,
"Accardi, Kristen C" <kristen.c.accardi@intel.com>,
"Gomes, Vinicius" <vinicius.gomes@intel.com>,
"Cabiddu, Giovanni" <giovanni.cabiddu@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>
Subject: Re: [PATCH v14 26/26] mm: zswap: Batched zswap_compress() for compress batching of large folios.
Date: Sat, 31 Jan 2026 16:48:43 -0800 [thread overview]
Message-ID: <CAKEwX=PqWQ_39BuApc_bT1WKQMJyNPDs+Gv0JAU5cTa1KNDj9g@mail.gmail.com> (raw)
In-Reply-To: <SJ2PR11MB8472E5DC16759FFE6E23EED0C99CA@SJ2PR11MB8472.namprd11.prod.outlook.com>
On Sat, Jan 31, 2026 at 12:32 PM Sridhar, Kanchana P
<kanchana.p.sridhar@intel.com> wrote:
>
>
> > -----Original Message-----
> > From: Nhat Pham <nphamcs@gmail.com>
> > Sent: Friday, January 30, 2026 5:13 PM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > hannes@cmpxchg.org; yosry.ahmed@linux.dev; chengming.zhou@linux.dev;
> > usamaarif642@gmail.com; ryan.roberts@arm.com; 21cnbao@gmail.com;
> > ying.huang@linux.alibaba.com; akpm@linux-foundation.org;
> > senozhatsky@chromium.org; sj@kernel.org; kasong@tencent.com; linux-
> > crypto@vger.kernel.org; herbert@gondor.apana.org.au;
> > davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> > ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> > <kristen.c.accardi@intel.com>; Gomes, Vinicius <vinicius.gomes@intel.com>;
> > Cabiddu, Giovanni <giovanni.cabiddu@intel.com>; Feghali, Wajdi K
> > <wajdi.k.feghali@intel.com>
> > Subject: Re: [PATCH v14 26/26] mm: zswap: Batched zswap_compress() for
> > compress batching of large folios.
> >
> > On Sat, Jan 24, 2026 at 7:36 PM Kanchana P Sridhar
> > <kanchana.p.sridhar@intel.com> wrote:
> > >
> > > We introduce a new batching implementation of zswap_compress() for
> > > compressors that do and do not support batching. This eliminates code
> > > duplication and facilitates code maintainability with the introduction
> > > of compress batching.
> > >
> > > The vectorized implementation of calling the earlier zswap_compress()
> > > sequentially, one page at a time in zswap_store_pages(), is replaced
> > > with this new version of zswap_compress() that accepts multiple pages to
> > > compress as a batch.
> > >
> > > If the compressor does not support batching, each page in the batch is
> > > compressed and stored sequentially. If the compressor supports batching,
> > > for e.g., 'deflate-iaa', the Intel IAA hardware accelerator, the batch
> > > is compressed in parallel in hardware.
> > >
> > > If the batch is compressed without errors, the compressed buffers for
> > > the batch are stored in zsmalloc. In case of compression errors, the
> > > current behavior based on whether the folio is enabled for zswap
> > > writeback, is preserved.
> > >
> > > The batched zswap_compress() incorporates Herbert's suggestion for
> > > SG lists to represent the batch's inputs/outputs to interface with the
> > > crypto API [1].
> > >
> > > Performance data:
> > > =================
> > > As suggested by Barry, this is the performance data gathered on Intel
> > > Sapphire Rapids with two workloads:
> > >
> > > 1) 30 usemem processes in a 150 GB memory limited cgroup, each
> > > allocates 10G, i.e, effectively running at 50% memory pressure.
> > > 2) kernel_compilation "defconfig", 32 threads, cgroup memory limit set
> > > to 1.7 GiB (50% memory pressure, since baseline memory usage is 3.4
> > > GiB): data averaged across 10 runs.
> > >
> > > To keep comparisons simple, all testing was done without the
> > > zswap shrinker.
> > >
> > >
> > ==============================================================
> > ===========
> > > IAA mm-unstable-1-23-2026 v14
> > >
> > ==============================================================
> > ===========
> > > zswap compressor deflate-iaa deflate-iaa IAA Batching
> > > vs.
> > > IAA Sequential
> > >
> > ==============================================================
> > ===========
> > > usemem30, 64K folios:
> > >
> > > Total throughput (KB/s) 6,226,967 10,551,714 69%
> > > Average throughput (KB/s) 207,565 351,723 69%
> > > elapsed time (sec) 99.19 67.45 -32%
> > > sys time (sec) 2,356.19 1,580.47 -33%
> > >
> > > usemem30, PMD folios:
> > >
> > > Total throughput (KB/s) 6,347,201 11,315,500 78%
> > > Average throughput (KB/s) 211,573 377,183 78%
> > > elapsed time (sec) 88.14 63.37 -28%
> > > sys time (sec) 2,025.53 1,455.23 -28%
> > >
> > > kernel_compilation, 64K folios:
> > >
> > > elapsed time (sec) 100.10 98.74 -1.4%
> > > sys time (sec) 308.72 301.23 -2%
> > >
> > > kernel_compilation, PMD folios:
> > >
> > > elapsed time (sec) 95.29 93.44 -1.9%
> > > sys time (sec) 346.21 344.48 -0.5%
> > >
> > ==============================================================
> > ===========
> > >
> > >
> > ==============================================================
> > ===========
> > > ZSTD mm-unstable-1-23-2026 v14
> > >
> > ==============================================================
> > ===========
> > > zswap compressor zstd zstd v14 ZSTD
> > > Improvement
> > >
> > ==============================================================
> > ===========
> > > usemem30, 64K folios:
> > >
> > > Total throughput (KB/s) 6,032,326 6,047,448 0.3%
> > > Average throughput (KB/s) 201,077 201,581 0.3%
> > > elapsed time (sec) 97.52 95.33 -2.2%
> > > sys time (sec) 2,415.40 2,328.38 -4%
> > >
> > > usemem30, PMD folios:
> > >
> > > Total throughput (KB/s) 6,570,404 6,623,962 0.8%
> > > Average throughput (KB/s) 219,013 220,798 0.8%
> > > elapsed time (sec) 89.17 88.25 -1%
> > > sys time (sec) 2,126.69 2,043.08 -4%
> > >
> > > kernel_compilation, 64K folios:
> > >
> > > elapsed time (sec) 100.89 99.98 -0.9%
> > > sys time (sec) 417.49 414.62 -0.7%
> > >
> > > kernel_compilation, PMD folios:
> > >
> > > elapsed time (sec) 98.26 97.38 -0.9%
> > > sys time (sec) 487.14 473.16 -2.9%
> > >
> > ==============================================================
> > ===========
> >
> > The rest of the patch changelog (architectural and future
> > considerations) can stay in the cover letter. Let's not duplicate
> > information :)
> >
> > Keep the patch changelog limited to only the changes in the patch
> > itself (unless we need some clarifications imminently relevant).
>
> Hi Nhat,
>
> Thanks for this comment. Yosry had also pointed this out in [1]. I have
> been including the architectural and future considerations in this change log
> since Andrew had asked me to do so. I hope this is Ok?
Ah hmmmmm. For some reasons I was under the assumption that usually
Andrew would concatenate the patch cover letter and the patch
changelog before merging. Oh well.
If Andrew prefers including that here then I'm fine with it.
>
> [1]: https://patchwork.kernel.org/comment/26706240/
>
> >
> > I'll review the remainder of the patch later :)
>
> Sure.
>
> Thanks,
> Kanchana
next prev parent reply other threads:[~2026-02-01 0:49 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-25 3:35 [PATCH v14 00/26] zswap compression batching with optimized iaa_crypto driver Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 01/26] crypto: iaa - Reorganize the iaa_crypto driver code Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 02/26] crypto: iaa - Replace sprintf with sysfs_emit in sysfs show functions Kanchana P Sridhar
2026-02-06 10:47 ` Herbert Xu
2026-01-25 3:35 ` [PATCH v14 03/26] crypto: iaa - New architecture for IAA device WQ [de]comp usage & core mapping Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 04/26] crypto: iaa - Simplify, consistency of function parameters, minor stats bug fix Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 05/26] crypto: iaa - Descriptor allocation timeouts with mitigations Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 06/26] crypto: iaa - iaa_wq uses percpu_refs for get/put reference counting Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 07/26] crypto: iaa - Simplify the code flow in iaa_compress() and iaa_decompress() Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 08/26] crypto: iaa - Refactor hardware descriptor setup into separate procedures Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 09/26] crypto: iaa - Simplified, efficient job submissions for non-irq mode Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 10/26] crypto: iaa - Deprecate exporting add/remove IAA compression modes Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 11/26] crypto: iaa - Expect a single scatterlist for a [de]compress request's src/dst Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 12/26] crypto: iaa - Rearchitect iaa_crypto to have clean interfaces with crypto_acomp Kanchana P Sridhar
2026-02-06 10:49 ` Herbert Xu
2026-01-25 3:35 ` [PATCH v14 13/26] crypto: acomp - Define a unit_size in struct acomp_req to enable batching Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 14/26] crypto: acomp - Add bit to indicate segmentation support Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 15/26] crypto: acomp - Add trivial segmentation wrapper Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 16/26] crypto: iaa - IAA Batching for parallel compressions/decompressions Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 17/26] crypto: iaa - Submit the two largest source buffers first in batch decompress Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 18/26] crypto: acomp, iaa - crypto_acomp integration of IAA Batching Kanchana P Sridhar
2026-02-05 4:14 ` Herbert Xu
2026-01-25 3:35 ` [PATCH v14 19/26] crypto: iaa - Enable async mode and make it the default Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 20/26] crypto: iaa - Disable iaa_verify_compress by default Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 21/26] crypto: iaa - Add deflate-iaa-dynamic compression mode Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 22/26] crypto: acomp - Add crypto_acomp_batch_size() to get an algorithm's batch-size Kanchana P Sridhar
2026-01-25 3:35 ` [PATCH v14 23/26] mm: zswap: Tie per-CPU acomp_ctx lifetime to the pool Kanchana P Sridhar
2026-02-04 16:29 ` Yosry Ahmed
2026-01-25 3:35 ` [PATCH v14 24/26] mm: zswap: Consistently use IS_ERR_OR_NULL() to check acomp_ctx resources Kanchana P Sridhar
2026-01-30 23:53 ` Nhat Pham
2026-01-31 1:15 ` Sridhar, Kanchana P
2026-01-25 3:35 ` [PATCH v14 25/26] mm: zswap: Store large folios in batches Kanchana P Sridhar
2026-01-31 0:33 ` Nhat Pham
2026-01-31 20:22 ` Sridhar, Kanchana P
2026-02-04 16:57 ` Yosry Ahmed
2026-01-25 3:35 ` [PATCH v14 26/26] mm: zswap: Batched zswap_compress() for compress batching of large folios Kanchana P Sridhar
2026-01-31 1:12 ` Nhat Pham
2026-01-31 20:31 ` Sridhar, Kanchana P
2026-02-01 0:48 ` Nhat Pham [this message]
2026-02-01 2:53 ` Sridhar, Kanchana P
2026-02-04 0:30 ` Nhat Pham
2026-02-04 18:10 ` Yosry Ahmed
2026-02-04 18:17 ` Yosry Ahmed
2026-02-04 18:17 ` Yosry Ahmed
2026-02-04 18:21 ` [PATCH v14 00/26] zswap compression batching with optimized iaa_crypto driver Yosry Ahmed
2026-02-04 18:39 ` Andrew Morton
2026-02-04 18:49 ` Yosry Ahmed
2026-02-05 4:16 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=PqWQ_39BuApc_bT1WKQMJyNPDs+Gv0JAU5cTa1KNDj9g@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=ardb@kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=clabbe@baylibre.com \
--cc=davem@davemloft.net \
--cc=ebiggers@google.com \
--cc=giovanni.cabiddu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=herbert@gondor.apana.org.au \
--cc=kanchana.p.sridhar@intel.com \
--cc=kasong@tencent.com \
--cc=kristen.c.accardi@intel.com \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=senozhatsky@chromium.org \
--cc=sj@kernel.org \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vinicius.gomes@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox