RE: [PATCH v11 00/24] zswap compression batching with optimized iaa_crypto driver

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Nhat Pham <nphamcs@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"yosry.ahmed@linux.dev" <yosry.ahmed@linux.dev>,
	"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
	"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"21cnbao@gmail.com" <21cnbao@gmail.com>,
	"ying.huang@linux.alibaba.com" <ying.huang@linux.alibaba.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"senozhatsky@chromium.org" <senozhatsky@chromium.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"clabbe@baylibre.com" <clabbe@baylibre.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"ebiggers@google.com" <ebiggers@google.com>,
	"surenb@google.com" <surenb@google.com>,
	"Accardi, Kristen C" <kristen.c.accardi@intel.com>,
	"Gomes, Vinicius" <vinicius.gomes@intel.com>,
	"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
	"Gopal, Vinodh" <vinodh.gopal@intel.com>,
	"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Subject: RE: [PATCH v11 00/24] zswap compression batching with optimized iaa_crypto driver
Date: Tue, 26 Aug 2025 04:09:45 +0000	[thread overview]
Message-ID: <PH7PR11MB8121473792ACC3AD50D5F129C939A@PH7PR11MB8121.namprd11.prod.outlook.com> (raw)
In-Reply-To: <aK0KNAmQh_JVgnML@gondor.apana.org.au>


> -----Original Message-----
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Sent: Monday, August 25, 2025 6:13 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Cc: Nhat Pham <nphamcs@gmail.com>; linux-kernel@vger.kernel.org; linux-
> mm@kvack.org; hannes@cmpxchg.org; yosry.ahmed@linux.dev;
> chengming.zhou@linux.dev; usamaarif642@gmail.com;
> ryan.roberts@arm.com; 21cnbao@gmail.com;
> ying.huang@linux.alibaba.com; akpm@linux-foundation.org;
> senozhatsky@chromium.org; linux-crypto@vger.kernel.org;
> davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> <kristen.c.accardi@intel.com>; Gomes, Vinicius <vinicius.gomes@intel.com>;
> Feghali, Wajdi K <wajdi.k.feghali@intel.com>; Gopal, Vinodh
> <vinodh.gopal@intel.com>
> Subject: Re: [PATCH v11 00/24] zswap compression batching with optimized
> iaa_crypto driver
> 
> On Mon, Aug 25, 2025 at 06:12:19PM +0000, Sridhar, Kanchana P wrote:
> >
> > Thanks Herbert, for reviewing the approach. IIUC, we should follow
> > these constraints:
> >
> > 1) The folio should be submitted as the source.
> >
> > 2) For the destination, construct an SG list for them and pass that in.
> >     The rule should be that the SG list must contain a sufficient number
> >     of pages for the compression output based on the given unit size
> >     (PAGE_SIZE for zswap).
> >
> > For PMD folios, there would be 512 compression outputs. In this case,
> > would we need to pass in an SG list that can contain 512 compression
> > outputs after calling the acompress API once?
> 
> Eventually yes :)
> 
> But for now we're just replicating your current patch-set, so
> the folio should come with an offset and a length restriction,
> and correspondingly the destination SG list should contain the
> same number of pages as there are in your current patch-set.

Thanks Herbert. Just want to make sure I understand this. Are you
referring to replacing sg_set_page() for the input with sg_set_folio()?
We have to pass in a scatterlist for the acomp_req->src..

This is how the converged zswap_compress() code would look for
batch compression of "nr_pages" in "folio", starting at index "start".
The input SG list will contain "nr_comps" pages: nr_comps is
1 for software and 8 for IAA.

The destination SG list will contain an equivalent number of
buffers (each is PAGE_SIZE * 2).

Based on your suggestions, I was able to come up with a unified
implementation for software and hardware compressors: the SG list
for the input is a key aspect of this (lines 24-25 from the start of the
procedure):

static bool zswap_compress(struct folio *folio, long start, unsigned int nr_pages,
                           struct zswap_entry *entries[], struct zswap_pool *pool,
                           int node_id)
{
        unsigned int nr_comps = min(nr_pages, pool->compr_batch_size);
        unsigned int dlens[ZSWAP_MAX_BATCH_SIZE];
        struct crypto_acomp_ctx *acomp_ctx;
        struct zpool *zpool = pool->zpool;
        struct scatterlist *sg;
        unsigned int i, j, k;
        gfp_t gfp;
        int err;

        gfp = GFP_NOWAIT | __GFP_NORETRY | __GFP_HIGHMEM | __GFP_MOVABLE;

        acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);

        mutex_lock(&acomp_ctx->mutex);

        prefetchw(acomp_ctx->sg_inputs->sgl);
        prefetchw(acomp_ctx->sg_outputs->sgl);

        /*                                                                                                                                
         * Note:                                                                                                                          
         * [i] refers to the incoming batch space and is used to                                                                          
         *     index into the folio pages and @entries.                                                                                   
         *                                                                                                                                
         * [k] refers to the @acomp_ctx space, as determined by                                                                           
         *     @pool->compr_batch_size, and is used to index into                                                                         
         *     @acomp_ctx->buffers and @dlens.                                                                                            
         */
        for (i = 0; i < nr_pages; i += nr_comps) {
                for_each_sg(acomp_ctx->sg_inputs->sgl, sg, nr_comps, k)
                        sg_set_folio(sg, folio, PAGE_SIZE, (start + k + i) * PAGE_SIZE);

                /*                                                                                                                        
                 * We need PAGE_SIZE * 2 here since there maybe over-compression case,                                                    
                 * and hardware-accelerators may won't check the dst buffer size, so                                                      
                 * giving the dst buffer with enough length to avoid buffer overflow.                                                     
                 */
                for_each_sg(acomp_ctx->sg_outputs->sgl, sg, nr_comps, k)
                        sg_set_buf(sg, acomp_ctx->buffers[k], PAGE_SIZE * 2);

                acomp_request_set_params(acomp_ctx->req,
                                         acomp_ctx->sg_inputs->sgl,
                                         acomp_ctx->sg_outputs->sgl,
                                         nr_comps * PAGE_SIZE,
                                         nr_comps * PAGE_SIZE);

                err = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req),
                                      &acomp_ctx->wait);

                if (unlikely(err)) {
                        if (nr_comps == 1)
                                dlens[0] = err;
                        goto compress_error;
                }

                if (nr_comps == 1)
                        dlens[0] = acomp_ctx->req->dlen;
                else
                        for_each_sg(acomp_ctx->sg_outputs->sgl, sg, nr_comps, k)
                                dlens[k] = sg->length;

[ store each compressed page in zpool]

I quickly tested this with usemem 30 processes. Switching from sg_set_page()
to sg_set_folio() does cause a 15% throughput regression for IAA and 2%
regression for zstd:

usemem30/64K folios/deflate-iaa/Avg throughput (KB/s):
sg_set_page(): 357,141
sg_set_folio(): 304,696

usemem30/64K folios/zstd/Avg throughput (KB/s):
sg_set_page(): 230,760
sg_set_folio(): 226,246

In my experience, zswap_compress() is highly performance critical code
and the smallest compute additions can cause significant impact on workload
performance and sys time.

Given the code simplification and unification that your SG list suggestions
have enabled, may I understand better why sg_set_folio() is preferred?
Again, my apologies if I have misunderstood your suggestion, but I think
it is worth getting this clarified so we are all in agreement.

Thanks and best regards,
Kanchana


> 
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

next prev parent reply	other threads:[~2025-08-26  4:09 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-01  4:36 Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 01/24] crypto: iaa - Reorganize the iaa_crypto driver code Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 02/24] crypto: iaa - New architecture for IAA device WQ comp/decomp usage & core mapping Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 03/24] crypto: iaa - Simplify, consistency of function parameters, minor stats bug fix Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 04/24] crypto: iaa - Descriptor allocation timeouts with mitigations Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 05/24] crypto: iaa - iaa_wq uses percpu_refs for get/put reference counting Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 06/24] crypto: iaa - Simplify the code flow in iaa_compress() and iaa_decompress() Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 07/24] crypto: iaa - Refactor hardware descriptor setup into separate procedures Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 08/24] crypto: iaa - Simplified, efficient job submissions for non-irq mode Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 09/24] crypto: iaa - Deprecate exporting add/remove IAA compression modes Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 10/24] crypto: iaa - Rearchitect the iaa_crypto driver to be usable by zswap and zram Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 11/24] crypto: iaa - Enablers for submitting descriptors then polling for completion Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 12/24] crypto: acomp - Add "void *kernel_data" in "struct acomp_req" for kernel users Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 13/24] crypto: iaa - IAA Batching for parallel compressions/decompressions Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 14/24] crypto: iaa - Enable async mode and make it the default Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 15/24] crypto: iaa - Disable iaa_verify_compress by default Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 16/24] crypto: iaa - Submit the two largest source buffers first in decompress batching Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 17/24] crypto: iaa - Add deflate-iaa-dynamic compression mode Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 18/24] crypto: acomp - Add crypto_acomp_batch_size() to get an algorithm's batch-size Kanchana P Sridhar
2025-08-15  5:28   ` Herbert Xu
2025-08-22 19:31     ` Sridhar, Kanchana P
2025-08-22 21:48       ` Nhat Pham
2025-08-22 21:58         ` Sridhar, Kanchana P
2025-08-22 22:00           ` Sridhar, Kanchana P
2025-08-01  4:36 ` [PATCH v11 19/24] crypto: iaa - IAA acomp_algs register the get_batch_size() interface Kanchana P Sridhar
2025-08-29  0:16   ` Barry Song
2025-08-29  3:12     ` Sridhar, Kanchana P
2025-08-01  4:36 ` [PATCH v11 20/24] mm: zswap: Per-CPU acomp_ctx resources exist from pool creation to deletion Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 21/24] mm: zswap: Consistently use IS_ERR_OR_NULL() to check acomp_ctx resources Kanchana P Sridhar
2025-08-01  4:36 ` [PATCH v11 22/24] mm: zswap: Allocate pool batching resources if the compressor supports batching Kanchana P Sridhar
2025-08-14 20:58   ` Nhat Pham
2025-08-14 22:05     ` Sridhar, Kanchana P
2025-08-26  3:48   ` Barry Song
2025-08-26  4:27     ` Sridhar, Kanchana P
2025-08-26  4:42       ` Barry Song
2025-08-26  4:56         ` Sridhar, Kanchana P
2025-08-26  5:17           ` Barry Song
2025-08-27  0:06             ` Sridhar, Kanchana P
2025-08-28 21:39               ` Barry Song
2025-08-28 22:47                 ` Sridhar, Kanchana P
2025-08-28 23:28                   ` Barry Song
2025-08-29  2:56                     ` Sridhar, Kanchana P
2025-08-29  3:42                       ` Barry Song
2025-08-29 18:39                         ` Sridhar, Kanchana P
2025-08-30  8:40                           ` Barry Song
2025-09-03 18:00                             ` Sridhar, Kanchana P
2025-08-01  4:36 ` [PATCH v11 23/24] mm: zswap: zswap_store() will process a large folio in batches Kanchana P Sridhar
2025-08-14 21:05   ` Nhat Pham
2025-08-14 22:10     ` Sridhar, Kanchana P
2025-08-28 23:59   ` Barry Song
2025-08-29  3:06     ` Sridhar, Kanchana P
2025-08-01  4:36 ` [PATCH v11 24/24] mm: zswap: Batched zswap_compress() with compress batching of large folios Kanchana P Sridhar
2025-08-14 21:14   ` Nhat Pham
2025-08-14 22:17     ` Sridhar, Kanchana P
2025-08-28 23:54   ` Barry Song
2025-08-29  3:04     ` Sridhar, Kanchana P
2025-08-29  3:31       ` Barry Song
2025-08-29  3:39         ` Sridhar, Kanchana P
2025-08-08 23:51 ` [PATCH v11 00/24] zswap compression batching with optimized iaa_crypto driver Nhat Pham
2025-08-09  0:03   ` Sridhar, Kanchana P
2025-08-15  5:27   ` Herbert Xu
2025-08-22 19:26     ` Sridhar, Kanchana P
2025-08-25  5:38       ` Herbert Xu
2025-08-25 18:12         ` Sridhar, Kanchana P
2025-08-26  1:13           ` Herbert Xu
2025-08-26  4:09             ` Sridhar, Kanchana P [this message]
2025-08-26  4:14               ` Herbert Xu
2025-08-26  4:42                 ` Sridhar, Kanchana P
2025-09-18  2:38   ` Vinicius Costa Gomes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH7PR11MB8121473792ACC3AD50D5F129C939A@PH7PR11MB8121.namprd11.prod.outlook.com \
    --to=kanchana.p.sridhar@intel.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=clabbe@baylibre.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=kristen.c.accardi@intel.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=senozhatsky@chromium.org \
    --cc=surenb@google.com \
    --cc=usamaarif642@gmail.com \
    --cc=vinicius.gomes@intel.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox