linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"nphamcs@gmail.com" <nphamcs@gmail.com>,
	"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
	"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
	"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
	"21cnbao@gmail.com" <21cnbao@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"clabbe@baylibre.com" <clabbe@baylibre.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"ebiggers@google.com" <ebiggers@google.com>,
	"surenb@google.com" <surenb@google.com>,
	"Accardi, Kristen C" <kristen.c.accardi@intel.com>,
	"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
	"Gopal, Vinodh" <vinodh.gopal@intel.com>,
	"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Subject: RE: [PATCH v5 02/12] crypto: acomp - Define new interfaces for compress/decompress batching.
Date: Tue, 7 Jan 2025 01:36:47 +0000	[thread overview]
Message-ID: <SJ0PR11MB5678034533E3FAD7B16E2758C9112@SJ0PR11MB5678.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAJD7tkatpOaortT8Si5GfxprvgPR+bzxwTSOR0rsaRUstdqNMQ@mail.gmail.com>

Hi Yosry,

> -----Original Message-----
> From: Yosry Ahmed <yosryahmed@google.com>
> Sent: Monday, January 6, 2025 3:24 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>; linux-
> kernel@vger.kernel.org; linux-mm@kvack.org; hannes@cmpxchg.org;
> nphamcs@gmail.com; chengming.zhou@linux.dev;
> usamaarif642@gmail.com; ryan.roberts@arm.com; 21cnbao@gmail.com;
> akpm@linux-foundation.org; linux-crypto@vger.kernel.org;
> davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> <kristen.c.accardi@intel.com>; Feghali, Wajdi K <wajdi.k.feghali@intel.com>;
> Gopal, Vinodh <vinodh.gopal@intel.com>
> Subject: Re: [PATCH v5 02/12] crypto: acomp - Define new interfaces for
> compress/decompress batching.
> 
> On Mon, Jan 6, 2025 at 9:37 AM Sridhar, Kanchana P
> <kanchana.p.sridhar@intel.com> wrote:
> >
> > Hi Herbert,
> >
> > > -----Original Message-----
> > > From: Herbert Xu <herbert@gondor.apana.org.au>
> > > Sent: Saturday, December 28, 2024 3:46 AM
> > > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > > hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
> > > chengming.zhou@linux.dev; usamaarif642@gmail.com;
> > > ryan.roberts@arm.com; 21cnbao@gmail.com; akpm@linux-
> foundation.org;
> > > linux-crypto@vger.kernel.org; davem@davemloft.net;
> clabbe@baylibre.com;
> > > ardb@kernel.org; ebiggers@google.com; surenb@google.com; Accardi,
> > > Kristen C <kristen.c.accardi@intel.com>; Feghali, Wajdi K
> > > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
> > > Subject: Re: [PATCH v5 02/12] crypto: acomp - Define new interfaces for
> > > compress/decompress batching.
> > >
> > > On Fri, Dec 20, 2024 at 10:31:09PM -0800, Kanchana P Sridhar wrote:
> > > > This commit adds get_batch_size(), batch_compress() and
> > > batch_decompress()
> > > > interfaces to:
> > >
> > > First of all we don't need a batch compress/decompress interface
> > > because the whole point of request chaining is to supply the data
> > > in batches.
> > >
> > > I'm also against having a get_batch_size because the user should
> > > be supplying as much data as they're comfortable with.  In other
> > > words if the user is happy to give us 8 requests for iaa then it
> > > should be happy to give us 8 requests for every implementation.
> > >
> > > The request chaining interface should be such that processing
> > > 8 requests is always better than doing 1 request at a time as
> > > the cost is amortised.
> >
> > Thanks for your comments. Can you please elaborate on how
> > request chaining would enable cost amortization for software
> > compressors? With the current implementation, a module like
> > zswap would need to do the following to invoke request chaining
> > for software compressors (in addition to pushing the chaining
> > to the user layer for IAA, as per your suggestion on not needing a
> > batch compress/decompress interface):
> >
> > zswap_batch_compress():
> >    for (i = 0; i < nr_pages_in_batch; ++i) {
> >       /* set up the acomp_req "reqs[i]". */
> >       [ ... ]
> >       if (i)
> >         acomp_request_chain(reqs[i], reqs[0]);
> >       else
> >         acomp_reqchain_init(reqs[0], 0, crypto_req_done, crypto_wait);
> >    }
> >
> >    /* Process the request chain in series. */
> >    err = crypto_wait_req(acomp_do_req_chain(reqs[0],
> crypto_acomp_compress), crypto_wait);
> >
> > Internally, acomp_do_req_chain() would sequentially process the
> > request chain by:
> > 1) adding all requests to a list "state"
> > 2) call "crypto_acomp_compress()" for the next list element
> > 3) when this request completes, dequeue it from the list "state"
> > 4) repeat for all requests in "state"
> > 5) When the last request in "state" completes, call "reqs[0]-
> >base.complete()",
> >     which notifies crypto_wait.
> >
> > From what I can understand, the latency cost should be the same for
> > processing a request chain in series vs. processing each request as it is
> > done today in zswap, by calling:
> >
> >   comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx-
> >reqs[0]), &acomp_ctx->wait);
> >
> > It is not clear to me if there is a cost amortization benefit for software
> > compressors. One of the requirements from Yosry was that there should
> > be no change for the software compressors, which is what I have
> > attempted to do in v5.
> >
> > Can you please help us understand if there is a room for optimizing
> > the implementation of the synchronous "acomp_do_req_chain()" API?
> > I would also like to get inputs from the zswap maintainers on using
> > request chaining for a batching implementation for software compressors.
> 
> Is there a functional change in doing so, or just using different
> interfaces to accomplish the same thing we do today?

The code paths for software compressors are considerably different between
these two scenarios:

1) Given a batch of 8 pages: for each page, call zswap_compress() that does this:

    	comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->reqs[0]), &acomp_ctx->wait);

2) Given a batch of 8 pages:
     a) Create a request chain of 8 acomp_reqs, starting with reqs[0], as
         described earlier.
     b) Process the request chain by calling:

              err = crypto_wait_req(acomp_do_req_chain(reqs[0], crypto_acomp_compress), &acomp_ctx->wait);
	/* Get each req's error status. */
	for (i = 0; i < nr_pages; ++i) {
		errors[i] = acomp_request_err(reqs[i]);
		if (errors[i]) {
			pr_debug("Request chaining req %d compress error %d\n", i, errors[i]);
		} else {
			dlens[i] = reqs[i]->dlen;
		}
	}

What I mean by considerably different code paths is that request chaining
internally overwrites the req's base.complete and base.data (after saving the
original values) to implement the algorithm described earlier. Basically, the
chain is processed in series by getting the next req in the chain, setting it's
completion function to "acomp_reqchain_done()", which gets called when
the "op" (crypto_acomp_compress()) is completed for that req.
acomp_reqchain_done() will cause the next req to be processed in the
same manner. If this next req happens to be the last req to be processed,
it will notify the original completion function of reqs[0], with the crypto_wait
that zswap sets up in zswap_cpu_comp_prepare():

	acomp_request_set_callback(acomp_ctx->reqs[0], CRYPTO_TFM_REQ_MAY_BACKLOG,
				   crypto_req_done, &acomp_ctx->wait);

Patch [1] in v5 of this series has the full implementation of acomp_do_req_chain()
in case you want to understand this in more detail.

The "functional change" wrt request chaining is limited to the above.

[1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.29140-2-kanchana.p.sridhar@intel.com/

Thanks,
Kanchana


  reply	other threads:[~2025-01-07  1:38 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-21  6:31 [PATCH v5 00/12] zswap IAA compress batching Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 01/12] crypto: acomp - Add synchronous/asynchronous acomp request chaining Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 02/12] crypto: acomp - Define new interfaces for compress/decompress batching Kanchana P Sridhar
2024-12-28 11:46   ` Herbert Xu
2025-01-06 17:37     ` Sridhar, Kanchana P
2025-01-06 23:24       ` Yosry Ahmed
2025-01-07  1:36         ` Sridhar, Kanchana P [this message]
2025-01-07  1:46           ` Yosry Ahmed
2025-01-07  2:06             ` Herbert Xu
2025-01-07  3:10               ` Yosry Ahmed
2025-01-08  1:38                 ` Herbert Xu
2025-01-08  1:43                   ` Yosry Ahmed
2025-02-16  5:17                 ` Herbert Xu
2025-02-20 17:32                   ` Yosry Ahmed
2025-02-22  6:26                     ` Barry Song
2025-02-22  6:34                       ` Herbert Xu
2025-02-22  6:41                         ` Barry Song
2025-02-22  6:52                           ` Herbert Xu
2025-02-22  7:13                             ` Barry Song
2025-02-22  7:22                               ` Herbert Xu
2025-02-22  8:21                                 ` Barry Song
2025-02-24 21:49                               ` Yosry Ahmed
2025-02-27  3:05                                 ` Barry Song
2025-02-22 12:31                       ` Sergey Senozhatsky
2025-02-22 14:27                         ` Sergey Senozhatsky
2025-02-23  0:14                           ` Herbert Xu
2025-02-23  2:09                             ` Sergey Senozhatsky
2025-02-23  2:52                               ` Herbert Xu
2025-02-23  3:12                                 ` Sergey Senozhatsky
2025-02-23  3:38                                   ` Herbert Xu
2025-02-23  4:02                                     ` Sergey Senozhatsky
2025-02-23  6:04                                       ` Herbert Xu
2025-02-22 16:24                         ` Barry Song
2025-02-23  0:24                         ` Herbert Xu
2025-02-23  1:57                           ` Sergey Senozhatsky
2025-01-07  2:04       ` Herbert Xu
2024-12-21  6:31 ` [PATCH v5 03/12] crypto: iaa - Add an acomp_req flag CRYPTO_ACOMP_REQ_POLL to enable async mode Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 04/12] crypto: iaa - Implement batch_compress(), batch_decompress() API in iaa_crypto Kanchana P Sridhar
2024-12-22  4:07   ` kernel test robot
2024-12-21  6:31 ` [PATCH v5 05/12] crypto: iaa - Make async mode the default Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 06/12] crypto: iaa - Disable iaa_verify_compress by default Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 07/12] crypto: iaa - Re-organize the iaa_crypto driver code Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 08/12] crypto: iaa - Map IAA devices/wqs to cores based on packages instead of NUMA Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 09/12] crypto: iaa - Distribute compress jobs from all cores to all IAAs on a package Kanchana P Sridhar
2024-12-21  6:31 ` [PATCH v5 10/12] mm: zswap: Allocate pool batching resources if the crypto_alg supports batching Kanchana P Sridhar
2025-01-07  0:58   ` Yosry Ahmed
2025-01-08  3:26     ` Sridhar, Kanchana P
2025-01-08  4:16       ` Yosry Ahmed
2024-12-21  6:31 ` [PATCH v5 11/12] mm: zswap: Restructure & simplify zswap_store() to make it amenable for batching Kanchana P Sridhar
2025-01-07  1:16   ` Yosry Ahmed
2025-01-08  3:57     ` Sridhar, Kanchana P
2025-01-08  4:22       ` Yosry Ahmed
2024-12-21  6:31 ` [PATCH v5 12/12] mm: zswap: Compress batching with Intel IAA in zswap_store() of large folios Kanchana P Sridhar
2025-01-07  1:19   ` Yosry Ahmed
2025-01-07  1:44 ` [PATCH v5 00/12] zswap IAA compress batching Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB5678034533E3FAD7B16E2758C9112@SJ0PR11MB5678.namprd11.prod.outlook.com \
    --to=kanchana.p.sridhar@intel.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=clabbe@baylibre.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=kristen.c.accardi@intel.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=usamaarif642@gmail.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox