From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4DBAD2E002 for ; Wed, 23 Oct 2024 00:53:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DDA46B00BB; Tue, 22 Oct 2024 20:53:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 566F36B00BC; Tue, 22 Oct 2024 20:53:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B9AF6B00BD; Tue, 22 Oct 2024 20:53:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 195FB6B00BB for ; Tue, 22 Oct 2024 20:53:51 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 781C81C3B4E for ; Wed, 23 Oct 2024 00:53:31 +0000 (UTC) X-FDA: 82703043822.13.82BE878 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by imf03.hostedemail.com (Postfix) with ESMTP id 0644A20014 for ; Wed, 23 Oct 2024 00:53:40 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mQxMSD97; spf=pass (imf03.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729644777; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=elKEUgYOBb9hlnUkKA6N1hOgZ0Y3H2ei9uKOJEOK+EY=; b=1Oa3AEou5gCwKsJ9Lsc8hPYjKKiV9J91N27psYd2CV4zE9ebOed2N+hb+LmAaMLsO3no0Q kRe2x5mNe6Vzt1btL1LISOIQ/85H/g5Kt/3he1UJWNcpJp2U2mzAHWpH8ji3TAPJ+axQ0Q Cnu28Q2i7+WBUoVrWaGoZIidAEKu12o= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mQxMSD97; spf=pass (imf03.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.49 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729644777; a=rsa-sha256; cv=none; b=u0DcnzyT2WzHAo9qQj8QWGcXR32EW6vlg0f+DOip2w6C6kVKXxda3VvpzdjK9HMG+eYGBs 0ImD/UsLSk+y959D2NczxZEZeHgLvqw3RtK1owymCJ1uCucHk/uV0G3YhMGFGqQnE2ynoi Zu+naP1k4jtmh2clKOOust45arE06dI= Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5c9454f3bfaso7361049a12.2 for ; Tue, 22 Oct 2024 17:53:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729644827; x=1730249627; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=elKEUgYOBb9hlnUkKA6N1hOgZ0Y3H2ei9uKOJEOK+EY=; b=mQxMSD97WPkHcByBizI5A/42OspHi21BAtpMlQaWXoN6FuSPIwDu2aQMUCQlT8KIBM 1pmz8viJ7ySnWv2qTGnmWX8/XffSZGfsgYI0gPoc63AlWKD9GG+R67OPAcUopK0AxX71 jFI+fmwIEngCEU3bKDsaJsp+PIYP0J+AZ5VWISrb9k/2pha4cmjKsKPGN+3mqLudNLmx AjKKC09FsMY/kQIE9GTRAKJYEpT7DcH3hJtQ22t8EysNcXrSM1+RLmzH1apsQh25/Nfw EG/1GTVPBrlgtH21W7YZc+IQ1JwuDcvfTyifq1+mUfNewfaa5Y16fGSQot38Y3TXIHMo q0vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729644827; x=1730249627; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=elKEUgYOBb9hlnUkKA6N1hOgZ0Y3H2ei9uKOJEOK+EY=; b=MAPfTAEeacqbCWAvOG0SGKENIbj7Yv8P+XmpR2zLiSnC947xFcRYr4KruOYH9ZjiVN vhjmFjYRHB//q2anAeUKNXWsrjA1buuRmubqx2B8/teuUiCH9evH86DJ94gJPdmB1LDw MtCKaShQmz2iv4MYZK5VPzzHMCZF11V8O9Fj1tsPTkx1lZ3U/iHxmI8R8MRM53rDzP8j fzaxHOLQcqcfd2nlUYApmP3+X/hCqpRPGqpFjNudZJRrnp8cvjwbIAmhe/ZocriLzpaA gCKwSLLwDuc9kJ8VoXjDRgDN+7kH4oyQpwH7wxNlcZJG1/fyK/YHAOJxaaPbBWmbIXun /06A== X-Forwarded-Encrypted: i=1; AJvYcCVFsiR4ttJkNmQknlhTUyJpwPApYuZMQtjqQlQYNT5a+VFU1GR0dBCQgzkZImrFZmCr4SrUdvWyWA==@kvack.org X-Gm-Message-State: AOJu0YxQ9LljSTfN2FtbJqhjKFIooQWOdZtGO8czG5Jm76A2Jc15M2UZ OeFeB/T8+tHFwVf2f6vobyDjEhcH/ITO4bgU8D1KXfVVUHhq66hZ0Fwl2zDV9MmLT5Y6a/jvrig yZQlnODR1MpDdumf8jNNI1O9X+KFqhYwMWQsH X-Google-Smtp-Source: AGHT+IFvb4nuTDqZSgOQT/PkUifYMSJP87uSJMegevloYuQ+XMxD6rU7IQjfW1EjkbdiLAZ2EDHqJEJt3TT9St/pBug= X-Received: by 2002:a17:907:86a1:b0:a9a:5b8d:68ad with SMTP id a640c23a62f3a-a9abf94da9emr60511966b.48.1729644826990; Tue, 22 Oct 2024 17:53:46 -0700 (PDT) MIME-Version: 1.0 References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> <20241018064101.336232-12-kanchana.p.sridhar@intel.com> In-Reply-To: <20241018064101.336232-12-kanchana.p.sridhar@intel.com> From: Yosry Ahmed Date: Tue, 22 Oct 2024 17:53:11 -0700 Message-ID: Subject: Re: [RFC PATCH v1 11/13] mm: swap: Add IAA batch compression API swap_crypto_acomp_compress_batch(). To: Kanchana P Sridhar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 346ot7pqdtgn4nwwcdx3oxg3hdi88868 X-Rspamd-Queue-Id: 0644A20014 X-Rspamd-Server: rspam11 X-HE-Tag: 1729644820-26133 X-HE-Meta: U2FsdGVkX18W7wQNuDUL4tarHUIYJDLjdKRRG0x3wmFOFbUMcq6sOds5MPbmNeCt75gWUamaM8NqT4rSyjCvTE5KZDnlzSiBrynf6it/g+lTWAbZFNlI3B+FgDFHdp3rhdbFSxVg7DN+zLgSoUKqfdcur6ks+CxytXpFh6/3xRaT0YKSKnk7rD3PYbzzP8b0m2yj3vUd3nfPG+qjZ8dxdJx5KTrt720VbwnymCb5Nx/xWPqEWEdDz9wAvf52Nwx7zjrACb+gQtVRoa0gYxrfSEw7gEJI/grkhmw4tc8Y525mAVfhbcky3BTXGgnWpK51zYXRYazFy3erx/ucmboEdWrCKHmKY7V8xgxC6SfNc4RjJHpTcJyS+zoyCBW9E0ZtZYmvWxvSvi875bvBrFG7oxQ5pdXMGUp65YcEnCi9iIfQ9uL22wPIOaZ3Rhcj6I85TRIjfPez7JyVfcCqGQEnQ4Jvu1ZLjrOUtKOgMMGF6j1ZxdZF9tGpQDZq85zwPvEkSIo7tjYLdOYmtiBw53o3Vv9GviWp9WfIUP91CeJIK0XnEvbPk1WtFCdAuFfOhkE2xdsbF17z9HPQe2HUr7TGrdMnKmRYxg2S/YT58cb48aeiVfvLCyS6MNNqgk/8x2QVWf8HatrxiDHarFF7L5Ms0ktx82V3uEtAG+2wNeALW4xxZbWQF1O4iJb7yBQdqE0bQhybHJdx4tAE19mpLCVoAgoiUlI//Uznj72mrVVx0jLW8n8qIOebDcX8zMQkD0CzOq/02bCiVfJt7e7Vfz42KtEGH9P6R3YLvqARDHsWURPSm0WuIQuA/E4+Ft6DRMnlQ/bFSkdBDjEAYqFIQFUdBVMmpSHoyy5+0VMf1c5Z8CuJ5ZCxcypdmx2e+hSkJCSYBFtlttdbpFh5VhHFms8hbDk4a/AqqqwiJs3+lpQXwVnFoHuRCMWaFeVm55suQlel4KpZ7ce30x3LK3qLXqa bTI0/8Zo B+2R6NzGLAPxtGc4FCSkcBIK75/MLT6jBvtFloO9dUGwKdDAttAspe9Z8Au9XyzmffcjTvpf8C9VOfIm36An22C+qvQLhjwC/CjOpG6qVtkmyO+iKBVuZKIRdluGmUq+UcsLTR/UTCpLQcy2V1tJ1SMBCTm1uFY2HjJe/AgOBi2fMfeP8unjiqWd9k75wx9SFEg9pSeueijk5eybpfBj5yGtyiWsu01PZdIO+FMNPpJdgpvRijmTjNgE/fFkRRgsaRC/kDBiUP8a95yE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 17, 2024 at 11:41=E2=80=AFPM Kanchana P Sridhar wrote: > > Added a new API swap_crypto_acomp_compress_batch() that does batch > compression. A system that has Intel IAA can avail of this API to submit = a > batch of compress jobs for parallel compression in the hardware, to impro= ve > performance. On a system without IAA, this API will process each compress > job sequentially. > > The purpose of this API is to be invocable from any swap module that need= s > to compress large folios, or a batch of pages in the general case. For > instance, zswap would batch compress up to SWAP_CRYPTO_SUB_BATCH_SIZE > (i.e. 8 if the system has IAA) pages in the large folio in parallel to > improve zswap_store() performance. > > Towards this eventual goal: > > 1) The definition of "struct crypto_acomp_ctx" is moved to mm/swap.h > so that mm modules like swap_state.c and zswap.c can reference it. > 2) The swap_crypto_acomp_compress_batch() interface is implemented in > swap_state.c. > > It would be preferable for "struct crypto_acomp_ctx" to be defined in, > and for swap_crypto_acomp_compress_batch() to be exported via > include/linux/swap.h so that modules outside mm (for e.g. zram) can > potentially use the API for batch compressions with IAA. I would > appreciate RFC comments on this. Same question as the last patch, why does this need to be in the swap code? Why can't zswap just submit a single request to compress a large folio or a range of contiguous subpages at once? > > Signed-off-by: Kanchana P Sridhar > --- > mm/swap.h | 45 +++++++++++++++++++ > mm/swap_state.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++ > mm/zswap.c | 9 ---- > 3 files changed, 160 insertions(+), 9 deletions(-) > > diff --git a/mm/swap.h b/mm/swap.h > index 566616c971d4..4dcb67e2cc33 100644 > --- a/mm/swap.h > +++ b/mm/swap.h > @@ -7,6 +7,7 @@ struct mempolicy; > #ifdef CONFIG_SWAP > #include /* for swp_offset */ > #include /* for bio_end_io_t */ > +#include > > /* > * For IAA compression batching: > @@ -19,6 +20,39 @@ struct mempolicy; > #define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL > #endif > > +/* linux/mm/swap_state.c, zswap.c */ > +struct crypto_acomp_ctx { > + struct crypto_acomp *acomp; > + struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE]; > + u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE]; > + struct crypto_wait wait; > + struct mutex mutex; > + bool is_sleepable; > +}; > + > +/** > + * This API provides IAA compress batching functionality for use by swap > + * modules. > + * The acomp_ctx mutex should be locked/unlocked before/after calling th= is > + * procedure. > + * > + * @pages: Pages to be compressed. > + * @dsts: Pre-allocated destination buffers to store results of IAA comp= ression. > + * @dlens: Will contain the compressed lengths. > + * @errors: Will contain a 0 if the page was successfully compressed, or= a > + * non-0 error value to be processed by the calling function. > + * @nr_pages: The number of pages, up to SWAP_CRYPTO_SUB_BATCH_SIZE, > + * to be compressed. > + * @acomp_ctx: The acomp context for iaa_crypto/other compressor. > + */ > +void swap_crypto_acomp_compress_batch( > + struct page *pages[], > + u8 *dsts[], > + unsigned int dlens[], > + int errors[], > + int nr_pages, > + struct crypto_acomp_ctx *acomp_ctx); > + > /* linux/mm/page_io.c */ > int sio_pool_init(void); > struct swap_iocb; > @@ -119,6 +153,17 @@ static inline int swap_zeromap_batch(swp_entry_t ent= ry, int max_nr, > > #else /* CONFIG_SWAP */ > struct swap_iocb; > +struct crypto_acomp_ctx {}; > +static inline void swap_crypto_acomp_compress_batch( > + struct page *pages[], > + u8 *dsts[], > + unsigned int dlens[], > + int errors[], > + int nr_pages, > + struct crypto_acomp_ctx *acomp_ctx) > +{ > +} > + > static inline void swap_read_folio(struct folio *folio, struct swap_iocb= **plug) > { > } > diff --git a/mm/swap_state.c b/mm/swap_state.c > index 4669f29cf555..117c3caa5679 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -23,6 +23,8 @@ > #include > #include > #include > +#include > +#include > #include "internal.h" > #include "swap.h" > > @@ -742,6 +744,119 @@ void exit_swap_address_space(unsigned int type) > swapper_spaces[type] =3D NULL; > } > > +#ifdef CONFIG_SWAP > + > +/** > + * This API provides IAA compress batching functionality for use by swap > + * modules. > + * The acomp_ctx mutex should be locked/unlocked before/after calling th= is > + * procedure. > + * > + * @pages: Pages to be compressed. > + * @dsts: Pre-allocated destination buffers to store results of IAA comp= ression. > + * @dlens: Will contain the compressed lengths. > + * @errors: Will contain a 0 if the page was successfully compressed, or= a > + * non-0 error value to be processed by the calling function. > + * @nr_pages: The number of pages, up to SWAP_CRYPTO_SUB_BATCH_SIZE, > + * to be compressed. > + * @acomp_ctx: The acomp context for iaa_crypto/other compressor. > + */ > +void swap_crypto_acomp_compress_batch( > + struct page *pages[], > + u8 *dsts[], > + unsigned int dlens[], > + int errors[], > + int nr_pages, > + struct crypto_acomp_ctx *acomp_ctx) > +{ > + struct scatterlist inputs[SWAP_CRYPTO_SUB_BATCH_SIZE]; > + struct scatterlist outputs[SWAP_CRYPTO_SUB_BATCH_SIZE]; > + bool compressions_done =3D false; > + int i, j; > + > + BUG_ON(nr_pages > SWAP_CRYPTO_SUB_BATCH_SIZE); > + > + /* > + * Prepare and submit acomp_reqs to IAA. > + * IAA will process these compress jobs in parallel in async mode= . > + * If the compressor does not support a poll() method, or if IAA = is > + * used in sync mode, the jobs will be processed sequentially usi= ng > + * acomp_ctx->req[0] and acomp_ctx->wait. > + */ > + for (i =3D 0; i < nr_pages; ++i) { > + j =3D acomp_ctx->acomp->poll ? i : 0; > + sg_init_table(&inputs[i], 1); > + sg_set_page(&inputs[i], pages[i], PAGE_SIZE, 0); > + > + /* > + * Each acomp_ctx->buffer[] is of size (PAGE_SIZE * 2). > + * Reflect same in sg_list. > + */ > + sg_init_one(&outputs[i], dsts[i], PAGE_SIZE * 2); > + acomp_request_set_params(acomp_ctx->req[j], &inputs[i], > + &outputs[i], PAGE_SIZE, dlens[i]= ); > + > + /* > + * If the crypto_acomp provides an asynchronous poll() > + * interface, submit the request to the driver now, and p= oll for > + * a completion status later, after all descriptors have = been > + * submitted. If the crypto_acomp does not provide a poll= () > + * interface, submit the request and wait for it to compl= ete, > + * i.e., synchronously, before moving on to the next requ= est. > + */ > + if (acomp_ctx->acomp->poll) { > + errors[i] =3D crypto_acomp_compress(acomp_ctx->re= q[j]); > + > + if (errors[i] !=3D -EINPROGRESS) > + errors[i] =3D -EINVAL; > + else > + errors[i] =3D -EAGAIN; > + } else { > + errors[i] =3D crypto_wait_req( > + crypto_acomp_compress(acomp= _ctx->req[j]), > + &acomp_ctx->wait); > + if (!errors[i]) > + dlens[i] =3D acomp_ctx->req[j]->dlen; > + } > + } > + > + /* > + * If not doing async compressions, the batch has been processed = at > + * this point and we can return. > + */ > + if (!acomp_ctx->acomp->poll) > + return; > + > + /* > + * Poll for and process IAA compress job completions > + * in out-of-order manner. > + */ > + while (!compressions_done) { > + compressions_done =3D true; > + > + for (i =3D 0; i < nr_pages; ++i) { > + /* > + * Skip, if the compression has already completed > + * successfully or with an error. > + */ > + if (errors[i] !=3D -EAGAIN) > + continue; > + > + errors[i] =3D crypto_acomp_poll(acomp_ctx->req[i]= ); > + > + if (errors[i]) { > + if (errors[i] =3D=3D -EAGAIN) > + compressions_done =3D false; > + } else { > + dlens[i] =3D acomp_ctx->req[i]->dlen; > + } > + } > + } > +} > +EXPORT_SYMBOL_GPL(swap_crypto_acomp_compress_batch); > + > +#endif /* CONFIG_SWAP */ > + > static int swap_vma_ra_win(struct vm_fault *vmf, unsigned long *start, > unsigned long *end) > { > diff --git a/mm/zswap.c b/mm/zswap.c > index 579869d1bdf6..cab3114321f9 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -150,15 +150,6 @@ bool zswap_never_enabled(void) > * data structures > **********************************/ > > -struct crypto_acomp_ctx { > - struct crypto_acomp *acomp; > - struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE]; > - u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE]; > - struct crypto_wait wait; > - struct mutex mutex; > - bool is_sleepable; > -}; > - > /* > * The lock ordering is zswap_tree.lock -> zswap_pool.lru_lock. > * The only case where lru_lock is not acquired while holding tree.lock = is > -- > 2.27.0 >