From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B253E77188 for ; Tue, 7 Jan 2025 01:46:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AFD886B008C; Mon, 6 Jan 2025 20:46:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AACFC6B0092; Mon, 6 Jan 2025 20:46:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94E126B0093; Mon, 6 Jan 2025 20:46:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 733856B008C for ; Mon, 6 Jan 2025 20:46:41 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1E62BA05C4 for ; Tue, 7 Jan 2025 01:46:41 +0000 (UTC) X-FDA: 82978966602.02.2AA3745 Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) by imf01.hostedemail.com (Postfix) with ESMTP id 3576A40003 for ; Tue, 7 Jan 2025 01:46:39 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nrwMjJZo; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.46 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736214399; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SSvF6aUL1JCiB9c4ONJk1avFv1g15BMehNQIcAJ9VWI=; b=Ip0G2NhIdxY+QRjShPjFlhhnpmgkFaPotbq1dtVcsLZ4H84JHpxPlKpscM4jYqSueiS+Oa B+K37UJ0iZpJURyVb0y3mikYFwSgWlOPVYvR4SNQFpMOh61lLKXe1aM41YHOrdw/LCMUAx 56HiwFqbOQrs80PuOEyX2bsQ1pK6DwE= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nrwMjJZo; spf=pass (imf01.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.46 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736214399; a=rsa-sha256; cv=none; b=EkGkohmRFe/98svVrqp+07lE+UYG3sUtMphEAsxs0hmd9m6SR1Q0NSn6llViykgyCgG3aP BfEheYdYzPwvZOm0N9EhR4WucLJ6MGS8S9f67SxbsKtKSV/7VBAzq8IBN6FUYFGdbjeBMM iCZlIlcIUh+M8ILEij/CwkwDpeBX818= Received: by mail-qv1-f46.google.com with SMTP id 6a1803df08f44-6dcd4f1aaccso89621326d6.2 for ; Mon, 06 Jan 2025 17:46:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736214398; x=1736819198; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SSvF6aUL1JCiB9c4ONJk1avFv1g15BMehNQIcAJ9VWI=; b=nrwMjJZomFlWgWsdkotnWbmXuyPGPNbgfDg4yMT3LADq8A17vMA3k4bte7hoDLyNdd mVqsYsxvFNnrnN3dInPLeU1wF5KvZ4Y1j5BpQE0YTGZshdbzaniHq5vSFkXkxnJSfMYz FjfPKWp0wiwXxQULsvbMomONxlQQraQaOWuTxZK9KGl+YtPGJSwKhZxm5/h27THmh36Y HDCQ/gOMAx4Fio7KoQkByIPRT3pHjPYxIvufBWvmfs4vq9qQH1bGGVFRJr4zVADk6NAH aLADhCPDnVazudN/OW6LB8o55e7pHC4awIzIh/UutBmgYKCJbz2YtQcz0dasce6VGeWz vvvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736214398; x=1736819198; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SSvF6aUL1JCiB9c4ONJk1avFv1g15BMehNQIcAJ9VWI=; b=ARtTjm5qarErd8VxRFpNOz9J7BvM5tEXI8VcXhQfdcJmI1y9MbVp9CM8AE3JybpSQo Af4acfKL4Shk508p2yb5jy3SHHjiRZL+KsL+E/j+BGvH+B4lEJjj9fJDMm+c7z/Er0Nz dKSJx6QaGX0rQ8w8cHBzN92i1FBUHkq/67hcqroh9SwmQMGa3oW+Gl938tw2+99rad6W 6AMVJ+DnedlTEWw2vikIyiVaeiKhAQv353e065Hia96JI1YsAqKbxmPxNhLl5x3K4CCb ON/J/XpzwFgTVMS9eEtiZWaiNtK++U3ej1zFTrUWH7xfWUDr3s56tBOIyo2Q7z9WWTba 9byg== X-Forwarded-Encrypted: i=1; AJvYcCW/wa+RNOFFqyp9L7tocEW+yYzhSgbVxMQs6U8KVrJ0HXGDydHWzkdRmDCpSB9AZF65HdjV70G4AA==@kvack.org X-Gm-Message-State: AOJu0YzBplAKRA1ifWpGXbCcYSz/eaQOMWVJxVD0LKW4VTTv/dGumgcZ RRpO4nH8JXMnJ8cfKT0PiLP84dXOASAO+hEyURCEy8JqZHyo8RC44OS9DG6fKsv6VgGjPbkhwP8 0pCdd5/xd656AvCLXR0NptS5EiDMR+XKm/mcj X-Gm-Gg: ASbGncsntCNhHw8l4jvFQoNKH08oJjgMu8RHOjNzk3BhOJyoBBZHoMMS6t2TPf8J7SN qKUn1p/55zug6K7R7r5xz4kQnQZCNpVbCe+w= X-Google-Smtp-Source: AGHT+IEwpsyifEEIEbyhRdZDI+y3x50tfrUBIzAywjmI0EsmGB8yKBkYHmbO7tJ7+yTmILhlwwvUaICr742dC2q070o= X-Received: by 2002:ad4:5c6c:0:b0:6d8:a856:133 with SMTP id 6a1803df08f44-6dd233360b9mr950996906d6.12.1736214398209; Mon, 06 Jan 2025 17:46:38 -0800 (PST) MIME-Version: 1.0 References: <20241221063119.29140-1-kanchana.p.sridhar@intel.com> <20241221063119.29140-3-kanchana.p.sridhar@intel.com> In-Reply-To: From: Yosry Ahmed Date: Mon, 6 Jan 2025 17:46:01 -0800 X-Gm-Features: AbW1kvb2Q3lSJydrqEOf5nKTy60L3z1n1kh8dAM1JigeuXwhiTh0sh6huNfEx1k Message-ID: Subject: Re: [PATCH v5 02/12] crypto: acomp - Define new interfaces for compress/decompress batching. To: "Sridhar, Kanchana P" Cc: Herbert Xu , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "hannes@cmpxchg.org" , "nphamcs@gmail.com" , "chengming.zhou@linux.dev" , "usamaarif642@gmail.com" , "ryan.roberts@arm.com" , "21cnbao@gmail.com" <21cnbao@gmail.com>, "akpm@linux-foundation.org" , "linux-crypto@vger.kernel.org" , "davem@davemloft.net" , "clabbe@baylibre.com" , "ardb@kernel.org" , "ebiggers@google.com" , "surenb@google.com" , "Accardi, Kristen C" , "Feghali, Wajdi K" , "Gopal, Vinodh" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3576A40003 X-Stat-Signature: ue97t6k6m4oh8zswxx6h8njnjk4d3iq6 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736214399-229714 X-HE-Meta: U2FsdGVkX1/Yjl80PUSweySa5L7FnNf19q3BMctvp2nX/v4yISwVll3P/Aa2JBZ0RvdjP+i7/MY144yqKesCoNMDm+qtMghg2he6yxLfIYLrlTwT7ltzPwiKIY9ge47YMerRlaaHXeCJL4JldJ4X8joaMeYIN/sws/YOAn1YQIA9iQsVDSrdnu7kDYPcQeEtxH2yOwa5YZlassKWz9aTLThceMRhgXyX6r45mluLxAzIqILckG6WRZ89roYBtZPeoefk0H3Kju1+60YEXHx/qUDFyEjcItSXU5bz1txDmKi9BRSqOU23A9jwmd/95B2Z5sAU8Zn/0daq1qWjHAU8S0QxwHstU9Rx5plCpgzIJ8in19wkepKrP37BBVbo3MPGxXKqmU/+32aW1LSnGdxRdho4KmkWPoR95FyWU4DR6iA1FGa6f8ED9qyvWD7iTZSG/3IG1pXLWtKEOE+4DGAL/ODnT3E4yvgjd8vTSG5Y5pAKPUZIkX8QNX2VMwz6K7MHHaY2AmPjXyGKgPbULrRds9bmAxOHEMDRPWkbUZnz8HvbPpQBFdRIHerQZNw7vv6KeoAtXAbntZVYvn3gCPKLmqmYUlMCJsnSnKRjr1ZToDpdkk24n4RTNSvYNHdxaQBNZGgmFjqNl2nf1FKaVFKan7f0h2L6tvuUEs0yyq0G8PJ8Djin3m/rKX63qOFFHBUaIOF+zglsCNqKob9zK5DRJp/HprnoeA3aau5d0c5aGkvxrhuO1sLgt5Cx1/mJyfAki2Ftbjbtp1irv72/JKu7ILzQ1kbdqgSqDqTP2KSDIOvNpX0hPUPlJ4vwUKJrLncPkkpubY1S3y6FGvJYdPjjzb/oJ+v7CyB85a6BqC1ksDV64YjENJx1OqjA5j1wqYboBpPIfd+rxRp9J+Nul9SndmyQQzbB3DYBHgjtcR+OrIVaERr6GN32u8lAoIq2L32ulATMDVuGOpP+mkJn4aq 7MPlHTh+ CShySlZ+WYBnHb/soCGWP7ZWPXI4F41+IS47elsOTZWX/uQEb2XS4a4TT2Au24qHvdeQgr+94A6Q2bTbINsuBnzp1Ajic/oR/TYKGchChBCrtCkf6LXu+L7vyn9cM2hmn9j10d81caKxpDRqbS4sCqb2ae1kTc/svRoZFNTiUQ8n7YHOvmtJ6cihG8ySjAEz6eIC2J3u2yfS9SZdQN0YBNTsnGRoKHMV5zGHRodQE/ZZI4zmCL1bX0BRPUUSRBDXFGyjby/yq4lqBbuV+fuhnQS/zgK5Dhz58RCp1leZ4nEcI3KyWAagPMHUIKeUB/jqvPqVewI2jp2iqYsYSnGRjic/XWJqE6AEAusStUSwmrBOnt4y9SCtbkp6i+pEBF7BGiN6LH3YjZ5r3Xir8TNFNorHkkAYrvN9b8ji2zNHBRo+jdllYnHCOXBlwcfO6rNmzCC4vUQRsfkhUsAkV8DxwjdLAbJZ4b/DGNI0vuePs6rap2sXJA6lzsRBCoJPnQ4qcfguUCN/b+pCLQbM81IY1WD9sNJ34DMomIv2OfP255RKGS8OOkLx8psda4Lz6S+LIBB/BeRfITURFlu/25DiaYNqlbjWlcK05etkla8NhyD7YtEf5S+mEO/LyGTrAqT1lLsN1geNsSoouYWyIPh3Y8M9maKkslBjs6g/2Z1MefMJPfx5cukfA1dwg13JVTA4Bh1drbh7G3+cD1LhFBLmFH8zHg8tFQ8xjbFm3Da4LO7FD2Q6zwFkvcBniSV17v4pTbn2ON9vynuZ9hP3RZZ4faUSFGuTZgHOhTWC7S4keEdWD6w4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 6, 2025 at 5:38=E2=80=AFPM Sridhar, Kanchana P wrote: > > Hi Yosry, > > > -----Original Message----- > > From: Yosry Ahmed > > Sent: Monday, January 6, 2025 3:24 PM > > To: Sridhar, Kanchana P > > Cc: Herbert Xu ; linux- > > kernel@vger.kernel.org; linux-mm@kvack.org; hannes@cmpxchg.org; > > nphamcs@gmail.com; chengming.zhou@linux.dev; > > usamaarif642@gmail.com; ryan.roberts@arm.com; 21cnbao@gmail.com; > > akpm@linux-foundation.org; linux-crypto@vger.kernel.org; > > davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org; > > ebiggers@google.com; surenb@google.com; Accardi, Kristen C > > ; Feghali, Wajdi K ; > > Gopal, Vinodh > > Subject: Re: [PATCH v5 02/12] crypto: acomp - Define new interfaces for > > compress/decompress batching. > > > > On Mon, Jan 6, 2025 at 9:37=E2=80=AFAM Sridhar, Kanchana P > > wrote: > > > > > > Hi Herbert, > > > > > > > -----Original Message----- > > > > From: Herbert Xu > > > > Sent: Saturday, December 28, 2024 3:46 AM > > > > To: Sridhar, Kanchana P > > > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > > > > hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com; > > > > chengming.zhou@linux.dev; usamaarif642@gmail.com; > > > > ryan.roberts@arm.com; 21cnbao@gmail.com; akpm@linux- > > foundation.org; > > > > linux-crypto@vger.kernel.org; davem@davemloft.net; > > clabbe@baylibre.com; > > > > ardb@kernel.org; ebiggers@google.com; surenb@google.com; Accardi, > > > > Kristen C ; Feghali, Wajdi K > > > > ; Gopal, Vinodh > > > > Subject: Re: [PATCH v5 02/12] crypto: acomp - Define new interfaces= for > > > > compress/decompress batching. > > > > > > > > On Fri, Dec 20, 2024 at 10:31:09PM -0800, Kanchana P Sridhar wrote: > > > > > This commit adds get_batch_size(), batch_compress() and > > > > batch_decompress() > > > > > interfaces to: > > > > > > > > First of all we don't need a batch compress/decompress interface > > > > because the whole point of request chaining is to supply the data > > > > in batches. > > > > > > > > I'm also against having a get_batch_size because the user should > > > > be supplying as much data as they're comfortable with. In other > > > > words if the user is happy to give us 8 requests for iaa then it > > > > should be happy to give us 8 requests for every implementation. > > > > > > > > The request chaining interface should be such that processing > > > > 8 requests is always better than doing 1 request at a time as > > > > the cost is amortised. > > > > > > Thanks for your comments. Can you please elaborate on how > > > request chaining would enable cost amortization for software > > > compressors? With the current implementation, a module like > > > zswap would need to do the following to invoke request chaining > > > for software compressors (in addition to pushing the chaining > > > to the user layer for IAA, as per your suggestion on not needing a > > > batch compress/decompress interface): > > > > > > zswap_batch_compress(): > > > for (i =3D 0; i < nr_pages_in_batch; ++i) { > > > /* set up the acomp_req "reqs[i]". */ > > > [ ... ] > > > if (i) > > > acomp_request_chain(reqs[i], reqs[0]); > > > else > > > acomp_reqchain_init(reqs[0], 0, crypto_req_done, crypto_wait)= ; > > > } > > > > > > /* Process the request chain in series. */ > > > err =3D crypto_wait_req(acomp_do_req_chain(reqs[0], > > crypto_acomp_compress), crypto_wait); > > > > > > Internally, acomp_do_req_chain() would sequentially process the > > > request chain by: > > > 1) adding all requests to a list "state" > > > 2) call "crypto_acomp_compress()" for the next list element > > > 3) when this request completes, dequeue it from the list "state" > > > 4) repeat for all requests in "state" > > > 5) When the last request in "state" completes, call "reqs[0]- > > >base.complete()", > > > which notifies crypto_wait. > > > > > > From what I can understand, the latency cost should be the same for > > > processing a request chain in series vs. processing each request as i= t is > > > done today in zswap, by calling: > > > > > > comp_ret =3D crypto_wait_req(crypto_acomp_compress(acomp_ctx- > > >reqs[0]), &acomp_ctx->wait); > > > > > > It is not clear to me if there is a cost amortization benefit for sof= tware > > > compressors. One of the requirements from Yosry was that there should > > > be no change for the software compressors, which is what I have > > > attempted to do in v5. > > > > > > Can you please help us understand if there is a room for optimizing > > > the implementation of the synchronous "acomp_do_req_chain()" API? > > > I would also like to get inputs from the zswap maintainers on using > > > request chaining for a batching implementation for software compresso= rs. > > > > Is there a functional change in doing so, or just using different > > interfaces to accomplish the same thing we do today? > > The code paths for software compressors are considerably different betwee= n > these two scenarios: > > 1) Given a batch of 8 pages: for each page, call zswap_compress() that do= es this: > > comp_ret =3D crypto_wait_req(crypto_acomp_compress(acomp_ctx->req= s[0]), &acomp_ctx->wait); > > 2) Given a batch of 8 pages: > a) Create a request chain of 8 acomp_reqs, starting with reqs[0], as > described earlier. > b) Process the request chain by calling: > > err =3D crypto_wait_req(acomp_do_req_chain(reqs[0], crypto_= acomp_compress), &acomp_ctx->wait); > /* Get each req's error status. */ > for (i =3D 0; i < nr_pages; ++i) { > errors[i] =3D acomp_request_err(reqs[i]); > if (errors[i]) { > pr_debug("Request chaining req %d compress error = %d\n", i, errors[i]); > } else { > dlens[i] =3D reqs[i]->dlen; > } > } > > What I mean by considerably different code paths is that request chaining > internally overwrites the req's base.complete and base.data (after saving= the > original values) to implement the algorithm described earlier. Basically,= the > chain is processed in series by getting the next req in the chain, settin= g it's > completion function to "acomp_reqchain_done()", which gets called when > the "op" (crypto_acomp_compress()) is completed for that req. > acomp_reqchain_done() will cause the next req to be processed in the > same manner. If this next req happens to be the last req to be processed, > it will notify the original completion function of reqs[0], with the cryp= to_wait > that zswap sets up in zswap_cpu_comp_prepare(): > > acomp_request_set_callback(acomp_ctx->reqs[0], CRYPTO_TFM_REQ_MAY= _BACKLOG, > crypto_req_done, &acomp_ctx->wait); > > Patch [1] in v5 of this series has the full implementation of acomp_do_re= q_chain() > in case you want to understand this in more detail. > > The "functional change" wrt request chaining is limited to the above. For software compressors, the batch size should be 1. In that scenario, from a zswap perspective (without going into the acomp implementation details please), is there a functional difference? If not, we can just use the request chaining API regardless of batching if that is what Herbert means. > > [1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.2= 9140-2-kanchana.p.sridhar@intel.com/ > > Thanks, > Kanchana >