From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1302D2E001 for ; Wed, 23 Oct 2024 00:49:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 194EA6B00B2; Tue, 22 Oct 2024 20:49:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 145C16B00B5; Tue, 22 Oct 2024 20:49:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F00176B00B7; Tue, 22 Oct 2024 20:49:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CB2676B00B2 for ; Tue, 22 Oct 2024 20:49:24 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A65BFA0638 for ; Wed, 23 Oct 2024 00:48:53 +0000 (UTC) X-FDA: 82703033280.28.6B97C4F Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41]) by imf11.hostedemail.com (Postfix) with ESMTP id 7C0144000E for ; Wed, 23 Oct 2024 00:49:02 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=A4dpOuKC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729644395; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7h7Ib3rin1j+j/i/BmxIphxr1v+kKzDbsGLgavpQz7o=; b=RGqf34FoiiWRPSLj+ielVuUdm8fB6vZmDWFNbtKiD5OLjgkD0cD4Pc3Pd5/iTThPhUUbB+ EYHDDKHIz9DG7P91i+cIOtOQ69lfVLPENfK/yJ48If+WeAY7AfCEDsfSu8SS5R1a9cwZCM KT9PVrSyobni3L8NGpTN5iEH2dW8B2I= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729644395; a=rsa-sha256; cv=none; b=t8/5D7kXp7Z04W5XZn0z9e4Oid2iI8y7P3VRspWWsdjvmQlQjDMXOJ6bJim93BkUMyo29b RAQKnkZ596NRGtzc+I+sG/3T8FhWr6G5i7CIokiRviT6pZQiLm7QMUDRhUgxL8DDk4wpxV nrEmVohbq4yjm5Hp4r36S5xZVshlQ0k= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=A4dpOuKC; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-a7aa086b077so749270166b.0 for ; Tue, 22 Oct 2024 17:49:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729644561; x=1730249361; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7h7Ib3rin1j+j/i/BmxIphxr1v+kKzDbsGLgavpQz7o=; b=A4dpOuKCba0OlfP60dUUdc4Mi57wg/JN8U0lVFzc+J5sgILtn5g+l03iAP++v9kxFx 2e+e+hnBui49z18rpYYO7xNGPHsDCNDVT0DFtTt4kdBWy8WSLqgzuV2LmFv/aqCSBcZv YfaYfdkJA/2ws2U5d2WInIyKV3TBEM5ZomgPHJuJ/ZXxuN22LNK+UtuO17v2BM1N+Zfh UKP+pb6vQOMsZaM/QsulNzycQwQtvW+o7z6gA6wJp/7z4uRDGLB9PITgylloTmDvL7Bt M7FzTCE2JwTmCnwngTE17aTBD9vAPNG4t09rFJlBi6kwEIUL6aCLldaD9Bqm9X4XPSG6 RUmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729644561; x=1730249361; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7h7Ib3rin1j+j/i/BmxIphxr1v+kKzDbsGLgavpQz7o=; b=tAjQVkYNzKofvmoQigtihMgyx4CAV/s2vkjfrL0MieLvL0Kz+S2a6B54A1Tf10dFXp 3WapxxeiVrF/V4NYIVsbvNbDu2fldDsoK7BmF9MYAyP2wZZc8sL0ZqFHRb0opUd5GV/c 4BgplnylujO7qi99v+DcZYvWMbqY/3ay0Zr5f0MxadSIsGss9qu51MN4IjsOKvrlbqzC uSU864BPFaC3KXQEOqd8Jf3NvfpGqwmF/rxhyxj5hwkcYpeXzHE6FCL6wmQfG00bgAD7 ER5szX7kDjDhbh22HVpX7qEjoZEb6giqLCweqHBSNcAnwMHkDTlPP2L8Hsbpr00fFvpb vwaA== X-Forwarded-Encrypted: i=1; AJvYcCVCn5rsp8L7HhY3CozB4jk6dSXXa3HEI0BzA5SBQ6ZdOAVhbUXE3mX2RXWV+xAZtiT7Et3HRv2S8g==@kvack.org X-Gm-Message-State: AOJu0YwGr3/zWPFLqw0L9Zgb0MqhhDEDfAcHdPlhWX9OY6a32JMTf1JX +5V6G9oEP4aLMZHGVeh4laEZZSKuAtqxzrd9xUKlqb9bX8ups/pB5EiGAbYlh+fODEVhfs16Qxr rYlzVgWKLHkdh8pGm9ZFQ/u+SFvBsjKcBTPaa X-Google-Smtp-Source: AGHT+IFMisZ4oMvMqemTXU5KgbzeivNdbDaya3XBywWCinRAAR5cqjPylB/Gg9m/GBrxRcfF6/Ojst9RWuKwv4fnkO4= X-Received: by 2002:a17:907:86a4:b0:a9a:1a6a:b5f5 with SMTP id a640c23a62f3a-a9abf964381mr65014466b.56.1729644560784; Tue, 22 Oct 2024 17:49:20 -0700 (PDT) MIME-Version: 1.0 References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> <20241018064101.336232-5-kanchana.p.sridhar@intel.com> In-Reply-To: <20241018064101.336232-5-kanchana.p.sridhar@intel.com> From: Yosry Ahmed Date: Tue, 22 Oct 2024 17:48:44 -0700 Message-ID: Subject: Re: [RFC PATCH v1 04/13] mm: zswap: zswap_compress()/decompress() can submit, then poll an acomp_req. To: Kanchana P Sridhar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7C0144000E X-Stat-Signature: x4cbt91iaksjhw313koicttnwdh5hn3s X-Rspam-User: X-HE-Tag: 1729644542-805747 X-HE-Meta: U2FsdGVkX19bq/v0eh0a6yp03sARqroig1pqmv5SiWWUvuErgSDmKS1auc2SE2marpr5BsS2MjLHxFPgCV0qAqOdgB2bGSgQwP2pKnkav4pDVs2Ow8sNkzy0ppmncaf/lRSO5+Jwz+AF2tUdbz8yS4b9PCIGv67z5erdzGNx2F7QURBAptBl6CFeH3O3KRR0jxlxV/8f+zMclUfkyfg7bdePxIGh9eSCIqcsq5h/MfVSImN9j1eBrGfrrOh20ZW3JpWO4uPGL9uycSc+GaY0P9bDIEJAwpbH4pOJLPBYEjvGx1Dfb1vf9urmCGBYn1V1O0m8LB7VaJ4Vzy+I7VXrI0eVhCYFnpB79tiz7guDhMGcL547XxtUDsTc/cbJ6NutsPWueZLd0DPFGhYVzf9DKDrWvhCtXaEsHIEHWE2GLwDLxb83mACbxLC+m2wZCzVbQWa6GhEYuChK3I8bAXpwuNxQAflE2DFjCBB7wg/bi01FcWyO3lCImPWzQSsj3GkAoPFC3QtgYsXFblTtkIaFn9dKL7rG+RtZIGlrwkN6/RkePn2HTmVT7sbmhinrRa3Md2Y98YQTZlauMcVhMCE73Hvo/FUtI2N4idt4ICdWtbtA3zeza2avoq76i8ysRDRE6LyVjTDigmro07X60fYVCJqK4GKWsFiLJXKSYqSCA/sLKZqbzJ60wO59gO2nFO5na7WC5hp2Q58I5290jTJPbEbex1QcbOD6suMUaOvpD9iok5V588NIouhUfkL23joywrwC4980rgBYYwmDi4iAe9ibUiDNLxFSWy941XYuakQOZIg9vHUamdcZV1GSZvqn3XqNfoSsTKl9qMK1rUyvC7EZfE4kHq7Pfg1UZvqXPsscalUIXHrfUCxgSkN3eOWrib7s9mhu17QwORvBzMnL6TCeveF+mdqIHUTG4cKkZwke8rRhZoYlULy/kBmK7OfX+VqQb8OEWRs+G5hO0HK 4Ig5LEJJ IP5bS4YNXzOdtxnd1kinOVZUn08JkKgY2teXRWm7y5LztnTmQNcl04YsnlWbtVMxuFplexF+lXce+x3YpSix7RLs2w8/IdbKD204tOBNdSB2Uyvmiw3jMFsAXfbZKwCjYRmUlTE2apIALnF/Ec0TsI0RlwAYDhFsEoWFTOQ5OTAnzWvgf5rwL0Ab2j/I6ETX/3nRX2th59P4Zv8OrzcTy/E6Sn4E3aEWcn3DN2BFrAKAdQutui3higm/XLaGpTFJoKtaPzbLRyINmb48= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 17, 2024 at 11:41=E2=80=AFPM Kanchana P Sridhar wrote: > > If the crypto_acomp has a poll interface registered, zswap_compress() > and zswap_decompress() will submit the acomp_req, and then poll() for a > successful completion/error status in a busy-wait loop. This allows an > asynchronous way to manage (potentially multiple) acomp_reqs without > the use of interrupts, which is supported in the iaa_crypto driver. > > This enables us to implement batch submission of multiple > compression/decompression jobs to the Intel IAA hardware accelerator, > which will process them in parallel; followed by polling the batch's > acomp_reqs for completion status. > > Signed-off-by: Kanchana P Sridhar > --- > mm/zswap.c | 51 +++++++++++++++++++++++++++++++++++++++------------ > 1 file changed, 39 insertions(+), 12 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index f6316b66fb23..948c9745ee57 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -910,18 +910,34 @@ static bool zswap_compress(struct page *page, struc= t zswap_entry *entry, > acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SI= ZE, dlen); > > /* > - * it maybe looks a little bit silly that we send an asynchronous= request, > - * then wait for its completion synchronously. This makes the pro= cess look > - * synchronous in fact. > - * Theoretically, acomp supports users send multiple acomp reques= ts in one > - * acomp instance, then get those requests done simultaneously. b= ut in this > - * case, zswap actually does store and load page by page, there i= s no > - * existing method to send the second page before the first page = is done > - * in one thread doing zwap. > - * but in different threads running on different cpu, we have dif= ferent > - * acomp instance, so multiple threads can do (de)compression in = parallel. > + * If the crypto_acomp provides an asynchronous poll() interface, > + * submit the descriptor and poll for a completion status. > + * > + * It maybe looks a little bit silly that we send an asynchronous > + * request, then wait for its completion in a busy-wait poll loop= , or, > + * synchronously. This makes the process look synchronous in fact= . > + * Theoretically, acomp supports users send multiple acomp reques= ts in > + * one acomp instance, then get those requests done simultaneousl= y. > + * But in this case, zswap actually does store and load page by p= age, > + * there is no existing method to send the second page before the > + * first page is done in one thread doing zswap. > + * But in different threads running on different cpu, we have dif= ferent > + * acomp instance, so multiple threads can do (de)compression in > + * parallel. > */ > - comp_ret =3D crypto_wait_req(crypto_acomp_compress(acomp_ctx->req= ), &acomp_ctx->wait); > + if (acomp_ctx->acomp->poll) { > + comp_ret =3D crypto_acomp_compress(acomp_ctx->req); > + if (comp_ret =3D=3D -EINPROGRESS) { > + do { > + comp_ret =3D crypto_acomp_poll(acomp_ctx-= >req); > + if (comp_ret && comp_ret !=3D -EAGAIN) > + break; > + } while (comp_ret); > + } > + } else { > + comp_ret =3D crypto_wait_req(crypto_acomp_compress(acomp_= ctx->req), &acomp_ctx->wait); > + } > + Is Herbert suggesting that crypto_wait_req(crypto_acomp_compress(..)) essentially do the poll internally for IAA, and hence this change can be dropped? > dlen =3D acomp_ctx->req->dlen; > if (comp_ret) > goto unlock; > @@ -959,6 +975,7 @@ static void zswap_decompress(struct zswap_entry *entr= y, struct folio *folio) > struct scatterlist input, output; > struct crypto_acomp_ctx *acomp_ctx; > u8 *src; > + int ret; > > acomp_ctx =3D raw_cpu_ptr(entry->pool->acomp_ctx); > mutex_lock(&acomp_ctx->mutex); > @@ -984,7 +1001,17 @@ static void zswap_decompress(struct zswap_entry *en= try, struct folio *folio) > sg_init_table(&output, 1); > sg_set_folio(&output, folio, PAGE_SIZE, 0); > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->= length, PAGE_SIZE); > - BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &= acomp_ctx->wait)); > + if (acomp_ctx->acomp->poll) { > + ret =3D crypto_acomp_decompress(acomp_ctx->req); > + if (ret =3D=3D -EINPROGRESS) { > + do { > + ret =3D crypto_acomp_poll(acomp_ctx->req)= ; > + BUG_ON(ret && ret !=3D -EAGAIN); > + } while (ret); > + } > + } else { > + BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx-= >req), &acomp_ctx->wait)); > + } > BUG_ON(acomp_ctx->req->dlen !=3D PAGE_SIZE); > mutex_unlock(&acomp_ctx->mutex); > > -- > 2.27.0 >