From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0C64E77188 for ; Tue, 7 Jan 2025 01:20:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C2AF6B00D1; Mon, 6 Jan 2025 20:20:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8727F6B00D3; Mon, 6 Jan 2025 20:20:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EDBD6B00D4; Mon, 6 Jan 2025 20:20:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4CDBA6B00D1 for ; Mon, 6 Jan 2025 20:20:38 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 040DD80613 for ; Tue, 7 Jan 2025 01:20:37 +0000 (UTC) X-FDA: 82978900956.01.143C0DB Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf30.hostedemail.com (Postfix) with ESMTP id 2C42F80012 for ; Tue, 7 Jan 2025 01:20:35 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="yY8/wTv0"; spf=pass (imf30.hostedemail.com: domain of yosryahmed@google.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736212836; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NWbE6ucSaJO7drDBPdAWovR6FIHv6Ku8rtYoWX93WyI=; b=lXvwjNrjgzaYiITB42Fd/KUcMgCqq8Ipn4j7XHkPDau4rWW9A/5mlDRyvlElkjoB4LRqqg 403Qp+zI/7pMntbvI6gyKw859D+rpLjIg0FcChatFmn1XA+SC0m6qHJiMWq1JZnHe/a3PE fDg6pV461boUshq6dtM43lzEGisjLk4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736212836; a=rsa-sha256; cv=none; b=VXMqgqPwNQzMxfnRdNbrLaO6V1lXNRu9I17UmAk+FVEwIXbc2P/xP3l7IB5NWHKRn0e4RR 5VG+pZ7vhiYXW8BYyGG/+GX01ixn09A/WNuceXgwfzrGItyUPZUXKHvgPeWypWhYl1/W77 i3vucdBmxk2eBarYvIa2xPGGIEv+4WI= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="yY8/wTv0"; spf=pass (imf30.hostedemail.com: domain of yosryahmed@google.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qk1-f179.google.com with SMTP id af79cd13be357-7b6e5c74cb7so1075158385a.2 for ; Mon, 06 Jan 2025 17:20:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736212835; x=1736817635; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NWbE6ucSaJO7drDBPdAWovR6FIHv6Ku8rtYoWX93WyI=; b=yY8/wTv0+/rJ50iS3KRJEaskifx8syzSslMw5l/GrivhOM3EjtLiwFqIojp+1E9vum dMg6HYtCBucAeL0aK+I8SOqZcXROOvv/7nCTMwh0unQ1STz0UP4MoSMdx/0fSbWXY/LX j8D3j2ThjmXAqg0f+O7E/361efE7XhNlVtQrhyPMnKaO5Z+utrXANLME+TYA6A6uSmgh aGDBMRSINyk9FGgIW0vF9Gpn5QVXi5SPTu4Xj7vsXtfnBEzNxbJSqcktOowQHmFpfvWG Y/fzMzS1+kuIXrWpZSt5EPcIzjOeQaeBXwW5oFF+PoQkvFuN4k81GCuWJQDuf7WP58OF 6CcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736212835; x=1736817635; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NWbE6ucSaJO7drDBPdAWovR6FIHv6Ku8rtYoWX93WyI=; b=qz25lNtxGihi7xgP9GJWhNxC/6j/31R7t/6qLhxacvK1QswHk5AIjzVCiBE+Lo5icZ w0Fw5IpNZaNXnvJQZ0aRaA/AprgIoi6DbvS76uU7bvy8qcpwNAZBmAyOkAn2Lk2vLNUu UJ3nphz+PLSarHP5/13tTbtP/bfFbMDs43o2J7TaT91Ms5Ygqh9rvHKDis1DGMqrKVQu dJNpgL5QJBNTzJdGqNkMgAEjLH0xy6UFEeKEkUBs+FXA8PNsGrUu5hFfCLocXrYH7lD9 iTTjiGkDoAaKHeI07lGMaLx0Uzvz6B3ULAaoTWL2r6l4AzvTBlDWI70BwQfrHwKPzeS7 OjTA== X-Forwarded-Encrypted: i=1; AJvYcCUBdB8r4ZB2YtLr4tXpBvdnlyBtviGXRYB1/LWGRWN5I+Qn2zqgDFzZFn4fqqZoazK2tdnfJwKmeA==@kvack.org X-Gm-Message-State: AOJu0YwiRN8Vul0WNinaohPtgvWqNtKU1lVTlKYmW7ToTtPD8zZqxj60 JMefWvVbQA+4KMek7OQGzDJ3VZ263SCtHBQYhiqjuh08UAV2yFqBbDwCkNPf8IonHDFaLyJhpXb W5QSLAbSXAk8Fhm/1tMi3G42XJuEXnJ3IO9Hv X-Gm-Gg: ASbGncsroH9Nhxg9u+CQ6YYOM1re4GXbix5kxcq6Ifh6PH3QOfp6ipY7GPuNQDfSPtm ozylFHDTWGA3SZM11ZPSGJrCRLAzM4GAY2Cs= X-Google-Smtp-Source: AGHT+IHr4I1iTbF+x66FtXbkHGsjsWE5VQF+aOpcBcFKBDxhnQoU0/gXvCPLmkFVPhkWUCmCFi13oGxhfnQpbZvjwxo= X-Received: by 2002:a05:6214:1bc7:b0:6d3:fa03:23f1 with SMTP id 6a1803df08f44-6dd23331d7dmr980894016d6.13.1736212835067; Mon, 06 Jan 2025 17:20:35 -0800 (PST) MIME-Version: 1.0 References: <20241221063119.29140-1-kanchana.p.sridhar@intel.com> <20241221063119.29140-13-kanchana.p.sridhar@intel.com> In-Reply-To: <20241221063119.29140-13-kanchana.p.sridhar@intel.com> From: Yosry Ahmed Date: Mon, 6 Jan 2025 17:19:58 -0800 X-Gm-Features: AbW1kvZYvCod58bLaFtH-8h2ST0WRrrRE-HFCzJNPjkFCbSRGs8S5ZmDN-fdihY Message-ID: Subject: Re: [PATCH v5 12/12] mm: zswap: Compress batching with Intel IAA in zswap_store() of large folios. To: Kanchana P Sridhar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 2C42F80012 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: us3m1kntuf1dr7rr6q9i87r1co69ssb6 X-HE-Tag: 1736212835-596853 X-HE-Meta: U2FsdGVkX19UHsJ0J/Ywd2zwvUshQ1p4GedmDTME+65ssIuJQq/BmPjFoeOK8kxW54Zc4Jc3hA2jAnIMnWvlQH0KK/7YZ/Wax4FRlOCJ0K2BLc02SiZdMotBfn2qN+Pq9+a0EVFCgs4ZeTbse0Gg6I7dM6XfJBZ4xoZLxnJTTN403APWt7ewAchYU3onHT5CQXLb6wYlofGHNGaxrJjqcGCnSycyxd33v9UdYgA3bzlU1d5By5RdxTT7Bh3TZHBorVzbDCPLBsCAmzjn/lA+a+PyqJqm7ISSYiBwklZIGfCgJO5yvQOSOPV4TvMUn2LiwoNiPJxf+TiSRmytRWIXzzpdHLeZlHKD0JghJnKkkVV9wzAaqyvuY0AIzjEKc3vOUo5Im7yKh8rKbyH4Q4uEIeQnQRMZfpFUpqGQF26KfSIdeNL/XBICOoJ81LN+5UQdTUflGQxp8xmcpmc+fW9P5ywmK/SCTCW3MVb8r5WQvBVwy1ba6Bk21RmZ5RhM77MlSTzMMtvPdnHrt1blHD387990zNObIdbzIkBOnI6fC18wONuH+tzUMEfZrvBoNjNG2xkpMHihw/SfPX3n2JIDP1CoGh/VYjgjks9hji92A4o+D8BXlMczQLMz/hvRha0AIi5Ga5Y+cex3521RVWxYYNUdXvdihFQRYnXD73koryKxiXHVu85RJ+d3jELc+tL8hOg3V2lR7B6X+bf1bXiWbDLbJyw99hgifK9RG3Hyn83AutY9LkUs7GM0hgg4DocUIgrq08WqWhC1qYsHg+KczDLrXrBt6FvIn8ULsa5j09JMPOuYtzj+Wj0kyJ1jeBCwM+hRUS2zPN/xG7KcMQAvXgJ4llQj4f9TneXA05oqkuVYMiDVaw29eRPXOeWGHjHhPKvQ7iJ+Z5LgHwXvf9EoBTrsiULE4hJ0eczUx3vNXBfer5s7zYRibumdSLIGbPI8gX/g1upiyLtHps+Br5W B8DCUzGO MRceFFefh1UNpOOlX6EgMeu0tzBaM1RsVEyQgNQnDRW7zXMpdxmJaWsdwm2BugtXWEz9hVejUVeonz7FPqeHzjbcOJ5tdjdeDEuwOlWAYe3RmWfxif+2mXHZkMZGS16PB5H7yg8OC2cuZWa62qYNGB4dEYSGU20uyA8piu01bI0hgQUch3u3MV0+CfBrqLbQALFlq3QostxarvvH9AAGpDcr79SDThjLJb9qZvdS7U/LpEZyXVHD++rH0H6JUndCVSGI1X9RpKeMDUnBpYl9tzY/OndhGy5izJ4XvgbJA4yk7OHU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 20, 2024 at 10:31=E2=80=AFPM Kanchana P Sridhar wrote: > > zswap_compress_folio() is modified to detect if the pool's acomp_ctx has > more than one "nr_reqs", which will be the case if the cpu onlining code > has allocated batching resources in the acomp_ctx based on the queries to > acomp_has_async_batching() and crypto_acomp_batch_size(). If multiple > "nr_reqs" are available in the acomp_ctx, it means compress batching can = be > used with a batch-size of "acomp_ctx->nr_reqs". > > If compress batching can be used with the given zswap pool, > zswap_compress_folio() will invoke the newly added zswap_batch_compress() > procedure to compress and store the folio in batches of > "acomp_ctx->nr_reqs" pages. The batch size is effectively > "acomp_ctx->nr_reqs". > > zswap_batch_compress() calls crypto_acomp_batch_compress() to compress ea= ch > batch of (up to) "acomp_ctx->nr_reqs" pages. The iaa_crypto driver > will compress each batch of pages in parallel in the Intel IAA hardware > with 'async' mode and request chaining. > > Hence, zswap_batch_compress() does the same computes for a batch, as > zswap_compress() does for a page; and returns true if the batch was > successfully compressed/stored, and false otherwise. > > If the pool does not support compress batching, zswap_compress_folio() > calls zswap_compress() for each individual page in the folio, as before. > > Signed-off-by: Kanchana P Sridhar > --- > mm/zswap.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 105 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 1be0f1807bfc..f336fafe24c4 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1467,17 +1467,118 @@ static void shrink_worker(struct work_struct *w) > * main API > **********************************/ > > +static bool zswap_batch_compress(struct folio *folio, > + long index, > + unsigned int batch_size, > + struct zswap_entry *entries[], > + struct zswap_pool *pool, > + struct crypto_acomp_ctx *acomp_ctx) > +{ > + int comp_errors[ZSWAP_MAX_BATCH_SIZE] =3D { 0 }; > + unsigned int dlens[ZSWAP_MAX_BATCH_SIZE]; > + struct page *pages[ZSWAP_MAX_BATCH_SIZE]; > + unsigned int i, nr_batch_pages; > + bool ret =3D true; > + > + nr_batch_pages =3D min((unsigned int)(folio_nr_pages(folio) - ind= ex), batch_size); > + > + for (i =3D 0; i < nr_batch_pages; ++i) { > + pages[i] =3D folio_page(folio, index + i); > + dlens[i] =3D PAGE_SIZE; > + } > + > + mutex_lock(&acomp_ctx->mutex); > + > + /* > + * Batch compress @nr_batch_pages. If IAA is the compressor, the > + * hardware will compress @nr_batch_pages in parallel. > + */ > + ret =3D crypto_acomp_batch_compress( > + acomp_ctx->reqs, > + &acomp_ctx->wait, > + pages, > + acomp_ctx->buffers, > + dlens, > + comp_errors, > + nr_batch_pages); I will hold off on reviewing this patch until the acomp interface is settled, but I am wondering if this can be a vectorization of zswap_compress() instead, since there's a lot of common code. > + > + if (ret) { > + /* > + * All batch pages were successfully compressed. > + * Store the pages in zpool. > + */ > + struct zpool *zpool =3D pool->zpool; > + gfp_t gfp =3D __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD= _RECLAIM; > + > + if (zpool_malloc_support_movable(zpool)) > + gfp |=3D __GFP_HIGHMEM | __GFP_MOVABLE; > + > + for (i =3D 0; i < nr_batch_pages; ++i) { > + unsigned long handle; > + char *buf; > + int err; > + > + err =3D zpool_malloc(zpool, dlens[i], gfp, &handl= e); > + > + if (err) { > + if (err =3D=3D -ENOSPC) > + zswap_reject_compress_poor++; > + else > + zswap_reject_alloc_fail++; > + > + ret =3D false; > + break; > + } > + > + buf =3D zpool_map_handle(zpool, handle, ZPOOL_MM_= WO); > + memcpy(buf, acomp_ctx->buffers[i], dlens[i]); > + zpool_unmap_handle(zpool, handle); > + > + entries[i]->handle =3D handle; > + entries[i]->length =3D dlens[i]; > + } > + } else { > + /* Some batch pages had compression errors. */ > + for (i =3D 0; i < nr_batch_pages; ++i) { > + if (comp_errors[i]) { > + if (comp_errors[i] =3D=3D -ENOSPC) > + zswap_reject_compress_poor++; > + else > + zswap_reject_compress_fail++; > + } > + } > + } > + > + mutex_unlock(&acomp_ctx->mutex); > + > + return ret; > +} > + > static bool zswap_compress_folio(struct folio *folio, > struct zswap_entry *entries[], > struct zswap_pool *pool) > { > long index, nr_pages =3D folio_nr_pages(folio); > + struct crypto_acomp_ctx *acomp_ctx; > + unsigned int batch_size; > > - for (index =3D 0; index < nr_pages; ++index) { > - struct page *page =3D folio_page(folio, index); > + acomp_ctx =3D raw_cpu_ptr(pool->acomp_ctx); > + batch_size =3D acomp_ctx->nr_reqs; > > - if (!zswap_compress(page, entries[index], pool)) > - return false; > + if ((batch_size > 1) && (nr_pages > 1)) { > + for (index =3D 0; index < nr_pages; index +=3D batch_size= ) { > + > + if (!zswap_batch_compress(folio, index, batch_siz= e, > + &entries[index], pool, = acomp_ctx)) > + return false; > + } > + } else { > + for (index =3D 0; index < nr_pages; ++index) { > + struct page *page =3D folio_page(folio, index); > + > + if (!zswap_compress(page, entries[index], pool)) > + return false; > + } > } > > return true; > -- > 2.27.0 >