From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD929C8303D for ; Fri, 4 Jul 2025 04:24:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A932D440161; Fri, 4 Jul 2025 00:23:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4421440154; Fri, 4 Jul 2025 00:23:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93421440161; Fri, 4 Jul 2025 00:23:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7DAA7440154 for ; Fri, 4 Jul 2025 00:23:42 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4CD695F954 for ; Fri, 4 Jul 2025 04:23:42 +0000 (UTC) X-FDA: 83625288684.19.0472A4F Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by imf24.hostedemail.com (Postfix) with ESMTP id 3422C180003 for ; Fri, 4 Jul 2025 04:23:40 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RWMD3pXd; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 198.175.65.17 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751603020; a=rsa-sha256; cv=none; b=2apQNBRmc6hjDJ8yWt37ziEZQpX8/HqFk23KT+13NaNDS4kCXfF2Ac+s45AbBcLkYdMOPl qKk37uioAfm4aEV4DmQo5TrUjcR83b28T0FoKzJGwyS3kHQTF0fFhcy6JYJifB1Z8GihC7 l8G37zNym6tJ0NaPS5LUVD8v0IuOTKI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=RWMD3pXd; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 198.175.65.17 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751603020; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ns0bdjXDdudzSL9cugfZzwSxsceMwXod50p+lqakXok=; b=HIVmxnHL+sqvq6CZvvbRWS0gTZzF/Z5b30EuXQ0cztKmEg6yF4iTDgaDaZWujUqweAeQXt xopztwSU7LSCx2dYTbJsY1asHW4fHHxyIiBVCJLNJqXu2NFjfcC+7DiMvKvjkNX9Eavkbv orYiJHPZIUoeau68dRz/ErCwa34uo9I= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1751603021; x=1783139021; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3e3UJSiDD0ePCp1UhoZ0QEgH6akJ/6LM7f1aXMoqR8U=; b=RWMD3pXdhiB1adMwG6O17nEdjxUNCFcplhBbwwXpvSjhDXNL9KYlCpXi 6JXa6DGZoQAMEv6dg4d2qsOre7EhZLHWSGwCSeRch8VE/5WZngu3ls6y5 3oTzhfz5h1Cg2tcTE1rPmS1bhis5OaZPbwh/L4ZzXRVvFmSvaowhTqepX PxNe5atwopNxEphAg4sVWOge0/aWLK5qZFTDpmH2fP8zxgVWLZQIxEB80 rcEisOcagTQx7Tj73c7E/1OfZxOIHFURJ9sTLnxDlYahCHXJsj4vgnZcu f1LQIy3+qJgqkSju1Hh72/U8sxOrkMTFU5zqxoIWo0WGCg/+gnsGqLWL3 Q==; X-CSE-ConnectionGUID: B8BvgfMjQiaSxJXwlKPBpA== X-CSE-MsgGUID: B2ZTEkr0SxWpaNq0WdWZyw== X-IronPort-AV: E=McAfee;i="6800,10657,11483"; a="53909157" X-IronPort-AV: E=Sophos;i="6.16,286,1744095600"; d="scan'208";a="53909157" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Jul 2025 21:23:27 -0700 X-CSE-ConnectionGUID: fpQ2MfLjQCyYj/Lk6NMX7g== X-CSE-MsgGUID: UUNqrK8TQdq0+7p7PZv2+g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,286,1744095600"; d="scan'208";a="153968708" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.115]) by orviesa006.jf.intel.com with ESMTP; 03 Jul 2025 21:23:26 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosry.ahmed@linux.dev, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, 21cnbao@gmail.com, ying.huang@linux.alibaba.com, akpm@linux-foundation.org, senozhatsky@chromium.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, vinicius.gomes@intel.com Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [PATCH v10 16/25] crypto: iaa - Submit the two largest source buffers first in decompress batching. Date: Thu, 3 Jul 2025 21:23:14 -0700 Message-Id: <20250704042323.10318-17-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20250704042323.10318-1-kanchana.p.sridhar@intel.com> References: <20250704042323.10318-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3422C180003 X-Stat-Signature: mxrkf7iqjm8tj84dt3zjzctjgd67h8b4 X-Rspam-User: X-HE-Tag: 1751603020-158050 X-HE-Meta: U2FsdGVkX191J2wu0SyXoj7J5wNuyxCHPFaLBCAwLqOaz/UZR1BTeyPeqtWbtow8MzS8UhpWk9iLuqmwsvBSaHiEkIRWr7M09X6HQUmpyhgSXfVEUktEgURULmPB/7/920FHmjtfUOGJ5Q3HGyt1lA7devUgsiOuIbPumiKtmoxn+JOcm2r9jERCrTtxUAldU8fkdhQMQE+9/Jt1gzDWNPcu+1NkNc4hSB/8NUqXu+tz+YM6YRl6iVrjg+1Bd2A8p54JbifJop+lgmDzB1F0EQR3rDngFzYdVqlGCoJRxF950BqoPZ/AUYkcS5C0E4IgfCfoR1oCjwBN1+GijyJyH9emkzF5myvFwcQ6cofYGok18R8IRnuCigSBeylN+s7A/GWKUsRfsIIffsW5LH0zSxApVvbnxoUZ0/fAB5dOoGnEB0QQB7HJqqUrpsw1xzFUOr2og09GsqZnpAdmbYj9UOk+PtmpjF0KINMWivSJtAJklJWvPbfSiOcZt9zGRBOqoXWmVqVYxCyxAmz6dAc3hgOFt1HwgKRMvrhuRFJhLFHxaNscVo5jmeiccF+eCiBGM1j2kWE/VoNxZMLL62eDICrcAXK4DVXBPhUTpCa54laLUTvgnV+M5TEbIbYCQLLMNKtBFUW/bwvbDogGo5Mdjkl7NtLCBrrT/3kTN67DVVdLcJadjkKyKvfJRnMIbXsydmoHRxirRX41aq3TT51AKIisexuYNkLTU9fbGI8oG89Xo9BiwsF1DlQNdce+JMQ4oKPkYHOm1e/hWIc6i1usT/ztoYT46v376dwjGzcHryOWvFVRcN3VsEFs6DH5ZbWYtDxdC9J7etOp7eNIZBUy4CHO1LasnmT2w2ipojW/Cqns63JEkl5Ne0STfaWQc+K0iRVDiNnVB8+FK8yEWmz8ZOSt8ird9vNB/3WleLkCr2WQo034VHqdtL833BLOaTI6kTJ01IBuPhLro/b+pzh UlBOPhB8 DN2B9hrVmWYH4mIuplPR0b9X4r0/h3aHmaL7oYZOp9tzRO84hOU43ZeCEuExiAsQTMPmJkHiHVPhA+Nq2KWVefSUgoZqEUgLuOLNOpr2NkoTiPaMe3XzPArFUMGPMCcSKh5v13lcwo+4ZF1tpJPU/pTLvUNNFOhiffhKXPQFfyZMLkt/9g8/9Mxox8g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch finds the two largest source buffers in a given decompression batch, and submits them first to the IAA decompress engines. This improves decompress batching latency because the hardware has a head start on decompressing the highest latency source buffers in the batch. Workload performance is also significantly improved as a result of this optimization. Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto_main.c | 60 +++++++++++++++++++++- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 09d786e85ab66..4ed56a69112a9 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -2375,6 +2375,35 @@ static int iaa_comp_acompress_batch( return err; } +/* + * Find the two largest source buffers in @slens for a decompress batch, + * and pass their indices back in @idx_max and @idx_next_max. + * + * Returns true if there is no second largest source buffer, only a max buffer. + */ +static __always_inline bool decomp_batch_get_max_slens_idx( + unsigned int slens[], + int nr_pages, + int *idx_max, + int *idx_next_max) +{ + int i, max_i = 0, next_max_i = 0; + + for (i = 0; i < nr_pages; ++i) { + if (slens[i] >= slens[max_i]) { + next_max_i = max_i; + max_i = i; + } else if ((next_max_i == max_i) || (slens[i] > slens[next_max_i])) { + next_max_i = i; + } + } + + *idx_max = max_i; + *idx_next_max = next_max_i; + + return (next_max_i == max_i); +} + /** * This API provides IAA decompress batching functionality for use by swap * modules. @@ -2407,18 +2436,36 @@ static int iaa_comp_adecompress_batch( { struct scatterlist inputs[IAA_CRYPTO_MAX_BATCH_SIZE]; struct scatterlist outputs[IAA_CRYPTO_MAX_BATCH_SIZE]; + bool max_processed = false, next_max_processed = false; bool decompressions_done = false; - int i, err = 0; + int i, max_i, next_max_i, err = 0; BUG_ON(nr_reqs > IAA_CRYPTO_MAX_BATCH_SIZE); iaa_set_req_poll(reqs, nr_reqs, true); + /* + * Get the indices of the two largest decomp buffers in the batch. + * Submit them first. This improves latency of the batch. + */ + next_max_processed = decomp_batch_get_max_slens_idx(slens, nr_reqs, + &max_i, &next_max_i); + + i = max_i; + /* * Prepare and submit the batch of iaa_reqs to IAA. IAA will process * these decompress jobs in parallel. */ - for (i = 0; i < nr_reqs; ++i) { + for (; i < nr_reqs; ++i) { + if ((i == max_i) && max_processed) + continue; + if ((i == next_max_i) && max_processed && next_max_processed) + continue; + + if (max_processed && !next_max_processed) + i = next_max_i; + reqs[i]->src = &inputs[i]; reqs[i]->dst = &outputs[i]; sg_init_one(reqs[i]->src, srcs[i], slens[i]); @@ -2437,6 +2484,15 @@ static int iaa_comp_adecompress_batch( errors[i] = -EAGAIN; else if (errors[i]) err = -EINVAL; + + if (i == max_i) { + max_processed = true; + i = -1; + } + if (i == next_max_i) { + next_max_processed = true; + i = -1; + } } /* -- 2.27.0