From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DD96D64085 for ; Fri, 8 Nov 2024 20:15:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 192F66B00D1; Fri, 8 Nov 2024 15:15:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 11E526B00D3; Fri, 8 Nov 2024 15:15:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB23E6B00D4; Fri, 8 Nov 2024 15:15:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C4F986B00D1 for ; Fri, 8 Nov 2024 15:15:09 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 767C01A025D for ; Fri, 8 Nov 2024 20:15:09 +0000 (UTC) X-FDA: 82764031182.18.6656374 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) by imf21.hostedemail.com (Postfix) with ESMTP id 2F8521C000A for ; Fri, 8 Nov 2024 20:13:56 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=n9PXCJK6; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731096680; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A8ZaVYoPzYCPAMjKNaGUfWAblmU5p0Gp8/YjaHtNL8w=; b=aTo5eo1rntQ2tMIPtaPr1N7gxb1t/iuAcKFq313dCpoEgOIdFLyQVaO6Bdw9F9CV7EIUf+ kaNh0Akobe781aE3l6hiXGQL6+SH5GGj8M19ZhCGwsGSc+ixR9IEmp9itp6/IWW/W+a4sv crb4lhcNNR8+qW1XfQVanFk/cQlAPIc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=n9PXCJK6; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731096680; a=rsa-sha256; cv=none; b=c5xdVd9eq7GyT/7Fgnl+glK4LT5FS7JWc517fV1VuMtq/LoZif2aIIATdCHy7xxYb7X7/9 ANpfcBP48JgZCDeu1gOvGOCr4Cbyi6+3t7Nurgw4V8TGGsL8OVk1DWJ0ahov5RP9JnaPWR c+8R9npgQcy9yZmMalGmJaRme2q3dM4= Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-7b14554468fso151936585a.1 for ; Fri, 08 Nov 2024 12:15:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731096901; x=1731701701; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=A8ZaVYoPzYCPAMjKNaGUfWAblmU5p0Gp8/YjaHtNL8w=; b=n9PXCJK6tvN51b06vq5EGAE+cDwg75wg5XegCyWya1/fA9ScDzjaZxefmzcJJqXa5E HFf2UiXr5iT1JSwJsy6L9DBq0DIRDvzjo0mSiL6zRm/S0FGN90uT16eDEWAxXGYKEG9h P2dVzaFwrAXSRUh8i1vWdKVO86ZTeDmbwSacLdAUeNT96mlu9wiDAYUlTS8N8bxvDODd 7Y2FPR6NT5eW/csIuMYzYXvO6GiyTPhmD2RaZIuM58zklx0mueRrkJ/x+X/oOD/nl9Xi X+4qHT2TtH+cgb4AGLfyZ1yEHXbV8x+JIHVK7G+L69Jf+sK4W9oc2OwEHEgi5N/X22TL Qgqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731096901; x=1731701701; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A8ZaVYoPzYCPAMjKNaGUfWAblmU5p0Gp8/YjaHtNL8w=; b=kCX0evq1CDCNN9UK4UU3/Ud3LszODDWgSVdOB0Iv5Cz/lq4vj/HN/LqISEp/kE3CDc v0lykPgd351ty0a8AYnK1s8fzpNeg0PDWnYuHXSGvOKwYxWQlzWTsmZ6nLmVigh6TtEW FhQHT5rZ/o6/t46TEheKf0myiZyW+FJJtCsGaaCFu3A9SCd6bq6CZlC0Dl8Kf7c7uR+l sv+x4SDWYhSx8+y8g3gYRusgiRSaLmgoM8mIxEP/vgr77vC3ct0Q13pK8cF+Zue5hVE2 f/SXGgHBczUVlGyDB2bOy+ObVhh/K8jMSR5Hv/b73LRv1dWL6ldKRbY7tQvtc+yStM8X hUeA== X-Forwarded-Encrypted: i=1; AJvYcCUlnPJ/jkMUzHJD9khBbYGj4uJqmFvanwjGTRNzK5alsZ+BK+W+HaFGNRG+9heejStiLSQODhWHrA==@kvack.org X-Gm-Message-State: AOJu0YwjywrzJ9T1H3i01D8ELb/EAMp7ZBgOLl5Tty0czD+Bs6Kd5ElL QvOfySBZmyppnaXnXq4OYqJcpl0eE1mP5Nr3CKsIUsov3yZAbqBrKdtLkKiXctKG/+hAERHvuCa u/XQTf0KYVr3umrqB4M/7nRCSJu6DxAlxa92M X-Google-Smtp-Source: AGHT+IHwxkiHhBh0v4JSU3A9T/d6Om7fmIZTYA02EaBXG8KwzaA2lJaR/4b4dutDSDaMNve2nSBjqFxaXZCgHlAe5U8= X-Received: by 2002:a05:6214:3bc7:b0:6bf:6ef6:22d5 with SMTP id 6a1803df08f44-6d39e16b233mr61618156d6.17.1731096901466; Fri, 08 Nov 2024 12:15:01 -0800 (PST) MIME-Version: 1.0 References: <20241106192105.6731-1-kanchana.p.sridhar@intel.com> <20241106192105.6731-9-kanchana.p.sridhar@intel.com> In-Reply-To: <20241106192105.6731-9-kanchana.p.sridhar@intel.com> From: Yosry Ahmed Date: Fri, 8 Nov 2024 12:14:25 -0800 Message-ID: Subject: Re: [PATCH v3 08/13] mm: zswap: acomp_ctx mutex lock/unlock optimizations. To: Kanchana P Sridhar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, wajdi.k.feghali@intel.com, vinodh.gopal@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2F8521C000A X-Stat-Signature: nbm638gcftjgwazmbm1tt5bc47xpnuib X-Rspam-User: X-HE-Tag: 1731096836-794868 X-HE-Meta: U2FsdGVkX1/VYjR0FnIgLYqkMcnHd3kIkpqjKkUxfW1ANXa5m7xaFnzefAE5NjPT415CGbdjYsP2P5rnox4Mzi0LG+/tft/ci6AVvnAD4Lr0IBH/A5+XaTFVd/rjuUHPrzaxB8Lceq2xSul1kylYOndCqTuGcJmhd5Om1c/x8KYLzeQqD1aJT0uaapUWRWKPcYU3UuC69RwNX3Jy27hLdzwXTTCd7/Gh/N/rA8qGsMrQ3om7xklTQAHHNvesGh0R3tNYfRMYCYIh67+yBPDUX/mdQowWhwdlZcqVKIgoxs+nfwtuv09HVpC8hTPO7oUXmM7IphMBj9lwWyn1b/k3I/sJDhaPRs957GgxX/2cQ6bf4yEJxudZtZMT97vhqQbomX0l+KLLy5mo9tADqSwoj/K8ikbyKLfJqSPj8LGFoL+LCCukhQVr1VHLBUiSKs84D3eBRojO7xoGQFswGWLnmBerZt3hshWGxseodBfGRsUjYz17ytOjRDl0uoj2l+L3xbVMetNtaEQRJycup90g5D2QYeiMIpeFFeRTe9gUtl4G+dPBkc/z9/O8H2BPN0+zFigPdUBmKRYwoSATpHxCRO3cChH2QngzkcRVTRwDKekN5kQdfABqijbiooSVOwiTe+oWSCf55BxUe909iXvzD61CA3TegAKJ31N6K3lxwIFzk2p55QZ0j454b7iyt2qIWhDEIsbcMGlsFv7FyLtfXJdmuFuELvsZaPfvnbdYkHbhfbP0TOMQdXllIJcRxoX44Lv2HWLdkxWyP9oiSuW7RWTRNGj2VNh/NhFG+Iy4qNaNRVKt4GbTvsnaVaoLdzh/ntsN11ft8p8Rc2Xf/O+NC+hgs45Sh2NlRQmWcCQCpn71ACi1rk4iKc9NGxyAK/EX7NNxBk8oWOOoIGDMH0lFoPCSw/ZVb/fuh4D+fCQLlj2kVSegPwV2Fj9nZqfhyTfRXYaHZmgMWa87PW8Z4Hu mxYNDcca 7vRrivI2sBX3hQbzQs23hDPN8sSUE1zHbgkDCXjxq4rljOExVOcHCiJKCk7sb9v7lB0hc44e6tk0oRnAa1yDZglAy7VG8lUQzNKMXw552YZH055oJH/TYkCcDudhKOqTvdImW8tNgGIy/o5/qya2u2i3bb2vXpC5j2y9EdVUasaLh/Mtt5Oo926QmSDSIjCLwPsmaJitrR+RrWek5OpfMBgEzWtfxwX6Al12Shvc9XaYljh8NhAkgG2Nr6xeCrLFwdvjEmUqjbh4VtK9tM1HO7jSCu0voEY+/rViZHTcgpnmaS00= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 6, 2024 at 11:21=E2=80=AFAM Kanchana P Sridhar wrote: > > This patch implements two changes with respect to the acomp_ctx mutex loc= k: The commit subject is misleading, one of these is definitely not an optimization. Also, if we are doing two unrelated things we should do them in two separate commits. > > 1) The mutex lock is not acquired/released in zswap_compress(). Instead, > zswap_store() acquires the mutex lock once before compressing each pag= e > in a large folio, and releases the lock once all pages in the folio ha= ve > been compressed. This should reduce some compute cycles in case of lar= ge > folio stores. I understand how bouncing the mutex around can regress performance, but I expect this to be more due to things like cacheline bouncing and allowing reclaim to make meaningful progress before giving up the mutex, rather than the actual cycles spent acquiring the mutex. Do you have any numbers to support that this is a net improvement? We usually base optimizations on data. > 2) In zswap_decompress(), the mutex lock is released after the conditiona= l > zpool_unmap_handle() based on "src !=3D acomp_ctx->buffer" rather than > before. This ensures that the value of "src" obtained earlier does not > change. If the mutex lock is released before the comparison of "src" i= t > is possible that another call to reclaim by the same process could > obtain the mutex lock and over-write the value of "src". This seems like a bug fix for 9c500835f279 ("mm: zswap: fix kernel BUG in sg_init_one"). That commit changed checking acomp_ctx->is_sleepable outside the mutex, which seems to be safe, to checking acomp_ctx->buffer. If my understanding is correct, this needs to be sent separately as a hotfix, with a proper Fixes tag and CC stable. The side effect would be that we never unmap the zpool handle and essentially leak the memory, right? > > Signed-off-by: Kanchana P Sridhar > --- > mm/zswap.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index f6316b66fb23..3e899fa61445 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -880,6 +880,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu, stru= ct hlist_node *node) > return 0; > } > > +/* > + * The acomp_ctx->mutex must be locked/unlocked in the calling procedure= . > + */ > static bool zswap_compress(struct page *page, struct zswap_entry *entry, > struct zswap_pool *pool) > { > @@ -895,8 +898,6 @@ static bool zswap_compress(struct page *page, struct = zswap_entry *entry, > > acomp_ctx =3D raw_cpu_ptr(pool->acomp_ctx); > > - mutex_lock(&acomp_ctx->mutex); > - > dst =3D acomp_ctx->buffer; > sg_init_table(&input, 1); > sg_set_page(&input, page, PAGE_SIZE, 0); > @@ -949,7 +950,6 @@ static bool zswap_compress(struct page *page, struct = zswap_entry *entry, > else if (alloc_ret) > zswap_reject_alloc_fail++; > > - mutex_unlock(&acomp_ctx->mutex); > return comp_ret =3D=3D 0 && alloc_ret =3D=3D 0; > } > > @@ -986,10 +986,16 @@ static void zswap_decompress(struct zswap_entry *en= try, struct folio *folio) > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->= length, PAGE_SIZE); > BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &= acomp_ctx->wait)); > BUG_ON(acomp_ctx->req->dlen !=3D PAGE_SIZE); > - mutex_unlock(&acomp_ctx->mutex); > > if (src !=3D acomp_ctx->buffer) > zpool_unmap_handle(zpool, entry->handle); > + > + /* > + * It is safer to unlock the mutex after the check for > + * "src !=3D acomp_ctx->buffer" so that the value of "src" > + * does not change. > + */ This comment is unnecessary, we should only release the lock after we are done accessing protected fields. > + mutex_unlock(&acomp_ctx->mutex); > } > > /********************************* > @@ -1487,6 +1493,7 @@ bool zswap_store(struct folio *folio) > { > long nr_pages =3D folio_nr_pages(folio); > swp_entry_t swp =3D folio->swap; > + struct crypto_acomp_ctx *acomp_ctx; > struct obj_cgroup *objcg =3D NULL; > struct mem_cgroup *memcg =3D NULL; > struct zswap_pool *pool; > @@ -1526,6 +1533,9 @@ bool zswap_store(struct folio *folio) > mem_cgroup_put(memcg); > } > > + acomp_ctx =3D raw_cpu_ptr(pool->acomp_ctx); > + mutex_lock(&acomp_ctx->mutex); > + > for (index =3D 0; index < nr_pages; ++index) { > struct page *page =3D folio_page(folio, index); > ssize_t bytes; > @@ -1547,6 +1557,7 @@ bool zswap_store(struct folio *folio) > ret =3D true; > > put_pool: > + mutex_unlock(&acomp_ctx->mutex); > zswap_pool_put(pool); > put_objcg: > obj_cgroup_put(objcg); > -- > 2.27.0 >