From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E29D1E6C609 for ; Tue, 3 Dec 2024 05:50:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D7766B007B; Tue, 3 Dec 2024 00:50:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 487F06B0083; Tue, 3 Dec 2024 00:50:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 327C26B0085; Tue, 3 Dec 2024 00:50:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 147B46B007B for ; Tue, 3 Dec 2024 00:50:12 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B4183120687 for ; Tue, 3 Dec 2024 05:50:11 +0000 (UTC) X-FDA: 82852571298.20.D945BD2 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) by imf16.hostedemail.com (Postfix) with ESMTP id DB60E180017 for ; Tue, 3 Dec 2024 05:49:56 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=v7YNIuav; spf=pass (imf16.hostedemail.com: domain of yosryahmed@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733205001; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yCNIMTjnW7Pm1TEcHWt8TE+HaVex4OhWuGsdWimbvkM=; b=Hh3ZRFsAz/QEVCzK+njD9Cv1tNE/G5m7DuVhHMPozdf9jnwYyOGqdYl+UmJxrW6KQCPDCl 9u1sNNkofILAVS6S8F9L/uzjcVw8cPfxD/6UJeLaR/1BaN6dDBZmFS+X++d/uKpiRF2HfL jeIAmjPnn2u9kGKpEbS/7kcjnNm0Kak= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=v7YNIuav; spf=pass (imf16.hostedemail.com: domain of yosryahmed@google.com designates 209.85.128.173 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733205001; a=rsa-sha256; cv=none; b=fHkpz3++5kGuPLjl2NYwhRYi/W15m7YN+nQRy1GPlymIDwvwnBedRSsrEZUpCRVU61kLnE GaknlnJ8CK6cteHhjGtP38qCpz52VvgQeYndWF7fNnmvZMqQdiHtup4I3ffNaE+kUlZ0VA Ue8ZuggLHrDBeoABJf/ugZBrXc1sMbQ= Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-6ef81222be7so24258967b3.3 for ; Mon, 02 Dec 2024 21:50:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733205009; x=1733809809; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yCNIMTjnW7Pm1TEcHWt8TE+HaVex4OhWuGsdWimbvkM=; b=v7YNIuavrhZOm+Ybq3x2RmUGKAilrsMEqV3t2J5zSlMfpt9OAkQQSrZhD9QdkcboA5 2rtmbTodPvpZl1juNdgRFJsxPeLdGqrd2Eo7KBte39/WdP5yJK0BWLA4NFPml8zx4D9v xGVDgQxr8gs9K9NeQusnijnm6OwqfH2iTuY9ryugSDM+IyI9IU2diilOOvnS8BKjLYcB 1ddiPAEJhcOeoStifOyAC2jG6kEsBllVn2FwVtltZjFAKlOIaqZeyCSz+Sks57p7TvHU TtKNfRGBY8fXjpjLJIKbedK6inIc3biO/PEUndsSuefkKkl8qaa7tuM8soyuCG/rPDE+ x10A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733205009; x=1733809809; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yCNIMTjnW7Pm1TEcHWt8TE+HaVex4OhWuGsdWimbvkM=; b=WIdk0Buiav1SqJiNe4DCrAzzanLIxX0gEMXzkhswG2LY0Ta3FKj+WmEjIsNedfp6NK lwQSEK+B3GagDRUqQRIdk4sRJCv6KZOdcciP6MLLIiCNDVx3iD5hxqt6afNyXuay8PbC 0UBdh2JwYycdglntx2k70LJdHd+1tOxDLJY1PbyqefPmTWDQBY5iR/2W3FZu3tvi9Lho R/ao5208vl9d8G4j6eOM4f3hQCTAxii0B1XkrTtC2AtqEoL+uV/8g6xqZlLaOaBFswqv L7DJPa+Y0KcXHMmU2Ii5tKoLStWehNL4jG/Gwt3CAUAjae1ONbu7fW1xNbX/pY3tIelt RGwA== X-Forwarded-Encrypted: i=1; AJvYcCWvDhmKxsqwzgpkpHejsEGxg0nYkuOpx4aarmXnenNoEa+WhAxetFhPf0UKWkHh3ysI9FYruGv3bQ==@kvack.org X-Gm-Message-State: AOJu0Yx5szd9x5r5eR2pNiX7EVM/dTYhCj3jQogs5bb79n6CC6dsxSVU gVSFUSYgYMR3o9Q+2KZAVvl4Bi5AygkttADlj4eWa1OD59oR/MD7eDjGy0Q9KkafBwaaOwbhKW5 BgVo400u8lD5IqXQr19QR28tfMUcZwb5J2Q+r X-Gm-Gg: ASbGncv6nZEEqq1+xNcFUcRcE9Xk5oJjfMUVW2XO8IfvUK3frvoZa3ZuJ7ued8nItoi VfQGQxSEodEbJlbnMR2ibJtGJVZxO X-Google-Smtp-Source: AGHT+IEq1Q7mg2nIVMC3+8erBp1HUADTRiSvUDUhWnHUsl3QRlIpAyS2mgoMhULUC+gMf0vav1/SIz/BYf6AfpCRQLs= X-Received: by 2002:a05:690c:6108:b0:6ef:4ba4:ecb0 with SMTP id 00721157ae682-6eface01549mr20286037b3.6.1733205008633; Mon, 02 Dec 2024 21:50:08 -0800 (PST) MIME-Version: 1.0 References: <20241127225324.6770-1-kanchana.p.sridhar@intel.com> <20241127225324.6770-3-kanchana.p.sridhar@intel.com> <045a786d-7b13-4127-82ce-57510565bd15@linux.dev> In-Reply-To: From: Yosry Ahmed Date: Mon, 2 Dec 2024 21:49:32 -0800 Message-ID: Subject: Re: [PATCH v1 2/2] mm: zswap: zswap_store_pages() simplifications for batching. To: "Sridhar, Kanchana P" Cc: Chengming Zhou , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "hannes@cmpxchg.org" , "nphamcs@gmail.com" , "usamaarif642@gmail.com" , "ryan.roberts@arm.com" , "21cnbao@gmail.com" <21cnbao@gmail.com>, "akpm@linux-foundation.org" , "Feghali, Wajdi K" , "Gopal, Vinodh" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Stat-Signature: 3cf813xxpgyhkbwcruoq4dojefwth4jj X-Rspamd-Queue-Id: DB60E180017 X-Rspam-User: X-HE-Tag: 1733204996-90825 X-HE-Meta: U2FsdGVkX1/s04bHQliChFOe30bjVmhAGa20TFkXKVlEVdUztUn+cPgtdvX2e8A7WxwwJqMhz5/c3SdHC+Vox9017tQLWSB6NsSpEVxZIuUqPAJNIwxDroql5lGWbysZICbznL9ism3PGl4/s3A4C1D5Japk3tLyjMwSitGnu2Kekl0tlhuMIRhvO28xveJbjQK9M6J21DIirW/dgnsAduH2eB1Z9d7mhE2Hdq7liLQS5tpqReqAFPrTOmFd6VR4DDMunkBIjcqTL1OwkxsA0wUjZatVOpXXs1yFK5QbaCZGDsPkHXcCcrkAdYde0iOVT6c8ngCeKFFYWen7lZoYzWWyU+O4evHb7Bmxe+UeaGo8y/GuLQykxQKRljbTa/Ol0f2gfyqDMutr7AOnVjRj5re0S1BdPlrelThCKXspYl50IrigMgsknRVkirfZwOFSb3N9XWs+PH6Zj3C77NDU9GnAN14lA6+mqEmOfqiqyvejXNaZCXTU/SgQMYe79eBwGD7QnmbUVBstc2/m72qFK3oZoyLubSMOTxYRq3wertz5BQw5Fn1TmgxZi9RrxWQ5FQR36e7A5ub3q7wB7EgXltzLB9K7ag9hc68Suy86ZJAPW86JVHJdcpgd/qiWpeFKSh4iQbFNnHRtsYwPJMhQqxtsJVeRA3sjo7x9yQoZ5hiesYWkEZVdo632yAnoxQBkJEQFcmjl3wEWtov6oD4V7+SQUakn0Qo4FFSskUo4J71ur5DuFG+MYgMFj4mj/ehYgAQsOR3Uva9EQ4VusTj2LN+YRWR5fVpTTz7TyI3/dd4eSexO7XG5OfjpWjTC0x+hoedkAHkZ7IaQfy7zbNAliujhXlUDxl05qW2tg8PWl4yxnBkZ5FsEr6IUigXMwx6IBxGMQEht6N0cAKUoNlQqriuL/abUYRKDt/5nd1J9bOIPIupr+UlmK6KrlFLg4jTQsG5baZUZD7EOQHKsngL aBG6u22+ L/ujTd2lfULGvms+etqIQ8cF1WP/Cn4zAvgtmKgbI1aQwszvQnh7I+g/ymGNwBGT3vqtDfiW6Xt+PkR30sIMhQEul0zuYIdthtA0Opc+jxJq0ujxjpvJQFL0QqGqqFeI3MYBQnC0pg2Y+szswg4rtql2t26KDJ+VtggP6wUlS0lnLLuREhujDVm0IeiQosaggETW9VqBf2XQGEEFvd5exTxxTP7a8IUUPytmbNBY7cpNuriF7diwwRz2Zt17LH0HZIe78dxP7XC7At7wktCVyEGiDB6tac08bJOyUgk+OmR7lmG2Walr4vm2WyA7RrTZ7Q/xUpH+u9WDgSgkQsONjsAuazRPh6PXKabyFX8AfRvzau/twUlTwZ7/JiikANSEhoU5Rv9bYpWxo7vWiavEbHd50jXdY+q82hnz29+FNgpR6TArsCbGkNXKfYj4TJ44lPySUMrevQyl0xIFiikdFX5sxRIHy3v0/GNhwUVPXTiWcKw8GjB8xkd0LXcrq0iTrZRW0+7aWsuOlphIXBLeWIbe/jVTP9EXQj2EnDpqDOcCNS/9svXWcLfx8R7hi6D29nOts9PY1/V948Ao6WwjIelL8truQQjS1iuuxwGh1KGVTC/Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 2, 2024 at 8:19=E2=80=AFPM Sridhar, Kanchana P wrote: > > > > -----Original Message----- > > From: Chengming Zhou > > Sent: Monday, December 2, 2024 7:06 PM > > To: Sridhar, Kanchana P ; Yosry Ahmed > > > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > > hannes@cmpxchg.org; nphamcs@gmail.com; usamaarif642@gmail.com; > > ryan.roberts@arm.com; 21cnbao@gmail.com; akpm@linux-foundation.org; > > Feghali, Wajdi K ; Gopal, Vinodh > > > > Subject: Re: [PATCH v1 2/2] mm: zswap: zswap_store_pages() simplificati= ons > > for batching. > > > > On 2024/12/3 09:01, Sridhar, Kanchana P wrote: > > > Hi Chengming, Yosry, > > > > > >> -----Original Message----- > > >> From: Yosry Ahmed > > >> Sent: Monday, December 2, 2024 11:33 AM > > >> To: Chengming Zhou > > >> Cc: Sridhar, Kanchana P ; linux- > > >> kernel@vger.kernel.org; linux-mm@kvack.org; hannes@cmpxchg.org; > > >> nphamcs@gmail.com; usamaarif642@gmail.com; ryan.roberts@arm.com; > > >> 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali, Wajdi K > > >> ; Gopal, Vinodh > > >> Subject: Re: [PATCH v1 2/2] mm: zswap: zswap_store_pages() > > simplifications > > >> for batching. > > >> > > >> On Wed, Nov 27, 2024 at 11:00=E2=80=AFPM Chengming Zhou > > >> wrote: > > >>> > > >>> On 2024/11/28 06:53, Kanchana P Sridhar wrote: > > >>>> In order to set up zswap_store_pages() to enable a clean batching > > >>>> implementation in [1], this patch implements the following changes= : > > >>>> > > >>>> 1) Addition of zswap_alloc_entries() which will allocate zswap ent= ries for > > >>>> all pages in the specified range for the folio, upfront. If t= his fails, > > >>>> we return an error status to zswap_store(). > > >>>> > > >>>> 2) Addition of zswap_compress_pages() that calls zswap_compress() = for > > >> each > > >>>> page, and returns false if any zswap_compress() fails, so > > >>>> zswap_store_page() can cleanup resources allocated and return= an > > >> error > > >>>> status to zswap_store(). > > >>>> > > >>>> 3) A "store_pages_failed" label that is a catch-all for all failur= e points > > >>>> in zswap_store_pages(). This facilitates cleaner error handli= ng within > > >>>> zswap_store_pages(), which will become important for IAA comp= ress > > >>>> batching in [1]. > > >>>> > > >>>> [1]: https://patchwork.kernel.org/project/linux- > > mm/list/?series=3D911935 > > >>>> > > >>>> Signed-off-by: Kanchana P Sridhar > > >>>> --- > > >>>> mm/zswap.c | 93 +++++++++++++++++++++++++++++++++++++++++- > > --- > > >> --------- > > >>>> 1 file changed, 71 insertions(+), 22 deletions(-) > > >>>> > > >>>> diff --git a/mm/zswap.c b/mm/zswap.c > > >>>> index b09d1023e775..db80c66e2205 100644 > > >>>> --- a/mm/zswap.c > > >>>> +++ b/mm/zswap.c > > >>>> @@ -1409,9 +1409,56 @@ static void shrink_worker(struct work_struc= t > > >> *w) > > >>>> * main API > > >>>> **********************************/ > > >>>> > > >>>> +static bool zswap_compress_pages(struct page *pages[], > > >>>> + struct zswap_entry *entries[], > > >>>> + u8 nr_pages, > > >>>> + struct zswap_pool *pool) > > >>>> +{ > > >>>> + u8 i; > > >>>> + > > >>>> + for (i =3D 0; i < nr_pages; ++i) { > > >>>> + if (!zswap_compress(pages[i], entries[i], pool)) > > >>>> + return false; > > >>>> + } > > >>>> + > > >>>> + return true; > > >>>> +} > > >>> > > >>> How about introducing a `zswap_compress_folio()` interface which > > >>> can be used by `zswap_store()`? > > >>> ``` > > >>> zswap_store() > > >>> nr_pages =3D folio_nr_pages(folio) > > >>> > > >>> entries =3D zswap_alloc_entries(nr_pages) > > >>> > > >>> ret =3D zswap_compress_folio(folio, entries, pool) > > >>> > > >>> // store entries into xarray and LRU list > > >>> ``` > > >>> > > >>> And this version `zswap_compress_folio()` is very simple for now: > > >>> ``` > > >>> zswap_compress_folio() > > >>> nr_pages =3D folio_nr_pages(folio) > > >>> > > >>> for (index =3D 0; index < nr_pages; ++index) { > > >>> struct page *page =3D folio_page(folio, index); > > >>> > > >>> if (!zswap_compress(page, &entries[index], pool)) > > >>> return false; > > >>> } > > >>> > > >>> return true; > > >>> ``` > > >>> This can be easily extended to support your "batched" version. > > >>> > > >>> Then the old `zswap_store_page()` could be removed. > > >>> > > >>> The good point is simplicity, that we don't need to slice folio > > >>> into multiple batches, then repeat the common operations for each > > >>> batch, like preparing entries, storing into xarray and LRU list... > > >> > > >> +1 > > > > > > Thanks for the code review comments. One question though: would > > > it make sense to trade-off the memory footprint cost with the code > > > simplification? For instance, lets say we want to store a 64k folio. > > > We would allocate memory for 16 zswap entries, and lets say one of > > > the compressions fails, we would deallocate memory for all 16 zswap > > > entries. Could this lead to zswap_entry kmem_cache starvation and > > > subsequent zswap_store() failures in multiple processes scenarios? > > > > Ah, I get your consideration. But it's the unlikely case, right? > > > > If the case you mentioned above happens a lot, I think yes, we should > > optimize its memory footprint to avoid allocation and deallocation. > > Thanks Chengming. I see your point. Let me gather performance data > for the two options, and share. Yeah I think we shouldn't optimize for the uncommon case, not until there's a real problem that needs fixing. > > > > > On the other hand, we should consider a folio would be compressed > > successfully in most cases. So we have to allocate all entries > > eventually. > > > > Based on your consideration, I think your way is ok too, although > > I think the patch 2/2 should be dropped, since it hides pages loop > > in smaller functions, as Yosry mentioned too. > > My main intent with patch 2/2 was to set up the error handling > path to be common and simpler, whether errors were encountered > during compression/zpool_malloc/xarray store. Hence, I initialize the > allocated zswap_entry's handle in zswap_alloc_entries() to ERR_PTR(-EINVA= L), > so it is easy for the common error handling code in patch 2 to determine > if the handle was allocated (and hence needs to be freed). This benefits > the batching code by eliminating the need to maintain state as to which > stage of zswap_store_pages() sees an error, based on which resources > would need to be deleted. > > My key consideration is to keep the batching error handling code simple, > hence these changes in patch 2. The changes described above would > help batching, and should not impact the non-batching case, as indicated > by the regression testing data I've included in the cover letter. > > I don't mind inlining the implementation of the helper functions, as I > mentioned in my response to Yosry. I am hoping the error handling > simplifications are acceptable, since they will help the batching code. I think having the loops open-coded should still be better than separate helpers. But I understand not wanting to have the loops directly in zswap_store(), as the error handling would be simpler if we do it in a separate function like zswap_store_pages(). How about we just move the loop from zswap_store() to zswap_store_page() and call it zswap_store_folio()? When batching is added I imagine we may need to split the loop into two loops before and after zswap_compress_folio(), which isn't very neat but is probably fine.