From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2888D17138 for ; Mon, 21 Oct 2024 21:12:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 379F26B009C; Mon, 21 Oct 2024 17:12:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3284B6B009D; Mon, 21 Oct 2024 17:12:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 217D76B009E; Mon, 21 Oct 2024 17:12:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 02E286B009D for ; Mon, 21 Oct 2024 17:12:08 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3612EA1B3C for ; Mon, 21 Oct 2024 21:11:40 +0000 (UTC) X-FDA: 82698856128.17.4C9E4B2 Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf18.hostedemail.com (Postfix) with ESMTP id 4554E1C0006 for ; Mon, 21 Oct 2024 21:12:00 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="1U1/pt1e"; spf=pass (imf18.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729544927; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vxZWb+WS7qDoH32HBtuD0Xo9X4+gFBKJHHXT+qJNXmQ=; b=JrWsub0IFRypCN/NY6RWkmdrPIGEbhO6RBvJO+BjQjNFZzhLrr1TvtgTvlBOGz+04YdbwX HWI58if860GWU1+XWHlBY7JEYAK4DYq3RKwO78iDB3HYsoyPBp7RPdGiaPvv59GSZsQ3nX Jo2SsCkGqGXuHsRpGVUnK4LDyqjoKOM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="1U1/pt1e"; spf=pass (imf18.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729544927; a=rsa-sha256; cv=none; b=PRoALtROI4WVBxaMJoXq/khxYuRdCswYw+JSE4VW2yhpdNgsq9Ws4YbSJWvcTWveNIHdSh h0ampHSAEUi/EruM5kbEHt3N5mbLk9wTEOk6SmciuDaT/Jm8pQH52bAvOfRwwn5+yCzjyv CYZARnHYYk9QEDi4CPUIX31vbSCOfm4= Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-5c96b2a10e1so6120959a12.2 for ; Mon, 21 Oct 2024 14:12:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729545125; x=1730149925; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vxZWb+WS7qDoH32HBtuD0Xo9X4+gFBKJHHXT+qJNXmQ=; b=1U1/pt1e26c2a50WXYsYVS8OUy2pfrDIj5DvIGoT3R4NW4FyRESGEhwo92LlMvIVu5 AJTWyQ8s2gkuzDPaxtvYOX/gUi12nSUw8yTR9hN+xz+/VKeDdQcbhdszzNDddXD0yzju 1AHpJqe668pysfYeepPl1RHYQtq9N8q43xc02+1tYpLqti3JH2S1fVnFOHLIzmUqfeH1 0kMflVWbzcHfgamZ/ONHQRm7017zOPGBA/guQj3QZQHv6slDAzqSQVqg+PCbwRbHItjB WA4nBKimZd4FTKx5uutqb5u2kg8hvNpClD6vjJm0xSLi0kBD1eS8DFVecSP/69cIylcj sFbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729545125; x=1730149925; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vxZWb+WS7qDoH32HBtuD0Xo9X4+gFBKJHHXT+qJNXmQ=; b=Qdubm03xgEP9hPuHYiR0o9gzITGbcAmX9Z7VYOdOlIloE/GRRBV4gbLcwZD5u3/vjL VBUJCvG7o9/1IvJ45QcT12U9lx7Sa0pYJDsX8v0kiVRDjIjnkfDUxTlt5AlKRaqlmay0 ZABezEAW1Mo3FEU6LrPwnEuYWjt772tbgQojDnbzmFEvnUl+AkDH9c2o9G9ZmAAz+X+r ihQ9ekqVrcWy271ZSsljwhpHidrRxe5flHUrLHKLv8VuDVrLeCfvFjgyd+IWYM6/ZCjW ww2MhIzTZ0NI1p5773iflcgT7xl1h+j6MstNl1WbJfTQBjaVptJ2UvCTi3mU0wGkUR5i nu6g== X-Forwarded-Encrypted: i=1; AJvYcCX7RMhBL9UEbRvwQoCbt/BfzC0f37qFd1LNzkbJd2miqayZw/FqOkZj9PMX91B5Z3dQuKGmzZrbfw==@kvack.org X-Gm-Message-State: AOJu0YxUWH9KwtsdBuzu2lgomXstJcpvuwN3HzQEU6I4UR4WWvPiyLtm tm0qdLgpHZ0Vyg7vxo4rsiW18gDgWyXf4xx2ai/uOsY05+raH3r2mPRtJRvEG4davec6jD2C1PU WCAMACufbNuIkcXLYDJ2Gjd+JdxHJht3QpclK X-Google-Smtp-Source: AGHT+IGfI7U1CpQcRa4n8SZQbKYJ/MwffcQXTK4JDslxT7srPqADrekslOpc2rstiPY3ZmcOcUZiV31xzp/1soR96HQ= X-Received: by 2002:a17:907:7252:b0:a9a:134:9887 with SMTP id a640c23a62f3a-a9a69c9eb2dmr1252133066b.41.1729545124959; Mon, 21 Oct 2024 14:12:04 -0700 (PDT) MIME-Version: 1.0 References: <20241018105026.2521366-1-usamaarif642@gmail.com> <20241018105026.2521366-2-usamaarif642@gmail.com> In-Reply-To: <20241018105026.2521366-2-usamaarif642@gmail.com> From: Yosry Ahmed Date: Mon, 21 Oct 2024 14:11:29 -0700 Message-ID: Subject: Re: [RFC 1/4] mm/zswap: skip swapcache for swapping in zswap pages To: Usama Arif Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, david@redhat.com, willy@infradead.org, kanchana.p.sridhar@intel.com, nphamcs@gmail.com, chengming.zhou@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, riel@surriel.com, shakeel.butt@linux.dev, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4554E1C0006 X-Stat-Signature: cffo1cmd986nh3js18e5ye5b7sjwf1eo X-Rspam-User: X-HE-Tag: 1729545120-187463 X-HE-Meta: U2FsdGVkX19Ilz6N01i8LsUekrA4PgIU4bC2KNyPPJhxYsghZUJeLFGEn8ots7bLDzpARybtiWV5BmjF7cQ4Dj19mKXDtAtVQkOlCGyf/v1grMIHjrGbgstNbOvIH6jBoPPqci/2Wghd8GoHPNf5+y6N4KmirbvNEmv2zFQ2zVnK4TGwLRdiF9P1IOV7hdn7mhFW1P46PuDgbPRtZHlfcSA2mQ6DCsuPtBfcAxqAJtucPEoDN6V7gpNMj5iVVj6TN8+MsPPGEYwwPJKz0oLekQn5v/xvHbBJTyqEsrovklhsvpJYr0wWq7IB8/JfJx0VK5JT5HTlKgBqa1pSrSOT62yw202KTSjO3khH72exRhxm1JYoTcvL81IA1a6V+Qp8NeFf6rbr4iuVjIqoGb+1Udq3Pb2DtyOt//NSVW7tkSsiWhq1h090oyOO4bg0shU2kxQZwX4cI+Pvz1s0jg/WNlVNm6HYr0Fmcc5+ubeue+6u4seQw/5ioIdUM+TA/hE+7toS6OZfX3qWBbFo5hJwWF25EtGPVu40Lt5pfZeVnGCHFYxSC/pDohydbpdu2WXQRajxRRdDbJQiSEpnqqHgEACE5RazoBwY0K7tczdmcuvybPRWDO4s7HtytopBScc42q/wG86pzB7Z0u3qr44ImIOgdbiQXoz9FisbpIturFw+x0PCtUymp/SzTpiwvteDyGkub7BMB4iLmF92jAA//HIyJpTlHZCe4pcKx/bJMC9QkPdhbck6xBef1wHrH0IOr3O0ns4m893Udg6NFubr7Jh4MjF4tWzapT2F+R3Cy2onVrFdE3kQSRsrijVWLyJjrCnvzuaMZiY79woHPXuJD30iWXPuWq3/L34Xk2tuEcvfJ/XeyC5g0HeEH7WEk1b6l5ZyM4s1+WnqU/n1DYMWJne5mHigtekWnc27aSQ7bWTXoThEtSdZ/2dtShrb6pGvlFVByizfTwldpg73J/8 hB0UaD0F ooM73ZaZ9nsrWnuDfBvOhGXEaG2uYObt7753yC4ohBV96bjqkgrf8kcxQ08q/BeVtWB4cOywTkyo0FRy72GrQqjsaq/jstw9FV1jr1SXibGX0DEJBk6IN8XlsdooiyQvjbgWIQtmd2CUfd5vVbpvs6El9mY/NeRGXUjxj3uR3SpYlTghYRorpvqcnc9KLiI3vkHSnMmRXouQewkiB0/uZJ49N+xZVzKPXkY3xaCSiDFdWoo68UJkst8HPBg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 18, 2024 at 3:50=E2=80=AFAM Usama Arif = wrote: > > As mentioned in [1], there is a significant improvement in no > readahead swapin performance for super fast devices when skipping > swapcache. > > With large folio zswapin support added in later patches, this will also > mean this path will also act as "readahead" by swapping in multiple > pages into large folios. further improving performance. > > [1] https://lore.kernel.org/all/1505886205-9671-5-git-send-email-minchan@= kernel.org/T/#m5a792a04dfea20eb7af4c355d00503efe1c86a93 > > Signed-off-by: Usama Arif > --- > include/linux/zswap.h | 6 ++++++ > mm/memory.c | 3 ++- > mm/page_io.c | 1 - > mm/zswap.c | 46 +++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 54 insertions(+), 2 deletions(-) > > diff --git a/include/linux/zswap.h b/include/linux/zswap.h > index d961ead91bf1..e418d75db738 100644 > --- a/include/linux/zswap.h > +++ b/include/linux/zswap.h > @@ -27,6 +27,7 @@ struct zswap_lruvec_state { > unsigned long zswap_total_pages(void); > bool zswap_store(struct folio *folio); > bool zswap_load(struct folio *folio); > +bool zswap_present_test(swp_entry_t swp, int nr_pages); > void zswap_invalidate(swp_entry_t swp); > int zswap_swapon(int type, unsigned long nr_pages); > void zswap_swapoff(int type); > @@ -49,6 +50,11 @@ static inline bool zswap_load(struct folio *folio) > return false; > } > > +static inline bool zswap_present_test(swp_entry_t swp, int nr_pages) > +{ > + return false; > +} > + > static inline void zswap_invalidate(swp_entry_t swp) {} > static inline int zswap_swapon(int type, unsigned long nr_pages) > { > diff --git a/mm/memory.c b/mm/memory.c > index 03e5452dd0c0..49d243131169 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4289,7 +4289,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > swapcache =3D folio; > > if (!folio) { > - if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && > + if ((data_race(si->flags & SWP_SYNCHRONOUS_IO) || > + zswap_present_test(entry, 1)) && > __swap_count(entry) =3D=3D 1) { > /* skip swapcache */ > folio =3D alloc_swap_folio(vmf); > diff --git a/mm/page_io.c b/mm/page_io.c > index 4aa34862676f..2a15b197968a 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -602,7 +602,6 @@ void swap_read_folio(struct folio *folio, struct swap= _iocb **plug) > unsigned long pflags; > bool in_thrashing; > > - VM_BUG_ON_FOLIO(!folio_test_swapcache(folio) && !synchronous, fol= io); > VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); > VM_BUG_ON_FOLIO(folio_test_uptodate(folio), folio); > > diff --git a/mm/zswap.c b/mm/zswap.c > index 7f00cc918e7c..f4b03071b2fb 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1576,6 +1576,52 @@ bool zswap_store(struct folio *folio) > return ret; > } > > +static bool swp_offset_in_zswap(unsigned int type, pgoff_t offset) > +{ > + return (offset >> SWAP_ADDRESS_SPACE_SHIFT) < nr_zswap_trees[typ= e]; > +} > + > +/* Returns true if the entire folio is in zswap */ > +bool zswap_present_test(swp_entry_t swp, int nr_pages) Also, did you check how the performance changes if we bring back the bitmap of present entries (i.e. what used to be frontswap's bitmap) instead of the tree lookups here? > +{ > + pgoff_t offset =3D swp_offset(swp), tree_max_idx; > + int max_idx =3D 0, i =3D 0, tree_offset =3D 0; > + unsigned int type =3D swp_type(swp); > + struct zswap_entry *entry =3D NULL; > + struct xarray *tree; > + > + while (i < nr_pages) { > + tree_offset =3D offset + i; > + /* Check if the tree exists. */ > + if (!swp_offset_in_zswap(type, tree_offset)) > + return false; > + > + tree =3D swap_zswap_tree(swp_entry(type, tree_offset)); > + XA_STATE(xas, tree, tree_offset); > + > + tree_max_idx =3D tree_offset % SWAP_ADDRESS_SPACE_PAGES ? > + ALIGN(tree_offset, SWAP_ADDRESS_SPACE_PAGES) : > + ALIGN(tree_offset + 1, SWAP_ADDRESS_SPACE_PAGES); > + max_idx =3D min(offset + nr_pages, tree_max_idx) - 1; > + rcu_read_lock(); > + xas_for_each(&xas, entry, max_idx) { > + if (xas_retry(&xas, entry)) > + continue; > + i++; > + } > + rcu_read_unlock(); > + /* > + * If xas_for_each exits because entry is NULL and > + * the number of entries checked are less then max idx, > + * then zswap does not contain the entire folio. > + */ > + if (!entry && offset + i <=3D max_idx) > + return false; > + } > + > + return true; > +} > + > bool zswap_load(struct folio *folio) > { > swp_entry_t swp =3D folio->swap; > -- > 2.43.5 >