From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C59DC7EE29 for ; Fri, 9 Jun 2023 11:02:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BEF5F8E0002; Fri, 9 Jun 2023 07:02:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9FA18E0001; Fri, 9 Jun 2023 07:02:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3F918E0002; Fri, 9 Jun 2023 07:02:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 91E808E0001 for ; Fri, 9 Jun 2023 07:02:27 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4A40C801E8 for ; Fri, 9 Jun 2023 11:02:27 +0000 (UTC) X-FDA: 80882920734.30.551F740 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf21.hostedemail.com (Postfix) with ESMTP id 514761C001F for ; Fri, 9 Jun 2023 11:02:24 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Dq7n6qrn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686308545; a=rsa-sha256; cv=none; b=7hJldicK3rSl3TUhbNpVmgGqOnHVh5UkfyOWc/pcvWL9ohXbrgIbnk9sI67+HofHURaPpp muHvEDUsSdVb6WwJoOitynFFqvKYqysLMAgblpdoKppC58DKwvtMVk1GK0EHHNebXhIMUq Qi1SGO8QYb0ZRTWIBdJjA4QYcpXm0u4= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Dq7n6qrn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686308545; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xyEH71FGKoeq2QeMyLJVASNNz7vp30E/6Q71Nq3qif8=; b=bh56bUbbIW+u4/8Oom0Ef0LiVO+X3SCHhwG7pviZWGDcKQVBn6lKWzLW9Gkp+XijHORjUq 691+tDiLuPt2svfqGTk6EZM/wLDs80m7DnqcepL2NoHATuZvQ46M2SRWmLohdF+oDEcLWT ayAMfNCbyr1Qjs6StdciawKkqsCpbsU= Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-974638ed5c5so357538766b.1 for ; Fri, 09 Jun 2023 04:02:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1686308543; x=1688900543; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xyEH71FGKoeq2QeMyLJVASNNz7vp30E/6Q71Nq3qif8=; b=Dq7n6qrnyonq0tToLe7fF5XfEJrj3QSmMpa/pAJqLUxWoJgE5T3RKCg8S9aQBU7v26 mYq9uMmpPdLz9GH8CN5dAmrGTuTZ/5wsps62F/H3+bxTAn5QAlz3lixeQN2DsKlqLzEs EdNCYQZgrI0Mv3y9SwtDIFukBVD8VE8F2HkNB8yZpE6GGI1JPJw6ptQuvrG7E3Mw8IQg C4N4trb7MwunZK6vWCEwqPTff1pfuHlu1yoWqN1u+JIrGS2INR2h1Grn8CgVpMHLRG0R 0cjS0cEVM2OIVFSf4ILyvsHGIeQS/lHhGypri5ua915kaltgD9ZPqlh9Z0ZO+9jl6PGz kdvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686308543; x=1688900543; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xyEH71FGKoeq2QeMyLJVASNNz7vp30E/6Q71Nq3qif8=; b=OqfJFvhfOZB0ST1RXt19wPuMHagc2EjyIQJxfAFr0ZaEzf89RlxjWyapQP8EOMtKp2 sS47rQBQOiers6QknNZxPNcf7zc7rwIL8Lqg3bOpz42ep5XVj2U5V7H314eKTQmbGGmI erkeuV9zYbidGmDZ3FhK4oTnUcuBbl+mi25rTNhagcoxRDrhBHMxVvhedL8nGJxIajKC ixFEKw8+GFnkFSg8N+NL9ee5K3y9YsAz9N4ArgA3E0ubhYj27YsuLKLivlnvwNPxUDzC P2ms5Euj0RBw6fQMFdRiL1lqnBRhHZRhNVPHKuCfYtGlpb+gHrv0+FQP7GbXONvq4Yih CYsQ== X-Gm-Message-State: AC+VfDxNTYc1BEIBcAUZLxijIv2BhGGDz8kaPQxTNxrrli3SLi0y8bJJ jh/cewyl/xC5DDs3zthgI3IkjywNbpJgqI3ntrfLyw== X-Google-Smtp-Source: ACHHUZ64rbGj3qg3jaP564y8zxkwh/GmmvZP+ZFR4ihQjXm0UMUkgir6gninXHxyVvjOEhSgMntb0nBctvqnPOEidTo= X-Received: by 2002:a17:907:25ca:b0:971:55e2:82c3 with SMTP id ae10-20020a17090725ca00b0097155e282c3mr1078353ejc.20.1686308543224; Fri, 09 Jun 2023 04:02:23 -0700 (PDT) MIME-Version: 1.0 References: <20230606145611.704392-1-cerasuolodomenico@gmail.com> <20230606145611.704392-7-cerasuolodomenico@gmail.com> In-Reply-To: From: Yosry Ahmed Date: Fri, 9 Jun 2023 04:01:46 -0700 Message-ID: Subject: Re: [RFC PATCH v2 6/7] mm: zswap: simplify writeback function To: Domenico Cerasuolo Cc: vitaly.wool@konsulko.com, minchan@kernel.org, senozhatsky@chromium.org, linux-mm@kvack.org, ddstreet@ieee.org, sjenning@redhat.com, nphamcs@gmail.com, hannes@cmpxchg.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 514761C001F X-Stat-Signature: yd9as1injrqpcrmtx5p9jw83rha5p5f4 X-HE-Tag: 1686308544-711797 X-HE-Meta: U2FsdGVkX18rTHAcFp8poCJhYiC+5oTiGyK7EFxj4amylYQVInz42pVWGlmfhhy290ZBN+wCZZuOEZ/GlL4LDz/51XyPBg8BLgbFAuC9UtBSlit7ZSDXWLq5tpvBLPYwEmyHQ75rFFwCDut78rbSSAMJxjWyiY/qyMSFC6W5dymCP7WB6i36Oe/nFkgv65Dh+zilx2bi1PpBvzMHSa7/9BAJ64BGIjEqfNOwxAIOeeq6iXO/xGdsk2XLPgmJwlT52V0VnjMQ6OteCyqvAeEJaKqYIy9XjXZC4N3C9NmWElQqOG2kDVRNqMYLKqQmvA4h7Sv/ZiuRh1ySEojd7pCiJmAH1pWLwY59tXb03/An2vHkgiJ8a8V14rpCdPuAXJRrNrJ8alQevjegUKXInEvVWCENuXxezMf4nvftR98Jyw5l6pwPYczi5qab3EUnzgLw481B7UwA+fF+ytv5qb2C6E6a4mCMrRwOMKW3TtmH9UpLNH6b+XsPYbuUQF07c9Xta6vMlbjcpcZSdzxbs/dG7kBrs0sMCkDyDUMGtBF4tyY7Z7oiltoCgz5aDUaY2/xrJpLiXijQjVeQWu7BQUelmQkzS7gPQBNd7DONy06WPrtXlfTvnwk0IkOwXKuvNYIBTSTnW6tPN3vdswORfEs4U5tdjngYCb2XT1loP9MuOZY4TCfV/L5STGJ4Lwumgie5jjEThJBlfHwGJ66hAY36Ur6OhdOi0A6/RRNuFAjYW1lUqBCLoMsVNCaGVojDAtbGXP6/hztR3mhX0M0DV3vVt7x1X7yPt/fvSgZJxr7sYUjPwn+eEU2gGMx7tAtsOBmnZ87dxyOjnkQfZOQOunZjpCFGwZlgYUmHMWjZxIQPMNp3+ISKF1Hiz6a3pn5MTT7IxP2D7329d9iH1mJcI6FEf9Oj0AxEWMhTdzJ4n/qZTWbPOne0hEn8tM3o/uFCdkZYyhCrrF0rn3qJqeghjY+ r/zzT2rb 1kgTQSiNklHXdTOpb4j0zShxCv0Q9WkDrLxl6aQfvfWdY+6OWFEcFo6BQx0hmpQCPoT6vvsKf2klLnki+QA9T8xMi/fvRtnXL6w6M5Uz3lFjoBL+Fcbhl/4yLyeNfueNcO5PNG7+5/9JL15U0RyF2CPNj8F5uDQp3eIymkcSnywscz8ECeHI5pFEoZoVnmb47afqIui4U60Y1bdGerUJu0c+EqOlv3QuK90VbAjF8oSNA+Xhq7XkJmQ3/krjYpCj0RVxWUYm2UzP6HMa/gXI4odR/Pg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 9, 2023 at 3:23=E2=80=AFAM Domenico Cerasuolo wrote: > > On Wed, Jun 7, 2023 at 11:27=E2=80=AFAM Yosry Ahmed wrote: > > > > On Tue, Jun 6, 2023 at 7:56=E2=80=AFAM Domenico Cerasuolo > > wrote: > > > > > > Previously, in zswap, the writeback function was passed to zpool driv= ers > > > for their usage in calling the writeback operation. However, since th= e > > > drivers did not possess references to entries and the function was > > > specifically designed to work with handles, the writeback process has > > > been modified to occur directly within zswap. Consequently, this chan= ge > > > allows for some simplification of the writeback function, taking into > > > account the updated workflow. > > > > > > Signed-off-by: Domenico Cerasuolo > > > --- > > > mm/zswap.c | 69 ++++++++++++++--------------------------------------= -- > > > 1 file changed, 17 insertions(+), 52 deletions(-) > > > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > > index 2831bf56b168..ef8604812352 100644 > > > --- a/mm/zswap.c > > > +++ b/mm/zswap.c > > > @@ -250,7 +250,8 @@ static bool zswap_has_pool; > > > pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ > > > zpool_get_type((p)->zpool)) > > > > > > -static int zswap_writeback_entry(struct zpool *pool, unsigned long h= andle); > > > +static int zswap_writeback_entry(struct zswap_entry *entry, struct z= swap_header *zhdr, > > > + struct zswap_tree *tree); > > > static int zswap_pool_get(struct zswap_pool *pool); > > > static void zswap_pool_put(struct zswap_pool *pool); > > > > > > @@ -632,15 +633,21 @@ static int zswap_shrink(struct zswap_pool *pool= ) > > > } > > > spin_unlock(&tree->lock); > > > > > > - ret =3D zswap_writeback_entry(pool->zpool, lru_entry->handle)= ; > > > + ret =3D zswap_writeback_entry(lru_entry, zhdr, tree); > > > > > > spin_lock(&tree->lock); > > > if (ret) { > > > spin_lock(&pool->lru_lock); > > > list_move(&lru_entry->lru, &pool->lru); > > > spin_unlock(&pool->lru_lock); > > > + zswap_entry_put(tree, tree_entry); > > > + } else { > > > + /* free the local reference */ > > > + zswap_entry_put(tree, tree_entry); > > > + /* free the entry if it's not been invalidated*/ > > > + if (lru_entry =3D=3D zswap_rb_search(&tree->rbroot, s= wpoffset)) > > > + zswap_entry_put(tree, tree_entry); > > > > The comment that was here about the 2 possible cases was useful imo, > > maybe keep it? > > I honestly found it brittle in that we're not interested in the refcount = there, > but rather in releasing the base reference that keeps the entry valid. > There's not way to know which refcount value it should be though, conside= r > that throughout this series the values can be 1 or 0, but also 2 or 1, > depending on where this function is called (zpool_shrink or zswap_shrink)= . Yeah looking at it with fresh eyes makes sense, we want to invalidate the entry, not really caring about the refcount (see below). > > > > > Also, I am not sure why we need to do a tree search vs. just reading > > the refcount here before the first put. We can even make > > zswap_entry_put() return the refcount after the put to know if we need > > the additional put or not. > > > > Can anyone think of any reason why we need to explicitly search the tre= e here? > > I think that the reasoning here is that refcount > 0 doesn't necessarily = mean > that the entry is on the tree. I'm not sure if reading refcount directly = here > would cause an issue now, probably not, but I wonder if bugs could be > introduced if the context in which this function is called changes. Yeah I agree. As Johannes suggested, using zswap_invalidate_entry() (from my exclusive loads patch in mm-unstable) would be best here imo. > > > > > > } > > > - zswap_entry_put(tree, tree_entry); > > > spin_unlock(&tree->lock); > > > > > > return ret ? -EAGAIN : 0; > > > @@ -1039,16 +1046,14 @@ static int zswap_get_swap_cache_page(swp_entr= y_t entry, > > > * the swap cache, the compressed version stored by zswap can be > > > * freed. > > > */ > > > -static int zswap_writeback_entry(struct zpool *pool, unsigned long h= andle) > > > +static int zswap_writeback_entry(struct zswap_entry *entry, struct z= swap_header *zhdr, > > > + struct zswap_tree *tree) > > > { > > > - struct zswap_header *zhdr; > > > - swp_entry_t swpentry; > > > - struct zswap_tree *tree; > > > - pgoff_t offset; > > > - struct zswap_entry *entry; > > > + swp_entry_t swpentry =3D zhdr->swpentry; > > > struct page *page; > > > struct scatterlist input, output; > > > struct crypto_acomp_ctx *acomp_ctx; > > > + struct zpool *pool =3D entry->pool->zpool; > > > > > > u8 *src, *tmp =3D NULL; > > > unsigned int dlen; > > > @@ -1063,25 +1068,6 @@ static int zswap_writeback_entry(struct zpool = *pool, unsigned long handle) > > > return -ENOMEM; > > > } > > > > > > - /* extract swpentry from data */ > > > - zhdr =3D zpool_map_handle(pool, handle, ZPOOL_MM_RO); > > > - swpentry =3D zhdr->swpentry; /* here */ > > > - tree =3D zswap_trees[swp_type(swpentry)]; > > > - offset =3D swp_offset(swpentry); > > > - zpool_unmap_handle(pool, handle); > > > - > > > - /* find and ref zswap entry */ > > > - spin_lock(&tree->lock); > > > - entry =3D zswap_entry_find_get(&tree->rbroot, offset); > > > - if (!entry) { > > > - /* entry was invalidated */ > > > - spin_unlock(&tree->lock); > > > - kfree(tmp); > > > - return 0; > > > - } > > > - spin_unlock(&tree->lock); > > > - BUG_ON(offset !=3D entry->offset); > > > - > > > /* try to allocate swap cache page */ > > > switch (zswap_get_swap_cache_page(swpentry, &page)) { > > > case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happene= d */ > > > @@ -1115,12 +1101,12 @@ static int zswap_writeback_entry(struct zpool= *pool, unsigned long handle) > > > acomp_ctx =3D raw_cpu_ptr(entry->pool->acomp_ctx); > > > dlen =3D PAGE_SIZE; > > > > > > - zhdr =3D zpool_map_handle(pool, handle, ZPOOL_MM_RO); > > > + zhdr =3D zpool_map_handle(pool, entry->handle, ZPOOL_= MM_RO); > > > src =3D (u8 *)zhdr + sizeof(struct zswap_header); > > > if (!zpool_can_sleep_mapped(pool)) { > > > memcpy(tmp, src, entry->length); > > > src =3D tmp; > > > - zpool_unmap_handle(pool, handle); > > > + zpool_unmap_handle(pool, entry->handle); > > > } > > > > > > mutex_lock(acomp_ctx->mutex); > > > @@ -1135,7 +1121,7 @@ static int zswap_writeback_entry(struct zpool *= pool, unsigned long handle) > > > if (!zpool_can_sleep_mapped(pool)) > > > kfree(tmp); > > > else > > > - zpool_unmap_handle(pool, handle); > > > + zpool_unmap_handle(pool, entry->handle); > > > > > > BUG_ON(ret); > > > BUG_ON(dlen !=3D PAGE_SIZE); > > > @@ -1152,23 +1138,7 @@ static int zswap_writeback_entry(struct zpool = *pool, unsigned long handle) > > > put_page(page); > > > zswap_written_back_pages++; > > > > > > - spin_lock(&tree->lock); > > > - /* drop local reference */ > > > - zswap_entry_put(tree, entry); > > > - > > > - /* > > > - * There are two possible situations for entry here: > > > - * (1) refcount is 1(normal case), entry is valid and on the = tree > > > - * (2) refcount is 0, entry is freed and not on the tree > > > - * because invalidate happened during writeback > > > - * search the tree and free the entry if find entry > > > - */ > > > - if (entry =3D=3D zswap_rb_search(&tree->rbroot, offset)) > > > - zswap_entry_put(tree, entry); > > > - spin_unlock(&tree->lock); > > > - > > > return ret; > > > - > > > fail: > > > if (!zpool_can_sleep_mapped(pool)) > > > kfree(tmp); > > > @@ -1177,13 +1147,8 @@ static int zswap_writeback_entry(struct zpool = *pool, unsigned long handle) > > > * if we get here due to ZSWAP_SWAPCACHE_EXIST > > > * a load may be happening concurrently. > > > * it is safe and okay to not free the entry. > > > - * if we free the entry in the following put > > > * it is also okay to return !0 > > > */ > > > - spin_lock(&tree->lock); > > > - zswap_entry_put(tree, entry); > > > - spin_unlock(&tree->lock); > > > - > > > return ret; > > > } > > > > > > -- > > > 2.34.1 > > >