From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B76C6C4725D for ; Fri, 19 Jan 2024 19:37:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32F596B0071; Fri, 19 Jan 2024 14:37:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DEC86B0078; Fri, 19 Jan 2024 14:37:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CDFA6B007D; Fri, 19 Jan 2024 14:37:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0E7C06B0071 for ; Fri, 19 Jan 2024 14:37:04 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B3A73140AF6 for ; Fri, 19 Jan 2024 19:37:03 +0000 (UTC) X-FDA: 81697068726.25.5138315 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) by imf20.hostedemail.com (Postfix) with ESMTP id BB4761C0019 for ; Fri, 19 Jan 2024 19:37:01 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fHoeOoqF; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705693021; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Fjm7bz8eyCbh/wR7R5WfPWiMGpqtxISCBuFNemwrIvk=; b=S1qksBv12YFMi/iZljoFTJnlQiTHa9dG5lB08E4J+1wWLN95rQ9zJcM1ZKKnSQMcgC6/6o Gms+2l9++vPs57J7dQWFYVha4STTIrDkMdjg775nTUJLXaT8i9ejNEU0g+mgzY/2MZuTtu bUYRaayaUjRJhtNXU900ujLfkuAuI+4= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fHoeOoqF; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705693021; a=rsa-sha256; cv=none; b=KRP4gssI0NYX/UCH8DaLoTI89dyaP8X3Mun9QO8ymB/CN5yXlJJDdj3iFCO8TfM6m4GAQY RAMV5YySVN1ven6A6c7IB+5lWcUHAcPzFDmhS3j+8jankuIqMW1oN8fuyFMXq7GRDUQYPi O4xft6EVeMftPDQUuug+Am6cpagAauM= Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-a28cc85e6b5so135473166b.1 for ; Fri, 19 Jan 2024 11:37:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705693020; x=1706297820; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Fjm7bz8eyCbh/wR7R5WfPWiMGpqtxISCBuFNemwrIvk=; b=fHoeOoqFawKABH1kvnPNda4GgYAlfFGGf+nLGy6rKGYW/HLOYXUsdpwJ/IKh7/jR2b VNpXtbyQ/avb+cRNJ6ICmiIrRPQGhv3E9s1DBNMCum/vrOhvsMwliIR/HgtnZzLqml2C tX+N4HjVcQxqqcz0GwCjgnCgpMduWg4G37iTcHMwhs1GXipq8c/eI7I8UQ54Z46Y4ao2 2nBI/bFkfbcdVMvk6fmYaqeHMwW/Ymm18Krt+3jltSbljXgnvNKdPw6UpoAHyIQ8qt6K BnMdT75qFMpNtMtgSFLj7BNVoeTmLiIdpxmodBCTmp2kyTKSRce5TkCa492lKzUdcym3 rNZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705693020; x=1706297820; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Fjm7bz8eyCbh/wR7R5WfPWiMGpqtxISCBuFNemwrIvk=; b=twBKK5R2Mk81eU8wAzlRrhS5dzMzTXV9jIxzRXpmch5bE5ZDWd3Yv1eqWfRXqeZ/vj IORk4pPiXuzpQPmbetPn8SDUHnCAAp8xwxSLh54ipYbN8HwFjz1uBYRykVQEdZMMVSpC CJDO7qOk3cUwXv6AJQMTh5u9b10sMEq1iTNm/qyy62a1AQmC8kMRNnpg0Ec7Skw1ChpQ ipaV6jIbdr5HamuK28WJetJqYg1soMYt4uJOXfKfEp0EOc134o0z8jlPCfqbL5mZWEmv 70rMbyASZSRP3Li5LlMBMxlpMYnWtlvG5UaMIgaM3xF0mSxMMcQh9OHQQpUvhy7rO6lO DYxg== X-Gm-Message-State: AOJu0YzWxtQLZfj10fofIBBzImcztwAwgJlSxOJITUhzriLcu0xaC0Yv GgrkK+4fw4JX73fgYY192UJFRixHdvWqrA8FMpyUUYaVRpgN22ggadO6iIM98dRAFtlfLVCDt2F 4MVCLBqgj+J/4RKIOEltYHgW5ltQd4XJjPwSy X-Google-Smtp-Source: AGHT+IHlA5zVgUxC+rQm8QL6dfNwRIpkh+yS0I576RXhxjY0UJ1CeyWSKEOjLYFF863tFrTGrT/FV1yClx7LbPAOJkg= X-Received: by 2002:a17:906:b741:b0:a2e:893f:7333 with SMTP id fx1-20020a170906b74100b00a2e893f7333mr111511ejb.5.1705693020064; Fri, 19 Jan 2024 11:37:00 -0800 (PST) MIME-Version: 1.0 References: <20240117-zswap-xarray-v1-0-6daa86c08fae@kernel.org> <20240117-zswap-xarray-v1-2-6daa86c08fae@kernel.org> In-Reply-To: From: Yosry Ahmed Date: Fri, 19 Jan 2024 11:36:23 -0800 Message-ID: Subject: Re: [PATCH 2/2] mm: zswap.c: remove RB tree To: Chris Li Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?UTF-8?B?V2VpIFh177+8?= , Yu Zhao , Greg Thelen , Chun-Tse Shao , =?UTF-8?Q?Suren_Baghdasaryan=EF=BF=BC?= , Brain Geffon , Minchan Kim , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Johannes Weiner , Kairui Song , Zhongkun He , Kemeng Shi , Barry Song , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , Joel Fernandes , Chengming Zhou Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: BB4761C0019 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: oyhfyr1exqdnjusxt9wsofe64ozug8m7 X-HE-Tag: 1705693021-384505 X-HE-Meta: U2FsdGVkX18cRaF+EhPe1pr+zpQkTErmbsJJ9ARbWUDZ3ZeZ3SUvfwOW3JuXQh7nddueA3y57eIY9W2Nk6hkrR6dxAL9hlKhB6DPO3QJqTaTCAv33MmqL3ERxu6b/7+twzEvxscofnjryVRkIRcD7W3F7Xv0ASAes0aSJRdGfjM+OQ+ywFak01prGG3Y8k/TDmyto+I5fiTtRID1w98gTXWOnOFGDnrVn75FUWRrASLeqHGIPMzcaGVAvCS1icw3/gQqq8aWuajvIIYGkweRUfFwaN7RGZeUdXGv9DJ4icQTq2ol4sNxyJl/Ft8YSdlWp/JDlkFVyKjohnccqPJq97bZIiwATdOAzfgMSNO/d0z7JPvkvuXzOUYlRJahBTEPHDR4dfZm3v7IHaAJvuAUhIl2lmtJOJ13ryrZj/URc5X8uaSeri+7zM9HcwANdcifnbRkkztF0dbNdJjXsoO9wd1HDUS0gJv52Fn4iDW5teJMOqFaulYxzfUXEdh1fI0j1TKzitxDPqSRhxy9JphLgLFl9VcC0eROw0lBY5esu220JgsN87ZGyKp1NgX1RXDS8m1JXWhKXXEx5I+U1FocSf8quHvGCG9ulX3d2wvzc4SzMoQCJqO66MxBnBcBF8fl0QUGe9y5v7cAF+MiP6k3dUzQInGyywr4WaxXP6JRpLcgZB0LntAMiZ4W12xC8mTSWD9XUFlbcxPikHlhGEOYXS2UimNGBgfB13FTOV8JKxW2c9856r96ki5IFN3+NB9AFrU7sBRV7Uey2pbT5HlogOwnctlnchfG2LvLw7Abw7n/Qk6hWdyqUQA732A5cxMweTqvk0lCB2uXm8bHYCAXOsxfN5Wh6jfMFU2AJPS+hWP0iacl7PQHU9fRJ8xZBPZuqHCTiOswEtAOgYflM00YXaYppGbtmt9TTbn3pt+tLaf7Q8aASrgEddJdpOdex5Uzt4fr3nZcTrHF2/SJWai 6AIpZI8l Ur+SxQzuZdCMVwICdp6g9BrfXxgSo3w5z8tusJt2Xx1SogYu/BunJ7uKndKcCO1Jw1SqtQD6G4HRcHi088soHrb66wIFb+SCGPTNObX9NWzcGSI1YXkMN0fArDoIWUICRDMZJJKv1U2iyaeGxZMlOS3Y9X7bw4R8NuRDtphKlU1bNMveSEv3TufBIkDqsVVZ+SjhFjVp8fCVwPn0Fy7LrUWSyOijCXd/lH0cLcJN6nARUhDGoUBNmwQKUgcCeFiT/bsd6AZny0yMNBNPT3TCI6sC8bwp0osOQWbDMlRxKGtDP9pLDkrsP2p9h+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 18, 2024 at 9:43=E2=80=AFPM Chris Li wrote: > > Hi Yosry, > > On Wed, Jan 17, 2024 at 10:35=E2=80=AFPM Yosry Ahmed wrote: > > > > > @@ -493,45 +471,47 @@ static struct zswap_entry *zswap_search(struct = zswap_tree *tree, pgoff_t offset) > > > static int zswap_insert(struct zswap_tree *tree, struct zswap_entry = *entry, > > > struct zswap_entry **dupentry) > > > { > > > - struct rb_root *root =3D &tree->rbroot; > > > - struct rb_node **link =3D &root->rb_node, *parent =3D NULL; > > > - struct zswap_entry *myentry, *old; > > > - pgoff_t myentry_offset, entry_offset =3D swp_offset(entry->sw= pentry); > > > - > > > - > > > - while (*link) { > > > - parent =3D *link; > > > - myentry =3D rb_entry(parent, struct zswap_entry, rbno= de); > > > - myentry_offset =3D swp_offset(myentry->swpentry); > > > - if (myentry_offset > entry_offset) > > > - link =3D &(*link)->rb_left; > > > - else if (myentry_offset < entry_offset) > > > - link =3D &(*link)->rb_right; > > > - else { > > > - old =3D xa_load(&tree->xarray, entry_offset); > > > - BUG_ON(old !=3D myentry); > > > - *dupentry =3D myentry; > > > + struct zswap_entry *e; > > > + pgoff_t offset =3D swp_offset(entry->swpentry); > > > + XA_STATE(xas, &tree->xarray, offset); > > > + > > > + do { > > > + xas_lock_irq(&xas); > > > + do { > > > + e =3D xas_load(&xas); > > > + if (xa_is_zero(e)) > > > + e =3D NULL; > > > + } while (xas_retry(&xas, e)); > > > + if (xas_valid(&xas) && e) { > > > + xas_unlock_irq(&xas); > > > + *dupentry =3D e; > > > return -EEXIST; > > > } > > > - } > > > - rb_link_node(&entry->rbnode, parent, link); > > > - rb_insert_color(&entry->rbnode, root); > > > - old =3D xa_store(&tree->xarray, entry_offset, entry, GFP_KERN= EL); > > > - return 0; > > > + xas_store(&xas, entry); > > > + xas_unlock_irq(&xas); > > > + } while (xas_nomem(&xas, GFP_KERNEL)); > > > + return xas_error(&xas); > > > > I think using the xas_* APIs can be avoided here. The only reason we > > need it is that we want to check if there's an existing entry first, > > and return -EEXIST. However, in that case, the caller will replace it > > anyway (and do some operations on the dupentry): > > We might be able to for the insert case if we don't mind changing the > code behavior a bit. My original intent is to keep close to the > original zswap code and not stir the pot too much for the xarray > replacement. We can always make more adjustment once the RB tree is > gone. I don't see how this changes code behavior though. The current code in zswap_store() will do the following: - Hold the tree lock to make sure no one else modifies it. - Try to insert, check if there is already a dupentry at the index and return -EEXIST. - Warn, increment zswap_duplicate_entry, and invalidate the dupentry. - Try to insert again (this should always succeed since we are holding the lock). What I am proposing is: - zswap_xa_insert() is a thin wrapper around xa_store() (or we can remove it completely). - zswap_store() does the following: - Use zswap_xa_insert() and check if there is a returned dupentry. - Warn, increment zswap_duplicate_entry, and invalidate the dupentry. Either way, we always place the entry we have in the tree, and if there is a dupentry we warn and invalidate it. If anything, the latter is more straightforward. Am I missing something? > > > > > } > > > > > > static bool zswap_erase(struct zswap_tree *tree, struct zswap_entry = *entry) > > > { > > > + struct zswap_entry *e; > > > pgoff_t offset =3D swp_offset(entry->swpentry); > > > - if (!RB_EMPTY_NODE(&entry->rbnode)) { > > > - struct zswap_entry *old; > > > - old =3D xa_erase(&tree->xarray, offset); > > > - BUG_ON(old !=3D entry); > > > - rb_erase(&entry->rbnode, &tree->rbroot); > > > - RB_CLEAR_NODE(&entry->rbnode); > > > - return true; > > > - } > > > - return false; > > > + XA_STATE(xas, &tree->xarray, offset); > > > + > > > + do { > > > + xas_lock_irq(&xas); > > > + do { > > > + e =3D xas_load(&xas); > > > + } while (xas_retry(&xas, e)); > > > + if (xas_valid(&xas) && e !=3D entry) { > > > + xas_unlock_irq(&xas); > > > + return false; > > > + } > > > + xas_store(&xas, NULL); > > > + xas_unlock_irq(&xas); > > > + } while (xas_nomem(&xas, GFP_KERNEL)); > > > + return !xas_error(&xas); > > > } > > > > Same here, I think we just want: > > > > return !!xa_erase(..); > > For the erase case it is tricky. > The current zswap code does not erase an entry if the tree entry at > the same offset has been changed. It should be fine if the new entry > is NULL. Basically some race to remove the entry already. However, if > the entry is not NULL, then force resetting it to NULL will change > behavior compared to the current. I see, very good point. I think we can use xa_cmpxchg() and pass in NULL?