From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2DFAC4725D for ; Fri, 19 Jan 2024 05:43:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09E546B0081; Fri, 19 Jan 2024 00:43:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 04E816B0082; Fri, 19 Jan 2024 00:43:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E59016B0085; Fri, 19 Jan 2024 00:43:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D04106B0081 for ; Fri, 19 Jan 2024 00:43:55 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A632B1406DA for ; Fri, 19 Jan 2024 05:43:55 +0000 (UTC) X-FDA: 81694969230.17.3F3BA96 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf17.hostedemail.com (Postfix) with ESMTP id B5C9C40006 for ; Fri, 19 Jan 2024 05:43:53 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=odXMvX6w; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf17.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705643033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UNKfzHKuYwhIdeUTXKHH67vvVoK95bfgatyBXjqZrxI=; b=DBKRKOqXfWTsAFD20dOYl4ozX2kmxTiurDGup9D+4x+P26pHrec1kl1cugV9YzphFDfWhr F2A3pdLRfXwIrYuY/8e8sY5Vn37HMGAu01zBPm2You008hc11TD3R8PX9y6ehD9g2JsUJU YPC0/Y/a79RDFLm8TBTIkvhwtpmDghc= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=odXMvX6w; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf17.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705643033; a=rsa-sha256; cv=none; b=cme37dAy2a+324bGgEbfu142Yyy2nQ4OOaz9J+i/mxMS7QnAOWD+tNBFbzbOtemJcrKVUN WB1Up5EGliZ2OfWYn83Wwou04r+xbaJYP2BykJxqXPerFI+dxdTGsBxqvYk+dov0snUpvz sutO8MbbXu/9fVtgV7qtYnfZsF8peeM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id CE656618F9 for ; Fri, 19 Jan 2024 05:43:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E5BF7C43143 for ; Fri, 19 Jan 2024 05:43:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705643031; bh=dF/UBBPC6CKheSkANeKfd36eVsULSzpHsfBlOvbrzjk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=odXMvX6w1T/92K37nvRY6QuX7ljVA4Uc5cddTFzxyUva5TsDxwHDxozSwz1nwdwVM 09AeqhEwLIg8D4DFqRWx8ouiQMfegqHdCE97FWKNI7FdEiv93XvFwxusmdssfmBKq7 6i7rBzH5B4652Cmj8OBa2sJ+27t6z8i/zJz5hq0a7O+yx6YYRtgmrAwcCKVwf3NU5P nWnA2nq5eAVHDfAe6Ha91FKaVYsTF7dT1/h625tfzOuOSuviTtkVBaNdS4T/H9/6Sy G/80uo0B6zjN8GQnrBh+fQ/Wp8AHGm7b8yB/bwKkSrdUD65hcTw3Syiejh8u8Od9hP LmnYCwRFquk0A== Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-5ceb3fe708eso218333a12.3 for ; Thu, 18 Jan 2024 21:43:51 -0800 (PST) X-Gm-Message-State: AOJu0Yzbf8HcKDkUslvJOM+15dcK66iNxB9Oixq7xS8vkw04MUZtJwsw w5mFBwfvzHVXTmLMgpxeIVLPRVQLFQp58ZOSY0c8v4jDS/xbiye4trta8XlZ/qhBW35GTte5oss 2WZQNr51uoNWqOr1rCCk4vswA8sCM19PLiVHF X-Google-Smtp-Source: AGHT+IE0f5aXeUkUrkPUKRQL0dlK//zmcRdutVfIYgYmI9ZABAcDjpp68Y3KSElT9U/8HY2rLfvDIKKvt1EgwIEO+94= X-Received: by 2002:a17:90a:eb01:b0:28f:f2b5:3f23 with SMTP id j1-20020a17090aeb0100b0028ff2b53f23mr1742916pjz.35.1705643031256; Thu, 18 Jan 2024 21:43:51 -0800 (PST) MIME-Version: 1.0 References: <20240117-zswap-xarray-v1-0-6daa86c08fae@kernel.org> <20240117-zswap-xarray-v1-2-6daa86c08fae@kernel.org> In-Reply-To: From: Chris Li Date: Thu, 18 Jan 2024 21:43:39 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 2/2] mm: zswap.c: remove RB tree To: Yosry Ahmed Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?UTF-8?B?V2VpIFh177+8?= , Yu Zhao , Greg Thelen , Chun-Tse Shao , =?UTF-8?Q?Suren_Baghdasaryan=EF=BF=BC?= , Brain Geffon , Minchan Kim , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Johannes Weiner , Kairui Song , Zhongkun He , Kemeng Shi , Barry Song , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , Joel Fernandes , Chengming Zhou Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B5C9C40006 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 9z1rc6sqdcqp9y6jjn3ckejb4a9zny48 X-HE-Tag: 1705643033-188461 X-HE-Meta: U2FsdGVkX1+kDTEAxkQsZyl84Hd39dtzKLzlWxIUL6Kj3FxSjwiPkmFxms/VQlUGW087u/FzvTx3fDnqePYu+Wm+QHpfTcyiMOryIpBeaRwZDz//NrHksFXhlmsnKPTF4XYz1RM8ruiMwmqpNBdWYMw2AAl6xfMspM/elOlrMo/oqwSErKQ/OiqwjeJb4H8V5Gn780/ioQIQViVs1qse+4kD+joC9YIH6PaKnEsKJjdsvKbXvb5RoaF0qcjwRECKqaYvDwg2zwaRfcifDwOLR2R01tmD4SoS6TRjTozkf5nWvZ8Y1vyZVp0MzccZRPMNElNzwIdPPd6MKIJbr09GRImAeqd4RusxVjRt2do+HvGv3Zdn246iBZdcZrjTZLEswkcst7mKhVcJCMdrLjzCyQWJ9wp/xfmxHJGjra4xC/ObUhAhoAldaQe4TTF+DKrzr/3C1KAg1rKebLM6wmUPwf9BPOcUpLRls/iE1itdCllSSP8lflaf3NOjP5klzrsTo52otbKnNL0LSisQGqr8v8VZ9T/BWQ8DdNsKob7KZsDHGyzLK1EGf1GhZmbg2kEpMCH2Kzr8K/kJkSeyN80kTEnE1lN+MJ2txLm7yR0/7L3TltSbIDx4IkhVNC0S/aTOUbMLFd4Nf6MW4sXjH085SkdVC1n/cAM5eqVtJje5vfjfEERFMYyrptFV8aIMzUdTQTgz+gy1MgEJHb9g5+rx7gl5mg3BY9d+OjoYgCsCboAM3nSd3d7bFNeqT/gj+13pwyLK23qo9H5VH58ryFg0mXFG8ePDmdBDTojy4hp1FORRqrl/OcnCk8BEDNcib9pNy3CeIG5r11SCmL+ugsv0nfVMEJ3sDp7Kw3dZ72S8xmwGuntTW0DB3i7Gm+GIrcbYI0Eer+9AXPpZ0l1aIDQmXV1uH3Pmq8QK66Hrtt/lUb6+m2Hn4M74sEyeqkb63r2kragIcr0SAPM99VC84mj tR66xRHW RNIK7GTSfqTlKpBMjvo/7B+vOP85r2GgmK+GZKKYxt6jOwYjpMRh+F7SUz21lfzdAadwajZs0aW4ep9XAAzP9usFCYLv1NbnYfeh2OG/Is4vAAO/BP0HwLp3xCyQU2KhvHNGljX1d1wucU1ugfQldaqmfcvLK+jZgnZMw46yBb3Wa4rcX8IkobUdcIsq+hMf9jqspQDp4+Ey2Ft6Gl9drLZHL5sxgzMU4rrQAkUbPluHNSKIJ3YqRvcU+tQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Yosry, On Wed, Jan 17, 2024 at 10:35=E2=80=AFPM Yosry Ahmed wrote: > > > @@ -493,45 +471,47 @@ static struct zswap_entry *zswap_search(struct zs= wap_tree *tree, pgoff_t offset) > > static int zswap_insert(struct zswap_tree *tree, struct zswap_entry *e= ntry, > > struct zswap_entry **dupentry) > > { > > - struct rb_root *root =3D &tree->rbroot; > > - struct rb_node **link =3D &root->rb_node, *parent =3D NULL; > > - struct zswap_entry *myentry, *old; > > - pgoff_t myentry_offset, entry_offset =3D swp_offset(entry->swpe= ntry); > > - > > - > > - while (*link) { > > - parent =3D *link; > > - myentry =3D rb_entry(parent, struct zswap_entry, rbnode= ); > > - myentry_offset =3D swp_offset(myentry->swpentry); > > - if (myentry_offset > entry_offset) > > - link =3D &(*link)->rb_left; > > - else if (myentry_offset < entry_offset) > > - link =3D &(*link)->rb_right; > > - else { > > - old =3D xa_load(&tree->xarray, entry_offset); > > - BUG_ON(old !=3D myentry); > > - *dupentry =3D myentry; > > + struct zswap_entry *e; > > + pgoff_t offset =3D swp_offset(entry->swpentry); > > + XA_STATE(xas, &tree->xarray, offset); > > + > > + do { > > + xas_lock_irq(&xas); > > + do { > > + e =3D xas_load(&xas); > > + if (xa_is_zero(e)) > > + e =3D NULL; > > + } while (xas_retry(&xas, e)); > > + if (xas_valid(&xas) && e) { > > + xas_unlock_irq(&xas); > > + *dupentry =3D e; > > return -EEXIST; > > } > > - } > > - rb_link_node(&entry->rbnode, parent, link); > > - rb_insert_color(&entry->rbnode, root); > > - old =3D xa_store(&tree->xarray, entry_offset, entry, GFP_KERNEL= ); > > - return 0; > > + xas_store(&xas, entry); > > + xas_unlock_irq(&xas); > > + } while (xas_nomem(&xas, GFP_KERNEL)); > > + return xas_error(&xas); > > I think using the xas_* APIs can be avoided here. The only reason we > need it is that we want to check if there's an existing entry first, > and return -EEXIST. However, in that case, the caller will replace it > anyway (and do some operations on the dupentry): We might be able to for the insert case if we don't mind changing the code behavior a bit. My original intent is to keep close to the original zswap code and not stir the pot too much for the xarray replacement. We can always make more adjustment once the RB tree is gone. > while (zswap_rb_insert(&tree->rbroot, entry, &dupentry) =3D=3D -EEXIST) { > WARN_ON(1); > zswap_duplicate_entry++; > zswap_invalidate_entry(tree, dupentry); > } > > So I think we can do something like this in zswap_insert() instead: > > dupentry =3D xa_store(..); > if (WARN_ON(dupentry)) { > zswap_duplicate_entry++; > zswap_invalidate_entry(tree, dupentry); > } > > WDYT? > > > } > > > > static bool zswap_erase(struct zswap_tree *tree, struct zswap_entry *e= ntry) > > { > > + struct zswap_entry *e; > > pgoff_t offset =3D swp_offset(entry->swpentry); > > - if (!RB_EMPTY_NODE(&entry->rbnode)) { > > - struct zswap_entry *old; > > - old =3D xa_erase(&tree->xarray, offset); > > - BUG_ON(old !=3D entry); > > - rb_erase(&entry->rbnode, &tree->rbroot); > > - RB_CLEAR_NODE(&entry->rbnode); > > - return true; > > - } > > - return false; > > + XA_STATE(xas, &tree->xarray, offset); > > + > > + do { > > + xas_lock_irq(&xas); > > + do { > > + e =3D xas_load(&xas); > > + } while (xas_retry(&xas, e)); > > + if (xas_valid(&xas) && e !=3D entry) { > > + xas_unlock_irq(&xas); > > + return false; > > + } > > + xas_store(&xas, NULL); > > + xas_unlock_irq(&xas); > > + } while (xas_nomem(&xas, GFP_KERNEL)); > > + return !xas_error(&xas); > > } > > Same here, I think we just want: > > return !!xa_erase(..); For the erase case it is tricky. The current zswap code does not erase an entry if the tree entry at the same offset has been changed. It should be fine if the new entry is NULL. Basically some race to remove the entry already. However, if the entry is not NULL, then force resetting it to NULL will change behavior compared to the current. >From reading the zswap code, I am not sure this is in theory impossible to happen: Some one races to remove the zswap entry then reclaim converts that page back to the zswap, with a new entry. By the time the zswap_erase() tries to erase the current entry, the entry has changed to a new entry. It seems the right thing to do in this unlikely case is that we should skip the force erase and drop the current entry, assuming someone else has already invalidated it. Don't touch the entry in the tree, we assume it is a newer generation. If we can reason the above is impossible to happen, then the force erase and reset the entry to NULL should be fine(Famous last word). Noted that this is a behavior change, I would like to seperate it out from the drop in replacement patch(keep the original behavior) Chris > > > > > static struct zpool *zswap_find_zpool(struct zswap_entry *entry) > > @@ -583,7 +563,6 @@ static void zswap_entry_put(struct zswap_tree *tree= , > > > > WARN_ON_ONCE(refcount < 0); > > if (refcount =3D=3D 0) { > > - WARN_ON_ONCE(!RB_EMPTY_NODE(&entry->rbnode)); > > zswap_free_entry(entry); > > } > > nit: the braces are no longer needed here >