From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C885CEB64DC for ; Wed, 21 Jun 2023 09:26:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 428DA8D0002; Wed, 21 Jun 2023 05:26:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D9018D0001; Wed, 21 Jun 2023 05:26:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A0DA8D0002; Wed, 21 Jun 2023 05:26:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1AFD58D0001 for ; Wed, 21 Jun 2023 05:26:44 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D4772C06BA for ; Wed, 21 Jun 2023 09:26:43 +0000 (UTC) X-FDA: 80926225086.26.E8133EA Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf12.hostedemail.com (Postfix) with ESMTP id EDBC440004 for ; Wed, 21 Jun 2023 09:26:41 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=6Ndo4x9f; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687339602; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WW1itccB156yJeKR1YW8UwouKW761+/dlDBKPF69UtI=; b=E7eqhE3cW9+2bAMnxZ8N5nrRHC9wDZyCeQCSvyerKTc1Ou+X3SrrDnL1GDmLtHRmrvnqMZ fWDwe1qh4H3iQ1zjZrt2hGv8SJNnFU6MhpCAgmcoBvOatY+dFrTwERnIxoFk/h+aDMMuWl GKdxV+7uaSNqMfyxBJUozuq0Ljn0GOM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687339602; a=rsa-sha256; cv=none; b=WOiIZuM7s95LNmTE9Di+/uyQ+SJ4zjBJ5cXF5ZQ0Yz5lt67lN+y+0XrEr7GLBMeac52Ycq 1rr+WqliHqFQyH1qOrKweHDkfSuktdvq4a558zrFW9leKKD7/n4V5QNFB5phPrAaELdbex VK/tcLZC1rnCfQiZKc0hbP2qwkmHMWw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=6Ndo4x9f; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-988c30a540aso388081466b.3 for ; Wed, 21 Jun 2023 02:26:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687339600; x=1689931600; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WW1itccB156yJeKR1YW8UwouKW761+/dlDBKPF69UtI=; b=6Ndo4x9frZNAcpqXyfhoJQQ9SF6iv+Gum+5rmmPuSUTTk8Lv4NpUl03xXZGA2A0+Ez l/rVEm56cvx4moCCSCDGHosQWnmGLgLPr843buQ/jK/KL5jEHXVstFJdp0kxvZ39Jivz cwXvJMwHkj+ExFkNPV2rmhJZ5Z2fd5P4ghkaDAwf5/WQSSaaE8q+2Sn2aCy7rhEjXR/r 3uH1Hc3Q+2vOrHV9pcIWVt19ZyN6nMf661LlIynwwi1ey0Tfr1KiEhYUiPsxo59P0SHp AV0EA1+1HR+0Q2kufSGqrenXh+QuTDASM7iav/ipFk0W3vKLikh/dir3UHAD+HIK7ZlP 1RIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687339600; x=1689931600; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WW1itccB156yJeKR1YW8UwouKW761+/dlDBKPF69UtI=; b=GBPZyyWMzUXskbnblln99mxlRPzASPElka045fh4g6I9V8WKX6ZyWWtfOUMKBmiCMR aTB3fTHkMu7EpiF1u4P8Y8KUrLIOhFbTmuKTsZBJLJUE+HRpVOhYfqBN0rfnv3ode9yj f6prgol//IZrfQKkDNQNmXP0jvboonnPj/f8KdpPTMflXUcZOb6mXrpOMOzTG7/CbPtC MYIavxZS84Ksu58BlItxjbpJCY60gl4QriL1yRODfHTMCIZZpN96yJYpA4ltaKSsnvd8 1TzEf8B88knZz6/P4fY9fbNR5QlOXHo4a6TEl7BGQJhbjcgxjwzlvPS1ovz7wAHdTlO+ 237w== X-Gm-Message-State: AC+VfDyI2KGfaPbNL1FAsM+/gxMWTUv2D6hylWZXh1Sk7X5nzguJbs2N bVMiYPzcc1Ljr+qlqvA+0Mqh1oX8fGP5q7Q1MZd16iOBBpVKi3OaAh4= X-Google-Smtp-Source: ACHHUZ52klPQKofsSPjyz5gwibi4QgqoLbA12/9MZJmHSA1cbAI5ntJusR19IH3MYPZ8gH4GqW1pSkcgDfUgoXwHa5g= X-Received: by 2002:a17:906:974b:b0:98c:ef87:17d8 with SMTP id o11-20020a170906974b00b0098cef8717d8mr6909ejy.7.1687339600312; Wed, 21 Jun 2023 02:26:40 -0700 (PDT) MIME-Version: 1.0 References: <20230607195143.1473802-1-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Wed, 21 Jun 2023 02:26:03 -0700 Message-ID: Subject: Re: [BUG mm-unstable] "kernel BUG at mm/swap.c:393!" on commit b9c91c43412f2e To: Domenico Cerasuolo Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Andrew Morton , Konrad Rzeszutek Wilk , Seth Jennings , Dan Streetman , Vitaly Wool , Johannes Weiner , Nhat Pham , Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: s8hn68uw6coaw6uchiasnm5fcyhrnhrp X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: EDBC440004 X-Rspam-User: X-HE-Tag: 1687339601-295904 X-HE-Meta: U2FsdGVkX1/xNe+MtDXNJewv7qTceoJDWKuzqTR50ReHx4vEgBjy8z8/Dyo69aZxf+8R1m5ciJliRmy4F+A7jTlv1kdVRetpK8O4HbVlTHrhzKunwzAscWPeX3zioHJO13kW0kqz8SyeN5bN39F6azl2Pv2PB0j6Uuc26fVW5gu9PGpS54RlXtQzFVowiI1i9uT2ZfQOHm3H9e0XBJa8RlnH7uL+rlVlNe5VmEhLc3psmI7HiNqy0bU13tBdfUwXR1VxHBV+zDZzghiVsdNviI3lkWZoJE5VexljPZ0v1r034qBDeZ4c4aAcJLT0p6q6Rk0Pk8PNIN8qVYQv9cMKBE2kgjIqFvfqHUXJITT95Ta8kx9tFwmwqV5F8OKnmb28uxCDg0nHEAlVC2rpZhfCTSSCPsqWGirbtsUpzSWUKNTiKlJSfesPpfvteSWWMOJEvmyOifUOjV7jdvZU4B+EUiNbNpphHK3QDAYGmR2UEzGgLhhrSCKaO7Vn7R/Z/iJHQ0wO/hJF8gPaBnwRFA944IjX8wBFLQvnruREJw3ZSGWqbKXwbo0rAU3U8qAfAUllOLXcYpSM8HGwo+zfpr3Ef7br0lbN6uBv45dKziKpVP0Qi/hIBaJPT8DKc/9EIDzol2THkiFvWgDC7b9TG1bJgcpelF6/OEHTiY3pJFEV+NKfzBNW1SlRbCpiZ6iiEB0n3VOmvPPKXyhF0MlS5HScBPhCl5V44To9aPTusWKdW8PSN+FTwPs3XZhysaocNS3+mRYdLvoz2XXt2RrZxONd2lYj3bVY5p4gMGI3a9bCBUrdUlIitjjxClpilbdB2cx5nGDBg89W4SSha6SwxSCQ3eMJYbqrf/QoBVpLIcvdn/ncIvY6wj5B2whAGtpvM61ivXxTylhqA0qTHQxzwoBXa2RZtifqYnJy8ZYHKp3bOQ4EOrR0lBnS12i85BLWOHQVlZyz/EoKA1CFp8Uh5X1 snSiYHFk naPoR3BuJDOgq51atDfRWGbYFJnP/qddJkFtutGqLNCohR21sdm3DTnkt68quoJgc7AXVV1ZjNrSxJ0zos1ubt30TkBDeuV/74HLvLElNX8FwikDEZXZFzvpvNJE0nc4IqJH2AvHzdsC4b25BBmZ3aJEdxs1bkWGBYiMzmpF4QrYBsHZOJzesSE1X50Qto295Fal7D0yd9tfgKPwTAkYHwS9e6njWCEeNxkoQNj06hFcAup0KVhbOZ2u0boqunT8HpR+u9CasJuNDt2LkYFqQkarNaV18zN8SZxE7QHWEPDzc/sH/klcxTwrdBiPWko/6l0JtG+aZ3YkdhBLW0qeuGygj+GMekt5Y/uUxVWjA4XS48ndABw6PKhE1FlqnPyKasFai5fD9danR5/Gze93fa8gXWTCUbOg/yJmeXMqufuJ2D9xWuUtTNsFWG+vF2taoK/RX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jun 21, 2023 at 2:19=E2=80=AFAM Domenico Cerasuolo wrote: > > On Wed, Jun 21, 2023 at 10:06=E2=80=AFAM Yosry Ahmed wrote: > > > > On Wed, Jun 21, 2023 at 12:01=E2=80=AFAM Hyeonggon Yoo <42.hyeyoo@gmail= .com> wrote: > > > > > > On Wed, Jun 07, 2023 at 07:51:43PM +0000, Yosry Ahmed wrote: > > > > Commit 71024cb4a0bf ("frontswap: remove frontswap_tmem_exclusive_ge= ts") > > > > removed support for exclusive loads from frontswap as it was not us= ed. > > > > Bring back exclusive loads support to frontswap by adding an "exclu= sive" > > > > output parameter to frontswap_ops->load. > > > > > > > > On the zswap side, add a module parameter to enable/disable exclusi= ve > > > > loads, and a config option to control the boot default value. > > > > Refactor zswap entry invalidation in zswap_frontswap_invalidate_pag= e() > > > > into zswap_invalidate_entry() to reuse it in zswap_frontswap_load()= if > > > > exclusive loads are enabled. > > > > > > > > With exclusive loads, we avoid having two copies of the same page i= n > > > > memory (compressed & uncompressed) after faulting it in from zswap.= On > > > > the other hand, if the page is to be reclaimed again without being > > > > dirtied, it will be re-compressed. Compression is not usually slow,= and > > > > a page that was just faulted in is less likely to be reclaimed agai= n > > > > soon. > > > > > > > > Suggested-by: Yu Zhao > > > > Signed-off-by: Yosry Ahmed > > > > --- > > > > > > > > v1 -> v2: > > > > - Add a module parameter to control whether exclusive loads are ena= bled > > > > or not, the config option now controls the default boot value ins= tead. > > > > Replaced frontswap_ops->exclusive_loads by an output parameter to > > > > frontswap_ops->load() (Johannes Weiner). > > > > --- > > > > > > Hi Yosry, I was testing the latest mm-unstable and encountered a bug. > > > It was bisectable and this is the first bad commit. > > > > > > > > > Attached config file and bisect log. > > > The oops message is available at: > > > > > > https://social.kernel.org/media/eace06d71655b3cc76411366573e4a8ce240a= d65b8fd20977d7c73eec9dc2253.jpg > > > > > > (the head commit is b9c91c43412f2e07 "mm: zswap: support exclusive lo= ads") > > > (it's an image because I tested it on real machine) > > > > > > > > > This is what I have as swap space: > > > > > > $ cat /proc/swaps > > > Filename Type Size = Used Priority > > > /var/swap file 134217724 = 0 -2 > > > /dev/zram0 partition 8388604 = 0 100 > > > > > > Hi Hyeonggon, > > > > Thanks for reporting this! I think I know what went wrong. Could you > > please verify if the below fix works if possible? > > > > Domenico, I believe the below fix would also fix a problem with the > > recent writeback series. If the entry is invalidated before we grab the > > lock to put the local ref in zswap_frontswap_load(), then the entry > > will be freed once we call zswap_entry_put(), and the movement to the > > beginning LRU will be operating on a freed entry. It also modifies > > your recently added commit 418fd29d9de5 ("mm: zswap: invaldiate entry > > after writeback"). I would appreciate it if you also take a look. > > Hi Yosry, > > Thanks, this makes sense indeed. I've been running a stress test too for > an hour now and it seems fine. Thanks! I will send the patch to Andrew then! > > > > > If this works as intended, I can send a formal patch (applies on top > > of fd247f029cd0 ("mm/gup: do not return 0 from pin_user_pages_fast() > > for bad args")): > > > > From 4b7f949b3ffb42d969d525d5b576fad474f55276 Mon Sep 17 00:00:00 2001 > > From: Yosry Ahmed > > Date: Wed, 21 Jun 2023 07:43:51 +0000 > > Subject: [PATCH] mm: zswap: fix double invalidate with exclusive loads > > > > If exclusive loads are enabled for zswap, we invalidate the entry befor= e > > returning from zswap_frontswap_load(), after dropping the local > > reference. However, the tree lock is dropped during decompression after > > the local reference is acquired, so the entry could be invalidated > > before we drop the local ref. If this happens, the entry is freed once > > we drop the local ref, and zswap_invalidate_entry() tries to invalidate > > an already freed entry. > > > > Fix this by: > > (a) Making sure zswap_invalidate_entry() is always called with a local > > ref held, to avoid being called on a freed entry. > > (b) Making sure zswap_invalidate_entry() only drops the ref if the entr= y > > was actually on the rbtree. Otherwise, another invalidation could > > have already happened, and the initial ref is already dropped. > > > > With these changes, there is no need to check that there is no need to > > make sure the entry still exists in the tree in zswap_reclaim_entry() > > before invalidating it, as zswap_reclaim_entry() will make this check > > internally. > > > > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > > Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > Signed-off-by: Yosry Ahmed > > --- > > mm/zswap.c | 21 ++++++++++++--------- > > 1 file changed, 12 insertions(+), 9 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index 87b204233115..62195f72bf56 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -355,12 +355,14 @@ static int zswap_rb_insert(struct rb_root *root, > > struct zswap_entry *entry, > > return 0; > > } > > > > -static void zswap_rb_erase(struct rb_root *root, struct zswap_entry *e= ntry) > > +static bool zswap_rb_erase(struct rb_root *root, struct zswap_entry *e= ntry) > > { > > if (!RB_EMPTY_NODE(&entry->rbnode)) { > > rb_erase(&entry->rbnode, root); > > RB_CLEAR_NODE(&entry->rbnode); > > + return true; > > } > > + return false; > > } > > > > /* > > @@ -599,14 +601,16 @@ static struct zswap_pool > > *zswap_pool_find_get(char *type, char *compressor) > > return NULL; > > } > > > > +/* > > + * If the entry is still valid in the tree, drop the initial ref and r= emove it > > + * from the tree. This function must be called with an additional ref = held, > > + * otherwise it may race with another invalidation freeing the entry. > > + */ > > static void zswap_invalidate_entry(struct zswap_tree *tree, > > struct zswap_entry *entry) > > { > > - /* remove from rbtree */ > > - zswap_rb_erase(&tree->rbroot, entry); > > - > > - /* drop the initial reference from entry creation */ > > - zswap_entry_put(tree, entry); > > + if (zswap_rb_erase(&tree->rbroot, entry)) > > + zswap_entry_put(tree, entry); > > } > > > > static int zswap_reclaim_entry(struct zswap_pool *pool) > > @@ -659,8 +663,7 @@ static int zswap_reclaim_entry(struct zswap_pool *p= ool) > > * swapcache. Drop the entry from zswap - unless invalidate alr= eady > > * took it out while we had the tree->lock released for IO. > > */ > > - if (entry =3D=3D zswap_rb_search(&tree->rbroot, swpoffset)) > > - zswap_invalidate_entry(tree, entry); > > + zswap_invalidate_entry(tree, entry); > > > > put_unlock: > > /* Drop local reference */ > > @@ -1466,7 +1469,6 @@ static int zswap_frontswap_load(unsigned type, > > pgoff_t offset, > > count_objcg_event(entry->objcg, ZSWPIN); > > freeentry: > > spin_lock(&tree->lock); > > - zswap_entry_put(tree, entry); > > if (!ret && zswap_exclusive_loads_enabled) { > > zswap_invalidate_entry(tree, entry); > > *exclusive =3D true; > > @@ -1475,6 +1477,7 @@ static int zswap_frontswap_load(unsigned type, > > pgoff_t offset, > > list_move(&entry->lru, &entry->pool->lru); > > spin_unlock(&entry->pool->lru_lock); > > } > > + zswap_entry_put(tree, entry); > > spin_unlock(&tree->lock); > > > > return ret; > > -- > > 2.41.0.162.gfafddb0af9-goog