From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E267EB64D7 for ; Wed, 21 Jun 2023 09:19:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A465F8D0002; Wed, 21 Jun 2023 05:19:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F6668D0001; Wed, 21 Jun 2023 05:19:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C01E8D0002; Wed, 21 Jun 2023 05:19:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7ED408D0001 for ; Wed, 21 Jun 2023 05:19:06 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4A991C080A for ; Wed, 21 Jun 2023 09:19:06 +0000 (UTC) X-FDA: 80926205892.26.7B491F5 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) by imf29.hostedemail.com (Postfix) with ESMTP id 7307F120005 for ; Wed, 21 Jun 2023 09:19:03 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=UsJVwK9X; spf=pass (imf29.hostedemail.com: domain of cerasuolodomenico@gmail.com designates 209.85.216.51 as permitted sender) smtp.mailfrom=cerasuolodomenico@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687339143; a=rsa-sha256; cv=none; b=X68D5jVDnhabOF874IGHyVz2XqiE34luPWA436fPfbuP8FJdLyJXpQ6uAmDeYeWOQkStTZ BHEDLeM8rWcSWXjFPYK8ElX8nP8tT0UflWePcr9TSE2OsGX0yDUGgs4fdROI1DrBSmdApJ QXHtHCX2lZz8zdOsSlwWvYzTPJyEpD0= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=UsJVwK9X; spf=pass (imf29.hostedemail.com: domain of cerasuolodomenico@gmail.com designates 209.85.216.51 as permitted sender) smtp.mailfrom=cerasuolodomenico@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687339143; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LG46ChCK+BwfP68y/KI81ahnd53s4HpMp6bIweLRvmo=; b=WG+UkIDkO8IfLz7IU8Y+Sw015HMi3fvoR1duYUYvtaugVY0naPwC7TOYKYuVVePM2LTH/5 fffdS7UUw7LMYfCm2rwCTo5lAEfYwfiYdL8xcS6NfdoxKvMvTfW4mwylYmGzESnEV3rARj /KHuvk63veMS/fVhXsWxQhf4ZWsx8zY= Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-25e803df0d7so2641224a91.3 for ; Wed, 21 Jun 2023 02:19:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687339142; x=1689931142; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LG46ChCK+BwfP68y/KI81ahnd53s4HpMp6bIweLRvmo=; b=UsJVwK9XgNJyoTnbXVkKVlaTZi1ATL7zgAzLxoTk9o8k/ED17vxHDscITHgoHs6//F 0QlHmOuCfBH+enSBNnvn78cEnEqZAzQX9oQVoblE6/re4T4LWZ/0RFtLvFgkb7E3falJ eYozQOzUWsYKBoa231vF/wpDycEU01iKR5syBTUXmIrhWI+NmBnQccW4BfTT7Rfuac4K VW9YFskP8Lhe2tXHFx9xgjtJIyJb13D2t2RmnNjOWcnt6zxBbcT1gAOqtxPurnR3hnCz 2FiwemCr9RY5qyQGEyrsX29Yt4NvqL+oNHi0YX1vAzdR0CfCgBX8cZxfsF0KEHTu1/hT lzCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687339142; x=1689931142; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LG46ChCK+BwfP68y/KI81ahnd53s4HpMp6bIweLRvmo=; b=h9uT4h3iJvquP3uBhZofck0ewpeI02E1HcMNMzqnRyp/N8QAioPTw45N6ccRLYCN/l oZwjOjqkalJoZ31rUeA4R27FJulT9L10f1s67D2Ju648TqyPlvjwhnrOAyh/e5duRjpX kdJ920MkRvbSq/+21mVCvd8R8dEpuDPxu8+KomF/7oxSxZ2rWDw1e2wEMY9ZKnWHKd2A TXz4NQ9lPALOGPcwhwXPGMjxWlVDkAc7104kFg3etkKNgwoqSC0o8CFNX9y7Gld/cT2X NxR/XZlPiyalDTG8vQLWJyyn9hr+em/MrFFdbGjMqqiNIn5R2C1EFPzLfp4eqYs2+XOk P4eg== X-Gm-Message-State: AC+VfDwB6DzTUvB2aaE+rsdtJB5P7jMl5F8AcnD2dy3FnCeRPFFSnvm9 qbpoLoYH/HHuIhgJyeXefUxvWXHdF078cpNk68ONkpIsQ/M= X-Google-Smtp-Source: ACHHUZ7A0sTZuMdN04LlG0KEWoMuNP0IS/Mj6i9rchT/ZccWposruguveBaa5vSLzQ3FrOHWCh2P1RjxoNK/NMx0sIw= X-Received: by 2002:a17:90b:11c7:b0:24b:2fc1:8a9c with SMTP id gv7-20020a17090b11c700b0024b2fc18a9cmr8961892pjb.11.1687339141972; Wed, 21 Jun 2023 02:19:01 -0700 (PDT) MIME-Version: 1.0 References: <20230607195143.1473802-1-yosryahmed@google.com> In-Reply-To: From: Domenico Cerasuolo Date: Wed, 21 Jun 2023 11:18:50 +0200 Message-ID: Subject: Re: [BUG mm-unstable] "kernel BUG at mm/swap.c:393!" on commit b9c91c43412f2e To: Yosry Ahmed Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Andrew Morton , Konrad Rzeszutek Wilk , Seth Jennings , Dan Streetman , Vitaly Wool , Johannes Weiner , Nhat Pham , Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7307F120005 X-Stat-Signature: o5tezt1b87k97zz38o8qzew5mdfgqtid X-Rspam-User: X-HE-Tag: 1687339143-729674 X-HE-Meta: U2FsdGVkX1+0LwCMcIg+T/aRXhbSu58bhgcE9C8PQersWlUQr0S5tll67NgXDjx199kwL998VYa0Mr0At2MVsqIKDXK13rkrhZNOLD2jNTMgGdyhywXdlH4N10nMX7GBRbZNb+3fym3W1AqdpbDqQkIt1nsKkANel8WpOcbKf4lmoBoCEEyGzO/nAy5r+7wwlfuWUUiAzRq28sMIzteR8MGHKMY0JroQyKJSuC944lJbfB/DrDLmgw9ZshsgU6pNBhFAREzNJ5SEeIX0NefkrPeYxACQwz0/e7WmCr4y7OoPTZur3tnxeHvk4DX8L/MHEPf4+IU9RO+rAGaRi7ZR6knlCyi5SJJ7xvMepvlFHvIMzfUZ2d0nbW6N3JzueodulG8Kz0FC9w9z7n+BMbdOTEqHFwLXndUGuje6Gh1VGmmp7wGBmoQZwm/J90xpGuUpiJ+W8VD7BzLq/P8cg7oYpMPHPh72k39lROqmomhvsVzQQ9GdjhSKLcMCo1viUdvar2kuuqEo3ywra7FdSoupdvjDcWrDBhG2D2Q/L8wdeNRegdBJTuAgT6SWgmPEGc5AtbHUf4yUBqf117X26z+C8FWcoubAelEy+0zNrF8ns+RIKmWG88ek4oh9BnEE6WFo2gAajo44lrlkKzNibCCNEWfvIQDfboTYiPG2pvTj4nwo+9RTWvaG1aSd2ZZIj3Y1IKg17jvDVCsC0C7IJ8zQ8eWyEMjWyZAIEjitRnHEDMdp3snX55Zq2kZCsZBBR678YTFE+vcRXjT8RDqEMsjrQnM786FJ7cYtR9Jo0tPJU9cljo6ynPxlmyRJghiWeFVwZmgAzh1nRFkizH1YFldRwP2Oce3TiKKlH7N44Whoq3Wrvwkh9y4Ugbf9rabFCL1dhgBCu/zJmWaoq82whlCdRPswOlMPTxmUChpuAtHWZDht3RzTeU3XI43kdzuE1cX88MKtV7fgvbSUiezV7zr vn/BSV8f g8btRwcqOTdAHETdePBW5nZVSKmlOib93GX8arlcA5mH43aUtQy3Pdy8ZmrziQbmSrXt3GU9+PtliNU2mf8pBofJXssVhFL2Wo9hsSOREi1oYunyUZoAEC8U6638LWfBbY54eJFmOwOaYfZTC3xNujt3a5VHMyIWflM0wcmucg6QdHSPlIIPQetHs+T9EHSITUCtRRyxcpOx/V80pI4aDzpMxlgcw/6Onqh7qR00Af3w2fMpoXcRrsfYJSnkUcCIHO4erpYXzrHAg7/LA+dKyZQIjuX66R+9BK5f+utZgBOFGCpogmgDmQVykzDipFaq6yGmtC6cKtg8drMkt5T6C5AqXvcKaJ9YNt0bfY0f0uWhUf7Jpql4p+k571kxKB8vwCIbXYWCfUGO4ze0pCx4P1fpnrkEywPcKpwQLom9DQzym1Ko9ISP7bHy6DmWBQHbI48YBK45FOnQBp6BKKrF+BtHPvWAOov38sL9ZjLrKQCbVrLVR0gyRuwX4OlCjGdlIQlfOGqkpU7aveeW7Hg7eeu4rzE56a/Ph9WSKhUAFlpj7lQQFcFGKAKZLCGbiR9Njlb/+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jun 21, 2023 at 10:06=E2=80=AFAM Yosry Ahmed wrote: > > On Wed, Jun 21, 2023 at 12:01=E2=80=AFAM Hyeonggon Yoo <42.hyeyoo@gmail.c= om> wrote: > > > > On Wed, Jun 07, 2023 at 07:51:43PM +0000, Yosry Ahmed wrote: > > > Commit 71024cb4a0bf ("frontswap: remove frontswap_tmem_exclusive_gets= ") > > > removed support for exclusive loads from frontswap as it was not used= . > > > Bring back exclusive loads support to frontswap by adding an "exclusi= ve" > > > output parameter to frontswap_ops->load. > > > > > > On the zswap side, add a module parameter to enable/disable exclusive > > > loads, and a config option to control the boot default value. > > > Refactor zswap entry invalidation in zswap_frontswap_invalidate_page(= ) > > > into zswap_invalidate_entry() to reuse it in zswap_frontswap_load() i= f > > > exclusive loads are enabled. > > > > > > With exclusive loads, we avoid having two copies of the same page in > > > memory (compressed & uncompressed) after faulting it in from zswap. O= n > > > the other hand, if the page is to be reclaimed again without being > > > dirtied, it will be re-compressed. Compression is not usually slow, a= nd > > > a page that was just faulted in is less likely to be reclaimed again > > > soon. > > > > > > Suggested-by: Yu Zhao > > > Signed-off-by: Yosry Ahmed > > > --- > > > > > > v1 -> v2: > > > - Add a module parameter to control whether exclusive loads are enabl= ed > > > or not, the config option now controls the default boot value inste= ad. > > > Replaced frontswap_ops->exclusive_loads by an output parameter to > > > frontswap_ops->load() (Johannes Weiner). > > > --- > > > > Hi Yosry, I was testing the latest mm-unstable and encountered a bug. > > It was bisectable and this is the first bad commit. > > > > > > Attached config file and bisect log. > > The oops message is available at: > > > > https://social.kernel.org/media/eace06d71655b3cc76411366573e4a8ce240ad6= 5b8fd20977d7c73eec9dc2253.jpg > > > > (the head commit is b9c91c43412f2e07 "mm: zswap: support exclusive load= s") > > (it's an image because I tested it on real machine) > > > > > > This is what I have as swap space: > > > > $ cat /proc/swaps > > Filename Type Size = Used Priority > > /var/swap file 134217724 = 0 -2 > > /dev/zram0 partition 8388604 = 0 100 > > > Hi Hyeonggon, > > Thanks for reporting this! I think I know what went wrong. Could you > please verify if the below fix works if possible? > > Domenico, I believe the below fix would also fix a problem with the > recent writeback series. If the entry is invalidated before we grab the > lock to put the local ref in zswap_frontswap_load(), then the entry > will be freed once we call zswap_entry_put(), and the movement to the > beginning LRU will be operating on a freed entry. It also modifies > your recently added commit 418fd29d9de5 ("mm: zswap: invaldiate entry > after writeback"). I would appreciate it if you also take a look. Hi Yosry, Thanks, this makes sense indeed. I've been running a stress test too for an hour now and it seems fine. > > If this works as intended, I can send a formal patch (applies on top > of fd247f029cd0 ("mm/gup: do not return 0 from pin_user_pages_fast() > for bad args")): > > From 4b7f949b3ffb42d969d525d5b576fad474f55276 Mon Sep 17 00:00:00 2001 > From: Yosry Ahmed > Date: Wed, 21 Jun 2023 07:43:51 +0000 > Subject: [PATCH] mm: zswap: fix double invalidate with exclusive loads > > If exclusive loads are enabled for zswap, we invalidate the entry before > returning from zswap_frontswap_load(), after dropping the local > reference. However, the tree lock is dropped during decompression after > the local reference is acquired, so the entry could be invalidated > before we drop the local ref. If this happens, the entry is freed once > we drop the local ref, and zswap_invalidate_entry() tries to invalidate > an already freed entry. > > Fix this by: > (a) Making sure zswap_invalidate_entry() is always called with a local > ref held, to avoid being called on a freed entry. > (b) Making sure zswap_invalidate_entry() only drops the ref if the entry > was actually on the rbtree. Otherwise, another invalidation could > have already happened, and the initial ref is already dropped. > > With these changes, there is no need to check that there is no need to > make sure the entry still exists in the tree in zswap_reclaim_entry() > before invalidating it, as zswap_reclaim_entry() will make this check > internally. > > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > Signed-off-by: Yosry Ahmed > --- > mm/zswap.c | 21 ++++++++++++--------- > 1 file changed, 12 insertions(+), 9 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 87b204233115..62195f72bf56 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -355,12 +355,14 @@ static int zswap_rb_insert(struct rb_root *root, > struct zswap_entry *entry, > return 0; > } > > -static void zswap_rb_erase(struct rb_root *root, struct zswap_entry *ent= ry) > +static bool zswap_rb_erase(struct rb_root *root, struct zswap_entry *ent= ry) > { > if (!RB_EMPTY_NODE(&entry->rbnode)) { > rb_erase(&entry->rbnode, root); > RB_CLEAR_NODE(&entry->rbnode); > + return true; > } > + return false; > } > > /* > @@ -599,14 +601,16 @@ static struct zswap_pool > *zswap_pool_find_get(char *type, char *compressor) > return NULL; > } > > +/* > + * If the entry is still valid in the tree, drop the initial ref and rem= ove it > + * from the tree. This function must be called with an additional ref he= ld, > + * otherwise it may race with another invalidation freeing the entry. > + */ > static void zswap_invalidate_entry(struct zswap_tree *tree, > struct zswap_entry *entry) > { > - /* remove from rbtree */ > - zswap_rb_erase(&tree->rbroot, entry); > - > - /* drop the initial reference from entry creation */ > - zswap_entry_put(tree, entry); > + if (zswap_rb_erase(&tree->rbroot, entry)) > + zswap_entry_put(tree, entry); > } > > static int zswap_reclaim_entry(struct zswap_pool *pool) > @@ -659,8 +663,7 @@ static int zswap_reclaim_entry(struct zswap_pool *poo= l) > * swapcache. Drop the entry from zswap - unless invalidate alrea= dy > * took it out while we had the tree->lock released for IO. > */ > - if (entry =3D=3D zswap_rb_search(&tree->rbroot, swpoffset)) > - zswap_invalidate_entry(tree, entry); > + zswap_invalidate_entry(tree, entry); > > put_unlock: > /* Drop local reference */ > @@ -1466,7 +1469,6 @@ static int zswap_frontswap_load(unsigned type, > pgoff_t offset, > count_objcg_event(entry->objcg, ZSWPIN); > freeentry: > spin_lock(&tree->lock); > - zswap_entry_put(tree, entry); > if (!ret && zswap_exclusive_loads_enabled) { > zswap_invalidate_entry(tree, entry); > *exclusive =3D true; > @@ -1475,6 +1477,7 @@ static int zswap_frontswap_load(unsigned type, > pgoff_t offset, > list_move(&entry->lru, &entry->pool->lru); > spin_unlock(&entry->pool->lru_lock); > } > + zswap_entry_put(tree, entry); > spin_unlock(&tree->lock); > > return ret; > -- > 2.41.0.162.gfafddb0af9-goog