From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93A32C52D7C for ; Fri, 23 Aug 2024 14:35:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 26A7A8009D; Fri, 23 Aug 2024 10:35:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 21BCD8009A; Fri, 23 Aug 2024 10:35:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 092CF8009D; Fri, 23 Aug 2024 10:35:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id DEBC08009A for ; Fri, 23 Aug 2024 10:35:33 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 81A4E121ED1 for ; Fri, 23 Aug 2024 14:35:33 +0000 (UTC) X-FDA: 82483758546.16.5E0A1CA Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by imf21.hostedemail.com (Postfix) with ESMTP id A56901C0018 for ; Fri, 23 Aug 2024 14:35:31 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Vencjoho; spf=pass (imf21.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724423650; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HD0zYUJdm5kof1yO+rSoAs76Cq9RGjC/6dlObPwrCXI=; b=I2HnXXcX5Aofcl4wNEu1lnI/cc7EGfZ8GRMfrgaPP1zw0YISjmirDXkEwNC+JQiewOL7ac UybB+Tk8YRakbXIAY6wRDdTU13fKEeKmUAGnz7lSTRYf3DDY3cxnRL/tH9wnVXegAm67uJ NJ+MFl9IV455J1F0GXqdXX0SXaaJTM4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724423650; a=rsa-sha256; cv=none; b=2XvFB0mInX5u4DPhCFUJ19hQL05GnfL6VVfp69eTMk0mB8CvWZ8F4XxNzvPibR03K3Qk1C 9P65ZSFhi8byFEKogHOOfIdSiaJDWBSci6tr2hZxvLaAxhiAEe/yeEzKt7boVIII2Ruxvq i2V/2qpqInWTYnEdKrXYJ4gsqaqT+4k= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Vencjoho; spf=pass (imf21.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.219.43 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6bf6dedbfe1so10807446d6.3 for ; Fri, 23 Aug 2024 07:35:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724423731; x=1725028531; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=HD0zYUJdm5kof1yO+rSoAs76Cq9RGjC/6dlObPwrCXI=; b=VencjohoVaym3jeL8tbT4ckwH8dvK5jzeuE3tuCQfzlcc+RGX6NGSCgsYUlozm+GDF LGRyfH/BdLJW8QUYQ3Bg7dpVcTHpqqbb3Mi9uj8Eoa9RTqb6PqswSbfG/e3R+PUkSyBg 83wUzQ5Up0crHEvB13Lzz9YR0ffoWIYPQ+rZMYTipKgcgdrf6hbwoOHo4Ihr19r+YdZ5 GGHVLMpAiiwCKoyDLv5PpFU8Xcjdrm0NurM8QOmN7acOKVZU4BQLV+lv6W8KdSircZcR eyAEq6y7B1J5LJtP5VXO9xRrFKUxbRw1njn2mhSmOISQsFepfohFgw4HsU/Ccjqo0W0y aoJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724423731; x=1725028531; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HD0zYUJdm5kof1yO+rSoAs76Cq9RGjC/6dlObPwrCXI=; b=PZi8J+ESHVxHAOTp0oW/bHfV+8Tjv64aaBw669OZ5X/+KGJeVPV3V42iQ3XFN2FdV3 pb59sKGDlX4RK3h/sDh++3PwnOs3sfnHgI8C1X/1rt/+3XOymDq2GZye+By4puvGYxCF EhzZYQ0XulC1GqErfwb7aZuXITtlgNPL09jaH/ghpQP/OeEefNtSrC5pazFXBAYC/X4l i5K5Css6IC9USZ4zMyVyrgaqlgDY0h4HDh/civhkcHGkxNTmIMnR4AKOKomQ65BYx5JG tTh359dlOVgoa4oca3jP+ol10eLX8lAJqfpbmmpniyqpe/1dLR2+ED63E8Lh1NJxhQSc NDCQ== X-Forwarded-Encrypted: i=1; AJvYcCXypJ0qSzSl1tDRbZ/i1hXbUhyORjLnztU59xbSWSMaaODWdmfXCjEat0ogOKFVimF5tQDzOs6ebA==@kvack.org X-Gm-Message-State: AOJu0YwoTFLySCqGxeUelIfmI1JRYD1CZlA5oMa13/vwirtq1JxqoVn+ jiQA19mDmh8YhSA5GQDVK5hJDVu8VkfYWkYCxnFRrZOnQe19tHvTTMTazy0O/J2D7v8TJEVCdju +VSw8WI7kPtLDUgD11jejEiCJ2eo= X-Google-Smtp-Source: AGHT+IESFCFL+ihJkBnUp6givjnwP4o1Pp6kQALg8BhCOcp0A/eLi4A6hjRT9RWUkP3HTXAVkZuPMFV4yiJHkghj5zM= X-Received: by 2002:a05:6214:3281:b0:6c1:6b00:6e90 with SMTP id 6a1803df08f44-6c16dc27931mr28556486d6.7.1724423730563; Fri, 23 Aug 2024 07:35:30 -0700 (PDT) MIME-Version: 1.0 References: <6f65e3a6-5f1a-4fda-b406-17598f4a72d5@leemhuis.info> In-Reply-To: From: Nhat Pham Date: Fri, 23 Aug 2024 10:35:19 -0400 Message-ID: Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000") To: Matthew Wilcox Cc: Linux regressions mailing list , Piotr Oniszczuk , LKML , Johannes Weiner , Yosry Ahmed , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: A56901C0018 X-Stat-Signature: ufxqigridksfrmys458jjo4958hcaudp X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1724423731-331973 X-HE-Meta: U2FsdGVkX1+/FQY00UnmxGUZzdMgi7kmJUz2PvBY3hLIZ/a1ApHzV431yuOJuXPWDYKFtYBiLonOqxG8Q+++1XxWsoYqTfkY0rNnXM7eO+ntLer8JRio1UCBXrriJDf6kBQ1yag3zukyKdmM3TH0Wq66DA5BwvR4xAM2sQXjg6+tZyCMayT9DGEYQY3saAEKu8owS5jm/8swYvEbfj0kmBXifJLKmln+smBlj4b6STKw2ggcXFdUzf6IlY+e4pTp62Rv1dBWzVuQccgbwMxaCbpO5xADBAQhXBQXi+tUcbAIo3JOx13UO21bLLlWlegE4Hn4f6l2a60Lx9MlajDHmiCpLCDhnKo18wGXWMAqWMh4geCBZWzLpTBq8pmfcaxxhZ+onZI+1PCrWhSZ++aIZGTaEY2I75CTHCuhuAZn/INZysqocUTkrIcp3QaPIZZLM5RAkNM2DA0pYE4JWItRgmZIqscd8WOzpMdKmtOpOVCrlcSz/emdhb368Sj924DzpwcLXh0Z1flVnoRk2QF8MrXUojcGtOiJbiAFkuDr98YVbxDuHaMoW3JjF5Yxbj3X1Z6c7BTFRK15cIkc8Ke8cSngvf1ZSbDQMEzd4k7xTx2J3jAyXQNcGY9bwtVieEHo92CTIknepW5dD3V1kXP8+jKPGpArGIu/+LLbifGT7vXwKVne6zr797L9sbPiMoAoarVX08jyAfVTvumtzYGk/mQghxRMmcQ6pcLrg3LoiOWTt2RUzYyZn0xjHjtWzRW3oji0YEtm9fEiMGkSX73gGwxhrYxaBVswic+4t/XrGa2rOfsXbJrThped2KPPceJTRbukcMFJ/4w+sRUuCWeUnz2OJuHTzJoatH6xhR/LQxzTli1Mv8H3A8+ktDyNl6ZzQh9x97uJ7a63cpYUCk+ddBiDiYuF+B2/zTlp7wsqlcN3GZnAs1LCbS/WQzVH3MUaa/y5Dx0UneV7aL4FK2w 9sQBlhZM OLT+7B6pzJfgvTH91nQa4aibSYtlg8QB+gwMzMqqoAc98h210W7nLH+6OTFnbotdbiEnmTMeOGQe7hBAePsm6Z4I4XDU42/jUNojEL3YFevy8AEwgmwvQEr1XDALZIaqFJ0L4vq24QQaDuSWb5S2cYt+WnpKuGvh8fuOh1xGYB5k3usvGC/8K+5ET3MiuCnaBsWCk+MB2jV0jUBrLN233XqA+Oi0WjChfUvOlGmA/wkD4ETx7rD3BCE2dS2RZ0Pb8WzFbaZdBxtADME/IvxNAhH52qa/31ojI3OjKwsv7ncQSNSbDFlIbHi0CTA0AaXcH12miWg/kaFCPpvx6aBepLtb0UHKjhdX9DS2vLocWmkil8xj0TP+ixC3hJ26I1dHPJZx08CF21dVoxQGREuaEp/Lfi+LqyN/kpxm8nSm077jD8L34RWBgWy8NKtAxxCnAV20pSyPHIRDafvL3+og8B/7ztA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.007059, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 23, 2024 at 9:13=E2=80=AFAM Matthew Wilcox wrote: > > > That said, zswap could handle this better. There's no need to panic the > entire machine over being unable to read a page from swap. Killing just > the process that needed this page is sufficient. Agree 100%. It is silly to kill the entire host for a swap read error, and extra silly to kill the process because we fail to writeback - for all we know that page might never be needed by the process again!!! > > Suggested patch at end after the oops. > > @@ -1601,6 +1613,7 @@ bool zswap_load(struct folio *folio) > bool swapcache =3D folio_test_swapcache(folio); > struct xarray *tree =3D swap_zswap_tree(swp); > struct zswap_entry *entry; > + int err; > > VM_WARN_ON_ONCE(!folio_test_locked(folio)); > > @@ -1638,10 +1651,13 @@ bool zswap_load(struct folio *folio) > if (!entry) > return false; > > - if (entry->length) > - zswap_decompress(entry, folio); > - else > + if (entry->length) { > + err =3D zswap_decompress(entry, folio); > + if (err) > + return false; Here, if zswap decompression fails and zswap load returns false, the page_io logic will proceed as if zswap does not have the page and reads garbage from the backing device instead. This could potentially lead to silent data/memory corruption right? Or am I missing something :) Maybe we could be extra careful here and treat it as if there is a bio read error in the case zswap owns the page, but cannot decompress it? The rest seems solid to me :)