From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE2DBC52D7C for ; Fri, 23 Aug 2024 14:47:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E0128009F; Fri, 23 Aug 2024 10:47:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 690188009A; Fri, 23 Aug 2024 10:47:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57E638009F; Fri, 23 Aug 2024 10:47:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 383728009A for ; Fri, 23 Aug 2024 10:47:32 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D3145C1DE6 for ; Fri, 23 Aug 2024 14:47:31 +0000 (UTC) X-FDA: 82483788702.03.FAF79FE Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf26.hostedemail.com (Postfix) with ESMTP id 8DB84140013 for ; Fri, 23 Aug 2024 14:47:29 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=YiJwgQf2; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724424409; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A6JtJ+roSbECN+POR7GhhHD1vIdACn5juW9yqsSZRLs=; b=TNra1FxjP8v4klMU73eq7rPTJNQudThMSeYFN3O528G5E5MIqaSeB4j2sdcY0q/znnsbQT 42J3Lo71UJrDXkBfgG1NZAa0p1FSVD7vrDviRnnKz72pBz+jjj78pjSdkyK9WxkakCzPK1 JAfehCzPxVP9IydkoowzYqI/Nk8tnt0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=YiJwgQf2; spf=none (imf26.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724424409; a=rsa-sha256; cv=none; b=6YoDknRlCATM/+OMILSmPB/n/YFEALYt81UqTmOgubJx3pLRBS9qhytX2v8PtgkMPjbpB2 tBwYEzlzqQjdo7G0ItpL3w3mQrJiEiTRNJnIUnKOLO83OSze1kJNjpt5GOWhRGilS1Z6af RutU9oZamNMLd485t9fJ/4L3ER/11ps= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=A6JtJ+roSbECN+POR7GhhHD1vIdACn5juW9yqsSZRLs=; b=YiJwgQf2BBX4aZKtSk/5IZx/5A R49JjQqTl7A71auWQpc7jIzi8Vp2eDt28x0H86mb6LvIwzKJvdNgE5pdgAeDJbfVfHJ2dzCd6a819 o6ngnrTU5ayWb2b4uuLIwZTcqb7ZdTHuA994tBcMRH0ofbPrCorzwAQCIBxgHuTeH42VSnmSTbVml BcqtqKRNC+VZucRop0/JK/5+ctFkCskQsPd91RzAx4plDQu/0J0fGvAhTwvskYM4f68m3ApC/6AD8 ZGO6SV9IAiQK6oBdqLL7dvZERzxi7LA5i/FhXtmqfNcQ50lwdKga49Zqrchee+BiBqqGatK+ZF1yJ UamWfcfw==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1shVZU-0000000BphE-1AHl; Fri, 23 Aug 2024 14:47:24 +0000 Date: Fri, 23 Aug 2024 15:47:24 +0100 From: Matthew Wilcox To: Nhat Pham Cc: Linux regressions mailing list , Piotr Oniszczuk , LKML , Johannes Weiner , Yosry Ahmed , Linux-MM Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000") Message-ID: References: <6f65e3a6-5f1a-4fda-b406-17598f4a72d5@leemhuis.info> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 8DB84140013 X-Stat-Signature: socbwbpc1kbrxaz7wd4hfod18ih77tqs X-HE-Tag: 1724424449-700557 X-HE-Meta: U2FsdGVkX186V4AW77bp1+xF2HXfu6OuWr6ql8HhyAl4GO1A1/k6k6mlQIYMzG/ALbMoO8CrdUztYdPGkd5/uDbNnPE2jrHVMBu81lbnfzsHX/66q0NH/VJp2awzjgWsPj7dGYtzgs1z0rGJmsSEu5PFjExODUFU9iATUnD6+UwA7sEz08ZQYpzfPZkKniS8Gt22geoCOshab+7rOMDfunvR738Vl7zXMoaAdPG7d562/sI4Gc2J93L72eWMv9fpVQdNfB1taKJVoLO3KWh5vyWV0fUIE89Cxl2uNr5WIndzVRVIaND+PN00keKLnLm+Q8Z4EKRdJKv2YSiFfs2q78VcLA5Cz2GG3f067tRnHai/w5wpuuU72NTKnY1qtOt3EW5hVJ4yHqn3sj/r1maZDdH5XghYsaMEK2boW5JYhtqSaAd1g+dnpfP1iUWngmK8P/qJA0r8jgAkwcB3lfkBwRK22JJJeL0038avoGx+H55OeKa5Tx5pZ4WgMFaOMObj6ig5qkS9/Kq8lFy/3uPq/RgWgquZ6Em+/9EFzAqmWp0V5TNlWoBnNC5aGeBccmwGTvkJt4muB87kukdo4Lp/Tbx9jY8pbY1JhJFyv4LkP/m5aqESp/AgGSJkThOKpNMYqmODZBonGkiRZrpfSa4G0q/J/LnXZoQ6kVYkaByZwl94rKgfR6ZSnxhEkMkRFGGxdBLE4pHoLvBCGZapDfMeaRxGnhLCh2yCPhZ0D1xQU3DU6omyJg4QDdJTCNhHc/B0gOt70narRFNX9aZrJHtA72Ne69ao9fpLmWVPTlDWnovyi+QPYWw/jYlsela7GB1ahUC611MUq43SHEfvFfYYErO5jSnIp7kY9Q+mL2Lcho/QrZEEMc1HFLvQmLDbdkz+y0OoNJ92DF/KwQl8RKHgs162Q0iMknUtvTj2CskC4hyn6ct+4lohDiY8PNxiEr3obHWQG5dT0mF1OMLjAMM 70j+J8OG Y7/WKmmNk5PKLPpVLzio21gCKGOpmJdgcZG6O08tMeZYBO2c2E9du3VwTTHl/yT5ocf2WxdDhftc35eyQ7ybdNniz25EYVTY9/P5Mt721H7T2OyQwrjGqG75RJBGxzQl2k4RDytgWg6hyx78jnnOkCmog+XEt9bGh7SdsTfu3fw0QyM27Vcbn2Jd7C9dC6St92erpUn8B07WxeBS+BYIVUEs7DqpR4XBC4wndMCGE/kftdwQQ6ZPzaXWbfwSnGjeKKBXAPKv5/5PLdqH4xwf9GWRTqyZnvA1q2g3UIMPzyzbBwEDo/h8xdkkNdEWZPqR3+ZphMALKlcDJHtnHQHF6kfj318738UQq7VZAIerUb3vltBrSKgPZBMrOUfNwQDSLdKJQkZfSOV1hgQsZ1SkgD69U5A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 23, 2024 at 10:35:19AM -0400, Nhat Pham wrote: > On Fri, Aug 23, 2024 at 9:13 AM Matthew Wilcox wrote: > > > > > > That said, zswap could handle this better. There's no need to panic the > > entire machine over being unable to read a page from swap. Killing just > > the process that needed this page is sufficient. > > Agree 100%. It is silly to kill the entire host for a swap read error, > and extra silly to kill the process because we fail to writeback - for > all we know that page might never be needed by the process again!!! > > > > > Suggested patch at end after the oops. > > > > @@ -1601,6 +1613,7 @@ bool zswap_load(struct folio *folio) > > bool swapcache = folio_test_swapcache(folio); > > struct xarray *tree = swap_zswap_tree(swp); > > struct zswap_entry *entry; > > + int err; > > > > VM_WARN_ON_ONCE(!folio_test_locked(folio)); > > > > @@ -1638,10 +1651,13 @@ bool zswap_load(struct folio *folio) > > if (!entry) > > return false; > > > > - if (entry->length) > > - zswap_decompress(entry, folio); > > - else > > + if (entry->length) { > > + err = zswap_decompress(entry, folio); > > + if (err) > > + return false; > > Here, if zswap decompression fails and zswap load returns false, the > page_io logic will proceed as if zswap does not have the page and > reads garbage from the backing device instead. This could potentially > lead to silent data/memory corruption right? Or am I missing something > :) Maybe we could be extra careful here and treat it as if there is a > bio read error in the case zswap owns the page, but cannot decompress > it? Ah; you know more about how zswap works than I do. So it's not a write-through cache? I guess we need to go a bit further then and return an errno from zswap_load -- EIO/ENOENT/0 and handle that appropriately. > The rest seems solid to me :)