From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3334DC531DC for ; Fri, 23 Aug 2024 16:08:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D613800AF; Fri, 23 Aug 2024 12:08:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6619A800A4; Fri, 23 Aug 2024 12:08:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D8F3800AF; Fri, 23 Aug 2024 12:08:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2E95C800A4 for ; Fri, 23 Aug 2024 12:08:27 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9AB4E121B6E for ; Fri, 23 Aug 2024 16:08:26 +0000 (UTC) X-FDA: 82483992612.22.E5AE5E1 Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182]) by imf07.hostedemail.com (Postfix) with ESMTP id AED8740009 for ; Fri, 23 Aug 2024 16:08:24 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eCbuuQUO; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.182 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724429195; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=60mDZjLohoH/Srn+FzsRdN5LB7GzcubV5MgxDtoHqBs=; b=cpBMubmjKMz/wVJs14WpU4z+wSMjQ5yMdtRL0PRtKTvYfqhxQmwGLgePbWi90eoWEfMJRe B23vVjOrYX3laf+SZdPeQVV/c1iIFhO2esrSsB6fEG128ttbBef6ZKN//kHaZDTesHyMtO OuQXFEIeA2cE37coU2h032Awwf6uMyw= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=eCbuuQUO; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.182 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724429195; a=rsa-sha256; cv=none; b=XT6ZvhQ7AoBEXd/QkbBol9c1oO7wKGy/KCHYogp7N0JLr6rzqcpCU8UCzEz78DwyFWn86Q oTImgmwm8m6Vbk2Q/hI5/rYYWTJC+7eGsmXrSU73TirHxfBXxWIMXVYJXz6wQMRoh2Xml6 RyTi3umGtypD4hhwZNdf4BAzsEdBtTs= Received: by mail-lj1-f182.google.com with SMTP id 38308e7fff4ca-2f029e9c9cfso26436651fa.2 for ; Fri, 23 Aug 2024 09:08:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724429303; x=1725034103; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=60mDZjLohoH/Srn+FzsRdN5LB7GzcubV5MgxDtoHqBs=; b=eCbuuQUO3kCPu9kX2iIRrnwpy9NXQOm7UaAmAdGkaF7ScyYNU4cJMdqh8G4oo5QunN LKVp0y50Zub7fU9xTVeBcD3T/7UCJiO74rq6LKEnUimjPbF4cQfIp6jWbvuy9F+nhRdi LtIS57KhDq0F5Jk/zgvJqKO+/2416ik7S81ziP3lr1qRMdX31NpIcrI9Yl1hn3PqyVeh 1RfPI5iKSEbtBeLtTK0wiVhVqYuTZiQECgEgS6pYMQ/cHDU3nGFYTHBdsAba3oVn9WcM y2MZDfWKBn5KoT1ED3zERU2PxH69EpkwQvjTQMCsYN9OdVsP+Q24x/fhnsfv9/L6/AcA cgqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724429303; x=1725034103; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=60mDZjLohoH/Srn+FzsRdN5LB7GzcubV5MgxDtoHqBs=; b=m8OtXQma4a7uoTp41q8JtOK8PEEKLKDL94keyA1orpONT/oR3+iA9U3Fp0V1rgMPZR rYl+8e4ZO1lddkIBX+6gpPqak6GZHEtU9aSZMgHK5xpyeoCPJp5JVJ1IbBsH4QegxPFs bsWn0yzWACTlA23ftcqV2O3/6+pUtLjsVrhjxH2Iad1R+wKBesw4vyMwVCgQWukE3UVj QSqc9Vn7QrHT1ZPXCfuaVUgQ9SID7akPscGbg6qfEECqbW1TlW/tjFmjP0eYCDzuyfwO 4v/dHppX7p7hq8B8cyMg442XRnkGW3CjWngOcTUCdK5DKLqLIQSShpGbC+Pnzh1xSIDw Bofw== X-Forwarded-Encrypted: i=1; AJvYcCU30l4tpR9OiKbe2JoTDJFq+3z1Pltv5FGGQqJoTUx25QvcG74saUHuhTzuCHdZZHmgL8x8DUyWaw==@kvack.org X-Gm-Message-State: AOJu0Yz+OdNDwMlQzvdmzjg2jg60B5uYRgKPiNlJzYuk3pERqNmU1z/7 e/V7QgWQqpWQ4M1Qj60+Ct5YjO4TZM5qLneqEs3pWLixZj/rMvqFv7Z93p2A71cKXaOefQXtTuD dSqvWkWPetioGfI3jaQqj9at5YM+CA6ksSb8/ X-Google-Smtp-Source: AGHT+IFzWoZ4n+2CZWNfhdJAMZParmTzQohv9bRH0x15mgayHZBeNg5k4entByf4z9D7oSlR7pVx0fhye36Aap4RFO0= X-Received: by 2002:a05:6512:234b:b0:533:4505:5b2a with SMTP id 2adb3069b0e04-534388510cdmr2797061e87.28.1724429302210; Fri, 23 Aug 2024 09:08:22 -0700 (PDT) MIME-Version: 1.0 References: <6f65e3a6-5f1a-4fda-b406-17598f4a72d5@leemhuis.info> In-Reply-To: From: Yosry Ahmed Date: Fri, 23 Aug 2024 09:07:44 -0700 Message-ID: Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000") To: Matthew Wilcox Cc: Nhat Pham , Linux regressions mailing list , Piotr Oniszczuk , LKML , Johannes Weiner , Linux-MM Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: AED8740009 X-Stat-Signature: o8ewxeky4p7awj589x59qt3jdho59iof X-Rspam-User: X-HE-Tag: 1724429304-587345 X-HE-Meta: U2FsdGVkX19ZREgIOmONG0OGUwIdZc0SEvXeiLL80CTvtGgh8YA1Jao50fDVB2zIeUtXNcP6lQo2vKhfRAhU0DwoLNtPi8LN5wGCPRZA23VZMjRySfWewJ3KbRwcMSnDWFfHQ2bgHGShSb84jRuQmzKWGWXfh17auuZyrpkOYDm/+NlwoI1Pw7pKWiVzMd/hNuKh3yiUquGGcvfwHumMMCGUaqb2WARkOfFN9r2HphVCKfwt3m/pbabc8x/zepc5OdTH+tU0PCIJV2g4Tsf1N69nAPcqfPwnOwzVqUg8BOz8wD0eDJ3W2M8/tI/SvCS3BGiRRF0F4fJwF51uIHp4RSllkgUzmt7hboZjeFrrWhYAyyf+uehs5kxsfISrV28yR2PWh1m+krMJmTJ4YpAQ/stbecg74H6om4uryhmFbLLiyqPhM5W8ZfpNuKzSsz5X1d70lgiePXwSoiGSS6K/CDdFrlheHZYd0za2WBG5UgTE0mDQzs35vLy0TyIYaBCT15er0q+Z2FPxOA/6uajM8fUu1LXrkB+u/oK9KXMTHFw4FNSKG2cj+VlEOLSV8Aa7JljAgJcVl0iZKIU9Y3YXw6i9WzrTCs2N80ntWCZxpJMz/SeRzjBdQjMeCmzPBjtGqdrIdLMIhVnsbIiw4ZThDVXVuN9mCoe0c5FyIY/sbMylO23wMZvASgIz8z4RZqdJS1dTTdPlcTPIjnRNgqUP3SySudJSXfnnN/2U6Zvx/NACmTPeKxnFtkhpIv/oa+F86WSEoAfeKyXpztFB40y1vNtcKHwSmD0SCpblMHz6IVhP2YpKPTpdIQIVnuIcRog6SnMOhBVaWuCaEpx6pI5VBKcOKql4ZoC1KMyx1bi2NEC9mRvhDQeiU5wds0gi0DSSWtz3NmuhOJqiSBPnbFJ2JhbqnMZtkNk86orYvKDkWVJLiz9WOnPZaA+BqsWsD1+s0khK+SMI8Hv4+InQtW5 tuuooSXf dWkCzfbi6uPkMQMyzfWCv7piF9MU34O5Ymfkf8PZuFJhhnuzpR6TZFjLa/3wCXoA2HRZFh3BTF2xC8TWp7DWziSeKSLppHm2gsY9kLQ6NDmlvUauGXUaQZCsKoOdpFi4KRjK7+c47BS9+RtHZDe6ct4qMvlpEWa6NIwSc6MkWeSgYTAY8TV3m00hTaMUGUY5GgaCVSeMR9aOmcro30aKv5g39pNmJwEsGBVY0pZYWEN4/o7aTOajLc6zhTHSfz+q48l994z/Xmqd8pahFiJQNG/0dHLNVHpqasR6Q3Nt3+jVeDMv0ZE0xBuFGMNUjG/XG22iKdPngVVfT/UlE/8M2MCDa+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 23, 2024 at 7:47=E2=80=AFAM Matthew Wilcox wrote: > > On Fri, Aug 23, 2024 at 10:35:19AM -0400, Nhat Pham wrote: > > On Fri, Aug 23, 2024 at 9:13=E2=80=AFAM Matthew Wilcox wrote: > > > > > > > > > That said, zswap could handle this better. There's no need to panic = the > > > entire machine over being unable to read a page from swap. Killing j= ust > > > the process that needed this page is sufficient. > > > > Agree 100%. It is silly to kill the entire host for a swap read error, > > and extra silly to kill the process because we fail to writeback - for > > all we know that page might never be needed by the process again!!! > > > > > > > > Suggested patch at end after the oops. > > > > > > @@ -1601,6 +1613,7 @@ bool zswap_load(struct folio *folio) > > > bool swapcache =3D folio_test_swapcache(folio); > > > struct xarray *tree =3D swap_zswap_tree(swp); > > > struct zswap_entry *entry; > > > + int err; > > > > > > VM_WARN_ON_ONCE(!folio_test_locked(folio)); > > > > > > @@ -1638,10 +1651,13 @@ bool zswap_load(struct folio *folio) > > > if (!entry) > > > return false; > > > > > > - if (entry->length) > > > - zswap_decompress(entry, folio); > > > - else > > > + if (entry->length) { > > > + err =3D zswap_decompress(entry, folio); > > > + if (err) > > > + return false; > > > > Here, if zswap decompression fails and zswap load returns false, the > > page_io logic will proceed as if zswap does not have the page and > > reads garbage from the backing device instead. This could potentially > > lead to silent data/memory corruption right? Or am I missing something > > :) Maybe we could be extra careful here and treat it as if there is a > > bio read error in the case zswap owns the page, but cannot decompress > > it? > > Ah; you know more about how zswap works than I do. So it's not a > write-through cache? I guess we need to go a bit further then and > return an errno from zswap_load -- EIO/ENOENT/0 and handle that > appropriately. It should work if we just return true without calling folio_mark_uptodate(), this is what we do if we get a large folio in zswap_load(). Returning true means that the page was found in zswap, so we won't fallback to reading from the backing device. Not marking the folio uptodate will cause an IO error IIUC. > > > The rest seems solid to me :)