From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A7F6C27C4F for ; Mon, 1 Jul 2024 01:08:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 37F966B007B; Sun, 30 Jun 2024 21:08:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32FB16B0089; Sun, 30 Jun 2024 21:08:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F7366B008A; Sun, 30 Jun 2024 21:08:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 01AAD6B007B for ; Sun, 30 Jun 2024 21:08:01 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5ECD8C07F1 for ; Mon, 1 Jul 2024 01:08:01 +0000 (UTC) X-FDA: 82289397162.22.AA7FEAD Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf27.hostedemail.com (Postfix) with ESMTP id AEAE740002 for ; Mon, 1 Jul 2024 01:07:59 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ehbTroqa; spf=pass (imf27.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719796068; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PHUvxpDXIjis6Is2UhneQS2IrOMOZyp0ltUMTOoofLs=; b=gBDjGwbez5z1/+QCzrilxMxIRAjhgEgvguENQahlv5n4EYnADKuMokl111+N/Fl8nuRPX/ su+fCqg+VS4xaCe+uSwnmTCtMNumGmgQhXJXw8HbhroGLO8yJkBgJXZ9Xzb/T8AII5x+tR jOfHn8K0xL78VyDrBenyVMYFEDKNMqQ= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ehbTroqa; spf=pass (imf27.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719796068; a=rsa-sha256; cv=none; b=E8Q3sS9Bq/YbCAmBtlBkGeBrP3sagZDoh7XrJ40z049zHH2DY7JasNfeTSWs2+VJmd7bbs 3Z/1t5lW2UjMFW6pi0wBmHOfXhqqEW5NZ1Fe230LAeUi8OdXUHAnBR4BkpXhTcv24vnUEw JB1TOh3c92VT4CvQpgqq27SVIXvj/YA= Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-446428931a0so11463931cf.1 for ; Sun, 30 Jun 2024 18:07:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719796079; x=1720400879; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PHUvxpDXIjis6Is2UhneQS2IrOMOZyp0ltUMTOoofLs=; b=ehbTroqaCOeCTJ6MXYkzTsfVU/k+OC4PVbgb8WUpEyGSsVH3cIjqM82t3ChXylk6Kx E6hAfXUf80OYysypg2szhwMP3A0rzxeoX4UDLCEJbg5tpHryPzlXPWyY6ILiMnhpOjiq OaLuiNZZ0MpM3KphX8v8Xa8/O4fIvnpGTWXSFlnEjFCxmJM1PUHVk8NQ2pLOpZUktOSa Mdz8Po4Nsm7XzGFjdIGwUdMuHavbjmdG+oD08pvawYTLPytFUA5HLrm/SUA+U1GNfoCp LayZY0Vxm0GFCOppV8zkt2rwU9engtjOQNTy1qwfZOPfvrxaNKMJOjkHpnm/MRoPFDp0 98JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719796079; x=1720400879; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PHUvxpDXIjis6Is2UhneQS2IrOMOZyp0ltUMTOoofLs=; b=ocnh/E8Y0mCbXKzmtW+Krl0eSiniHuCu2lixUL6CJUXBOAbltkwxAyrGGltlb79eIC vOSIXpvgeeUQAeeKzov5HYyX0zx1JS58B6LLeVqoMOVIOdpOZxTal9UIqRLuSxPzEqwk DpQ7s9w0p2eER2ktni6NQWBTy/UVaC+Q0PseYCheYeqOrYromE+4wCX05wLzRj9Hq+V9 TbQhkv50cleQM6h08Ytmp+nrJ4rswlkiSPxUh13ulD2xCGahEkqKtbbmHM7GETkXto/A YbwwoLsuSsMj97KgQJOBSP2hOiGZ3eFo+Lm+4iLAvi7/bk8FofUSi816N54obz6OEOyL jm1w== X-Gm-Message-State: AOJu0YwHvohbGsrlDJ2SH7ekzLi6p2t7kT6YSM11D4eEScFlO6jPYsK0 SbIUUslZAxg/8oZqcv65EY0AjYldKseCIruOtxB/5UGBdaqusGMP2Jp9XV3UXthbP3tkkp+2cqp vv7zJ5ZSlXaIN1Kkvsp5orEqox7Q= X-Google-Smtp-Source: AGHT+IGrnxli0UKEw+A5n5Tk+KgLEhWimAl8csj74cXkOOm79lnGgeWqM/jV3TMfXoVZJO+HhcKXP7gTIPCliQiJ55E= X-Received: by 2002:a05:6214:d09:b0:6b0:76f1:8639 with SMTP id 6a1803df08f44-6b5b7161b24mr56862386d6.42.1719796078680; Sun, 30 Jun 2024 18:07:58 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nhat Pham Date: Sun, 30 Jun 2024 18:07:47 -0700 Message-ID: Subject: Re: zswap_writeback_entry crashes in 6.9.5 To: Pedro Falcato Cc: mm , Linux Regressions , Johannes Weiner , Yosry Ahmed , Chengming Zhou , Christian Heusel Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: AEAE740002 X-Stat-Signature: 6uf679aswm9h4j9pmfbe6knfrp7o786k X-Rspam-User: X-HE-Tag: 1719796079-572997 X-HE-Meta: U2FsdGVkX1/qXRgDeuuAvewn7pqJcxQbDy9iTyrXaz357WOCeN8AZejShNaqaPccDJAcxQt5yhq2n2g2ZmrhX2CtvnZhEvob03dBasAnmhkimKWoD8BcziysQs9GXYK9exNHp1G2JcWYkgtmBfgK5eCmluc9XxNepqatqHBJT1FJPG1by44K5IyMpfDruowxOwuweBfQkR6OkhET7t3++zp6zJMV+y8jz21ygO0hHF4fYHCxyciHURp5/s39vbwsJIU48JXvg3Y2qCve98jtGGN1WhSMOP8R4rxlOhesTx4ZFwrpMqJx+PG1xr8yGz3MVDvUdZJfWxjoqKT5tdArANhkApdNpD/9Sh7M3H3Ru/5wCdhdl7tvl7tUuNhnm1uxlOfdCNe6ZRGg7YHwhE5dqc8/zu25QwywjjfznxFsD6+6A/U2qK96FBf+vq7PwUUi+6EzeCWb4ZEj8r/u+kv4mqy9gu4uFW5c+22hOXcMr5ATYNkvZTAFBWudbQ2DKz/CnnFv4oykZm1KF8NSymg/vm+sfM5d6W+nSMTXYGdpbkQqoj5EgKA+PoYqSqjGrN7Pt87P/K6kT2940u+vPtvYlsh5ilhgbILDlueQuaL78uFK44gOYC2Rzl7wkePPnXB7erZXAGLtpIgt+TFa6Ft/cXbDNzrc7Nor7w34ePB4T7BenwTOOjxSoZ1XUAVTnG4wuLFhJLVmk7QGdyHa7bt5wSV1EQ9PTYWVhSDfwoXfhedvz8c4z570gqX4vE9YhQO9A28oGpwaHV75vC0bHj/5JSDYdyp87p1OAWxS6+CVNJD1dcJZbBAMbRI8G/pxMgvl3DTnhum7gtmlel8So8n+VhFJIMRC87QR68AP+xGKiSlTDq6v6PohJsisfYbNYEGxDAweM5FSKA/7qjIzEWYZiO9jS8XeKJ/BXXuQ7uxFB455KpRGK0AX2Y4XaHP07eFDYWopamzYmq/U4wczB7X 0VgDj1av 2mqhLgPAipc/v9N1QeCT1szie0JVvmlBrba/p0aGtifTyaqr7h9l6zydBvreD00r+WMZ/RK/l3+qWHmjX1f2LXicVc7qctyW8INZeIBwMB4jNIUyyeZ7e5XbWaTI/ETdqtKBBTMv9Zscubs4JB6dL8Av+JZwnR/pOXkmTOQPaeFcl+RtWPZKz9ecAXfjV6Vq0HpzAyZlC3xHh4WWQqBdgoFGfGOApxl7yGiClUZUlD9JPg//8do6pAaCkrVsb6CvV42Nu0TNcEck5ZR59YDQ0Rk4uOo43zCdgKKp4EJ/CrkNQiJrA2fXf3sJ0JVUNoAt+uZI8/iG9+LUMkBqGbXV/W02W1P5kRHKckjIaaa1ANDO92WEIq2WGYP84Cb8riDSGonxOMOPex4c9T6BI9IPucK95YICjeJ5VZuRSm7xQClKyi8wdZcUQfr4TUg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.001052, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Jun 30, 2024 at 10:58=E2=80=AFAM Pedro Falcato wrote: > > Hi everyone, Hi Pedro, Thanks for the bug report! Taking a look now - some preliminary questions to narrow down the suspects and aid the debugging process: a) Do you observe this bug in 6.8? 6.10? b) Have you run the faddr2line script to verify that the line that triggers the crash is count_objcg_event(entry->objcg, ZSWPWB);? c) Do you have a full dmesg log? Or maybe some other reproduction instructi= ons? If entry->objcg is garbage, then this smells like a lifetime/reference counting issue. Either: a) The zswap entry itself is garbage. Not impossible, but seems unlikely. In 6.9, we effectively isolate the entry first through the swap cache, then check and remove it from the zswap tree (under the tree's lock). The former locks out concurrent accessors, and the latter should have taken care of invalidated entries (and prevents future invalidation attempts). Furthermore, after this, if the entry is somehow garbage (i.e freed and recycled), it should also be possible to blow up in the decompression step first, by feeding a garbage handle to zsmalloc and crashing the kernel at that point. IOW, we should also see zsmalloc crashes in addition to this particular crash, no? I cannot think of any protection mechanism that applies to the decompression step and not to count_objcg_event(). b) entry->objcg has been freed/recycled under us. This is much trickier, as the culprit could be any holder of the objcg reference who accidentally double-released the reference it held. That said, if it only happened on zswap shrinker path, then maybe there is something to this... Let me muse on this a bit more. Please let us know if you have other clues, traces, hints, or observation - it will help the investigation a lot!