From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44D89EEEC17 for ; Fri, 13 Sep 2024 17:40:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC09C6B0089; Fri, 13 Sep 2024 13:40:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B6FF96B0093; Fri, 13 Sep 2024 13:40:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E90C6B0096; Fri, 13 Sep 2024 13:40:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 81B0D6B0089 for ; Fri, 13 Sep 2024 13:40:36 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2DB421404E1 for ; Fri, 13 Sep 2024 17:40:36 +0000 (UTC) X-FDA: 82560429672.02.61A5E93 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by imf12.hostedemail.com (Postfix) with ESMTP id 4DDC040012 for ; Fri, 13 Sep 2024 17:40:34 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ps14C+Z2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726249205; a=rsa-sha256; cv=none; b=W/i7t2b4VRNguIbxVObZpoDX4zwMrWbc6TJauI+evvxQAwG/kZtxe02aNwWbUw3jFQvBur Reatq7XF1tHVoBCXScOTC+Fu1kdeCLChJJzeMgkjRzNMxzPgMVu8yQmyUoq3l0TsvwPZR2 tZ59bU99rhi9Gk9z5qd+3HDkoZwiRDs= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Ps14C+Z2; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf12.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.44 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726249205; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MV//vMadBXNAeDxNpKJbhQVJOpXHshAUgJME6e2RQz4=; b=ikJZggCEO0xFzLvjDW0hNpFB0g3VTrr4X/0BRMD+9/7XjS08fU9oUErnGCSx/th20tuf9h s20bg0aoyisfJYrXoQQ5OH8qjfw8B7C7xTLJnJXjeE065V5J6MplvcLvpNCFzkpZ4QsOza SbHJGVWrAUJ0MQRg0P7hN18VhfTlTt0= Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-378c16a4d3eso1252213f8f.1 for ; Fri, 13 Sep 2024 10:40:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726249232; x=1726854032; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MV//vMadBXNAeDxNpKJbhQVJOpXHshAUgJME6e2RQz4=; b=Ps14C+Z2zyrOuOr+K01wsWEQBDrPgphj7CW/BVLSsBSGWnT6wZWi9cd2HD7Br92BDi tXVa/7criRG8GG0j6ZT4vbVg/0hF5Vd+xq+UNRIR4G1pAJFPRhknEqMhura2l0neunx1 CNwbghefJEo+G0EA+z8hGJVE5scQSMuJJOWxi7CveriNM6Bq2O+lRIty1SpRsL8Vkf82 ypX3UYLzG2/iAaHidWwKLf3YASu1YBNwNXiIBBqUzsVUJsRTJ7oKYyIFA+hjhaYCJIFu /qWQNXIAO+4fUKEtnaaJrueYixYmgCmG56L9rd7E1tJF41c4v2VMs9kgRcO64IJQb6vh KYGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726249232; x=1726854032; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MV//vMadBXNAeDxNpKJbhQVJOpXHshAUgJME6e2RQz4=; b=MZwDGhdIA9OEk4RN0gM6TN3zbeYxYAPFUyYPC8iXJaudZaElGxSBdbYonnNOxHfpOH yAobjJXTsBh8tbAVLsRAxIV6zYNRuo9YCs9+7Uzt6tSQSfXbsxyDBpPyc12zeIfl/E1s gVit4i+G8Rls3MPCEa1UUfSP6hfzr6oUMdqV+Whu3ll49Z1rDJLH6hK5qXqtruHhDNDr dRmiD9IlaCXCMTnfQGPtAi+k5QU0mHxWeZfKK5TjMoBafeCn1rom6CJtwtAUa5VqM05q hIdrT2Rld8llIaXnU6E3EwrBngei/ALJmvINJsZechCWDeyXHMNQnwaDi8k/RK5FKdck zTSA== X-Forwarded-Encrypted: i=1; AJvYcCU5SFI/PBQq6ytzXj1WDYxP3GhyiTezr9Qt4CViTRvphRkouZyjPb4dA0v62OwTv5i5jVRspWGxVQ==@kvack.org X-Gm-Message-State: AOJu0YwmxDzxzNLs65dYZpr+h7izkt3L+erS4Fkx5kG7tBJMHom+5TpU eF4vt70x0kpoMx1/By7ug6XBCvQEABmUbQTJFQQoyq7BcozbT5KVzSTG1gZt+zlP7Pd+2zifVtG ZvrpZHRDs+fA1TTfnljrZMPgpxljRf9K6ttu0 X-Google-Smtp-Source: AGHT+IFIyujRzAe4WlemD4EfCBpbqNG9oq0567HHDveKlpdCWunt+0ofHO0jr8IHidqrsP7gfOEuMszgh2JULwWn3vs= X-Received: by 2002:a5d:550d:0:b0:374:c69b:5a24 with SMTP id ffacd0b85a97d-378d6243ba0mr3009632f8f.51.1726249231412; Fri, 13 Sep 2024 10:40:31 -0700 (PDT) MIME-Version: 1.0 References: <2272920.vFx2qVVIhK@electra> In-Reply-To: <2272920.vFx2qVVIhK@electra> From: Yosry Ahmed Date: Fri, 13 Sep 2024 10:39:55 -0700 Message-ID: Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000") To: =?UTF-8?B?VG9tw6HFoSBUcm5rYQ==?= Cc: hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, nphamcs@gmail.com, pedro.falcato@gmail.com, piotr.oniszczuk@gmail.com, regressions@lists.linux.dev, willy@infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 4DDC040012 X-Rspamd-Server: rspam01 X-Stat-Signature: fh479bs1sumn44qccufffz8h9db9wu88 X-HE-Tag: 1726249234-774949 X-HE-Meta: U2FsdGVkX18J6i/fN/7WeWQ/JaRkWi7mB+JxKSf6lKnDidADbamAi52x1F3G3omZd0tZhhsheEYDb9Ifl0MCfedRE4gNu5YrMgw8RCtlFq1e76mbrrLj/mc1v9Q023eeK4/W2mOQ+zwxa/ejMntT4QCx98Zb+SE2mCib9XIYdoEOHglA1yDOEq3s/UbU3wX93cgyWYPQw46QLy3JBA/fLbrQRIFZmNUGwQ67mUHcJ1NUlsnOdbhnpUoratfe73eQ6uy2787U+yzsFTAHp7bGiRGYvQwlHbIBTqsiqpDQdrIgnzscDI9aGCtIqa6ISyxqdr3yl68I2nWibhLJVBiqYENy91iNFUof3Wbf/MMcuUujGdCgZfc4zo82Kso/OMSr74/nGr0b53pcTIdbbDggASTf360yrajfJtE6jkYzCzhOBSkDbZ7YdmHSK0cfU5S1pdgV4dYOu6dtD1C9t2HPDO1+tU2WQB/UW/EmV7EJ02CyDRXUxa8RtRuQ0moGhlwBIV4iaT1orSQxzO4vXf/iLP6Qz7us692YNWNlv5Yy1r+wdXO6diyeswsoUiiiZUWaEzu6rYX3BHlVm53LqklYmvVmL81DEhlJkhGYwoflUyPJKSYbKfhTwRVyKcEzeujQEBTlP+n9cd4dPrVlnzIoussNKKDdEfP2qW5RDEvbYY7GfjWPu8Xocyladw3685IVWpC/L8P6uraindDvlfXeoU4REnRqWzT3nyG4JhAqgLt8K3b53zT4wRGc0h6V+WJhrdIfChVtopzzzZbTCCIEqE8us964QWokBHDscEwgTBvwnLG3nCyU70s6lGxuPmACWPhvILvWWu2iWKa6NHX3LYjBpbxEAmvaaq6lwlYbUse2LekNrGjnW0r3juIL5Xd0k/YKC1ifucdTCamDGJiaYXDGVWqMTBb5TmS6HsPQlnGdzEd/GRqsJHtVl341Ci6/03d0GIBLaNdYEwfwgfc ix+W+yh9 bAszSTQqvDEHyJrOguddr+2YpvKWYlPTwUO1l0/yoelyDZiP8leAv92zIXS/HaXdq3aqsGxPbvLJSXBpuXtf/jGcDgjWkMS8qRz5AzKZRkryuF3fffPgBDVunYJlTB0zieN1J703ClQgfDSVtS1zsPMlA38nPOfrb46sW9SLyuxGWZ7n/vO7HFAT2FSQzyetRkj9MwvcbdPi3Bl5rurKtxqnKBebotU+jRuphlFg25I5+ymN4IpSkD97H5WiEOKthztoLRnY71cVTrIwAITEtQwBalWiQlZx546uBsAnTBDYwegAQlLjRbtWxwDf4DWhOAPlF6hvs5jcT9JDzZeWU3f2zmeX6Ykg6IKXEX6meCADUyX0CpKYQewziQZjleShEAye78KO87qrEpry5/NLshiTzSXpVMSyDDR6v X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Sep 13, 2024 at 2:03=E2=80=AFAM Tom=C3=A1=C5=A1 Trnka wrote: > > > Well, it's possible that some zswap change was not fully compatible > > with z3fold, or surfaced a dormant bug in z3fold. Either way, my > > recommendation is to use zsmalloc. I have been trying to deprecate > > z3fold, and honestly you are the only person I have seen use z3fold in > > a while -- which is probably why no one else reported such a problem. > > FWIW, I have repeatedly hit this exact BUG (mm/zswap.c:1005) on two of my > machines on 6.10.x (possibly 6.9.x as well, but I don't have the logs at = hand > to confirm). In both cases, this was also using z3fold under moderate mem= ory > pressure. I think this fairly conclusively rules out a HW issue. > > Additionally, I have hit the following BUG on 6.10.8, which is potentiall= y > related (note __z3fold_alloc in there): > > list_del corruption, ffff977c17128000->next is NULL > ------------[ cut here ]------------ > kernel BUG at lib/list_debug.c:52! > Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > CPU: 3 PID: 248608 Comm: kworker/u32:3 Tainted: G W > 6.10.8-100.fc39.x86_64 #1 > Hardware name: HP HP EliteBook 850 G6/8549, BIOS R70 Ver. 01.28.00 04/12/= 2024 > Workqueue: zswap12 compact_page_work > RIP: 0010:__list_del_entry_valid_or_report+0x5d/0xc0 > Code: 48 8b 01 48 39 f8 75 5a 48 8b 72 08 48 39 f0 75 65 b8 01 00 00 00 c= 3 cc > cc cc cc 48 89 fe 48 c7 c7 f0 89 ba ad e8 73 34 8f ff <0f> 0b 48 89 fe 48= c7 > c7 20 8a ba ad e8 62 34 8f ff 0f 0b 48 89 fe > RSP: 0018:ffffac7299f5bdb0 EFLAGS: 00010246 > RAX: 0000000000000033 RBX: ffff977c0afd0b08 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffff977f2d5a18c0 RDI: ffff977f2d5a18c0 > RBP: ffff977c0afd0b00 R08: 0000000000000000 R09: 4e20736920747865 > R10: 7478656e3e2d3030 R11: 4c4c554e20736920 R12: ffff977c17128010 > R13: 000000000000000a R14: 00000000000000a0 R15: ffff977c17128000 > FS: 0000000000000000(0000) GS:ffff977f2d580000(0000) knlGS:0000000000000= 000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f063638a000 CR3: 0000000179428002 CR4: 00000000003706f0 > Call Trace: > > ? die+0x36/0x90 > ? do_trap+0xdd/0x100 > ? __list_del_entry_valid_or_report+0x5d/0xc0 > ? do_error_trap+0x6a/0x90 > ? __list_del_entry_valid_or_report+0x5d/0xc0 > ? exc_invalid_op+0x50/0x70 > ? __list_del_entry_valid_or_report+0x5d/0xc0 > ? asm_exc_invalid_op+0x1a/0x20 > ? __list_del_entry_valid_or_report+0x5d/0xc0 > __z3fold_alloc+0x4e/0x4b0 > do_compact_page+0x20e/0xa60 > process_one_work+0x17b/0x390 > worker_thread+0x265/0x380 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xcf/0x100 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x31/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > > Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast lp parp= ort > ti_usb_3410_5052 hid_logitech_hidpp snd_usb_audio snd_usbmidi_lib snd_ump > snd_rawmidi hid_logitech_dj r8153_ecm cdc_ether usbnet r8152 mii ib_core > dimlib tls > > snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component > snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_in= tel > soundwire_cadence snd_sof_intel_hda_common snd_sof_intel_hda_mlink > snd_sof_intel_hda snd> > processor_thermal_device_pci_legacy intel_cstate hp_wmi > processor_thermal_device snd_timer sparse_keymap processor_thermal_wt_hin= t > intel_uncore intel_wmi_thunderbolt thunderbolt wmi_bmof cfg80211 snd > processor_thermal_rfim i2c_i801 sp> > ---[ end trace 0000000000000000 ]--- > RIP: 0010:__list_del_entry_valid_or_report+0x5d/0xc0 > Code: 48 8b 01 48 39 f8 75 5a 48 8b 72 08 48 39 f0 75 65 b8 01 00 00 00 c= 3 cc > cc cc cc 48 89 fe 48 c7 c7 f0 89 ba ad e8 73 34 8f ff <0f> 0b 48 89 fe 48= c7 > c7 20 8a ba ad e8 62 34 8f ff 0f 0b 48 89 fe > RSP: 0018:ffffac7299f5bdb0 EFLAGS: 00010246 > RAX: 0000000000000033 RBX: ffff977c0afd0b08 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffff977f2d5a18c0 RDI: ffff977f2d5a18c0 > RBP: ffff977c0afd0b00 R08: 0000000000000000 R09: 4e20736920747865 > R10: 7478656e3e2d3030 R11: 4c4c554e20736920 R12: ffff977c17128010 > R13: 000000000000000a R14: 00000000000000a0 R15: ffff977c17128000 > FS: 0000000000000000(0000) GS:ffff977f2d580000(0000) knlGS:0000000000000= 000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f063638a000 CR3: 0000000179428002 CR4: 00000000003706f0 > note: kworker/u32:3[248608] exited with preempt_count 3 > > > > Is there any possibility/way to avoid bisecting? (due limited time fr= om my > > > side)> > > So unless you have a reason to specifically use z3fold or avoid > > zsmalloc, please use zsmalloc. It should be better for you anyway. I > > doubt that you (or anyone) wants to spend time debugging a z3fold > > problem :) > > I could conceivably try to bisect this, but since I don't have a quick > reproducer, it would likely take weeks to finish. I'm wondering whether i= t's > worth trying or if z3fold is going out of the door anyway. I don't think = it's > hardware-related so it should be possible to test this in a VM, but that = still > takes some effort to set up. z3fold is going out of the door anyway, I already sent a patch to deprecate= it: https://lore.kernel.org/lkml/20240904233343.933462-1-yosryahmed@google.com/ I will send a new version after the merge window, and I will include your bug report in the list of problems in the commit log :) Thanks for the report, please don't waste time debugging this and use zsmalloc! > > Best regards, > > Tom=C3=A1=C5=A1 Trnka > >