linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Tomáš Trnka" <trnka@scm.com>
To: yosryahmed@google.com
Cc: hannes@cmpxchg.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, nphamcs@gmail.com, pedro.falcato@gmail.com,
	piotr.oniszczuk@gmail.com, regressions@lists.linux.dev,
	willy@infradead.org
Subject: Re: [regression] oops on heavy compilations ("kernel BUG at mm/zswap.c:1005!" and "Oops: invalid opcode: 0000")
Date: Fri, 13 Sep 2024 11:03:18 +0200	[thread overview]
Message-ID: <2272920.vFx2qVVIhK@electra> (raw)
In-Reply-To: <CAJD7tkaTcnuCFW+dWTzSAuLKBqkkGv9s5uByYm9DaJC=Cp-Xqg@mail.gmail.com>

> Well, it's possible that some zswap change was not fully compatible
> with z3fold, or surfaced a dormant bug in z3fold. Either way, my
> recommendation is to use zsmalloc. I have been trying to deprecate
> z3fold, and honestly you are the only person I have seen use z3fold in
> a while -- which is probably why no one else reported such a problem.

FWIW, I have repeatedly hit this exact BUG (mm/zswap.c:1005) on two of my 
machines on 6.10.x (possibly 6.9.x as well, but I don't have the logs at hand 
to confirm). In both cases, this was also using z3fold under moderate memory 
pressure. I think this fairly conclusively rules out a HW issue.

Additionally, I have hit the following BUG on 6.10.8, which is potentially 
related (note __z3fold_alloc in there):

list_del corruption, ffff977c17128000->next is NULL
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:52!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 248608 Comm: kworker/u32:3 Tainted: G        W          
6.10.8-100.fc39.x86_64 #1
Hardware name: HP HP EliteBook 850 G6/8549, BIOS R70 Ver. 01.28.00 04/12/2024
Workqueue: zswap12 compact_page_work
RIP: 0010:__list_del_entry_valid_or_report+0x5d/0xc0
Code: 48 8b 01 48 39 f8 75 5a 48 8b 72 08 48 39 f0 75 65 b8 01 00 00 00 c3 cc 
cc cc cc 48 89 fe 48 c7 c7 f0 89 ba ad e8 73 34 8f ff <0f> 0b 48 89 fe 48 c7 
c7 20 8a ba ad e8 62 34 8f ff 0f 0b 48 89 fe
RSP: 0018:ffffac7299f5bdb0 EFLAGS: 00010246
RAX: 0000000000000033 RBX: ffff977c0afd0b08 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff977f2d5a18c0 RDI: ffff977f2d5a18c0
RBP: ffff977c0afd0b00 R08: 0000000000000000 R09: 4e20736920747865
R10: 7478656e3e2d3030 R11: 4c4c554e20736920 R12: ffff977c17128010
R13: 000000000000000a R14: 00000000000000a0 R15: ffff977c17128000
FS:  0000000000000000(0000) GS:ffff977f2d580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f063638a000 CR3: 0000000179428002 CR4: 00000000003706f0
Call Trace:
 <TASK>
 ? die+0x36/0x90
 ? do_trap+0xdd/0x100
 ? __list_del_entry_valid_or_report+0x5d/0xc0
 ? do_error_trap+0x6a/0x90
 ? __list_del_entry_valid_or_report+0x5d/0xc0
 ? exc_invalid_op+0x50/0x70
 ? __list_del_entry_valid_or_report+0x5d/0xc0
 ? asm_exc_invalid_op+0x1a/0x20
 ? __list_del_entry_valid_or_report+0x5d/0xc0
 __z3fold_alloc+0x4e/0x4b0
 do_compact_page+0x20e/0xa60
 process_one_work+0x17b/0x390
 worker_thread+0x265/0x380
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xcf/0x100
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x31/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>
Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast lp parport 
ti_usb_3410_5052 hid_logitech_hidpp snd_usb_audio snd_usbmidi_lib snd_ump 
snd_rawmidi hid_logitech_dj r8153_ecm cdc_ether usbnet r8152 mii ib_core 
dimlib tls >
 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component 
snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel 
soundwire_cadence snd_sof_intel_hda_common snd_sof_intel_hda_mlink 
snd_sof_intel_hda snd>
 processor_thermal_device_pci_legacy intel_cstate hp_wmi 
processor_thermal_device snd_timer sparse_keymap processor_thermal_wt_hint 
intel_uncore intel_wmi_thunderbolt thunderbolt wmi_bmof cfg80211 snd 
processor_thermal_rfim i2c_i801 sp>
---[ end trace 0000000000000000 ]---
RIP: 0010:__list_del_entry_valid_or_report+0x5d/0xc0
Code: 48 8b 01 48 39 f8 75 5a 48 8b 72 08 48 39 f0 75 65 b8 01 00 00 00 c3 cc 
cc cc cc 48 89 fe 48 c7 c7 f0 89 ba ad e8 73 34 8f ff <0f> 0b 48 89 fe 48 c7 
c7 20 8a ba ad e8 62 34 8f ff 0f 0b 48 89 fe
RSP: 0018:ffffac7299f5bdb0 EFLAGS: 00010246
RAX: 0000000000000033 RBX: ffff977c0afd0b08 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff977f2d5a18c0 RDI: ffff977f2d5a18c0
RBP: ffff977c0afd0b00 R08: 0000000000000000 R09: 4e20736920747865
R10: 7478656e3e2d3030 R11: 4c4c554e20736920 R12: ffff977c17128010
R13: 000000000000000a R14: 00000000000000a0 R15: ffff977c17128000
FS:  0000000000000000(0000) GS:ffff977f2d580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f063638a000 CR3: 0000000179428002 CR4: 00000000003706f0
note: kworker/u32:3[248608] exited with preempt_count 3

> > Is there any possibility/way to avoid bisecting? (due limited time from my
> > side)> 
> So unless you have a reason to specifically use z3fold or avoid
> zsmalloc, please use zsmalloc. It should be better for you anyway. I
> doubt that you (or anyone) wants to spend time debugging a z3fold
> problem :)

I could conceivably try to bisect this, but since I don't have a quick 
reproducer, it would likely take weeks to finish. I'm wondering whether it's 
worth trying or if z3fold is going out of the door anyway. I don't think it's 
hardware-related so it should be possible to test this in a VM, but that still 
takes some effort to set up.

Best regards,

Tomáš Trnka




  parent reply	other threads:[~2024-09-13  9:03 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <BD22A15A-9216-4FA0-82DF-C7BBF8EE642E@gmail.com>
2024-08-23 11:51 ` Linux regression tracking (Thorsten Leemhuis)
2024-08-23 12:12   ` Piotr Oniszczuk
2024-08-23 13:13   ` Matthew Wilcox
2024-08-23 14:35     ` Nhat Pham
2024-08-23 14:47       ` Matthew Wilcox
2024-08-23 16:07         ` Yosry Ahmed
2024-08-23 15:06     ` Piotr Oniszczuk
2024-08-23 16:16       ` Nhat Pham
2024-08-23 17:24         ` Piotr Oniszczuk
2024-08-23 18:06           ` Nhat Pham
2024-08-24 10:50             ` Piotr Oniszczuk
2024-08-25  5:55         ` Piotr Oniszczuk
2024-08-25 15:05           ` Pedro Falcato
2024-08-25 16:24             ` Piotr Oniszczuk
2024-08-27 18:48               ` Yosry Ahmed
2024-08-29 15:50                 ` Piotr Oniszczuk
2024-08-29 21:54                   ` Yosry Ahmed
2024-08-29 22:29                     ` Matthew Wilcox
2024-08-29 22:53                       ` Yosry Ahmed
2024-08-31  9:41                     ` Piotr Oniszczuk
2024-08-31 17:23                       ` Yosry Ahmed
2024-09-02  8:57                         ` Piotr Oniszczuk
2024-09-03 17:49                           ` Yosry Ahmed
2024-09-03 22:43                             ` Nhat Pham
2024-09-04 23:36                               ` Yosry Ahmed
2024-09-13  9:03                         ` Tomáš Trnka [this message]
2024-09-13 17:39                           ` Yosry Ahmed
     [not found]           ` <27594ee6-41dd-4951-b4cc-31577c9466db@amd.com>
2024-09-03 17:52             ` Yosry Ahmed
2024-08-23 18:42       ` Takero Funaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2272920.vFx2qVVIhK@electra \
    --to=trnka@scm.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=pedro.falcato@gmail.com \
    --cc=piotr.oniszczuk@gmail.com \
    --cc=regressions@lists.linux.dev \
    --cc=willy@infradead.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox