linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [powerpc] Lockups seen during/just after boot (bisected)
@ 2023-11-23 11:27 Sachin Sant
  2023-11-23 14:35 ` Chengming Zhou
  0 siblings, 1 reply; 4+ messages in thread
From: Sachin Sant @ 2023-11-23 11:27 UTC (permalink / raw)
  To: linux-mm, zhouchengming; +Cc: linuxppc-dev, vbabka

While booting recent -next kernel on IBM Power server, I have observed lockups
either during boot or just after.

[ 3631.015775] watchdog: CPU 3 self-detected hard LOCKUP @ __update_freelist_slow+0x74/0x90
[ 3631.015783] watchdog: CPU 3 TB:7766577908812231, last heartbeat TB:7766572528409444 (10508ms ago)
[ 3631.015784] Modules linked in: rpadlpar_io(E) rpaphp(E) xsk_diag(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) nf_tables(E) nfnetlink(E) sunrpc(E) binfmt_misc(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[ 3631.015811] CPU: 3 PID: 167427 Comm: sed Kdump: loaded Tainted: G E 6.7.0-rc2-next-20231122 #1
[ 3631.015813] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 3631.015814] NIP: c000000000561f34 LR: c00000000056b108 CTR: c0000000004f4c50
[ 3631.015816] REGS: c000000e87743d60 TRAP: 0900 Tainted: G E (6.7.0-rc2-next-20231122)
[ 3631.015817] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 42042222 XER: 20040000
[ 3631.015822] CFAR: 0000000000000000 IRQMASK: 1 
[ 3631.015822] GPR00: c000000000096968 c000000e87b43ca0 c000000001522100 c00c00000001d700 
[ 3631.015822] GPR04: c0000000075f2000 0000000000200009 c0000000075f0000 0000000000200008 
[ 3631.015822] GPR08: 0000000000001000 0000000000000001 003ffff800000a41 c0000000b189d000 
[ 3631.015822] GPR12: c0000000004f4c50 c000000effffcb00 0000000000000000 0000000000000000 
[ 3631.015822] GPR16: 0000000000000001 c000000002b82a80 0000000000000000 0000000000000000 
[ 3631.015822] GPR20: c000000003016c00 0000000000000000 0000000000210d00 c000000e81273978 
[ 3631.015822] GPR24: 0000000000000000 0000000000000001 c0000000075f0000 c0000000075f0000 
[ 3631.015822] GPR28: c000000e81273900 0000000000200008 c00c00000001d700 c0000000075f2000 
[ 3631.015840] NIP [c000000000561f34] __update_freelist_slow+0x74/0x90
[ 3631.015842] LR [c00000000056b108] __slab_free+0x138/0x4a0
[ 3631.015845] Call Trace:
[ 3631.015845] [c000000e87b43ca0] [c00c00000001d700] 0xc00c00000001d700 (unreliable)
[ 3631.015849] [c000000e87b43d80] [c000000000096968] __tlb_remove_table+0xe8/0x150
[ 3631.015853] [c000000e87b43db0] [c0000000004f4cac] tlb_remove_table_rcu+0x5c/0xa0
[ 3631.015856] [c000000e87b43de0] [c000000000243314] rcu_do_batch+0x234/0x680
[ 3631.015859] [c000000e87b43e90] [c000000000247a80] rcu_core+0x170/0x2d0
[ 3631.015862] [c000000e87b43ee0] [c00000000102054c] __do_softirq+0x15c/0x3c0
[ 3631.015866] [c000000e87b43fe0] [c0000000000182d0] do_softirq_own_stack+0x40/0x60
[ 3631.015869] [c0000000672f7610] [c000000000170668] __irq_exit_rcu+0x128/0x150
[ 3631.015872] [c0000000672f7640] [c0000000001711a0] irq_exit+0x20/0x40
[ 3631.015874] [c0000000672f7660] [c00000000002bb58] timer_interrupt+0x128/0x310
[ 3631.015876] [c0000000672f76c0] [c000000000009ffc] decrementer_common_virt+0x28c/0x290
[ 3631.015879] --- interrupt: 900 at smp_call_function_many_cond+0x1d4/0x6a0
[ 3631.015883] NIP: c000000000298a34 LR: c0000000002989e4 CTR: c0000000000c9f50
[ 3631.015884] REGS: c0000000672f76f0 TRAP: 0900 Tainted: G E (6.7.0-rc2-next-20231122)
[ 3631.015885] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 44042822 XER: 20040000
[ 3631.015888] CFAR: 0000000000000000 IRQMASK: 0 
[ 3631.015888] GPR00: c000000000298eb0 c0000000672f7990 c000000001522100 0000000000000012 
[ 3631.015888] GPR04: 0000000000000012 0000000000000000 0000000000000000 0000000000000012 
[ 3631.015888] GPR08: c000000002b92058 0000000000000001 c000000e81bdc060 c0000000070d0c88 
[ 3631.015888] GPR12: c0000000000c9f50 c000000effffcb00 0000000000000000 0000000000000000 
[ 3631.015888] GPR16: 0000000000000000 0000000000000003 0000000000000000 c000000008270ff0 
[ 3631.015888] GPR20: c000000002b961f8 c0000000000a8270 60000000000000e0 c0000000070d0680 
[ 3631.015888] GPR24: 0000000000000000 0000000000000090 c000000e81273d88 c0000000070d0680 
[ 3631.015888] GPR28: c000000e81273d80 c0000000000a9470 c000000e81273d88 c000000002b968b0 
[ 3631.015906] NIP [c000000000298a34] smp_call_function_many_cond+0x1d4/0x6a0
[ 3631.015908] LR [c0000000002989e4] smp_call_function_many_cond+0x184/0x6a0
[ 3631.015911] --- interrupt: 900
[ 3631.015912] [c0000000672f7990] [c000000000298eb0] smp_call_function_many_cond+0x650/0x6a0 (unreliable)
[ 3631.015915] [c0000000672f7a50] [c0000000000a8270] flush_type_needed+0x1d0/0x260
[ 3631.015917] [c0000000672f7a90] [c0000000000a94ec] radix__flush_tlb_page_psize+0x5c/0x300
[ 3631.015919] [c0000000672f7b00] [c0000000004fd7f4] ptep_clear_flush+0xa4/0x160
[ 3631.015921] [c0000000672f7b50] [c0000000004d9218] wp_page_copy+0x348/0xa40
[ 3631.015924] [c0000000672f7c00] [c0000000004e55b0] __handle_mm_fault+0x470/0x8a0
[ 3631.015927] [c0000000672f7d10] [c0000000004e5af4] handle_mm_fault+0x114/0x3b0
[ 3631.015929] [c0000000672f7d60] [c0000000000900ac] ___do_page_fault+0x3ec/0x8c0
[ 3631.015931] [c0000000672f7e20] [c000000000090670] do_page_fault+0x30/0xc0
[ 3631.015933] [c0000000672f7e50] [c000000000008be0] data_access_common_virt+0x210/0x220
[ 3631.015935] --- interrupt: 300 at 0x7fff98f81c18
[ 3631.015936] NIP: 00007fff98f81c18 LR: 00007fff98f81c08 CTR: 00007fff98c53380
[ 3631.015937] REGS: c0000000672f7e80 TRAP: 0300 Tainted: G E (6.7.0-rc2-next-20231122)
[ 3631.015938] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 24002422 XER: 20040000
[ 3631.015943] CFAR: 00007fff98f81b0c DAR: 00007fff98fa0000 DSISR: 0a000000 IRQMASK: 0 
[ 3631.015943] GPR00: 00007fff98ff7d44 00007fffe2816a10 00007fff98fa7f00 00007fff98fa0000 
[ 3631.015943] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 3631.015943] GPR08: 0000000000000000 0000000000000001 0000000000000001 0000000000002000 
[ 3631.015943] GPR12: 00007fff98c53380 00007fff9905bdc0 0000000000000000 0000000000000000 
[ 3631.015943] GPR16: 0000000000000000 0000000000000000 0000000000000000 00007fff98f9fb30 
[ 3631.015943] GPR20: 00007fffe2816aa8 00007fffe2816a80 00007fffe2816a50 00007fff99050000 
[ 3631.015943] GPR24: 0000000000000000 00007fff9904f668 0000000000000000 00007fff99050988 
[ 3631.015943] GPR28: 00007fff99052040 0000000000000000 00007fff99050000 00007fffe2816a50 
[ 3631.015958] NIP [00007fff98f81c18] 0x7fff98f81c18
[ 3631.015959] LR [00007fff98f81c08] 0x7fff98f81c08
[ 3631.015960] --- interrupt: 300
[ 3631.015960] Code: 60000000 60000000 60000000 e9230028 7c292800 4082ffd4 39400001 f8c30020 f8e30028 4bffffc4 60000000 7c40003c <60000000> e9230000 71290001 4082fff0

Git bisect points to following patch

commit c8d312e039030edab25836a326bcaeb2a3d4db14
    slub: Delay freezing of partial slabs

Bisect log:

git bisect start
# status: waiting for both good and bad commits
# bad: [288736c822de7fd3b69be317c11eaa8dfb78bf6f] Add linux-next specific files for 20231122
git bisect bad 288736c822de7fd3b69be317c11eaa8dfb78bf6f
# status: waiting for good commit(s), bad commit known
# good: [98b1cc82c4affc16f5598d4fa14b1858671b2263] Linux 6.7-rc2
git bisect good 98b1cc82c4affc16f5598d4fa14b1858671b2263
# good: [9540131d5721e24c00b118ce852c761285515b26] Merge branch 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
git bisect good 9540131d5721e24c00b118ce852c761285515b26
# good: [99444230e9595fc7050292ce284003d7e7d4b53e] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git
git bisect good 99444230e9595fc7050292ce284003d7e7d4b53e
# good: [a95502fad0ab45767b263f46719bba3885e9597c] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt.git
git bisect good a95502fad0ab45767b263f46719bba3885e9597c
# good: [5317edbc82dbfac690f2ff720667291ecf6ccee0] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel.git
git bisect good 5317edbc82dbfac690f2ff720667291ecf6ccee0
# good: [602bf18307981f3bfd9ebf19921791a4256d3fd1] Merge branch 'for-6.7' into for-next
git bisect good 602bf18307981f3bfd9ebf19921791a4256d3fd1
# good: [d5bf2252dd17fa9fa87206862500884e9a342c9b] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/samsung.git
git bisect good d5bf2252dd17fa9fa87206862500884e9a342c9b
# good: [48b10dee6a7a26cc695bf4932f091d423b94b429] Merge branch 'zstd-next' of https://github.com/terrelln/linux.git
git bisect good 48b10dee6a7a26cc695bf4932f091d423b94b429
# good: [f6a4a72703c035882d7595198ce83d021b6b1c96] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode.git
git bisect good f6a4a72703c035882d7595198ce83d021b6b1c96
# bad: [dd374e220ba492f95344a638b1efe5b2744fdd73] slub: Update frozen slabs documentations in the source
git bisect bad dd374e220ba492f95344a638b1efe5b2744fdd73
# good: [a3058965bb35490454953aa2c87ea51004839f2f] slub: Prepare __slab_free() for unfrozen partial slab out of node partial list
git bisect good a3058965bb35490454953aa2c87ea51004839f2f
# bad: [c8d312e039030edab25836a326bcaeb2a3d4db14] slub: Delay freezing of partial slabs
git bisect bad c8d312e039030edab25836a326bcaeb2a3d4db14
# good: [00b15a19ee543f0117cb217fcbab8b7b3fd50677] slub: Introduce freeze_slab()
git bisect good 00b15a19ee543f0117cb217fcbab8b7b3fd50677
# first bad commit: [c8d312e039030edab25836a326bcaeb2a3d4db14] slub: Delay freezing of partial slabs

- Sachin



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [powerpc] Lockups seen during/just after boot (bisected)
  2023-11-23 11:27 [powerpc] Lockups seen during/just after boot (bisected) Sachin Sant
@ 2023-11-23 14:35 ` Chengming Zhou
  2023-11-24  8:55   ` Vlastimil Babka
  0 siblings, 1 reply; 4+ messages in thread
From: Chengming Zhou @ 2023-11-23 14:35 UTC (permalink / raw)
  To: Sachin Sant, linux-mm; +Cc: linuxppc-dev, vbabka

On 2023/11/23 19:27, Sachin Sant wrote:
> While booting recent -next kernel on IBM Power server, I have observed lockups
> either during boot or just after.
> 
> [ 3631.015775] watchdog: CPU 3 self-detected hard LOCKUP @ __update_freelist_slow+0x74/0x90

Sorry, the bug can be fixed by this patch from Vlastimil Babka:

https://lore.kernel.org/all/83ff4b9e-94f1-8b35-1233-3dd414ea4dfe@suse.cz/

Thanks!


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [powerpc] Lockups seen during/just after boot (bisected)
  2023-11-23 14:35 ` Chengming Zhou
@ 2023-11-24  8:55   ` Vlastimil Babka
  2023-11-24  9:05     ` Sachin Sant
  0 siblings, 1 reply; 4+ messages in thread
From: Vlastimil Babka @ 2023-11-24  8:55 UTC (permalink / raw)
  To: Chengming Zhou, Sachin Sant, linux-mm; +Cc: linuxppc-dev

On 11/23/23 15:35, Chengming Zhou wrote:
> On 2023/11/23 19:27, Sachin Sant wrote:
>> While booting recent -next kernel on IBM Power server, I have observed lockups
>> either during boot or just after.
>> 
>> [ 3631.015775] watchdog: CPU 3 self-detected hard LOCKUP @ __update_freelist_slow+0x74/0x90
> 
> Sorry, the bug can be fixed by this patch from Vlastimil Babka:
> 
> https://lore.kernel.org/all/83ff4b9e-94f1-8b35-1233-3dd414ea4dfe@suse.cz/

The current -next should be fixed, the fix was folded to the preparatory
commit, which is now:

8a399e2f6003 ("slub: Keep track of whether slub is on the per-node partial
list")

> Thanks!



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [powerpc] Lockups seen during/just after boot (bisected)
  2023-11-24  8:55   ` Vlastimil Babka
@ 2023-11-24  9:05     ` Sachin Sant
  0 siblings, 0 replies; 4+ messages in thread
From: Sachin Sant @ 2023-11-24  9:05 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Chengming Zhou, linux-mm, linuxppc-dev


> On 24-Nov-2023, at 2:25 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> 
> On 11/23/23 15:35, Chengming Zhou wrote:
>> On 2023/11/23 19:27, Sachin Sant wrote:
>>> While booting recent -next kernel on IBM Power server, I have observed lockups
>>> either during boot or just after.
>>> 
>>> [ 3631.015775] watchdog: CPU 3 self-detected hard LOCKUP @ __update_freelist_slow+0x74/0x90
>> 
>> Sorry, the bug can be fixed by this patch from Vlastimil Babka:
>> 
>> https://lore.kernel.org/all/83ff4b9e-94f1-8b35-1233-3dd414ea4dfe@suse.cz/
> 
> The current -next should be fixed, the fix was folded to the preparatory
> commit, which is now:
> 

Thanks. Yes the problem is fixed with today’s next.

- Sachin


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-24  9:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-23 11:27 [powerpc] Lockups seen during/just after boot (bisected) Sachin Sant
2023-11-23 14:35 ` Chengming Zhou
2023-11-24  8:55   ` Vlastimil Babka
2023-11-24  9:05     ` Sachin Sant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox