linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [powerpc] Kernel crash with THP tests (next-20220920)
@ 2022-09-21  6:30 Sachin Sant
  2022-09-21 23:41 ` Mike Kravetz
  0 siblings, 1 reply; 3+ messages in thread
From: Sachin Sant @ 2022-09-21  6:30 UTC (permalink / raw)
  To: linuxppc-dev, linux-mm; +Cc: mike.kravetz, open list

While running transparent huge page tests [1] against 6.0.0-rc6-next-20220920
following crash is seen on IBM Power server.

Kernel attempted to read user page (34) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000034
Faulting instruction address: 0xc0000000004d2744
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: dm_mod(E) bonding(E) rfkill(E) tls(E) sunrpc(E) nd_pmem(E) nd_btt(E) dax_pmem(E) papr_scm(E) libnvdimm(E) pseries_rng(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
CPU: 37 PID: 2219255 Comm: sysctl Tainted: G            E      6.0.0-rc6-next-20220920 #1
NIP:  c0000000004d2744 LR: c0000000004d2734 CTR: 0000000000000000
REGS: c0000012801bf660 TRAP: 0300   Tainted: G            E       (6.0.0-rc6-next-20220920)
MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24048222  XER: 20040000
CFAR: c0000000004b0eac DAR: 0000000000000034 DSISR: 40000000 IRQMASK: 0 
GPR00: c0000000004d2734 c0000012801bf900 c000000002a92300 0000000000000000 
GPR04: c000000002ac8ac0 c000000001209340 0000000000000005 c000001286714b80 
GPR08: 0000000000000034 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000028048242 c00000167fff6b00 0000000000000000 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: c0000012801bfae8 0000000000000001 0000000000000100 0000000000000001 
GPR24: c0000012801bfae8 c000000002ac8ac0 0000000000000002 0000000000000005 
GPR28: 0000000000000000 0000000000000001 0000000000000000 0000000000346cca 
NIP [c0000000004d2744] alloc_buddy_huge_page+0xd4/0x240
LR [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240
Call Trace:
[c0000012801bf900] [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240 (unreliable)
[c0000012801bf9b0] [c0000000004d46a4] alloc_fresh_huge_page.part.72+0x214/0x2a0
[c0000012801bfa40] [c0000000004d7f88] alloc_pool_huge_page+0x118/0x190
[c0000012801bfa90] [c0000000004d84dc] __nr_hugepages_store_common+0x4dc/0x610
[c0000012801bfb70] [c0000000004d88bc] hugetlb_sysctl_handler_common+0x13c/0x180
[c0000012801bfc10] [c0000000006380e0] proc_sys_call_handler+0x210/0x350
[c0000012801bfc90] [c000000000551c00] vfs_write+0x2e0/0x460
[c0000012801bfd50] [c000000000551f5c] ksys_write+0x7c/0x140
[c0000012801bfda0] [c000000000033f58] system_call_exception+0x188/0x3f0
[c0000012801bfe10] [c00000000000c53c] system_call_common+0xec/0x270
--- interrupt: c00 at 0x7fffa9520c34
NIP:  00007fffa9520c34 LR: 00000001024754bc CTR: 0000000000000000
REGS: c0000012801bfe80 TRAP: 0c00   Tainted: G            E       (6.0.0-rc6-next-20220920)
MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28002202  XER: 00000000
IRQMASK: 0 
GPR00: 0000000000000004 00007fffccd76cd0 00007fffa9607300 0000000000000003 
GPR04: 0000000138da6970 0000000000000006 fffffffffffffff6 0000000000000000 
GPR08: 0000000138da6970 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 00007fffa9a40940 0000000000000000 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR24: 0000000000000001 0000000000000010 0000000000000006 0000000138da8aa0 
GPR28: 00007fffa95fc2c8 0000000138da8aa0 0000000000000006 0000000138da6930 
NIP [00007fffa9520c34] 0x7fffa9520c34
LR [00000001024754bc] 0x1024754bc
--- interrupt: c00
Instruction dump:
3b400002 3ba00001 3b800000 7f26cb78 7fc5f378 7f64db78 7fe3fb78 4bfde5b9 
60000000 7c691b78 39030034 7c0004ac <7d404028> 7c0ae800 40c20010 7f80412d 
---[ end trace 0000000000000000 ]---

Kernel panic - not syncing: Fatal exception

Bisect points to following patch:
commit f2f3c25dea3acfb17aecb7273541e7266dfc8842
    hugetlb: freeze allocated pages before creating hugetlb pages

Reverting the patch allows the test to run successfully.

Thanks
- Sachin

[1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/transparent_hugepages_defrag.py

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [powerpc] Kernel crash with THP tests (next-20220920)
  2022-09-21  6:30 [powerpc] Kernel crash with THP tests (next-20220920) Sachin Sant
@ 2022-09-21 23:41 ` Mike Kravetz
  2022-09-22 12:53   ` Sachin Sant
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Kravetz @ 2022-09-21 23:41 UTC (permalink / raw)
  To: Sachin Sant; +Cc: linuxppc-dev, linux-mm, open list

On 09/21/22 12:00, Sachin Sant wrote:
> While running transparent huge page tests [1] against 6.0.0-rc6-next-20220920
> following crash is seen on IBM Power server.

Thanks Sachin,

Naoya reported this, with my analysis here:
https://lore.kernel.org/linux-mm/YyqCS6+OXAgoqI8T@monkey/

An updated version of the patch was posted here,
https://lore.kernel.org/linux-mm/20220921202702.106069-1-mike.kravetz@oracle.com/

Sorry about that,
-- 
Mike Kravetz

> 
> Kernel attempted to read user page (34) - exploit attempt? (uid: 0)
> BUG: Kernel NULL pointer dereference on read at 0x00000034
> Faulting instruction address: 0xc0000000004d2744
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: dm_mod(E) bonding(E) rfkill(E) tls(E) sunrpc(E) nd_pmem(E) nd_btt(E) dax_pmem(E) papr_scm(E) libnvdimm(E) pseries_rng(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
> CPU: 37 PID: 2219255 Comm: sysctl Tainted: G            E      6.0.0-rc6-next-20220920 #1
> NIP:  c0000000004d2744 LR: c0000000004d2734 CTR: 0000000000000000
> REGS: c0000012801bf660 TRAP: 0300   Tainted: G            E       (6.0.0-rc6-next-20220920)
> MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24048222  XER: 20040000
> CFAR: c0000000004b0eac DAR: 0000000000000034 DSISR: 40000000 IRQMASK: 0 
> GPR00: c0000000004d2734 c0000012801bf900 c000000002a92300 0000000000000000 
> GPR04: c000000002ac8ac0 c000000001209340 0000000000000005 c000001286714b80 
> GPR08: 0000000000000034 0000000000000000 0000000000000000 0000000000000000 
> GPR12: 0000000028048242 c00000167fff6b00 0000000000000000 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: c0000012801bfae8 0000000000000001 0000000000000100 0000000000000001 
> GPR24: c0000012801bfae8 c000000002ac8ac0 0000000000000002 0000000000000005 
> GPR28: 0000000000000000 0000000000000001 0000000000000000 0000000000346cca 
> NIP [c0000000004d2744] alloc_buddy_huge_page+0xd4/0x240
> LR [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240
> Call Trace:
> [c0000012801bf900] [c0000000004d2734] alloc_buddy_huge_page+0xc4/0x240 (unreliable)
> [c0000012801bf9b0] [c0000000004d46a4] alloc_fresh_huge_page.part.72+0x214/0x2a0
> [c0000012801bfa40] [c0000000004d7f88] alloc_pool_huge_page+0x118/0x190
> [c0000012801bfa90] [c0000000004d84dc] __nr_hugepages_store_common+0x4dc/0x610
> [c0000012801bfb70] [c0000000004d88bc] hugetlb_sysctl_handler_common+0x13c/0x180
> [c0000012801bfc10] [c0000000006380e0] proc_sys_call_handler+0x210/0x350
> [c0000012801bfc90] [c000000000551c00] vfs_write+0x2e0/0x460
> [c0000012801bfd50] [c000000000551f5c] ksys_write+0x7c/0x140
> [c0000012801bfda0] [c000000000033f58] system_call_exception+0x188/0x3f0
> [c0000012801bfe10] [c00000000000c53c] system_call_common+0xec/0x270
> --- interrupt: c00 at 0x7fffa9520c34
> NIP:  00007fffa9520c34 LR: 00000001024754bc CTR: 0000000000000000
> REGS: c0000012801bfe80 TRAP: 0c00   Tainted: G            E       (6.0.0-rc6-next-20220920)
> MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28002202  XER: 00000000
> IRQMASK: 0 
> GPR00: 0000000000000004 00007fffccd76cd0 00007fffa9607300 0000000000000003 
> GPR04: 0000000138da6970 0000000000000006 fffffffffffffff6 0000000000000000 
> GPR08: 0000000138da6970 0000000000000000 0000000000000000 0000000000000000 
> GPR12: 0000000000000000 00007fffa9a40940 0000000000000000 0000000000000000 
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> GPR24: 0000000000000001 0000000000000010 0000000000000006 0000000138da8aa0 
> GPR28: 00007fffa95fc2c8 0000000138da8aa0 0000000000000006 0000000138da6930 
> NIP [00007fffa9520c34] 0x7fffa9520c34
> LR [00000001024754bc] 0x1024754bc
> --- interrupt: c00
> Instruction dump:
> 3b400002 3ba00001 3b800000 7f26cb78 7fc5f378 7f64db78 7fe3fb78 4bfde5b9 
> 60000000 7c691b78 39030034 7c0004ac <7d404028> 7c0ae800 40c20010 7f80412d 
> ---[ end trace 0000000000000000 ]---
> 
> Kernel panic - not syncing: Fatal exception
> 
> Bisect points to following patch:
> commit f2f3c25dea3acfb17aecb7273541e7266dfc8842
>     hugetlb: freeze allocated pages before creating hugetlb pages
> 
> Reverting the patch allows the test to run successfully.
> 
> Thanks
> - Sachin
> 
> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/transparent_hugepages_defrag.py


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [powerpc] Kernel crash with THP tests (next-20220920)
  2022-09-21 23:41 ` Mike Kravetz
@ 2022-09-22 12:53   ` Sachin Sant
  0 siblings, 0 replies; 3+ messages in thread
From: Sachin Sant @ 2022-09-22 12:53 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: linuxppc-dev, linux-mm, open list



> On 22-Sep-2022, at 5:11 AM, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
> On 09/21/22 12:00, Sachin Sant wrote:
>> While running transparent huge page tests [1] against 6.0.0-rc6-next-20220920
>> following crash is seen on IBM Power server.
> 
> Thanks Sachin,
> 
> Naoya reported this, with my analysis here:
> https://lore.kernel.org/linux-mm/YyqCS6+OXAgoqI8T@monkey/
> 

Thanks Mike for the pointer.

> An updated version of the patch was posted here,
> https://lore.kernel.org/linux-mm/20220921202702.106069-1-mike.kravetz@oracle.com/
> 
This updated patch works for me. The test runs to completion without any
issues.

- Sachin

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-09-22 12:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21  6:30 [powerpc] Kernel crash with THP tests (next-20220920) Sachin Sant
2022-09-21 23:41 ` Mike Kravetz
2022-09-22 12:53   ` Sachin Sant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox