linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* "Shrink zones before removing memory" causes kernel panic with kpagecount
@ 2019-10-08 20:18 Qian Cai
  2019-10-08 21:45 ` David Hildenbrand
  0 siblings, 1 reply; 2+ messages in thread
From: Qian Cai @ 2019-10-08 20:18 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Andrew Morton, linux-mm, linux-kernel

The linux-next series "mm/memory_hotplug: Shrink zones before removing memory"
[1] causes a kernel panic while reading /proc/kpagecount after offlining a
memory section. It was reproduced on both x86 and powerpc. Reverted the whole
series fixed the problem.

[1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/

# echo offline > /sys/devices/system/memory/memory124/state 
# cat /proc/kpagecount

[  133.268032][ T8809] remove from free list 7c000 256 7d000
[  133.268134][ T8809] remove from free list 7c100 256 7d000
[  133.268153][ T8809] remove from free list 7c200 256 7d000
[  133.268182][ T8809] remove from free list 7c300 256 7d000
[  133.268212][ T8809] remove from free list 7c400 256 7d000
[  133.268241][ T8809] remove from free list 7c500 256 7d000
[  133.268260][ T8809] remove from free list 7c600 256 7d000
[  133.268289][ T8809] remove from free list 7c700 256 7d000
[  133.268329][ T8809] remove from free list 7c800 256 7d000
[  133.268359][ T8809] remove from free list 7c900 256 7d000
[  133.268399][ T8809] remove from free list 7ca00 256 7d000
[  133.268429][ T8809] remove from free list 7cb00 256 7d000
[  133.268458][ T8809] remove from free list 7cc00 256 7d000
[  133.268488][ T8809] remove from free list 7cd00 256 7d000
[  133.268517][ T8809] remove from free list 7ce00 256 7d000
[  133.268546][ T8809] remove from free list 7cf00 256 7d000
[  133.268580][ T8809] Offlined Pages 4096
[  144.038732][ T8944] BUG: Unable to handle kernel data access at
0xfffffffffffffffe
[  144.038769][ T8944] Faulting instruction address: 0xc000000000590c08
[  144.038794][ T8944] Oops: Kernel access of bad area, sig: 11 [#1]
[  144.038807][ T8944] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=256
DEBUG_PAGEALLOC NUMA PowerNV
[  144.038822][ T8944] Modules linked in: ip_tables x_tables xfs sd_mod bnx2x
mdio ahci libahci tg3 libata libphy firmware_class dm_mirror dm_region_hash
dm_log dm_mod
[  144.038864][ T8944] CPU: 116 PID: 8944 Comm: cat Not tainted 5.4.0-rc2+ #6
[  144.038898][ T8944] NIP:  c000000000590c08 LR: c000000000577330 CTR:
c0000000005909d0
[  144.038945][ T8944] REGS: c00020196bd6fa30 TRAP: 0380   Not tainted  (5.4.0-
rc2+)
[  144.038989][ T8944] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
48022428  XER: 20040000
[  144.039028][ T8944] CFAR: c000000000590ad0 IRQMASK: 0 
[  144.039028][ T8944] GPR00: c000000000577330 c00020196bd6fcc0 c000000001122d00
c0002009d3d4a880 
[  144.039028][ T8944] GPR04: 00007fffb6870000 0000000000020000 fffffffffffffffe
c00c000000000000 
[  144.039028][ T8944] GPR08: 0000000001f00000 c00c000001f00000 0000000000000001
c0000000009413d0 
[  144.039028][ T8944] GPR12: c0000000005909d0 c000201fff677000 0000000000000000
0000000000000000 
[  144.039028][ T8944] GPR16: 0000000000000002 00007fffca34cfa8 ffffffffffffffff
0000000000000000 
[  144.039028][ T8944] GPR20: 0000000000000000 0000000000000000 c000000000000000
c00020196bd6fdf0 
[  144.039028][ T8944] GPR24: 00007fffb6870000 0000000007ffffff 0000000000000000
c000000000aa6c20 
[  144.039028][ T8944] GPR28: 00007fffb6890000 0000000000000008 000000000007c000
00007fffb6870000 
[  144.039240][ T8944] NIP [c000000000590c08] kpagecount_read+0x238/0x3f0
[  144.039263][ T8944] LR [c000000000577330] proc_reg_read+0x90/0x130
[  144.039274][ T8944] Call Trace:
[  144.039304][ T8944] [c00020196bd6fd30] [c000000000577330]
proc_reg_read+0x90/0x130
[  144.039342][ T8944] [c00020196bd6fd60] [c0000000004978bc]
__vfs_read+0x3c/0x70
[  144.039377][ T8944] [c00020196bd6fd80] [c00000000049799c] vfs_read+0xac/0x170
[  144.039423][ T8944] [c00020196bd6fdd0] [c000000000497dfc]
ksys_read+0x7c/0x140
[  144.039472][ T8944] [c00020196bd6fe20] [c00000000000b378]
system_call+0x5c/0x68
[  144.039495][ T8944] Instruction dump:
[  144.039513][ T8944] 4e800020 60000000 3d22000d 3929c098 7bc83664 e8e90000
7d274215 418200ac 
[  144.039540][ T8944] e9490008 38caffff 714a0001 7cc9309e <e9460000> 2faaffff
e9490008 419e00fc 
[  144.039580][ T8944] ---[ end trace 96fb2ea2d503fda9 ]---
[  144.492072][ T8944] 
[  145.492172][ T8944] Kernel panic - not syncing: Fatal exception


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: "Shrink zones before removing memory" causes kernel panic with kpagecount
  2019-10-08 20:18 "Shrink zones before removing memory" causes kernel panic with kpagecount Qian Cai
@ 2019-10-08 21:45 ` David Hildenbrand
  0 siblings, 0 replies; 2+ messages in thread
From: David Hildenbrand @ 2019-10-08 21:45 UTC (permalink / raw)
  To: Qian Cai
  Cc: Andrew Morton, linux-mm, linux-kernel, Dan Williams, Michal Hocko

On 08.10.19 22:18, Qian Cai wrote:
> The linux-next series "mm/memory_hotplug: Shrink zones before removing memory"
> [1] causes a kernel panic while reading /proc/kpagecount after offlining a
> memory section. It was reproduced on both x86 and powerpc. Reverted the whole
> series fixed the problem.
> 
> [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/
> 
> # echo offline > /sys/devices/system/memory/memory124/state 
> # cat /proc/kpagecount
> 
> [  133.268032][ T8809] remove from free list 7c000 256 7d000
> [  133.268134][ T8809] remove from free list 7c100 256 7d000
> [  133.268153][ T8809] remove from free list 7c200 256 7d000
> [  133.268182][ T8809] remove from free list 7c300 256 7d000
> [  133.268212][ T8809] remove from free list 7c400 256 7d000
> [  133.268241][ T8809] remove from free list 7c500 256 7d000
> [  133.268260][ T8809] remove from free list 7c600 256 7d000
> [  133.268289][ T8809] remove from free list 7c700 256 7d000
> [  133.268329][ T8809] remove from free list 7c800 256 7d000
> [  133.268359][ T8809] remove from free list 7c900 256 7d000
> [  133.268399][ T8809] remove from free list 7ca00 256 7d000
> [  133.268429][ T8809] remove from free list 7cb00 256 7d000
> [  133.268458][ T8809] remove from free list 7cc00 256 7d000
> [  133.268488][ T8809] remove from free list 7cd00 256 7d000
> [  133.268517][ T8809] remove from free list 7ce00 256 7d000
> [  133.268546][ T8809] remove from free list 7cf00 256 7d000
> [  133.268580][ T8809] Offlined Pages 4096
> [  144.038732][ T8944] BUG: Unable to handle kernel data access at
> 0xfffffffffffffffe
> [  144.038769][ T8944] Faulting instruction address: 0xc000000000590c08
> [  144.038794][ T8944] Oops: Kernel access of bad area, sig: 11 [#1]
> [  144.038807][ T8944] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=256
> DEBUG_PAGEALLOC NUMA PowerNV
> [  144.038822][ T8944] Modules linked in: ip_tables x_tables xfs sd_mod bnx2x
> mdio ahci libahci tg3 libata libphy firmware_class dm_mirror dm_region_hash
> dm_log dm_mod
> [  144.038864][ T8944] CPU: 116 PID: 8944 Comm: cat Not tainted 5.4.0-rc2+ #6
> [  144.038898][ T8944] NIP:  c000000000590c08 LR: c000000000577330 CTR:
> c0000000005909d0
> [  144.038945][ T8944] REGS: c00020196bd6fa30 TRAP: 0380   Not tainted  (5.4.0-
> rc2+)
> [  144.038989][ T8944] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:
> 48022428  XER: 20040000
> [  144.039028][ T8944] CFAR: c000000000590ad0 IRQMASK: 0 
> [  144.039028][ T8944] GPR00: c000000000577330 c00020196bd6fcc0 c000000001122d00
> c0002009d3d4a880 
> [  144.039028][ T8944] GPR04: 00007fffb6870000 0000000000020000 fffffffffffffffe
> c00c000000000000 
> [  144.039028][ T8944] GPR08: 0000000001f00000 c00c000001f00000 0000000000000001
> c0000000009413d0 
> [  144.039028][ T8944] GPR12: c0000000005909d0 c000201fff677000 0000000000000000
> 0000000000000000 
> [  144.039028][ T8944] GPR16: 0000000000000002 00007fffca34cfa8 ffffffffffffffff
> 0000000000000000 
> [  144.039028][ T8944] GPR20: 0000000000000000 0000000000000000 c000000000000000
> c00020196bd6fdf0 
> [  144.039028][ T8944] GPR24: 00007fffb6870000 0000000007ffffff 0000000000000000
> c000000000aa6c20 
> [  144.039028][ T8944] GPR28: 00007fffb6890000 0000000000000008 000000000007c000
> 00007fffb6870000 
> [  144.039240][ T8944] NIP [c000000000590c08] kpagecount_read+0x238/0x3f0
> [  144.039263][ T8944] LR [c000000000577330] proc_reg_read+0x90/0x130
> [  144.039274][ T8944] Call Trace:
> [  144.039304][ T8944] [c00020196bd6fd30] [c000000000577330]
> proc_reg_read+0x90/0x130
> [  144.039342][ T8944] [c00020196bd6fd60] [c0000000004978bc]
> __vfs_read+0x3c/0x70
> [  144.039377][ T8944] [c00020196bd6fd80] [c00000000049799c] vfs_read+0xac/0x170
> [  144.039423][ T8944] [c00020196bd6fdd0] [c000000000497dfc]
> ksys_read+0x7c/0x140
> [  144.039472][ T8944] [c00020196bd6fe20] [c00000000000b378]
> system_call+0x5c/0x68
> [  144.039495][ T8944] Instruction dump:
> [  144.039513][ T8944] 4e800020 60000000 3d22000d 3929c098 7bc83664 e8e90000
> 7d274215 418200ac 
> [  144.039540][ T8944] e9490008 38caffff 714a0001 7cc9309e <e9460000> 2faaffff
> e9490008 419e00fc 
> [  144.039580][ T8944] ---[ end trace 96fb2ea2d503fda9 ]---
> [  144.492072][ T8944] 
> [  145.492172][ T8944] Kernel panic - not syncing: Fatal exception
> 

Thanks, that's somewhat expected as I taint pages more aggressively.
It's a pre-existing issue. You can trigger the exact same BUG by

1. Hotplugging a DIMM but not onlining it
2. cat /proc/kpagecount

The right fix is to add a pgn_to_online_page() to the PFN walker and
skip all PFNs that are not online. This was already discussed in the
context of ZONE_DEVICE and I am yet waiting for a fix.

I can prepare and send a fix for that PFN walker tomorrow.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-08 21:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 20:18 "Shrink zones before removing memory" causes kernel panic with kpagecount Qian Cai
2019-10-08 21:45 ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox