From: James Morse <james.morse@arm.com>
To: Michal Hocko <mhocko@kernel.org>, Mikulas Patocka <mpatocka@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
Pavel Tatashin <Pavel.Tatashin@microsoft.com>
Subject: Re: A crash on ARM64 in move_freepages_block due to uninitialized pages in reserved memory
Date: Tue, 21 Aug 2018 13:58:38 +0100 [thread overview]
Message-ID: <e35b7c14-c7ea-412d-2763-c961b74576f3@arm.com> (raw)
In-Reply-To: <20180821104418.GA16611@dhcp22.suse.cz>
Hi guys,
On 08/21/2018 11:44 AM, Michal Hocko wrote:
> On Fri 17-08-18 15:44:27, Mikulas Patocka wrote:
>> I report this crash on ARM64 on the kernel 4.17.11. The reason is that the
>> function move_freepages_block accesses contiguous runs of
>> pageblock_nr_pages. The ARM64 firmware sets holes of reserved memory there
>> and when move_freepages_block stumbles over this hole, it accesses
>> uninitialized page structures and crashes.
Any idea if this is nomap (so a hole in the linear map), or a missing struct page?
>> 00000000-03ffffff : System RAM
>> 00080000-007bffff : Kernel code
>> 00820000-00aa3fff : Kernel data
>> 04200000-bf80ffff : System RAM
>> bf810000-bfbeffff : reserved
>> bfbf0000-bfc8ffff : System RAM
>> bfc90000-bffdffff : reserved
>> bffe0000-bfffffff : System RAM
>> c0000000-dfffffff : MEM
>> c0000000-c00fffff : PCI Bus 0000:01
>> c0000000-c0003fff : 0000:01:00.0
>> c0000000-c0003fff : nvme
To test Laura's bounds-of-zone theory [0], could you put some empty space between the
nvme and the System RAM? (It sounds like this is a KVM guest). Reducing the amount of
memory is probably easiest.
>> The bug was already reported here for x86:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1598462
>>
>> For x86, it was fixed in the kernel 4.17.7 - but I observed it in the
>> kernel 4.17.11 on ARM64. I also observed it on 4.18-rc kernels running in
>> KVM virtual machine on ARM when I compiled the guest kernel with 64kB page
>> size.
I'm not sure this is the same bug.
[1] reports hitting a VM_BUG, this is a dereference of -ENOENT:
>> Unable to handle kernel paging request at virtual address fffffffffffffffe
Does your kernel have HOLES_IN_ZONE enabled? (It looks like it depends on NUMA)
Could you reproduce this with CONIG_DEBUG_VM enabled?
move_freepages() uses pfn_valid_within(), so it should handle missing struct pages in
this range.
>> CPU: 3 PID: 14823 Comm: updatedb.mlocat Not tainted 4.17.11 #16
>> Hardware name: Marvell Armada 8040 MacchiatoBin/Armada 8040 MacchiatoBin, BIOS EDK II Jul 30 2018
>> pstate: 00000085 (nzcv daIf -PAN -UAO)
>> pc : move_freepages_block+0xb4/0x160
>> lr : steal_suitable_fallback+0xe4/0x188
Any chance you could addr2line these?
>> Call trace:
>> move_freepages_block+0xb4/0x160
>> get_page_from_freelist+0xad8/0xea8
>> __alloc_pages_nodemask+0xac/0x970
>> new_slab+0xc0/0x348
>> ___slab_alloc.constprop.32+0x2cc/0x350
>> __slab_alloc.isra.26.constprop.31+0x24/0x38
>> kmem_cache_alloc+0x168/0x198
>> spadfs_alloc_inode+0x2c/0x88
>> alloc_inode+0x20/0xa0
>> iget5_locked+0xf8/0x1c0
>> spadfs_iget+0x44/0x4c8
>> spadfs_lookup+0x70/0x108
Hmmm. What's this?
Thanks,
James
[0] https://www.spinics.net/lists/linux-mm/msg157223.html
[1] https://www.spinics.net/lists/linux-mm/msg156764.html
next prev parent reply other threads:[~2018-08-21 12:58 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-17 19:44 Mikulas Patocka
2018-08-21 10:44 ` Michal Hocko
2018-08-21 12:58 ` James Morse [this message]
2018-08-23 11:02 ` Mikulas Patocka
2018-08-23 11:10 ` Michal Hocko
2018-08-23 11:16 ` Mikulas Patocka
2018-08-23 11:23 ` Michal Hocko
2018-08-23 13:13 ` Pasha Tatashin
2018-08-23 13:14 ` Pasha Tatashin
2018-08-23 14:34 ` Mikulas Patocka
2018-08-23 14:06 ` James Morse
2018-08-24 11:41 ` Michal Hocko
2018-08-29 17:37 ` James Morse
2018-08-30 15:58 ` Mikulas Patocka
2018-08-30 16:11 ` Will Deacon
2018-08-30 16:25 ` James Morse
2018-09-03 19:33 ` Michal Hocko
2018-09-07 17:47 ` James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e35b7c14-c7ea-412d-2763-c961b74576f3@arm.com \
--to=james.morse@arm.com \
--cc=Pavel.Tatashin@microsoft.com \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mpatocka@redhat.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox