From: Vlastimil Babka <vbabka@suse.cz>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Dave Chinner <david@fromorbit.com>,
linux-mm@kvack.org, Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16]
Date: Wed, 16 Oct 2019 22:03:39 +0200 [thread overview]
Message-ID: <3560f07e-03a7-9291-6494-e0580eeaa6bd@suse.cz> (raw)
In-Reply-To: <87blugh452.fsf@mid.deneb.enyo.de>
On 10/16/19 9:38 PM, Florian Weimer wrote:
> This time, I've got a kernel with debugging information (still
> 5.2.18). The crash is at offset 0x39f:
>
> if (!mem_section[SECTION_NR_TO_ROOT(nr)])
> 384: 48 c1 ea 35 shr $0x35,%rdx
> 388: 48 8b 14 d7 mov (%rdi,%rdx,8),%rdx
> 38c: 48 c1 e8 2d shr $0x2d,%rax
> 390: 48 85 d2 test %rdx,%rdx
> 393: 74 0a je 39f <__reset_isolation_pfn+0x27f>
> return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
> 395: 0f b6 c0 movzbl %al,%eax
> 398: 48 c1 e0 04 shl $0x4,%rax
> 39c: 48 01 c2 add %rax,%rdx
> unsigned long map = section->section_mem_map;
> 39f: 48 8b 02 mov (%rdx),%rax
> clear_pageblock_skip(page);
> 3a2: 4c 89 f2 mov %r14,%rdx
> 3a5: 41 b8 01 00 00 00 mov $0x1,%r8d
> 3ab: 31 f6 xor %esi,%esi
> 3ad: b9 03 00 00 00 mov $0x3,%ecx
> 3b2: 4c 89 f7 mov %r14,%rdi
>
> Hmm, -l output is likely more helpful here:
>
> /home/fw/src/linux/linux/mm/compaction.c:293
> 37a: a8 10 test $0x10,%al
> 37c: 74 bc je 33a <__reset_isolation_pfn+0x21a>
> page_to_section():
> /home/fw/src/linux/linux/./include/linux/mm.h:1265
> 37e: 49 8b 16 mov (%r14),%rdx
> 381: 48 89 d0 mov %rdx,%rax
> __nr_to_section():
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1218
> 384: 48 c1 ea 35 shr $0x35,%rdx
> 388: 48 8b 14 d7 mov (%rdi,%rdx,8),%rdx
> page_to_section():
> /home/fw/src/linux/linux/./include/linux/mm.h:1265
> 38c: 48 c1 e8 2d shr $0x2d,%rax
> __nr_to_section():
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1218
> 390: 48 85 d2 test %rdx,%rdx
> 393: 74 0a je 39f <__reset_isolation_pfn+0x27f>
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1220
> 395: 0f b6 c0 movzbl %al,%eax
> 398: 48 c1 e0 04 shl $0x4,%rax
> 39c: 48 01 c2 add %rax,%rdx
> __section_mem_map_addr():
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1247
> 39f: 48 8b 02 mov (%rdx),%rax
> __reset_isolation_pfn():
> /home/fw/src/linux/linux/mm/compaction.c:294
> 3a2: 4c 89 f2 mov %r14,%rdx
> 3a5: 41 b8 01 00 00 00 mov $0x1,%r8d
> 3ab: 31 f6 xor %esi,%esi
>
> It's this loop:
>
> 286 /*
> 287 * Only clear the hint if a sample indicates there is either a
> 288 * free page or an LRU page in the block. One or other condition
> 289 * is necessary for the block to be a migration source/target.
> 290 */
> 291 do {
> 292 if (pfn_valid_within(pfn)) {
> 293 if (check_source && PageLRU(page)) {
> 294 clear_pageblock_skip(page);
Thanks. Looks like it's indeed here in the page_to_pfn() embedded in the
clear_pageblock_skip() expansion. We've got a wrong struct page pointer,
so page_to_section gives us a bogus value, __nr_to_section() a null
pointer, and __section_mem_map_addr then accesses it.
Hopefully the commit [1] should address the reason why we got a wrong
page pointer. You could try cherry-picking it to your stable tree, or
wait until it appears in a (hopefully near) future stable 5.3.y (5.2 is
EOL, so it won't appear there).
Thanks,
Vlastimil
> 295 return true;
> 296 }
> 297
> 298 if (check_target && PageBuddy(page)) {
> 299 clear_pageblock_skip(page);
> 300 return true;
> 301 }
> 302 }
> 303
> 304 page += (1 << PAGE_ALLOC_COSTLY_ORDER);
> 305 pfn += (1 << PAGE_ALLOC_COSTLY_ORDER);
> 306 } while (page < end_page);
>
next prev parent reply other threads:[~2019-10-16 20:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <87pnji8cpw.fsf@mid.deneb.enyo.de>
[not found] ` <20190930085406.GP16973@dread.disaster.area>
[not found] ` <87o8z1fvqu.fsf@mid.deneb.enyo.de>
2019-09-30 21:17 ` Dave Chinner
2019-09-30 21:42 ` Florian Weimer
2019-10-01 9:10 ` Vlastimil Babka
2019-10-01 19:40 ` Florian Weimer
2019-10-07 13:28 ` Vlastimil Babka
2019-10-07 13:56 ` Vlastimil Babka
2019-10-08 8:52 ` Mel Gorman
2019-10-16 19:38 ` Florian Weimer
2019-10-16 20:03 ` Vlastimil Babka [this message]
2019-10-18 17:38 ` Florian Weimer
2019-10-21 8:13 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3560f07e-03a7-9291-6494-e0580eeaa6bd@suse.cz \
--to=vbabka@suse.cz \
--cc=david@fromorbit.com \
--cc=fw@deneb.enyo.de \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox