linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-mm@kvack.org, Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16]
Date: Wed, 16 Oct 2019 22:03:39 +0200	[thread overview]
Message-ID: <3560f07e-03a7-9291-6494-e0580eeaa6bd@suse.cz> (raw)
In-Reply-To: <87blugh452.fsf@mid.deneb.enyo.de>

On 10/16/19 9:38 PM, Florian Weimer wrote:
> This time, I've got a kernel with debugging information (still
> 5.2.18).  The crash is at offset 0x39f:
> 
>         if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>      384:       48 c1 ea 35             shr    $0x35,%rdx
>      388:       48 8b 14 d7             mov    (%rdi,%rdx,8),%rdx
>      38c:       48 c1 e8 2d             shr    $0x2d,%rax
>      390:       48 85 d2                test   %rdx,%rdx
>      393:       74 0a                   je     39f <__reset_isolation_pfn+0x27f>
>         return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>      395:       0f b6 c0                movzbl %al,%eax
>      398:       48 c1 e0 04             shl    $0x4,%rax
>      39c:       48 01 c2                add    %rax,%rdx
>         unsigned long map = section->section_mem_map;
>      39f:       48 8b 02                mov    (%rdx),%rax
>                                 clear_pageblock_skip(page);
>      3a2:       4c 89 f2                mov    %r14,%rdx
>      3a5:       41 b8 01 00 00 00       mov    $0x1,%r8d
>      3ab:       31 f6                   xor    %esi,%esi
>      3ad:       b9 03 00 00 00          mov    $0x3,%ecx
>      3b2:       4c 89 f7                mov    %r14,%rdi
> 
> Hmm, -l output is likely more helpful here:
> 
> /home/fw/src/linux/linux/mm/compaction.c:293
>      37a:       a8 10                   test   $0x10,%al
>      37c:       74 bc                   je     33a <__reset_isolation_pfn+0x21a>
> page_to_section():
> /home/fw/src/linux/linux/./include/linux/mm.h:1265
>      37e:       49 8b 16                mov    (%r14),%rdx
>      381:       48 89 d0                mov    %rdx,%rax
> __nr_to_section():
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1218
>      384:       48 c1 ea 35             shr    $0x35,%rdx
>      388:       48 8b 14 d7             mov    (%rdi,%rdx,8),%rdx
> page_to_section():
> /home/fw/src/linux/linux/./include/linux/mm.h:1265
>      38c:       48 c1 e8 2d             shr    $0x2d,%rax
> __nr_to_section():
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1218
>      390:       48 85 d2                test   %rdx,%rdx
>      393:       74 0a                   je     39f <__reset_isolation_pfn+0x27f>
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1220
>      395:       0f b6 c0                movzbl %al,%eax
>      398:       48 c1 e0 04             shl    $0x4,%rax
>      39c:       48 01 c2                add    %rax,%rdx
> __section_mem_map_addr():
> /home/fw/src/linux/linux/./include/linux/mmzone.h:1247
>      39f:       48 8b 02                mov    (%rdx),%rax
> __reset_isolation_pfn():
> /home/fw/src/linux/linux/mm/compaction.c:294
>      3a2:       4c 89 f2                mov    %r14,%rdx
>      3a5:       41 b8 01 00 00 00       mov    $0x1,%r8d
>      3ab:       31 f6                   xor    %esi,%esi
> 
> It's this loop:
> 
>   286         /*
>   287          * Only clear the hint if a sample indicates there is either a
>   288          * free page or an LRU page in the block. One or other condition
>   289          * is necessary for the block to be a migration source/target.
>   290          */
>   291         do {
>   292                 if (pfn_valid_within(pfn)) {
>   293                         if (check_source && PageLRU(page)) {
>   294                                 clear_pageblock_skip(page);

Thanks. Looks like it's indeed here in the page_to_pfn() embedded in the
clear_pageblock_skip() expansion. We've got a wrong struct page pointer,
so page_to_section gives us a bogus value, __nr_to_section() a null
pointer, and __section_mem_map_addr then accesses it.

Hopefully the commit [1] should address the reason why we got a wrong
page pointer. You could try cherry-picking it to your stable tree, or
wait until it appears in a (hopefully near) future stable 5.3.y (5.2 is
EOL, so it won't appear there).

Thanks,
Vlastimil

>   295                                 return true;
>   296                         }
>   297 
>   298                         if (check_target && PageBuddy(page)) {
>   299                                 clear_pageblock_skip(page);
>   300                                 return true;
>   301                         }
>   302                 }
>   303 
>   304                 page += (1 << PAGE_ALLOC_COSTLY_ORDER);
>   305                 pfn += (1 << PAGE_ALLOC_COSTLY_ORDER);
>   306         } while (page < end_page);
> 



  reply	other threads:[~2019-10-16 20:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87pnji8cpw.fsf@mid.deneb.enyo.de>
     [not found] ` <20190930085406.GP16973@dread.disaster.area>
     [not found]   ` <87o8z1fvqu.fsf@mid.deneb.enyo.de>
2019-09-30 21:17     ` Dave Chinner
2019-09-30 21:42       ` Florian Weimer
2019-10-01  9:10       ` Vlastimil Babka
2019-10-01 19:40         ` Florian Weimer
2019-10-07 13:28           ` Vlastimil Babka
2019-10-07 13:56             ` Vlastimil Babka
2019-10-08  8:52               ` Mel Gorman
2019-10-16 19:38         ` Florian Weimer
2019-10-16 20:03           ` Vlastimil Babka [this message]
2019-10-18 17:38             ` Florian Weimer
2019-10-21  8:13               ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3560f07e-03a7-9291-6494-e0580eeaa6bd@suse.cz \
    --to=vbabka@suse.cz \
    --cc=david@fromorbit.com \
    --cc=fw@deneb.enyo.de \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox