From: Vlastimil Babka <vbabka@suse.cz>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Dave Chinner <david@fromorbit.com>,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [bug, 5.2.16] kswapd/compaction null pointer crash [was Re: xfs_inode not reclaimed/memory leak on 5.2.16]
Date: Mon, 7 Oct 2019 15:28:17 +0200 [thread overview]
Message-ID: <2af04718-d5cb-1bb1-a789-be017f2e2df0@suse.cz> (raw)
In-Reply-To: <87lfu4i79z.fsf@mid.deneb.enyo.de>
On 10/1/19 9:40 PM, Florian Weimer wrote:
> * Vlastimil Babka:
>
>> On 9/30/19 11:17 PM, Dave Chinner wrote:
>>> On Mon, Sep 30, 2019 at 09:07:53PM +0200, Florian Weimer wrote:
>>>> * Dave Chinner:
>>>>
>>>>> On Mon, Sep 30, 2019 at 09:28:27AM +0200, Florian Weimer wrote:
>>>>>> Simply running “du -hc” on a large directory tree causes du to be
>>>>>> killed because of kernel paging request failure in the XFS code.
>>>>>
>>>>> dmesg output? if the system was still running, then you might be
>>>>> able to pull the trace from syslog. But we can't do much without
>>>>> knowing what the actual failure was....
>>>>
>>>> Huh. I actually have something in syslog:
>>>>
>>>> [ 4001.238411] BUG: kernel NULL pointer dereference, address:
>>>> 0000000000000000
>>>> [ 4001.238415] #PF: supervisor read access in kernel mode
>>>> [ 4001.238417] #PF: error_code(0x0000) - not-present page
>>>> [ 4001.238418] PGD 0 P4D 0
>>>> [ 4001.238420] Oops: 0000 [#1] SMP PTI
>>>> [ 4001.238423] CPU: 3 PID: 143 Comm: kswapd0 Tainted: G I 5.2.16fw+
>>>> #1
>>>> [ 4001.238424] Hardware name: System manufacturer System Product
>>>> Name/P6X58D-E, BIOS 0701 05/10/2011
>>>> [ 4001.238430] RIP: 0010:__reset_isolation_pfn+0x27f/0x3c0
>>>
>>> That's memory compaction code it's crashed in.
>>>
>>>> [ 4001.238432] Code: 44 c6 48 8b 00 a8 10 74 bc 49 8b 16 48 89 d0
>>>> 48 c1 ea 35 48 8b 14 d7 48 c1 e8 2d 48 85 d2 74 0a 0f b6 c0 48 c1
>>>> e0 04 48 01 c2 <48> 8b 02 4c 89 f2 41 b8 01 00 00 00 31 f6 b9 03 00
>>>> 00 00 4c 89 f7
>>
>> Tried to decode it, but couldn't match it to source code, my version of
>> compiled code is too different. Would it be possible to either send
>> mm/compaction.o from the matching build, or output of 'objdump -d -l'
>> for the __reset_isolation_pfn function?
>
> See below. I don't have debuginfo for this build, and the binary does
> not reproduce for some reason. Due to the heavy inlining, it might be
> quite hard to figure out what's going on.
Thanks, but I'm still not able to "decompile" that in my head.
> I've switched to kernel builds with debuginfo from now on. I'm
> surprised that it's not the default.
Let's see if you can reproduce it with that.
However, I've noticed at least something weird:
> 37e: 49 8b 16 mov (%r14),%rdx
> 381: 48 89 d0 mov %rdx,%rax
> 384: 48 c1 ea 35 shr $0x35,%rdx
> 388: 48 8b 14 d7 mov (%rdi,%rdx,8),%rdx
> 38c: 48 c1 e8 2d shr $0x2d,%rax
> 390: 48 85 d2 test %rdx,%rdx
> 393: 74 0a je 39f <__reset_isolation_pfn+0x27f>
IIUC, this will jump to 39f when rdx is zero.
> 395: 0f b6 c0 movzbl %al,%eax
> 398: 48 c1 e0 04 shl $0x4,%rax
> 39c: 48 01 c2 add %rax,%rdx
> 39f: 48 8b 02 mov (%rdx),%rax
And this is where we crash because rdx is zero. So the test+branch might
have sent us directly here to crash. Sounds like an inverted condition
somewhere? Or possibly a result of optimizations.
next prev parent reply other threads:[~2019-10-07 13:28 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <87pnji8cpw.fsf@mid.deneb.enyo.de>
[not found] ` <20190930085406.GP16973@dread.disaster.area>
[not found] ` <87o8z1fvqu.fsf@mid.deneb.enyo.de>
2019-09-30 21:17 ` Dave Chinner
2019-09-30 21:42 ` Florian Weimer
2019-10-01 9:10 ` Vlastimil Babka
2019-10-01 19:40 ` Florian Weimer
2019-10-07 13:28 ` Vlastimil Babka [this message]
2019-10-07 13:56 ` Vlastimil Babka
2019-10-08 8:52 ` Mel Gorman
2019-10-16 19:38 ` Florian Weimer
2019-10-16 20:03 ` Vlastimil Babka
2019-10-18 17:38 ` Florian Weimer
2019-10-21 8:13 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2af04718-d5cb-1bb1-a789-be017f2e2df0@suse.cz \
--to=vbabka@suse.cz \
--cc=david@fromorbit.com \
--cc=fw@deneb.enyo.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mgorman@techsingularity.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox