linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Xishi Qiu <qiuxishi@huawei.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "'Kirill A . Shutemov'" <kirill.shutemov@linux.intel.com>,
	zhong jiang <zhongjiang@huawei.com>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Michal Hocko <mhocko@suse.com>, Minchan Kim <minchan@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	aarcange@redhat.com, sumeet.keswani@hpe.com,
	Rik van Riel <riel@redhat.com>, Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: mm, something wrong in page_lock_anon_vma_read()?
Date: Tue, 18 Jul 2017 18:59:19 +0800	[thread overview]
Message-ID: <596DEA07.5000009@huawei.com> (raw)
In-Reply-To: <e8dacd42-e5c5-998b-5f9a-a34dbfb986f1@suse.cz>

On 2017/6/8 21:59, Vlastimil Babka wrote:

> On 06/08/2017 03:44 PM, Xishi Qiu wrote:
>> On 2017/5/23 17:33, Vlastimil Babka wrote:
>>
>>> On 05/23/2017 11:21 AM, zhong jiang wrote:
>>>> On 2017/5/23 0:51, Vlastimil Babka wrote:
>>>>> On 05/20/2017 05:01 AM, zhong jiang wrote:
>>>>>> On 2017/5/20 10:40, Hugh Dickins wrote:
>>>>>>> On Sat, 20 May 2017, Xishi Qiu wrote:
>>>>>>>> Here is a bug report form redhat: https://bugzilla.redhat.com/show_bug.cgi?id=1305620
>>>>>>>> And I meet the bug too. However it is hard to reproduce, and 
>>>>>>>> 624483f3ea82598("mm: rmap: fix use-after-free in __put_anon_vma") is not help.
>>>>>>>>
>>>>>>>> From the vmcore, it seems that the page is still mapped(_mapcount=0 and _count=2),
>>>>>>>> and the value of mapping is a valid address(mapping = 0xffff8801b3e2a101),
>>>>>>>> but anon_vma has been corrupted.
>>>>>>>>
>>>>>>>> Any ideas?
>>>>>>> Sorry, no.  I assume that _mapcount has been misaccounted, for example
>>>>>>> a pte mapped in on top of another pte; but cannot begin tell you where
>>>>>>> in Red Hat's kernel-3.10.0-229.4.2.el7 that might happen.
>>>>>>>
>>>>>>> Hugh
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>> Hi, Hugh
>>>>>>
>>>>>> I find the following message from the dmesg.
>>>>>>
>>>>>> [26068.316592] BUG: Bad rss-counter state mm:ffff8800a7de2d80 idx:1 val:1
>>>>>>
>>>>>> I can prove that the __mapcount is misaccount.  when task is exited. the rmap
>>>>>> still exist.
>>>>> Check if the kernel in question contains this commit: ad33bb04b2a6 ("mm:
>>>>> thp: fix SMP race condition between THP page fault and MADV_DONTNEED")
>>>>   HI, Vlastimil
>>>>  
>>>>   I miss the patch.
>>>
>>> Try applying it then, there's good chance the error and crash will go
>>> away. Even if your workload doesn't actually run any madvise(MADV_DONTNEED).
>>>
>>
>> Hi Vlastimil,
>>
>> I find this error was reported by Kirill as following, right?
>> https://patchwork.kernel.org/patch/7550401/
> 
> That was reported by Minchan.
> 
>> The call trace is quite like the same as ours.
> 
> In that thread, the error seems just disappeared in the end.
> 
> So, did you apply the patch I suggested? Did it help?
> 

Hi,

Unfortunately, this patch(mm: thp: fix SMP race condition between
THP page fault and MADV_DONTNEED) didn't help, I got the panic again.

And I find this error before panic, "[468229.996610] BUG: Bad rss-counter state mm:ffff8806aebc2580 idx:1 val:1"

[468451.702807] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[468451.702861] IP: [<ffffffff810ac089>] down_read_trylock+0x9/0x30
[468451.702900] PGD 12445e067 PUD 11acaa067 PMD 0 
[468451.702931] Oops: 0000 [#1] SMP 
[468451.702953] kbox catch die event.
[468451.703003] collected_len = 1047419, LOG_BUF_LEN_LOCAL = 1048576
[468451.703003] kbox: notify die begin
[468451.703003] kbox: no notify die func register. no need to notify
[468451.703003] do nothing after die!
[468451.703003] Modules linked in: ipt_REJECT macvlan ip_set_hash_ipport vport_vxlan(OVE) xt_statistic xt_physdev xt_nat xt_recent xt_mark xt_comment veth ct_limit(OVE) bum_extract(OVE) policy(OVE) bum(OVE) ip_set nfnetlink openvswitch(OVE) nf_defrag_ipv6 gre ext3 jbd ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack bridge stp llc kboxdriver(O) kbox(O) dm_thin_pool dm_persistent_data crc32_pclmul dm_bio_prison dm_bufio ghash_clmulni_intel libcrc32c aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ppdev sg parport_pc cirrus virtio_console parport syscopyarea sysfillrect sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core pcspkr ip_tables ext4 jbd2 mbcache sr_mod cdrom ata_generic pata_acpi
[468451.703003]  virtio_net virtio_blk crct10dif_pclmul crct10dif_common ata_piix virtio_pci libata serio_raw virtio_ring crc32c_intel virtio dm_mirror dm_region_hash dm_log dm_mod
[468451.703003] CPU: 6 PID: 21965 Comm: docker-containe Tainted: G           OE  ----V-------   3.10.0-327.53.58.73.x86_64 #1
[468451.703003] Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.8.1-0-g4adadbd-20170107_142945-9_64_246_229 04/01/2014
[468451.703003] task: ffff880692402e00 ti: ffff88018209c000 task.ti: ffff88018209c000
[468451.703003] RIP: 0010:[<ffffffff810ac089>]  [<ffffffff810ac089>] down_read_trylock+0x9/0x30
[468451.703003] RSP: 0018:ffff88018209f8f8  EFLAGS: 00010202
[468451.703003] RAX: 0000000000000000 RBX: ffff880720cd7740 RCX: ffff880720cd7740
[468451.703003] RDX: 0000000000000001 RSI: 0000000000000301 RDI: 0000000000000008
[468451.703003] RBP: ffff88018209f8f8 R08: 00000000c0e0f310 R09: ffff880720cd7740
[468451.703003] R10: ffff88083efd8000 R11: 0000000000000000 R12: ffff880720cd7741
[468451.703003] R13: ffffea000824d100 R14: 0000000000000008 R15: 0000000000000000
[468451.703003] FS:  00007fc0e2a85700(0000) GS:ffff88083ed80000(0000) knlGS:0000000000000000
[468451.703003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[468451.703003] CR2: 0000000000000008 CR3: 0000000661906000 CR4: 00000000001407e0
[468451.703003] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[468451.703003] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[468451.703003] Stack:
[468451.703003]  ffff88018209f928 ffffffff811a7eb5 ffffea000824d100 ffff88018209fa90
[468451.703003]  ffffea00082f9680 0000000000000301 ffff88018209f978 ffffffff811a82e1
[468451.703003]  ffffea000824d100 ffff88018209fa00 0000000000000001 ffffea000824d100
[468451.703003] Call Trace:
[468451.703003]  [<ffffffff811a7eb5>] page_lock_anon_vma_read+0x55/0x110
[468451.703003]  [<ffffffff811a82e1>] try_to_unmap_anon+0x21/0x120
[468451.703003]  [<ffffffff811a842d>] try_to_unmap+0x4d/0x60
[468451.712006]  [<ffffffff811cc749>] migrate_pages+0x439/0x790
[468451.712006]  [<ffffffff81193280>] ? __reset_isolation_suitable+0xe0/0xe0
[468451.712006]  [<ffffffff811941f9>] compact_zone+0x299/0x400
[468451.712006]  [<ffffffff81059aff>] ? kvm_clock_get_cycles+0x1f/0x30
[468451.712006]  [<ffffffff811943fc>] compact_zone_order+0x9c/0xf0
[468451.712006]  [<ffffffff811947b1>] try_to_compact_pages+0x121/0x1a0
[468451.712006]  [<ffffffff8163ace6>] __alloc_pages_direct_compact+0xac/0x196
[468451.712006]  [<ffffffff811783e2>] __alloc_pages_nodemask+0xbc2/0xca0
[468451.712006]  [<ffffffff811bcb7a>] alloc_pages_vma+0x9a/0x150
[468451.712006]  [<ffffffff811d1573>] do_huge_pmd_anonymous_page+0x123/0x510
[468451.712006]  [<ffffffff8119bc58>] handle_mm_fault+0x1a8/0xf50
[468451.712006]  [<ffffffff8164b4d6>] __do_page_fault+0x166/0x470
[468451.712006]  [<ffffffff8164b8a3>] trace_do_page_fault+0x43/0x110
[468451.712006]  [<ffffffff8164af79>] do_async_page_fault+0x29/0xe0
[468451.712006]  [<ffffffff81647a38>] async_page_fault+0x28/0x30
[468451.712006] Code: 00 00 00 ba 01 00 00 00 48 89 de e8 12 fe ff ff eb ce 48 c7 c0 f2 ff ff ff eb c5 e8 42 ff fc ff 66 90 0f 1f 44 00 00 55 48 89 e5 <48> 8b 07 48 89 c2 48 83 c2 01 7e 07 f0 48 0f b1 17 75 f0 48 f7 
[468451.712006] RIP  [<ffffffff810ac089>] down_read_trylock+0x9/0x30
[468451.738667]  RSP <ffff88018209f8f8>
[468451.738667] CR2: 0000000000000008



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-07-18 11:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-18  9:46 mm, something wring " Xishi Qiu
2017-05-19  8:52 ` Xishi Qiu
2017-05-19  9:44   ` Xishi Qiu
2017-05-19 22:00     ` Hugh Dickins
2017-05-20  1:21       ` Xishi Qiu
2017-05-20  2:02         ` Hugh Dickins
2017-05-20  2:18           ` Xishi Qiu
2017-05-20  2:40             ` Hugh Dickins
2017-05-20  3:01               ` zhong jiang
2017-05-22 16:51                 ` Vlastimil Babka
2017-05-23  9:21                   ` zhong jiang
2017-05-23  9:33                     ` Vlastimil Babka
2017-05-23 10:32                       ` zhong jiang
2017-06-08 13:44                       ` Xishi Qiu
2017-06-08 13:59                         ` Vlastimil Babka
2017-06-08 14:11                           ` zhong jiang
2017-07-18 10:59                           ` Xishi Qiu [this message]
2017-07-19  8:40                             ` mm, something wrong " Vlastimil Babka
2017-07-19  9:59                               ` Xishi Qiu
2017-07-20 12:58                                 ` Andrea Arcangeli
2017-07-20 16:15                                   ` Andrea Arcangeli
2017-05-22  9:48               ` mm, something wring " Xishi Qiu
2017-05-22 19:26                 ` Hugh Dickins
2017-05-23  2:19                   ` Xishi Qiu
2017-05-23  2:51                     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=596DEA07.5000009@huawei.com \
    --to=qiuxishi@huawei.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=sumeet.keswani@hpe.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=zhongjiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox