From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f176.google.com (mail-ig0-f176.google.com [209.85.213.176]) by kanga.kvack.org (Postfix) with ESMTP id DF4256B0038 for ; Sun, 29 Mar 2015 23:10:14 -0400 (EDT) Received: by igbud6 with SMTP id ud6so62488511igb.1 for ; Sun, 29 Mar 2015 20:10:14 -0700 (PDT) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com. [119.145.14.66]) by mx.google.com with ESMTPS id b16si7853547icg.101.2015.03.29.20.10.11 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sun, 29 Mar 2015 20:10:14 -0700 (PDT) Message-ID: <5518BDC7.2090407@huawei.com> Date: Mon, 30 Mar 2015 11:06:47 +0800 From: Xishi Qiu MIME-Version: 1.0 Subject: Re: arm/ksm: Unable to handle kernel paging request in get_ksm_page() and ksm_scan_thread() References: <55140869.7060507@huawei.com> <55161D0E.9070604@huawei.com> <5518AAE5.8060308@huawei.com> In-Reply-To: <5518AAE5.8060308@huawei.com> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Hugh Dickins Cc: Andrew Morton , Peter Zijlstra , neilb@suse.de, heiko.carstens@de.ibm.com, dhowells@redhat.com, izik.eidus@ravellosystems.com, aarcange@redhat.com, chrisw@sous-sol.org, Linux MM , LKML , weiyuan.wei@huawei.com On 2015/3/30 9:46, Xishi Qiu wrote: > On 2015/3/30 8:43, Hugh Dickins wrote: > >> On Sat, 28 Mar 2015, Xishi Qiu wrote: >>> On 2015/3/26 21:23, Xishi Qiu wrote: >>> >>>> Here are two panic logs from smart phone test, and the kernel version is v3.10. >>>> >>>> log1 is "Unable to handle kernel paging request at virtual address c0704da020", it should be ffffffc0704da020, right? >> >> That one was an oops at get_ksm_page+0x34/0x150: I'm pretty sure that >> comes from the "kpfn = ACCESS_ONCE(stable_node->kpfn)" line, that the >> stable_node pointer (in x21 or x22) has upper bits cleared; which >> suggests corruption of the rmap_item supposed to point to it. >> >> get_ksm_page() is tricky with ACCESS_ONCEs against page migration, >> and the structures tricky with unions; but pointers overlay pointers >> in those unions, I don't see any way we might pick up an address with >> the upper 24 or 32 bits cleared due to that. >> >>>> and log2 is "Unable to handle kernel paging request at virtual address 1e000796", it should be ffffffc01e000796, right? >> >> And this one was an oops at ksm_scan_thread+0x4ac/0xce0; as is the oops >> you posted below. Which contains lots of hex numbers, but very little >> info I can work from. >> >> Please make a CONFIG_DEBUG_INFO=y build of one of the kernels you're >> hitting this with, then use the disassembler (objdump -ld perhaps) to >> identify precisely which line of ksm.c that is oopsing on: the compiler >> will have inlined more interesting functions into ksm_scan_thread, so >> I haven't a clue where it's actually oopsing. >> >> Maybe we'll find that it's also oopsing on a kernel virtual address >> from an rmap_item, maybe we won't. >> >> And I don't read arm64 assembler at all, so I shall be rather limited >> in what I can tell you, I'm afraid. >> >>>> >>>> I cann't repeat the panic by test, so could anyone tell me this is the >>>> bug of ksm or other reason? >> >> I've not heard of any problem like this with KSM on other architectures. >> Maybe it is making some assumption which is invalid on arm64, but I'd >> have thought we'd have heard about that before now. My guess is that >> something in your kernel is stamping on KSM's structures. >> >> A relevant experiment (after identifying the oops line in your current >> kernel) might be to switch from CONFIG_SLAB=y to CONFIG_SLUB=y or vice >> versa. I doubt SLAB or SLUB is to blame, but changing allocator might >> shake things up in a way that either hides the problem, or shifts it >> elsewhere. >> >> Hugh >> > > Hi Hugh, > > Thanks for your reply. There are 3 cases as follows, at first I think maybe > something causes the oops, but all of the cases are relevant to "rmap_item", > so I have no idea. > > 1. ksm_scan_thread+0xa88/0xce0 -> unstable_tree_search_insert() -> tree_rmap_item = rb_entry(*new, struct rmap_item, node); > > 2. ksm_scan_thread+0x4ac/0xce0 -> get_next_rmap_item() -> if ((rmap_item->address & PAGE_MASK) == addr) > > 3. get_ksm_page+0x34/0x150 -> get_ksm_page() -> kpfn = ACCESS_ONCE(stable_node->kpfn); > > Thanks, > Xishi Qiu > The kernel is v3.10.59 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org