From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC092C388F7 for ; Mon, 9 Nov 2020 11:47:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 87DBF206ED for ; Mon, 9 Nov 2020 11:47:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=netcourrier.com header.i=@netcourrier.com header.b="ccRATIsF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 87DBF206ED Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=netcourrier.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F1D856B005D; Mon, 9 Nov 2020 06:47:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ECCE66B0068; Mon, 9 Nov 2020 06:47:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBCFB6B006C; Mon, 9 Nov 2020 06:47:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id AC6A16B005D for ; Mon, 9 Nov 2020 06:47:50 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5C62B180AD807 for ; Mon, 9 Nov 2020 11:47:50 +0000 (UTC) X-FDA: 77464705500.21.robin83_26170fd272ec Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 408E1180442C0 for ; Mon, 9 Nov 2020 11:47:50 +0000 (UTC) X-HE-Tag: robin83_26170fd272ec X-Filterd-Recvd-Size: 11965 Received: from msg-1.mailo.com (msg-1.mailo.com [213.182.54.11]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Nov 2020 11:47:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=netcourrier.com; s=mailo; t=1604922448; bh=Ezp0rUJskD2bomxqXZEWYTXMtagj8yYFwrN/XRfhpHQ=; h=X-EA-Auth:Reply-To:Subject:To:Cc:References:From:Message-ID:Date: MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=ccRATIsF0mMAFsfz4v59f9cF+tseUuQhDrh7eUMj71Df5OEO8mnLWIW0IE2UWAWge GiM9JO5sVO8wvMGpcNtfl4vDMX0sGYUbxSrebCpZhjTrFjiO47z0onP6/hyed+3xfS DPjaI3/D14HXMHPUrO1r9/77xSjLX8Mhy1x3JHAo= Received: by b-1.in.mailobj.net [192.168.90.11] with ESMTP via ip-206.mailobj.net [213.182.55.206] Mon, 9 Nov 2020 12:47:28 +0100 (CET) X-EA-Auth: SulPGvq02fPsK3Q/OpVJaZ46aROW3OemOxdSvOszjWQVjsczbh0uRv3+9lQcPVIgY8rXVpmAQ8La7HUdXRsm7e9QsB9X4K7cYNzlujRB4Ec= Reply-To: newsmails@netcourrier.com Subject: Re: [Bug 210031] New: unable to handle page fault for address - EIP: khugepaged To: Vlastimil Babka , Andrew Morton Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org, Song Liu , "Kirill A. Shutemov" , Johannes Weiner , Oleg Nesterov , Pavel Machek References: <20201103161753.27a58af99ba35b94660e8fa3@linux-foundation.org> <86e33bed-eb81-3966-b195-431742f86746@suse.cz> <599fa8aa-9811-bee7-3aa8-ca522b362b77@netcourrier.com> <38cfc9be-e5ec-8ed2-87ce-4e877d5bf952@suse.cz> From: Newsmails Message-ID: <505ed020-6dba-46d3-9b49-4d97900c07dc@netcourrier.com> Date: Mon, 9 Nov 2020 12:47:26 +0100 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <38cfc9be-e5ec-8ed2-87ce-4e877d5bf952@suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/4/20 1:45 PM, Vlastimil Babka wrote: > On 11/4/20 1:23 PM, Newsmails wrote: >> >> >> On 11/4/20 1:14 PM, Vlastimil Babka wrote: >>> On 11/4/20 1:17 AM, Andrew Morton wrote: >>>> (switched to email.=C2=A0 Please respond via emailed reply-to-all, n= ot=20 >>>> via the >>>> bugzilla web interface). >>>> >>>> >>>> On Tue, 03 Nov 2020 20:00:58 +0000=20 >>>> bugzilla-daemon@bugzilla.kernel.org wrote: >>>> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=3D210031 >>>>> >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Bug ID: 210031 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Summar= y: unable to handle page fault for address - EIP: >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 khugepaged >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Produc= t: Memory Management >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Versio= n: 2.5 >>>>> =C2=A0=C2=A0=C2=A0 Kernel Version: 5.9.1 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Hardware: Al= l >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 OS: Linux >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 Tree: Mainline >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Status: NEW >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Severity: no= rmal >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Priority: P1 >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Component: Other >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Assignee: ak= pm@linux-foundation.org >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Reporter: ne= wsmails@netcourrier.com >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Regression: No >>>> >>>> Thanks.=C2=A0 That's a strange looking trace.=C2=A0 I'll optimistica= lly cc some >>>> people who have been working in that area lately. >>>> >>>> What caused this kernel to be tainted? >>>> >>>> >>>>> laptop Skylake i915 Distribution : slackware 14.2 32 bits >>>>> >>>>> Oct 23 17:38:22 linuxp kernel: [141330.499234] BUG: unable to=20 >>>>> handle page fault >>>>> for address: 021d202d >>>>> Oct 23 17:38:22 linuxp kernel: [141330.499245] #PF: supervisor=20 >>>>> read access in >>>>> kernel mode >>>>> Oct 23 17:38:22 linuxp kernel: [141330.499250] #PF:=20 >>>>> error_code(0x0000) - >>>>> not-present page >>>>> Oct 23 17:38:22 linuxp kernel: [141330.499265] Oops: 0000 [#2] SMP=20 >>>>> PTI >>> >>> #2 means this is not the first oops. Do you have the very first? >>> >> Yes sorry. >> It was a resume too as you will see with the time. >> For oct 23 17:38 it is a resume too i think : i think that I hibernate= d >> and i forgot to look at something so i resumed. > > It's always 021d202d (3 times) and always where a vma might be=20 > accessed (a /proc file, khugepaged(), acct_collect()) so I would=20 > assume a struct vma was corrupted in the hibernate/resume process. > Could be also firmware related AFAIK and there's I taint flag which=20 > means some buggy firmware workaround is in effect. It seem's that you are right concerning the buggy firmware. Here are=20 lines at start in syslog : [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update=20 microcode to version: 0xb2 (or later) MDS CPU bug present and SMT on, data leak possible. TAA CPU bug present and SMT on, data leak possible. > >> Oct 23 13:22:10 linuxp dhcpcd[18199]: dhcpcd not running >> Oct 23 15:55:49 linuxp dhcpcd[27045]: dhcpcd not running >> Oct 23 15:55:49 linuxp dhcpcd[27053]: dhcpcd not running >> Oct 23 15:55:50 linuxp dhcpcd[27061]: dhcpcd not running >> Oct 23 15:55:50 linuxp dhcpcd[27067]: dhcpcd not running >> Oct 23 17:31:13 linuxp kernel: [140897.356150] iwlwifi 0000:03:00.0: >> RF_KILL bit toggled to enable radio. >> Oct 23 17:31:16 linuxp kernel: [140900.724013] Bluetooth: hci0: >> unexpected event for opcode 0xfc2f >> Oct 23 17:31:39 linuxp kernel: [140928.245135] BUG: unable to handle >> page fault for address: 021d2001 >> Oct 23 17:31:39 linuxp kernel: [140928.245147] #PF: supervisor read >> access in kernel mode >> Oct 23 17:31:39 linuxp kernel: [140928.245152] #PF: error_code(0x0000)= - >> not-present page >> Oct 23 17:31:39 linuxp kernel: [140928.245169] Oops: 0000 [#1] SMP PTI >> Oct 23 17:31:39 linuxp kernel: [140928.245179] CPU: 1 PID: 2302 Comm: >> Breakpad Server Tainted: G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 I=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 5.9.1 #1 >> Oct 23 17:31:39 linuxp kernel: [140928.245184] Hardware name: >> Notebook=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 W65_W67RZ/W65_W67RZ, BIOS 1.05.06 >> 02/22/2016 >> Oct 23 17:31:39 linuxp kernel: [140928.245197] EIP: m_next+0x1c/0x44 >> Oct 23 17:31:39 linuxp kernel: [140928.245205] Code: 24 08 d4 e6 c1 e8 >> 1a 77 e2 ff eb d6 cc cc 3e 8d 74 26 00 55 89 e5 57 56 53 8b 40 44 8b 5= 8 >> 0c 39 da 74 24 8b 42 08 85 c0 74 0e <8b> 30 31 ff 89 31 89 79 04 5b 5e >> 5f 5d c3 be ff ff ff ff 31 ff 85 >> Oct 23 17:31:39 linuxp kernel: [140928.245212] EAX: 021d2001 EBX: >> 00000000 ECX: eeb9e56c EDX: eca0f000 >> Oct 23 17:31:39 linuxp kernel: [140928.245216] ESI: 00000000 EDI: >> c128cad0 EBP: e59bff0c ESP: e59bff00 >> Oct 23 17:31:39 linuxp kernel: [140928.245221] DS: 007b ES: 007b FS: >> 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 >> Oct 23 17:31:39 linuxp kernel: [140928.245224] CR0: 80050033 CR2: >> 021d2001 CR3: 2606c000 CR4: 003506f0 >> Oct 23 17:31:39 linuxp kernel: [140928.245228] DR0: 00000000 DR1: >> 00000000 DR2: 00000000 DR3: 00000000 >> Oct 23 17:31:39 linuxp kernel: [140928.245231] DR6: fffe0ff0 DR7:=20 >> 00000400 >> Oct 23 17:31:39 linuxp kernel: [140928.245233] Call Trace: >> Oct 23 17:31:39 linuxp kernel: [140928.245242]=C2=A0 ? >> quota_send_warning+0x220/0x220 >> Oct 23 17:31:39 linuxp kernel: [140928.245248] seq_read+0x2bc/0x3e1 >> Oct 23 17:31:39 linuxp kernel: [140928.245254]=C2=A0 ? >> quota_send_warning+0x220/0x220 >> Oct 23 17:31:39 linuxp kernel: [140928.245260]=C2=A0 ?=20 >> seq_open_private+0x17/0x17 >> Oct 23 17:31:39 linuxp kernel: [140928.245266] vfs_read+0x85/0x17f >> Oct 23 17:31:39 linuxp kernel: [140928.245272]=C2=A0 ? mutex_lock+0x10= /0x33 >> Oct 23 17:31:39 linuxp kernel: [140928.245277] ksys_read+0x51/0xb6 >> Oct 23 17:31:39 linuxp kernel: [140928.245283] __ia32_sys_read+0x15/0x= 17 >> Oct 23 17:31:39 linuxp kernel: [140928.245289]=20 >> do_int80_syscall_32+0x2c/0x39 >> Oct 23 17:31:39 linuxp kernel: [140928.245295] entry_INT80_32+0xf7/0xf= 7 >> Oct 23 17:31:39 linuxp kernel: [140928.245299] EIP: 0xafc787c8 >> Oct 23 17:31:39 linuxp kernel: [140928.245304] Code: 00 00 c6 47 04 01 >> 8b 47 08 85 c0 75 b6 80 7f 04 00 75 5c 8b 37 ba 00 04 00 00 8d 4c 07 0= c >> 29 c2 b8 03 00 00 00 53 89 f3 cd 80 <5b> 89 c6 3d 01 f0 ff ff 73 32 85 >> f6 78 37 74 c8 01 77 08 8b 47 08 >> Oct 23 17:31:39 linuxp kernel: [140928.245308] EAX: ffffffda EBX: >> 00000040 ECX: a3c194fc EDX: 000003d8 >> Oct 23 17:31:39 linuxp kernel: [140928.245311] ESI: 00000040 EDI: >> a3c194c8 EBP: 995fece8 ESP: 995feccc >> Oct 23 17:31:39 linuxp kernel: [140928.245316] DS: 007b ES: 007b FS: >> 0000 GS: 0033 SS: 007b EFLAGS: 00000216 >> Oct 23 17:31:39 linuxp kernel: [140928.245320] Modules linked in: >> appletalk psnap llc ipv6 fuse uvcvideo videobuf2_vmalloc >> videobuf2_memops btusb videobuf2_v4l2 hid_generic btrtl btbcm >> videobuf2_common btintel videodev bluetooth mc usbhid hid ecdh_generic >> ecc rtsx_pci_sdmmc joydev mmc_core snd_hda_codec_hdmi >> snd_hda_codec_realtek snd_hda_codec_generic i2c_dev ledtrig_audio >> coretemp i915 hwmon iwlmvm r8169 i2c_algo_bit x86_pkg_temp_thermal >> mac80211 drm_kms_helper intel_powerclamp rtsx_pci realtek drm kvm_inte= l >> mdio_devres mfd_core libphy kvm intel_gtt irqbypass crc32_pclmul iwlwi= fi >> snd_hda_intel agpgart psmouse evdev crc32c_intel serio_raw fb_sys_fops >> cfg80211 snd_intel_dspcfg snd_hda_codec rfkill snd_hda_core syscopyare= a >> wmi thermal snd_hwdep battery snd_pcm sysfillrect sysimgblt snd_timer >> xhci_pci i2c_i801 button xhci_hcd snd i2c_smbus intel_pch_thermal mei_= me >> soundcore video mei i2c_core acpi_pad ac loop >> Oct 23 17:31:39 linuxp kernel: [140928.245396] CR2: 00000000021d2001 >> Oct 23 17:31:39 linuxp kernel: [140928.245402] ---[ end trace >> c79bfd2669dd9a26 ]--- >> Oct 23 17:31:39 linuxp kernel: [140928.245408] EIP: m_next+0x1c/0x44 >> Oct 23 17:31:39 linuxp kernel: [140928.245412] Code: 24 08 d4 e6 c1 e8 >> 1a 77 e2 ff eb d6 cc cc 3e 8d 74 26 00 55 89 e5 57 56 53 8b 40 44 8b 5= 8 >> 0c 39 da 74 24 8b 42 08 85 c0 74 0e <8b> 30 31 ff 89 31 89 79 04 5b 5e >> 5f 5d c3 be ff ff ff ff 31 ff 85 >> Oct 23 17:31:39 linuxp kernel: [140928.245417] EAX: 021d2001 EBX: >> 00000000 ECX: eeb9e56c EDX: eca0f000 >> Oct 23 17:31:39 linuxp kernel: [140928.245420] ESI: 00000000 EDI: >> c128cad0 EBP: e59bff0c ESP: e59bff00 >> Oct 23 17:31:39 linuxp kernel: [140928.245424] DS: 007b ES: 007b FS: >> 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 >> Oct 23 17:31:39 linuxp kernel: [140928.245427] CR0: 80050033 CR2: >> 021d2001 CR3: 2606c000 CR4: 003506f0 >> Oct 23 17:31:39 linuxp kernel: [140928.245431] DR0: 00000000 DR1: >> 00000000 DR2: 00000000 DR3: 00000000 >> Oct 23 17:31:39 linuxp kernel: [140928.245434] DR6: fffe0ff0 DR7:=20 >> 00000400 >> Oct 23 17:32:27 linuxp dhcpcd[27294]: dhcpcd not running >> Oct 23 17:32:30 linuxp dhcpcd[27305]: dhcpcd not running >> Oct 23 17:32:33 linuxp dhcpcd[27314]: dhcpcd not running >> Oct 23 17:32:35 linuxp dhcpcd[27325]: dhcpcd not running >> Oct 23 17:32:36 linuxp dhcpcd[27331]: dhcpcd not running >> Oct 23 17:38:22 linuxp kernel: [141330.499234] BUG: unable to handle >> page fault for address: 021d202d >> Oct 23 17:38:22 linuxp kernel: [141330.499245] #PF: supervisor read >> access in kernel mode >> Oct 23 17:38:22 linuxp kernel: [141330.499250] #PF: error_code(0x0000)= - >> not-present page >> Oct 23 17:38:22 linuxp kernel: [141330.499265] Oops: 0000 [#2] SMP PTI >> Oct 23 17:38:22 linuxp kernel: [141330.499274] CPU: 0 PID: 37 Comm: >> khugepaged Tainted: G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 D=C2=A0=C2=A0 I=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 5.9.1 #1 >> Oct 23 17:38:22 linuxp kernel: [141330.499278] Hardware name: >> Notebook=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 W65_W67RZ/W65_W67RZ, BIOS 1.05.06 >> 02/22/2016 >> Oct 23 17:38:22 linuxp kernel: [141330.499289] EIP:=20 >> khugepaged+0x599/0x2226 >> >> >> >