From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8753DC2D0A3 for ; Wed, 4 Nov 2020 12:45:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B3C352242C for ; Wed, 4 Nov 2020 12:45:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B3C352242C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AE2886B005D; Wed, 4 Nov 2020 07:45:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A934D6B006C; Wed, 4 Nov 2020 07:45:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A8986B006E; Wed, 4 Nov 2020 07:45:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 6CD5C6B005D for ; Wed, 4 Nov 2020 07:45:39 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 07F3E181AC9C6 for ; Wed, 4 Nov 2020 12:45:39 +0000 (UTC) X-FDA: 77446707198.15.oven66_3400044272c1 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id D70301814B0C1 for ; Wed, 4 Nov 2020 12:45:38 +0000 (UTC) X-HE-Tag: oven66_3400044272c1 X-Filterd-Recvd-Size: 10750 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf20.hostedemail.com (Postfix) with ESMTP for ; Wed, 4 Nov 2020 12:45:37 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 68EA5AE0D; Wed, 4 Nov 2020 12:45:36 +0000 (UTC) Subject: Re: [Bug 210031] New: unable to handle page fault for address - EIP: khugepaged To: newsmails@netcourrier.com, Andrew Morton Cc: bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org, Song Liu , "Kirill A. Shutemov" , Johannes Weiner , Oleg Nesterov , Pavel Machek References: <20201103161753.27a58af99ba35b94660e8fa3@linux-foundation.org> <86e33bed-eb81-3966-b195-431742f86746@suse.cz> <599fa8aa-9811-bee7-3aa8-ca522b362b77@netcourrier.com> From: Vlastimil Babka Message-ID: <38cfc9be-e5ec-8ed2-87ce-4e877d5bf952@suse.cz> Date: Wed, 4 Nov 2020 13:45:35 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: <599fa8aa-9811-bee7-3aa8-ca522b362b77@netcourrier.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/4/20 1:23 PM, Newsmails wrote: >=20 >=20 > On 11/4/20 1:14 PM, Vlastimil Babka wrote: >> On 11/4/20 1:17 AM, Andrew Morton wrote: >>> (switched to email.=C2=A0 Please respond via emailed reply-to-all, no= t via=20 >>> the >>> bugzilla web interface). >>> >>> >>> On Tue, 03 Nov 2020 20:00:58 +0000=20 >>> bugzilla-daemon@bugzilla.kernel.org wrote: >>> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=3D210031 >>>> >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 B= ug ID: 210031 >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Summary= : unable to handle page fault for address - EIP: >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 khugepaged >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Product= : Memory Management >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Version= : 2.5 >>>> =C2=A0=C2=A0=C2=A0 Kernel Version: 5.9.1 >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Hardware: All >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 OS: Linux >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 Tree: Mainline >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 S= tatus: NEW >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Severity: nor= mal >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Priority: P1 >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Component: Other >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Assignee: akp= m@linux-foundation.org >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Reporter: new= smails@netcourrier.com >>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Regression: No >>> >>> Thanks.=C2=A0 That's a strange looking trace.=C2=A0 I'll optimistical= ly cc some >>> people who have been working in that area lately. >>> >>> What caused this kernel to be tainted? >>> >>> >>>> laptop Skylake i915 Distribution : slackware 14.2 32 bits >>>> >>>> Oct 23 17:38:22 linuxp kernel: [141330.499234] BUG: unable to handle= =20 >>>> page fault >>>> for address: 021d202d >>>> Oct 23 17:38:22 linuxp kernel: [141330.499245] #PF: supervisor read=20 >>>> access in >>>> kernel mode >>>> Oct 23 17:38:22 linuxp kernel: [141330.499250] #PF:=20 >>>> error_code(0x0000) - >>>> not-present page >>>> Oct 23 17:38:22 linuxp kernel: [141330.499265] Oops: 0000 [#2] SMP P= TI >> >> #2 means this is not the first oops. Do you have the very first? >> > Yes sorry. > It was a resume too as you will see with the time. > For oct 23 17:38 it is a resume too i think : i think that I hibernated > and i forgot to look at something so i resumed. It's always 021d202d (3 times) and always where a vma might be accessed (= a /proc=20 file, khugepaged(), acct_collect()) so I would assume a struct vma was co= rrupted=20 in the hibernate/resume process. Could be also firmware related AFAIK and there's I taint flag which means= some=20 buggy firmware workaround is in effect. > Oct 23 13:22:10 linuxp dhcpcd[18199]: dhcpcd not running > Oct 23 15:55:49 linuxp dhcpcd[27045]: dhcpcd not running > Oct 23 15:55:49 linuxp dhcpcd[27053]: dhcpcd not running > Oct 23 15:55:50 linuxp dhcpcd[27061]: dhcpcd not running > Oct 23 15:55:50 linuxp dhcpcd[27067]: dhcpcd not running > Oct 23 17:31:13 linuxp kernel: [140897.356150] iwlwifi 0000:03:00.0: > RF_KILL bit toggled to enable radio. > Oct 23 17:31:16 linuxp kernel: [140900.724013] Bluetooth: hci0: > unexpected event for opcode 0xfc2f > Oct 23 17:31:39 linuxp kernel: [140928.245135] BUG: unable to handle > page fault for address: 021d2001 > Oct 23 17:31:39 linuxp kernel: [140928.245147] #PF: supervisor read > access in kernel mode > Oct 23 17:31:39 linuxp kernel: [140928.245152] #PF: error_code(0x0000) = - > not-present page > Oct 23 17:31:39 linuxp kernel: [140928.245169] Oops: 0000 [#1] SMP PTI > Oct 23 17:31:39 linuxp kernel: [140928.245179] CPU: 1 PID: 2302 Comm: > Breakpad Server Tainted: G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 I=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 5.9.1 #1 > Oct 23 17:31:39 linuxp kernel: [140928.245184] Hardware name: > Notebook=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 W65_W67RZ/W65_W67RZ, BIOS 1.05.06 > 02/22/2016 > Oct 23 17:31:39 linuxp kernel: [140928.245197] EIP: m_next+0x1c/0x44 > Oct 23 17:31:39 linuxp kernel: [140928.245205] Code: 24 08 d4 e6 c1 e8 > 1a 77 e2 ff eb d6 cc cc 3e 8d 74 26 00 55 89 e5 57 56 53 8b 40 44 8b 58 > 0c 39 da 74 24 8b 42 08 85 c0 74 0e <8b> 30 31 ff 89 31 89 79 04 5b 5e > 5f 5d c3 be ff ff ff ff 31 ff 85 > Oct 23 17:31:39 linuxp kernel: [140928.245212] EAX: 021d2001 EBX: > 00000000 ECX: eeb9e56c EDX: eca0f000 > Oct 23 17:31:39 linuxp kernel: [140928.245216] ESI: 00000000 EDI: > c128cad0 EBP: e59bff0c ESP: e59bff00 > Oct 23 17:31:39 linuxp kernel: [140928.245221] DS: 007b ES: 007b FS: > 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 > Oct 23 17:31:39 linuxp kernel: [140928.245224] CR0: 80050033 CR2: > 021d2001 CR3: 2606c000 CR4: 003506f0 > Oct 23 17:31:39 linuxp kernel: [140928.245228] DR0: 00000000 DR1: > 00000000 DR2: 00000000 DR3: 00000000 > Oct 23 17:31:39 linuxp kernel: [140928.245231] DR6: fffe0ff0 DR7: 00000= 400 > Oct 23 17:31:39 linuxp kernel: [140928.245233] Call Trace: > Oct 23 17:31:39 linuxp kernel: [140928.245242]=C2=A0 ? > quota_send_warning+0x220/0x220 > Oct 23 17:31:39 linuxp kernel: [140928.245248]=C2=A0 seq_read+0x2bc/0x3= e1 > Oct 23 17:31:39 linuxp kernel: [140928.245254]=C2=A0 ? > quota_send_warning+0x220/0x220 > Oct 23 17:31:39 linuxp kernel: [140928.245260]=C2=A0 ? seq_open_private= +0x17/0x17 > Oct 23 17:31:39 linuxp kernel: [140928.245266]=C2=A0 vfs_read+0x85/0x17= f > Oct 23 17:31:39 linuxp kernel: [140928.245272]=C2=A0 ? mutex_lock+0x10/= 0x33 > Oct 23 17:31:39 linuxp kernel: [140928.245277]=C2=A0 ksys_read+0x51/0xb= 6 > Oct 23 17:31:39 linuxp kernel: [140928.245283] __ia32_sys_read+0x15/0x1= 7 > Oct 23 17:31:39 linuxp kernel: [140928.245289] do_int80_syscall_32+0x2c= /0x39 > Oct 23 17:31:39 linuxp kernel: [140928.245295] entry_INT80_32+0xf7/0xf7 > Oct 23 17:31:39 linuxp kernel: [140928.245299] EIP: 0xafc787c8 > Oct 23 17:31:39 linuxp kernel: [140928.245304] Code: 00 00 c6 47 04 01 > 8b 47 08 85 c0 75 b6 80 7f 04 00 75 5c 8b 37 ba 00 04 00 00 8d 4c 07 0c > 29 c2 b8 03 00 00 00 53 89 f3 cd 80 <5b> 89 c6 3d 01 f0 ff ff 73 32 85 > f6 78 37 74 c8 01 77 08 8b 47 08 > Oct 23 17:31:39 linuxp kernel: [140928.245308] EAX: ffffffda EBX: > 00000040 ECX: a3c194fc EDX: 000003d8 > Oct 23 17:31:39 linuxp kernel: [140928.245311] ESI: 00000040 EDI: > a3c194c8 EBP: 995fece8 ESP: 995feccc > Oct 23 17:31:39 linuxp kernel: [140928.245316] DS: 007b ES: 007b FS: > 0000 GS: 0033 SS: 007b EFLAGS: 00000216 > Oct 23 17:31:39 linuxp kernel: [140928.245320] Modules linked in: > appletalk psnap llc ipv6 fuse uvcvideo videobuf2_vmalloc > videobuf2_memops btusb videobuf2_v4l2 hid_generic btrtl btbcm > videobuf2_common btintel videodev bluetooth mc usbhid hid ecdh_generic > ecc rtsx_pci_sdmmc joydev mmc_core snd_hda_codec_hdmi > snd_hda_codec_realtek snd_hda_codec_generic i2c_dev ledtrig_audio > coretemp i915 hwmon iwlmvm r8169 i2c_algo_bit x86_pkg_temp_thermal > mac80211 drm_kms_helper intel_powerclamp rtsx_pci realtek drm kvm_intel > mdio_devres mfd_core libphy kvm intel_gtt irqbypass crc32_pclmul iwlwif= i > snd_hda_intel agpgart psmouse evdev crc32c_intel serio_raw fb_sys_fops > cfg80211 snd_intel_dspcfg snd_hda_codec rfkill snd_hda_core syscopyarea > wmi thermal snd_hwdep battery snd_pcm sysfillrect sysimgblt snd_timer > xhci_pci i2c_i801 button xhci_hcd snd i2c_smbus intel_pch_thermal mei_m= e > soundcore video mei i2c_core acpi_pad ac loop > Oct 23 17:31:39 linuxp kernel: [140928.245396] CR2: 00000000021d2001 > Oct 23 17:31:39 linuxp kernel: [140928.245402] ---[ end trace > c79bfd2669dd9a26 ]--- > Oct 23 17:31:39 linuxp kernel: [140928.245408] EIP: m_next+0x1c/0x44 > Oct 23 17:31:39 linuxp kernel: [140928.245412] Code: 24 08 d4 e6 c1 e8 > 1a 77 e2 ff eb d6 cc cc 3e 8d 74 26 00 55 89 e5 57 56 53 8b 40 44 8b 58 > 0c 39 da 74 24 8b 42 08 85 c0 74 0e <8b> 30 31 ff 89 31 89 79 04 5b 5e > 5f 5d c3 be ff ff ff ff 31 ff 85 > Oct 23 17:31:39 linuxp kernel: [140928.245417] EAX: 021d2001 EBX: > 00000000 ECX: eeb9e56c EDX: eca0f000 > Oct 23 17:31:39 linuxp kernel: [140928.245420] ESI: 00000000 EDI: > c128cad0 EBP: e59bff0c ESP: e59bff00 > Oct 23 17:31:39 linuxp kernel: [140928.245424] DS: 007b ES: 007b FS: > 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010202 > Oct 23 17:31:39 linuxp kernel: [140928.245427] CR0: 80050033 CR2: > 021d2001 CR3: 2606c000 CR4: 003506f0 > Oct 23 17:31:39 linuxp kernel: [140928.245431] DR0: 00000000 DR1: > 00000000 DR2: 00000000 DR3: 00000000 > Oct 23 17:31:39 linuxp kernel: [140928.245434] DR6: fffe0ff0 DR7: 00000= 400 > Oct 23 17:32:27 linuxp dhcpcd[27294]: dhcpcd not running > Oct 23 17:32:30 linuxp dhcpcd[27305]: dhcpcd not running > Oct 23 17:32:33 linuxp dhcpcd[27314]: dhcpcd not running > Oct 23 17:32:35 linuxp dhcpcd[27325]: dhcpcd not running > Oct 23 17:32:36 linuxp dhcpcd[27331]: dhcpcd not running > Oct 23 17:38:22 linuxp kernel: [141330.499234] BUG: unable to handle > page fault for address: 021d202d > Oct 23 17:38:22 linuxp kernel: [141330.499245] #PF: supervisor read > access in kernel mode > Oct 23 17:38:22 linuxp kernel: [141330.499250] #PF: error_code(0x0000) = - > not-present page > Oct 23 17:38:22 linuxp kernel: [141330.499265] Oops: 0000 [#2] SMP PTI > Oct 23 17:38:22 linuxp kernel: [141330.499274] CPU: 0 PID: 37 Comm: > khugepaged Tainted: G=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 D=C2=A0=C2=A0 I=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 5.9.1 #1 > Oct 23 17:38:22 linuxp kernel: [141330.499278] Hardware name: > Notebook=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 W65_W67RZ/W65_W67RZ, BIOS 1.05.06 > 02/22/2016 > Oct 23 17:38:22 linuxp kernel: [141330.499289] EIP: khugepaged+0x599/0x= 2226 >=20 >=20 >=20