From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29967C00140 for ; Thu, 18 Aug 2022 20:14:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 387B78D0003; Thu, 18 Aug 2022 16:14:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 335D48D0002; Thu, 18 Aug 2022 16:14:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1FE148D0003; Thu, 18 Aug 2022 16:14:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 10B868D0002 for ; Thu, 18 Aug 2022 16:14:17 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 70C8F1A1EAF for ; Thu, 18 Aug 2022 20:14:16 +0000 (UTC) X-FDA: 79813815312.20.08229B1 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.130]) by imf10.hostedemail.com (Postfix) with ESMTP id D1866C0238 for ; Thu, 18 Aug 2022 20:08:31 +0000 (UTC) Received: from [192.168.1.18] ([90.112.14.95]) by mrelayeu.kundenserver.de (mreue011 [212.227.15.167]) with ESMTPSA (Nemesis) id 1M9nhF-1oJUIq3ZVV-005o6Q; Thu, 18 Aug 2022 19:14:13 +0200 Subject: Re: BUG: Bad page map in process/Bad Swap file entry, RPI CM4 on clone syscall From: Max Schulze To: Will Deacon Cc: linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, naush@raspberrypi.com, linux-mm@kvack.org, akpm@linux-foundation.org References: <20220815142213.GA10448@willie-the-truck> Message-ID: Date: Thu, 18 Aug 2022 19:14:12 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Provags-ID: V03:K1:UV/Ybs4clmyI89DxIFm5uPfqBlh49HwMQiEh6dwFsqD1QAFHpyt FZjOIIOj52WcvjSuW1Fd8rTfSpUeHSQSjuosK+uMfEJrEqL/JmgCzhxD7uuevKAGJ5OLptP 8zvpsVcOxZ6ucsKkQaoAfR/b4hOEbUU9BxCCZFeWCmC6M6GEr6OJeexIf17D5R+vxgzTLS2 kL0P9lVD3qEmK86y+KEDw== X-UI-Out-Filterresults: notjunk:1;V03:K0:wrRGag5KJiE=:8y10agz9sCT1DERk7pNLHE JDuI8Oy6e1C/dIe7ksM75kd3tOXGpjbm2PJ1RtBdFG/ducl9MINh7XdeYcyPqQua+Rj/LgYLz Y/q/8me2pXKXUqKkvTh7E7OzoYx3gWkp1m6/nbfUBJ+jE1w51d2rOTN3OFO7XnS9cObOLiPYM nx0Vhn2+32tRbqfPwmTREaf3WZ/w/0m9KPTHMbnk71AlLI26S5MfwyNdi3R6lddnn3NqYJ0Bh o2h6VDuAC5UCE7gGDJWM2izKjvVZgBmquP1Qnp20IDVoPqa05sKPyPgRFF4+U6HDeS4/TsHFm CvFke9Ux3Ev58EmlUnqm0pYipS+BxqGqDFXKrZbBUL2zcDwaEuYMwfsq7a6Hk90D7iaY7tysG vWnURgMxUQF6WByUx/Dl/EWam4Td21FIuAHKVMrAYfs3rgkZ3BWI1XAy6ZlmSknU+QFI3gqAH JOt2q+jD0CFUSIIaZVVNzTHg9wE0Bz0lv73NrgSOh+XR/bkN4laxzkBv4T6N5xHyKnmlJqGvt K0rDttOOCL6qsIv+eF/FNfXQLupu+GdWKGp1Pnaik31Jfmt4c6SMrWbupxJ1CrSZDlXIKpAfY /tN4UkPljhWRzQSUDV5Ha5qYfhvWHveit7Lzj3XwFt9THybVBZW3WICTci7kfcL3tjb8naX+b gVZdvsZ3CHAaWwNZZaS2nsNOF7bofoYGLvqeOiUTNgElWJMngkCVaChxLD2ExMPKy3kTLxvBf Uhn2u3k3pDc5Kj1kyU8Z3qHMBpnNagBDA52yAOeJR3x6TXNTCfeM7Po8z/4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660853313; a=rsa-sha256; cv=none; b=wbgJVD3795tU6bb7BvS5MI896AZy0Lk1wHtUbG3zbOV7EE4W7BpCfKrCJag6eDspDq2Fn+ owBEMtxMN/jikaSgy8APA6mbHNebvkkcljknwrTHqX9Nu0zBF7n8GCEpZvjlN7qPOOsW3f nW9knRagX+emF8lc7b09Hn9PqtLf7Co= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of max.schulze@online.de designates 212.227.126.130 as permitted sender) smtp.mailfrom=max.schulze@online.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660853313; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dN85hYBCIyDe8lelauJXP4sIeCutIiuT5rgrUPokIQw=; b=0pEKASZh4ObjyYX3epS1OUHYRpReug1HFwMWW9oTPJYPuprvTRAirJbJfiy6e65d7Sj8E3 auXcbXJuoyQerYdhoxJjk1BYzhZF6bPSJ5RxIrGC9PPTDiludHV7GYOKXKBVd0wp0TkuTw 8vdYSw5VyQ5p3T1Uqs23DQM1+HfAPbI= Authentication-Results: imf10.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf10.hostedemail.com: domain of max.schulze@online.de designates 212.227.126.130 as permitted sender) smtp.mailfrom=max.schulze@online.de X-Rspam-User: X-Stat-Signature: zfywgqos3s5q9bme7mpjyzudee8xccch X-Rspamd-Queue-Id: D1866C0238 X-Rspamd-Server: rspam12 X-HE-Tag: 1660853311-824725 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, > On 15.08.22 16:22, Will Deacon wrote: >> >>> [...] >>> >>> >>> [20:47:09] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready >>> [20:48:46] BUG: Bad page map in process projecta  pte:1110111111111111 pmd:800000001c40003 >>> [20:48:46] addr:0000007fa1c00000 vm_flags:00100073 anon_vma:ffffff805bf80d08 mapping:0000000000000000 index:7fa1c00 >>> [20:48:46] file:(null) fault:0x0 mmap:0x0 read_folio:0x0 > >> I hate to say it, but this all looks like memory corruption hitting the >> page table and possibly the 'struct page' array to me :/ > > Perhaps a note on the occcurence: across devices, the "bad page map" differs at pte, but somehow is mostly consistent at pmd:800000001c40003 (though I have seen 800000001c02003 and 800000001c40003). Is this some "magic value"? Because when not, I think it would be highly unlikely to be the hardware. > > It is not only my program that has the problem, I have seen > > [Sun Aug 14 17:30:38 2022] BUG: Bad page map in process llvmpipe-3 pte:262d2626292a2627 pmd:800000001c01003 > > and > [Sat Aug 13 11:53:43 2022] BUG: Bad page map in process Xorg:disk$1 pte:a098a09aa29ea8a4 pmd:800000001c01003 > [Sat Aug 13 11:53:43 2022] addr:00000055a961e000 vm_flags:200100073 anon_vma:ffffff804c07d8f8 mapping:0000000000000000 index:55a961e > [Sat Aug 13 11:53:43 2022] file:(null) fault:0x0 mmap:0x0 read_folio:0x0 > > [..] I am able to reproduce this on 6.0.0-rc1 . It looks like vm_normal_page does not recognize the page as being "normal" (?). (mm/memory.c) > if (likely(!pte_special(pte))) > goto check_pfn; > if (vma->vm_ops && vma->vm_ops->find_special_page) > return vma->vm_ops->find_special_page(vma, addr); > if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP)) > return NULL; > if (is_zero_pfn(pfn)) > return NULL; > if (pte_devmap(pte)) >[...] > return NULL; > > print_bad_pte(vma, addr, pte, NULL); What would be helpful to do next? Is the KASAN warning a consequent error or the cause? [ 18:42:59] [ 18:44:17] BUG: Bad page map in process projecta pte:212725231f242323 pmd:800000001c01003 [ 18:44:17] addr:0000007fa1000000 vm_flags:00100073 anon_vma:ffffff8054090c38 mapping:0000000000000000 index:7fa1000 [ 18:44:17] file:(null) fault:0x0 mmap:0x0 read_folio:0x0 [ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G C 6.0.0-rc1-v8-gc8f41281d1f4 #2 [ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT) [ 18:44:17] Call trace: [ 18:44:17] dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:184) [ 18:44:17] show_stack (arch/arm64/kernel/stacktrace.c:191) [ 18:44:17] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) [ 18:44:17] dump_stack (lib/dump_stack.c:114) [ 18:44:17] print_bad_pte (mm/memory.c:567 (discriminator 12)) [ 18:44:17] vm_normal_page (mm/memory.c:638) [ 18:44:17] copy_page_range (mm/memory.c:951 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) [ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) [ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) [ 18:44:17] kernel_clone (kernel/fork.c:2673) [ 18:44:17] __do_sys_clone (kernel/fork.c:2808) [ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) [ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) [ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) [ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) [ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) [ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) [ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) [ 18:44:17] Disabling lock debugging due to kernel taint [ 18:44:17] BUG: Bad page map in process projecta pte:2626262023222323 pmd:800000001c01003 [ 18:44:17] addr:0000007fa1001000 vm_flags:00100073 anon_vma:ffffff8054090c38 mapping:0000000000000000 index:7fa1001 [ 18:44:17] file:(null) fault:0x0 mmap:0x0 read_folio:0x0 [ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G B C 6.0.0-rc1-v8-gc8f41281d1f4 #2 [ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT) [ 18:44:17] Call trace: [ 18:44:17] dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:184) [ 18:44:17] show_stack (arch/arm64/kernel/stacktrace.c:191) [ 18:44:17] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) [ 18:44:17] dump_stack (lib/dump_stack.c:114) [ 18:44:17] print_bad_pte (mm/memory.c:567 (discriminator 12)) [ 18:44:17] vm_normal_page (mm/memory.c:638) [ 18:44:17] copy_page_range (mm/memory.c:951 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) [ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) [ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) [ 18:44:17] kernel_clone (kernel/fork.c:2673) [ 18:44:17] __do_sys_clone (kernel/fork.c:2808) [ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) [ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) [ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) [ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) [ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) [ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) [ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) [ 18:44:17] ================================================================== [ 18:44:17] BUG: KASAN: wild-memory-access in __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) [ 18:44:17] Read of size 8 at addr 00000096808c8880 by task projecta/1135 [ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G B C 6.0.0-rc1-v8-gc8f41281d1f4 #2 [ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT) [ 18:44:17] Call trace: [ 18:44:17] dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:184) [ 18:44:17] show_stack (arch/arm64/kernel/stacktrace.c:191) [ 18:44:17] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) [ 18:44:17] print_report (mm/kasan/report.c:438) [ 18:44:17] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:497) [ 18:44:17] __asan_load8 (mm/kasan/generic.c:256) [ 18:44:17] __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) [ 18:44:17] copy_page_range (./arch/arm64/include/asm/pgtable.h:327 ./arch/arm64/include/asm/pgtable.h:358 mm/memory.c:994 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) [ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) [ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) [ 18:44:17] kernel_clone (kernel/fork.c:2673) [ 18:44:17] __do_sys_clone (kernel/fork.c:2808) [ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) [ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) [ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) [ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) [ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) [ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) [ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) [ 18:44:17] ================================================================== [ 18:44:17] Unable to handle kernel paging request at virtual address 00000096808c8880 [ 18:44:17] Mem abort info: [ 18:44:17] ESR = 0x0000000096000004 [ 18:44:17] EC = 0x25: DABT (current EL), IL = 32 bits [ 18:44:17] SET = 0, FnV = 0 [ 18:44:17] EA = 0, S1PTW = 0 [ 18:44:17] FSC = 0x04: level 0 translation fault [ 18:44:17] Data abort info: [ 18:44:17] ISV = 0, ISS = 0x00000004 [ 18:44:17] CM = 0, WnR = 0 [ 18:44:17] [00000096808c8880] address between user and kernel address ranges [ 18:44:17] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 18:44:17] Modules linked in: rtc_pcf85063 regmap_i2c ov9281 rfkill bcm2835_unicam v4l2_dv_timings v4l2_fwnode v3d bcm2835_v4l2(C) v4l2_async bcm2835_codec(C) bcm2835_isp(C) videobuf2_vmalloc rpivid_hevc(C) v4l2_mem2mem drm_shmem_helper bcm2835_mmal_vchiq(C) gpu_sched videobuf2_dma_contig videobuf2_memops i2c_mux_pinctrl videobuf2_v4l2 videobuf2_common raspberrypi_hwmon i2c_mux videodev i2c_brcmstb i2c_bcm2835 vc_sm_cma(C) mc uio_pdrv_genirq nvmem_rmem uio drm fuse drm_panel_orientation_quirks backlight ipv6 [ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G B C 6.0.0-rc1-v8-gc8f41281d1f4 #2 [ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT) [ 18:44:17] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 18:44:17] pc : __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) [ 18:44:17] lr : __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) [ 18:44:17] sp : ffffffc00d067630 [ 18:44:17] x29: ffffffc00d067630 x28: 0400000000000001 x27: 2626262023222323 [ 18:44:17] x26: 0000007fa1001000 x25: fffffffe010f2ce8 x24: 0000000000000000 [ 18:44:17] x23: fffffffe00000000 x22: 00000096808c8880 x21: 1ffffff801a0cece [ 18:44:17] x20: 0000000000000000 x19: 00000098808c8880 x18: 0000000000000000 [ 18:44:17] x17: 3d3d3d3d3d3d3d3d x16: 3d3d3d3d3d3d3d3d x15: 3d3d3d3d3d3d3d3d [ 18:44:17] x14: 3d3d3d3d3d3d3d3d x13: 3d3d3d3d3d3d3d3d x12: ffffffb8014cd81d [ 18:44:17] x11: 1ffffff8014cd81c x10: ffffffb8014cd81c x9 : dfffffc000000000 [ 18:44:17] x8 : ffffffc00a66c0e7 x7 : 00000047feb327e4 x6 : 0000000000000001 [ 18:44:17] x5 : ffffffc00a66c0e0 x4 : ffffffb8014cd81d x3 : ffffffc0080b68e4 [ 18:44:17] x2 : 0000000000000000 x1 : ffffff804f3e0040 x0 : 0000000000000001 [ 18:44:17] Call trace: [ 18:44:17] __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) [ 18:44:17] copy_page_range (./arch/arm64/include/asm/pgtable.h:327 ./arch/arm64/include/asm/pgtable.h:358 mm/memory.c:994 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) [ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) [ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) [ 18:44:17] kernel_clone (kernel/fork.c:2673) [ 18:44:17] __do_sys_clone (kernel/fork.c:2808) [ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) [ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) [ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) [ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) [ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) [ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) [ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) [ 18:44:17] Code: d37ae673 8b170276 aa1603e0 940f8ac1 (f8776a60) All code ======== 0: d37ae673 lsl x19, x19, #6 4: 8b170276 add x22, x19, x23 8: aa1603e0 mov x0, x22 c: 940f8ac1 bl 0x3e2b10 10:* f8776a60 ldr x0, [x19, x23] <-- trapping instruction Code starting with the faulting instruction =========================================== 0: f8776a60 ldr x0, [x19, x23] [ 18:44:17] ---[ end trace 0000000000000000 ]--- [ 18:44:17] note: projecta[1135] exited with preempt_count 2 Thanks, Max