From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18F95C3064D for ; Thu, 27 Jun 2024 12:36:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A45596B008A; Thu, 27 Jun 2024 08:36:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F4426B0095; Thu, 27 Jun 2024 08:36:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BCB76B0098; Thu, 27 Jun 2024 08:36:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6F5476B008A for ; Thu, 27 Jun 2024 08:36:58 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 34E3640376 for ; Thu, 27 Jun 2024 12:36:58 +0000 (UTC) X-FDA: 82276618116.12.57C4090 Received: from relay9-d.mail.gandi.net (relay9-d.mail.gandi.net [217.70.183.199]) by imf24.hostedemail.com (Postfix) with ESMTP id 2D17A180013 for ; Thu, 27 Jun 2024 12:36:55 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=yoseli.org header.s=gm1 header.b=AjKYVV4F; spf=none (imf24.hostedemail.com: domain of jeanmichel.hautbois@yoseli.org has no SPF policy when checking 217.70.183.199) smtp.mailfrom=jeanmichel.hautbois@yoseli.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719491799; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iEO4mfp2S5jW+Ioidefkinbk85g6kv5LmFOVauLOaMY=; b=XRrzfTJZ+abfT12ZJ81XZo6+/ZE8OfcOD5EjA3sgwKX/bv8zjVsmGo73LFOdKiy0ack3/7 QkyVIQ59rTUaQCaoOHfR6D5/ZaGhza7BpZXGro8yiCG5GwfZxXrEiMsv0tshUt8i0Rv9mG hUKDC4dRBIoyvqH8iD+HqwZHDFji2q8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719491799; a=rsa-sha256; cv=none; b=BsVQ0FVDR8jaGzZHVKMs8vTO2zCd1hvajmdVVEbMnVU3JVK2H2fREM2wlhC1iUCX2rtFF0 Sn9vQiiS64bn6zzj6LD7xCyCrNhDDvMXHwOkej+nFE1xyO0N5OY/svvMlz/t7WpSCRXu7Z hwUEM+8f/nOjedAI1V4TI2kfCq9XdoM= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=yoseli.org header.s=gm1 header.b=AjKYVV4F; spf=none (imf24.hostedemail.com: domain of jeanmichel.hautbois@yoseli.org has no SPF policy when checking 217.70.183.199) smtp.mailfrom=jeanmichel.hautbois@yoseli.org; dmarc=none Received: by mail.gandi.net (Postfix) with ESMTPSA id 4F1C8FF80B; Thu, 27 Jun 2024 12:36:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yoseli.org; s=gm1; t=1719491814; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iEO4mfp2S5jW+Ioidefkinbk85g6kv5LmFOVauLOaMY=; b=AjKYVV4FQWJuS0qQfYOmPEBMrD/3S0sNqURIcXLykW9Af2TiyVKeoVTExQHHKYxKFoZNjx JjDMRceQYbhEPIS1coln5z5Q/yWBaG/3LeOT0LJ3zNtSHuB1sn6AEEInz2nPqhp9pmXlYi iqppQW+sSQVSoWkEoUAlHx0T0uvU4k5WYElYMRfd/fZPPtX1HWVdDe0MeBoiR1/opysONz 62s3hj1GSJ0iB7JFxvKQeUgSjIZNk2ml2peCvhI9L89XOwIf2cWpChrh+YCEvyquTCFw6D FRRWa03y9gDiU0S49Z2L/GtJvRT2ONp+nyXbnYrA88ZyHQt/nG/Mr2O70ahD9w== Message-ID: <57879ac8-eaf5-48f1-b4ef-6619d9108440@yoseli.org> Date: Thu, 27 Jun 2024 14:36:50 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: m68k 54418 fails to execute user space To: Michael Schmitz , linux-m68k@lists.linux-m68k.org, linux-mm@kvack.org, linux-mtd@lists.infradead.org Cc: Greg Ungerer , Geert Uytterhoeven , Christoph Hellwig , wbx@openadk.org References: <735e19b6-3747-417f-ba5b-1a7da137a3a3@yoseli.org> <7fb2988d-ab89-405f-8cf1-edcdd2196376@gmail.com> Content-Language: en-US From: Jean-Michel Hautbois In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-GND-Sasl: jeanmichel.hautbois@yoseli.org X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2D17A180013 X-Stat-Signature: cwnjmzd7auypwedau4oyuw1gh6e7gih7 X-HE-Tag: 1719491815-889095 X-HE-Meta: U2FsdGVkX1/UxFwJoYaQ/lVWW5peBkX+lJl8FroO6PmZpwmjlHmR/ti9rqQqAqB1lQzmAuABmm4ibmoqgzmIHvicNYzF4tHJ+SXDc8saTGbnXi6/RXzcBhaeJZ1aS9dHB3Paq1PTdWiAoq6qIwe42YuxJb+Za00f9P/E7XIO1r88IiS3uGUTIdZgMF6txpDPkN6tjB+KRQ8yfjDh7Cpl096FF3iDOTdm28fdeSrPufWRzSk+8XN9v+IIcU4EboSfrlKJP+OghtHbp4rocrRzkPbR1NfUBNqjfFlsL49OZAyh8maLp6xr+uvy1zqWXKiK9eDUCScwpFb3/VcdCSt4kBCe7zr6hCaEphtzDeftdhFjQoiIHk7qLo9a0uCCO+FfMyyfEj5QX7JxjiH/vK54RQR+FQk2QTrxV1FBzYJ6v3InQqvU2lBZWt7vDCNYhilNBe0x6byDubYg6gm3hLBTMHkDsj45ANYwm7LNB29TfIPMXWdviSTFTOwSnxrKUgHSOHqxWAxRcyv5sdHJJ8YUznaDf5JeSn/ZvkKZ0/Gek/AzbPPX+PwWEcwW0+95toUc/7tbTWJ2J5aL1QF/ebaN9yBxDCuzzGhmfOSjoYKNLR0sA31VG+7T1qx1H2lvP8D1F2sazc06v0nhLb79hzH6YuwGG80PtSWEoen8MkQd9dbL3Jy5LILBffX1yxnexH16as++rhR8GIMF6BzD+IKVCrOaabm8dggK8WhPQKixSqQxUr0fftjocEbbUXLB1Qxp0I+alsEzX0O4Y0NJfj1fxm5HpwPDaZEOpQRUJ2DY9P1zM0R1rpOsWFao7ysfSrfw40UHgdyzdh7VmRCvuMYgu+ugSYAcsM0+WZhCcEkDFwVcGuw+bIE9/6iIhS6FgVzdr7Bpx4p0WIODjoslGlBZNnv78na0k+M8W4QKyZ0Pwu8nSjb7ATm3mO8dRG4mkyRAewnut2T16LbDr4syNLw IxmoGKXs bDPA9F6yvHCcpzrcG4B+OncTUPRW+ZehYRfbSBCksTpbqLV3qWkVfOjN2LWuBQBxtjdRbYD6DY87kUokiflRzCpqpe+iIqCDPuKUfL27swSY2nweONQsI/hydAJculNu6t5rZHuGRl1dD+7eoNWbhZCKHB1orA+r3Ef8K X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Michael, On 26/06/2024 21:36, Michael Schmitz wrote: > Jean-Michel, > > On 27/06/24 01:28, Jean-Michel Hautbois wrote: >> Hi Michael, >> >> On 26/06/2024 03:56, Michael Schmitz wrote: >>> Jean-Michel, >>> >>> On 24/06/24 20:56, Jean-Michel Hautbois wrote: >>>> >>>> When I printk the do_page_fault first debug, I get for the first >>>> call to ls: >>>> bash-5.2# ls >>>> [   14.700000] do page fault: >>>> [   14.700000] regs->sr=0x0, regs->pc=0x70069ee6, >>>> address=0x70069ee6, 0, (ptrval) >>> >>> Page not present, read fault. Please disable obfuscation of kernel >>> pointer addresses by printk. Maybe also disable address space >>> randomization while debugging this. >>> >>>> This call works almost fine (I still have the assert failed: >>>> folio->private != NULL issue). >>>> >>>> And when I call it a second time, I get: >>>> bash-5.2# ls >>>> [   19.820000] do page fault: >>>> [   19.820000] regs->sr=0x0, regs->pc=0x6011d65a, >>>> address=0x700e2004, 2, (ptrval) >>> >>> Page not present, write fault. >>> >>> It would be helpful if you could get a dump of /proc/1/maps before >>> the execve() syscall in your helloworld init replacement. That might >>> confirm all these addresses are legit (assuming mappings survive >>> across execve(), that is), and what they correspond to. >>> >>>> >>>> The address corresponds to the defined zone ELF_ET_DYN_BASE as I set >>>> it to 0x70000000. >>>> >>>> regs->pc is not the same as the address. It might be unrelevant, but >>>> any help is appreciated to understand the process behind :-). >>>> >>>> I keep digging, and I am in the asm part which fears me a bit ! >>> >>> I don't see that you'd need to look at any asm code here. >> >> I add a small test in do_page_fault, and in case of an error, it >> panics. The result follows: > > Please take a look at the comments at the start of > arch/m68k/mm/fault.c:do_page_fault(). The meaning of the bits in > error_code are explained there. > > error_code != 0 is just one possible case out of the four that are > handled by do_page_fault(). It does not signify 'no error' - if there > hadn't been a page fault, do_page_fault() would not have been called. > > You just forced a panic each time a write fault and/or a protection > fault happens. Write faults are absolutely expected to happen when > loading a library - ld.so needs to perform relocation after loading a > dynamic library, and that means writes to the GOT in the library's data > segment (PIC assumed). > > >>  ./scripts/decode_stacktrace.sh vmlinux < /tmp/trace.log >> [    3.857000] Run /bin/bash as init process >> [    3.858000]   with arguments: >> [    3.861000]     /bin/bash >> [    3.862000]   with environment: >> [    3.863000]     HOME=/ >> [    3.864000]     TERM=linux >> [    4.242000] do page fault: >> [    4.242000] regs->sr=0x2000, regs->pc=0x41366924, >> address=0x700b3364, 2, 41fb0000 >> [    4.242000] Kernel panic - not syncing: page fault error >> [    4.242000] CPU: 0 PID: 1 Comm: bash Not tainted >> 6.10.0-rc5-g927da6cf01fe-dirty #25 >> [    4.242000] Stack from 4186dda8: >> [    4.242000]         4186dda8 41423aa4 41423aa4 700b3300 00000001 >> 00000000 4136ee10 41423aa4 >> [    4.242000]         41366d7a 700b3364 700b3364 00000000 0000000d >> 4186de60 41fb0000 41d51a60 >> [    4.242000]         41005696 41416a90 41416a4d 00002000 41366924 >> 700b3364 00000002 41fb0000 >> [    4.242000]         0000000a 700b3364 00000000 0000000d 00000012 >> 41d51a00 4186de60 41d51a60 >> [    4.242000]         41fb81c0 41d51a60 410052fe 4100529a 4186de60 >> 700b3364 00000002 00000000 >> [    4.242000]         700bc414 00000003 00008000 700ac000 41003660 >> 4186de60 00000000 00000000 >> [    4.242000] Call Trace: dump_stack (lib/dump_stack.c:124) >> [    4.242000] panic (kernel/panic.c:266 kernel/panic.c:368) >> [    4.242000] do_page_fault (arch/m68k/mm/fault.c:88 (discriminator 1)) >> [    4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >> [    4.242000] buserr_c (arch/m68k/kernel/traps.c:725 >> arch/m68k/kernel/traps.c:775) >> [    4.242000] buserr_c (arch/m68k/kernel/traps.c:748 >> arch/m68k/kernel/traps.c:775) >> [    4.242000] buserr (arch/m68k/kernel/entry.S:116) >> [    4.242000] ma_slots (lib/maple_tree.c:759) >> [    4.242000] __clear_user (arch/m68k/lib/uaccess.c:108) >> [    4.242000] elf_load (fs/binfmt_elf.c:125 (discriminator 1) >> fs/binfmt_elf.c:421 (discriminator 1)) >> [    4.242000] load_elf_binary (fs/binfmt_elf.c:1132) >> [    4.242000] memset (arch/m68k/lib/memset.c:11) >> [    4.242000] load_misc_binary (fs/binfmt_misc.c:97 >> fs/binfmt_misc.c:146 fs/binfmt_misc.c:213) >> [    4.242000] memset (arch/m68k/lib/memset.c:11) >> [    4.242000] bprm_execve (fs/exec.c:1797 fs/exec.c:1839 >> fs/exec.c:1891 fs/exec.c:1867) >> [    4.242000] copy_strings_kernel (fs/exec.c:669) >> [    4.242000] count_strings_kernel (fs/exec.c:473) >> [    4.242000] kernel_execve (fs/exec.c:2058) >> [    4.242000] __dynamic_pr_debug (lib/dynamic_debug.c:865) >> [    4.242000] run_init_process (init/main.c:1389) >> [    4.242000] _printk (kernel/printk/printk.c:2365) >> [    4.242000] kernel_init (init/main.c:1508) >> [    4.242000] kernel_init (init/main.c:1459) >> [    4.242000] ret_from_kernel_thread (arch/m68k/kernel/entry.S:142) >> [    4.242000] >> [    4.242000] ---[ end Kernel panic - not syncing: page fault error ]--- >> >> Looks like a memory mapping failure, but why ? >> My JTAG at this point dumps a list of 0s at 0x41fb0000 and my SDRAM >> starts at 0x40000000 and ends at 0x50000000 (256MB). > 0x41fb0000 seems to be init's page directory. The fault address is in > the range where I'd expect dynamic libraries to reside. >> >> It looks like a TLB write miss which is obscure to me :-). >> >> I tried to use the /proc but as expected it is not alive after >> mounting it. > > The memory map ought to be accessible through sysrq - an alternative > would be to modify the ELF binfmt handler and dump the map once ld.so > has finished with relocations. I added a dump in the binfmt_elf file: diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index a43897b03ce9..395f556f3a90 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -816,6 +816,63 @@ static int parse_elf_properties(struct file *f, const struct elf_phdr *phdr, return ret == -ENOENT ? 0 : ret; } +static int dump_memory_map(struct task_struct *task) +{ + struct mm_struct *mm = task->mm; + struct vm_area_struct *vma; + MA_STATE(mas, &mm->mm_mt, 0, -1); + struct file *file; + struct path *path; + char *buf; + char *pathname; + + // Acquire the read lock for mmap_lock + down_read(&mm->mmap_lock); + mas_lock(&mas); + for (vma = mas_find(&mas, ULONG_MAX); vma; vma = mas_find(&mas, ULONG_MAX)) { + if (vma->vm_file) { + buf = (char *)__get_free_page(GFP_KERNEL); + if (!buf) { + continue; // Handle memory allocation failure + } + + file = vma->vm_file; + path = &file->f_path; + pathname = d_path(path, buf, PAGE_SIZE); + if (IS_ERR(pathname)) { + pathname = NULL; + } + + pr_info("%lx-%lx %c%c%c%c %08lx %02x:%02x %lu %s\n", + vma->vm_start, vma->vm_end, + vma->vm_flags & VM_READ ? 'r' : '-', + vma->vm_flags & VM_WRITE ? 'w' : '-', + vma->vm_flags & VM_EXEC ? 'x' : '-', + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', + vma->vm_pgoff << PAGE_SHIFT, + MAJOR(file->f_inode->i_rdev), + MINOR(file->f_inode->i_rdev), + file->f_inode->i_ino, + pathname ? pathname : ""); + + free_page((unsigned long)buf); + } else { + pr_info("%lx-%lx %c%c%c%c %08lx 00:00 0\n", + vma->vm_start, vma->vm_end, + vma->vm_flags & VM_READ ? 'r' : '-', + vma->vm_flags & VM_WRITE ? 'w' : '-', + vma->vm_flags & VM_EXEC ? 'x' : '-', + vma->vm_flags & VM_MAYSHARE ? 's' : 'p', + vma->vm_pgoff << PAGE_SHIFT); + } + } + mas_unlock(&mas); + // Release the read lock for mmap_lock + up_read(&mm->mmap_lock); + + return 0; +} + static int load_elf_binary(struct linux_binprm *bprm) { struct file *interpreter = NULL; /* to shut gcc up */ @@ -1299,6 +1356,9 @@ static int load_elf_binary(struct linux_binprm *bprm) finalize_exec(bprm); START_THREAD(elf_ex, regs, elf_entry, bprm->p); + if (current->pid == 1) { // Check if this is the init process + dump_memory_map(current); + } retval = 0; out: return retval; I think it is quick and dirty, but seems to do the trick. I then get in my console: [ 4.265000] 60000000-6001e000 r-xp 00000000 00:00 178 /lib/ld.so.1 [ 4.266000] 6001e000-60022000 rw-p 0001c000 00:00 178 /lib/ld.so.1 [ 4.267000] 70000000-700ac000 r-xp 00000000 00:00 27 /bin/bash [ 4.268000] 700ac000-700b4000 rw-p 000ac000 00:00 27 /bin/bash [ 4.269000] 700b4000-700be000 rwxp 700b4000 00:00 0 [ 4.270000] bfe7a000-bfe9c000 rw-p bffde000 00:00 0 But nothing rings a bell at this level for me... Thanks !