From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CDA0C0015E for ; Wed, 26 Jul 2023 21:28:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7FB7E8D0001; Wed, 26 Jul 2023 17:28:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7ABD96B0072; Wed, 26 Jul 2023 17:28:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 673A78D0001; Wed, 26 Jul 2023 17:28:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 55BD06B0071 for ; Wed, 26 Jul 2023 17:28:20 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 256BB1601E8 for ; Wed, 26 Jul 2023 21:28:20 +0000 (UTC) X-FDA: 81055051560.15.4DC15F1 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id 3C96C40013 for ; Wed, 26 Jul 2023 21:28:17 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690406898; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kv9K8IXVBAxZzYkZ3HrMZXC0UIuJVX4m08IQkv8/4Vk=; b=uoKEp2WK+wfJjW1RC1vg/41Ico9oq5vrn4/6zoErqMukGN8ccmk0gVJ6+UfMYK672TdCLY Wbm/4RJ1cImgoQyMTtHet8xIb1sXf+jm9cIINYVO+tltar9ptmyz3fEVQudZa4rwvRESKz VC1NJXs/GoN9Qvao0s0INCYgWKQlqPA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690406898; a=rsa-sha256; cv=none; b=Z1MjqzP/bWv259lAcd10FzA9XvarwYu3zGw1qjfKJbt/3JzjZ1mwYbkbgu5T5LRw44r+1M KjtVI8gassTFq+CDAJxSvcMZhG6l9l5sgSuIKL+TDnzCu/IE4j0P7Kmhc+owUH+2OKBBFI b9Ww2dp0Oms0EjGGAU+X3ksyFxipLlQ= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3767BD75; Wed, 26 Jul 2023 14:29:00 -0700 (PDT) Received: from [10.57.77.6] (unknown [10.57.77.6]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 645063F67D; Wed, 26 Jul 2023 14:28:15 -0700 (PDT) Message-ID: <79f3bc0f-f54a-76ac-19a4-ea93104cd693@arm.com> Date: Wed, 26 Jul 2023 22:28:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v3 3/3] mm: Batch-zap large anonymous folio PTE mappings To: Nathan Chancellor Cc: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Yang Shi , "Huang, Ying" , Zi Yan , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230720112955.643283-1-ryan.roberts@arm.com> <20230720112955.643283-4-ryan.roberts@arm.com> <20230726161942.GA1123863@dev-arch.thelio-3990X> <44a91c46-08ba-9693-6c9c-a0a59921e9f1@arm.com> <20230726195029.GA123524@dev-arch.thelio-3990X> <20230726212354.GA811386@dev-arch.thelio-3990X> From: Ryan Roberts In-Reply-To: <20230726212354.GA811386@dev-arch.thelio-3990X> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 3C96C40013 X-Rspam-User: X-Stat-Signature: eco4mrbzcd7g5zwaj8awygppk664o7pb X-Rspamd-Server: rspam03 X-HE-Tag: 1690406897-725204 X-HE-Meta: U2FsdGVkX1+YxbzCz/LoP+p+igReKbH2A7E2ecHWkyxS4KT4hACYVdIJk4kmTNAcPb5rFAr9k2sZsDXlrdDmBGHCsWylERTbmOnkPqbg4bds+e9WmsSLWIKjQXyWumiLVksspz5mp9w39N2bfZHxhjyWnvYQPZqGr7xFdUGCIESbq4Q+bKSNIpvMY0ivvYjdJn/m2ZjIUDYF3P9Y0ItKShQYJicqLR75YXsuEJ55ilORosKHSWMMk8SkYV0Iry63B//1FYmTF9jARMxfAGIDkRmylwvbsCsZNvxnhKMM4Wy0WkuIzDBEg6BQ7YapOBJW95BR07abGdF+glHmwaN7XvwDC6iL2rCpdp3VK5Vii1DFlKnXNUTrJTDbftk63sOwW1n6KC9GENbMwQg+ZwkMEBy26mg3a+LKgMdduoRDIUV2YXtOzYu9hySrzeGDyeAlIrfycCPLVBAp5ROU8WSdYsdUq6SSUvZ5W50jjKM6CzW00IA+Qi+Tu1vUG0JzFjTzOaXZTzncg9EYcsu5I5Ssek5edZj12r/KSlAroJGVVqlYlDX09YqjXACIpWf4mJl5N5UJsc90Kyrt/0LWLU4v/CkgdPtIhinIbLJ5mUt83Axw4EaOCZt3u5b6iadpmXSft+oF3ZS1ayaXbsQVt51DVZ1U2Ky99Dq1LLYLy9kBb7UqVbYlDqoTYPwDbAvMXSDLmDGAkfAU3EFF8/CBnbOSKWZVbIdEqWXMCOVvlWhQBsM4pQ1vMNkE4CdQSiSsXUskjVWqQObbz64w4Q9KGO1oQMhQWE3k2HD2r/Q32vNjwkCVOMGgLhHds2nmn3Hl7sY6CEcL1cx5Fjt560Erl7sSuEYq+9QuLkT24RbSNNlXbbEbn/HL/wBluK8EsRO0sRhYM8RhM5qb+3/aoBqMlWUwCVHBYEUi1tznS9smrjdwCN01WXg7l7RXsqL8/B9Oyph2n+eAJWsLSKLhE7FHQk/ 6wVKL5Mi YpGmbk81FihsAqIT+kGdE6dxYWU/Wp30LD9+fxW2wyMV4fpV6t+LXsFvQdy2HXwQQXHqQ63fn7nYDr2BqxndEMMtyJkL8zrtOS261/39Zdbp7+YJvbp6k4qy7SEcniCaAR5BNh1bsDvHqdRzjGkJU5WqdJIMh+d5g/eEotIOc1KR+ZdGuk3tqds5tEGL6Ivck7mby475yLfwVS5OZAyvU7XUveA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 26/07/2023 22:23, Nathan Chancellor wrote: > On Wed, Jul 26, 2023 at 10:17:18PM +0100, Ryan Roberts wrote: >> On 26/07/2023 20:50, Nathan Chancellor wrote: >>> On Wed, Jul 26, 2023 at 08:38:25PM +0100, Ryan Roberts wrote: >>>> On 26/07/2023 17:19, Nathan Chancellor wrote: >>>>> Hi Ryan, >>>>> >>>>> On Thu, Jul 20, 2023 at 12:29:55PM +0100, Ryan Roberts wrote: >>>> >>>> ... >>>> >>>>>> >>>>> >>>>> After this change in -next as commit 904d9713b3b0 ("mm: batch-zap large >>>>> anonymous folio PTE mappings"), I see the following splats several times >>>>> when booting Debian's s390x configuration (which I have mirrored at [1]) >>>>> in QEMU (bisect log below): >>>>> >>>>> $ qemu-system-s390x \ >>>>> -display none \ >>>>> -nodefaults \ >>>>> -M s390-ccw-virtio \ >>>>> -kernel arch/s390/boot/bzImage \ >>>>> -initrd rootfs.cpio \ >>>>> -m 512m \ >>>>> -serial mon:stdio >>>> >>>> I'm compiling the kernel for next-20230726 using the s390 cross compiler from kernel.org and the config you linked. Then booting with qemu-system-s390x (tried both distro's 4.2.1 and locally built 8.0.3) and the initrd you provided (tried passing it compressed and uncompressed), but I'm always getting a kernel panic due to not finding a rootfs: >>>> >>>> $ qemu-system-s390x \ >>>> -display none \ >>>> -nodefaults \ >>>> -M s390-ccw-virtio \ >>>> -kernel arch/s390/boot/bzImage >>>> -initrd ../s390-rootfs.cpio.zst \ >>>> -m 512m \ >>>> -serial mon:stdio >>>> KASLR disabled: CPU has no PRNG >>>> KASLR disabled: CPU has no PRNG >>>> Linux version 6.5.0-rc3-next-20230726 (ryarob01@e125769) (s390-linux-gcc (GCC) 13.1.0, GNU ld (GNU Binutils) 2.40) #1 SMP Wed Jul 26 19:56:26 BST 2023 >>>> setup: Linux is running under KVM in 64-bit mode >>>> setup: The maximum memory size is 512MB >>>> setup: Relocating AMODE31 section of size 0x00003000 >>>> cpu: 1 configured CPUs, 0 standby CPUs >>>> Write protected kernel read-only data: 4036k >>>> Zone ranges: >>>> DMA [mem 0x0000000000000000-0x000000007fffffff] >>>> Normal empty >>>> Movable zone start for each node >>>> Early memory node ranges >>>> node 0: [mem 0x0000000000000000-0x000000001fffffff] >>>> Initmem setup node 0 [mem 0x0000000000000000-0x000000001fffffff] >>>> percpu: Embedded 14 pages/cpu s26368 r0 d30976 u57344 >>>> Kernel command line: >>>> random: crng init done >>>> Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear) >>>> Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear) >>>> Built 1 zonelists, mobility grouping on. Total pages: 129024 >>>> mem auto-init: stack:all(zero), heap alloc:off, heap free:off >>>> Memory: 507720K/524288K available (3464K kernel code, 788K rwdata, 572K rodata, 796K init, 400K bss, 16568K reserved, 0K cma-reserved) >>>> SLUB: HWalign=256, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 >>>> rcu: Hierarchical RCU implementation. >>>> rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=1. >>>> rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies. >>>> rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1 >>>> NR_IRQS: 3, nr_irqs: 3, preallocated irqs: 3 >>>> rcu: srcu_init: Setting srcu_struct sizes based on contention. >>>> clocksource: tod: mask: 0xffffffffffffffff max_cycles: 0x3b0a9be803b0a9, max_idle_ns: 1805497147909793 ns >>>> Console: colour dummy device 80x25 >>>> printk: console [ttysclp0] enabled >>>> pid_max: default: 32768 minimum: 301 >>>> Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, linear) >>>> Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes, linear) >>>> rcu: Hierarchical SRCU implementation. >>>> rcu: Max phase no-delay instances is 1000. >>>> smp: Bringing up secondary CPUs ... >>>> smp: Brought up 1 node, 1 CPU >>>> clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns >>>> futex hash table entries: 256 (order: 4, 65536 bytes, linear) >>>> kvm-s390: SIE is not available >>>> hypfs: The hardware system does not support hypfs >>>> workingset: timestamp_bits=62 max_order=17 bucket_order=0 >>>> io scheduler mq-deadline registered >>>> io scheduler kyber registered >>>> cio: Channel measurement facility initialized using format extended (mode autodetected) >>>> Discipline DIAG cannot be used without z/VM >>>> vmur: The z/VM virtual unit record device driver cannot be loaded without z/VM >>>> sclp_sd: Store Data request failed (eq=2, di=3, response=0x40f0, flags=0x00, status=0, rc=-5) >>>> List of all partitions: >>>> No filesystem could mount root, tried: >>>> >>>> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0) >>>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc3-next-20230726 #1 >>>> Hardware name: QEMU 8561 QEMU (KVM/Linux) >>>> Call Trace: >>>> [<0000000000432b22>] dump_stack_lvl+0x62/0x80 >>>> [<0000000000158898>] panic+0x2f8/0x310 >>>> [<00000000005b7a56>] mount_root_generic+0x276/0x3b8 >>>> [<00000000005b7e46>] prepare_namespace+0x56/0x220 >>>> [<00000000004593e8>] kernel_init+0x28/0x1c8 >>>> [<000000000010217e>] __ret_from_fork+0x36/0x50 >>>> [<000000000046143a>] ret_from_fork+0xa/0x30 >>>> >>>> >>>> Any idea what I'm doing wrong here? Do I need to specify something on the kernel command line? (I've tried root=/dev/ram0, but get the same result). >>> >>> Hmmm, interesting. The rootfs does need to be used decompressed (I think >>> the kernel does support zstd compressed initrds but we only compress >>> them to save space, not for running). Does the sha256sum sum match the >>> one I just tested? >>> >>> $ curl -LSsO https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/s390-rootfs.cpio.zst >>> $ zstd -d s390-rootfs.cpio.zst >>> $ sha256sum s390-rootfs.cpio >>> 948fb3c2ad65e26aee8eb0a069f5c9e1ab2c59e4b4f62b63ead271e12a8479b4 s390-rootfs.cpio >> >> Yes, this matches. >> >>> $ qemu-system-s390x -display none -nodefaults -M s390-ccw-virtio -kernel arch/s390/boot/bzImage -initrd s390-rootfs.cpio -m 512m -serial mon:stdio >>> ... >>> [ 7.890385] Trying to unpack rootfs image as initramfs... >>> ... >>> >>> I suppose it could be something with Kconfig too, here is the actual one >>> that olddefconfig produces for me: >>> >>> https://gist.github.com/nathanchance/3e4c1721ac204bbb969e2f288e1695c9 >> >> Ahh, I think this was the problem, after downloading this, the kernel is taking >> much longer to compile and eventually giving me an assembler error: >> >> arch/s390/kernel/mcount.S: Assembler messages: >> arch/s390/kernel/mcount.S:140: Error: Unrecognized opcode: `aghik' >> make[4]: *** [scripts/Makefile.build:360: arch/s390/kernel/mcount.o] Error 1 >> make[4]: *** Waiting for unfinished jobs.... >> make[3]: *** [scripts/Makefile.build:480: arch/s390/kernel] Error 2 >> make[2]: *** [scripts/Makefile.build:480: arch/s390] Error 2 >> make[2]: *** Waiting for unfinished jobs.... >> make[1]: *** [/data_nvme0n1/ryarob01/granule_perf/linux/Makefile:2035: .] Error 2 >> make: *** [Makefile:234: __sub-make] Error 2 >> >> The assembler that comes with the kernel.org toolchain is from binutils 2.40. >> Perhaps that's too old for this instruction? It's inside an "#ifdef >> CONFIG_FUNCTION_GRAPH_TRACER" block, so I'm guessing whatever config I was >> previously using didn't have that enabled. I'll try to disable some configs to >> workaround. What assembler are you using? > > Ah sorry about that, I forgot about that issue because I handled it > during my bisect and all the manual reproduction/verification I have > been doing in this thread has been against Andrew's -mm tree, which does > not have that problem. Apply this patch on top of next-20230726 and > everything should work... > > https://lore.kernel.org/20230726061834.1300984-1-hca@linux.ibm.com/ No problem - I just got it working by disabling FUNCTION_TRACER, which avoids aghik, and then disabling generation of BTF (which was complaining that pahole wasn't available). Anyway, I'm up and running now - can repro what you are seeing. Although it's late in the UK now, so will have to look at this tomorrow. Hopefully I can get to the bottom of it in reasonable time. > > Cheers, > Nathan