From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DEFDC3DA4A for ; Sun, 11 Aug 2024 16:52:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C063F6B008A; Sun, 11 Aug 2024 12:52:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B8F246B0092; Sun, 11 Aug 2024 12:52:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A300C6B0095; Sun, 11 Aug 2024 12:52:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7C99B6B008A for ; Sun, 11 Aug 2024 12:52:36 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id EFD8B1403B1 for ; Sun, 11 Aug 2024 16:52:35 +0000 (UTC) X-FDA: 82440558270.01.845EF4C Received: from ufal-mail.mff.cuni.cz (ufal-mail.mff.cuni.cz [195.113.20.158]) by imf17.hostedemail.com (Postfix) with ESMTP id 54E5A40005 for ; Sun, 11 Aug 2024 16:52:33 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=ufal.mff.cuni.cz header.s=9D3691E2-3533-11E9-988E-D2516E4D0B60 header.b=v6mJ7amC; dmarc=pass (policy=none) header.from=ufal.mff.cuni.cz; spf=pass (imf17.hostedemail.com: domain of vidra@ufal.mff.cuni.cz designates 195.113.20.158 as permitted sender) smtp.mailfrom=vidra@ufal.mff.cuni.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723395098; a=rsa-sha256; cv=none; b=tY2oGykJtuJxus0UxcIkQpNw0JCi+XiysEAVvMdoMfvFl8Yr9oNzW74xaHH+dBdGvRobj9 jH3nfQ9r4YBZsIIlQmzzqUKjwm7jPHjYmAN8glX8quBYGXfvUlM0ctnkBWOA99Hr4bmMZi xTxNOrnOUbZJIy7Bu/4SxsarbNDCYaA= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=ufal.mff.cuni.cz header.s=9D3691E2-3533-11E9-988E-D2516E4D0B60 header.b=v6mJ7amC; dmarc=pass (policy=none) header.from=ufal.mff.cuni.cz; spf=pass (imf17.hostedemail.com: domain of vidra@ufal.mff.cuni.cz designates 195.113.20.158 as permitted sender) smtp.mailfrom=vidra@ufal.mff.cuni.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723395098; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:dkim-signature; bh=0KjhO2xt++CVNuvKu9eIW/EKcmpGMJULps5rYcCKbok=; b=YX9kBI1tKl6Bh2I5l6fPAHsBw2DkT1GXnHo9wrYABE0pv4XS4qYg79fyE0/RVTvmZO9g04 iGBic0Fa5zcLfNVLe+kqf+0cJettDP4TE+peFl7lrYHrONbm7bnOTF7JHtQtY02/8375AO 9BTl1KXGi8HCiaTZoEqtKJmqOfxA6Xo= Received: from localhost (localhost.localdomain [127.0.0.1]) by ufal-mail.mff.cuni.cz (Postfix) with ESMTP id EBEAC4F2B69; Sun, 11 Aug 2024 18:52:30 +0200 (CEST) Received: from ufal-mail.mff.cuni.cz ([127.0.0.1]) by localhost (ufal-mail.mff.cuni.cz [127.0.0.1]) (amavis, port 10032) with ESMTP id DhOUQqr-HnCG; Sun, 11 Aug 2024 18:52:30 +0200 (CEST) Received: from localhost (localhost.localdomain [127.0.0.1]) by ufal-mail.mff.cuni.cz (Postfix) with ESMTP id BD4844F3831; Sun, 11 Aug 2024 18:52:30 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.10.3 ufal-mail.mff.cuni.cz BD4844F3831 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ufal.mff.cuni.cz; s=9D3691E2-3533-11E9-988E-D2516E4D0B60; t=1723395150; bh=0KjhO2xt++CVNuvKu9eIW/EKcmpGMJULps5rYcCKbok=; h=From:To:MIME-Version:Message-Id:Date; b=v6mJ7amCoSiQkBsOzh6SUTpoPx3MYFzIVyW4B8dKa0Awl/jDt+yPWP0hMrrGF30k2 p1W3zf/YsfFthPap2o2DT4zrNuqh+HUKr2uaVVPy5dhZ88JINRSnkvbtYcjVmClL1L YcPB290DV4D0VCvP0tyAWKfNEvdLRWC7TWSt/0fs3mMoePAyxkwRnKqn1ChBQVGY1h lceFGgKNNywYIhUuzJQHue/pzKFuqhLxShGeDFytB/trbJYUWUKyBmBEei59xbauHK flme1EQIXOcjmPXii3yyL0XPS14tMcc9T+PKbFWlSXvqoSuie3IhtsNXFTnEPM18wa ebtUZ3NgSUeXw== X-Virus-Scanned: amavis at ufal.mff.cuni.cz Received: from ufal-mail.mff.cuni.cz ([127.0.0.1]) by localhost (ufal-mail.mff.cuni.cz [127.0.0.1]) (amavis, port 10026) with ESMTP id Sucev-Xru4Rx; Sun, 11 Aug 2024 18:52:30 +0200 (CEST) Received: from freki.localdomain (freki.ms.mff.cuni.cz [195.113.18.207]) by ufal-mail.mff.cuni.cz (Postfix) with ESMTP id 98B344F3830; Sun, 11 Aug 2024 18:52:30 +0200 (CEST) Received: by freki.localdomain (Postfix, from userid 6172) id 91DCFA0660; Sun, 11 Aug 2024 18:52:30 +0200 (CEST) From: Jonas Vidra To: erhard_f@mailbox.org Subject: Re: BUG: Bad page map in process init pte:c0ab684c pmd:01182000 (on a PowerMac G4 DP) Cc: christophe.leroy@csgroup.eu,linux-mm@kvack.org,linuxppc-dev@lists.ozlabs.org,npiggin@gmail.com,rmclure@linux.ibm.com In-Reply-To: <20240620004237.2338f82b@yea> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Message-Id: <20240811165230.91DCFA0660@freki.localdomain> Date: Sun, 11 Aug 2024 18:52:30 +0200 (CEST) Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 54E5A40005 X-Stat-Signature: jbo6dsg6gf4ft8ki9nzqeithm53ot66i X-Rspam-User: X-HE-Tag: 1723395153-920487 X-HE-Meta: U2FsdGVkX1+zRGJ2oBw+dTDJURqji/Dsdmok2khJYtfIkRw/XpGEI63grdJ6zYYFZa5rtF6jYu9Xc1oEr+e0gy58n9xfTXBi/DEs84Oof2crMbT4rQGLcPrW65NZKCjJhyCG01by+AWMRdMSGG3Yjs1/AWOmYadEDyYdZiCsfBNabcIwSJXVOaoKy5R1AaJcLlYgFSkTEa5MGGwiMgbQdEXCNOjfbHHf5GcnQMr0DYuEfFhdm9kq5qaV1LGba6MP+/9wSSjkxSxBLg360jeC1l+fcjG6zLwE6aEY3+Gnzev23CCn22nPTb0vg0svwm8FIyGdSwh4HCB85jlVQnJyEfUjs3ISXmOFYR9x0i/3XQiL6BA+aPT7oN8b55vPBI3ULCJqs/LitNaWEgIMohWRhxFMwIgcE4gyWWDRlLaOWx9TeNkFFbjg0As0NjI8m388EMfAS4axFQvJv9/rBPvK09MVpDVv72kX4qJrpbgXS7YfgqGLj4nlwIkZNApV2KhoulI2pOK8PN6NoM7ImueeKCTqdziWqdTJ/pEw0qFDaoBPqMWm7H0FttIH/wKjXcMyWums2V5+S1fVSNmn9Jp/YAq/LRuTxqD/EBJFFj+MeMW8jDHiPG9rTn9mHtUvxxLoCTObjK0JfiUKyOFIbxJ+icRmVpn9UV+Sni9eagDgATg2IuKDj85CQu0ysELkV9FUQtVrYXtPnYC11PbxbvCp8kbqbUZ7DarYc4mEPAaiZtvsAb0+FkcLSK8sUUt6dRQ35DiabeQ8TfxLJiSg+1Lx7Vpu50vzcxMSm3y/xFzmKKb99j8AbX3YvaAUfLt6l1eaYFpZnpxM8deuxX1wXzyCD1Caa+zcYM8AcJSr4hcb1tG/YxLYCHM2uLssFmBojpXdSJyJAzLU/Fv3sQrWknXOj/3V9xnltnXZZ3BobENnNyP2n/Kicoa9aodMA9wZmPazGs8f5MkEWgmRXbqfLNu qFnQPNGu p3eW+fUAiK8oNH5beEpqBLqe9lJ3bFyWsC3y71zw2lB+IzQHpXbA+G6RRUZ0I5j5++BjJZRQwbx+kaNz1HOQdksUtEvG3/U8OEDo1aHBau8Xw8w1jVGmOJAXXPnwNcYMSlnIDstsommj5a5VtLQr8dZx2pKybzo9l8cdTa0Hwod4lR0IymUWxWfln2hKMfd1MQJ+483Uly+gZWdOq49iet0ligZ8rZs2HnQ763Z1R47EHv3Cmyydzb+8xzxi4ufLnyDcH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 20 Jun 2024 00:42:37 +0200, Erhard Furtner wrote: >> Le 29/02/2024 =C3=A0 02:09, Erhard Furtner a =C3=A9crit : >> >=20 >> > Revisited the issue on kernel v6.8-rc6 and I can still reproduce it. >> >=20 >> > Short summary as my last post was over a year ago: >> > (x) I get this memory corruption only when CONFIG_VMAP_STACK=3Dy a= nd CONFIG_SMP=3Dy is enabled. >> > (x) I don't get this memory corruption when only one of the above = is enabled. ^^ >> > (x) memtester says the 2 GiB RAM in my G4 DP are fine. >> > (x) I don't get this issue on my G5 11,2 or Talos II. >> > (x) "stress -m 2 --vm-bytes 965M" provokes the issue in < 10 secs.= (https://salsa.debian.org/debian/stress) >> >=20 > The "pagealloc: memory corruption" remains however as of kernel v6.10-r= c4. I've reproduced the bug on similar hardware, also a dual-processor Power Mac G4 with 2 GiB RAM. With the 6.6.30 kernel without extra debugging options, the system was stable and could e.g. compile GCC or the kernel without an issue. That doesn't mean there wasn't silent corruption going on, of course. :-) Running the `stress` program as listed above did, however, cause the system to get into an unstable state where heavier workloads, such as compiling the kernel, would randomly fail. I updated the kernel to 6.10.3, enabled SLUB_DEBUG, PAGE_POISONING and DEBUG_PAGEALLOC and turned them on at boot-time with slub_debug=3DFZ page_poison=3Don debug_pagealloc=3Don. The updated kernel exhibits the same symptoms as described by Erhard, running `stress -m 2 --vm-bytes 965M` almost immediately causes a memory corruption with the following messages in dmesg: ``` pagealloc: memory corruption fffcfff0: 00 00 00 00 .... CPU: 1 PID: 1845 Comm: stress Tainted: G T 6.10.3-gentoo = #1 Hardware name: PowerMac3,6 7455 0x80010303 PowerMac Call Trace: [f2d05ca0] [c08ff18c] dump_stack_lvl+0x60/0xbc (unreliable) [f2d05cc0] [c01db7e0] __kernel_unpoison_pages+0x128/0x1f0 [f2d05d10] [c01bc6c4] get_page_from_freelist+0xeb0/0xf6c [f2d05db0] [c01bcf7c] __alloc_pages_noprof+0x160/0xdf0 [f2d05e70] [c01be388] __folio_alloc_noprof+0x14/0x44 [f2d05e80] [c0199690] handle_mm_fault+0x99c/0xdac [f2d05f00] [c00218c8] do_page_fault+0x264/0x73c [f2d05f30] [c000433c] DataAccess_virt+0x124/0x17c --- interrupt: 300 at 0x7c2db0 NIP: 007c2db0 LR: 007c2d90 CTR: 00000000 REGS: f2d05f40 TRAP: 0300 Tainted: G T (6.10.3-gentoo) MSR: 0000d032 CR: 20882004 XER: 00000000 DAR: 8fe18020 DSISR: 42000000 GPR00: 007c2d90 afb6a160 a7a00100 6b416020 ffffffa0 00000000 a7916ffc 000= 00000 GPR08: 24a03000 24a02000 00000000 404347fa 404344c7 00000000 00000000 000= 0005a GPR16: 6b416020 00000002 00000000 00000000 ffffffff 00000000 40882002 007= e0004 GPR24: 00000001 ffffffff ffffffff 3c500000 00000000 66b7cd68 007e7cf8 000= 01000 NIP [007c2db0] 0x7c2db0 LR [007c2d90] 0x7c2d90 --- interrupt: 300 page: refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x31069 flags: 0x80000000(zone=3D2) raw: 80000000 00000100 00000122 00000000 00000000 00000000 ffffffff 00000= 001 page dumped because: pagealloc: corrupted page details ``` Other activity can also trigger it, compilation of larger programs with `make -j2` does it within an hour, typically resulting in an ICE. When booted with the `maxcpus=3D0` kernel parameter, the corruptions do not occur.