From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A18CC021B2 for ; Tue, 25 Feb 2025 11:19:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D58FC6B007B; Tue, 25 Feb 2025 06:19:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D09016B0082; Tue, 25 Feb 2025 06:19:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF7B16B0085; Tue, 25 Feb 2025 06:19:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9F38C6B007B for ; Tue, 25 Feb 2025 06:19:26 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4D7C31A11B8 for ; Tue, 25 Feb 2025 11:19:26 +0000 (UTC) X-FDA: 83158221132.04.EE9786B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 7FF8E1A000D for ; Tue, 25 Feb 2025 11:19:24 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740482364; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yhueN38Adv5CfsVEW8JtCZc6GpRxwOpf5J796yTp53s=; b=KrINTPqaGPEWQWbeNlrjWjG4qM53lHgwgafHkLRdjluz0ETJvlvdn4BK6BAlRh6vukZrII BJhiX+5QU7dZ/jiODdtIVrjHfqH+rOJpyY01vgPB/azPb1pHhzmUbXHmrIeRQEcn4+NeZL GxoY80TXrJytOkylS9D4Nvphaz7VwVQ= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740482364; a=rsa-sha256; cv=none; b=LN63W7pecASVj6NTJhJRI/GZopDgJhlw5Y6SzkLQe/jVKzdCmSys4nNEE1akrosvPGWvUp 6fiPZ8fycbZ8GJWyHyRqs19wRrlpOHQIVyxMLJrTsm1r/o14Ha2ZHL0sv9Z5DDqexuz8na FWHtCG8TcaT9iY96nWpqSA492ln5VJw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1D131152B; Tue, 25 Feb 2025 03:19:40 -0800 (PST) Received: from [10.57.84.186] (unknown [10.57.84.186]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 41C873F673; Tue, 25 Feb 2025 03:19:22 -0800 (PST) Message-ID: <929f0475-4801-4f30-869b-a15e93ca662b@arm.com> Date: Tue, 25 Feb 2025 11:19:20 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: kernel BUG at arch/arm64/mm/mmu.c:185! Content-Language: en-GB To: Mark Rutland Cc: Luiz Capitulino , LKML , linux-mm@kvack.org, ardb@kernel.org, "linux-arm-kernel@lists.infradead.org" , Catalin Marinas , Will Deacon References: <9f5600b3-6525-4045-ad1f-4408dfc9ce0f@redhat.com> <789c17e6-5ebc-4e37-93cd-19d24f148fd8@redhat.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7FF8E1A000D X-Stat-Signature: cua51weeqdu6ynxtd8qdnpfzajct1hcz X-HE-Tag: 1740482364-831643 X-HE-Meta: U2FsdGVkX1/FfF9jlvHCOSxH4hh5w29ejDUy6S98B4rvqH64VRZaDWniK97Smbt/QaVk70Al+gAEXSuoN401aanEBgIVhfy4//rR3gKD9aioM55nDWB9Z+iZGKbYeZ7WTVsqTu7hyx8mruiK20i9KRxqp8JkLvqKaShI1OW/fGofpKpuUQoKJjcLdvwx3OI8G4xPioPvpYRQW0sQ+NzmySNoPD4YsaXfmxUCGg0KsyFX0UkpIB5K5jOIxsHXU3JWe/GGRfRM2YE5fU/G/d8aWdCTioRH+t/R2LabWLrVYcRNkkhZdyAI/sz6oEij5qgJZUGWUQJiZ4Qj2geXlJ+I98v25hn/7g9nYdN6ONqZjHPvdS9f5i4rkU/ASlRbJELpO3gLr30EvATeLbzBcV8U1wCK4vqm429EC74BROf7dSAy8ue6yvIqzUxMARVV5kP7HvBHmyLGYLb4at0xXsBq4QfffnFdyumtMduubHAMsrfvVtNaV3yvXOXIH0HziJXE7pXkN9W9Hz17tHVk3vOXRgJcdAHJvkx93qOMyqhQNV5gHCdEEgZQ5V+KQk/JyW0eEjH6e3u38StnWlF27S/Jw+FBfnDNaypusDqu4at2PkIyQdBKTRreig172RLWu0Pz8mRXfYPW4YbuDC28ErttUbW4gQkjk+y/FL7gJy2mLElHb9l1FP46HcI9P3zLtYT+cQqKq9StGJXaGfDyV7eC3k2oPlgB9gkTv2HQa7H8hPjDuRMVciBWDvKxx0IS4ZJhF+qo+R5OjKKSZZ0BVqWEvzMrCRwBuACcMpVbg4dsOTtAs8vUot1JJ71goQXtHjM25PQfTLjQnqJdO+wdo9AH/mq99P91tSBxzorM6EK3Dbx2vsVWl+SQrMzM5xO15l8OJVJvXt5dTKfhD+lOH2zX/kWFwz2brtL8Jjmzsv/HQMRasu5rHDBR4Csw3EC1JBoBRMbiXQYC8eoo6gzmk0z eFNXYZaU Jv70pieRntIKBwHlq0XWXIWf7EhInQsi4s37AykGiX+eyLWm6yKQbNaVCT1DltKVNbs9j23apmOu6H9+s6o68nuNlbtBbGy0DbLm2Cg00dAF3+TWBropW9eyjWWCJSLJhBkNQJCXFZfhXrzC6v4NHQQTlWgCOvGa9GL4JEnCFOZra/GnkpCJWwJZzLeNpCZQMUHG4RBFAxB1lRoKZubDZ84l0Ev5iFn5yswmOlN9NilD7zprhNT7ySVPKgt0F+G6FiR9fLeMCrn/wn64oX5PvSXpZyg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Mark, On 25/02/2025 11:10, Mark Rutland wrote: > On Tue, Feb 25, 2025 at 09:47:30AM +0000, Ryan Roberts wrote: >> (Adding arm folks for visibility) >> >> See original report here for context: >> https://lore.kernel.org/all/a3d9acbe-07c2-43b6-9ba9-a7585f770e83@redhat.com/ >> >> TL;DR is that 6.14 doesn't boot on Ampere Altra when kaslr is enabled. >> >> >> On 20/02/2025 20:08, Luiz Capitulino wrote: >>> On 2025-02-19 09:40, Luiz Capitulino wrote: >>> >>>>>> Btw, I'll try to bisect again and will also try to update the system's firmware >>>>>> just in case. >>> >>> I tried to bisect it and again, got nowhere. >>> >>> Git bisect says the first bad commit is 8883957b3c9de2087fb6cf9691c1188cccf1ac9c . >>> But I'm able to boot that tree... >>> >> >> OK, think I've found the dodgy commit: >> >> Commit 62cffa496aac ("arm64/mm: Override PARange for !LPA2 and use it consistently") >> >> Based on the changes it certainly looks like it could be the issue, but I >> haven't spotted exactly what the problem is yet. Ard, could you take a look? >> >> I managed to hack multi ram bank support into kvmtool, so I can now repro the >> issue in virtualization. Then was able to bisect to get to the above commit. > > If you're able to repro this, could you please say the configuration of > memory banks you're using, and could you hack the BUG() to dump more > info, e.g. something lihke the below, UNTESTED patch. I believe the root cause is due to the above commit switching from using read_cpuid() to using read_sanitised_ftr_reg() in arm64_memblock_init(). This function runs prior to the registers being sanitized, so the change means that parange is calculated as 32 bit, instead of 48 bit and we screw up the randomization of memstart_addr. I'm just putting a patch together and will send it out for testing shortly. Although, with defconfig, I get a slightly different panic to the originally reported one. I suspect they are both sympptoms of the same root cause though. I'll send out my fix then hopefully Luiz will be able to test it to confirm his original problem has gone away. > > Knowing the VA will tell us whether we're spilling out of the expected VA > region otherwise going wildly wrong with addressing, and the values in the PTEs > will tell us what's specifically triggering the warning. > > Also, if you're able to test with CONFIG_DEBUG_VIRTUAL, that might spot if we > have a dodgy VA->PA conversion somewhere, which can > > Mark. > > ---->8---- > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index b4df5bc5b1b8b..d04719919de33 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -171,19 +171,22 @@ static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end, > { > do { > pte_t old_pte = __ptep_get(ptep); > + pte_t new_pte = pfn_pte(__phys_to_pfn(phys), prot); > > /* > - * Required barriers to make this visible to the table walker > - * are deferred to the end of alloc_init_cont_pte(). > + * After the PTE entry has been populated once, we > + * only allow updates to the permission attributes. > */ > - __set_pte_nosync(ptep, pfn_pte(__phys_to_pfn(phys), prot)); > + if (!pgattr_change_is_safe(pte_val(old_pte), pte_val(new_pte))) { > + panic("Unsafe PTE change @ VA:0x%016lx PA:%pa::0x%016llx -> 0x%016llx\n", > + addr, &phys, pte_val(old_pte), pte_val(new_pte)); > + } > > /* > - * After the PTE entry has been populated once, we > - * only allow updates to the permission attributes. > + * Required barriers to make this visible to the table walker > + * are deferred to the end of alloc_init_cont_pte(). > */ > - BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), > - pte_val(__ptep_get(ptep)))); > + __set_pte_nosync(ptep, pfn_pte(__phys_to_pfn(phys), prot)); > > phys += PAGE_SIZE; > } while (ptep++, addr += PAGE_SIZE, addr != end); >