From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33712C54E92 for ; Wed, 21 May 2025 07:43:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBF656B0085; Wed, 21 May 2025 03:43:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C971A6B0088; Wed, 21 May 2025 03:43:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD9886B0089; Wed, 21 May 2025 03:43:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9EAA16B0085 for ; Wed, 21 May 2025 03:43:26 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D486EC0757 for ; Wed, 21 May 2025 07:43:25 +0000 (UTC) X-FDA: 83466124770.04.708D647 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf22.hostedemail.com (Postfix) with ESMTP id 383F6C0006 for ; Wed, 21 May 2025 07:43:24 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=tniVTUKM; spf=pass (imf22.hostedemail.com: domain of rppt@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747813404; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=n2uI1LxDvatN74Pe4RvHA5cLkSm/HkoEEGVLeeQyaaE=; b=XtMKD8DjHooyovkkiD5meiOMo3cxN6+Tz8Let1iVVkJ2c4mYCf414CL+DiKqfHxHfvtqId FfuRcjTd1ZeeXISCmDqILx7BpBhd0St5crAA/kRrFnzObdE4xLgq201gi61NZKXCbouiAo xMZ++fyUnMLej/bxZL1005/3XatM/zU= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=tniVTUKM; spf=pass (imf22.hostedemail.com: domain of rppt@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747813404; a=rsa-sha256; cv=none; b=ULst6WCJ1PZzolpx6idSCMP7JemBVI1xtcYtwOyATsxoHM0+GiJS04JMSsTXSuN2LSgSVd U85hFv+uDyEMs3kqVr4EtITsEHUfibmS2FvL1WKuofwyyDQUCmS9skk3Qp1Kflrh358Rv2 9jtGxJD968DO4MubEW+9RIxYbRlwCV8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 75A5D629D5; Wed, 21 May 2025 07:43:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC49FC4CEEA; Wed, 21 May 2025 07:43:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1747813403; bh=IaXGz566qug/+3EDaHBOJMa42yV+T656nzKq4hpX70A=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=tniVTUKM4XvL4gNhywgKr7aRHfydPT1P95tSVrdRRZah7fr7CfHyu4SD8A5iOLXWQ fddFnOKbF7JJoxNQQ86pNryBmgx0W9LznRBXv/FY0L080INbPqjn644g+ce3cmKPLZ P8rcOvI00ufaMAbagclmhJ+6Flgkw01Ujifw8AQAsd0ijkHfkXe0zS8L47tcObZhCL MXxAgNQC1MJ4GAlSbbQUP6YnVlT9P5dp7Ubviph56Tsmh3ykF5eanv6mNG2tyMdosH lnw+HiM5D3oHvdkMKyy9bWTs6iVrmeDu390ARaR3VYjb8+11irgtcuGihmG3wo0yUG 8cnOIEdBtjtTA== Date: Wed, 21 May 2025 10:43:15 +0300 From: Mike Rapoport To: Changyuan Lyu Cc: akpm@linux-foundation.org, bhe@redhat.com, chrisl@kernel.org, graf@amazon.com, jasonmiu@google.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, pasha.tatashin@soleen.com Subject: Re: [PATCH 1/2] memblock: show a warning if allocation in KHO scratch fails Message-ID: References: <20250521070310.2478491-1-changyuanl@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250521070310.2478491-1-changyuanl@google.com> X-Stat-Signature: yorxzf3whkd791m8bhxtyp9uzdoomd8y X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 383F6C0006 X-HE-Tag: 1747813404-578430 X-HE-Meta: U2FsdGVkX18MbaUrNFbqqXZBlkUvNRi5T+1vA1Lp3K/ju3qwDw4gwdOpfhGDwThlVACW2CR4JXxE7ddFT4ST4f2OtDykJckuq+aiszh2+FgbKVjA2rvyFd1CczbirMQFdEtF/XzWS0UPn7WjLYRLWCozBt3rf8JaruNqLKvijE8vzM3ozjHjVix8Qu20xY55OufBvssMOBIRAB+XRbsmgS1SrVFo5zFD+7+FHK6dgZXh8FJ6P5v3mbN8oUYt+/JydD3zjZ/7aHTophkutLs0n50bXdazE5cfnGRmphhBp55szDZ3kb6T+DVfFLdROr65I6/IHv85FtB/DQ6UH97bQBcbK6bVoc27+S5Jn1mksfHq3wPOysLrdD14U8wC2DXYfw8FUXta9gtcOSyJyoP0GrYvjykO39DotDSmYCoFnOjOKOLx2/jo64vu9Kvlgx5I4SCX72pNrvuj/ld290WcF9ZJXdYRoGH7h8VPbH/hVFegoFIlqAbrgqezBTh6RSy228Y9zqK/rcUT9+cv8Y+8g1veLLTfXOn90kdO22StXD/OS+YMR+R6UZAOQh2dq61mFEMNeMdpklpVUFeluHlTpINUMB2NCaUkg2Y79iW5251eSOu9oPdr6szc8nxrBB86L3dmZE/rwjDCiQLIprZHyjsu2bIae/Hn5gAthLHMUyw3Q3PD6zAu2CHpQYDfLiogVhf/MriCTiyvpiSJsS0lv5csqNstuqVdZrNt9k5WI6jkYn9+IqJ8FzuG8gNHmPl54q8re2UphdJAauo7qAWPKZxXaoTbMqIrmsnEqUC+Xy9aG06tWuvxCZyJ8vg3qHjREFYYyLUCqBRyOV3sGdoLFR4zoXB2SUHNz8N3QX4b/aTSTJNRGnWiqsJ1r1ovt1fjoVQU1jupsy8ZvX+/YROWU/8ElxqAm4HmQ23cWgmw2FCdE/LV9/EsqkxyvkZy5oaf8zblQx3BzNEXtB1yKpl T+xSsuMB WP+KtfjjerZskNpPjFAGB6bqE8IBe8bAUA+hm+q0tRwYYXToAYpzlZDoW5sn295OLCvQz9GO/9OcFskxhL0B+9x/m4JnFPEHi+1QWfDtB+qmYGk7DlRw8+F3EAZkMRSNkcoO4vG//gTCmtAdb44BoLq8Qp/TwLY3VAXAeaKalA00xjtQAeH7MVVikbkR2Es/OJjCo98ZjEaNk3IC14baaQ7TmRZK3EypGfMbyjdpfa3nX+4S833RFc6x7DqO3luiUrf2abmegi+R640t8tZiftt8vAbrgjCdZIXwf4y1KzPjyWtpuSj+Cj8AtDQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Changyuan, On Wed, May 21, 2025 at 12:03:10AM -0700, Changyuan Lyu wrote: > Hi Mike, > > On Sun, May 18, 2025 at 19:07:02 +0300, Mike Rapoport wrote: > > > > This can be reproduced without KHO, just squeeze the RAM size, boot with a huge > > kernel and initrd and you'll get the same panic. > > > > The issue is that sparse_init_nid() does not treat allocation failures as > > fatal and just continues with some sections being unpopulated and then > > subsection_map_init() presumes all the sections are valid. > > > > This should be fixed in mm/sparse.c regardless of KHO, maybe as simple as > > > > diff --git a/mm/sparse.c b/mm/sparse.c > > index 3c012cf83cc2..64d071f9f037 100644 > > --- a/mm/sparse.c > > +++ b/mm/sparse.c > > @@ -197,6 +197,10 @@ void __init subsection_map_init(unsigned long pfn, unsigned long nr_pages) > > pfns = min(nr_pages, PAGES_PER_SECTION > > - (pfn & ~PAGE_SECTION_MASK)); > > ms = __nr_to_section(nr); > > + > > + if (!ms->section_mem_map) > > + continue; > > + > > subsection_mask_set(ms->usage->subsection_map, pfn, pfns); > > > > pr_debug("%s: sec: %lu pfns: %lu set(%d, %d)\n", __func__, nr, > > I tried your patch and the kernel log now looks like > > [ 0.027562] Faking a node at [mem 0x0000000000000000-0x000000057fffffff] > [ 0.028338] NODE_DATA(0) allocated [mem 0x562bd5a00-0x562bfffff] > [ 0.029201] Could not allocate 0x0000000014000000 bytes in KHO scratch > [ 0.030229] sparse_init_nid: node[0] memory map backing failed. Some memory will not be available. > [ 0.030232] Zone ranges: > [ 0.031539] DMA [mem 0x0000000000001000-0x0000000000ffffff] > [ 0.032242] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] > [ 0.032952] Normal [mem 0x0000000100000000-0x000000057fffffff] > [ 0.033658] Device empty > [ 0.033987] Movable zone start for each node > [ 0.034473] Early memory node ranges > [ 0.034878] node 0: [mem 0x0000000000001000-0x000000000007ffff] > [ 0.035591] node 0: [mem 0x0000000000100000-0x00000000773fffff] > [ 0.036308] node 0: [mem 0x0000000077400000-0x00000000775fffff] > [ 0.037030] node 0: [mem 0x0000000077600000-0x000000007fffffff] > [ 0.037750] node 0: [mem 0x0000000100000000-0x000000054abfffff] > [ 0.038463] node 0: [mem 0x000000054ac00000-0x0000000562bfffff] > [ 0.039180] node 0: [mem 0x0000000562c00000-0x000000057fffffff] > [ 0.039901] Initmem setup node 0 [mem 0x0000000000001000-0x000000057fffffff] > [ 0.040707] On node 0, zone DMA: 1 pages in unavailable ranges > [ 0.041401] On node 0, zone DMA: 128 pages in unavailable ranges > [ 0.221829] BUG: kernel NULL pointer dereference, address: 0000000000000018 > [ 0.222675] #PF: supervisor read access in kernel mode > [ 0.223271] #PF: error_code(0x0000) - not-present page > [ 0.223859] PGD 0 P4D 0 > [ 0.224152] Oops: Oops: 0000 [#1] SMP > [ 0.224575] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc4+ #279 NONE > [ 0.225439] RIP: 0010:set_pageblock_migratetype+0x97/0xd0 > [ 0.226069] Code: b6 c9 c1 e1 04 48 01 c8 eb 02 31 c0 48 8b 70 08 89 f9 c1 e9 07 c1 ef 0d 83 e7 03 80 e1 3c 41 b8 07 00 00 00 49 d3 e0 48 d3 e2 <48> 8b 44 fe 18 49 f7 d0 48 89 c1 4c 21 c1 48 09 d1 f0 48 0f b1 4c > [ 0.228231] RSP: 0000:ffffffffa4203d58 EFLAGS: 00010046 > [ 0.228834] RAX: ffff8e4722bd13b0 RBX: 0000000000000000 RCX: 0000000000009b00 > [ 0.229664] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 > [ 0.230487] RBP: ffffffffa4203d58 R08: 0000000000000007 R09: 0000000000000000 > [ 0.231303] R10: ffffffffa4edc610 R11: 000000000000000c R12: 000000000054ac00 > [ 0.232119] R13: 0017fff000000000 R14: 0000000000000002 R15: 00000000004d8000 > [ 0.232937] FS: 0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000 > [ 0.233868] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.234529] CR2: 0000000000000018 CR3: 000000055923e000 CR4: 00000000000200b0 > [ 0.235351] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 0.236171] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 0.236990] Call Trace: > [ 0.237272] > [ 0.237514] memmap_init_range+0x1d8/0x210 > [ 0.237983] memmap_init_zone_range+0x7f/0xb0 > [ 0.238488] memmap_init+0x9a/0x120 > [ 0.238892] free_area_init+0x369/0x3d0 > [ 0.239331] zone_sizes_init+0x5e/0x80 > [ 0.239765] paging_init+0x27/0x30 > [ 0.240153] setup_arch+0x307/0x3e0 > [ 0.240556] start_kernel+0x59/0x390 > [ 0.240968] x86_64_start_reservations+0x28/0x30 > [ 0.241493] x86_64_start_kernel+0x70/0x80 > [ 0.241962] common_startup_64+0x13b/0x140 > [ 0.242433] > [ 0.242682] CR2: 0000000000000018 > [ 0.243064] ---[ end trace 0000000000000000 ]--- > > It seems we are just defering the panic from subsection_map_init() to > memmap_init(). To me it is still not obvious that the failure was > caused by samll KHO scratch. Small KHO scratch only exposes the issue that from one side sparse_init_nid() does not treat OOM condition as fatal and tries to continue with hardly noticeable error message but from the other side, we presume that all section data was properly allocated and access it. > I think the error log in my original patch still makes sense since it > indicates potential panics early. This will add another barely noticeable message at the same place sparse_init_nid() reports an error. I don't see how it will be better. I think we should just make sparse_init_nid() panic or at least change "sparse_init_nid: node[0] memory map backing failed. Some memory will not be available." to something more visible and clear. > Best, > Changyuan -- Sincerely yours, Mike.