From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0083FC25B74 for ; Tue, 14 May 2024 03:07:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A5F98D0018; Mon, 13 May 2024 23:07:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 755478D000D; Mon, 13 May 2024 23:07:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61D9C8D0018; Mon, 13 May 2024 23:07:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 457D88D000D for ; Mon, 13 May 2024 23:07:30 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B8950160FF3 for ; Tue, 14 May 2024 03:07:29 +0000 (UTC) X-FDA: 82115515818.15.D6A6F4B Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf06.hostedemail.com (Postfix) with ESMTP id 1CAE6180007 for ; Tue, 14 May 2024 03:07:25 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715656047; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GDLycPXRN9/vLyzcfSX9gpfZ/Pa5R/xpEyMOzmVLJUo=; b=tdyE8kVLsPjlRhXFhe2fJo2vmHc4P6x6lwo0VLZdylnnrxBamRwlrdOw8PZwOht9UTAepM yE57WooMp/vZlXoGp7Pe0J7BWeVlolMWSVUCnMyYaFjEPlKsSFu0G8bBzD+mrtf0STGyQJ mgX8fWmO6GLfoAJPMpRDi7icxtzO34U= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715656047; a=rsa-sha256; cv=none; b=bSj5s1clB+ws+XXKRtwK7F6mJvvxegHPxBH2qYY0sRnx5Xi31/XnwOhmv36+vucv1l6ztQ wJiivLOglqfpNQrP5YhCvVVbDdK4hV1L34+3B63oSSYpNXXB4oBrwQPmatqUyH39Ril+9c q1SRhBSETyb7RonxtKlk6kDEdBEu0qE= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4Vdh5T5p1Zz1RJPr; Tue, 14 May 2024 11:03:57 +0800 (CST) Received: from canpemm500002.china.huawei.com (unknown [7.192.104.244]) by mail.maildlp.com (Postfix) with ESMTPS id BB218140360; Tue, 14 May 2024 11:07:19 +0800 (CST) Received: from [10.173.135.154] (10.173.135.154) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 14 May 2024 11:07:19 +0800 Subject: Re: [PATCH] mm/huge_memory: mark huge_zero_folio reserved To: David Hildenbrand , CC: , , , , References: <20240511032801.1295023-1-linmiaohe@huawei.com> <1ca64fc3-1b96-466e-aa25-a8f9f6805edc@redhat.com> From: Miaohe Lin Message-ID: <2b5e2b42-7fa6-ab51-494a-0414d1c75290@huawei.com> Date: Tue, 14 May 2024 11:07:18 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <1ca64fc3-1b96-466e-aa25-a8f9f6805edc@redhat.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.173.135.154] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500002.china.huawei.com (7.192.104.244) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1CAE6180007 X-Rspam-User: X-Stat-Signature: sjoi3cgxadoutbnpyb5nccgmyqb3iuj7 X-HE-Tag: 1715656045-858335 X-HE-Meta: U2FsdGVkX1+klYLGSoQVaYa3ePHEvl9WF4Vbt50GvyVqpPFgLxyVVK6KFcqNaHp2o3fPVgMg97NrfcO2/B8brOZ2vUKiMG+EvPQ2yIuQWXe1+3XTlodJPM5DkDfKIqPelCCDKyNRVsUqnBrNpRT+R9sgPP9Vx7mIo2B1RtLfcPVymhQlyNGhFtFqYqRdLJXcuonkLnoaosUYearQktkBFzQOn7fCRWitIvDlS4silX+MLEETRd4EKfox+dB5UqEkAKEC6k149Tv0sTix1iBhJLPBQ2IPaAw3EeO+TrG6yl21y/vGHBtcKujaZC23S9mwJy/l+B7MiiQkbcVghhMJ3VXn+TiPuU3jZWX/G4p/DpE0/3pAJzJDPdxXkLp4ADGYA5IM9h0i6BaqqBa3oryvHHxDywRYrcmsj7etTqpQeBZOoREc0F+LCapyFKs9KKbpZQrqUIuSrpxP6WVu2vFmXCPh26fgkeoTuRh2YGh3lymoYml61WtGf7od7e9yqaQdm5y0/JeYApNVb3zFJbnlaRMtleasmxngOzwUXY6X/yPDUHpBr6BOpVG5RnyIZTN4HWglcSmNRVTxH0lUw4BAQt/AFRay4bSmQOtvk4JGyr0H28TbgUdazlq1GOiYPw1okOrM6V7s+uz9F5WzjVzcE7P3EnB532PNeKXKCT4j8lWA9VvJueo/eOBQ+cu4MAXH9G+bzaA/ujs+VUMapkjnZZqeLTVtEKZdBDa9a2DEnHcksHcL+Ztc751yNr3r07ZdvtLoL7W/fCrOJ2EXmI2ExQ7o70UN4JlilvOJyPlXuVsBzxVWM2HhhFnTAtBTiDVat7Sdy/RoBF8SCV4uP6ueH45rb7bKXaa5Dba15xViJnVpZOS+f37Ll+Rn8n2UyAAitE9P7e+wGeHAspxLimdY8JNnGrgQMJpZMNBcyR6EdjmBzhanh4P6HrMdH1BYuoIcjPlycTDOD/Whtwyht/O sxADbWwL ySMzVDnIfy2O1/qiNfMoiXl+mJOxTOsYTX72s6MgLIP+2nxEuqKQUCfBavApaw4Cf4QJjMUBkUkta+NdoUg5/M6yfCNGhcaXfyMIBjq+fKILJlxPKq/8abScFViTWIUrxfzViNj+8Bb+no7XkrevwcGcwKrgtMKcm53gDW/qxnzQUKThamHpzDognfp2LaFvk1KMEKbj7N4pxpgiWydMJOhniPrOvzEdUReX6p5v04z4Cli/3mN5lyhuJDGQ1KdeW1USR6tBv5I5SZwzza/ydWOEU9c9N1mIaExmmMh5nkWQMG+qDsoJsiFGI1VwEwp3wT1pyhx0Dh0XOLS4km+w2M+82s6Aj2BxOZyRr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/5/13 23:40, David Hildenbrand wrote: > On 11.05.24 05:28, Miaohe Lin wrote: >> When I did memory failure tests recently, below panic occurs: >> >>   kernel BUG at include/linux/mm.h:1135! >>   invalid opcode: 0000 [#1] PREEMPT SMP NOPTI >>   CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14 >>   RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 >>   RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 >>   RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 >>   RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 >>   RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 >>   R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 >>   R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 >>   FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 >>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>   CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 >>   Call Trace: >>    >>    do_shrink_slab+0x14f/0x6a0 >>    shrink_slab+0xca/0x8c0 >>    shrink_node+0x2d0/0x7d0 >>    balance_pgdat+0x33a/0x720 >>    kswapd+0x1f3/0x410 >>    kthread+0xd5/0x100 >>    ret_from_fork+0x2f/0x50 >>    ret_from_fork_asm+0x1a/0x30 >>    >>   Modules linked in: mce_inject hwpoison_inject >>   ---[ end trace 0000000000000000 ]--- >>   RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 >>   RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 >>   RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 >>   RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 >>   RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 >>   R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 >>   R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 >>   FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 >>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>   CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 >> >> The root cause is that HWPoison flag will be set for huge_zero_folio >> without increasing the folio refcnt. But then unpoison_memory() will >> decrease the folio refcnt unexpectly as it appears like a successfully >> hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) >> when releasing huge_zero_folio. >> >> Fix this issue by marking huge_zero_folio reserved. So unpoison_memory() >> will skip this page. This will make it consistent with ZERO_PAGE case too. >> >> Fixes: 478d134e9506 ("mm/huge_memory: do not overkill when splitting huge_zero_page") >> Signed-off-by: Miaohe Lin >> Cc: >> --- >>   mm/huge_memory.c | 2 ++ >>   1 file changed, 2 insertions(+) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 317de2afd371..d508ff793145 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -212,6 +212,7 @@ static bool get_huge_zero_page(void) >>           folio_put(zero_folio); >>           goto retry; >>       } >> +    __folio_set_reserved(zero_folio); > > We want to limit/remove the use of PG_reserve. Please find a different way (e.g., simply checking for the huge zero page directly). I see. Will drop this patch and find another one. Thanks. .