From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A16ECC02198 for ; Wed, 12 Feb 2025 13:55:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E27626B0082; Wed, 12 Feb 2025 08:55:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD7506B0083; Wed, 12 Feb 2025 08:55:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC6496B0085; Wed, 12 Feb 2025 08:55:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AED906B0082 for ; Wed, 12 Feb 2025 08:55:56 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5C67881AC6 for ; Wed, 12 Feb 2025 13:55:56 +0000 (UTC) X-FDA: 83111441112.07.9383FAD Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf02.hostedemail.com (Postfix) with ESMTP id 3DAF480016 for ; Wed, 12 Feb 2025 13:55:51 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="E8UGSx/7"; spf=pass (imf02.hostedemail.com: domain of xueshuai@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=xueshuai@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739368554; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MFEgo4/N2Zx2KC4KrZ/3cSqXd5izVPgpo6yvb0R7Yww=; b=29wvkyJGgW4TMnE3XXHCkTvcGmC2v8QpjMOWKxgfU09Dc1W9xNEUOZ//tZtypTSrnBy5vE L8vHgWD6gWsHWv9R6giwWN2+7cTrRkORfgnZ6S5SnaI0kxSsJ2E/7nXH4aT/vfWGAuu8f5 8Ccg/TOVNpKaB5NbOqnM1PNjZSm7SbQ= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="E8UGSx/7"; spf=pass (imf02.hostedemail.com: domain of xueshuai@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=xueshuai@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739368554; a=rsa-sha256; cv=none; b=Dl45jb6fuZ3CAJU3+nxQ3D5dCZcdbv2Tc6Gg/MuP5GqSjvAW1MgI0gx5fs6yV4SXOukCHO b5SiyntiL6wvimNFSykWTWYOO3X6GgRfz8BSTy2VF6ddSCmFkuRm/LrIBPQ15tflc6N+iA pqDWhGeiPLE/krEKDFVfZGqVsqyYALs= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1739368547; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=MFEgo4/N2Zx2KC4KrZ/3cSqXd5izVPgpo6yvb0R7Yww=; b=E8UGSx/7OLMLjDARYoVeYxlNFQL/1kQkEIiBC19Lv96AmfPdPciMevq46lGEKISMOOhP2xjg73l0SHadyTDNl9DE+s6T7hvIzllRcf12fHTCN9VZ3TzjhmjN7Pygdp3BEG0fgYXDJF7eSe8Vsftse8AR87REKFDd78wtWsA93tM= Received: from 30.246.161.128(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0WPKLnPV_1739368544 cluster:ay36) by smtp.aliyun-inc.com; Wed, 12 Feb 2025 21:55:45 +0800 Message-ID: <3820329d-20e3-49ee-a329-aac7393c6df3@linux.alibaba.com> Date: Wed, 12 Feb 2025 21:55:39 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 4/4] mm/hwpoison: Fix incorrect "not recovered" report for recovered clean pages To: Miaohe Lin , "nao.horiguchi@gmail.com" Cc: tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, akpm@linux-foundation.org, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, tianruidong@linux.alibaba.com, tony.luck@intel.com, bp@alien8.de References: <20250211060200.33845-1-xueshuai@linux.alibaba.com> <20250211060200.33845-5-xueshuai@linux.alibaba.com> <5f116840-60df-c6d9-d7ff-dcf1dce7773f@huawei.com> From: Shuai Xue In-Reply-To: <5f116840-60df-c6d9-d7ff-dcf1dce7773f@huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 3DAF480016 X-Stat-Signature: 9ntntptowb1b1gfyu3tgh3ud5ryr4foi X-HE-Tag: 1739368551-119530 X-HE-Meta: U2FsdGVkX18wg6aGXrOcNp+oDGdvPIQh9K2PtpqfCLpqKMlwn2fRcWkVMJ1WLoFfXRCZwqpAtOvnqQLihvlDiQY0IZdCPaiPrVlAjCXCnrmKAuJUAUpaQ7InDfQcozH7vvncYS3GUYlHgmQhpDfn5LXM54o2CInDyhBy9MDDQ6GRLmJiX+FNO0PPVFflyPtcrEOfxH1/CDyQplRnVclOS4QPPU1MZGeKyz4eH8nv6CWqxWPdHoVjawvXhx0eSPX19USIO4ECoB/dRSaLixVIV21v5sCVH6uGqJARuNviRVlH3MDSsv6IS8/fDwviJdCbCS7UEqIBLYVt4EbF1ODNETWNReU/fGuSvRjB6Lks3AEPez3ZT7Rn1ynXGZvqgUyMa0oWE5TaWqktttyy0Idlu+C9FpzzjbpSyjX0vEFLuLGI+1DiOJPgNM/9/paB5+50m7bRJzitB5CciiCemFgJY8p48h7W/5rkJNTGRW8yujsBVnIHjjot3ytsnQlzLKjSCb4iQeKm7SRv7VLhqSXGxiqpcxhOz96LFuvRtXvjI5BIni2r3XCJLqjOJRkzMsBeKI+fQvD2mAiyypIMAyoKk5oW9BtkApx0GikuFAzcNpfHJxoFKwABRJHOaLR8ajp2UGmQZs6CC7YmovPBYwQUxqhOhiNt2TgpRHYp5lxGV6JFIaHeOJGAaxZwdOIr7ZpSS32SFZXiRfARCWuLkQoD+aH3KdNyT4fTyrEhfAQuHW9xn1Fg6c7GDXT4XN7cOgOLlf7NGtieiyz7fRZfLB9tgYJ7XakZC2RYr26DJZkE0MlAzwp109Rew4CteZpoEpytKkG9CcdWFKxwaUdD0Zvbg7v7vlXuKfT8CaHlQm6coKwrokwsbTpwuefrtqhi4zQmuiKbg8/NIR7tB+OYy71mZKu6CI6mdb77wSraG7XPwT3qyEWvL3smwKyh0PkctoBa0Y8t4SJ+gB7etS9dvMR N+NbAff4 Ob0LIBm2i2WOUhrUr6NeH2llkm1xfbRVnhX5Hm7jTuooBbDO34tVtJSsdKtLU9sMzr4Coa7/gJZbUkg8zQkWNWaNItlUm5DDSPEgT58BWnN1Gtd5jrCuAy5Viy1vj2A3ZVuiIaogdAOm7V+cpXh6R1vLcpmjHgrxRS0Z/hZNIc6N0GMXobERWuifMJQBRSn44gNPirlpBYmFddYBDsvCR3NtqR7Rvz3ngIDP7LqymcCk4I5PLWOOGejMZ+vHlM8Fe4flRExebAQlqgmvgV08B+z14r3VGTYdwFeSwP/1WvqN36fTB0wm73sDvRB0wIg4IegLweBL/YN53iGkM1AITAYACcLJUQcLYRRsiK/QVq1M0LwxLiF4njJbPa+irReCE94OhqP6w9UOwfsZZwXHgqt3DZmErQOjD4zoDMt0MUIkjXrVkcDJsbrkfOg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.022510, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/2/12 16:09, Miaohe Lin 写道: > On 2025/2/11 14:02, Shuai Xue wrote: >> When an uncorrected memory error is consumed there is a race between >> the CMCI from the memory controller reporting an uncorrected error >> with a UCNA signature, and the core reporting and SRAR signature >> machine check when the data is about to be consumed. >> >> If the CMCI wins that race, the page is marked poisoned when >> uc_decode_notifier() calls memory_failure(). For dirty pages, >> memory_failure() invokes try_to_unmap() with the TTU_HWPOISON flag, >> converting the PTE to a hwpoison entry. However, for clean pages, the >> TTU_HWPOISON flag is cleared, leaving the PTE unchanged and not converted >> to a hwpoison entry. Consequently, for an unmapped dirty page, the PTE is >> marked as a hwpoison entry allowing kill_accessing_process() to: >> >> - call walk_page_range() and return 1 >> - call kill_proc() to make sure a SIGBUS is sent >> - return -EHWPOISON to indicate that SIGBUS is already sent to the process >> and kill_me_maybe() doesn't have to send it again. >> >> Conversely, for clean pages where PTE entries are not marked as hwpoison, >> kill_accessing_process() returns -EFAULT, causing kill_me_maybe() to send a >> SIGBUS. >> >> Console log looks like this: >> >> Memory failure: 0x827ca68: corrupted page was clean: dropped without side effects >> Memory failure: 0x827ca68: recovery action for clean LRU page: Recovered >> Memory failure: 0x827ca68: already hardware poisoned >> mce: Memory error not recovered >> >> To fix it, return -EHWPOISON if no hwpoison PTE entry is found, preventing >> an unnecessary SIGBUS. > > Thanks for your patch. > >> >> Fixes: 046545a661af ("mm/hwpoison: fix error page recovered but reported "not recovered"") >> Signed-off-by: Shuai Xue >> --- >> mm/memory-failure.c | 5 ++--- >> 1 file changed, 2 insertions(+), 3 deletions(-) >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 995a15eb67e2..f9a6b136a6f0 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -883,10 +883,9 @@ static int kill_accessing_process(struct task_struct *p, unsigned long pfn, >> (void *)&priv); >> if (ret == 1 && priv.tk.addr) >> kill_proc(&priv.tk, pfn, flags); >> - else >> - ret = 0; >> mmap_read_unlock(p->mm); >> - return ret > 0 ? -EHWPOISON : -EFAULT; >> + >> + return ret >= 0 ? -EHWPOISON : -EFAULT; > > IIUC, kill_accessing_process() is supposed to return -EHWPOISON to notify that SIGBUS is already > sent to the process and kill_me_maybe() doesn't have to send it again. But with your change, > kill_accessing_process() will return -EHWPOISON even if SIGBUS is not sent. Does this break > the semantics of -EHWPOISON? Yes, from the comment of kill_me_maybe(), * -EHWPOISON from memory_failure() means that it already sent SIGBUS * to the current process with the proper error info, * -EOPNOTSUPP means hwpoison_filter() filtered the error event, this patch break the comment. But the defination of EHWPOISON is quite different from the comment. #define EHWPOISON 133 /* Memory page has hardware error */ As for this issue, returning 0 or EHWPOISON can both prevent a SIGBUS signal from being sent in kill_me_maybe(). Which way do you prefer? > > BTW I scanned the code of walk_page_range(). It seems with implementation of hwpoison_walk_ops > walk_page_range() will only return 0 or 1, i.e. always >= 0. So kill_accessing_process() will always > return -EHWPOISON if this patch is applied. > > Correct me if I miss something. Yes, you are right. Let's count the cases one by one: 1. clean page: try_to_remap(!TTU_HWPOISON), walk_page_range() will return 0 and we should not send sigbus in kill_me_maybe(). 2. dirty page: 2.1 MCE wins race CMCI:w/o Action Require MCE: w/ Action Require TestSetPageHWPoison TestSetPageHWPoison return -EHWPOISON try_to_unmap(TTU_HWPOISON) kill_proc in hwpoison_user_mappings() If MCE wins the race, because the flag of memory_fialure() called by CMCI is not set as MF_ACTION_REQUIRED, everything goes well, kill_proc() will send SIGBUS in hwpoison_user_mappings(). 2.2 CMCI win CMCI:w/o Action Require MCE: w/ Action Require TestSetPageHWPoison try_to_unmap(TTU_HWPOISON) walk_page_range() return 1 due to hwpoison PTE entry kill_proc in kill_accessing_process() If the CMCI wins the race, we need to kill the process in kill_accessing_process(). And if try_to_remap() success, everything goes well, kill_proc() will send SIGBUS in kill_accessing_process(). But if try_to_remap() fails, the PTE entry will not be marked as hwpoison, and walk_page_range() return 0 as case 1 clean page, NO SIGBUS will be sent. In summary, hwpoison_walk_ops cannot distinguish between try_to_unmap failing and causing the PTE entry not to be set to hwpoison, and a clean page that originally does not have the PTE entry set to hwpoison. +naoya for orginal patch intend. Thanks. Best Regard, Shuai