From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 427AEC021AB for ; Wed, 19 Feb 2025 08:54:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC30A280214; Wed, 19 Feb 2025 03:54:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BFD8128020C; Wed, 19 Feb 2025 03:54:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9F1E280214; Wed, 19 Feb 2025 03:54:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 87F2C28020C for ; Wed, 19 Feb 2025 03:54:25 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3C0D5A3120 for ; Wed, 19 Feb 2025 08:54:25 +0000 (UTC) X-FDA: 83136082890.09.A09B807 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf09.hostedemail.com (Postfix) with ESMTP id 4FA7A14000A for ; Wed, 19 Feb 2025 08:54:20 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="fJ7KdH/K"; spf=pass (imf09.hostedemail.com: domain of xueshuai@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=xueshuai@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739955263; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/XNFU2Vaq0TopySEX+e+5ydilqlPx+5/Lqq5YswmjCo=; b=rzT8JuHrAii03D7xRD2IgDDX/Aa0rrIzOc1yBGomd/WSwqrKjStJzyDLtFBROxCYTWO5BA vKr+xDVHK7pdRSw22v8p5YqlNRQFLXkFxsz3d0lAlKvNKd9rIkUrAUU0FRwh09o5bPKXl3 yXJ0N/YAi4ijitnZqY5skyXmNBLOlNY= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="fJ7KdH/K"; spf=pass (imf09.hostedemail.com: domain of xueshuai@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=xueshuai@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739955263; a=rsa-sha256; cv=none; b=JH/yIDXrZgvmq8mBoYvC++kfWaAL3TTFTxpdxEBsGg4TRW5czMbQ+bgzi40uqkhrYpk0kZ kPvb+cw8offh5qKB9wdXRTMLBNjzTS5FGsU+fs6rm86ZLA1p2BWpNkZyN1EFiQ2ZqzG2iM OKiDD4nkjy6Hssf/ipO41NsvNLI146g= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1739955257; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=/XNFU2Vaq0TopySEX+e+5ydilqlPx+5/Lqq5YswmjCo=; b=fJ7KdH/KwGsWmCoFasmc0bfMTFXFLdHq+P0ppH+zTtQrkWYAuZWqMNiBFOGAzIoAYQMgxOnNd7DQs3LKxd1HajrK1XFdd7stn2opcdRVfMuLlff0hBYYNkmxq7KC6UJdFTSNLIlMwLIZx/3n6Qnn8ahwrpqniuMOi947/75R2rE= Received: from 30.246.161.128(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0WPoiyp9_1739955255 cluster:ay36) by smtp.aliyun-inc.com; Wed, 19 Feb 2025 16:54:16 +0800 Message-ID: Date: Wed, 19 Feb 2025 16:54:14 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 4/5] mm/hwpoison: Fix incorrect "not recovered" report for recovered clean pages To: Miaohe Lin , "Luck, Tony" Cc: tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, akpm@linux-foundation.org, peterz@infradead.org, jpoimboe@kernel.org, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, tianruidong@linux.alibaba.com, tony.luck@intel.com, bp@alien8.de, nao.horiguchi@gmail.com References: <20250217063335.22257-1-xueshuai@linux.alibaba.com> <20250217063335.22257-5-xueshuai@linux.alibaba.com> From: Shuai Xue In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Stat-Signature: wbw8bjpjefkcn5kxwnsip7u74cy5wnue X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4FA7A14000A X-HE-Tag: 1739955260-573402 X-HE-Meta: U2FsdGVkX19OCuVdlGkmqlJnSS4z+kkzPZKYlCfrMSvcp7ZOOWXhiy1kbI2/UpRGKntXb5lPdjouJd8arEj834Y7Kr1L6wjw6anrjwhQg4ghGoOQbEtjaaFyKC5tFmZc451AOWWT3taqpmoYUdhpB6zRj4hHNdb/L6Vxxr1JUCgmXg8xXuragyHAjHTIlesQ2dv4hw5AjBKJqqjLLFXtddcZ7cZ1cC6NE/bH008gl5ekh97aR+1hM9Va8178dpR9tdEKVrlSJz7N1BS9C1yL+aoGBI+fA4skrIvYGYMc/4/+aGZsnymW6lYYVm/VMAF2D1MfahyEHypb7C1ltFZbYgbOYMefiN5AldD6Bn5JjvFTi54AE1J1ZrSSVYRukM7JVB6wKaE6lbVBg8NzFfPJmws572TuY+uS5JHCp/uSnqYiwSNlw6uPVojbWGFmF/pQq5daGSpJs4oL9XaRPLzrJZh9K1m0TzkxsoBUkD4teUEL5gID0wA9OT/WLs1iiZPLMQupkbqnAQMG/vRfIYTdcj4V+NH552GT0ImTfzEh1sWFCwbwcA4QTbi3DaXNNvgAcqGiabHKkWlJ8eDV+GwSEraS41LDxAjGqujGZvs196q5t4OMJlp9HsvL/ervnLQAxrj76iFvmk5qVWSoaGPW+/fPSSGeJDrh0Qo7z3spVd6NgtKxdS7EiYGmZjQ1ReupL3xmJJz/JD68fOClrIKXANj28ObY+GmuPQCchoZm+TBhhc2aKKwaU7kqu/GjLd4JX/4nFme6BHuNVlejdFpNCZrX0QrmqGjmg+/K8kzpE1lVRpeVGSkbbwgREsxdu44EOk0RQ1psEAk0k2Mqbnw7YLyOLRERcoGDX9+t0UmFuu4aHwPnNjkL5Gaw2LGjSLXaPzq+038AA3PytGwpIUYnpNWVz6lsgUWHMf0OIo0yoAlN3RkgezOHFMDaTERV+6ZNK0EuPY84FPYLo7V2Gu1 wqLoWN5T gWgnVIkCfSHunw/ixLdsCxckABOO5Tvaii/Y8NbllEQaBTTYgdRF4l0ZNafnbWhsSzgM6pJq+ScioXhbaPxEydBgfNUrka87+qgsRoqRlz/kWCdcExPXWVD4jgqw9pyfdDPfEmBSOmQaZr8jefFXrXJJYzEL3//2GkQZ73Nt7a+RL2KmOnAlkF0Mjz3uMWm6M1LddpgkbPo8P0UtJypQwQtz/zi7WMY8ULFDb25nQ22dtapxJVL/rBNbG0zZChwn9Z+rWhPhUSY1Pfv9dc2Tm9K8nnpIjN1loWFSRruyKbve6lvABkOvsspk8ZZwCIwGn3lYdN0QKcjCgxzt85As1XEhMzUbsv23+SLyYIzfvl76BTu4DR+vO4//TOhQuVvzvLwUZHHrG1V2ZvmRbR9xAruOLvudsf7rcq3rm9ERe7h6rQiMuRbTgCsYazLNxrdZ4qempVphW4Lvah1kpZsIFrpeVmxT+QHWk6foI X-Bogosity: Ham, tests=bogofilter, spamicity=0.002046, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/2/19 14:34, Miaohe Lin 写道: > On 2025/2/17 14:33, Shuai Xue wrote: >> When an uncorrected memory error is consumed there is a race between >> the CMCI from the memory controller reporting an uncorrected error >> with a UCNA signature, and the core reporting and SRAR signature >> machine check when the data is about to be consumed. >> >> If the CMCI wins that race, the page is marked poisoned when >> uc_decode_notifier() calls memory_failure(). For dirty pages, >> memory_failure() invokes try_to_unmap() with the TTU_HWPOISON flag, >> converting the PTE to a hwpoison entry. As a result, >> kill_accessing_process(): >> >> - call walk_page_range() and return 1 regardless of whether >> try_to_unmap() succeeds or fails, >> - call kill_proc() to make sure a SIGBUS is sent >> - return -EHWPOISON to indicate that SIGBUS is already sent to the >> process and kill_me_maybe() doesn't have to send it again. >> >> However, for clean pages, the TTU_HWPOISON flag is cleared, leaving the >> PTE unchanged and not converted to a hwpoison entry. Conversely, for >> clean pages where PTE entries are not marked as hwpoison, >> kill_accessing_process() returns -EFAULT, causing kill_me_maybe() to >> send a SIGBUS. >> >> Console log looks like this: >> >> Memory failure: 0x827ca68: corrupted page was clean: dropped without side effects >> Memory failure: 0x827ca68: recovery action for clean LRU page: Recovered >> Memory failure: 0x827ca68: already hardware poisoned >> mce: Memory error not recovered >> >> To fix it, return 0 for "corrupted page was clean", preventing an >> unnecessary SIGBUS. >> >> Fixes: 046545a661af ("mm/hwpoison: fix error page recovered but reported "not recovered"") >> Signed-off-by: Shuai Xue >> Cc: stable@vger.kernel.org >> --- >> mm/memory-failure.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 995a15eb67e2..b037952565be 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -881,12 +881,17 @@ static int kill_accessing_process(struct task_struct *p, unsigned long pfn, >> mmap_read_lock(p->mm); >> ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwpoison_walk_ops, >> (void *)&priv); >> + /* >> + * ret = 1 when CMCI wins, regardless of whether try_to_unmap() >> + * succeeds or fails, then kill the process with SIGBUS. >> + * ret = 0 when poison page is a clean page and it's dropped, no >> + * SIGBUS is needed. >> + */ >> if (ret == 1 && priv.tk.addr) >> kill_proc(&priv.tk, pfn, flags); >> - else >> - ret = 0; >> mmap_read_unlock(p->mm); >> - return ret > 0 ? -EHWPOISON : -EFAULT; >> + >> + return ret > 0 ? -EHWPOISON : 0; > > The caller kill_me_maybe will do set_mce_nospec + sync_core again. > > static void kill_me_maybe(struct callback_head *cb) > { > struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me); > int flags = MF_ACTION_REQUIRED; > ... > ret = memory_failure(pfn, flags); > if (!ret) { > set_mce_nospec(pfn); > sync_core(); > return; > } > > Is this expected? > the second set_mce_nospec do nothing and have no side affect. sync_core() is introduced by Tony [1]: Also moved sync_core(). The comments for this function say that it should only be called when instructions have been changed/re-mapped. Recovery for an instruction fetch may change the physical address. But that doesn't happen until the scheduled work runs (which could be on another CPU). [1]https://lore.kernel.org/all/20200824221237.5397-1-tony.luck@intel.com/T/#u IMHO, I think it also has no side affect. @Tony, could you help to confirm this? Thank. Shuai