From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B64AC02198 for ; Tue, 18 Feb 2025 12:24:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 97D86280127; Tue, 18 Feb 2025 07:24:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 92DD9280124; Tue, 18 Feb 2025 07:24:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F598280127; Tue, 18 Feb 2025 07:24:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 60CC3280124 for ; Tue, 18 Feb 2025 07:24:51 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 75E8112106D for ; Tue, 18 Feb 2025 12:24:50 +0000 (UTC) X-FDA: 83132984340.28.632CAF9 Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by imf12.hostedemail.com (Postfix) with ESMTP id C92C640019 for ; Tue, 18 Feb 2025 12:24:47 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=BkBOoiME; spf=pass (imf12.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de; dmarc=pass (policy=none) header.from=alien8.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739881488; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WI+Uaqm9Cw9dVESJ3i25x12Uq0b4Q9A8IQaKBTE93+M=; b=7UCnrFyPBVC1wj5i0cx2Uk3Su4vXwkW5i4Ry66glErRanIsHL8+k2iklVtssoCVAHc6Lcf 5GTKu8hjavVMxU5bEySmO6cwjgvzNpLTKmYJSMDVUCIIYOfi+AoB20IHfRUPwrQeDAUz/u ODu/y96+0ozO2I/UnM6wTqpFJyqghP8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739881488; a=rsa-sha256; cv=none; b=8iNtpTD8NqzOOVpdoXUj8MGySJI4kA6iHo9z6tfQySJaAsNd3KvsAsLW4bnLStNgaY5p3H Dl3Cs3IC8HAOFPBGaKnL6WmaVHbJ8dO9Zlpwd9zSDJEDf2ywg8h2IxAMzkUt3fAVVXesBe NAjMN7OW6dvPFxz/Hp9+irrUyyeqYJM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=BkBOoiME; spf=pass (imf12.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de; dmarc=pass (policy=none) header.from=alien8.de Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 2B39A40E020E; Tue, 18 Feb 2025 12:24:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id OyDvPaUDwoWc; Tue, 18 Feb 2025 12:24:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1739881480; bh=WI+Uaqm9Cw9dVESJ3i25x12Uq0b4Q9A8IQaKBTE93+M=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=BkBOoiMEQp6jcZIbbtwagiV/Mkd09m070ochMoIyiIdzrc9aMIoMIojtiuuG8zQt9 /s9Haq1pxtO1MK7+24L43rHDcxwkmNJSUc29w73lHdosqGzk9MxQVNBl7hAZVay6Rx dnRjIvKIKdU/RfFELDlLTcHuxdGGffIEhhAp/I1wNai5wu5fXHwQdDwLxdkcOA48r2 qcHErggIqF59QrohnLog0FS4ToJqWLHu+4lJi2HM6klQvErzfoP/0uF5HetDWxwcj6 CVI7ELi7ASFOtnflUUDpXBqT83ckeh5Qy4iyQtoB/2ebxvX0LKL4R61EBy3uEdd6HB aLyHalxJ2Ic6SFTCXI/vEvaMrw4TuuS9bttqco4L6Xb5ArZWwt8UL57GuwCYYelXYs FoaERsjklDv4B2LECzgWgiSBclaQU41VEcHCgIfj3ay4FtYrkyQyQUWPurWdkE47Dr bJcjxcS1r6ux2S11bCClbdymi+5JGgNPv9WRqDbxxpLs0EtAeYR3nYJSbzVgwWTABt 6biRjIRtGzIsQILJsuU/HZSa7iHML0NOAA4yMmWdxCGA0vv2XO/URUBd2ZRmm7nQYW tfBayKNLv5JtlE05eBSMhVN07wZC/qkIrMUVZoS5x25k4USRwTnt7BkS0v9/cCM8Ls S2DhYoqaZrkYwY7YKX3XlVQM= Received: from zn.tnic (pd95303ce.dip0.t-ipconnect.de [217.83.3.206]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id C2CD140E01A1; Tue, 18 Feb 2025 12:24:22 +0000 (UTC) Date: Tue, 18 Feb 2025 13:24:17 +0100 From: Borislav Petkov To: Shuai Xue , tony.luck@intel.com Cc: nao.horiguchi@gmail.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, linmiaohe@huawei.com, akpm@linux-foundation.org, peterz@infradead.org, jpoimboe@kernel.org, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, tianruidong@linux.alibaba.com Subject: Re: [PATCH v2 0/5] mm/hwpoison: Fix regressions in memory failure handling Message-ID: <20250218122417.GHZ7R78fPm32jKYUlx@fat_crate.local> References: <20250217063335.22257-1-xueshuai@linux.alibaba.com> <20250218082727.GCZ7REb7OG6NTAY-V-@fat_crate.local> <7393bcfb-fe94-4967-b664-f32da19ae5f9@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <7393bcfb-fe94-4967-b664-f32da19ae5f9@linux.alibaba.com> X-Rspam-User: X-Rspamd-Queue-Id: C92C640019 X-Rspamd-Server: rspam07 X-Stat-Signature: rhnbbswkdf63nomgzp4yxe6tbgoax79d X-HE-Tag: 1739881487-13889 X-HE-Meta: U2FsdGVkX19QCdcLlqhlFp9jeSGBSaQcKY+HmR/lMRb7pAJTlkBb3JBddaPLvu0r9r90f1Du2Ax68mfBkmV/0qFGHnpRfdfpwaACiW1/Y+grPIqFfHcRezIJD+4OYmGGHJS8kZDGBs8ucwIBtGVXj5Q40OS1x+A/6xtuJmGgR7gKYd3gBMm/mO+RNqg1DGW0quVtqRPKhpAgERRMLmGbOOERCMe5rC3KMIg4wEJjV2Ah1G+657x4ToOdipyOIQWCov8qcVUMME2JEwa6pVQLDJ5wB3gujhtPjlKgs7ZYu1iH3vzK7LwxyDxL8BS8MOi6qe+CVw+LeCnzWGOURIzqfn23O0e7GFbeEFDEAwQV+m/EufUvDE+AZAnfTUBZhXTvjA38my7UWjRwv/hz17/NU06Y/wEwWwKZhDNwHMiCOt0Lx21ft3BTPZfk61zK075ykSBdl5pJd8NjeppQ7MIGqfEMeCwqlbIEr0m62B9LeZVXvHi9IFeLFSYTTJLUEoBFNo4mjCfG/BjUtQjtNgDMVvepiRuDFZWoBOCOKHvV2lOPjvHsTly8g7Jk9GHH1tKb1m+iHYQiebJ2N/XJtx+OJUqj6POsC4TzFlczeV4Kmh5l3OYuEBCmo1kJYJ06/hSprL5V4MrFqckvrzJZw9d2OaQvxlcjbfBvTQsTbJl04zoPPfN7jCn/+D2SUaNj26GZPXJlBzyzS0bQg5zTtwHzyNjqIlJg5j5kvl0gxn6DPJR7W9yT7o2aFhQkskTe4nfvGDX7x2oor9KbizNzMV18jo5kUCpYzvOZAWLLZb0zhOiUOzeSNZ3EzSVSNVLRAF3LvBy4MVxlcCTM1tmEIJqtA1uVKPG+9OHN22jr6aiPWEy+dhxWU8Jq2hCZwraL0l7TbaB7Hp5gIxOMZ7LinwZEP6UVqnd7/kJWkrgL696BnYBvpwFcZozRDPGaAVTaa16vEv4+G0uPGPJi7O0YvY5 m39GbVBQ HeyxtN4mbk1kfJLOrXpqT6KD1hd9dih01RIyo6cHxK6x6rFwLm13wfjUCqagt4hT2BJbAtd0MTB3rouFNirDjJ2zzUdVMDzbtnvYqGAbi0f9ilmvgKvvg0NMM/dud28OP69ew4pcW1LfN/8LfwbVBG6EpxYBC0MRwvy/n0zfdjOq9Uww2oysm0snMZNGblK756JKf4smI6HRytSmsEjUR09DG+egnQ7jrU+O2ap1KpOVfy9ktS8ZxWS9mnodilBTK/UBaofyzwAFR0BRVrUSOlcAY8noyxrvH1N4g2RnYi/UG9PSRRtHRdEjmUa4rHL5oUUV63wiocCzHCeg/KXuDDVq6IQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.007405, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 18, 2025 at 07:31:34PM +0800, Shuai Xue wrote: > Kernel can recover from poison found while copying from user space. Where was that poison found? On user pages? So reading them consumes the poison? So you're not really seeing real issues on real hw - you're using ras tools to trigger those, correct? If so, what guarantees ras tools are doing the right thing? > MCE check the fixup handler type to decide whether an in kernel #MC can be > recovered. When EX_TYPE_UACCESS is found, Sounds like poison on user memory... > the PC jumps to recovery code specified in _ASM_EXTABLE_FAULT() and return > a -EFAULT to user space. > For instr case: > > If a poison found while instruction fetching in user space, full recovery is > possible. User process takes #PF, Linux allocates a new page and fills by > reading from storage. > > > 3. What actually happens and why > > For copyin case: kernel panic since v5.17 > > Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new extable > fixup type, EX_TYPE_EFAULT_REG, and later patches updated the extable fixup > type for copy-from-user operations, changing it from EX_TYPE_UACCESS to > EX_TYPE_EFAULT_REG. What do futexes have to do with copying user memory? > For instr case: user process is killed by a SIGBUS signal > > Commit 046545a661af ("mm/hwpoison: fix error page recovered but reported "not > recovered"") introduced a bug that kill_accessing_process() return -EHWPOISON > for instr case, as result, kill_me_maybe() send a SIGBUS to user process. This makes my head hurt... a race between the CMCI reporting an uncorrected error... why does the CMCI report uncorrected errors? This sounds like some nasty confusion. And you've basically reused the format and wording of 046545a661af for your commit message and makes staring at those a PITA. Tony, what's going on with that CMCI and SRAR race? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette