From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5EFCC48260 for ; Mon, 19 Feb 2024 09:26:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A0B68D0002; Mon, 19 Feb 2024 04:26:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 152188D0001; Mon, 19 Feb 2024 04:26:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 019898D0002; Mon, 19 Feb 2024 04:26:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E575E8D0001 for ; Mon, 19 Feb 2024 04:26:19 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id BBB0BC02AB for ; Mon, 19 Feb 2024 09:26:19 +0000 (UTC) X-FDA: 81808022478.16.39C1C1A Received: from mail.alien8.de (mail.alien8.de [65.109.113.108]) by imf02.hostedemail.com (Postfix) with ESMTP id 8C39A80002 for ; Mon, 19 Feb 2024 09:26:17 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=Xf0cdP2x; spf=pass (imf02.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de; dmarc=pass (policy=none) header.from=alien8.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708334778; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UeyP8KcF7dj4h57hjnNyWpdHtLUCNmmBLKlm/tndrmU=; b=aR+FyLWRldW5jjCymOdEbkH10HcIcUIrWdmi5OlvnBvHql0w0GHY3t+JEBlog+oro7ZPHt YFIw+kgWSlrfyZKWlSLPF+pQgxCqfJW3hGAOCzBFiLlbXeBcSOdPCK6wMurPOldtpNuqKE 91NVYBW3cDflHhLHcSdtpix/F6fPvCc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708334778; a=rsa-sha256; cv=none; b=PNFTUiQxT7q6qfE9CsPVPTbi2kYkK1AWmLJSuiRxuyVblEgSwSboJ9lnxoOx1rZsXR4Xrt +jsgceHXYhZ0QFdqilQhcUsTutrXaLhM1Cmh9qsulKNBBOBljuaITg+K/C58gwXfC/jH8+ VxcQY3yE0MHbOEKbc83+HfDjO3Tpy1s= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=alien8.de header.s=alien8 header.b=Xf0cdP2x; spf=pass (imf02.hostedemail.com: domain of bp@alien8.de designates 65.109.113.108 as permitted sender) smtp.mailfrom=bp@alien8.de; dmarc=pass (policy=none) header.from=alien8.de Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTP id 7366540E01B5; Mon, 19 Feb 2024 09:26:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at mail.alien8.de Received: from mail.alien8.de ([127.0.0.1]) by localhost (mail.alien8.de [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Mq1WMaf5Uafz; Mon, 19 Feb 2024 09:26:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=alien8; t=1708334770; bh=UeyP8KcF7dj4h57hjnNyWpdHtLUCNmmBLKlm/tndrmU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Xf0cdP2xgYwE7kk7NPoncwRQ2vWHPuBzFiX9Tp/dkKOkz7smCwncVf0nzi1Na0lVi x6ujl4Eih/C97YDlmeLExPqrbjN3/aaspiUhEbhnXTW1mcnIxZm2GBmz8Vm+MttUNx 718loAEebT/uwngmUFn2ICrqZyBj/QciEgICYjPGkDAsJ1k3JF11Q8g83P4umT5v5H jNENwJiT4B5rbIChpoeAM/XQN3VTlBnlLBOS0P8iEDUciphLLIs5Ur52B23krqy4Ky QWAjTpPWG5pPWTs4C91do1EZTp92qFN5LpX5M2Fww5TuYARes6D/UiQ6Yr/p/yzMhD 5nWN2iSpsbO3TDbHOv6sRHTUUhXtsdnA19VmQMa1qBhGGYxsgygmL84WdtsQal30w3 oGn4Mfo7Ite08xAMz5y+ziPRCJuKMN9sIKEOnT8Ja/9D5z8EIWx1tiYtBFJ45fVLd9 ByPPRpVoiiUHKhjBj01J6zSM1PuJmriXQv6SHiGlIKAyK8iJLYinVBmsQm8Z6OrCIg +T/HT9O+/gLpixmCRHqIZcgxS/JYamyKfclw2ziFb4x1K2+HqVdbV3opVQtZYxnHw4 WrBcLblJm6zCeGMNmCRCHfSm/hWAaKzFiHZwPJ8ohAebLAsnqG/RI+NufsbkjC6zrf l+T3vQ8Jaw0ElXahzouG/fww= Received: from zn.tnic (pd953021b.dip0.t-ipconnect.de [217.83.2.27]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail.alien8.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 13EB640E0196; Mon, 19 Feb 2024 09:25:35 +0000 (UTC) Date: Mon, 19 Feb 2024 10:25:28 +0100 From: Borislav Petkov To: Shuai Xue Cc: rafael@kernel.org, wangkefeng.wang@huawei.com, tanxiaofei@huawei.com, mawupeng1@huawei.com, tony.luck@intel.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, james.morse@arm.com, gregkh@linuxfoundation.org, will@kernel.org, jarkko@kernel.org, linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, linux-edac@vger.kernel.org, x86@kernel.org, justin.he@arm.com, ardb@kernel.org, ying.huang@intel.com, ashish.kalra@amd.com, baolin.wang@linux.alibaba.com, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, lenb@kernel.org, hpa@zytor.com, robert.moore@intel.com, lvying6@huawei.com, xiexiuqi@huawei.com, zhuo.song@linux.alibaba.com, Ira Weiny , Jonathan Cameron , Dan Williams Subject: Re: [PATCH v11 1/3] ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered Message-ID: <20240219092528.GTZdMeiDWIDz613VeT@fat_crate.local> References: <20221027042445.60108-1-xueshuai@linux.alibaba.com> <20240204080144.7977-2-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20240204080144.7977-2-xueshuai@linux.alibaba.com> X-Stat-Signature: 85cuhjxyhznxrf9p18xk45w38h43855g X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 8C39A80002 X-Rspam-User: X-HE-Tag: 1708334777-272955 X-HE-Meta: U2FsdGVkX18ylgKynWOcCQ3OQeJyaC9lRfrtaEdM3bmI1ZIOtF6Nrud5y3mHMNGkRAdqXj7NjQw7bTX1p3lDPbw5y7D0Y06Xz+jBNFZ86Z5XPcHSvi+ehybWmrpBo/fDE0534pN/BoCrtn7ul/AtOTSsRTxsDHHp1X/+SjtwhWBAiu+7KXEKGD8M2Y6Ia+R3g/Ir8IQz6faldqwhDku8z7lQnYXBskoNqiFSPkEy10xBBMpdnw1OvJ8D/Tnh8+9XO3+JzX7+6JbV+21URwIpnYRPWZmUtTeVWm82W8JALQEc6LhplVyPvg2WZSalqmvy/f50p9QKVgYiWtELSgctwJTaV+8oieJ+RG1eGGaxUuTgqQZxFsy8/brb1Kv1Z946xfm1rakvgPQbE4ShXHp3WM/ZTMH+MqpeKJNDymQ4L1H7XwBAHRuQobYXKp1NTM6+Vmxd10v7+txOUFWSk1UMrONbEIx2SkCSNFRX6kxrQ7n9d3KNdOepKPOWvRgKcpdqOWDQuVB5Ax6BhXnMsETcLsmzvs7mra7dwURAyo8K2/xcFVel0+kZfLGViTbe1zTTf1zIzsNjldx3btejTVqfGgnHrTqBlTZpvaAP/5gkG9FaCPxawLgURLt0ZfmM+hU6Sz2u4SQUpn92xl+D8endPSXWenuMoCsbTfQNGUshw54DEcxXTnt0Pp/vDBlOICzOVYIb0wvEm7Tee+6L9nYXS6XiVYpN8KtlI1VOCxB54yw0ChzQ5QuIU7vrSHFB+f6TPUk8EKuK9mRcG6y6345DtYQ7kuMFwrOEY3CAaiARHRZgXvigNbvdhtKpIyCDPY2wVeS/ICunnuRbCDUvKK59Bg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Feb 04, 2024 at 04:01:42PM +0800, Shuai Xue wrote: > Synchronous error was detected as a result of user-space process accessing > a 2-bit uncorrected error. The CPU will take a synchronous error exception > such as Synchronous External Abort (SEA) on Arm64. The kernel will queue a > memory_failure() work which poisons the related page, unmaps the page, and > then sends a SIGBUS to the process, so that a system wide panic can be > avoided. > > However, no memory_failure() work will be queued when abnormal synchronous > errors occur. These errors can include situations such as invalid PA, > unexpected severity, no memory failure config support, invalid GUID > section, etc. In such case, the user-space process will trigger SEA again. > This loop can potentially exceed the platform firmware threshold or even > trigger a kernel hard lockup, leading to a system reboot. > > Fix it by performing a force kill if no memory_failure() work is queued > for synchronous errors. > > Signed-off-by: Shuai Xue > --- > drivers/acpi/apei/ghes.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 7b7c605166e0..0892550732d4 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -806,6 +806,15 @@ static bool ghes_do_proc(struct ghes *ghes, > } > } > > + /* > + * If no memory failure work is queued for abnormal synchronous > + * errors, do a force kill. > + */ > + if (sync && !queued) { > + pr_err("Sending SIGBUS to current task due to memory error not recovered"); > + force_sig(SIGBUS); > + } Except that there are a bunch of CXL GUIDs being handled there too and this will sigbus those processes now automatically. Lemme add the whole bunch from 671a794c33c6 ("acpi/ghes: Process CXL Component Events") for comment to Cc. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette