From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E219AD10BE5 for ; Sat, 26 Oct 2024 06:47:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 187C16B0082; Sat, 26 Oct 2024 02:47:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 10EF76B0083; Sat, 26 Oct 2024 02:47:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECAA16B0085; Sat, 26 Oct 2024 02:46:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CB1816B0082 for ; Sat, 26 Oct 2024 02:46:59 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 13990141469 for ; Sat, 26 Oct 2024 06:46:38 +0000 (UTC) X-FDA: 82714820328.26.C7DECC1 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) by imf22.hostedemail.com (Postfix) with ESMTP id 05B4CC001D for ; Sat, 26 Oct 2024 06:46:29 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=t8LIS3vP; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf22.hostedemail.com: domain of xueshuai@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=xueshuai@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729925102; a=rsa-sha256; cv=none; b=36fS+mZ1FwB1rmCqMFp8dgu/icKDMhNjAQ8FrEWfwfTrH46U4SDpfR9TbxShISULVG9fWW qwRwHPu+u9lnwVm+Rk3kvXh9kejAsN9V+uJALwEMCSyIw51klwAquEUfsCOnIkxeQ2cKjj ZJjmYW6cE7LSCoILESxkps3qw1qDchA= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=t8LIS3vP; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf22.hostedemail.com: domain of xueshuai@linux.alibaba.com designates 115.124.30.111 as permitted sender) smtp.mailfrom=xueshuai@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729925102; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=642D6GzLlUCKe9SzYd8cvk/nLxL5J/aN2kycl56LWxQ=; b=E/iw5H+Bm13EyvAeGc4JsEEgaXeLIbXXTbTXFkv/ni1ojrvW67BF2VpBoD9/hYjKU01kfx kldw9elHDk40ATpVIFuvU3IxcufFZfOyNiw0izlAi95cyef/v/ZzVg9X6wIfVRKTZoPmYh I0zQZ7qSI4WihdB0Rnt4iDeDDCDwmGs= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729925212; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=642D6GzLlUCKe9SzYd8cvk/nLxL5J/aN2kycl56LWxQ=; b=t8LIS3vP2l2nE1sBKy4a5I/1tURadv+2kUyitWaRfS4SD2XkSgiN/S19zDiSsgsZChXUAoqFk8q1bb/RhoAk7HZfq5sUFRE9+Gdq4W6sN5Gde1ylxtmZzyCwc1gwNFF1ls75N7B3LllCrgw777eE+85FHhXbYFTE3PpHo7hNbIQ= Received: from 30.246.160.81(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0WHuUzmp_1729925207 cluster:ay36) by smtp.aliyun-inc.com; Sat, 26 Oct 2024 14:46:50 +0800 Message-ID: Date: Sat, 26 Oct 2024 14:46:46 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v14 3/3] ACPI: APEI: handle synchronous exceptions in task work To: Jarkko Sakkinen , mark.rutland@arm.com, catalin.marinas@arm.com, mingo@redhat.com, robin.murphy@arm.com, Jonathan.Cameron@Huawei.com, bp@alien8.de, rafael@kernel.org, wangkefeng.wang@huawei.com, tanxiaofei@huawei.com, mawupeng1@huawei.com, tony.luck@intel.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, james.morse@arm.com, tongtiangen@huawei.com, gregkh@linuxfoundation.org, will@kernel.org Cc: linux-acpi@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, linux-edac@vger.kernel.org, x86@kernel.org, justin.he@arm.com, ardb@kernel.org, ying.huang@intel.com, ashish.kalra@amd.com, baolin.wang@linux.alibaba.com, tglx@linutronix.de, dave.hansen@linux.intel.com, lenb@kernel.org, hpa@zytor.com, robert.moore@intel.com, lvying6@huawei.com, xiexiuqi@huawei.com, zhuo.song@linux.alibaba.com References: <20221027042445.60108-1-xueshuai@linux.alibaba.com> <20241014084240.18614-4-xueshuai@linux.alibaba.com> <05a8d26b-b023-426f-879c-7d33be4a6406@linux.alibaba.com> From: Shuai Xue In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 05B4CC001D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 4t6nyo3wr41ankgoe61nfsh8zmp5d6d5 X-HE-Tag: 1729925189-746529 X-HE-Meta: U2FsdGVkX1/c7nVVeM8gpBT/d+X5/HV/fRUgfb0cfUN/2gx4OWpp5Y123W+eqY527QJGV+vTQxSo0JWiYFQKKYlGD0gBKHfPFo+aidHNwV327TEGdjOK60jrC0Ovk+RZKtOm13PsdEX4tnfESm6j+1pPmrIhQpZjb2FX4UNvJPOrXpcSrR7yl44GPV18FgK4fatj2iVSf9Vanmmcd49iyFd8z9oHI3ksVRMGn5/YrXMy63w37EjJWl/0Q6T41L0pk1m25GmtqyePZmQwl1U6OG0r98fz6XSmDEkXFCVERBCEgAZzQs+nae7wYB6QS6R+ki6IsyG/C98kkfoE3gaGdcxeyolmuYdghyAhFDzLZC07NQDc0F3Y4FVSlTCIAp87g5n4A2dXzjTu1TCO3uT3nIY8eWWGoutYg+JIQX42Scj/rsmRPh248YaBDCRROZDUpJcSlcfpvgTxkox5cS/FU39tUTiSIuhRSjv8IhvlFqd1uGTu4V/GrIeC73YdzDx7+1LnhjMaDhx3A8H90//wvApuF6goIoUSaYLupWtXW3dKtKJXLNFdPkc314eoaTO847CwVJseK30fEzigeORObWSRwNtg4c12u3D1A5fjPvb737qA8GEYMbMmBiYUns+FEKiL21hdX+E/oZMFcpxi0ARu+cmV4wuJyq7KCAxvwDL2TGXpaCAJ20DHp79jGkEwrzif20R4ydpkxdFifz/4NfGkFBkrvnoXoD0AXqHt3Vg/7h5DV6xvbyJcEhilLc5NqY8NWAkqBJgu+0p479GCq0Wbm4AMIO9fSN/Xm/JpMzQ4LvSthayhtYMGRcqBLjADcj+LjBm+fm3f/8/b9PdMdbfkfrmRaxWkNYg6OvrPwDsIe6BJr8IBn0UuFVa90XabwsTwBBi9q1y0UumCjJfVTBTKLjYF+pDNiA4j27Ml/TTxyxGQyy27ymi8CZOZ9TKd42jJ0ErA7XcNG2VEubk U3KSFXPF O573jZj01D0vgiE6HGZSCXtIZ1uoTJJ7WwRvrgibyi/FdJIXTzd4CX5ERhwAliRKkcBBvt4JC4abQLV977O6csudZW29uolx3HUfsTS77e8hWdBHFVX9E9iK7KXsEZy3gzOISCdm/sITJkfWrUkEjs6JVmx2vn/fZ29L+dgXHW+NZHGZnu2lrTfih17dvlsvXhij41HFHt9XoLuX0I8OMmgrsFBSgdpVYed7+bgTL8HFD5ElB6EKRA9zQPQXC0DGbBbhDcWqMhgphHLM+TT5JX8CkTn6gO+lEZkWBVbWirUjLZi4+qJZ0SysURvQryzS5r60lnZcd3af06UY9F7MezRY7GYPBkoiHFNDi X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/10/25 22:40, Jarkko Sakkinen 写道: > On Tue Oct 22, 2024 at 4:11 AM EEST, Shuai Xue wrote: >> Hi, Jarkko, >> >> >> 在 2024/10/14 16:42, Shuai Xue 写道: >>> The memory uncorrected error could be signaled by asynchronous interrupt >>> (specifically, SPI in arm64 platform), e.g. when an error is detected by >>> a background scrubber, or signaled by synchronous exception >>> (specifically, data abort excepction in arm64 platform), e.g. when a CPU >>> tries to access a poisoned cache line. Currently, both synchronous and >>> asynchronous error use memory_failure_queue() to schedule >>> memory_failure() exectute in kworker context. >>> >>> As a result, when a user-space process is accessing a poisoned data, a >>> data abort is taken and the memory_failure() is executed in the kworker >>> context: >>> >>> - will send wrong si_code by SIGBUS signal in early_kill mode, and >>> - can not kill the user-space in some cases resulting a synchronous >>> error infinite loop >>> >>> Issue 1: send wrong si_code in early_kill mode >>> >>> Since commit a70297d22132 ("ACPI: APEI: set memory failure flags as >>> MF_ACTION_REQUIRED on synchronous events")', the flag MF_ACTION_REQUIRED >>> could be used to determine whether a synchronous exception occurs on >>> ARM64 platform. When a synchronous exception is detected, the kernel is >>> expected to terminate the current process which has accessed poisoned >>> page. This is done by sending a SIGBUS signal with an error code >>> BUS_MCEERR_AR, indicating an action-required machine check error on >>> read. >>> >>> However, when kill_proc() is called to terminate the processes who have >>> the poisoned page mapped, it sends the incorrect SIGBUS error code >>> BUS_MCEERR_AO because the context in which it operates is not the one >>> where the error was triggered. >>> >>> To reproduce this problem: >>> >>> #sysctl -w vm.memory_failure_early_kill=1 >>> vm.memory_failure_early_kill = 1 >>> >>> # STEP2: inject an UCE error and consume it to trigger a synchronous error >>> #einj_mem_uc single >>> 0: single vaddr = 0xffffb0d75400 paddr = 4092d55b400 >>> injecting ... >>> triggering ... >>> signal 7 code 5 addr 0xffffb0d75000 >>> page not present >>> Test passed >>> >>> The si_code (code 5) from einj_mem_uc indicates that it is BUS_MCEERR_AO >>> error and it is not fact. >>> >>> After this patch: >>> >>> # STEP1: enable early kill mode >>> #sysctl -w vm.memory_failure_early_kill=1 >>> vm.memory_failure_early_kill = 1 >>> # STEP2: inject an UCE error and consume it to trigger a synchronous error >>> #einj_mem_uc single >>> 0: single vaddr = 0xffffb0d75400 paddr = 4092d55b400 >>> injecting ... >>> triggering ... >>> signal 7 code 4 addr 0xffffb0d75000 >>> page not present >>> Test passed >>> >>> The si_code (code 4) from einj_mem_uc indicates that it is BUS_MCEERR_AR >>> error as we expected. >>> >>> Issue 2: a synchronous error infinite loop >>> >>> If a user-space process, e.g. devmem, a poisoned page which has been set >>> HWPosion flag, kill_accessing_process() is called to send SIGBUS to the >>> current processs with error info. Because the memory_failure() is >>> executed in the kworker contex, it will just do nothing but return >>> EFAULT. So, devmem will access the posioned page and trigger an >>> excepction again, resulting in a synchronous error infinite loop. Such >>> loop may cause platform firmware to exceed some threshold and reboot >>> when Linux could have recovered from this error. >>> >>> To reproduce this problem: >>> >>> # STEP 1: inject an UCE error, and kernel will set HWPosion flag for related page >>> #einj_mem_uc single >>> 0: single vaddr = 0xffffb0d75400 paddr = 4092d55b400 >>> injecting ... >>> triggering ... >>> signal 7 code 4 addr 0xffffb0d75000 >>> page not present >>> Test passed >>> >>> # STEP 2: access the same page and it will trigger a synchronous error infinite loop >>> devmem 0x4092d55b400 >>> >>> To fix above two issues, queue memory_failure() as a task_work so that it runs in >>> the context of the process that is actually consuming the poisoned data. >>> >>> Signed-off-by: Shuai Xue >>> Tested-by: Ma Wupeng >>> Reviewed-by: Kefeng Wang >>> Reviewed-by: Xiaofei Tan >>> Reviewed-by: Baolin Wang >>> --- >>> drivers/acpi/apei/ghes.c | 78 +++++++++++++++++++++++----------------- >>> include/acpi/ghes.h | 3 -- >>> include/linux/mm.h | 1 - >>> mm/memory-failure.c | 13 ------- >>> 4 files changed, 45 insertions(+), 50 deletions(-) >>> >>> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c >>> index f2ee28c44d7a..95e9520eb803 100644 >>> --- a/drivers/acpi/apei/ghes.c >>> +++ b/drivers/acpi/apei/ghes.c >>> @@ -467,28 +467,42 @@ static void ghes_clear_estatus(struct ghes *ghes, >>> } >>> >>> /* >>> - * Called as task_work before returning to user-space. >>> - * Ensure any queued work has been done before we return to the context that >>> - * triggered the notification. >>> + * struct ghes_task_work - for synchronous RAS event >>> + * >>> + * @twork: callback_head for task work >>> + * @pfn: page frame number of corrupted page >>> + * @flags: work control flags >>> + * >>> + * Structure to pass task work to be handled before >>> + * returning to user-space via task_work_add(). >>> */ >> >> >> Do you have any futer comments about this patch? Any comments are >> welcomed. If not, are you happy to explictly give the reveiwed-by tag? > > Sorry I've been busy switching to a new job. > > I read this now through and both commit messages and the code changes > look sane to me so I guess I don't have any problem with that: > > Reviewed-by: Jarkko Sakkinen > >> >> Best Regard, >> Shuai > > BR, Jarkko Thank you. BR. Shuai