From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 599B2C7EE26 for ; Tue, 23 May 2023 01:34:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C25226B0072; Mon, 22 May 2023 21:34:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BD4FC900002; Mon, 22 May 2023 21:34:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9D0B6B0075; Mon, 22 May 2023 21:34:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9B0566B0072 for ; Mon, 22 May 2023 21:34:58 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 67F4C1A05D5 for ; Tue, 23 May 2023 01:34:58 +0000 (UTC) X-FDA: 80819801076.17.547CA3A Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf30.hostedemail.com (Postfix) with ESMTP id EE14080015 for ; Tue, 23 May 2023 01:34:54 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684805696; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=byqDs459Sl/fK/SzZ4kfxIsnTOQXyWJotZIBoYBtsZk=; b=5WkuaZlZ68hGEnkD2MZUS4N+bBnek+E/ZoMrnbAsoKdiRuKbSkUnzLzypbbrlHlD3H3fhs ceWkAj988zTWFSJbDlDE8Qqam6sfUJWq6lku0ARCSdQ3o8X31bm3lp8ThykWwbHriCLtyP T8liUB9bPaBiJ2h1mDNWJDBKx1UfW6c= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684805696; a=rsa-sha256; cv=none; b=7Ch7l679oTVNPz9VmzRnU8JpRDt+Ncp6Pm/nOjdge5WQiu0V9rlN77PKNCP4cMQweIFUPf yBmMgsfRb/tCqb02Mtny+HOhTnzgkgTiOKmOTizSE3V3TeTgS+BbtQ7LT4HR0rSy8Jzptu gFk9h9JGQf/uBWgP+OrUoTmRPBDPDL4= Received: from dggpemm500001.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4QQGwG6MzRz18Ldb; Tue, 23 May 2023 09:30:22 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm500001.china.huawei.com (7.185.36.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 23 May 2023 09:34:49 +0800 Message-ID: <0d3aeb09-5bc9-31bd-4f84-675ebddd9f03@huawei.com> Date: Tue, 23 May 2023 09:34:49 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: [PATCH] x86/mce: set MCE_IN_KERNEL_COPYIN for all MC-Safe Copy Content-Language: en-US To: "Luck, Tony" , Borislav Petkov , Naoya Horiguchi CC: Thomas Gleixner , Ingo Molnar , Dave Hansen , "x86@kernel.org" , Andrew Morton , "linux-edac@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "chu, jane" References: <20230508022233.13890-1-wangkefeng.wang@huawei.com> <75d8452c-695b-b22a-30d0-15302cd072ef@huawei.com> From: Kefeng Wang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemm500001.china.huawei.com (7.185.36.107) X-CFilter-Loop: Reflected X-Rspam-User: X-Stat-Signature: 8df3yrmmdpc9muhu7n7encs31j5n58bb X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EE14080015 X-HE-Tag: 1684805694-932787 X-HE-Meta: U2FsdGVkX1/cxcVNwoQw+gqu4xYP8HqQVYTuvbb6UTEGv+wTRtuw86i+3D6PQIp+wWKaZIKsc5QWQqeUo8D1YJyCnDfj+LMsdu0axRBprlMK84WfdbGvoipqude/PvkeinXQBJavCPxacRAsl6JR1LBylB0eQlvW/2JQwEbz4rhtTHXvM88nwj5l/YWyHUk7x5LbixkGH9JvOioDEijR/XD1Zl0tvq2Hqlu2NBb+Mmi5R7SQdcJiU0bEWshJ3bLqWmqrhkUfwvdNVPepNLFh5Ux2r5DKzIMl9uszL3ejCc7z39N8U8l4nuJ8d2hT7GdFjwH3VRFtblhRR9t/nUB3qX4xywyUAGhRO3zppgerf4uJDdQ5q2FebkcaynN36yG1cuBdWdYMOMIykiPIzQlU0DpJdTlcMvTIIr+LS17JyHwYuUAS2m4PMMVlrUgvGz6pZzIO/1k01xCDf0ZJu1LrIP5agHJb2myiwPt8UhJ4KZJvS7Xati0i9wKjpNZLeL8ldHFOs1fkM+JUy1F/uRJ3PLynavb4B36IYXRk4fs/X71nXOf7o3wOYgR/IbtlQ/Mzeb8SdH4giq0rGnXsiTaslzZh1hPZrY5bhLDG1sXTqC9LSVi8jnDt4uF+I2hwpmdbrVEcqTSM84mYrvHUUvLSQEBggKZbwrhNwbXbcydMZZtbK6VICdMf7wH2efeQQgJleCkMVieY9xuMj5jIPGEbH0ApKKWt7wTyoRCX2AtzCwqBY77cju6MGRcYzFWfj4xxhnVv7BFv8JHYpJ6s6X9J4eHq5o3r7rLnLGuxGpoe+XfwopVvhfqkGJMOEz8bUs3JCrNWqGet8XA/NhHeyG5oxVEEgK/1FBrfmO6n+1gsT0IclQqaupCIijJSdcCdYWZV+hDwVco/t6+ZlxP3CeobpHc7BBewN975dYEag/m+gQdRrAG+mUoP1Hjmoho6s8XePVuVcQ7yzwEEH6ddLh6 eFZwWoJE gOLGcYQqHmPzaXyjoHma8BAHE+OELkaFIxI/DIcU3DxQM2fS+dqEJF4/I1Xx7Hf7tzdQNIyOzbHrIxmNFi5tnnYrumFqVg6i6jTyDWd8KgnIId959brRM2qogg3f/A8x9Ektyg3Pd4scr7LvAB7pPjLUOLx8qzj9p4Hu+ZMBt8Kbz3qmzaxLTQbXsL8GGcx+gYpDgNHWCPaEIxHpF83NxF7VrvHX8wr1jH5ZslT4SGVFO1oTBnWG+v4cxig== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/5/23 2:02, Luck, Tony wrote: >>> Is this patch in addition to, or instead of, the earlier core dump patch? >> >> This is an addition, in previous coredump patch, manually call >> memory_failure_queue() >> to be asked to cope with corrupted page, and it is similar to your >> "Copy-on-write poison recovery"[1], but after some discussion, I think >> we could add MCE_IN_KERNEL_COPYIN to all MC-safe copy, which will >> cope with corrupted page in the core do_machine_check() instead of >> do it one-by-one. > > Thanks for the context. I see how this all fits together now). > > Your patch looks good. > > Reviewed-by: Tony Luck Thanks for your confirm. > > -Tony > > One small observation from testing. I injected to an application which consumed > the poisoned data and was sent a SIGBUS. > > Kernel did not crash (hurrah!) Yes, no crash is always great. > > Console log said: > > [ 417.610930] mce: [Hardware Error]: Machine check events logged > [ 417.618372] Memory failure: 0x89167f: recovery action for dirty LRU page: Recovered > ... EDAC messages > [ 423.666918] MCE: Killing testprog:4770 due to hardware memory corruption fault at 7f8eccf35000 > > A core file was generated and saved in /var/lib/systemd/coredump > > But my shell (/bin/bash) only said: > > Bus error > > not > > Bus error (core dumped) No sure about the effect, but since there is kernel message and mcelog, it seems that there is no big deal for the different :) > > -Tony >