From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 956BFC77B7A for ; Wed, 24 May 2023 11:23:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02F98900004; Wed, 24 May 2023 07:23:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2051900002; Wed, 24 May 2023 07:23:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0F15900004; Wed, 24 May 2023 07:23:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D04B8900002 for ; Wed, 24 May 2023 07:23:42 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9F7B31A089D for ; Wed, 24 May 2023 11:23:42 +0000 (UTC) X-FDA: 80824913484.06.636AC0B Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf11.hostedemail.com (Postfix) with ESMTP id 64E4B40005 for ; Wed, 24 May 2023 11:23:38 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf11.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684927420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pi/hNP/dWQWXmYNbgHi6AF/qV4XbNxlaDl7yzmp/TN4=; b=vBsp1VyWYsX1RiGs8ldiyMVwDjkR//dWHiz6alUqaNEbsC7dMUPbFJYrieIFtfmB4D3IOB ngcR6CDO9xzcDp9nNu0a5PjxOfh8heZyMixE8qLErsG0oX6yUDQky18Kpyvg6tn0x8QoEu eOY3Uc1TqQlqe5E5oFMOl8JurIEjb8w= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf11.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684927420; a=rsa-sha256; cv=none; b=OlTV6ueLu3zwo7M1+g/WtUwEaxQTCM64z79byMwqFGXS07GolG+/nQi7u7fH6w+IMOFWNI A22go2NtRQFyF6PE8vbkA2JZC8h36tY6CLoYYpwm+8QSj/YHkRI8y/WOnAx4bW/38vaJJs ppfm63YrNcF5cPzpQz/tf9Xj7kphxcQ= Received: from dggpemm500001.china.huawei.com (unknown [172.30.72.55]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4QR80Z1wvvzLmG5; Wed, 24 May 2023 19:22:06 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm500001.china.huawei.com (7.185.36.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 24 May 2023 19:23:33 +0800 Message-ID: Date: Wed, 24 May 2023 19:23:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: [PATCH] x86/mce: set MCE_IN_KERNEL_COPYIN for all MC-Safe Copy Content-Language: en-US To: Tony Luck , Borislav Petkov , Naoya Horiguchi CC: Thomas Gleixner , Ingo Molnar , Dave Hansen , , Andrew Morton , , , , References: <20230508022233.13890-1-wangkefeng.wang@huawei.com> From: Kefeng Wang In-Reply-To: <20230508022233.13890-1-wangkefeng.wang@huawei.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500001.china.huawei.com (7.185.36.107) X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 64E4B40005 X-Stat-Signature: mi4q95ix9nqqgd1tmp9cg13paf41h878 X-HE-Tag: 1684927418-904844 X-HE-Meta: U2FsdGVkX1/icywJXkGBiq79J8HLSkuxwTMoujw8scJnLwTVw8KpsQWUmyOiSRp1dDCSLVk7T+l/PjxfnXct2nNlWwXamOmcp14I6r/VrnXN8MjifSEjy5osPkeudQDmxj+pLz7P8uq9VawDPWKy+EA+1ZmgkK4ITl3fxO8VVeJHAdR9iXlemLxeca+vDtUi3U4ZZBKo5nAnO0bzK01wJDCtThywIWQGc53jyn3EQiDgbbXVHAp3nVEIc/U2E3pnJXRV9afBziEmN32razi5hKtAuGiH9oNNkY8HtBhvvsjirVwNaQg0ixHKzQDHJLCmKBOzfL92JvcNEwNMmW1lAe28mMhNPkfgPGhVSL6XCVpogbrZuU2y6x1OhJ9ecqobRQ1My4GtROXRVA8j5uaUCQouQAwsMetugshra7KgcO4mbJSsLe57liqM+UckvpPWlwO1yt7JVHDsv3RuBdufO7wYAaYY7nVJCCYgl6SxOik0b/2ScqrH/VdxmdmXgxUIWdV1EO5P7zghc6WUoFb2jumxiza9m5RK39GYlaE4r+X5MWENT8GjbvCjKT2qZZzSrTy7nOfpSIZ+mnqWGJ3BWB5aFQa+O3GCP7eocfTepXwrSKvxiOTaz/kBx6Lt6dX6RQJXBz/Rcey1jdsmTpSKlL60F1Wq6S1bwCJs6ygPxSFu+xr/pYLwQd0zaB+mlAooo/ZkBD0D9+y/W+RVBELAhEoLm35sALXha9NJG2dX38fQVO5bpeBlXrt3taPlyzYEkbhMuRH+WxwLPfY3ODoN7VBX4D8lr5HWw6e3m2m1bglar5Wb3bQr0+lxrmYnl44swGaQwmhQX4v7BaU8NpK2q4Tl9bZkQMQVIRbD1z5Upt15WCTRYpQAQY1uOfDiiyArXd1yIzYbHQC02skNz4eWLuWizMaAbBdnKI48nQeK4a/odsAqBaa3HzAFfIIiVwySCooS0fwL8Ca+J08W5UN 3U2RBl45 kCrY0QzN03y1SsecS3DlfS5ngpKABqA+raXaacAwEqk9mAkTUiFafX+FfrZ1r4a9w4wp8YolhEHfmHPVV+/mwtm5e6iwILXiZkuy/zoMCTe7nsfOgjgi1b5NIJqmCxGRP2K9hNLMFnPBH1qkx8a8pTG4NzM7FNA+v6YYiWyYyMeyYmnj5dX0M36126qn+bGPHUz5GkQqyAntFVmB4zhzjl4SZDLfg0r6mwlstNTbCSQSZYyQ0I2S9ta2dYH66psAvwLplxgAsbjzRrlM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi x86/mm maintainers, could you pick this up as it has be reviewed by Naoya and Tony, many thanks. On 2023/5/8 10:22, Kefeng Wang wrote: > Both EX_TYPE_FAULT_MCE_SAFE and EX_TYPE_DEFAULT_MCE_SAFE exception > fixup types are used to identify fixups which allow in kernel #MC > recovery, that is the Machine Check Safe Copy. > > For now, the MCE_IN_KERNEL_COPYIN flag is only set for EX_TYPE_COPY > and EX_TYPE_UACCESS when copy from user, and corrupted page is > isolated in this case, for MC-safe copy, memory_failure() is not > always called, some places, like __wp_page_copy_user, copy_subpage, > copy_user_gigantic_page and ksm_might_need_to_copy manually call > memory_failure_queue() to cope with such unhandled error pages, > recently coredump hwposion recovery support[1] is asked to do the > same thing, and there are some other already existed MC-safe copy > scenarios, eg, nvdimm, dm-writecache, dax, which has similar issue. > > The best way to fix them is set MCE_IN_KERNEL_COPYIN to MCE_SAFE > exception, then kill_me_never() will be queued to call memory_failure() > in do_machine_check() to isolate corrupted page, which avoid calling > memory_failure_queue() after every MC-safe copy return. > > [1] https://lkml.kernel.org/r/20230417045323.11054-1-wangkefeng.wang@huawei.com > > Signed-off-by: Kefeng Wang