From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09D01C77B7A for ; Thu, 25 May 2023 17:18:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49309900006; Thu, 25 May 2023 13:18:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 44289900002; Thu, 25 May 2023 13:18:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 332B3900006; Thu, 25 May 2023 13:18:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 21533900002 for ; Thu, 25 May 2023 13:18:43 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DD74B160D11 for ; Thu, 25 May 2023 17:18:42 +0000 (UTC) X-FDA: 80829436884.19.7659E56 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf25.hostedemail.com (Postfix) with ESMTP id DFDA1A0012 for ; Thu, 25 May 2023 17:18:39 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hqxQTO2H; spf=pass (imf25.hostedemail.com: domain of dave.hansen@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685035120; a=rsa-sha256; cv=none; b=M+advdXBTwa/Kz05d3HAa4i9DQTYzyxxpoEns3aOfwq5889yQV9HjvRJXGFC6NP2SzfqR0 hhbOS13PABXAsV3dmGiMts2Ypnf4HqVz8QAylmPYKO3Qmge5i7DYwnUMG6v2Q6p4Exe+VL 7v/KP1v+D5hKpoBejfsSl4ndh1jxNT8= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hqxQTO2H; spf=pass (imf25.hostedemail.com: domain of dave.hansen@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685035120; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4pgu1PDV2itx5lNkUSwyPSEwx0sFGbR2ORpA8UttGWo=; b=4D4fmJ6Ghlyu1vPUuA9MM3xNfwEgH3Fl38iL0PHiCce9emZ+LjfVrHOIin01BXY86FBQKq XLuFvzblodiHRKmH335Fn3UelKaGXpAixaGjpV2VFLZHJRvEMKQdiQZ+Gr3PXIqqzeZ2JR VAzp4sydHHDMkEeJOVJRQFE/PZSVy+I= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1685035120; x=1716571120; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=Fx5cirxrz+HpUTDvn+zJBzDpeIbw64NeczyJDswoBq4=; b=hqxQTO2HRL7syn3gPRPOnmKzPuupZjyAZ4S76gOhvW43B+IN1vT9RO0Z zYmCvs9LdKRn8QBGyziNf/vBPndA+oRUVRuBcB/deKz4DKe7XoYg7mGnm 6waI9fJXsNj+ChEfUzszQu77PB/WCMuLIWD4UFtBHnMRw+VD7XTzBZMRq knr4lKj5no0R4e4mqQrdFK6DJZQbof1yS42Vz359IcljtXR7tjZg3UVtI VO/0NFNP6OYw1d2BZ5tZZRPWMqO8gxf3w91L5J/WhZv8fcgw1xx+DopvO ICUoeqUyhDj0tneJF6E87HLu91mGsBrh42rTfuCqvXhiWvu8j6hdSAkB4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10721"; a="357206472" X-IronPort-AV: E=Sophos;i="6.00,191,1681196400"; d="scan'208";a="357206472" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2023 10:18:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10721"; a="704871231" X-IronPort-AV: E=Sophos;i="6.00,191,1681196400"; d="scan'208";a="704871231" Received: from shuklaas-mobl1.amr.corp.intel.com (HELO [10.212.186.148]) ([10.212.186.148]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2023 10:18:23 -0700 Message-ID: <5570c23a-3b12-6685-cb0b-29fc1d58f541@intel.com> Date: Thu, 25 May 2023 10:18:24 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH] x86/mce: set MCE_IN_KERNEL_COPYIN for all MC-Safe Copy Content-Language: en-US To: Kefeng Wang , Tony Luck , Borislav Petkov , Naoya Horiguchi Cc: Thomas Gleixner , Ingo Molnar , Dave Hansen , x86@kernel.org, Andrew Morton , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, jane.chu@oracle.com References: <20230508022233.13890-1-wangkefeng.wang@huawei.com> From: Dave Hansen In-Reply-To: <20230508022233.13890-1-wangkefeng.wang@huawei.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: DFDA1A0012 X-Stat-Signature: bzga17kzyj8ebzu6ss8ijdwprxws6oc8 X-Rspam-User: X-HE-Tag: 1685035119-406292 X-HE-Meta: U2FsdGVkX1/gc91Tdl71/nYefwFzj8HfzXXB9Rs1Qv+bd6vMsoY29QTvBZ5cGSpCf74CVRvXGs1cj1n0VKrfD+ukg9ki3l0XTyIP4ihXLsgbnOdljv6IUsnmKkMehsGrFPWu10APb4qrFjI1w/2ULmjpW6WgEUt79RtJ/tpmC/o50d0hhg+PA5SwcIhaL6WpNJi1/+qciiHt8PLhjNd9x0wNTH1ExgHsFH3SW9Y+etNP8g1VXCW3AYvtkcKfwnSxV7HY9WNSM5V2Gm537tJBFlck3egvX9yoWnDICGCE+BQldXm5Mqwk0lGQAB2UKIqaxqpLqaI0X41a6RrHYIPuNL2WtijRiVQIKeCjOqjO7bMYDP6D8hdqcvJk5U9YiG8oPwq7278Ca+HvSz668XJkWp+jNqWpl4FZm954QvPtTgrC+9z1rMdsHjhLOsclsoghdsCF5Ea4qlcV/GuUmlln87hwam8EWEh5NKSMveim/yoQD6b55qIXTVLS8iOkPqq6815xhbNG/YVjSc5YTlpBVi+tqQ2hOiNVv7XrzyCAgdDB/j/tyRaqyX49Ex0wPC6I+bQY1BLJV697vE43bSDE5qkKs+Unn3LF23WvEf22u4gg9WUhkehwJ9RiBb44xNhHDkZFrXGTSGz898NqqPlFxFwyJEb6rXYgNlsPD/LO8upWwpEgyrSaPv977WsaZ3lo6hsnEXFRjK/YiqsZwC1W9ozlzdJfI6DqhwyzZIXPin+aulbsKd8aW8/Be90A5ezbPmARCJ5LUnT+onsJMIhdarp6nFMSffG0oRMR3QI0fm0A1wltGzbXQN5RoZTaSyon5odMtgkh/WimEfclebPvEdahcDOPjxIxo3ZVSiYGTao05D6ET7PdiL+UERd0i654Ew2h8vj6q1G//yPXCoMHgfkUt3krgKYfPVIWWgnDVAQSEUVLleWBduwUOK78aueGnwt4DiQ2iv50i43PEJV +yy4mUBz lhwkJoIZ/GDVQPznzB/szTsyaRUdbINiMYln6XHDRi++RuU/FjOOzj8OIxvWSh0xPEKj54qv/CPh+bV6e7eyNL3xhM8J0ktBgNwmafeXH6CEsn7tXqHhI0YTryftGUsD4vrYWJk2AT5UmPw5joXG3w+nmVP7O8SpNIV3tDhInost1Yd6WdOCua4y+3pLqUt3qnEc5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/7/23 19:22, Kefeng Wang wrote: > Both EX_TYPE_FAULT_MCE_SAFE and EX_TYPE_DEFAULT_MCE_SAFE exception > fixup types are used to identify fixups which allow in kernel #MC > recovery, that is the Machine Check Safe Copy. > > For now, the MCE_IN_KERNEL_COPYIN flag is only set for EX_TYPE_COPY > and EX_TYPE_UACCESS when copy from user, and corrupted page is > isolated in this case, for MC-safe copy, memory_failure() is not > always called, some places, like __wp_page_copy_user, copy_subpage, > copy_user_gigantic_page and ksm_might_need_to_copy manually call > memory_failure_queue() to cope with such unhandled error pages, > recently coredump hwposion recovery support[1] is asked to do the > same thing, and there are some other already existed MC-safe copy > scenarios, eg, nvdimm, dm-writecache, dax, which has similar issue. That has to set some kind of record for run-on sentences. Could you please try to rewrite this coherently? > The best way to fix them is set MCE_IN_KERNEL_COPYIN to MCE_SAFE > exception, then kill_me_never() will be queued to call memory_failure() > in do_machine_check() to isolate corrupted page, which avoid calling > memory_failure_queue() after every MC-safe copy return. Could you try to send a v2 of this with a clear problem statement? What is the end user visible effect of the problem and of your solution?