From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,PDS_BAD_THREAD_QP_64, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3BBFC433DB for ; Fri, 12 Mar 2021 23:48:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4ADBB64F3E for ; Fri, 12 Mar 2021 23:48:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4ADBB64F3E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 73F7B6B006C; Fri, 12 Mar 2021 18:48:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6DC786B006E; Fri, 12 Mar 2021 18:48:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 556136B0070; Fri, 12 Mar 2021 18:48:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3C1006B006C for ; Fri, 12 Mar 2021 18:48:41 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id DAD7E814F for ; Fri, 12 Mar 2021 23:48:40 +0000 (UTC) X-FDA: 77912864400.23.0BC5804 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf07.hostedemail.com (Postfix) with ESMTP id 53FE0A0B7292 for ; Fri, 12 Mar 2021 23:48:39 +0000 (UTC) IronPort-SDR: dNCJWUW3DY5Ji337T7Ir0J4xMusjwO17hBusoh0e02SSAlYwZ1sbAUN/m0KCOEwY4KulKK2kVB q5OvepEVpg/g== X-IronPort-AV: E=McAfee;i="6000,8403,9921"; a="185548065" X-IronPort-AV: E=Sophos;i="5.81,244,1610438400"; d="scan'208";a="185548065" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2021 15:48:32 -0800 IronPort-SDR: eU82mLSqcCkLWvqZYcSZQVkkNBFocFGN+1m26nc2uVQbSnZahEsgExEJmQGe4pwUfLdTG/q96a RMCoALwfuXig== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,244,1610438400"; d="scan'208";a="604110535" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by fmsmga005.fm.intel.com with ESMTP; 12 Mar 2021 15:48:32 -0800 Received: from fmsmsx610.amr.corp.intel.com (10.18.126.90) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Fri, 12 Mar 2021 15:48:32 -0800 Received: from fmsmsx610.amr.corp.intel.com (10.18.126.90) by fmsmsx610.amr.corp.intel.com (10.18.126.90) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Fri, 12 Mar 2021 15:48:31 -0800 Received: from fmsmsx610.amr.corp.intel.com ([10.18.126.90]) by fmsmsx610.amr.corp.intel.com ([10.18.126.90]) with mapi id 15.01.2106.013; Fri, 12 Mar 2021 15:48:31 -0800 From: "Luck, Tony" To: Aili Yao CC: =?iso-2022-jp?B?SE9SSUdVQ0hJIE5BT1lBKBskQktZOH0hIUQ+TGkbKEIp?= , Oscar Salvador , "david@redhat.com" , "akpm@linux-foundation.org" , "bp@alien8.de" , "tglx@linutronix.de" , "mingo@redhat.com" , "hpa@zytor.com" , "x86@kernel.org" , "linux-edac@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "yangfeng1@kingsoft.com" , "sunhao2@kingsoft.com" Subject: RE: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned Thread-Topic: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned Thread-Index: AQHXCnz5ja9ELypBUEatGW66U99st6pnoa2AgAEgN4CAAIHfAIAAAyEAgAAQXwD//9g7gIABDSmAgAALNoCAB2DqAIAAiu0AgABOzQD//+3L0IABObiAgAAiT4CAACi3AIAAmjmAgACgEACAAGnOwIAHwDQAgAHAYACAAADA4IABX0yAgAAnh8CAAHr7MA== Date: Fri, 12 Mar 2021 23:48:31 +0000 Message-ID: References: <20210303115710.2e9f8e23@alex-virtual-machine> <20210303163912.3d508e0f@alex-virtual-machine> <1a78e9abdc134e35a5efcbf6b2fd2263@intel.com> <20210304101653.546a9da1@alex-virtual-machine> <20210304121941.667047c3@alex-virtual-machine> <20210304144524.795872d7@alex-virtual-machine> <20210304235720.GA215567@agluck-desk2.amr.corp.intel.com> <20210305093016.40c87375@alex-virtual-machine> <20210310141042.4db9ea29@alex-virtual-machine> <20210311085529.GA22268@hori.linux.bs1.fc.nec.co.jp> <20210312135531.72e33b35@alex-virtual-machine> <3900f518d1324c388be52cf81f5220e4@intel.com> In-Reply-To: <3900f518d1324c388be52cf81f5220e4@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 x-originating-ip: [10.1.200.100] Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 53FE0A0B7292 X-Stat-Signature: 8f7oeomzoizdoytcdxso5jkxuuujxmmf Received-SPF: none (intel.com>: No applicable sender policy available) receiver=imf07; identity=mailfrom; envelope-from=""; helo=mga11.intel.com; client-ip=192.55.52.93 X-HE-DKIM-Result: none/none X-HE-Tag: 1615592919-961696 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000178, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >> will memory_failure() find it and unmap it? if succeed, then the current= will be >> signaled with correct vaddr and shift? > > That's a very good question. I didn't see a SIGBUS when I first wrote th= is code, > hence all the p->mce_vaddr. But now I'm > a) not sure why there wasn't a signal > b) if we are to fix the problems noted by AndyL, need to make sure that t= here isn't a SIGBUS Tests on upstream kernel today show that memory_failure() is both unmapping= the page and sending a SIGBUS. My biggest issue with the KERNEL_COPYIN recovery path is that we don't have= code to mark the page not present while we are still in do_machine_check(). Tha= t's resulted in recovery working for simple cases where there is a single get_user() cal= l followed by an error return if that failed. But more complex cases require more machine= checks and a touching faith that the kernel will eventually give up trying (spoiler: i= t sometimes doesn't). Thanks to the decode of the instruction we do have the virtual address. So = we just need a safe walk of pgd->p4d->pud->pmd->pte (truncated if we hit a huge page) wi= th a write of a "not-present" value. Maybe a different poison type from the one we get= from memory_failure() so that the #PF code can recognize this as a special case = and do any other work that we avoided because we were in #MC context. -Tony