From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_2 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7640EC433E0 for ; Mon, 18 Jan 2021 09:39:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 00108223E8 for ; Mon, 18 Jan 2021 09:39:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 00108223E8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kingsoft.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4853D8D000B; Mon, 18 Jan 2021 04:39:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 435558D0002; Mon, 18 Jan 2021 04:39:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3257D8D000B; Mon, 18 Jan 2021 04:39:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id 15FD78D0002 for ; Mon, 18 Jan 2021 04:39:12 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D5B622C94 for ; Mon, 18 Jan 2021 09:39:11 +0000 (UTC) X-FDA: 77718397302.01.hat55_54074d827548 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id B72B81004CB37 for ; Mon, 18 Jan 2021 09:39:11 +0000 (UTC) X-HE-Tag: hat55_54074d827548 X-Filterd-Recvd-Size: 4960 Received: from mail.kingsoft.com (mail.kingsoft.com [114.255.44.145]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Mon, 18 Jan 2021 09:39:06 +0000 (UTC) X-AuditID: 0a580155-6fbff700000550c6-0a-6005514ea15e Received: from mail.kingsoft.com (localhost [10.88.1.32]) (using TLS with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mail.kingsoft.com (SMG-2-NODE-85) with SMTP id A5.F8.20678.E4155006; Mon, 18 Jan 2021 17:13:50 +0800 (HKT) Received: from aili-OptiPlex-7020 (172.16.253.254) by KSBJMAIL2.kingsoft.cn (10.88.1.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Mon, 18 Jan 2021 17:38:57 +0800 Date: Mon, 18 Jan 2021 17:38:52 +0800 From: Aili Yao To: Oscar Salvador CC: "HORIGUCHI =?UTF-8?B?TkFPWUE=?=(=?UTF-8?B?5aCA5Y+j44CA55u05Lmf?=)" , "linux-mm@kvack.org" , "yangfeng1@kingsoft.com" Subject: Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early Message-ID: <20210118173852.0c784aa9.yaoaili@kingsoft.com> In-Reply-To: <20210118092419.GA4234@linux> References: <20210115155506.2d59fe83.yaoaili@kingsoft.com> <20210115084920.GA4092@linux> <20210115172622.699d68e5.yaoaili@kingsoft.com> <20210118051555.GA3585@hori.linux.bs1.fc.nec.co.jp> <20210118135744.7413cd06.yaoaili@kingsoft.com> <20210118065054.GA7447@hori.linux.bs1.fc.nec.co.jp> <20210118161512.701c94e7.yaoaili@kingsoft.com> <20210118085747.GA904@hori.linux.bs1.fc.nec.co.jp> <20210118092419.GA4234@linux> Organization: Kingsoft X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [172.16.253.254] X-ClientProxiedBy: KSBJMAIL1.kingsoft.cn (10.88.1.31) To KSBJMAIL2.kingsoft.cn (10.88.1.32) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrMLMWRmVeSWpSXmKPExsXCFcGooOsXyJpg0LrV2OLemv+sFhcbDzBa nJlW5MDssenTJHaPF1c3snhsPl0dwBzFZZOSmpNZllqkb5fAlfHgwhvGgi7+ioZpHawNjF+4 uxg5OSQETCQunHjF3MXIxSEkMJ1JYs2uDWwQzgtGiX39J9hBqlgEVCWWrN7GDGKzAdm77s1i BbFFBNQkpr1qZAdpYBY4xChxffofsAZhgUSJ45P2AU3i4OAVsJKYfa4EJMwpoCXx/vByRogF 65glpp45BlbDLyAm8arBGOIie4nnf8+C7eIVEJQ4OfMJC4jNLKAjcWLVMWYIW15i+9s5YLaQ gKLE4SW/2CF6lSSOdM9gg7BjJZbNe8U6gVF4FpJRs5CMmoVk1AJG5lWMLMW56UabGCHhHLqD cUbTR71DjEwcjECvcTArifCWrmNKEOJNSaysSi3Kjy8qzUktPsQozcGiJM479/OfeCGB9MSS 1OzU1ILUIpgsEwenVANTgcYDu3WFb5m5TuebhYXtF2dsT56/+u/rxUa7q6azrJTPq3326rm3 rIi6nXXR94KpGRmntvHzLedbyR1uy/76wh4bVcMPHHuM5TPns4dV7OlbenDT4+SSZz9+zPhs /291ycS0agcljQ3tCnuVeEIjEo5ILroWJaV19t+hWWuvym5tm6Czav/XmHVZVz5VsJqEt6/b YhMTo/X8w8YZdnd/nfeoCLXQCFPecnzmyR9JshunxMqaFaoven9IbObXumWpHp7Nm15d9lfc /v+mjGba0hx/nWdeV2J3HMz6U8d/Xq//9h29I3b37gow2tns8UrNEDZ/Il42jTG/5uTMK5ET gji0thXGtOedD3qZ3+2sxFKckWioxVxUnAgA29wNeNYCAAA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.001312, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 18 Jan 2021 10:24:23 +0100 Oscar Salvador wrote: > > So, the scenario case is a multithread application with the same page mapped. > And PF_MCE_KILL_EARLY flag was set. > The scenario is not related multithread application; it's about multiprocess application which share the same page > IIUIC, Aili Yao concern is that when the MCE machinery calls memory_failure > which MF_ACTION_REQUIRED, only the process that triggered the MCE exception > will receive a SIGBUG, and not the other threads that might have PF_MCE_EARLY. > Aili Yao would like memory_failure() to also signal those threads who might > have the flag set, in case they want to do something with that information. > For the processes who care the memory with early flag set, they may specify one thread to process related signal such as signal bus, when the flag set, the thread want to handle the error gracefully and not kill the process all, and may do something more. > But reading the code, I do not think that is what the code expects. > Looking at the comment above find_early_kill_thread: > > "/* > * Find a dedicated thread which is supposed to handle SIGBUS(BUS_MCEERR_AO) > * on behalf of the thread group. Return task_struct of the (first found) > * dedicated thread if found, and return NULL otherwise. > * > * We already hold read_lock(&tasklist_lock) in the caller, so we don't > * have to call rcu_read_lock/unlock() in this function. > */" > > What I understand from that is: > > " > If memory_failure() was not triggered by any concrete process (aka: no one was > trying to manipulate the corrupted area), we need to find the main thread who > might have set the MCE policy by pcrtl and see if they want to be signaled > __before__ they access the corrupted area. yes, it doesn't just want to be killed all. Thanks -- Best Regards! Aili Yao