From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86B88C433E1 for ; Fri, 29 May 2020 05:56:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 215B3207BC for ; Fri, 29 May 2020 05:56:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 215B3207BC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A83C18001A; Fri, 29 May 2020 01:56:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0E5B80010; Fri, 29 May 2020 01:56:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D5508001A; Fri, 29 May 2020 01:56:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id 726FB80010 for ; Fri, 29 May 2020 01:56:32 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2BCBD180AD81A for ; Fri, 29 May 2020 05:56:32 +0000 (UTC) X-FDA: 76868697024.22.ink71_732400046b10b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 0E6A218038E60 for ; Fri, 29 May 2020 05:56:32 +0000 (UTC) X-HE-Tag: ink71_732400046b10b X-Filterd-Recvd-Size: 7280 Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 29 May 2020 05:56:30 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R961e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01355;MF=wetp.zy@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0TzxLNk._1590731785; Received: from wetpdeMacBook-Pro.local(mailfrom:wetp.zy@linux.alibaba.com fp:SMTPD_---0TzxLNk._1590731785) by smtp.aliyun-inc.com(127.0.0.1); Fri, 29 May 2020 13:56:26 +0800 Subject: Re: [PATCH] mm, memory_failure: only send BUS_MCEERR_AO to early-kill process To: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= Cc: "n-horiguchi@ah.jp.nec.com" , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" References: <1590476801-19882-1-git-send-email-wetp.zy@linux.alibaba.com> <20200528022241.GA1401@hori.linux.bs1.fc.nec.co.jp> <881b990a-2198-8e80-036e-bfa6f88070ff@linux.alibaba.com> <20200529021224.GA345@hori.linux.bs1.fc.nec.co.jp> From: wetp Message-ID: Date: Fri, 29 May 2020 13:56:25 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:68.0) Gecko/20100101 Thunderbird/68.8.1 MIME-Version: 1.0 In-Reply-To: <20200529021224.GA345@hori.linux.bs1.fc.nec.co.jp> Content-Type: text/plain; charset=utf-8; format=flowed X-Rspamd-Queue-Id: 0E6A218038E60 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020/5/29 =E4=B8=8A=E5=8D=8810:12, HORIGUCHI NAOYA(=E5=A0=80=E5=8F=A3 = =E7=9B=B4=E4=B9=9F) wrote: > On Thu, May 28, 2020 at 02:50:09PM +0800, wetp wrote: >> On 2020/5/28 =E4=B8=8A=E5=8D=8810:22, HORIGUCHI NAOYA(=E5=A0=80=E5=8F=A3= =E7=9B=B4=E4=B9=9F) wrote: >>> Hi Zhang, >>> >>> Sorry for my late response. >>> >>> On Tue, May 26, 2020 at 03:06:41PM +0800, Wetp Zhang wrote: >>>> From: Zhang Yi >>>> >>>> If a process don't need early-kill, it may not care the BUS_MCEERR_A= O. >>>> Let the process to be killed when it really access the corrupted mem= ory. >>>> >>>> Signed-off-by: Zhang Yi >>> Thank you for pointing this. This looks to me a bug (per-process flag >>> is ignored when system-wide flag is set). >> The flag is not problem for me. >> >> In my case, two processes share memory with no any flag setting, both = will >> be killed when only one >> >> access the fail memory. > Thanks, now your problem seems clearer. > > It seems that this happens because in "Action Required" case kill_proc(= ) > takes the first branch for current process, while it takes the else bra= nch > for other affected processes: > > static int kill_proc(struct to_kill *tk, unsigned long pfn, int fl= ags) > { > ... > =20 > if ((flags & MF_ACTION_REQUIRED) && t->mm =3D=3D current->= mm) { > ret =3D force_sig_mceerr(BUS_MCEERR_AR, (void __us= er *)tk->addr, > addr_lsb); > } else { > /* > * Don't use force here, it's convenient if the si= gnal > * can be temporarily blocked. > * This could cause a loop when the user sets SIGB= US > * to SIG_IGN, but hopefully no one will do that? > */ > ret =3D send_sig_mceerr(BUS_MCEERR_AO, (void __use= r *)tk->addr, > addr_lsb, t); /* synchronou= s? */ > } > > Sending SIGBUS with BUS_MCEERR_AO for action optional error is strange,= so > maybe this logic should be like this: > > > if (flags & MF_ACTION_REQUIRED) { > if (t->mm =3D=3D current->mm) > ret =3D force_sig_mceerr(BUS_MCEERR_AR, (v= oid __user *)tk->addr, > addr_lsb); > /* send no signal to non-current processes */ Ok, this can solve my problem. > } else { > /* > * Don't use force here, it's convenient if the si= gnal > * can be temporarily blocked. > * This could cause a loop when the user sets SIGB= US > * to SIG_IGN, but hopefully no one will do that? > */ > ret =3D send_sig_mceerr(BUS_MCEERR_AO, (void __use= r *)tk->addr, > addr_lsb, t); /* synchronou= s? */ > } > >>>> --- >>>> mm/memory-failure.c | 7 ++++--- >>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>> index a96364be8ab4..2db13d48865c 100644 >>>> --- a/mm/memory-failure.c >>>> +++ b/mm/memory-failure.c >>>> @@ -210,7 +210,7 @@ static int kill_proc(struct to_kill *tk, unsigne= d long pfn, int flags) >>>> { >>>> struct task_struct *t =3D tk->tsk; >>>> short addr_lsb =3D tk->size_shift; >>>> - int ret; >>>> + int ret =3D 0; >>>> >>>> pr_err("Memory failure: %#lx: Sending SIGBUS to %s:%d due to har= dware memory corruption\n", >>>> pfn, t->comm, t->pid); >>>> @@ -225,8 +225,9 @@ static int kill_proc(struct to_kill *tk, unsigne= d long pfn, int flags) >>>> * This could cause a loop when the user sets SIGBUS >>>> * to SIG_IGN, but hopefully no one will do that? >>>> */ >>>> - ret =3D send_sig_mceerr(BUS_MCEERR_AO, (void __user *)tk->addr, >>>> - addr_lsb, t); /* synchronous? */ >>>> + if ((t->flags & PF_MCE_PROCESS) && (t->flags & PF_MCE_EARLY)) >>>> + ret =3D send_sig_mceerr(BUS_MCEERR_AO, >>>> + (void __user *)tk->addr, addr_lsb, t); >>> kill_proc() could be called only for processes that are selected by >>> collect_procs() with task_early_kill(). So I think that we should fi= x >>> task_early_kill(), maybe by reordering sysctl_memory_failure_early_ki= ll >>> check and find_early_kill_thread() check. >>> >>> static struct task_struct *task_early_kill(struct task_struct *= tsk, >>> int force_early) >>> { >>> struct task_struct *t; >>> if (!tsk->mm) >>> return NULL; >>> if (force_early) >>> return tsk; >> The force_early is rely the flag MF_ACTION_REQUIRED, so it is always t= rue >> when MCE occurs. >> >> This leads always sending SIGBUS to processes even if those are not cu= rrent >> or no flag setting. >> >> =C2=A0I think it could keep the non-current processes which has no fl= ag setting >> running. >> >> >> Besides, base on your recommendation I reorder the force_early check a= nd >> find_early_kill_thread() >> >> check, to send the signal to the right thread. > Sorry, my previous comment around task_early_kill() is for a separate p= roblem, > so I'll try some fix on this later. Thanks. Should me send the patch V2 for my problem alone?=C2=A0 Or you will fix i= t=20 with task_early_kill() together =EF=BC=9F > Thanks, > Naoya Horiguchi