linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Aili Yao <yaoaili@kingsoft.com>
To: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"yangfeng1@kingsoft.com" <yangfeng1@kingsoft.com>
Subject: Re: [PATCH] mm,hwpoison: non-current task should be checked early_kill for force_early
Date: Mon, 18 Jan 2021 17:09:00 +0800	[thread overview]
Message-ID: <20210118170900.6fe9595a.yaoaili@kingsoft.com> (raw)
In-Reply-To: <20210118085747.GA904@hori.linux.bs1.fc.nec.co.jp>

On Mon, 18 Jan 2021 08:57:47 +0000
HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote:

> > > 
> > > For action optional cases, one error event kills *only one* process. If an
> > > error page are shared by multiple processes, these processes will be killed
> > > by separate error events, each of which is triggered when each process tries
> > > to access the error memory.  So these processes would be killed immediately
> > > when accessing the error, but you don't have to kill all at the same time
> > > (or actually you might not even have to kill it at all if the process exits
> > > finally without accessing the error later).
> > > 
> > > Maybe the function variable "force_early" is named confusingly (it sounds
> > > that it's related to PF_MCE_KILL_EARLY flag, but that's incorrect).
> > > I'll submit a fix later.  (I'll add your "Reported-by" because you made me
> > > find it, thank you.)
> > >   
> > I think we should do more for non current process error case, we should mark it AO for processes to be signaled
> > or we may take wrong action.  
> 
> I'm not sure what you mean by "non current process error case" and "we
> should mark it AO", so could you explain more specifically about your error
> scenario?  
  I will share my test code and i will submit another patch to this scenario.
  please give me some time, thanks!
  And I think you are right, AR is only current process.

> Especially I'd like to know about who triggers hard offline on
> what hardware events and what "wrong action" could happen.  Maybe just
> "calling memory_failure() with MF_ACTION_REQUIRED" is not enough, because
> it's not enough for us to see that your scenario is possible. Current
> implementation implicitly assumes some hardware behavior, and does not work
> for the case which never happens under the assumption.
> 
  This action is from mcelog daemon, normally softpage offlie is default, but we can configure
hardpage offline for CE storms, to get related processes signaled.

> Do you have some test cases to reproduce any specific issue (like data lost)
> on your system? (If yes, please share it.) Or your concern is from code review?
>
  I will make it clean, get it shared

Thanks
-- 
Best Regards!

Aili Yao


  reply	other threads:[~2021-01-18  9:29 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-15  7:55 Aili Yao
2021-01-15  8:49 ` Oscar Salvador
2021-01-15  9:26   ` Aili Yao
2021-01-15  9:31     ` Aili Yao
2021-01-15  9:40       ` Oscar Salvador
2021-01-15  9:53         ` Aili Yao
2021-01-15 10:31     ` Oscar Salvador
2021-01-18  5:15     ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  5:57       ` Aili Yao
2021-01-18  6:50         ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  7:16           ` Aili Yao
2021-01-18  8:15           ` Aili Yao
2021-01-18  8:57             ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  9:09               ` Aili Yao [this message]
2021-01-19  5:25                 ` HORIGUCHI NAOYA(堀口 直也)
2021-01-19  6:04                   ` Aili Yao
2021-01-19  7:33                     ` HORIGUCHI NAOYA(堀口 直也)
2021-01-18  9:24               ` Oscar Salvador
2021-01-18  9:38                 ` Aili Yao
2021-01-18 10:09                   ` Oscar Salvador
2021-01-19  4:21               ` Aili Yao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210118170900.6fe9595a.yaoaili@kingsoft.com \
    --to=yaoaili@kingsoft.com \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=yangfeng1@kingsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox