linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Tong Tiangen <tongtiangen@huawei.com>
Cc: James Morse <james.morse@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Robin Murphy <robin.murphy@arm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
	linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Xie XiuQi <xiexiuqi@huawei.com>, Guohanjun <guohanjun@huawei.com>
Subject: Re: [PATCH -next v5 6/8] arm64: add support for machine check error safe
Date: Sat, 18 Jun 2022 13:52:24 +0100	[thread overview]
Message-ID: <Yq3KiDN87pd6mg+m@FVFF77S0Q05N> (raw)
In-Reply-To: <4aa8b109-c79b-8da0-db89-85ca128f1049@huawei.com>

On Sat, Jun 18, 2022 at 05:18:55PM +0800, Tong Tiangen wrote:
> 在 2022/6/17 16:55, Mark Rutland 写道:
> > On Sat, May 28, 2022 at 06:50:54AM +0000, Tong Tiangen wrote:
> > > +static bool arm64_do_kernel_sea(unsigned long addr, unsigned int esr,
> > > +				     struct pt_regs *regs, int sig, int code)
> > > +{
> > > +	if (!IS_ENABLED(CONFIG_ARCH_HAS_COPY_MC))
> > > +		return false;
> > > +
> > > +	if (user_mode(regs) || !current->mm)
> > > +		return false;
> > 
> > What's the `!current->mm` check for?
> 
> At first, I considered that only user processes have the opportunity to
> recover when they trigger memory error.
> 
> But it seems that this restriction is unreasonable. When the kernel thread
> triggers memory error, it can also be recovered. for instance:
> 
> https://lore.kernel.org/linux-mm/20220527190731.322722-1-jiaqiyan@google.com/
> 
> And i think if(!current->mm) shoud be added below:
> 
> if(!current->mm) {
> 	set_thread_esr(0, esr);
> 	arm64_force_sig_fault(...);
> }
> return true;

Why does 'current->mm' have anything to do with this, though?

There can be kernel threads with `current->mm` set in unusual circumstances
(and there's a lot of kernel code out there which handles that wrong), so if
you want to treat user tasks differently, we should be doing something like
checking PF_KTHREAD, or adding something like an is_user_task() helper.

[...]

> > > +
> > > +	if (apei_claim_sea(regs) < 0)
> > > +		return false;
> > > +
> > > +	if (!fixup_exception_mc(regs))
> > > +		return false;
> > 
> > I thought we still wanted to signal the task in this case? Or do you expect to
> > add that into `fixup_exception_mc()` ?
> 
> Yeah, here return false and will signal to task in do_sea() ->
> arm64_notify_die().

I mean when we do the fixup.

I thought the idea was to apply the fixup (to stop the kernel from crashing),
but still to deliver a fatal signal to the user task since we can't do what the
user task asked us to.

> > > +
> > > +	set_thread_esr(0, esr);
> > 
> > Why are we not setting the address? Is that deliberate, or an oversight?
> 
> Here set fault_address to 0, i refer to the logic of arm64_notify_die().
> 
> void arm64_notify_die(...)
> {
>          if (user_mode(regs)) {
>                  WARN_ON(regs != current_pt_regs());
>                  current->thread.fault_address = 0;
>                  current->thread.fault_code = err;
> 
>                  arm64_force_sig_fault(signo, sicode, far, str);
>          } else {
>                  die(str, regs, err);
>          }
> }
> 
> I don't know exactly why and do you know why arm64_notify_die() did this? :)

To be honest, I don't know, and that looks equally suspicious to me.

Looking at the git history, that was added in commit:

  9141300a5884b57c ("arm64: Provide read/write fault information in compat signal handlers")

... so maybe Catalin recalls why.

Perhaps the assumption is just that this will be fatal and so unimportant? ...
but in that case the same logic would apply to the ESR value, so it's not clear
to me.

Mark.


  reply	other threads:[~2022-06-18 12:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-28  6:50 [PATCH -next v5 0/8]arm64: add machine check safe support Tong Tiangen
2022-05-28  6:50 ` [PATCH -next v5 1/8] arm64: extable: add new extable type EX_TYPE_KACCESS_ERR_ZERO support Tong Tiangen
2022-06-17  8:23   ` Mark Rutland
2022-06-18  2:44     ` Tong Tiangen
2022-05-28  6:50 ` [PATCH -next v5 2/8] arm64: extable: make uaaccess helper use extable type EX_TYPE_UACCESS_ERR_ZERO Tong Tiangen
2022-06-17  8:24   ` Mark Rutland
2022-06-18  3:26     ` Tong Tiangen
2022-06-18  8:42       ` Tong Tiangen
2022-06-18 12:40         ` Mark Rutland
2022-06-20  2:59           ` Tong Tiangen
2022-06-20  9:10             ` Mark Rutland
2022-06-20 13:32               ` Tong Tiangen
2022-06-20 14:13               ` Tong Tiangen
2022-06-20 14:26                 ` Mark Rutland
2022-05-28  6:50 ` [PATCH -next v5 3/8] arm64: extable: move _cond_extable to _cond_uaccess_extable Tong Tiangen
2022-06-17  8:31   ` Mark Rutland
2022-05-28  6:50 ` [PATCH -next v5 4/8] arm64: extable: cleanup redundant extable type EX_TYPE_FIXUP Tong Tiangen
2022-06-17  8:43   ` Mark Rutland
2022-05-28  6:50 ` [PATCH -next v5 5/8] Add generic fallback version of copy_mc_to_user() Tong Tiangen
2022-05-28  6:50 ` [PATCH -next v5 6/8] arm64: add support for machine check error safe Tong Tiangen
2022-06-17  8:55   ` Mark Rutland
2022-06-18  9:18     ` Tong Tiangen
2022-06-18 12:52       ` Mark Rutland [this message]
2022-06-20  1:53         ` Tong Tiangen
2022-05-28  6:50 ` [PATCH -next v5 7/8] arm64: add uaccess to machine check safe Tong Tiangen
2022-06-17  9:06   ` Mark Rutland
2022-06-18  9:27     ` Tong Tiangen
2022-06-18 11:35       ` Mark Rutland
2022-06-20  2:04         ` Tong Tiangen
2022-05-28  6:50 ` [PATCH -next v5 8/8] arm64: add cow " Tong Tiangen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yq3KiDN87pd6mg+m@FVFF77S0Q05N \
    --to=mark.rutland@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=guohanjun@huawei.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=robin.murphy@arm.com \
    --cc=tglx@linutronix.de \
    --cc=tongtiangen@huawei.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xiexiuqi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox