linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	"chu, jane" <jane.chu@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Miaohe Lin <linmiaohe@huawei.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Tong Tiangen <tongtiangen@huawei.com>,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: [PATCH v2] mm: hwpoison: coredump: support recovery from dump_user_range()
Date: Thu, 27 Apr 2023 02:31:08 +0000	[thread overview]
Message-ID: <20230427023045.GA3499768@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <f345b2b4-73e5-a88d-6cff-767827ab57d0@huawei.com>

On Thu, Apr 27, 2023 at 09:06:46AM +0800, Kefeng Wang wrote:
> 
> 
> On 2023/4/26 23:45, Luck, Tony wrote:
> > > > > Thanks for your confirm, and what your option about add
> > > > > MCE_IN_KERNEL_COPYIN to EX_TYPE_DEFAULT_MCE_SAFE/FAULT_MCE_SAFE type
> > > > > to let do_machine_check call queue_task_work(&m, msg, kill_me_never),
> > > > > which kill every call memory_failure_queue() after mc safe copy return?
> > > > 
> > > > I haven't been following this thread closely. Can you give a link to the e-mail
> > > > where you posted a patch that does this? Or just repost that patch if easier.
> > > 
> > > The major diff changes is [1], I will post a formal patch when -rc1 out,
> > > thanks.
> > > 
> > > [1]
> > > https://lore.kernel.org/linux-mm/6dc1b117-020e-be9e-7e5e-a349ffb7d00a@huawei.com/
> > 
> > There seem to be a few misconceptions in that message. Not sure if all of them
> > were resolved.  Here are some pertinent points:
> > 
> > > > > In my understanding, an MCE should not be triggered when MC-safe copy
> > > > > tries
> > > > > to access to a memory error.  So I feel that we might be talking about
> > > > > different scenarios.
> > 
> > This is wrong. There is still a machine check when a MC-safe copy does a read
> > from a location that has a memory error.

Yes, the above was my first impression to be proven wrong ;)

> > 
> > The recovery flow in this case does not involve queue_task_work(). That is only
> > useful for machine check exceptions taken in user context. The queued work will
> > be executed to call memory_failure() from the kernel, but in process context (not
> > from the machine check exception stack) to handle the error.
> > 
> > For machine checks taken by kernel code (MC-safe copy functions) the recovery
> > path is here:
> > 
> >                  if (m.kflags & MCE_IN_KERNEL_RECOV) {
> >                          if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
> >                                  mce_panic("Failed kernel mode recovery", &m, msg);
> >                  }
> > 
> >                  if (m.kflags & MCE_IN_KERNEL_COPYIN)
> >                          queue_task_work(&m, msg, kill_me_never);
> > 
> > The "fixup_exception()" ensures that on return from the machine check handler
> > code returns to the extable[] fixup location instead of the instruction that was
> > loading from the memory error location.
> > 
> > When the exception was from one of the copy_from_user() variants it makes
> > sense to also do the queue_task_work() because the kernel is going to return
> > to the user context (with an EFAULT error code from whatever system call was
> > attempting the copy_from_user()).
> > 
> > But in the core dump case there is no return to user. The process is being
> > terminated by the signal that leads to this core dump. So even though you
> > may consider the page being accessed to be a "user" page, you can't fix
> > it by queueing work to run on return to user.
> 
> For coredump,the task work will be called too, see following code,
> 
> get_signal
> 	sig_kernel_coredump
> 		elf_core_dump
> 			dump_user_range
> 				_copy_from_iter // with MC-safe copy, return without panic
> 	do_group_exit(ksig->info.si_signo);
> 		do_exit
> 			exit_task_work
> 				task_work_run
> 					kill_me_never
> 						memory_failure
> 
> I also add debug print to check the memory_failure() processing after
> add MCE_IN_KERNEL_COPYIN to MCE_SAFE exception type, also tested CoW of
> normal page and huge page, it works too.

Sounds nice to me.
Maybe this information is worth documenting in the patch description.

Thanks,
Naoya Horiguchi

  reply	other threads:[~2023-04-27  2:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-17  4:53 Kefeng Wang
     [not found] ` <20230418031243.GA2845864@hori.linux.bs1.fc.nec.co.jp>
2023-04-18  9:45   ` Kefeng Wang
2023-04-19  7:25     ` HORIGUCHI NAOYA(堀口 直也)
2023-04-19 12:03       ` Kefeng Wang
2023-04-20  2:03         ` Jane Chu
2023-04-20  2:59           ` Kefeng Wang
2023-04-20 15:05             ` Kefeng Wang
2023-04-21  3:13               ` HORIGUCHI NAOYA(堀口 直也)
2023-04-21  5:43                 ` Kefeng Wang
2023-04-24  6:44                   ` HORIGUCHI NAOYA(堀口 直也)
2023-04-24 16:17                     ` Luck, Tony
2023-04-25  1:47                       ` Kefeng Wang
2023-04-25 17:16                         ` Luck, Tony
2023-04-26  1:23                           ` Kefeng Wang
2023-04-26 15:45                             ` Luck, Tony
2023-04-27  1:06                               ` Kefeng Wang
2023-04-27  2:31                                 ` HORIGUCHI NAOYA(堀口 直也) [this message]
2023-04-27 16:45                                   ` Luck, Tony
2023-04-28  8:59                                     ` Kefeng Wang
2023-04-28  8:56                                   ` Kefeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230427023045.GA3499768@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=jane.chu@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tglx@linutronix.de \
    --cc=tongtiangen@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox