linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: "Kefeng Wang" <wangkefeng.wang@huawei.com>,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Cc: "chu, jane" <jane.chu@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Miaohe Lin <linmiaohe@huawei.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Tong Tiangen <tongtiangen@huawei.com>,
	Jens Axboe <axboe@kernel.dk>
Subject: RE: [PATCH v2] mm: hwpoison: coredump: support recovery from dump_user_range()
Date: Wed, 26 Apr 2023 15:45:42 +0000	[thread overview]
Message-ID: <SJ1PR11MB60833517FCAA19AC5F20FC3CFC659@SJ1PR11MB6083.namprd11.prod.outlook.com> (raw)
In-Reply-To: <6b350187-a9a5-fb37-79b1-bf69068f0182@huawei.com>

> >> Thanks for your confirm, and what your option about add
> >> MCE_IN_KERNEL_COPYIN to EX_TYPE_DEFAULT_MCE_SAFE/FAULT_MCE_SAFE type
> >> to let do_machine_check call queue_task_work(&m, msg, kill_me_never),
> >> which kill every call memory_failure_queue() after mc safe copy return?
> >
> > I haven't been following this thread closely. Can you give a link to the e-mail
> > where you posted a patch that does this? Or just repost that patch if easier.
>
> The major diff changes is [1], I will post a formal patch when -rc1 out,
> thanks.
>
> [1]
> https://lore.kernel.org/linux-mm/6dc1b117-020e-be9e-7e5e-a349ffb7d00a@huawei.com/

There seem to be a few misconceptions in that message. Not sure if all of them
were resolved.  Here are some pertinent points:

>>> In my understanding, an MCE should not be triggered when MC-safe copy 
>>> tries
>>> to access to a memory error.  So I feel that we might be talking about
>>> different scenarios.

This is wrong. There is still a machine check when a MC-safe copy does a read
from a location that has a memory error.

The recovery flow in this case does not involve queue_task_work(). That is only
useful for machine check exceptions taken in user context. The queued work will
be executed to call memory_failure() from the kernel, but in process context (not
from the machine check exception stack) to handle the error.

For machine checks taken by kernel code (MC-safe copy functions) the recovery
path is here:

                if (m.kflags & MCE_IN_KERNEL_RECOV) {
                        if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
                                mce_panic("Failed kernel mode recovery", &m, msg);
                }

                if (m.kflags & MCE_IN_KERNEL_COPYIN)
                        queue_task_work(&m, msg, kill_me_never);

The "fixup_exception()" ensures that on return from the machine check handler
code returns to the extable[] fixup location instead of the instruction that was
loading from the memory error location.

When the exception was from one of the copy_from_user() variants it makes
sense to also do the queue_task_work() because the kernel is going to return
to the user context (with an EFAULT error code from whatever system call was
attempting the copy_from_user()).

But in the core dump case there is no return to user. The process is being
terminated by the signal that leads to this core dump. So even though you
may consider the page being accessed to be a "user" page, you can't fix
it by queueing work to run on return to user.

I don't have an well thought out suggestion on how to make sure that memory_failure()
is called for the page in this case. Maybe the core dump code can check for the
return from the MC-safe copy it is using and handle it in the error path?

-Tony

  reply	other threads:[~2023-04-26 15:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-17  4:53 Kefeng Wang
     [not found] ` <20230418031243.GA2845864@hori.linux.bs1.fc.nec.co.jp>
2023-04-18  9:45   ` Kefeng Wang
2023-04-19  7:25     ` HORIGUCHI NAOYA(堀口 直也)
2023-04-19 12:03       ` Kefeng Wang
2023-04-20  2:03         ` Jane Chu
2023-04-20  2:59           ` Kefeng Wang
2023-04-20 15:05             ` Kefeng Wang
2023-04-21  3:13               ` HORIGUCHI NAOYA(堀口 直也)
2023-04-21  5:43                 ` Kefeng Wang
2023-04-24  6:44                   ` HORIGUCHI NAOYA(堀口 直也)
2023-04-24 16:17                     ` Luck, Tony
2023-04-25  1:47                       ` Kefeng Wang
2023-04-25 17:16                         ` Luck, Tony
2023-04-26  1:23                           ` Kefeng Wang
2023-04-26 15:45                             ` Luck, Tony [this message]
2023-04-27  1:06                               ` Kefeng Wang
2023-04-27  2:31                                 ` HORIGUCHI NAOYA(堀口 直也)
2023-04-27 16:45                                   ` Luck, Tony
2023-04-28  8:59                                     ` Kefeng Wang
2023-04-28  8:56                                   ` Kefeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ1PR11MB60833517FCAA19AC5F20FC3CFC659@SJ1PR11MB6083.namprd11.prod.outlook.com \
    --to=tony.luck@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=jane.chu@oracle.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=tglx@linutronix.de \
    --cc=tongtiangen@huawei.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox