From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
Dave Hansen <dave.hansen@linux.intel.com>, <x86@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
<linux-edac@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <jane.chu@oracle.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: [PATCH] x86/mce: set MCE_IN_KERNEL_COPYIN for all MC-Safe Copy
Date: Mon, 8 May 2023 10:22:33 +0800 [thread overview]
Message-ID: <20230508022233.13890-1-wangkefeng.wang@huawei.com> (raw)
Both EX_TYPE_FAULT_MCE_SAFE and EX_TYPE_DEFAULT_MCE_SAFE exception
fixup types are used to identify fixups which allow in kernel #MC
recovery, that is the Machine Check Safe Copy.
For now, the MCE_IN_KERNEL_COPYIN flag is only set for EX_TYPE_COPY
and EX_TYPE_UACCESS when copy from user, and corrupted page is
isolated in this case, for MC-safe copy, memory_failure() is not
always called, some places, like __wp_page_copy_user, copy_subpage,
copy_user_gigantic_page and ksm_might_need_to_copy manually call
memory_failure_queue() to cope with such unhandled error pages,
recently coredump hwposion recovery support[1] is asked to do the
same thing, and there are some other already existed MC-safe copy
scenarios, eg, nvdimm, dm-writecache, dax, which has similar issue.
The best way to fix them is set MCE_IN_KERNEL_COPYIN to MCE_SAFE
exception, then kill_me_never() will be queued to call memory_failure()
in do_machine_check() to isolate corrupted page, which avoid calling
memory_failure_queue() after every MC-safe copy return.
[1] https://lkml.kernel.org/r/20230417045323.11054-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/x86/kernel/cpu/mce/severity.c | 3 +--
mm/ksm.c | 1 -
mm/memory.c | 12 +++---------
3 files changed, 4 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index c4477162c07d..63e94484c5d6 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -293,12 +293,11 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)
case EX_TYPE_COPY:
if (!copy_user)
return IN_KERNEL;
- m->kflags |= MCE_IN_KERNEL_COPYIN;
fallthrough;
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
- m->kflags |= MCE_IN_KERNEL_RECOV;
+ m->kflags |= MCE_IN_KERNEL_RECOV | MCE_IN_KERNEL_COPYIN;
return IN_KERNEL_RECOV;
default:
diff --git a/mm/ksm.c b/mm/ksm.c
index 0156bded3a66..7abdf4892387 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2794,7 +2794,6 @@ struct page *ksm_might_need_to_copy(struct page *page,
if (new_page) {
if (copy_mc_user_highpage(new_page, page, address, vma)) {
put_page(new_page);
- memory_failure_queue(page_to_pfn(page), 0);
return ERR_PTR(-EHWPOISON);
}
SetPageDirty(new_page);
diff --git a/mm/memory.c b/mm/memory.c
index 5e2c6b1fc00e..c0f586257017 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2814,10 +2814,8 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src,
unsigned long addr = vmf->address;
if (likely(src)) {
- if (copy_mc_user_highpage(dst, src, addr, vma)) {
- memory_failure_queue(page_to_pfn(src), 0);
+ if (copy_mc_user_highpage(dst, src, addr, vma))
return -EHWPOISON;
- }
return 0;
}
@@ -5852,10 +5850,8 @@ static int copy_user_gigantic_page(struct folio *dst, struct folio *src,
cond_resched();
if (copy_mc_user_highpage(dst_page, src_page,
- addr + i*PAGE_SIZE, vma)) {
- memory_failure_queue(page_to_pfn(src_page), 0);
+ addr + i*PAGE_SIZE, vma))
return -EHWPOISON;
- }
}
return 0;
}
@@ -5871,10 +5867,8 @@ static int copy_subpage(unsigned long addr, int idx, void *arg)
struct copy_subpage_arg *copy_arg = arg;
if (copy_mc_user_highpage(copy_arg->dst + idx, copy_arg->src + idx,
- addr, copy_arg->vma)) {
- memory_failure_queue(page_to_pfn(copy_arg->src + idx), 0);
+ addr, copy_arg->vma))
return -EHWPOISON;
- }
return 0;
}
--
2.35.3
next reply other threads:[~2023-05-08 2:09 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-08 2:22 Kefeng Wang [this message]
2023-05-08 5:47 ` HORIGUCHI NAOYA(堀口 直也)
2023-05-18 2:03 ` Kefeng Wang
2023-05-19 16:17 ` Luck, Tony
2023-05-22 1:26 ` Kefeng Wang
2023-05-22 18:02 ` Luck, Tony
2023-05-23 1:34 ` Kefeng Wang
2023-05-24 11:23 ` Kefeng Wang
2023-05-25 17:18 ` Dave Hansen
2023-05-26 1:57 ` Kefeng Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230508022233.13890-1-wangkefeng.wang@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=jane.chu@oracle.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=naoya.horiguchi@nec.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox