From: Seiji Aguchi <seiji.aguchi@hds.com>
To: "hpa@zytor.com" <hpa@zytor.com>,
"andi@firstfloor.org" <andi@firstfloor.org>,
"ebiederm@xmission.com" <ebiederm@xmission.com>,
"bp@alien8.de" <bp@alien8.de>,
"seto.hidetoshi@jp.fujitsu.com" <seto.hidetoshi@jp.fujitsu.com>,
"gregkh@suse.de" <gregkh@suse.de>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"x86@kernel.org" <x86@kernel.org>,
"dle-develop@lists.sourceforge.net"
<dle-develop@lists.sourceforge.net>,
"amwang@redhat.com" <amwang@redhat.com>
Cc: Satoru Moriya <satoru.moriya@hds.com>
Subject: [RFC][PATCH v2] Controlling kexec behaviour when hardware error happened.
Date: Wed, 9 Feb 2011 11:35:43 -0500 [thread overview]
Message-ID: <5C4C569E8A4B9B42A84A977CF070A35B2C1494DBE0@USINDEVS01.corp.hds.com> (raw)
Hi,
I submitted a quite similar patch last December.
http://www.spinics.net/lists/linux-mm/msg13157.html
I retry it with different description of the purpose.
[Changelog]
from v1:
- Change name of sysctl parameter ,kexec_on_mce, to kexec_on_hwerr.
- Move variable declaration from <asm/mce.h> to <kernel/panic.h>.
- Remove CONFIG_X86_MCE in *.c files.
- Modify [Purpose]/[Patch Description].
[Purpose]
There are some logging features of firmware/hardware, SEL,BMC, etc, in enterprise servers.
We investigate the firmware/hardware logs first when MCE occurred and replace the broken hardware.
So, memory dump is not necessary for detecting root cause of machine check.
Also, we can reduce down-time by skipping kdump.
Of course, there are a lot of servers which don't have logging features of firmware/hardware.
So, I proposed a option controlling kexec behaviour when hardware error occurred.
[Patch Description]
This patch adds a sysctl option ,kernel.kexec_on_hwerr, controlling kexec behaviour when hardware error occurred.
- Permission
- 0644
- Value(default is "1")
- non-zero: Kexec is enabled regardless of hardware error.
- 0: Kexec is disabled when MCE occurred.
Matrix of kernel.kexec_on_hwerr value ,hardware error and kexec
--------------------------------------------------
kernel.kexec_on_hwerr| hardware error | kexec
--------------------------------------------------
non-zero | occurred | enabled
-----------------------------
| not occurred | enabled
--------------------------------------------------
0 | occurred | disabled
|----------------------------
| not occurred | enabled
--------------------------------------------------
Any comments and suggestions are welcome.
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
---
Documentation/sysctl/kernel.txt | 11 +++++++++++
arch/x86/kernel/cpu/mcheck/mce.c | 2 ++
include/linux/kernel.h | 2 ++
include/linux/sysctl.h | 1 +
kernel/panic.c | 15 ++++++++++++++-
kernel/sysctl.c | 8 ++++++++
kernel/sysctl_binary.c | 1 +
mm/memory-failure.c | 2 ++
8 files changed, 41 insertions(+), 1 deletions(-)
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 11d5ced..3159111 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -34,6 +34,7 @@ show up in /proc/sys/kernel:
- hotplug
- java-appletviewer [ binfmt_java, obsolete ]
- java-interpreter [ binfmt_java, obsolete ]
+- kexec_on_hwerr [ x86 only ]
- kptr_restrict
- kstack_depth_to_print [ X86 only ]
- l2cr [ PPC only ]
@@ -261,6 +262,16 @@ This flag controls the L2 cache of G3 processor boards. If 0, the cache is disabled. Enabled if nonzero.
==============================================================
+kexec_on_hwerr: (X86 only)
+
+Controls the behaviour of kexec when panic occurred due to hardware
+error.
+Default value is 1.
+
+0: Kexec is disabled.
+non-zero: Kexec is enabled.
+
+==============================================================
kptr_restrict:
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index d916183..e76b47b 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -944,6 +944,8 @@ void do_machine_check(struct pt_regs *regs, long error_code)
percpu_inc(mce_exception_count);
+ hwerr_flag = 1;
+
if (notify_die(DIE_NMI, "machine check", regs, error_code,
18, SIGKILL) == NOTIFY_STOP)
goto out;
diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 2fe6e84..c2fba7c 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -242,6 +242,8 @@ extern void add_taint(unsigned flag); extern int test_taint(unsigned flag); extern unsigned long get_taint(void); extern int root_mountflags;
+extern int kexec_on_hwerr;
+extern int hwerr_flag;
extern bool early_boot_irqs_disabled;
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index 7bb5cb6..8ae5bfe 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -153,6 +153,7 @@ enum
KERN_MAX_LOCK_DEPTH=74, /* int: rtmutex's maximum lock depth */
KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
+ KERN_KEXEC_ON_HWERR=77, /* int: bevaviour of kexec for hardware error
+*/
};
diff --git a/kernel/panic.c b/kernel/panic.c index 991bb87..84c1d2e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -28,6 +28,8 @@
#define PANIC_BLINK_SPD 18
int panic_on_oops;
+int kexec_on_hwerr = 1;
+int hwerr_flag;
static unsigned long tainted_mask;
static int pause_on_oops;
static int pause_on_oops_flag;
@@ -45,6 +47,16 @@ static long no_blink(int state)
return 0;
}
+static int kexec_should_skip(void)
+{
+ if (!kexec_on_hwerr && hwerr_flag) {
+ printk(KERN_WARNING "Kexec is skipped because hardware error "
+ "occurred.\n");
+ return 1;
+ }
+ return 0;
+}
+
/* Returns how long it waited in ms */
long (*panic_blink)(int state);
EXPORT_SYMBOL(panic_blink);
@@ -86,7 +98,8 @@ NORET_TYPE void panic(const char * fmt, ...)
* everything else.
* Do we want to call this before we try to display a message?
*/
- crash_kexec(NULL);
+ if (!kexec_should_skip())
+ crash_kexec(NULL);
kmsg_dump(KMSG_DUMP_PANIC);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 0f1bd83..f78edd8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -811,6 +811,14 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "kexec_on_hwerr",
+ .data = &kexec_on_hwerr,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+
#endif
#if defined(CONFIG_MMU)
{
diff --git a/kernel/sysctl_binary.c b/kernel/sysctl_binary.c index b875bed..8d572ca 100644
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -137,6 +137,7 @@ static const struct bin_table bin_kern_table[] = {
{ CTL_INT, KERN_COMPAT_LOG, "compat-log" },
{ CTL_INT, KERN_MAX_LOCK_DEPTH, "max_lock_depth" },
{ CTL_INT, KERN_PANIC_ON_NMI, "panic_on_unrecovered_nmi" },
+ { CTL_INT, KERN_KEXEC_ON_HWERR, "kexec_on_hwerr" },
{}
};
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 0207c2f..0178f47 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -994,6 +994,8 @@ int __memory_failure(unsigned long pfn, int trapno, int flags)
int res;
unsigned int nr_pages;
+ hwerr_flag = 1;
+
if (!sysctl_memory_failure_recovery)
panic("Memory failure from trap %d on page %lx", trapno, pfn);
--
1.7.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2011-02-09 16:38 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-09 16:35 Seiji Aguchi [this message]
2011-02-09 16:51 ` Greg KH
2011-02-09 17:06 ` Eric W. Biederman
2011-02-09 17:07 ` Eric W. Biederman
2011-02-10 3:04 ` Cong Wang
2011-02-10 8:36 ` Hidetoshi Seto
2011-02-10 9:14 ` Borislav Petkov
2011-02-14 1:20 ` Hidetoshi Seto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5C4C569E8A4B9B42A84A977CF070A35B2C1494DBE0@USINDEVS01.corp.hds.com \
--to=seiji.aguchi@hds.com \
--cc=amwang@redhat.com \
--cc=andi@firstfloor.org \
--cc=bp@alien8.de \
--cc=dle-develop@lists.sourceforge.net \
--cc=ebiederm@xmission.com \
--cc=gregkh@suse.de \
--cc=hpa@zytor.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=satoru.moriya@hds.com \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox