linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, gpiccoli@canonical.com,
	keescook@chromium.org, linux-mm@kvack.org, mcgrof@kernel.org,
	mm-commits@vger.kernel.org, rdunlap@infradead.org,
	tglx@linutronix.de, torvalds@linux-foundation.org,
	vbabka@suse.cz, willy@infradead.org, yzaikin@google.com
Subject: [patch 14/54] panic: add sysctl to dump all CPUs backtraces on oops event
Date: Sun, 07 Jun 2020 21:40:48 -0700	[thread overview]
Message-ID: <20200608044048.aELgRjIO6%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200607212615.b050e41fac139a1e16fe00bd@linux-foundation.org>

From: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
Subject: panic: add sysctl to dump all CPUs backtraces on oops event

Usually when the kernel reaches an oops condition, it's a point of no
return; in case not enough debug information is available in the kernel
splat, one of the last resorts would be to collect a kernel crash dump and
analyze it.  The problem with this approach is that in order to collect
the dump, a panic is required (to kexec-load the crash kernel).  When in
an environment of multiple virtual machines, users may prefer to try
living with the oops, at least until being able to properly shutdown their
VMs / finish their important tasks.

This patch implements a way to collect a bit more debug details when an
oops event is reached, by printing all the CPUs backtraces through the
usage of NMIs (on architectures that support that).  The sysctl added (and
documented) here was called "oops_all_cpu_backtrace", and when set will
(as the name suggests) dump all CPUs backtraces.

Far from ideal, this may be the last option though for users that for some
reason cannot panic on oops.  Most of times oopses are clear enough to
indicate the kernel portion that must be investigated, but in virtual
environments it's possible to observe hypervisor/KVM issues that could
lead to oopses shown in other guests CPUs (like virtual APIC crashes). 
This patch hence aims to help debug such complex issues without resorting
to kdump.

Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/sysctl/kernel.rst |   16 ++++++++++++++++
 include/linux/kernel.h                      |    6 ++++++
 kernel/panic.c                              |   11 +++++++++++
 kernel/sysctl.c                             |   11 +++++++++++
 4 files changed, 44 insertions(+)

--- a/Documentation/admin-guide/sysctl/kernel.rst~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/Documentation/admin-guide/sysctl/kernel.rst
@@ -646,6 +646,22 @@ rate for each task.
 scanned for a given scan.
 
 
+oops_all_cpu_backtrace:
+================
+
+If this option is set, the kernel will send an NMI to all CPUs to dump
+their backtraces when an oops event occurs. It should be used as a last
+resort in case a panic cannot be triggered (to protect VMs running, for
+example) or kdump can't be collected. This file shows up if CONFIG_SMP
+is enabled.
+
+0: Won't show all CPUs backtraces when an oops is detected.
+This is the default behavior.
+
+1: Will non-maskably interrupt all CPUs and dump their backtraces when
+an oops event is detected.
+
+
 osrelease, ostype & version
 ===========================
 
--- a/include/linux/kernel.h~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/include/linux/kernel.h
@@ -520,6 +520,12 @@ static inline u32 int_sqrt64(u64 x)
 }
 #endif
 
+#ifdef CONFIG_SMP
+extern unsigned int sysctl_oops_all_cpu_backtrace;
+#else
+#define sysctl_oops_all_cpu_backtrace 0
+#endif /* CONFIG_SMP */
+
 extern void bust_spinlocks(int yes);
 extern int oops_in_progress;		/* If set, an oops, panic(), BUG() or die() is in progress */
 extern int panic_timeout;
--- a/kernel/panic.c~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/kernel/panic.c
@@ -36,6 +36,14 @@
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
 
+#ifdef CONFIG_SMP
+/*
+ * Should we dump all CPUs backtraces in an oops event?
+ * Defaults to 0, can be changed via sysctl.
+ */
+unsigned int __read_mostly sysctl_oops_all_cpu_backtrace;
+#endif /* CONFIG_SMP */
+
 int panic_on_oops = CONFIG_PANIC_ON_OOPS_VALUE;
 static unsigned long tainted_mask =
 	IS_ENABLED(CONFIG_GCC_PLUGIN_RANDSTRUCT) ? (1 << TAINT_RANDSTRUCT) : 0;
@@ -522,6 +530,9 @@ void oops_enter(void)
 	/* can't trust the integrity of the kernel anymore: */
 	debug_locks_off();
 	do_oops_enter_exit();
+
+	if (sysctl_oops_all_cpu_backtrace)
+		trigger_all_cpu_backtrace();
 }
 
 /*
--- a/kernel/sysctl.c~panic-add-sysctl-to-dump-all-cpus-backtraces-on-oops-event
+++ a/kernel/sysctl.c
@@ -2150,6 +2150,17 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
+#ifdef CONFIG_SMP
+	{
+		.procname	= "oops_all_cpu_backtrace",
+		.data		= &sysctl_oops_all_cpu_backtrace,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+#endif /* CONFIG_SMP */
 	{
 		.procname	= "pid_max",
 		.data		= &pid_max,
_


  parent reply	other threads:[~2020-06-08  4:40 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200607212615.b050e41fac139a1e16fe00bd@linux-foundation.org>
2020-06-08  4:40 ` [patch 01/54] mm/page_idle.c: skip offline pages Andrew Morton
2020-06-08  4:40 ` [patch 02/54] ipc/msg: add missing annotation for freeque() Andrew Morton
2020-06-08  4:40 ` [patch 03/54] ipc/namespace.c: use a work queue to free_ipc Andrew Morton
2020-06-08  4:40 ` [patch 04/54] dynamic_debug: add an option to enable dynamic debug for modules only Andrew Morton
2020-06-08  4:58   ` 答复: " 翟京 (Orson Zhai)
2020-06-08  4:40 ` [patch 05/54] kernel: add panic_on_taint Andrew Morton
2020-06-08  4:40 ` [patch 06/54] xarray.h: correct return code documentation for xa_store_{bh,irq}() Andrew Morton
2020-06-08  4:40 ` [patch 07/54] kernel/sysctl: support setting sysctl parameters from kernel command line Andrew Morton
2020-06-08  4:40 ` [patch 08/54] kernel/sysctl: support handling command line aliases Andrew Morton
2020-06-08  4:40 ` [patch 09/54] kernel/hung_task convert hung_task_panic boot parameter to sysctl Andrew Morton
2020-06-08  4:40 ` [patch 10/54] tools/testing/selftests/sysctl/sysctl.sh: support CONFIG_TEST_SYSCTL=y Andrew Morton
2020-06-08  4:40 ` [patch 11/54] lib/test_sysctl: support testing of sysctl. boot parameter Andrew Morton
2020-06-08  4:40 ` [patch 12/54] kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases Andrew Morton
2020-06-08  4:40 ` [patch 13/54] kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected Andrew Morton
2020-06-08  4:40 ` Andrew Morton [this message]
2020-06-08  4:40 ` [patch 15/54] kernel/sysctl.c: ignore out-of-range taint bits introduced via kernel.tainted Andrew Morton
2020-06-08  4:40 ` [patch 16/54] mm/gup.c: convert to use get_user_{page|pages}_fast_only() Andrew Morton
2020-06-08  4:40 ` [patch 17/54] mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers) Andrew Morton
2020-06-08  4:41 ` [patch 18/54] mm/gup: introduce pin_user_pages_locked() Andrew Morton
2020-06-08  4:41 ` [patch 19/54] mm/gup: frame_vector: convert get_user_pages() --> pin_user_pages() Andrew Morton
2020-06-08  4:41 ` [patch 20/54] mm/gup: documentation fix for pin_user_pages*() APIs Andrew Morton
2020-06-08  4:41 ` [patch 21/54] docs: mm/gup: pin_user_pages.rst: add a "case 5" Andrew Morton
2020-06-08  4:41 ` [patch 22/54] vhost: convert get_user_pages() --> pin_user_pages() Andrew Morton
2020-06-08  4:41 ` [patch 23/54] mm/mmap.c: add more sanity checks to get_unmapped_area() Andrew Morton
2020-06-08  4:41 ` [patch 24/54] mm/mmap.c: do not allow mappings outside of allowed limits Andrew Morton
2020-06-08 17:50   ` Linus Torvalds
2020-06-08 17:55     ` Linus Torvalds
2020-06-08  4:41 ` [patch 25/54] arm: fix the flush_icache_range arguments in set_fiq_handler Andrew Morton
2020-06-08  4:41 ` [patch 26/54] nds32: unexport flush_icache_page Andrew Morton
2020-06-08  4:41 ` [patch 27/54] powerpc: unexport flush_icache_user_range Andrew Morton
2020-06-08  4:41 ` [patch 28/54] unicore32: remove flush_cache_user_range Andrew Morton
2020-06-08  4:41 ` [patch 29/54] asm-generic: fix the inclusion guards for cacheflush.h Andrew Morton
2020-06-08  4:41 ` [patch 30/54] asm-generic: don't include <linux/mm.h> in cacheflush.h Andrew Morton
2020-06-08  4:41 ` [patch 31/54] asm-generic: improve the flush_dcache_page stub Andrew Morton
2020-06-08  4:41 ` [patch 32/54] alpha: use asm-generic/cacheflush.h Andrew Morton
2020-06-08  4:41 ` [patch 33/54] arm64: " Andrew Morton
2020-06-08  4:41 ` [patch 34/54] c6x: " Andrew Morton
2020-06-08  4:41 ` [patch 35/54] hexagon: " Andrew Morton
2020-06-08  4:42 ` [patch 36/54] ia64: " Andrew Morton
2020-06-08  4:42 ` [patch 37/54] microblaze: " Andrew Morton
2020-06-08  4:42 ` [patch 38/54] m68knommu: " Andrew Morton
2020-06-08  4:42 ` [patch 39/54] openrisc: " Andrew Morton
2020-06-08  4:42 ` [patch 40/54] powerpc: " Andrew Morton
2020-06-08  4:42 ` [patch 41/54] riscv: " Andrew Morton
2020-06-08  4:42 ` [patch 42/54] arm,sparc,unicore32: remove flush_icache_user_range Andrew Morton
2020-06-08  4:42 ` [patch 43/54] mm: rename flush_icache_user_range to flush_icache_user_page Andrew Morton
2020-06-08  4:42 ` [patch 44/54] asm-generic: add a flush_icache_user_range stub Andrew Morton
2020-06-08  4:42 ` [patch 45/54] sh: implement flush_icache_user_range Andrew Morton
2020-06-08  4:42 ` [patch 46/54] xtensa: " Andrew Morton
2020-06-08  4:42 ` [patch 47/54] arm: rename flush_cache_user_range to flush_icache_user_range Andrew Morton
2020-06-08  4:42 ` [patch 48/54] m68k: implement flush_icache_user_range Andrew Morton
2020-06-08  4:42 ` [patch 49/54] exec: only build read_code when needed Andrew Morton
2020-06-08  4:42 ` [patch 50/54] exec: use flush_icache_user_range in read_code Andrew Morton
2020-06-08  4:42 ` [patch 51/54] binfmt_flat: use flush_icache_user_range Andrew Morton
2020-06-08  4:42 ` [patch 52/54] nommu: use flush_icache_user_range in brk and mmap Andrew Morton
2020-06-08  4:42 ` [patch 53/54] module: move the set_fs hack for flush_icache_range to m68k Andrew Morton
2020-06-08  4:42 ` [patch 54/54] doc: cgroup: update note about conditions when oom killer is invoked Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200608044048.aELgRjIO6%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=gpiccoli@canonical.com \
    --cc=keescook@chromium.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox