* [PATCH v3 1/3] mm/memory-failure: report MF_MSG_KERNEL for reserved pages
2026-04-13 13:26 [PATCH v3 0/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
@ 2026-04-13 13:26 ` Breno Leitao
2026-04-13 13:26 ` [PATCH v3 2/3] mm/memory-failure: add CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC option Breno Leitao
2026-04-13 13:26 ` [PATCH v3 3/3] Documentation: document panic_on_unrecoverable_memory_failure sysctl Breno Leitao
2 siblings, 0 replies; 4+ messages in thread
From: Breno Leitao @ 2026-04-13 13:26 UTC (permalink / raw)
To: Miaohe Lin, Naoya Horiguchi, Andrew Morton, Jonathan Corbet,
Shuah Khan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko
Cc: linux-mm, linux-kernel, linux-doc, Breno Leitao, kernel-team
When get_hwpoison_page() returns a negative value, distinguish
reserved pages from other failure cases by reporting MF_MSG_KERNEL
instead of MF_MSG_GET_HWPOISON. Reserved pages belong to the kernel
and should be classified accordingly for proper handling by the
panic_on_unrecoverable_memory_failure mechanism.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/memory-failure.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 48 insertions(+), 1 deletion(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ee42d43613097..852c595aff108 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -74,6 +74,8 @@ static int sysctl_memory_failure_recovery __read_mostly = 1;
static int sysctl_enable_soft_offline __read_mostly = 1;
+static int sysctl_panic_on_unrecoverable_mf __read_mostly;
+
atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
static bool hw_memory_failure __read_mostly = false;
@@ -155,6 +157,15 @@ static const struct ctl_table memory_failure_table[] = {
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
+ },
+ {
+ .procname = "panic_on_unrecoverable_memory_failure",
+ .data = &sysctl_panic_on_unrecoverable_mf,
+ .maxlen = sizeof(sysctl_panic_on_unrecoverable_mf),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
}
};
@@ -1281,6 +1292,35 @@ static void update_per_node_mf_stats(unsigned long pfn,
++mf_stats->total;
}
+/*
+ * Determine whether to panic on an unrecoverable memory failure.
+ *
+ * Design rationale: This design opts for immediate panic on kernel memory
+ * failures, capturing clean crashes other than random crashes on MF_IGNORED pages
+ *
+ * This panics on three categories of failures:
+ * - MF_MSG_KERNEL: Reserved pages that cannot be recovered
+ * - MF_MSG_KERNEL_HIGH_ORDER: High-order kernel pages that cannot be recovered
+ * - MF_MSG_UNKNOWN: Pages with unknown state that cannot be classified as recoverable
+ * - and the page is not being recovered (result = MF_IGNORED)
+ *
+ * Note: Transient races are mitigated by memory_failure()'s retry mechanism.
+ * When a buddy allocator race is detected (take_page_off_buddy() fails), the
+ * code clears PageHWPoison and retries the entire memory_failure() flow,
+ * allowing pages to be properly reclassified with updated flags. This ensures
+ * that false posiotives are not misclassified as unrecoverable.
+ *
+ */
+static bool panic_on_unrecoverable_mf(enum mf_action_page_type type,
+ enum mf_result result)
+{
+ return sysctl_panic_on_unrecoverable_mf &&
+ result == MF_IGNORED &&
+ (type == MF_MSG_KERNEL ||
+ type == MF_MSG_KERNEL_HIGH_ORDER ||
+ type == MF_MSG_UNKNOWN);
+}
+
/*
* "Dirty/Clean" indication is not 100% accurate due to the possibility of
* setting PG_dirty outside page lock. See also comment above set_page_dirty().
@@ -1298,6 +1338,9 @@ static int action_result(unsigned long pfn, enum mf_action_page_type type,
pr_err("%#lx: recovery action for %s: %s\n",
pfn, action_page_types[type], action_name[result]);
+ if (panic_on_unrecoverable_mf(type, result))
+ panic("Memory failure: %#lx: unrecoverable page", pfn);
+
return (result == MF_RECOVERED || result == MF_DELAYED) ? 0 : -EBUSY;
}
@@ -2432,7 +2475,11 @@ int memory_failure(unsigned long pfn, int flags)
}
goto unlock_mutex;
} else if (res < 0) {
- res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
+ if (PageReserved(p))
+ res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
+ else
+ res = action_result(pfn, MF_MSG_GET_HWPOISON,
+ MF_IGNORED);
goto unlock_mutex;
}
--
2.52.0
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH v3 2/3] mm/memory-failure: add CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC option
2026-04-13 13:26 [PATCH v3 0/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-04-13 13:26 ` [PATCH v3 1/3] mm/memory-failure: report MF_MSG_KERNEL for reserved pages Breno Leitao
@ 2026-04-13 13:26 ` Breno Leitao
2026-04-13 13:26 ` [PATCH v3 3/3] Documentation: document panic_on_unrecoverable_memory_failure sysctl Breno Leitao
2 siblings, 0 replies; 4+ messages in thread
From: Breno Leitao @ 2026-04-13 13:26 UTC (permalink / raw)
To: Miaohe Lin, Naoya Horiguchi, Andrew Morton, Jonathan Corbet,
Shuah Khan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko
Cc: linux-mm, linux-kernel, linux-doc, Breno Leitao, kernel-team
Add a kernel configuration option to enable panic on unrecoverable
memory failures at boot time, similar to CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC
and CONFIG_BOOTPARAM_HARDLOCKUP_PANIC.
This allows systems that prioritize availability over recovery to
automatically panic when encountering unrecoverable kernel memory
failures. The behavior can still be controlled at runtime via the
panic_on_unrecoverable_memory_failure sysctl.
When enabled, the kernel will panic if:
* A memory failure affects kernel pages that cannot be recovered
* A memory failure affects high-order kernel pages
* A memory failure affects unknown page types that cannot be recovered
Examples of BOOTPARAM configuration usage:
1. Building with the panic option enabled by default:
CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC=y
2. Disabling at runtime even when compiled in:
echo 0 > /proc/sys/vm/panic_on_unrecoverable_memory_failure
3. Enabling at runtime when not compiled in by default:
echo 1 > /proc/sys/vm/panic_on_unrecoverable_memory_failure
Similar to other BOOTPARAM options, this provides a balance between:
- Safe defaults (disabled by default without CONFIG option)
- Production flexibility (can be enabled at build time)
- Runtime control (can be toggled via sysctl)
This is consistent with the kernel's approach to other panic-on-error
options that allow systems to choose between attempting recovery or
failing fast when critical errors are detected.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
mm/Kconfig | 9 +++++++++
mm/memory-failure.c | 3 ++-
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/mm/Kconfig b/mm/Kconfig
index ebd8ea353687e..596f24a872ff6 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -733,6 +733,15 @@ config MEMORY_FAILURE
even when some of its memory has uncorrected errors. This requires
special hardware support and typically ECC memory.
+config BOOTPARAM_MEMORY_FAILURE_PANIC
+ bool "Panic on unrecoverable memory failure"
+ depends on MEMORY_FAILURE
+ help
+ Say Y here to panic when an unrecoverable memory failure is
+ detected. This covers kernel pages, high-order kernel pages,
+ and unknown page types that cannot be recovered. Can be disabled
+ at runtime via the panic_on_unrecoverable_memory_failure sysctl.
+
config HWPOISON_INJECT
tristate "HWPoison pages injector"
depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 852c595aff108..cf06960b4d069 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -74,7 +74,8 @@ static int sysctl_memory_failure_recovery __read_mostly = 1;
static int sysctl_enable_soft_offline __read_mostly = 1;
-static int sysctl_panic_on_unrecoverable_mf __read_mostly;
+static int sysctl_panic_on_unrecoverable_mf __read_mostly =
+ IS_ENABLED(CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC);
atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0);
--
2.52.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 3/3] Documentation: document panic_on_unrecoverable_memory_failure sysctl
2026-04-13 13:26 [PATCH v3 0/3] mm/memory-failure: add panic option for unrecoverable pages Breno Leitao
2026-04-13 13:26 ` [PATCH v3 1/3] mm/memory-failure: report MF_MSG_KERNEL for reserved pages Breno Leitao
2026-04-13 13:26 ` [PATCH v3 2/3] mm/memory-failure: add CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC option Breno Leitao
@ 2026-04-13 13:26 ` Breno Leitao
2 siblings, 0 replies; 4+ messages in thread
From: Breno Leitao @ 2026-04-13 13:26 UTC (permalink / raw)
To: Miaohe Lin, Naoya Horiguchi, Andrew Morton, Jonathan Corbet,
Shuah Khan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko
Cc: linux-mm, linux-kernel, linux-doc, Breno Leitao, kernel-team
Document the vm.panic_on_unrecoverable_memory_failure sysctl in the
admin guide, including the CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC kernel
configuration option that allows enabling this behavior at build time.
This follows the same format as panic_on_unrecovered_nmi and other
panic-on-error documentation, providing clear examples of:
- Enabling panic at build time via CONFIG option
- Disabling at runtime via sysctl
- Enabling at runtime via sysctl
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Documentation/admin-guide/sysctl/vm.rst | 46 +++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 97e12359775c9..af545869bc1b4 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -67,6 +67,7 @@ Currently, these files are in /proc/sys/vm:
- page-cluster
- page_lock_unfairness
- panic_on_oom
+- panic_on_unrecoverable_memory_failure
- percpu_pagelist_high_fraction
- stat_interval
- stat_refresh
@@ -925,6 +926,51 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.
+panic_on_unrecoverable_memory_failure
+======================================
+
+When a hardware memory error (e.g. multi-bit ECC) hits an in-use kernel
+page that cannot be recovered by the memory failure handler, the default
+behaviour is to ignore the error and continue operation. This is
+dangerous because the corrupted data remains accessible to the kernel,
+risking silent data corruption or a delayed crash when the poisoned
+memory is next accessed.
+
+Pages that reach this path include slab objects (dentry cache, inode
+cache, etc.), page tables, kernel stacks, and other kernel allocations
+that lack the reverse mapping needed to isolate all references.
+
+For many environments it is preferable to panic immediately with a clean
+crash dump that captures the original error context, rather than to
+continue and face a random crash later whose cause is difficult to
+diagnose.
+
+= =====================================================================
+0 Try to continue operation (default).
+1 Panic immediately. If the ``panic`` sysctl is also non-zero then the
+ machine will be rebooted.
+= =====================================================================
+
+This sysctl can be set to 1 at boot time by enabling the
+``CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC`` kernel configuration option.
+This provides systems with the ability to enforce panic-on-error behavior
+from the kernel build, without requiring runtime sysctl configuration.
+
+Examples:
+
+1. Enable panic on unrecoverable memory failure at kernel build time::
+
+ CONFIG_BOOTPARAM_MEMORY_FAILURE_PANIC=y
+
+2. Disable at runtime even when compiled in::
+
+ echo 0 > /proc/sys/vm/panic_on_unrecoverable_memory_failure
+
+3. Enable at runtime when not enabled at build time::
+
+ echo 1 > /proc/sys/vm/panic_on_unrecoverable_memory_failure
+
+
percpu_pagelist_high_fraction
=============================
--
2.52.0
^ permalink raw reply [flat|nested] 4+ messages in thread