* [PATCH v2] mm/kfence: add reboot notifier to disable KFENCE on shutdown
@ 2025-11-27 14:51 Breno Leitao
2026-01-13 14:02 ` Chris Mason
0 siblings, 1 reply; 3+ messages in thread
From: Breno Leitao @ 2025-11-27 14:51 UTC (permalink / raw)
To: Alexander Potapenko, Marco Elver, Dmitry Vyukov, Andrew Morton
Cc: kasan-dev, linux-mm, linux-kernel, kernel-team, stable, Breno Leitao
During system shutdown, KFENCE can cause IPI synchronization issues if
it remains active through the reboot process. To prevent this, register
a reboot notifier that disables KFENCE and cancels any pending timer
work early in the shutdown sequence.
This is only necessary when CONFIG_KFENCE_STATIC_KEYS is enabled, as
this configuration sends IPIs that can interfere with shutdown. Without
static keys, no IPIs are generated and KFENCE can safely remain active.
The notifier uses maximum priority (INT_MAX) to ensure KFENCE shuts
down before other subsystems that might still depend on stable memory
allocation behavior.
This fixes a late kexec CSD lockup[1] when kfence is trying to IPI a CPU
that is busy in a IRQ-disabled context printing characters to the
console.
Link: https://lore.kernel.org/all/sqwajvt7utnt463tzxgwu2yctyn5m6bjwrslsnupfexeml6hkd@v6sqmpbu3vvu/ [1]
Cc: stable@vger.kernel.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Marco Elver <elver@google.com>
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
---
Changes in v2:
- Adding Fixes: tag and CCing stable (akpm)
- Link to v1: https://patch.msgid.link/20251126-kfence-v1-1-5a6e1d7c681c@debian.org
---
mm/kfence/core.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 727c20c94ac5..162a026871ab 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -26,6 +26,7 @@
#include <linux/panic_notifier.h>
#include <linux/random.h>
#include <linux/rcupdate.h>
+#include <linux/reboot.h>
#include <linux/sched/clock.h>
#include <linux/seq_file.h>
#include <linux/slab.h>
@@ -820,6 +821,25 @@ static struct notifier_block kfence_check_canary_notifier = {
static struct delayed_work kfence_timer;
#ifdef CONFIG_KFENCE_STATIC_KEYS
+static int kfence_reboot_callback(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ /*
+ * Disable kfence to avoid static keys IPI synchronization during
+ * late shutdown/kexec
+ */
+ WRITE_ONCE(kfence_enabled, false);
+ /* Cancel any pending timer work */
+ cancel_delayed_work_sync(&kfence_timer);
+
+ return NOTIFY_OK;
+}
+
+static struct notifier_block kfence_reboot_notifier = {
+ .notifier_call = kfence_reboot_callback,
+ .priority = INT_MAX, /* Run early to stop timers ASAP */
+};
+
/* Wait queue to wake up allocation-gate timer task. */
static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
@@ -901,6 +921,10 @@ static void kfence_init_enable(void)
if (kfence_check_on_panic)
atomic_notifier_chain_register(&panic_notifier_list, &kfence_check_canary_notifier);
+#ifdef CONFIG_KFENCE_STATIC_KEYS
+ register_reboot_notifier(&kfence_reboot_notifier);
+#endif
+
WRITE_ONCE(kfence_enabled, true);
queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
---
base-commit: ab084f0b8d6d2ee4b1c6a28f39a2a7430bdfa7f0
change-id: 20251126-kfence-42c93f9b3979
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH v2] mm/kfence: add reboot notifier to disable KFENCE on shutdown
2025-11-27 14:51 [PATCH v2] mm/kfence: add reboot notifier to disable KFENCE on shutdown Breno Leitao
@ 2026-01-13 14:02 ` Chris Mason
2026-01-14 15:21 ` Breno Leitao
0 siblings, 1 reply; 3+ messages in thread
From: Chris Mason @ 2026-01-13 14:02 UTC (permalink / raw)
To: Breno Leitao
Cc: Chris Mason, Alexander Potapenko, Marco Elver, Dmitry Vyukov,
Andrew Morton, kasan-dev, linux-mm, linux-kernel, kernel-team,
stable
On Thu, 27 Nov 2025 06:51:54 -0800 Breno Leitao <leitao@debian.org> wrote:
> During system shutdown, KFENCE can cause IPI synchronization issues if
> it remains active through the reboot process. To prevent this, register
> a reboot notifier that disables KFENCE and cancels any pending timer
> work early in the shutdown sequence.
>
Hi Breno,
I'm running v6.19 MM commits through the patch review automation and this
one was flagged. Is this deadlock possible?
Automated review output:
> @@ -820,6 +821,25 @@ static struct notifier_block kfence_check_canary_notifier = {
> static struct delayed_work kfence_timer;
>
> #ifdef CONFIG_KFENCE_STATIC_KEYS
> +static int kfence_reboot_callback(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + /*
> + * Disable kfence to avoid static keys IPI synchronization during
> + * late shutdown/kexec
> + */
> + WRITE_ONCE(kfence_enabled, false);
> + /* Cancel any pending timer work */
> + cancel_delayed_work_sync(&kfence_timer);
^^^^^^^^^^^^^^^
Can cancel_delayed_work_sync() deadlock here?
If toggle_allocation_gate() is currently executing and blocked inside
wait_event_idle() (waiting for kfence_allocation_gate > 0), then
cancel_delayed_work_sync() will block forever waiting for the work to
complete.
The wait_event_idle() condition depends only on allocations occurring
to increment kfence_allocation_gate - setting kfence_enabled to false
does not wake up this wait. During shutdown when allocations may have
stopped, the work item could remain blocked indefinitely, causing the
reboot notifier to hang.
The call chain is:
kfence_reboot_callback()
-> cancel_delayed_work_sync(&kfence_timer)
-> __flush_work()
-> wait_for_completion(&barr.done)
// waits forever because...
toggle_allocation_gate() [currently running]
-> wait_event_idle(allocation_wait, kfence_allocation_gate > 0)
// never wakes up if no allocations happen
Would it be safer to use cancel_delayed_work() (non-sync) here, or add
a mechanism to wake up the wait_event_idle() when kfence_enabled becomes
false?
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH v2] mm/kfence: add reboot notifier to disable KFENCE on shutdown
2026-01-13 14:02 ` Chris Mason
@ 2026-01-14 15:21 ` Breno Leitao
0 siblings, 0 replies; 3+ messages in thread
From: Breno Leitao @ 2026-01-14 15:21 UTC (permalink / raw)
To: Chris Mason
Cc: Alexander Potapenko, Marco Elver, Dmitry Vyukov, Andrew Morton,
kasan-dev, linux-mm, linux-kernel, kernel-team, stable
Hello Chris,
On Tue, Jan 13, 2026 at 06:02:27AM -0800, Chris Mason wrote:
> On Thu, 27 Nov 2025 06:51:54 -0800 Breno Leitao <leitao@debian.org> wrote:
> > @@ -820,6 +821,25 @@ static struct notifier_block kfence_check_canary_notifier = {
> > static struct delayed_work kfence_timer;
> >
> > #ifdef CONFIG_KFENCE_STATIC_KEYS
> > +static int kfence_reboot_callback(struct notifier_block *nb,
> > + unsigned long action, void *data)
> > +{
> > + /*
> > + * Disable kfence to avoid static keys IPI synchronization during
> > + * late shutdown/kexec
> > + */
> > + WRITE_ONCE(kfence_enabled, false);
> > + /* Cancel any pending timer work */
> > + cancel_delayed_work_sync(&kfence_timer);
> ^^^^^^^^^^^^^^^
>
> Can cancel_delayed_work_sync() deadlock here?
>
> If toggle_allocation_gate() is currently executing and blocked inside
> wait_event_idle() (waiting for kfence_allocation_gate > 0), then
> cancel_delayed_work_sync() will block forever waiting for the work to
> complete.
>
> The wait_event_idle() condition depends only on allocations occurring
> to increment kfence_allocation_gate - setting kfence_enabled to false
> does not wake up this wait. During shutdown when allocations may have
> stopped, the work item could remain blocked indefinitely, causing the
> reboot notifier to hang.
>
> The call chain is:
> kfence_reboot_callback()
> -> cancel_delayed_work_sync(&kfence_timer)
> -> __flush_work()
> -> wait_for_completion(&barr.done)
> // waits forever because...
>
> toggle_allocation_gate() [currently running]
> -> wait_event_idle(allocation_wait, kfence_allocation_gate > 0)
> // never wakes up if no allocations happen
This is spot on, I think this is a real case if the following happen:
1) toggle_allocation_gate() passed beyond kfence_enabled and is waiting
for kfence_allocation_gate to be > 0.
a) kfence_allocation_gate is increased on allocation time
2) There is no more kernel allocation, thus, kfence_allocation_gate is
not incremented
3) cancel_delayed_work_sync() is for kfence_allocation_gate > 0, but
given there is no more allocation, this will never happen.
> Would it be safer to use cancel_delayed_work() (non-sync) here.
In this case toggle_allocation_gate() task will continue to be idle,
waiting for to be kfence_allocation_gate > 0 forever, but it will not
block the notifiers, unless we wake them up.
Is this a problem?
Maybe a more robust solution would include:
1) s/cancel_delayed_work_sync()/cancel_delayed_work().
- This would unblock the notifier
or/and some of the followings
2) Return from wait_event_idle() if kfence_enabled got disabled.
- Remove the waiters once kfence got disabled
- Cons: kfence_allocation_gate will continue to be negative
3) Wake up everyone in the allocation_wait() list
- This might not be necessary if we got 2, since they will wake
themselves once kfence_enabled got to 0
- Cons: kfence_allocation_gate will continue to be negative
4) bump kfence_allocation_gate > 1 on the notifier
- Avoid kfence allocation completely after it got disabled.
- Cons: it is unclear if we I cant set kfence_allocation_gate = 1 from
the notifier.
Thanks for the report,
--breno
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-01-14 15:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-27 14:51 [PATCH v2] mm/kfence: add reboot notifier to disable KFENCE on shutdown Breno Leitao
2026-01-13 14:02 ` Chris Mason
2026-01-14 15:21 ` Breno Leitao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox