linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node
@ 2026-01-12  8:47 Deepanshu Kartikey
  2026-01-12 10:21 ` Uladzislau Rezki
  0 siblings, 1 reply; 2+ messages in thread
From: Deepanshu Kartikey @ 2026-01-12  8:47 UTC (permalink / raw)
  To: akpm, urezki
  Cc: linux-mm, linux-kernel, Deepanshu Kartikey, syzbot+d8d4c31d40f868eaea30

When CONFIG_PAGE_OWNER is enabled, freeing KASAN shadow pages during
vmalloc cleanup triggers expensive stack unwinding that acquires RCU
read locks. Processing a large purge_list without rescheduling can
cause the task to hold CPU for extended periods (10+ seconds), leading
to RCU stalls and potential OOM conditions.

The issue manifests in purge_vmap_node() -> kasan_release_vmalloc_node()
where iterating through hundreds or thousands of vmap_area entries and
freeing their associated shadow pages causes:

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P6229/1:b..l
  ...
  task:kworker/0:17 state:R running task stack:28840 pid:6229
  ...
  kasan_release_vmalloc_node+0x1ba/0xad0 mm/vmalloc.c:2299
  purge_vmap_node+0x1ba/0xad0 mm/vmalloc.c:2299

Each call to kasan_release_vmalloc() can free many pages, and with
page_owner tracking, each free triggers save_stack() which performs
stack unwinding under RCU read lock. Without yielding, this creates
an unbounded RCU critical section.

Add periodic cond_resched() calls within the loop to allow:
- RCU grace periods to complete
- Other tasks to run
- Scheduler to preempt when needed

The fix uses need_resched() for immediate response under load, with
a batch count of 32 as a guaranteed upper bound to prevent worst-case
stalls even under light load.

Reported-by: syzbot+d8d4c31d40f868eaea30@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d8d4c31d40f868eaea30
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/vmalloc.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 41dd01e8430c..a9161007cf02 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2273,6 +2273,7 @@ kasan_release_vmalloc_node(struct vmap_node *vn)
 {
 	struct vmap_area *va;
 	unsigned long start, end;
+	unsigned int batch_count = 0;
 
 	start = list_first_entry(&vn->purge_list, struct vmap_area, list)->va_start;
 	end = list_last_entry(&vn->purge_list, struct vmap_area, list)->va_end;
@@ -2282,6 +2283,11 @@ kasan_release_vmalloc_node(struct vmap_node *vn)
 			kasan_release_vmalloc(va->va_start, va->va_end,
 				va->va_start, va->va_end,
 				KASAN_VMALLOC_PAGE_RANGE);
+
+			if (need_resched() || (++batch_count >= 32)) {
+				cond_resched();
+				batch_count = 0;
+			}
 	}
 
 	kasan_release_vmalloc(start, end, start, end, KASAN_VMALLOC_TLB_FLUSH);
-- 
2.43.0



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node
  2026-01-12  8:47 [PATCH] mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node Deepanshu Kartikey
@ 2026-01-12 10:21 ` Uladzislau Rezki
  0 siblings, 0 replies; 2+ messages in thread
From: Uladzislau Rezki @ 2026-01-12 10:21 UTC (permalink / raw)
  To: Deepanshu Kartikey
  Cc: akpm, urezki, linux-mm, linux-kernel, syzbot+d8d4c31d40f868eaea30

On Mon, Jan 12, 2026 at 02:17:23PM +0530, Deepanshu Kartikey wrote:
> When CONFIG_PAGE_OWNER is enabled, freeing KASAN shadow pages during
> vmalloc cleanup triggers expensive stack unwinding that acquires RCU
> read locks. Processing a large purge_list without rescheduling can
> cause the task to hold CPU for extended periods (10+ seconds), leading
> to RCU stalls and potential OOM conditions.
> 
> The issue manifests in purge_vmap_node() -> kasan_release_vmalloc_node()
> where iterating through hundreds or thousands of vmap_area entries and
> freeing their associated shadow pages causes:
> 
>   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>   rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P6229/1:b..l
>   ...
>   task:kworker/0:17 state:R running task stack:28840 pid:6229
>   ...
>   kasan_release_vmalloc_node+0x1ba/0xad0 mm/vmalloc.c:2299
>   purge_vmap_node+0x1ba/0xad0 mm/vmalloc.c:2299
> 
> Each call to kasan_release_vmalloc() can free many pages, and with
> page_owner tracking, each free triggers save_stack() which performs
> stack unwinding under RCU read lock. Without yielding, this creates
> an unbounded RCU critical section.
> 
> Add periodic cond_resched() calls within the loop to allow:
> - RCU grace periods to complete
> - Other tasks to run
> - Scheduler to preempt when needed
> 
> The fix uses need_resched() for immediate response under load, with
> a batch count of 32 as a guaranteed upper bound to prevent worst-case
> stalls even under light load.
> 
> Reported-by: syzbot+d8d4c31d40f868eaea30@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d8d4c31d40f868eaea30
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>  mm/vmalloc.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 41dd01e8430c..a9161007cf02 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2273,6 +2273,7 @@ kasan_release_vmalloc_node(struct vmap_node *vn)
>  {
>  	struct vmap_area *va;
>  	unsigned long start, end;
> +	unsigned int batch_count = 0;
>  
>  	start = list_first_entry(&vn->purge_list, struct vmap_area, list)->va_start;
>  	end = list_last_entry(&vn->purge_list, struct vmap_area, list)->va_end;
> @@ -2282,6 +2283,11 @@ kasan_release_vmalloc_node(struct vmap_node *vn)
>  			kasan_release_vmalloc(va->va_start, va->va_end,
>  				va->va_start, va->va_end,
>  				KASAN_VMALLOC_PAGE_RANGE);
> +
> +			if (need_resched() || (++batch_count >= 32)) {
> +				cond_resched();
> +				batch_count = 0;
> +			}
>  	}
>  
>  	kasan_release_vmalloc(start, end, start, end, KASAN_VMALLOC_TLB_FLUSH);
> -- 
> 2.43.0
> 
Introduce a macro to represent an upper-bound? 

Thanks!

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-01-12 10:21 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-12  8:47 [PATCH] mm/vmalloc: prevent RCU stalls in kasan_release_vmalloc_node Deepanshu Kartikey
2026-01-12 10:21 ` Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox