* [stable-6.6.y] mm: khugepaged refuses to freeze @ 2026-02-06 2:47 Sergey Senozhatsky 2026-02-06 3:33 ` Baolin Wang 0 siblings, 1 reply; 13+ messages in thread From: Sergey Senozhatsky @ 2026-02-06 2:47 UTC (permalink / raw) To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan Cc: Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel Greetings, I'm looking at a slightly unusual issue where khugepaged refuses to freeze during system suspend: ... PM: suspend entry (s2idle) Filesystems sync: 0.003 seconds Freezing user space processes Freezing user space processes completed (elapsed 0.003 seconds) OOM killer disabled. Freezing remaining freezable tasks Freezing remaining freezable tasks failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): task:khugepaged state:D stack:0 pid:1345 ppid:2 flags:0x00004000 Call Trace: <TASK> schedule+0x523/0x16a0 ? sysvec_apic_timer_interrupt+0xf/0x90 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? wait_for_completion_io_timeout+0xc5/0x170 schedule_timeout+0x23b/0x6e0 ? __pfx_process_timeout+0x10/0x10 ? wait_for_completion_io_timeout+0xc5/0x170 io_schedule_timeout+0x3f/0x80 wait_for_completion_io_timeout+0xe4/0x170 submit_bio_wait+0x79/0xc0 swap_readpage+0x150/0x2d0 ? __pfx_submit_bio_wait_endio+0x10/0x10 swap_cluster_readahead+0x3be/0x750 ? __pfx_workingset_update_node+0x10/0x10 shmem_swapin+0xa7/0x100 shmem_swapin_folio+0xcd/0x2e0 shmem_get_folio+0x237/0x580 collapse_file+0x247/0x1280 hpage_collapse_scan_file+0x26e/0x380 khugepaged+0x43b/0x810 kthread+0xfb/0x120 ? __pfx_khugepaged+0x10/0x10 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x38/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> ... The system is using zram swap. I wonder if khugepaged should be suspend/freeze aware. Does something like below make sense? Or is the problem elsewhere? --- mm/khugepaged.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index eff9e3061925..fa6a018b20a8 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1894,6 +1894,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, xas_set(&xas, index); folio = xas_load(&xas); + if (try_to_freeze()) + goto xa_unlocked; + VM_BUG_ON(index != xas.xa_index); if (is_shmem) { if (!folio) { -- 2.53.0.rc2.204.g2597b5adb4-goog ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 2:47 [stable-6.6.y] mm: khugepaged refuses to freeze Sergey Senozhatsky @ 2026-02-06 3:33 ` Baolin Wang 2026-02-06 3:38 ` Sergey Senozhatsky 0 siblings, 1 reply; 13+ messages in thread From: Baolin Wang @ 2026-02-06 3:33 UTC (permalink / raw) To: Sergey Senozhatsky, Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan Cc: Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/6/26 10:47 AM, Sergey Senozhatsky wrote: > Greetings, > > I'm looking at a slightly unusual issue where khugepaged refuses to > freeze during system suspend: > > ... > PM: suspend entry (s2idle) > Filesystems sync: 0.003 seconds > Freezing user space processes > Freezing user space processes completed (elapsed 0.003 seconds) > OOM killer disabled. > Freezing remaining freezable tasks > Freezing remaining freezable tasks failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): > task:khugepaged state:D stack:0 pid:1345 ppid:2 flags:0x00004000 > Call Trace: > <TASK> > schedule+0x523/0x16a0 > ? sysvec_apic_timer_interrupt+0xf/0x90 > ? asm_sysvec_apic_timer_interrupt+0x16/0x20 > ? wait_for_completion_io_timeout+0xc5/0x170 > schedule_timeout+0x23b/0x6e0 > ? __pfx_process_timeout+0x10/0x10 > ? wait_for_completion_io_timeout+0xc5/0x170 > io_schedule_timeout+0x3f/0x80 > wait_for_completion_io_timeout+0xe4/0x170 > submit_bio_wait+0x79/0xc0 > swap_readpage+0x150/0x2d0 > ? __pfx_submit_bio_wait_endio+0x10/0x10 > swap_cluster_readahead+0x3be/0x750 > ? __pfx_workingset_update_node+0x10/0x10 > shmem_swapin+0xa7/0x100 > shmem_swapin_folio+0xcd/0x2e0 > shmem_get_folio+0x237/0x580 > collapse_file+0x247/0x1280 > hpage_collapse_scan_file+0x26e/0x380 > khugepaged+0x43b/0x810 > kthread+0xfb/0x120 > ? __pfx_khugepaged+0x10/0x10 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x38/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1b/0x30 > </TASK> > ... > > The system is using zram swap. I wonder if khugepaged should > be suspend/freeze aware. Does something like below make sense? > Or is the problem elsewhere? > > --- > mm/khugepaged.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index eff9e3061925..fa6a018b20a8 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1894,6 +1894,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, > xas_set(&xas, index); > folio = xas_load(&xas); > > + if (try_to_freeze()) > + goto xa_unlocked; > + > VM_BUG_ON(index != xas.xa_index); > if (is_shmem) { > if (!folio) { Your analysis is reasonable. When the system is freezing, khugepaged is still trying to swap-in shmem to collapse, which prevents the system from entering suspend state. However, it’s not only shmem that will swap in, collapsing anonymous folios may also trigger swap-in operations. Therefore, I think we should skip all collapse scans for anonymous and file pages in the main scan function khugepaged_do_scan() if the system is attempting to freeze. Some sample code is as follows: diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fa1e57fd2c46..cfa7882585ad 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2560,9 +2560,18 @@ static void khugepaged_do_scan(struct collapse_control *cc) lru_add_drain_all(); while (true) { + bool was_frozen; + cond_resched(); - if (unlikely(kthread_should_stop())) + if (unlikely(kthread_freezable_should_stop(&was_frozen))) + break; + + /* + * We can speed up thawing tasks if we don't call khugepaged_scan_mm_slot() + * after returning from the refrigerator + */ + if (was_frozen) break; spin_lock(&khugepaged_mm_lock); ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 3:33 ` Baolin Wang @ 2026-02-06 3:38 ` Sergey Senozhatsky 2026-02-06 4:31 ` Sergey Senozhatsky 0 siblings, 1 reply; 13+ messages in thread From: Sergey Senozhatsky @ 2026-02-06 3:38 UTC (permalink / raw) To: Baolin Wang Cc: Sergey Senozhatsky, Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On (26/02/06 11:33), Baolin Wang wrote: > > Freezing remaining freezable tasks failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0): > > task:khugepaged state:D stack:0 pid:1345 ppid:2 flags:0x00004000 > > Call Trace: > > <TASK> > > schedule+0x523/0x16a0 > > ? sysvec_apic_timer_interrupt+0xf/0x90 > > ? asm_sysvec_apic_timer_interrupt+0x16/0x20 > > ? wait_for_completion_io_timeout+0xc5/0x170 > > schedule_timeout+0x23b/0x6e0 > > ? __pfx_process_timeout+0x10/0x10 > > ? wait_for_completion_io_timeout+0xc5/0x170 > > io_schedule_timeout+0x3f/0x80 > > wait_for_completion_io_timeout+0xe4/0x170 > > submit_bio_wait+0x79/0xc0 > > swap_readpage+0x150/0x2d0 > > ? __pfx_submit_bio_wait_endio+0x10/0x10 > > swap_cluster_readahead+0x3be/0x750 > > ? __pfx_workingset_update_node+0x10/0x10 > > shmem_swapin+0xa7/0x100 > > shmem_swapin_folio+0xcd/0x2e0 > > shmem_get_folio+0x237/0x580 > > collapse_file+0x247/0x1280 > > hpage_collapse_scan_file+0x26e/0x380 > > khugepaged+0x43b/0x810 > > kthread+0xfb/0x120 > > ? __pfx_khugepaged+0x10/0x10 > > ? __pfx_kthread+0x10/0x10 > > ret_from_fork+0x38/0x50 > > ? __pfx_kthread+0x10/0x10 > > ret_from_fork_asm+0x1b/0x30 > > </TASK> > > ... > > > > The system is using zram swap. I wonder if khugepaged should > > be suspend/freeze aware. Does something like below make sense? > > Or is the problem elsewhere? > > > > --- > > mm/khugepaged.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index eff9e3061925..fa6a018b20a8 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -1894,6 +1894,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, > > xas_set(&xas, index); > > folio = xas_load(&xas); > > + if (try_to_freeze()) > > + goto xa_unlocked; > > + > > VM_BUG_ON(index != xas.xa_index); > > if (is_shmem) { > > if (!folio) { > > Your analysis is reasonable. When the system is freezing, khugepaged is > still trying to swap-in shmem to collapse, which prevents the system from > entering suspend state. However, it’s not only shmem that will swap in, > collapsing anonymous folios may also trigger swap-in operations. Right, I thought about it but wasn't sure. Could the inner loop (e.g. collapse_file() in this particular case) loop long enough to fail suspend w/o ever giving the outer loop (khugepaged_do_scan()) a chance to freeze? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 3:38 ` Sergey Senozhatsky @ 2026-02-06 4:31 ` Sergey Senozhatsky 2026-02-06 5:12 ` Baolin Wang 0 siblings, 1 reply; 13+ messages in thread From: Sergey Senozhatsky @ 2026-02-06 4:31 UTC (permalink / raw) To: Baolin Wang Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel, Sergey Senozhatsky On (26/02/06 12:38), Sergey Senozhatsky wrote: [..] > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index eff9e3061925..fa6a018b20a8 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -1894,6 +1894,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, > > > xas_set(&xas, index); > > > folio = xas_load(&xas); > > > + if (try_to_freeze()) > > > + goto xa_unlocked; > > > + > > > VM_BUG_ON(index != xas.xa_index); > > > if (is_shmem) { > > > if (!folio) { > > > > Your analysis is reasonable. When the system is freezing, khugepaged is > > still trying to swap-in shmem to collapse, which prevents the system from > > entering suspend state. However, it’s not only shmem that will swap in, > > collapsing anonymous folios may also trigger swap-in operations. > > Right, I thought about it but wasn't sure. Could the inner loop (e.g. > collapse_file() in this particular case) loop long enough to fail suspend > w/o ever giving the outer loop (khugepaged_do_scan()) a chance to freeze? For inner loops I wondered if cond_resched() could be an indicator of where try_to_freeze() should be placed. Those cond_resched() calls are there for a reason, after all. E.g. something like: --- diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fa6a018b20a8..cee08466a069 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2431,6 +2431,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, enum scan_result unsigned long hstart, hend; cond_resched(); + if (try_to_freeze()) + break; + if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { progress++; break; @@ -2453,6 +2456,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, enum scan_result bool mmap_locked = true; cond_resched(); + if (try_to_freeze()) + goto breakouterloop; + if (unlikely(hpage_collapse_test_exit_or_disable(mm))) goto breakouterloop; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 4:31 ` Sergey Senozhatsky @ 2026-02-06 5:12 ` Baolin Wang 2026-02-06 8:36 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 13+ messages in thread From: Baolin Wang @ 2026-02-06 5:12 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/6/26 12:31 PM, Sergey Senozhatsky wrote: > On (26/02/06 12:38), Sergey Senozhatsky wrote: > [..] >>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>>> index eff9e3061925..fa6a018b20a8 100644 >>>> --- a/mm/khugepaged.c >>>> +++ b/mm/khugepaged.c >>>> @@ -1894,6 +1894,9 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr, >>>> xas_set(&xas, index); >>>> folio = xas_load(&xas); >>>> + if (try_to_freeze()) >>>> + goto xa_unlocked; >>>> + >>>> VM_BUG_ON(index != xas.xa_index); >>>> if (is_shmem) { >>>> if (!folio) { >>> >>> Your analysis is reasonable. When the system is freezing, khugepaged is >>> still trying to swap-in shmem to collapse, which prevents the system from >>> entering suspend state. However, it’s not only shmem that will swap in, >>> collapsing anonymous folios may also trigger swap-in operations. >> >> Right, I thought about it but wasn't sure. Could the inner loop (e.g. >> collapse_file() in this particular case) loop long enough to fail suspend >> w/o ever giving the outer loop (khugepaged_do_scan()) a chance to freeze? Yes, that’s possible. However, if we add a try_to_freeze() check in the inner loop, we need to consider various scenarios (such as anonymous folio swap-in and other potential cases?), which feels too hacky to me. > For inner loops I wondered if cond_resched() could be an indicator of > where try_to_freeze() should be placed. Those cond_resched() calls > are there for a reason, after all. E.g. something like: > > --- > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index fa6a018b20a8..cee08466a069 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -2431,6 +2431,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, enum scan_result > unsigned long hstart, hend; > > cond_resched(); > + if (try_to_freeze()) > + break; > + > if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { > progress++; > break; > @@ -2453,6 +2456,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, enum scan_result > bool mmap_locked = true; > > cond_resched(); > + if (try_to_freeze()) > + goto breakouterloop; > + > if (unlikely(hpage_collapse_test_exit_or_disable(mm))) > goto breakouterloop; This looks better than the previous version. Let’s also wait to see if others have any better suggestions. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 5:12 ` Baolin Wang @ 2026-02-06 8:36 ` David Hildenbrand (Arm) 2026-02-06 8:55 ` Baolin Wang 0 siblings, 1 reply; 13+ messages in thread From: David Hildenbrand (Arm) @ 2026-02-06 8:36 UTC (permalink / raw) To: Baolin Wang, Sergey Senozhatsky Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/6/26 06:12, Baolin Wang wrote: > > > On 2/6/26 12:31 PM, Sergey Senozhatsky wrote: >> On (26/02/06 12:38), Sergey Senozhatsky wrote: >> [..] >>> >>> Right, I thought about it but wasn't sure. Could the inner loop (e.g. >>> collapse_file() in this particular case) loop long enough to fail >>> suspend >>> w/o ever giving the outer loop (khugepaged_do_scan()) a chance to >>> freeze? > > Yes, that’s possible. However, if we add a try_to_freeze() check in the > inner loop, we need to consider various scenarios (such as anonymous > folio swap-in and other potential cases?), which feels too hacky to me. > >> For inner loops I wondered if cond_resched() could be an indicator of >> where try_to_freeze() should be placed. Those cond_resched() calls >> are there for a reason, after all. E.g. something like: >> >> --- >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index fa6a018b20a8..cee08466a069 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -2431,6 +2431,9 @@ static unsigned int >> khugepaged_scan_mm_slot(unsigned int pages, enum scan_result >> unsigned long hstart, hend; >> cond_resched(); >> + if (try_to_freeze()) >> + break; >> + >> if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { >> progress++; >> break; >> @@ -2453,6 +2456,9 @@ static unsigned int >> khugepaged_scan_mm_slot(unsigned int pages, enum scan_result >> bool mmap_locked = true; >> cond_resched(); >> + if (try_to_freeze()) >> + goto breakouterloop; >> + >> if (unlikely(hpage_collapse_test_exit_or_disable(mm))) >> goto breakouterloop; > > This looks better than the previous version. Let’s also wait to see if > others have any better suggestions. What prevents other callpaths (faults, read(), write(), etc) from similarly triggering swapin? I recall that there is a notifier when the system is preparing to sleep (pm notifier or something). Could we simply hook into that to tell khugepaged to suspend+resume? Essentially, making hpage_collapse_test_exit_or_disable() break our for us. -- Cheers, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 8:36 ` David Hildenbrand (Arm) @ 2026-02-06 8:55 ` Baolin Wang 2026-02-06 9:00 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 13+ messages in thread From: Baolin Wang @ 2026-02-06 8:55 UTC (permalink / raw) To: David Hildenbrand (Arm), Sergey Senozhatsky Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/6/26 4:36 PM, David Hildenbrand (Arm) wrote: > On 2/6/26 06:12, Baolin Wang wrote: >> >> >> On 2/6/26 12:31 PM, Sergey Senozhatsky wrote: >>> On (26/02/06 12:38), Sergey Senozhatsky wrote: >>> [..] >>>> >>>> Right, I thought about it but wasn't sure. Could the inner loop (e.g. >>>> collapse_file() in this particular case) loop long enough to fail >>>> suspend >>>> w/o ever giving the outer loop (khugepaged_do_scan()) a chance to >>>> freeze? >> >> Yes, that’s possible. However, if we add a try_to_freeze() check in >> the inner loop, we need to consider various scenarios (such as >> anonymous folio swap-in and other potential cases?), which feels too >> hacky to me. >> >>> For inner loops I wondered if cond_resched() could be an indicator of >>> where try_to_freeze() should be placed. Those cond_resched() calls >>> are there for a reason, after all. E.g. something like: >>> >>> --- >>> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index fa6a018b20a8..cee08466a069 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -2431,6 +2431,9 @@ static unsigned int >>> khugepaged_scan_mm_slot(unsigned int pages, enum scan_result >>> unsigned long hstart, hend; >>> cond_resched(); >>> + if (try_to_freeze()) >>> + break; >>> + >>> if (unlikely(hpage_collapse_test_exit_or_disable(mm))) { >>> progress++; >>> break; >>> @@ -2453,6 +2456,9 @@ static unsigned int >>> khugepaged_scan_mm_slot(unsigned int pages, enum scan_result >>> bool mmap_locked = true; >>> cond_resched(); >>> + if (try_to_freeze()) >>> + goto breakouterloop; >>> + >>> if (unlikely(hpage_collapse_test_exit_or_disable(mm))) >>> goto breakouterloop; >> >> This looks better than the previous version. Let’s also wait to see if >> others have any better suggestions. > > What prevents other callpaths (faults, read(), write(), etc) from > similarly triggering swapin? Usually it’s just a userspace process triggering one page fault to swap a page in, then will return to userspace. There aren’t other kernel threads like khugepaged continuously do swap-in in a loop. > I recall that there is a notifier when the system is preparing to sleep > (pm notifier or something). Could we simply hook into that to tell > khugepaged to suspend+resume? Do you mean “struct dev_pm_ops”, which is used to register PM callbacks for devices? However, I don’t know how to use it with a kernel thread. Also look at how kswapd does it, kswapd also uses kthread_freezable_should_stop() to check the freeze state. > Essentially, making hpage_collapse_test_exit_or_disable() break our for us. Ah, yes, even better:) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 8:55 ` Baolin Wang @ 2026-02-06 9:00 ` David Hildenbrand (Arm) 2026-02-10 3:21 ` Sergey Senozhatsky 0 siblings, 1 reply; 13+ messages in thread From: David Hildenbrand (Arm) @ 2026-02-06 9:00 UTC (permalink / raw) To: Baolin Wang, Sergey Senozhatsky Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel >> I recall that there is a notifier when the system is preparing to >> sleep (pm notifier or something). Could we simply hook into that to >> tell khugepaged to suspend+resume? > > Do you mean “struct dev_pm_ops”, which is used to register PM callbacks > for devices? However, I don’t know how to use it with a kernel thread. > > Also look at how kswapd does it, kswapd also uses > kthread_freezable_should_stop() to check the freeze state. Right, mimicking what kswapd does sound reasonable! -- Cheers, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-06 9:00 ` David Hildenbrand (Arm) @ 2026-02-10 3:21 ` Sergey Senozhatsky 2026-02-10 10:07 ` Baolin Wang 0 siblings, 1 reply; 13+ messages in thread From: Sergey Senozhatsky @ 2026-02-10 3:21 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: Baolin Wang, Sergey Senozhatsky, Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On (26/02/06 10:00), David Hildenbrand (Arm) wrote: > > > I recall that there is a notifier when the system is preparing to > > > sleep (pm notifier or something). Could we simply hook into that to > > > tell khugepaged to suspend+resume? > > > > Do you mean “struct dev_pm_ops”, which is used to register PM callbacks > > for devices? However, I don’t know how to use it with a kernel thread. > > > > Also look at how kswapd does it, kswapd also uses > > kthread_freezable_should_stop() to check the freeze state. > > Right, mimicking what kswapd does sound reasonable! I may be missing something, as I'm not seeing dev_pm_ops in vmscan code. Would something like this work? --- diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fa6a018b20a8..c5d89ec223d3 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -394,8 +394,12 @@ static inline int hpage_collapse_test_exit(struct mm_struct *mm) static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) { + bool was_frozen; + int ret = kthread_freezable_should_stop(&was_frozen); + return hpage_collapse_test_exit(mm) || - mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm); + mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm) || + was_frozen || ret; } static bool hugepage_pmd_enabled(void) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-10 3:21 ` Sergey Senozhatsky @ 2026-02-10 10:07 ` Baolin Wang 2026-02-10 10:12 ` Sergey Senozhatsky 2026-02-10 10:21 ` David Hildenbrand (Arm) 0 siblings, 2 replies; 13+ messages in thread From: Baolin Wang @ 2026-02-10 10:07 UTC (permalink / raw) To: Sergey Senozhatsky, David Hildenbrand (Arm) Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/10/26 11:21 AM, Sergey Senozhatsky wrote: > On (26/02/06 10:00), David Hildenbrand (Arm) wrote: >>>> I recall that there is a notifier when the system is preparing to >>>> sleep (pm notifier or something). Could we simply hook into that to >>>> tell khugepaged to suspend+resume? >>> >>> Do you mean “struct dev_pm_ops”, which is used to register PM callbacks >>> for devices? However, I don’t know how to use it with a kernel thread. >>> >>> Also look at how kswapd does it, kswapd also uses >>> kthread_freezable_should_stop() to check the freeze state. >> >> Right, mimicking what kswapd does sound reasonable! > > I may be missing something, as I'm not seeing dev_pm_ops in vmscan code. > Would something like this work? > > --- > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index fa6a018b20a8..c5d89ec223d3 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -394,8 +394,12 @@ static inline int hpage_collapse_test_exit(struct mm_struct *mm) > > static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) > { > + bool was_frozen; > + int ret = kthread_freezable_should_stop(&was_frozen); > + > return hpage_collapse_test_exit(mm) || > - mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm); > + mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm) || > + was_frozen || ret; > } Since the hpage_collapse_test_exit_or_disable() can be called by madvise_callapse(), which is not a kernel thread. So I think using the try_to_freeze() is enough? or pass the cc->is_khugepaged to check if current thread is khugepaged. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-10 10:07 ` Baolin Wang @ 2026-02-10 10:12 ` Sergey Senozhatsky 2026-02-10 10:21 ` David Hildenbrand (Arm) 1 sibling, 0 replies; 13+ messages in thread From: Sergey Senozhatsky @ 2026-02-10 10:12 UTC (permalink / raw) To: Baolin Wang Cc: Sergey Senozhatsky, David Hildenbrand (Arm), Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On (26/02/10 18:07), Baolin Wang wrote: > > > > Do you mean “struct dev_pm_ops”, which is used to register PM callbacks > > > > for devices? However, I don’t know how to use it with a kernel thread. > > > > > > > > Also look at how kswapd does it, kswapd also uses > > > > kthread_freezable_should_stop() to check the freeze state. > > > > > > Right, mimicking what kswapd does sound reasonable! > > > > I may be missing something, as I'm not seeing dev_pm_ops in vmscan code. > > Would something like this work? > > > > --- > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index fa6a018b20a8..c5d89ec223d3 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -394,8 +394,12 @@ static inline int hpage_collapse_test_exit(struct mm_struct *mm) > > static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) > > { > > + bool was_frozen; > > + int ret = kthread_freezable_should_stop(&was_frozen); > > + > > return hpage_collapse_test_exit(mm) || > > - mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm); > > + mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm) || > > + was_frozen || ret; > > } > > Since the hpage_collapse_test_exit_or_disable() can be called by > madvise_callapse(), which is not a kernel thread. So I think using the > try_to_freeze() is enough? I guess try_to_freeze() should work. > or pass the cc->is_khugepaged to check if current thread is khugepaged. Or I guess I can check `current->flags & PF_KTHREAD` in hpage_collapse_test_exit_or_disable(). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-10 10:07 ` Baolin Wang 2026-02-10 10:12 ` Sergey Senozhatsky @ 2026-02-10 10:21 ` David Hildenbrand (Arm) 2026-02-11 1:03 ` Baolin Wang 1 sibling, 1 reply; 13+ messages in thread From: David Hildenbrand (Arm) @ 2026-02-10 10:21 UTC (permalink / raw) To: Baolin Wang, Sergey Senozhatsky Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/10/26 11:07, Baolin Wang wrote: > > > On 2/10/26 11:21 AM, Sergey Senozhatsky wrote: >> On (26/02/06 10:00), David Hildenbrand (Arm) wrote: >>> >>> Right, mimicking what kswapd does sound reasonable! >> >> I may be missing something, as I'm not seeing dev_pm_ops in vmscan code. >> Would something like this work? >> >> --- >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index fa6a018b20a8..c5d89ec223d3 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -394,8 +394,12 @@ static inline int hpage_collapse_test_exit(struct >> mm_struct *mm) >> static inline int hpage_collapse_test_exit_or_disable(struct >> mm_struct *mm) >> { >> + bool was_frozen; >> + int ret = kthread_freezable_should_stop(&was_frozen); >> + >> return hpage_collapse_test_exit(mm) || >> - mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm); >> + mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm) || >> + was_frozen || ret; >> } > > Since the hpage_collapse_test_exit_or_disable() can be called by > madvise_callapse(), which is not a kernel thread. Which raises the question whether we should forward that context (khugepaged vs. madvise) to hpage_collapse_test_exit_or_disable(). -- Cheers, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [stable-6.6.y] mm: khugepaged refuses to freeze 2026-02-10 10:21 ` David Hildenbrand (Arm) @ 2026-02-11 1:03 ` Baolin Wang 0 siblings, 0 replies; 13+ messages in thread From: Baolin Wang @ 2026-02-11 1:03 UTC (permalink / raw) To: David Hildenbrand (Arm), Sergey Senozhatsky Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, linux-mm, linux-kernel On 2/10/26 6:21 PM, David Hildenbrand (Arm) wrote: > On 2/10/26 11:07, Baolin Wang wrote: >> >> >> On 2/10/26 11:21 AM, Sergey Senozhatsky wrote: >>> On (26/02/06 10:00), David Hildenbrand (Arm) wrote: >>>> >>>> Right, mimicking what kswapd does sound reasonable! >>> >>> I may be missing something, as I'm not seeing dev_pm_ops in vmscan code. >>> Would something like this work? >>> >>> --- >>> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index fa6a018b20a8..c5d89ec223d3 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -394,8 +394,12 @@ static inline int >>> hpage_collapse_test_exit(struct mm_struct *mm) >>> static inline int hpage_collapse_test_exit_or_disable(struct >>> mm_struct *mm) >>> { >>> + bool was_frozen; >>> + int ret = kthread_freezable_should_stop(&was_frozen); >>> + >>> return hpage_collapse_test_exit(mm) || >>> - mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm); >>> + mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm) || >>> + was_frozen || ret; >>> } >> >> Since the hpage_collapse_test_exit_or_disable() can be called by >> madvise_callapse(), which is not a kernel thread. > > Which raises the question whether we should forward that context > (khugepaged vs. madvise) to hpage_collapse_test_exit_or_disable(). Passing in the 'cc' pointer looks fine to me. Something like: static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm, struct collapse_control *cc) { bool was_frozen = false; if (cc->is_khugepaged && unlikely(kthread_freezable_should_stop(&was_frozen))) return 1; return hpage_collapse_test_exit(mm) || mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm) || was_frozen; } Sergey, could you submit a formal patch for review? ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-02-11 1:04 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2026-02-06 2:47 [stable-6.6.y] mm: khugepaged refuses to freeze Sergey Senozhatsky 2026-02-06 3:33 ` Baolin Wang 2026-02-06 3:38 ` Sergey Senozhatsky 2026-02-06 4:31 ` Sergey Senozhatsky 2026-02-06 5:12 ` Baolin Wang 2026-02-06 8:36 ` David Hildenbrand (Arm) 2026-02-06 8:55 ` Baolin Wang 2026-02-06 9:00 ` David Hildenbrand (Arm) 2026-02-10 3:21 ` Sergey Senozhatsky 2026-02-10 10:07 ` Baolin Wang 2026-02-10 10:12 ` Sergey Senozhatsky 2026-02-10 10:21 ` David Hildenbrand (Arm) 2026-02-11 1:03 ` Baolin Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox