linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Zi Yan <ziy@nvidia.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv2] mm: khugepaged: make scan loops suspend aware
Date: Thu, 12 Feb 2026 09:44:13 +0100	[thread overview]
Message-ID: <3571cf8b-9fb3-41b2-a402-a8537ee2c399@kernel.org> (raw)
In-Reply-To: <jftqdosilzfkd4ku6pjuevkvuwmhlqtua62jwsmmgdfcseitb4@gopt2xzqhyrk>

On 2/12/26 07:32, Sergey Senozhatsky wrote:
> On (26/02/11 10:50), David Hildenbrand (Arm) wrote:
>> On 2/11/26 04:15, Sergey Senozhatsky wrote:
>>> A number of khugepaaged's loops, e.g. khugepaged_scan_mm_slot(),
>>> are time unbound, which can become problematic during system
>>> suspend:
>>>
>>> PM: suspend entry (s2idle)
>>> Filesystems sync: 0.003 seconds
>>> Freezing user space processes
>>> Freezing user space processes completed (elapsed 0.003 seconds)
>>> OOM killer disabled.
>>> Freezing remaining freezable tasks
>>> Freezing remaining freezable tasks failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
>>> task:khugepaged      state:D stack:0     pid:1345  ppid:2      flags:0x00004000
>>> Call Trace:
>>>    <TASK>
>>>    schedule+0x523/0x16a0
>>>    schedule_timeout+0x23b/0x6e0
>>>    io_schedule_timeout+0x3f/0x80
>>>    wait_for_completion_io_timeout+0xe4/0x170
>>>    submit_bio_wait+0x79/0xc0
>>>    swap_readpage+0x150/0x2d0
>>>    swap_cluster_readahead+0x3be/0x750
>>>    shmem_swapin+0xa7/0x100
>>>    shmem_swapin_folio+0xcd/0x2e0
>>>    shmem_get_folio+0x237/0x580
>>>    collapse_file+0x247/0x1280
>>>    hpage_collapse_scan_file+0x26e/0x380
>>>    khugepaged+0x43b/0x810
>>>    kthread+0xfb/0x120
>>>    </TASK>
>>>
>>> Make hpage_collapse_test_exit_or_disable() suspend aware so
>>> that khugepaaged's scan loops can terminate in a timely manner
>>> and let system enter the sleep state.
>>>
>>
>> Do we want a Fixes: tag, and maybe backport this to stable kernels?
> 
> I can Cc stable, but I don't know about Fixes - we are adding something
> that was never there, not fixing a regression.

Cc: stable is only possible with a valid Fixes:.

If we're fixing an issue, we usually try to identify which commit introduced the
issue.

For example, support for freezing was introduced in

commit 878aee7d6b5504e01b9caffce080e792b6b8d090
Author: Andrea Arcangeli <aarcange@redhat.com>
Date:   Thu Jan 13 15:47:10 2011 -0800

     thp: freeze khugepaged and ksmd
     
     It's unclear why schedule friendly kernel threads can't be taken away by
     the CPU through the scheduler itself.  It's safer to stop them as they can
     trigger memory allocation, if kswapd also freezes itself to avoid
     generating I/O they have too.



Now that I am looking through the history, I find:

commit b39ca208403c8f2c17dab1fbfef1f5ecaff25e53
Author: Kevin Hao <haokexin@gmail.com>
Date:   Wed Dec 20 07:17:53 2023 +0800

     mm/khugepaged: remove redundant try_to_freeze()
     
     A freezable kernel thread can enter frozen state during freezing by either
     calling try_to_freeze() or using wait_event_freezable() and its variants.
     However, there is no need to use both methods simultaneously.  The
     freezable wait variants have been used in khugepaged_wait_work() and
     khugepaged_alloc_sleep(), so remove this redundant try_to_freeze().
     
     I used the following stress-ng command to generate some memory load on my
     Intel Alder Lake board (24 CPUs, 32G memory).


I wonder if that made the issue more likely to appear?


Interestingly, we also had in the past:

commit 1dfb059b9438633b0546c5431538a47f6ed99028
Author: Andrea Arcangeli <aarcange@redhat.com>
Date:   Thu Dec 8 14:33:57 2011 -0800

     thp: reduce khugepaged freezing latency
     
     khugepaged can sometimes cause suspend to fail, requiring that the user
     retry the suspend operation.


So it's a recurring theme.


Given that we only scan "khugepaged_pages_to_scan" pages/ptes/etc. before going back to sleep,
I wonder how that can take in your setup that long.

Why does it end up taking something around 20 seconds in your setup? How is khugepaged_pages_to_scan
set in your environment?


-- 
Cheers,

David


  reply	other threads:[~2026-02-12  8:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11  3:15 Sergey Senozhatsky
2026-02-11  6:15 ` Nico Pache
2026-02-12  1:51   ` Sergey Senozhatsky
2026-02-11  9:50 ` David Hildenbrand (Arm)
2026-02-12  1:50   ` Sergey Senozhatsky
2026-02-12  8:30     ` David Hildenbrand (Arm)
2026-02-12  8:42       ` Sergey Senozhatsky
2026-02-12  6:32   ` Sergey Senozhatsky
2026-02-12  8:44     ` David Hildenbrand (Arm) [this message]
2026-02-12  9:05       ` Sergey Senozhatsky
2026-02-12  9:10         ` David Hildenbrand (Arm)
2026-02-12  9:24           ` Sergey Senozhatsky
2026-02-14  6:35           ` Lance Yang
2026-02-16  9:24             ` David Hildenbrand (Arm)
2026-02-16  9:50               ` Sergey Senozhatsky
2026-02-16 10:05                 ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3571cf8b-9fb3-41b2-a402-a8537ee2c399@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=senozhatsky@chromium.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox