From: Lance Yang <lance.yang@linux.dev>
To: "David Hildenbrand (Arm)" <david@kernel.org>,
Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
Barry Song <baohua@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv2] mm: khugepaged: make scan loops suspend aware
Date: Sat, 14 Feb 2026 14:35:07 +0800 [thread overview]
Message-ID: <56345542-544a-48e4-b127-49a850deee9b@linux.dev> (raw)
In-Reply-To: <16ce9ce2-8081-482c-a6ea-0932ebd081f1@kernel.org>
On 2026/2/12 17:10, David Hildenbrand (Arm) wrote:
> On 2/12/26 10:05, Sergey Senozhatsky wrote:
>> On (26/02/12 09:44), David Hildenbrand (Arm) wrote:
>> [..]
>>> If we're fixing an issue, we usually try to identify which commit
>>> introduced the
>>> issue.
>>>
>>> For example, support for freezing was introduced in
>>>
>>> commit 878aee7d6b5504e01b9caffce080e792b6b8d090
>>> Author: Andrea Arcangeli <aarcange@redhat.com>
>>> Date: Thu Jan 13 15:47:10 2011 -0800
>>>
>>> thp: freeze khugepaged and ksmd
>>> It's unclear why schedule friendly kernel threads can't be taken
>>> away by
>>> the CPU through the scheduler itself. It's safer to stop them
>>> as they can
>>> trigger memory allocation, if kswapd also freezes itself to avoid
>>> generating I/O they have too.
>>>
>>>
>>>
>>> Now that I am looking through the history, I find:
>>>
>>> commit b39ca208403c8f2c17dab1fbfef1f5ecaff25e53
>>> Author: Kevin Hao <haokexin@gmail.com>
>>> Date: Wed Dec 20 07:17:53 2023 +0800
>>>
>>> mm/khugepaged: remove redundant try_to_freeze()
>>> A freezable kernel thread can enter frozen state during freezing
>>> by either
>>> calling try_to_freeze() or using wait_event_freezable() and its
>>> variants.
>>> However, there is no need to use both methods simultaneously. The
>>> freezable wait variants have been used in khugepaged_wait_work()
>>> and
>>> khugepaged_alloc_sleep(), so remove this redundant try_to_freeze().
>>> I used the following stress-ng command to generate some memory
>>> load on my
>>> Intel Alder Lake board (24 CPUs, 32G memory).
>>>
>>>
>>> I wonder if that made the issue more likely to appear?
>>>
>>>
>>> Interestingly, we also had in the past:
>>>
>>> commit 1dfb059b9438633b0546c5431538a47f6ed99028
>>> Author: Andrea Arcangeli <aarcange@redhat.com>
>>> Date: Thu Dec 8 14:33:57 2011 -0800
>>>
>>> thp: reduce khugepaged freezing latency
>>> khugepaged can sometimes cause suspend to fail, requiring that
>>> the user
>>> retry the suspend operation.
>>>
>>>
>>> So it's a recurring theme.
>>
>> Interesting, so 1dfb059b9438633 and 878aee7d6b5504e fixed real
>> problems "khugepaged can sometimes cause suspend to fail", but
>> I don't see what exactly b39ca208403c8f2 fixed. Sounds more
>> like an "optimization"?
>
> Yes, a cleanup. I wonder if it caused harm.
>
>>
>>> Given that we only scan "khugepaged_pages_to_scan" pages/ptes/etc.
>>> before going back to sleep,
>>> I wonder how that can take in your setup that long.
>>>
>>> Why does it end up taking something around 20 seconds in your setup?
>>
>> I only have bug reports at hands, I don't have a repro. Can the fact
>> that swap reads require S/W decompression (zram) add enough latency?
>
> I guess so. 20 seconds is still a lot.
>
>>
>>> How is khugepaged_pages_to_scan set in your environment?
>>
>> Let me check.
>>
>> cat /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
>> 4096
>>
>> Hmm, doesn't sound too high. Let me look more.
>
> Yeah, that's not a lot of pages to scan. It's the default (8 *
> HPAGE_PMD_NR)
Right. 4096 pages is not much to scan :)
This patch lets khugepaged be frozen between VMAs.
But if khugepaged is already collapsing when freeze starts, there
are two places without freeze checks that could take a bit long:
- __collapse_huge_page_swapin() loops 512 pages, calls do_swap_page()
for each swap entry.
- collapse_file() loops 512 pages, calls shmem_get_folio(). If pages
are swapped out, shmem_swapin_folio() is called.
Each swap-in can block for I/O. With multiple pages swapped out, the
cumulative time adds up.
Maybe we also need check points inside these loops to bail out early?
Cheers,
Lance
next prev parent reply other threads:[~2026-02-14 6:35 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 3:15 Sergey Senozhatsky
2026-02-11 6:15 ` Nico Pache
2026-02-12 1:51 ` Sergey Senozhatsky
2026-02-11 9:50 ` David Hildenbrand (Arm)
2026-02-12 1:50 ` Sergey Senozhatsky
2026-02-12 8:30 ` David Hildenbrand (Arm)
2026-02-12 8:42 ` Sergey Senozhatsky
2026-02-12 6:32 ` Sergey Senozhatsky
2026-02-12 8:44 ` David Hildenbrand (Arm)
2026-02-12 9:05 ` Sergey Senozhatsky
2026-02-12 9:10 ` David Hildenbrand (Arm)
2026-02-12 9:24 ` Sergey Senozhatsky
2026-02-14 6:35 ` Lance Yang [this message]
2026-02-16 9:24 ` David Hildenbrand (Arm)
2026-02-16 9:50 ` Sergey Senozhatsky
2026-02-16 10:05 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56345542-544a-48e4-b127-49a850deee9b@linux.dev \
--to=lance.yang@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=senozhatsky@chromium.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox