From: Wei Yang <richard.weiyang@gmail.com>
To: Nico Pache <npache@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>,
linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-doc@vger.kernel.org, david@redhat.com,
ziy@nvidia.com, baolin.wang@linux.alibaba.com,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
ryan.roberts@arm.com, dev.jain@arm.com, corbet@lwn.net,
rostedt@goodmis.org, mhiramat@kernel.org,
mathieu.desnoyers@efficios.com, akpm@linux-foundation.org,
baohua@kernel.org, willy@infradead.org, peterx@redhat.com,
wangkefeng.wang@huawei.com, usamaarif642@gmail.com,
sunnanyong@huawei.com, vishal.moola@gmail.com,
thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
kas@kernel.org, aarcange@redhat.com, raquini@redhat.com,
anshuman.khandual@arm.com, catalin.marinas@arm.com,
tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com,
jack@suse.cz, cl@gentwo.org, jglisse@google.com,
surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org,
rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org,
hughd@google.com, lance.yang@linux.dev, vbabka@suse.cz,
rppt@kernel.org, jannh@google.com, pfalcato@suse.de
Subject: Re: [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts
Date: Tue, 18 Nov 2025 02:00:34 +0000 [thread overview]
Message-ID: <20251118020034.rdgisvkqs53lwabz@master> (raw)
In-Reply-To: <CAA1CXcA9zaGqLPHnJWH=fKPUjb02dV+rKgfmsRZOGdeukiC8eg@mail.gmail.com>
On Mon, Nov 17, 2025 at 11:16:53AM -0700, Nico Pache wrote:
>On Sat, Nov 8, 2025 at 7:40 PM Wei Yang <richard.weiyang@gmail.com> wrote:
>>
>> On Wed, Oct 22, 2025 at 12:37:15PM -0600, Nico Pache wrote:
>> >There are cases where, if an attempted collapse fails, all subsequent
>> >orders are guaranteed to also fail. Avoid these collapse attempts by
>> >bailing out early.
>> >
>> >Signed-off-by: Nico Pache <npache@redhat.com>
>> >---
>> > mm/khugepaged.c | 31 ++++++++++++++++++++++++++++++-
>> > 1 file changed, 30 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> >index e2319bfd0065..54f5c7888e46 100644
>> >--- a/mm/khugepaged.c
>> >+++ b/mm/khugepaged.c
>> >@@ -1431,10 +1431,39 @@ static int collapse_scan_bitmap(struct mm_struct *mm, unsigned long address,
>> > ret = collapse_huge_page(mm, address, referenced,
>> > unmapped, cc, mmap_locked,
>> > order, offset);
>> >- if (ret == SCAN_SUCCEED) {
>> >+
>> >+ /*
>> >+ * Analyze failure reason to determine next action:
>> >+ * - goto next_order: try smaller orders in same region
>> >+ * - continue: try other regions at same order
>> >+ * - break: stop all attempts (system-wide failure)
>> >+ */
>> >+ switch (ret) {
>> >+ /* Cases were we should continue to the next region */
>> >+ case SCAN_SUCCEED:
>> > collapsed += 1UL << order;
>> >+ fallthrough;
>> >+ case SCAN_PTE_MAPPED_HUGEPAGE:
>> > continue;
>> >+ /* Cases were lower orders might still succeed */
>> >+ case SCAN_LACK_REFERENCED_PAGE:
>> >+ case SCAN_EXCEED_NONE_PTE:
>> >+ case SCAN_EXCEED_SWAP_PTE:
>> >+ case SCAN_EXCEED_SHARED_PTE:
>> >+ case SCAN_PAGE_LOCK:
>> >+ case SCAN_PAGE_COUNT:
>> >+ case SCAN_PAGE_LRU:
>> >+ case SCAN_PAGE_NULL:
>> >+ case SCAN_DEL_PAGE_LRU:
>> >+ case SCAN_PTE_NON_PRESENT:
>> >+ case SCAN_PTE_UFFD_WP:
>> >+ case SCAN_ALLOC_HUGE_PAGE_FAIL:
>> >+ goto next_order;
>> >+ /* All other cases should stop collapse attempts */
>> >+ default:
>> >+ break;
>> > }
>> >+ break;
>>
>> One question here:
>
>Hi Wei Yang,
>
>Sorry I forgot to get back to this email.
>
No problem, thanks for taking a look.
>>
>> Suppose we have iterated several orders and not collapse successfully yet. So
>> the mthp_bitmap_stack[] would look like this:
>>
>> [8 7 6 6]
>> ^
>> |
>
>so we always pop before pushing. So it would go
>
>[9]
>pop
>if (collapse fails)
>[8 8]
>lets say we pop and successfully collapse a order 8
>[8]
>Then we fail the other order 8
>[7 7]
>now if we succeed the first order 7
>[7 6 6]
>I believe we are now in the state you wanted to describe.
>
>>
>> Now we found this one pass the threshold check, but it fails with other
>> result.
>
>ok lets say we pass the threshold checks, but the collapse fails for
>any reason that is described in the
>/* Cases were lower orders might still succeed */
>In this case we would continue to order 5 (or lower). Once we are done
>with this branch of the tree we go back to the other order 6 collapse.
>and eventually the order 7.
>
>>
>> Current code looks it would give up at all, but we may still have a chance to
>> collapse the above 3 range?
>
>for cases under /* All other cases should stop collapse attempts */
>Yes we would bail out and skip some collapses. I tried to think about
>all the cases were we would still want to continue trying, vs cases
>where the system is probably out of resources or hitting some major
>failure, and we should just break out (as others will probably fail
>too).
>
Thanks, your explanation is very clear.
>But this is also why I separated this patch out on its own. I was
>hoping to have some more focus on the different cases, and make sure I
>handled them in the best possible way. So I really appreciate the
>question :)
>
>* I did some digging through old message to find this *
>
>I believe these are the remaining cases. If these are hit I figured
>it's better to abort.
>
I agree we need to take care of those cases.
>/* cases where we must stop collapse attempts */
>case SCAN_CGROUP_CHARGE_FAIL:
>case SCAN_COPY_MC:
>case SCAN_ADDRESS_RANGE:
>case SCAN_PMD_NULL:
>case SCAN_ANY_PROCESS:
>case SCAN_VMA_NULL:
>case SCAN_VMA_CHECK:
>case SCAN_SCAN_ABORT:
>case SCAN_PMD_NONE:
>case SCAN_PAGE_ANON:
>case SCAN_PMD_MAPPED:
>case SCAN_FAIL:
>
>Please let me know if you think we should move these to either the
>`continue` or `next order` cases.
Take a look into these cases, it looks good to me now.
Also one of my concern is this coding style is a little hard to maintain. In
case we introduce a new result, we should remember to add it here. Otherwise
we may stop the collapse too early.
While it maybe a separate work after this patch set merged.
>
>Cheers,
>-- Nico
>
>>
>> > }
>> >
>> > next_order:
>> >--
>> >2.51.0
>>
>> --
>> Wei Yang
>> Help you, Help me
>>
--
Wei Yang
Help you, Help me
next prev parent reply other threads:[~2025-11-18 2:00 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-22 18:37 [PATCH v12 mm-new 00/15] khugepaged: mTHP support Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 01/15] khugepaged: rename hpage_collapse_* to collapse_* Nico Pache
2025-11-08 1:42 ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 02/15] introduce collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-10-27 9:00 ` Lance Yang
2025-10-27 15:44 ` Lorenzo Stoakes
2025-11-08 1:44 ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 03/15] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-10-27 9:02 ` Lance Yang
2025-11-08 1:54 ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 04/15] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-10-27 9:05 ` Lance Yang
2025-11-08 2:34 ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 05/15] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-10-27 9:17 ` Lance Yang
2025-10-27 16:00 ` Lorenzo Stoakes
2025-11-10 13:20 ` Nico Pache
2025-11-08 3:01 ` Wei Yang
2025-10-22 18:37 ` [PATCH v12 mm-new 06/15] khugepaged: introduce collapse_max_ptes_none helper function Nico Pache
2025-10-27 17:53 ` Lorenzo Stoakes
2025-10-28 10:09 ` Baolin Wang
2025-10-28 13:57 ` Nico Pache
2025-10-28 17:07 ` Lorenzo Stoakes
2025-10-28 17:56 ` David Hildenbrand
2025-10-28 18:09 ` Lorenzo Stoakes
2025-10-28 18:17 ` David Hildenbrand
2025-10-28 18:41 ` Lorenzo Stoakes
2025-10-29 15:04 ` David Hildenbrand
2025-10-29 18:41 ` Lorenzo Stoakes
2025-10-29 21:10 ` Nico Pache
2025-10-30 18:03 ` Lorenzo Stoakes
2025-10-29 20:45 ` Nico Pache
2025-10-28 13:36 ` Nico Pache
2025-10-28 14:15 ` David Hildenbrand
2025-10-28 17:29 ` Lorenzo Stoakes
2025-10-28 17:36 ` Lorenzo Stoakes
2025-10-28 18:08 ` David Hildenbrand
2025-10-28 18:59 ` Lorenzo Stoakes
2025-10-28 19:08 ` Lorenzo Stoakes
2025-10-29 2:09 ` Baolin Wang
2025-10-29 2:49 ` Nico Pache
2025-10-29 18:55 ` Lorenzo Stoakes
2025-10-29 21:14 ` Nico Pache
2025-10-30 1:15 ` Baolin Wang
2025-10-29 2:47 ` Nico Pache
2025-10-29 18:58 ` Lorenzo Stoakes
2025-10-29 21:23 ` Nico Pache
2025-10-30 10:15 ` Lorenzo Stoakes
2025-10-31 11:12 ` David Hildenbrand
2025-10-28 16:57 ` Lorenzo Stoakes
2025-10-28 17:49 ` David Hildenbrand
2025-10-28 17:59 ` Lorenzo Stoakes
2025-10-22 18:37 ` [PATCH v12 mm-new 07/15] khugepaged: generalize collapse_huge_page for mTHP collapse Nico Pache
2025-10-27 3:25 ` Baolin Wang
2025-11-06 18:14 ` Lorenzo Stoakes
2025-11-07 3:09 ` Dev Jain
2025-11-07 9:18 ` Lorenzo Stoakes
2025-11-07 19:33 ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 08/15] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 09/15] khugepaged: add per-order mTHP collapse failure statistics Nico Pache
2025-11-06 18:45 ` Lorenzo Stoakes
2025-11-07 17:14 ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 10/15] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 11/15] khugepaged: introduce collapse_allowable_orders helper function Nico Pache
2025-11-06 18:49 ` Lorenzo Stoakes
2025-11-07 18:01 ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 12/15] khugepaged: Introduce mTHP collapse support Nico Pache
2025-10-27 6:28 ` Baolin Wang
2025-11-09 2:08 ` Wei Yang
2025-11-11 21:56 ` Nico Pache
2025-11-19 11:53 ` Lorenzo Stoakes
2025-11-19 12:08 ` Lorenzo Stoakes
2025-11-20 22:32 ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 13/15] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-11-09 2:40 ` Wei Yang
2025-11-17 18:16 ` Nico Pache
2025-11-18 2:00 ` Wei Yang [this message]
2025-11-19 12:05 ` Lorenzo Stoakes
2025-11-26 23:16 ` Nico Pache
2025-11-26 23:29 ` Nico Pache
2025-10-22 18:37 ` [PATCH v12 mm-new 14/15] khugepaged: run khugepaged for all orders Nico Pache
2025-11-19 12:13 ` Lorenzo Stoakes
2025-11-20 6:37 ` Baolin Wang
2025-10-22 18:37 ` [PATCH v12 mm-new 15/15] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2025-10-22 19:52 ` Christoph Lameter (Ampere)
2025-10-22 20:22 ` David Hildenbrand
2025-10-23 8:00 ` Lorenzo Stoakes
2025-10-23 8:44 ` Pedro Falcato
2025-10-24 13:54 ` Zach O'Keefe
2025-10-23 23:41 ` Christoph Lameter (Ampere)
2025-10-22 20:13 ` [PATCH v12 mm-new 00/15] khugepaged: mTHP support Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251118020034.rdgisvkqs53lwabz@master \
--to=richard.weiyang@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=jglisse@google.com \
--cc=kas@kernel.org \
--cc=lance.yang@linux.dev \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=peterx@redhat.com \
--cc=pfalcato@suse.de \
--cc=raquini@redhat.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=sunnanyong@huawei.com \
--cc=surenb@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tiwai@suse.de \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=vishal.moola@gmail.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
--cc=ziy@nvidia.com \
--cc=zokeefe@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox