linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: Vernon Yang <vernon2gm@gmail.com>,
	"David Hildenbrand (Red Hat)" <david@kernel.org>
Cc: akpm@linux-foundation.org, lorenzo.stoakes@oracle.com,
	ziy@nvidia.com, baohua@kernel.org, lance.yang@linux.dev,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Vernon Yang <yanglincheng@kylinos.cn>
Subject: Re: [PATCH 2/4] mm: khugepaged: remove mm when all memory has been collapsed
Date: Tue, 23 Dec 2025 16:48:57 +0530	[thread overview]
Message-ID: <52174c05-e9ed-4049-ac05-d0d0b3228f2a@arm.com> (raw)
In-Reply-To: <djh6xaia56grmgxdok23kp6ly3oe3ugsinxdp6jie3k2tzwaml@57gbrcr75jng>


On 19/12/25 2:05 pm, Vernon Yang wrote:
> On Thu, Dec 18, 2025 at 10:29:18AM +0100, David Hildenbrand (Red Hat) wrote:
>> On 12/15/25 10:04, Vernon Yang wrote:
>>> The following data is traced by bpftrace on a desktop system. After
>>> the system has been left idle for 10 minutes upon booting, a lot of
>>> SCAN_PMD_MAPPED or SCAN_PMD_NONE are observed during a full scan by
>>> khugepaged.
>>>
>>> @scan_pmd_status[1]: 1           ## SCAN_SUCCEED
>>> @scan_pmd_status[4]: 158         ## SCAN_PMD_MAPPED
>>> @scan_pmd_status[3]: 174         ## SCAN_PMD_NONE
>>> total progress size: 701 MB
>>> Total time         : 440 seconds ## include khugepaged_scan_sleep_millisecs
>>>
>>> The khugepaged_scan list save all task that support collapse into hugepage,
>>> as long as the take is not destroyed, khugepaged will not remove it from
>>> the khugepaged_scan list. This exist a phenomenon where task has already
>>> collapsed all memory regions into hugepage, but khugepaged continues to
>>> scan it, which wastes CPU time and invalid, and due to
>>> khugepaged_scan_sleep_millisecs (default 10s) causes a long wait for
>>> scanning a large number of invalid task, so scanning really valid task
>>> is later.
>>>
>>> After applying this patch, when all memory is either SCAN_PMD_MAPPED or
>>> SCAN_PMD_NONE, the mm is automatically removed from khugepaged's scan
>>> list. If the page fault or MADV_HUGEPAGE again, it is added back to
>>> khugepaged.
>> I don't like that, as it assumes that memory within such a process would be
>> rather static, which is easily not the case (e.g., allocators just doing
>> MADV_DONTNEED to free memory).
>>
>> If most stuff is collapsed to PMDs already, can't we just skip over these
>> regions a bit faster?
> I have a flash of inspiration and came up with a good idea.
>
> If these regions have already been collapsed into hugepage, rechecking
> them would be very fast. Due to the khugepaged_pages_to_scan can also
> represent the number of VMAs to skip, we can extend its semantics as
> follows:
>
> 	/*
> 	 * default scan 8*HPAGE_PMD_NR ptes, pmd_mapped, no_pte_table or vmas
> 	 * every 10 second.
> 	 */
> 	static unsigned int khugepaged_pages_to_scan __read_mostly;
>
> 	switch (*result) {
> 	case SCAN_NO_PTE_TABLE:
> 	case SCAN_PMD_MAPPED:
> 	case SCAN_PTE_MAPPED_HUGEPAGE:
> 		progress++; // here
> 		break;
> 	case SCAN_SUCCEED:
> 		++khugepaged_pages_collapsed;
> 		fallthrough;
> 	default:
> 		progress += HPAGE_PMD_NR;
> 	}
>
> This way can achieve our goal. David, do you like it?

This looks good, can you formally test this and see if it comes close to the optimizations
yielded by the current version of the patchset?

>
>> --
>> Cheers
>>
>> David
> --
> Thanks,
> Vernon
>


  parent reply	other threads:[~2025-12-23 11:19 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15  9:04 [PATCH 0/4] Improve khugepaged scan logic Vernon Yang
2025-12-15  9:04 ` [PATCH 1/4] mm: khugepaged: add trace_mm_khugepaged_scan event Vernon Yang
2025-12-18  9:24   ` David Hildenbrand (Red Hat)
2025-12-19  5:21     ` Vernon Yang
2025-12-15  9:04 ` [PATCH 2/4] mm: khugepaged: remove mm when all memory has been collapsed Vernon Yang
2025-12-15 11:52   ` Lance Yang
2025-12-16  6:27     ` Vernon Yang
2025-12-15 21:45   ` kernel test robot
2025-12-16  6:30     ` Vernon Yang
2025-12-15 23:01   ` kernel test robot
2025-12-16  6:32     ` Vernon Yang
2025-12-17  3:31   ` Wei Yang
2025-12-18  3:27     ` Vernon Yang
2025-12-18  3:48       ` Wei Yang
2025-12-18  4:41         ` Vernon Yang
2025-12-18  9:29   ` David Hildenbrand (Red Hat)
2025-12-19  5:24     ` Vernon Yang
2025-12-19  9:00       ` David Hildenbrand (Red Hat)
2025-12-19  8:35     ` Vernon Yang
2025-12-19  8:55       ` David Hildenbrand (Red Hat)
2025-12-23 11:18       ` Dev Jain [this message]
2025-12-25 16:07         ` Vernon Yang
2025-12-29  6:02         ` Vernon Yang
2025-12-22 19:00   ` kernel test robot
2025-12-15  9:04 ` [PATCH 3/4] mm: khugepaged: move mm to list tail when MADV_COLD/MADV_FREE Vernon Yang
2025-12-15 21:12   ` kernel test robot
2025-12-16  7:00     ` Vernon Yang
2025-12-16 13:08   ` kernel test robot
2025-12-16 13:31   ` kernel test robot
2025-12-18  9:31   ` David Hildenbrand (Red Hat)
2025-12-19  5:29     ` Vernon Yang
2025-12-19  8:58       ` David Hildenbrand (Red Hat)
2025-12-21  2:10         ` Wei Yang
2025-12-21  4:25           ` Vernon Yang
2025-12-21  9:24             ` David Hildenbrand (Red Hat)
2025-12-21 12:34               ` Vernon Yang
2025-12-23  9:59                 ` David Hildenbrand (Red Hat)
2025-12-25 15:12                   ` Vernon Yang
2025-12-21 12:38             ` Wei Yang
2025-12-15  9:04 ` [PATCH 4/4] mm: khugepaged: set to next mm direct when mm has MMF_DISABLE_THP_COMPLETELY Vernon Yang
2025-12-18  9:33   ` David Hildenbrand (Red Hat)
2025-12-19  5:31     ` Vernon Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52174c05-e9ed-4049-ac05-d0d0b3228f2a@arm.com \
    --to=dev.jain@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=david@kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=vernon2gm@gmail.com \
    --cc=yanglincheng@kylinos.cn \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox