linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Nico Pache <npache@redhat.com>
Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	akpm@linux-foundation.org, corbet@lwn.net, rostedt@goodmis.org,
	mhiramat@kernel.org, mathieu.desnoyers@efficios.com,
	david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com,
	willy@infradead.org, peterx@redhat.com, ziy@nvidia.com,
	wangkefeng.wang@huawei.com, usamaarif642@gmail.com,
	sunnanyong@huawei.com, vishal.moola@gmail.com,
	thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com,
	kirill.shutemov@linux.intel.com, aarcange@redhat.com,
	raquini@redhat.com, dev.jain@arm.com, anshuman.khandual@arm.com,
	catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org,
	dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org,
	jglisse@google.com, surenb@google.com, zokeefe@google.com,
	hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com,
	rdunlap@infradead.org
Subject: Re: [PATCH v4 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support
Date: Tue, 29 Apr 2025 15:16:08 +0800	[thread overview]
Message-ID: <ad93f480-431a-4f9b-9225-136d8c6c37df@linux.alibaba.com> (raw)
In-Reply-To: <CAA1CXcAiiBJ4mMp0WJUk7tWTF20guJi80P8wBh271yJ9P+c_VA@mail.gmail.com>



On 2025/4/28 22:47, Nico Pache wrote:
> On Sat, Apr 26, 2025 at 8:52 PM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
>>
>>
>>
>> On 2025/4/17 08:02, Nico Pache wrote:
>>> khugepaged scans PMD ranges for potential collapse to a hugepage. To add
>>> mTHP support we use this scan to instead record chunks of utilized
>>> sections of the PMD.
>>>
>>> khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap
>>> that represents chunks of utilized regions. We can then determine what
>>> mTHP size fits best and in the following patch, we set this bitmap while
>>> scanning the PMD.
>>>
>>> max_ptes_none is used as a scale to determine how "full" an order must
>>> be before being considered for collapse.
>>>
>>> When attempting to collapse an order that has its order set to "always"
>>> lets always collapse to that order in a greedy manner without
>>> considering the number of bits set.
>>>
>>> Signed-off-by: Nico Pache <npache@redhat.com>
>>> ---
>>>    include/linux/khugepaged.h |  4 ++
>>>    mm/khugepaged.c            | 94 ++++++++++++++++++++++++++++++++++----
>>>    2 files changed, 89 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
>>> index 1f46046080f5..18fe6eb5051d 100644
>>> --- a/include/linux/khugepaged.h
>>> +++ b/include/linux/khugepaged.h
>>> @@ -1,6 +1,10 @@
>>>    /* SPDX-License-Identifier: GPL-2.0 */
>>>    #ifndef _LINUX_KHUGEPAGED_H
>>>    #define _LINUX_KHUGEPAGED_H
>>> +#define KHUGEPAGED_MIN_MTHP_ORDER    2
>>
>> Why is the minimum mTHP order set to 2? IMO, the file large folios can
>> support order 1, so we don't expect to collapse exec file small folios
>> to order 1 if possible?
> I should have been more specific in the patch notes, but this affects
> anonymous only. I'll go over my commit messages and make sure this is
> reflected in the next version.

OK. I am looking into how to support shmem mTHP collapse based on your 
patch series.

>> (PS: I need more time to understand your logic in this patch, and any
>> additional explanation would be helpful:) )
> 
> We are currently scanning ptes in a PMD. The core principle/reasoning
> behind the bitmap is to keep the PMD scan while saving its state. We
> then use this bitmap to determine which chunks of the PMD are active
> and are the best candidates for mTHP collapse. We start at the PMD
> level, and recursively break down the bitmap to find the appropriate
> sizes for the bitmap.
> 
> looking at a simplified example: we scan a PMD and get the following
> bitmap, 1111101101101011 (in this case MIN_MTHP_ORDER= 5, so each bit
> == 32 ptes, in the actual set each bit == 4 ptes).
> We would first attempt a PMD collapse, while checking the number of
> bits set vs the max_ptes_none tunable. If those conditions arent
> triggered, we will try the next enabled mTHP order, for each half of
> the bitmap.
> 
> ie) order 8 attempt on 11111011 and order 8 attempt on 01101011.
> 
> If a collapse succeeds we dont keep recursing on that portion of the
> bitmap. If not, we continue attempting lower orders.
> 
> Hopefully that helps you understand my logic here! Let me know if you
> need more clarification.

Thanks for your explanation. That's pretty much how I understand it.:) 
I'll give a test for your new version.

> 
> I gave a presentation on this that might help too:
> https://docs.google.com/presentation/d/1w9NYLuC2kRcMAwhcashU1LWTfmI5TIZRaTWuZq-CHEg/edit?usp=sharing&resourcekey=0-nBAGld8cP1kW26XE6i0Bpg

Unfortunately, this link requires access permission.


  reply	other threads:[~2025-04-29  7:16 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17  0:02 [PATCH v4 00/12] khugepaged: " Nico Pache
2025-04-17  0:02 ` [PATCH v4 01/12] introduce khugepaged_collapse_single_pmd to unify khugepaged and madvise_collapse Nico Pache
2025-04-23  6:44   ` Baolin Wang
2025-04-23  7:06     ` Nico Pache
2025-04-17  0:02 ` [PATCH v4 02/12] khugepaged: rename hpage_collapse_* to khugepaged_* Nico Pache
2025-04-23  6:49   ` Baolin Wang
2025-04-17  0:02 ` [PATCH v4 03/12] khugepaged: generalize hugepage_vma_revalidate for mTHP support Nico Pache
2025-04-23  6:55   ` Baolin Wang
2025-04-17  0:02 ` [PATCH v4 04/12] khugepaged: generalize alloc_charge_folio() Nico Pache
2025-04-23  7:06   ` Baolin Wang
2025-04-17  0:02 ` [PATCH v4 05/12] khugepaged: generalize __collapse_huge_page_* for mTHP support Nico Pache
2025-04-23  7:30   ` Baolin Wang
2025-04-23  8:00     ` Nico Pache
2025-04-23  8:25       ` Baolin Wang
2025-04-17  0:02 ` [PATCH v4 06/12] khugepaged: introduce khugepaged_scan_bitmap " Nico Pache
2025-04-27  2:51   ` Baolin Wang
2025-04-28 14:47     ` Nico Pache
2025-04-29  7:16       ` Baolin Wang [this message]
2025-04-17  0:02 ` [PATCH v4 07/12] khugepaged: add " Nico Pache
2025-04-24 12:21   ` Baolin Wang
2025-04-28 15:14     ` Nico Pache
2025-04-17  0:02 ` [PATCH v4 08/12] khugepaged: skip collapsing mTHP to smaller orders Nico Pache
2025-04-24  7:48   ` Baolin Wang
2025-04-28 15:44     ` Nico Pache
2025-04-29  6:53       ` Baolin Wang
2025-04-17  0:02 ` [PATCH v4 09/12] khugepaged: avoid unnecessary mTHP collapse attempts Nico Pache
2025-04-17  0:02 ` [PATCH v4 10/12] khugepaged: improve tracepoints for mTHP orders Nico Pache
2025-04-24  7:51   ` Baolin Wang
2025-04-17  0:02 ` [PATCH v4 11/12] khugepaged: add per-order mTHP khugepaged stats Nico Pache
2025-04-24  7:58   ` Baolin Wang
2025-04-28 15:45     ` Nico Pache
2025-04-17  0:02 ` [PATCH v4 12/12] Documentation: mm: update the admin guide for mTHP collapse Nico Pache
2025-04-24 15:03   ` Usama Arif
2025-04-28 14:54     ` Nico Pache
  -- strict thread matches above, loose matches on Subject: below --
2025-04-14 22:05 [PATCH v3 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support Nico Pache
2025-04-14 23:18 ` [PATCH v4 " Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad93f480-431a-4f9b-9225-136d8c6c37df@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=jglisse@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=peterx@redhat.com \
    --cc=raquini@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=ryan.roberts@arm.com \
    --cc=sunnanyong@huawei.com \
    --cc=surenb@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tiwai@suse.de \
    --cc=usamaarif642@gmail.com \
    --cc=vishal.moola@gmail.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox