linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: Vernon Yang <vernon2gm@gmail.com>,
	david@kernel.org, Lance Yang <lance.yang@linux.dev>,
	baohua@kernel.org
Cc: lorenzo.stoakes@oracle.com, ziy@nvidia.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Vernon Yang <yanglincheng@kylinos.cn>,
	akpm@linux-foundation.org
Subject: Re: [PATCH mm-new v5 4/5] mm: khugepaged: skip lazy-free folios
Date: Sat, 24 Jan 2026 12:18:22 +0530	[thread overview]
Message-ID: <18e34ad4-82b1-42c3-b01d-ac6e5330c4e0@arm.com> (raw)
In-Reply-To: <CACZaFFNY8+UKLzBGnmB3ij9amzBdKJgytcSNtA8fLCake8Ua=A@mail.gmail.com>


On 24/01/26 8:52 am, Vernon Yang wrote:
> On Sat, Jan 24, 2026 at 12:32 AM Lance Yang <lance.yang@linux.dev> wrote:
>> On 2026/1/23 23:08, Vernon Yang wrote:
>>> On Fri, Jan 23, 2026 at 5:09 PM Lance Yang <lance.yang@linux.dev> wrote:
>>>> On 2026/1/23 16:22, Vernon Yang wrote:
>>>>> From: Vernon Yang <yanglincheng@kylinos.cn>
>>>>>
>> [...]
>>
>>>>> @@ -583,6 +584,11 @@ static enum scan_result __collapse_huge_page_isolate(struct vm_area_struct *vma,
>>>>>                folio = page_folio(page);
>>>>>                VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio);
>>>>>
>>>>> +             if (!pte_dirty(pteval) && folio_test_lazyfree(folio)) {
>>>> I'm wondering if we need "cc->is_khugepaged &&" as well here?
>>>>
>>>> We should allow users to enforce collapse via the madvise_collapse()
>>>> path even if pages are marked lazyfree, IMHO.
>>> $ man madvise
>>> MADV_COLLAPSE
>>>          Perform a best-effort synchronous collapse of the native pages
>>>          mapped by the memory range into Transparent Huge Pages (THPs).
>>>
>>> The semantics of MADV_COLLAPSE are best-effort and do not imply to enforce
>>> collapsing, so we don't need "cc->is_khugepaged" here.
>>>
>>> We can imagine that if a user simultaneously uses MADV_FREE and
>>> MADV_COLLAPSE, it indicates a misunderstanding of their semantics.
>>> As the kernel, we need to safeguard the baseline.
>> No. Afraid I don't think so.
>>
>> To be clear, what I meant by "enforce":
>>
>> Yep, MADV_COLLAPSE is best-effort - it can fail. But when users
>> call MADV_COLLAPSE, they're explicitly asking for collapse.
>>
>> Compared to khugepaged just scanning around, that's already "enforce"
>> - users are actively requesting it, not passively waiting for.
>>
>> Note that you're *breaking* userspace. Users would not be able
>> to collapse the range where there are any lazyfree pages anymore,
>> even when they explicitly call MADV_COLLAPSE.
>>
>> For khugepaged, skipping lazyfree makes sense.
> I got your meaning, this is equivalent to two questions:
>
> 1. Does the semantics of best-effort imply any "enforce" meaning?
> 2. When madvise(MADV_FREE| MADV_COLLAPSE), do we want to collapse
>    lazyfree folios?
>
> This is a semantic warning, and I'd like to hear others' opinions.

Lance is right. When user does MADV_COLLAPSE, kernel needs to try its
best to collapse. It may not be in the best interest of the user to
do MADV_FREE then MADV_COLLAPSE, but that is something the user has
to fix - kernel does not need to think about it.

Regarding "best-effort", it is best-effort in the sense that, the
madvise(MADV_COLLAPSE) is a syscall needed not for correctness,
but for optimization purposes. So it is not the end of the world
if the syscall fails. But, since the user has decided to do an
expensive operation (syscall), kernel needs to try harder to
make sure those CPU cycles weren't a waste.

>
>>>>> +                     result = SCAN_PAGE_LAZYFREE;
>>>>> +                     goto out;
>>>>> +             }
>>>>> +
>>>>>                /* See hpage_collapse_scan_pmd(). */
>>>>>                if (folio_maybe_mapped_shared(folio)) {
>>>>>                        ++shared;
>>>>> @@ -1330,6 +1336,11 @@ static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
>>>>>                }
>>>>>                folio = page_folio(page);
>>>>>
>>>>> +             if (!pte_dirty(pteval) && folio_test_lazyfree(folio)) {
>>>> Ditto.
>>>>
>>>>> +                     result = SCAN_PAGE_LAZYFREE;
>>>>> +                     goto out_unmap;
>>>>> +             }
>>>>> +
>>>>>                if (!folio_test_anon(folio)) {
>>>>>                        result = SCAN_PAGE_ANON;
>>>>>                        goto out_unmap;


  reply	other threads:[~2026-01-24  6:48 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23  8:22 [PATCH mm-new v5 0/5] Improve khugepaged scan logic Vernon Yang
2026-01-23  8:22 ` [PATCH mm-new v5 1/5] mm: khugepaged: add trace_mm_khugepaged_scan event Vernon Yang
2026-01-23 10:25   ` Dev Jain
2026-01-23  8:22 ` [PATCH mm-new v5 2/5] mm: khugepaged: refine scan progress number Vernon Yang
2026-01-23 10:46   ` Dev Jain
2026-01-23 15:25     ` Vernon Yang
2026-01-23 15:19   ` Matthew Wilcox
2026-01-23 15:29     ` Vernon Yang
2026-01-28  8:29   ` Dev Jain
2026-01-28 14:34     ` Vernon Yang
2026-01-29  5:35       ` Dev Jain
2026-01-29  7:59         ` Vernon Yang
2026-01-29  8:32           ` Dev Jain
2026-01-29 12:24             ` Vernon Yang
2026-01-29 12:46               ` Dev Jain
2026-01-29  9:18         ` Lance Yang
2026-01-29 12:28           ` Vernon Yang
2026-01-23  8:22 ` [PATCH mm-new v5 3/5] mm: add folio_test_lazyfree helper Vernon Yang
2026-01-23 10:54   ` Dev Jain
2026-01-26  1:52   ` Barry Song
2026-01-23  8:22 ` [PATCH mm-new v5 4/5] mm: khugepaged: skip lazy-free folios Vernon Yang
2026-01-23  9:09   ` Lance Yang
2026-01-23 15:08     ` Vernon Yang
2026-01-23 16:32       ` Lance Yang
2026-01-24  3:22         ` Vernon Yang
2026-01-24  6:48           ` Dev Jain [this message]
2026-01-26  2:06             ` Barry Song
2026-01-23  8:22 ` [PATCH mm-new v5 5/5] mm: khugepaged: set to next mm direct when mm has MMF_DISABLE_THP_COMPLETELY Vernon Yang
2026-01-23 12:40   ` Dev Jain
2026-01-23 15:32     ` Vernon Yang
2026-01-26  2:18     ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18e34ad4-82b1-42c3-b01d-ac6e5330c4e0@arm.com \
    --to=dev.jain@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=david@kernel.org \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=vernon2gm@gmail.com \
    --cc=yanglincheng@kylinos.cn \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox