linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	akpm@linux-foundation.org, ziy@nvidia.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, zokeefe@google.com,
	shy828301@gmail.com, usamaarif642@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP settings are disabled
Date: Wed, 25 Jun 2025 10:40:23 +0200	[thread overview]
Message-ID: <a027fe94-e6c2-46d0-8768-6acd8e801cc3@redhat.com> (raw)
In-Reply-To: <f36e64f2-f3d1-407e-862f-ceccc89ac9a8@lucifer.local>

On 25.06.25 10:22, Lorenzo Stoakes wrote:
> On Wed, Jun 25, 2025 at 10:16:46AM +0200, David Hildenbrand wrote:
>> On 25.06.25 09:49, David Hildenbrand wrote:
>>> I think the whole use case of using MADV_COLLAPSE to completely control
>>> THP allocation in a system is otherwise pretty hard to achieve, if there
>>> is no other way to tame THP allocation through page faults+khugepaged.
>>
>> Just want to add: for an app itself, it's doable in "madvise" mode perfectly
>> fine.
>>
>> If your app does a MADV_HUGEPAGE, it can get a THP during page-fault +
>> khugepaged.
>>
>> If your app does not do a MADV_HUGEPAGE, it can get a THP through
>> MADV_COLLAPSE.
>>
>> So the "madvise" mode actually works.
> 
> Right, but for me MADV_COLLAPSE is more about 'I want THPs _now_ (if available),
> not when khugepaged decides to give me some'.
> 
> So we have multiple semantics at work here, unfortunately.
> 
>>
>> The problem appears as soon as we want to control other processes that might
>> be setting MADV_HUGEPAGE, and we actually want to control the behavior using
>> process_madvise(MADV_COLLAPSE), to say "well, the MADV_HUGEPAGE" should be
>> ignored.
> 
> This is a _very_ specialist use.
> 
> I'd argue for a 'manual' mode to be added to sysfs to cover this case, with
> 'never' having the 'actually means never' semantics.
> 
> You might argue that could confuse things, but it'd retain the 'de facto'
> understanding nearly everybody has about what thees flags mean, but give
> whatever user is out there that needs this the ability to continue doing what
> they want.
> 
> And we get into philosophy about not 'breaking' userland, not sure we have a
> TLB/page fault/folio allocation efficiency contract with userland :)
> 
> No program will break with this patch applied. Just potentially get performance
> degradation in a very, very specialist case.
> 
>>
>> Then, you configure "never" system-wide and use
>> process_madvise(MADV_COLLAPSE) to drive it all manually.
>>
>> Curious to learn if there is such a user out there.
> 
> Oh me too :)

I just looked at the original use cases [1], such a use case is not 
mentioned.

But it did add process_madvise(MADV_COLLAPSE) in 
876b4a1896646cc85ec6b1fc1c9270928b7e0831 where we document

"
     This is useful for the development of userspace agents that seek to
     optimize THP utilization system-wide by using userspace signals to
     prioritize what memory is most deserving of being THP-backed.
"

The "prioritize" might indicate that this is used in combination with 
"madvise", not with "never"/


So yeah, it all boils down to

(1) If there is no such use case, "never can mean never". Because there
     is nothing to break, really.

(2) If there is such a use case, we might be breaking it.

[1] 
https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-06-25  8:40 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25  1:40 Baolin Wang
2025-06-25  1:40 ` [PATCH v4 1/2] mm: huge_memory: disallow hugepages if the system-wide THP sysfs " Baolin Wang
2025-06-25  4:34   ` Dev Jain
2025-06-25  1:40 ` [PATCH v4 2/2] mm: shmem: disallow hugepages if the system-wide shmem " Baolin Wang
2025-06-25  5:53 ` [PATCH v4 0/2] fix MADV_COLLAPSE issue if THP " Hugh Dickins
2025-06-25  6:05   ` Dev Jain
2025-06-25  6:26   ` Baolin Wang
2025-06-25  6:49     ` Dev Jain
2025-06-25  6:55       ` Baolin Wang
2025-06-25  7:20   ` Lorenzo Stoakes
2025-06-25  7:34     ` David Hildenbrand
2025-06-25  7:55       ` Lorenzo Stoakes
2025-06-25  8:12         ` Lorenzo Stoakes
2025-06-25  8:24           ` David Hildenbrand
2025-06-25  8:37             ` Lorenzo Stoakes
2025-06-25  8:52               ` Baolin Wang
2025-06-25  9:31                 ` Lorenzo Stoakes
2025-06-25 10:02                   ` Baolin Wang
2025-06-25 10:07                     ` David Hildenbrand
2025-06-25 10:15                       ` Lorenzo Stoakes
2025-06-25 10:29                         ` David Hildenbrand
2025-06-25  8:53               ` David Hildenbrand
2025-06-25 11:03       ` Usama Arif
2025-06-25 11:09         ` David Hildenbrand
2025-06-26  3:49           ` Hugh Dickins
2025-06-25  7:23   ` David Hildenbrand
2025-06-25  7:30     ` Lorenzo Stoakes
2025-06-25  7:36       ` David Hildenbrand
2025-06-25  7:42         ` Lorenzo Stoakes
2025-06-25  7:49           ` David Hildenbrand
2025-06-25  8:16             ` David Hildenbrand
2025-06-25  8:22               ` Lorenzo Stoakes
2025-06-25  8:40                 ` David Hildenbrand [this message]
2025-06-25  8:45                   ` Lorenzo Stoakes
2025-06-25 21:51         ` Hugh Dickins
2025-07-09 12:36 ` Lorenzo Stoakes
2025-07-10  1:58   ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a027fe94-e6c2-46d0-8768-6acd8e801cc3@redhat.com \
    --to=david@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=usamaarif642@gmail.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox