From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Usama Arif <usamaarif642@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
david@redhat.com, linux-mm@kvack.org, hannes@cmpxchg.org,
shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com,
laoar.shao@gmail.com, baolin.wang@linux.alibaba.com,
Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY
Date: Thu, 15 May 2025 14:55:28 +0100 [thread overview]
Message-ID: <6502bbb7-e8b3-4520-9547-823207119061@lucifer.local> (raw)
In-Reply-To: <20250515133519.2779639-1-usamaarif642@gmail.com>
On Thu, May 15, 2025 at 02:33:29PM +0100, Usama Arif wrote:
> This allows to change the THP policy of a process, according to the value
> set in arg2, all of which will be inherited during fork+exec:
This is pretty confusing.
It should be something like 'add a new prctl() option that allows...' etc.
> - PR_THP_POLICY_DEFAULT_HUGE: This will set the MMF2_THP_VMA_DEFAULT_HUGE
> process flag which changes the default of new VMAs to be VM_HUGEPAGE. The
> call also modifies all existing VMAs that are not VM_NOHUGEPAGE
> to be VM_HUGEPAGE.
This is referring to implementation detail that doesn't matter for an overview,
just add a summary here e.g.
PR_THP_POLICY_DEFAULT_HUGE - set VM_HUGEPAGE flag in all VMAs by default,
including after fork/exec, ignoring global policy.
PR_THP_POLICY_DEFAULT_NOHUGE - clear VM_HUGEPAGE flag in all VMAs by default,
including after fork/exec, ignoring global policy.
PR_THP_POLICY_DEFAULT_SYSTEM - Eliminate any policy set above.
> This allows systems where the global policy is set to "madvise"
> to effectively have THPs always for the process. In an environment
> where different types of workloads are stacked on the same machine
> whose global policy is set to "madvise", this will allow workloads
> that benefit from always having hugepages to do so, without regressing
> those that don't.
So does this just ignore and override the global policy? I'm not sure I'm
comfortable with that.
What about if the the policy is 'never'? Does this override that? That seems
completely wrong.
> - PR_THP_POLICY_DEFAULT_NOHUGE: This will set the MMF2_THP_VMA_DEFAULT_NOHUGE
> process flag which changes the default of new VMAs to be VM_NOHUGEPAGE.
> The call also modifies all existing VMAs that are not VM_HUGEPAGE
> to be VM_NOHUGEPAGE.
> This allows systems where the global policy is set to "always"
> to effectively have THPs on madvise only for the process. In an
> environment where different types of workloads are stacked on the
> same machine whose global policy is set to "always", this will allow
> workloads that benefit from having hugepages on an madvise basis only
> to do so, without regressing those that benefit from having hugepages
> always.
Wait, so 'no huge' means 'madvise'? What? This is confusing.
> - PR_THP_POLICY_DEFAULT_SYSTEM: This will clear the MMF2_THP_VMA_DEFAULT_HUGE
> and MMF2_THP_VMA_DEFAULT_NOHUGE process flags.
>
> These patches are required in rolling out hugepages in hyperscaler
> configurations for workloads that benefit from them, where workloads are
> stacked anda single THP global policy is likely to be used across the entire
> fleet, and prctl will help override it.
I don't understand this justification whatsoever. What does 'stacked' mean? And
you're not justifying why you'd override the policy?
This series has no actual justificaiton here at all? You really need to provide one.
>
> v1->v2:
Where was the v1? Is it [0]?
This seems like a massive change compared to that series?
You've renamed it and not referenced the old series, please make sure you link
it or somehow let somebody see what this is against, because it makes review
difficult.
[0]: https://lore.kernel.org/linux-mm/20250507141132.2773275-1-usamaarif642@gmail.com/
> - change from modifying the THP decision making for the process, to modifying
> VMA flags only. This prevents further complicating the logic used to
> determine THP order (Thanks David!)
> - change from using a prctl per policy change to just using PR_SET_THP_POLICY
> and arg2 to set the policy. (Zi Yan)
> - Introduce PR_THP_POLICY_DEFAULT_NOHUGE and PR_THP_POLICY_DEFAULT_SYSTEM
> - Add selftests and documentation.
>
> Usama Arif (6):
> prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process
> prctl: introduce PR_THP_POLICY_DEFAULT_NOHUGE for the process
> prctl: introduce PR_THP_POLICY_SYSTEM for the process
> selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_NOHUGE
> selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_HUGE
> docs: transhuge: document process level THP controls
>
> Documentation/admin-guide/mm/transhuge.rst | 40 +++
> include/linux/huge_mm.h | 4 +
> include/linux/mm_types.h | 14 +
> include/uapi/linux/prctl.h | 6 +
> kernel/fork.c | 1 +
> kernel/sys.c | 35 +++
> mm/huge_memory.c | 56 ++++
> mm/vma.c | 2 +
> tools/include/uapi/linux/prctl.h | 6 +
> .../trace/beauty/include/uapi/linux/prctl.h | 6 +
> tools/testing/selftests/prctl/Makefile | 2 +-
> tools/testing/selftests/prctl/thp_policy.c | 286 ++++++++++++++++++
> 12 files changed, 457 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/prctl/thp_policy.c
>
> --
> 2.47.1
>
next prev parent reply other threads:[~2025-05-15 13:55 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-15 13:33 Usama Arif
2025-05-15 13:33 ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Usama Arif
2025-05-15 14:40 ` Lorenzo Stoakes
2025-05-15 14:44 ` David Hildenbrand
2025-05-15 14:56 ` Usama Arif
2025-05-15 14:58 ` David Hildenbrand
2025-05-15 15:18 ` Lorenzo Stoakes
2025-05-15 15:45 ` Liam R. Howlett
2025-05-15 15:57 ` David Hildenbrand
2025-05-15 16:38 ` Lorenzo Stoakes
2025-05-15 17:29 ` David Hildenbrand
2025-05-15 18:09 ` Liam R. Howlett
2025-05-15 18:21 ` Lorenzo Stoakes
2025-05-15 18:42 ` Zi Yan
2025-05-15 21:04 ` Lorenzo Stoakes
2025-05-15 18:46 ` Usama Arif
2025-05-15 19:20 ` David Hildenbrand
2025-05-15 15:28 ` Usama Arif
2025-05-15 16:06 ` Lorenzo Stoakes
2025-05-15 16:11 ` David Hildenbrand
2025-05-15 18:08 ` Lorenzo Stoakes
2025-05-15 19:12 ` David Hildenbrand
2025-05-15 20:35 ` Lorenzo Stoakes
2025-05-16 7:45 ` David Hildenbrand
2025-05-16 10:57 ` Lorenzo Stoakes
2025-05-16 11:24 ` David Hildenbrand
2025-05-16 12:57 ` Lorenzo Stoakes
2025-05-16 17:19 ` Usama Arif
2025-05-16 17:51 ` Lorenzo Stoakes
2025-05-16 19:34 ` Usama Arif
2025-05-17 16:20 ` Is number of process_madvise()-able ranges limited to 8? (was Re: [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process) SeongJae Park
2025-05-17 18:50 ` Lorenzo Stoakes
2025-05-17 20:25 ` SeongJae Park
2025-05-17 19:01 ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Lorenzo Stoakes
2025-05-15 16:47 ` Usama Arif
2025-05-15 18:36 ` Lorenzo Stoakes
2025-05-15 19:17 ` David Hildenbrand
2025-05-15 20:42 ` Lorenzo Stoakes
2025-05-16 6:12 ` kernel test robot
2025-05-15 13:33 ` [PATCH 2/6] prctl: introduce PR_THP_POLICY_DEFAULT_NOHUGE " Usama Arif
2025-05-16 8:19 ` kernel test robot
2025-05-15 13:33 ` [PATCH 3/6] prctl: introduce PR_THP_POLICY_SYSTEM " Usama Arif
2025-05-15 13:33 ` [PATCH 4/6] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_NOHUGE Usama Arif
2025-05-15 13:33 ` [PATCH 5/6] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_HUGE Usama Arif
2025-05-15 13:33 ` [PATCH 6/6] docs: transhuge: document process level THP controls Usama Arif
2025-05-15 13:55 ` Lorenzo Stoakes [this message]
2025-05-15 14:50 ` [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY Usama Arif
2025-05-15 15:15 ` Lorenzo Stoakes
2025-05-15 15:54 ` Usama Arif
2025-05-15 16:04 ` David Hildenbrand
2025-05-15 16:24 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6502bbb7-e8b3-4520-9547-823207119061@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=laoar.shao@gmail.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=riel@surriel.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox