linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Zi Yan <ziy@nvidia.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Usama Arif <usamaarif642@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, hannes@cmpxchg.org, shakeel.butt@linux.dev,
	riel@surriel.com, laoar.shao@gmail.com,
	baolin.wang@linux.alibaba.com, npache@redhat.com,
	ryan.roberts@arm.com, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process
Date: Thu, 15 May 2025 22:04:05 +0100	[thread overview]
Message-ID: <b29f98c7-ad5e-4993-b98b-2b3e62c952e1@lucifer.local> (raw)
In-Reply-To: <EF16AFD9-DDBB-4FB0-BF70-B7282159EDB1@nvidia.com>

On Thu, May 15, 2025 at 02:42:02PM -0400, Zi Yan wrote:
> On 15 May 2025, at 14:21, Lorenzo Stoakes wrote:
>
> > On Thu, May 15, 2025 at 02:09:56PM -0400, Liam R. Howlett wrote:
> >> * David Hildenbrand <david@redhat.com> [250515 13:30]:
> >>>>>
> >>>>
> >>>> Did we document all this? :)
> >>>>
> >>>> It'd be good to be super explicit about these sorts of 'dependency chains'.
> >>>>
> >>>
> >>> Documentation/admin-guide/mm/transhuge.rst has under "Global THP controls"
> >>> quite some stuff about all that, yes.
> >>>
> >>> The whole document needs an overhaul, to clarify on the whole terminology,
> >>> make it consistent, and better explain how the pagecache behaves etc. On my
> >>> todo list, but I'm afraid it will be a bit of work to get it right / please
> >>> most people.
> >>
> >> Yes, the whole thing is making me grumpy (more than my default state).
> >> The more I think about it, the more I don't like the prctl approach
> >> either...
> >
> > prctl() feels like it's literally never, ever the right choice.
> >
> > It feels like we shove all the dark stuff we want to put under the rug
> > there.
> >
> > Reading the man page is genuinely frightening. there's stuff about VMAs _I
> > wasn't aware of_.
> >
> > It's also never really the _right time_ to do it - it's not process
> > inception is it? It's when the process has started, now you suddenly fiddle
> > with it.
> >
> > Then relying on mm flags being propagated over fork/exec is just, it's a
> > hack really.
> >
> >>
> >> I more than dislike flags2... I hate it.
> >
> > Yeah, to be clear - I will NACK any series that tries to add flags2 unless
> > a VERY VERY good justification is given. It's horrid. And frankly this
> > feature doesn't warrant something as horrible.
> >
> > But making mm->flags 64-bit on 32-bit kernels (which are in effect
> > deprecated in my view) would fix this.
> >
> >>
> >> but no prctl, no cgroups, no bpf.. what is left?  A new policy groups
> >> thing?  No, not that either, please.
>
> BPF might be OK, as long as we provide right functions for BPF to manipulate
> system, process, MM, VMA level knobs. My only objection to Yafang's patch[1] is
> that the patch adds a VMA parameter to the global hugepage checking functions.

Yeah, that was a good point to raise :)

>
> My take on BPF approach is that it does not add new APIs, so we can change it
> at any time, assuming people is willing to accept that the functions instrumented
> by BPF can go away at any time and the corresponding BPF programs will not work
> forever. It allows us to explore various huge page policies without the burden
> of maintaining APIs. Eventually, huge page policies become transparent after
> we learn enough.

Yeah I am quite worried about the consequences of infiltrating BPF that far into
this to be honest, and we do want to get to a future where THP is something
people don't think about but an automated thing and this feels like we might end
up putting ourselves in a position where we make that impossible?

It's interesting but I think needs really careful analysis rather than 'bpf all
the things'...

But we do have these awkward 'not really sure how to do this' scenarios that
fall between the gaps like the one here.

And I guess prctl() ends up being the catch-all because it saves us having to
create a new system call, etc.

>
> [1] https://lore.kernel.org/linux-mm/20250429024139.34365-1-laoar.shao@gmail.com/
>
>
>
> --
> Best Regards,
> Yan, Zi


  reply	other threads:[~2025-05-15 21:04 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-15 13:33 [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY Usama Arif
2025-05-15 13:33 ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Usama Arif
2025-05-15 14:40   ` Lorenzo Stoakes
2025-05-15 14:44     ` David Hildenbrand
2025-05-15 14:56       ` Usama Arif
2025-05-15 14:58         ` David Hildenbrand
2025-05-15 15:18           ` Lorenzo Stoakes
2025-05-15 15:45       ` Liam R. Howlett
2025-05-15 15:57         ` David Hildenbrand
2025-05-15 16:38           ` Lorenzo Stoakes
2025-05-15 17:29             ` David Hildenbrand
2025-05-15 18:09               ` Liam R. Howlett
2025-05-15 18:21                 ` Lorenzo Stoakes
2025-05-15 18:42                   ` Zi Yan
2025-05-15 21:04                     ` Lorenzo Stoakes [this message]
2025-05-15 18:46                   ` Usama Arif
2025-05-15 19:20                 ` David Hildenbrand
2025-05-15 15:28     ` Usama Arif
2025-05-15 16:06       ` Lorenzo Stoakes
2025-05-15 16:11         ` David Hildenbrand
2025-05-15 18:08           ` Lorenzo Stoakes
2025-05-15 19:12             ` David Hildenbrand
2025-05-15 20:35               ` Lorenzo Stoakes
2025-05-16  7:45                 ` David Hildenbrand
2025-05-16 10:57                   ` Lorenzo Stoakes
2025-05-16 11:24                     ` David Hildenbrand
2025-05-16 12:57                       ` Lorenzo Stoakes
2025-05-16 17:19                         ` Usama Arif
2025-05-16 17:51                           ` Lorenzo Stoakes
2025-05-16 19:34                             ` Usama Arif
2025-05-17 16:20                         ` Is number of process_madvise()-able ranges limited to 8? (was Re: [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process) SeongJae Park
2025-05-17 18:50                           ` Lorenzo Stoakes
2025-05-17 20:25                             ` SeongJae Park
2025-05-17 19:01                         ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Lorenzo Stoakes
2025-05-15 16:47         ` Usama Arif
2025-05-15 18:36           ` Lorenzo Stoakes
2025-05-15 19:17             ` David Hildenbrand
2025-05-15 20:42               ` Lorenzo Stoakes
2025-05-16  6:12   ` kernel test robot
2025-05-15 13:33 ` [PATCH 2/6] prctl: introduce PR_THP_POLICY_DEFAULT_NOHUGE " Usama Arif
2025-05-16  8:19   ` kernel test robot
2025-05-15 13:33 ` [PATCH 3/6] prctl: introduce PR_THP_POLICY_SYSTEM " Usama Arif
2025-05-15 13:33 ` [PATCH 4/6] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_NOHUGE Usama Arif
2025-05-15 13:33 ` [PATCH 5/6] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_HUGE Usama Arif
2025-05-15 13:33 ` [PATCH 6/6] docs: transhuge: document process level THP controls Usama Arif
2025-05-15 13:55 ` [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY Lorenzo Stoakes
2025-05-15 14:50   ` Usama Arif
2025-05-15 15:15     ` Lorenzo Stoakes
2025-05-15 15:54       ` Usama Arif
2025-05-15 16:04         ` David Hildenbrand
2025-05-15 16:24         ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b29f98c7-ad5e-4993-b98b-2b3e62c952e1@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=riel@surriel.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox