linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Yafang Shao <laoar.shao@gmail.com>,
	akpm@linux-foundation.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, David Hildenbrand <david@redhat.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	bpf@vger.kernel.org, linux-mm@kvack.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC PATCH 0/4] mm, bpf: BPF based THP adjustment
Date: Wed, 30 Apr 2025 11:21:47 -0400	[thread overview]
Message-ID: <mnv3jjbdqx3eqrcxjrn5eeql3kpcfa6jzyjihh2cdyvrd7ldga@3cmkqwudlomh> (raw)
In-Reply-To: <8F000270-A724-4536-B69E-C22701522B89@nvidia.com>

* Zi Yan <ziy@nvidia.com> [250430 11:01]:

...

> >>>>> Since multiple services run on a single host in a containerized environment,
> >>>>> enabling THP globally is not ideal. Previously, we set THP to madvise,
> >>>>> allowing selected services to opt in via MADV_HUGEPAGE. However, this
> >>>>> approach had limitation:
> >>>>>
> >>>>> - Some services inadvertently used madvise(MADV_HUGEPAGE) through
> >>>>>   third-party libraries, bypassing our restrictions.
> >>>>
> >>>> Basically, you want more precise control of THP enablement and the
> >>>> ability of overriding madvise() from userspace.
> >>>>
> >>>> In terms of overriding madvise(), do you have any concrete example of
> >>>> these third-party libraries? madvise() users are supposed to know what
> >>>> they are doing, so I wonder why they are causing trouble in your
> >>>> environment.
> >>>
> >>> To my knowledge, jemalloc [0] supports THP.
> >>> Applications using jemalloc typically rely on its default
> >>> configurations rather than explicitly enabling or disabling THP. If
> >>> the system is configured with THP=madvise, these applications may
> >>> automatically leverage THP where appropriate
> >>>
> >>> [0]. https://github.com/jemalloc/jemalloc
> >>
> >> It sounds like a userspace issue. For jemalloc, if applications require
> >> it, can't you replace the jemalloc with a one compiled with --disable-thp
> >> to work around the issue?
> >
> > That’s not the issue this patchset is trying to address or work
> > around. I believe we should focus on the actual problem it's meant to
> > solve.
> >
> > By the way, you might not raise this question if you were managing a
> > large fleet of servers. We're a platform provider, but we don’t
> > maintain all the packages ourselves. Users make their own choices
> > based on their specific requirements. It's not a feasible solution for
> > us to develop and maintain every package.
> 
> Basically, user wants to use THP, but as a service provider, you think
> differently, so want to override userspace choice. Am I getting it right?

Who is the platform provider in question?  It makes me uneasy to have
such claims from an @gmail account with current world events..

...

> >>>
> >>> I chose not to include this in the self-tests to avoid the complexity
> >>> of setting up cgroups for testing purposes. However, in patch #4 of
> >>> this series, I've included a simpler example demonstrating task-level
> >>> control.
> >>
> >> For task-level control, why not using prctl(PR_SET_THP_DISABLE)?
> >
> > You’ll need to modify the user-space code—and again, this likely
> > wouldn’t be a concern if you were managing a large fleet of servers.
> >
> >>
> >>> For service-level control, we could potentially utilize BPF task local
> >>> storage as an alternative approach.
> >>
> >> +cgroup people
> >>
> >> For service-level control, there was a proposal of adding cgroup based
> >> THP control[1]. You might need a strong use case to convince people.
> >>
> >> [1] https://lore.kernel.org/linux-mm/20241030083311.965933-1-gutierrez.asier@huawei-partners.com/
> >
> > Thanks for the reference. I've reviewed the related discussion, and if
> > I understand correctly, the proposal was rejected by the maintainers.

More of the point is why it was rejected.  Why is your motive different?

> 
> I wonder why your approach is better than the cgroup based THP control proposal.

I think Matthew's response in that thread is pretty clear and still
relevant.  If it isn't, can you state why?

The main difference is that you are saying it's in a container that you
don't control.  Your plan is to violate the control the internal
applications have over THP because you know better.  I'm not sure how
people might feel about you messing with workloads, but beyond that, you
are fundamentally fixing things at a sysadmin level because programmers
have made errors.  You state as much in the cover letter, yes?

Thanks,
Liam



  parent reply	other threads:[~2025-04-30 15:22 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29  2:41 Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 1/4] mm: move hugepage_global_{enabled,always}() to internal.h Yafang Shao
2025-04-29 15:13   ` Zi Yan
2025-04-30  2:40     ` Yafang Shao
2025-04-30 12:11       ` Zi Yan
2025-04-30 14:43         ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 2/4] mm: pass VMA parameter to hugepage_global_{enabled,always}() Yafang Shao
2025-04-29 15:31   ` Zi Yan
2025-04-30  2:46     ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 3/4] mm: add BPF hook for THP adjustment Yafang Shao
2025-04-29 15:19   ` Alexei Starovoitov
2025-04-30  2:48     ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 4/4] selftests/bpf: Add selftest " Yafang Shao
2025-04-29  3:11 ` [RFC PATCH 0/4] mm, bpf: BPF based " Matthew Wilcox
2025-04-29  4:53   ` Yafang Shao
2025-04-29 15:09 ` Zi Yan
2025-04-30  2:33   ` Yafang Shao
2025-04-30 13:19     ` Zi Yan
2025-04-30 14:38       ` Yafang Shao
2025-04-30 15:00         ` Zi Yan
2025-04-30 15:16           ` Yafang Shao
2025-04-30 15:21           ` Liam R. Howlett [this message]
2025-04-30 15:37             ` Yafang Shao
2025-04-30 15:53               ` Liam R. Howlett
2025-04-30 16:06                 ` Yafang Shao
2025-04-30 17:45                   ` Johannes Weiner
2025-04-30 17:53                     ` Zi Yan
2025-05-01 19:36                       ` Gutierrez Asier
2025-05-02  5:48                         ` Yafang Shao
2025-05-02 12:00                           ` Zi Yan
2025-05-02 12:18                             ` Yafang Shao
2025-05-02 13:04                               ` David Hildenbrand
2025-05-02 13:06                                 ` Matthew Wilcox
2025-05-02 13:34                                 ` Zi Yan
2025-05-05  2:35                                 ` Yafang Shao
2025-05-05  9:11                           ` Gutierrez Asier
2025-05-05  9:38                             ` Yafang Shao
2025-04-30 17:59         ` Johannes Weiner
2025-05-01  0:40           ` Yafang Shao
2025-04-30 14:40     ` Liam R. Howlett
2025-04-30 14:49       ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mnv3jjbdqx3eqrcxjrn5eeql3kpcfa6jzyjihh2cdyvrd7ldga@3cmkqwudlomh \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hannes@cmpxchg.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox