linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Zi Yan <ziy@nvidia.com>,
	akpm@linux-foundation.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, David Hildenbrand <david@redhat.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	bpf@vger.kernel.org, linux-mm@kvack.org,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [RFC PATCH 0/4] mm, bpf: BPF based THP adjustment
Date: Wed, 30 Apr 2025 13:45:21 -0400	[thread overview]
Message-ID: <20250430174521.GC2020@cmpxchg.org> (raw)
In-Reply-To: <CALOAHbCL-NOEeA1+t=D2F_q7UUi7GvkLDry5=SiehtWs1TKX1Q@mail.gmail.com>

On Thu, May 01, 2025 at 12:06:31AM +0800, Yafang Shao wrote:
> > > > If it isn't, can you state why?
> > > >
> > > > The main difference is that you are saying it's in a container that you
> > > > don't control.  Your plan is to violate the control the internal
> > > > applications have over THP because you know better.  I'm not sure how
> > > > people might feel about you messing with workloads,
> > >
> > > It’s not a mess. They have the option to deploy their services on
> > > dedicated servers, but they would need to pay more for that choice.
> > > This is a two-way decision.
> >
> > This implies you want a container-level way of controlling the setting
> > and not a system service-level?
> 
> Right. We want to control the THP per container.

This does strike me as a reasonable usecase.

I think there is consensus that in the long-term we want this stuff to
just work and truly be transparent to userspace.

In the short-to-medium term, however, there are still quite a few
caveats. thp=always can significantly increase the memory footprint of
sparse virtual regions. Huge allocations are not as cheap and reliable
as we would like them to be, which for real production systems means
having to make workload-specifcic choices and tradeoffs.

There is ongoing work in these areas, but we do have a bit of a
chicken-and-egg problem: on the one hand, huge page adoption is slow
due to limitations in how they can be deployed. For example, we can't
do thp=always on a DC node that runs arbitary combinations of jobs
from a wide array of services. Some might benefit, some might hurt.

Yet, it's much easier to improve the kernel based on exactly such
production experience and data from real-world usecases. We can't
improve the THP shrinker if we can't run THP.

So I don't see it as overriding whoever wrote the software running
inside the container. They don't know, and they shouldn't have to care
about page sizes. It's about letting admins and kernel teams get
started on using and experimenting with this stuff, given the very
real constraints right now, so we can get the feedback necessary to
improve the situation.


  reply	other threads:[~2025-04-30 17:45 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29  2:41 Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 1/4] mm: move hugepage_global_{enabled,always}() to internal.h Yafang Shao
2025-04-29 15:13   ` Zi Yan
2025-04-30  2:40     ` Yafang Shao
2025-04-30 12:11       ` Zi Yan
2025-04-30 14:43         ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 2/4] mm: pass VMA parameter to hugepage_global_{enabled,always}() Yafang Shao
2025-04-29 15:31   ` Zi Yan
2025-04-30  2:46     ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 3/4] mm: add BPF hook for THP adjustment Yafang Shao
2025-04-29 15:19   ` Alexei Starovoitov
2025-04-30  2:48     ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 4/4] selftests/bpf: Add selftest " Yafang Shao
2025-04-29  3:11 ` [RFC PATCH 0/4] mm, bpf: BPF based " Matthew Wilcox
2025-04-29  4:53   ` Yafang Shao
2025-04-29 15:09 ` Zi Yan
2025-04-30  2:33   ` Yafang Shao
2025-04-30 13:19     ` Zi Yan
2025-04-30 14:38       ` Yafang Shao
2025-04-30 15:00         ` Zi Yan
2025-04-30 15:16           ` Yafang Shao
2025-04-30 15:21           ` Liam R. Howlett
2025-04-30 15:37             ` Yafang Shao
2025-04-30 15:53               ` Liam R. Howlett
2025-04-30 16:06                 ` Yafang Shao
2025-04-30 17:45                   ` Johannes Weiner [this message]
2025-04-30 17:53                     ` Zi Yan
2025-05-01 19:36                       ` Gutierrez Asier
2025-05-02  5:48                         ` Yafang Shao
2025-05-02 12:00                           ` Zi Yan
2025-05-02 12:18                             ` Yafang Shao
2025-05-02 13:04                               ` David Hildenbrand
2025-05-02 13:06                                 ` Matthew Wilcox
2025-05-02 13:34                                 ` Zi Yan
2025-05-05  2:35                                 ` Yafang Shao
2025-05-05  9:11                           ` Gutierrez Asier
2025-05-05  9:38                             ` Yafang Shao
2025-04-30 17:59         ` Johannes Weiner
2025-05-01  0:40           ` Yafang Shao
2025-04-30 14:40     ` Liam R. Howlett
2025-04-30 14:49       ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250430174521.GC2020@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=laoar.shao@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox