linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: akpm@linux-foundation.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org
Cc: bpf@vger.kernel.org, linux-mm@kvack.org,
	Yafang Shao <laoar.shao@gmail.com>
Subject: [RFC PATCH 0/4] mm, bpf: BPF based THP adjustment
Date: Tue, 29 Apr 2025 10:41:35 +0800	[thread overview]
Message-ID: <20250429024139.34365-1-laoar.shao@gmail.com> (raw)

In our container environment, we aim to enable THP selectively—allowing
specific services to use it while restricting others. This approach is
driven by the following considerations:

1. Memory Fragmentation
   THP can lead to increased memory fragmentation, so we want to limit its
   use across services.
2. Performance Impact
   Some services see no benefit from THP, making its usage unnecessary.
3. Performance Gains
   Certain workloads, such as machine learning services, experience
   significant performance improvements with THP, so we enable it for them
   specifically. 

Since multiple services run on a single host in a containerized environment,
enabling THP globally is not ideal. Previously, we set THP to madvise,
allowing selected services to opt in via MADV_HUGEPAGE. However, this
approach had limitation:

- Some services inadvertently used madvise(MADV_HUGEPAGE) through
  third-party libraries, bypassing our restrictions.

To address this issue, we initially hooked the __x64_sys_madvise() syscall,
which is error-injectable, to blacklist unwanted services. While this
worked, it was error-prone and ineffective for services needing always mode,
as modifying their code to use madvise was impractical.

To achieve finer-grained control, we introduced an fmod_ret-based solution.
Now, we dynamically adjust THP settings per service by hooking
hugepage_global_{enabled,always}() via BPF. This allows us to set THP to
enable or disable on a per-service basis without global impact.

The hugepage_global_{enabled,always}() functions currently share the same
BPF hook, which limits THP configuration to either always or never. While
this suffices for our specific use cases, full support for all three modes
(always, madvise, and never) would require splitting them into separate
hooks.

This is the initial RFC patch—feedback is welcome!

Yafang Shao (4):
  mm: move hugepage_global_{enabled,always}() to internal.h
  mm: pass VMA parameter to hugepage_global_{enabled,always}()
  mm: add BPF hook for THP adjustment
  selftests/bpf: Add selftest for THP adjustment

 include/linux/huge_mm.h                       |  54 +-----
 mm/Makefile                                   |   3 +
 mm/bpf.c                                      |  36 ++++
 mm/bpf.h                                      |  21 +++
 mm/huge_memory.c                              |  50 ++++-
 mm/internal.h                                 |  21 +++
 mm/khugepaged.c                               |  18 +-
 tools/testing/selftests/bpf/config            |   1 +
 .../selftests/bpf/prog_tests/thp_adjust.c     | 176 ++++++++++++++++++
 .../selftests/bpf/progs/test_thp_adjust.c     |  32 ++++
 10 files changed, 344 insertions(+), 68 deletions(-)
 create mode 100644 mm/bpf.c
 create mode 100644 mm/bpf.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/thp_adjust.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_thp_adjust.c

-- 
2.43.5



             reply	other threads:[~2025-04-29  2:41 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29  2:41 Yafang Shao [this message]
2025-04-29  2:41 ` [RFC PATCH 1/4] mm: move hugepage_global_{enabled,always}() to internal.h Yafang Shao
2025-04-29 15:13   ` Zi Yan
2025-04-30  2:40     ` Yafang Shao
2025-04-30 12:11       ` Zi Yan
2025-04-30 14:43         ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 2/4] mm: pass VMA parameter to hugepage_global_{enabled,always}() Yafang Shao
2025-04-29 15:31   ` Zi Yan
2025-04-30  2:46     ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 3/4] mm: add BPF hook for THP adjustment Yafang Shao
2025-04-29 15:19   ` Alexei Starovoitov
2025-04-30  2:48     ` Yafang Shao
2025-04-29  2:41 ` [RFC PATCH 4/4] selftests/bpf: Add selftest " Yafang Shao
2025-04-29  3:11 ` [RFC PATCH 0/4] mm, bpf: BPF based " Matthew Wilcox
2025-04-29  4:53   ` Yafang Shao
2025-04-29 15:09 ` Zi Yan
2025-04-30  2:33   ` Yafang Shao
2025-04-30 13:19     ` Zi Yan
2025-04-30 14:38       ` Yafang Shao
2025-04-30 15:00         ` Zi Yan
2025-04-30 15:16           ` Yafang Shao
2025-04-30 15:21           ` Liam R. Howlett
2025-04-30 15:37             ` Yafang Shao
2025-04-30 15:53               ` Liam R. Howlett
2025-04-30 16:06                 ` Yafang Shao
2025-04-30 17:45                   ` Johannes Weiner
2025-04-30 17:53                     ` Zi Yan
2025-05-01 19:36                       ` Gutierrez Asier
2025-05-02  5:48                         ` Yafang Shao
2025-05-02 12:00                           ` Zi Yan
2025-05-02 12:18                             ` Yafang Shao
2025-05-02 13:04                               ` David Hildenbrand
2025-05-02 13:06                                 ` Matthew Wilcox
2025-05-02 13:34                                 ` Zi Yan
2025-05-05  2:35                                 ` Yafang Shao
2025-05-05  9:11                           ` Gutierrez Asier
2025-05-05  9:38                             ` Yafang Shao
2025-04-30 17:59         ` Johannes Weiner
2025-05-01  0:40           ` Yafang Shao
2025-04-30 14:40     ` Liam R. Howlett
2025-04-30 14:49       ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250429024139.34365-1-laoar.shao@gmail.com \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox