linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/5] mm: introduce THP deferred setting
@ 2025-02-11  0:40 Nico Pache
  2025-02-11  0:40 ` [RFC v2 1/5] mm: defer THP insertion to khugepaged Nico Pache
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Nico Pache @ 2025-02-11  0:40 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-kselftest, linux-mm
  Cc: ryan.roberts, anshuman.khandual, catalin.marinas, cl, vbabka,
	mhocko, apopple, dave.hansen, will, baohua, jack, srivatsa,
	haowenchao22, hughd, aneesh.kumar, yang, peterx, ioworker0,
	wangkefeng.wang, ziy, jglisse, surenb, vishal.moola, zokeefe,
	zhengqi.arch, jhubbard, 21cnbao, willy, kirill.shutemov, david,
	aarcange, raquini, dev.jain, sunnanyong, usamaarif642, audra,
	akpm, rostedt, mathieu.desnoyers, tiwai, baolin.wang, corbet,
	shuah

This series is a follow-up to [1], which adds mTHP support to khugepaged.
mTHP khugepaged support was necessary for the global="defer" and
mTHP="inherit" case (and others) to make sense.

We've seen cases were customers switching from RHEL7 to RHEL8 see a
significant increase in the memory footprint for the same workloads.

Through our investigations we found that a large contributing factor to
the increase in RSS was an increase in THP usage.

For workloads like MySQL, or when using allocators like jemalloc, it is
often recommended to set /transparent_hugepages/enabled=never. This is
in part due to performance degradations and increased memory waste.

This series introduces enabled=defer, this setting acts as a middle
ground between always and madvise. If the mapping is MADV_HUGEPAGE, the
page fault handler will act normally, making a hugepage if possible. If
the allocation is not MADV_HUGEPAGE, then the page fault handler will
default to the base size allocation. The caveat is that khugepaged can
still operate on pages thats not MADV_HUGEPAGE.

This allows for two things... one, applications specifically designed to
use hugepages will get them, and two, applications that don't use
hugepages can still benefit from them without aggressively inserting
THPs at every possible chance. This curbs the memory waste, and defers
the use of hugepages to khugepaged. Khugepaged can then scan the memory
for eligible collapsing.

Admins may want to lower max_ptes_none, if not, khugepaged may
aggressively collapse single allocations into hugepages.

TESTING:
- Built for x86_64, aarch64, ppc64le, and s390x
- selftests mm
- In [1] I provided a script [2] that has multiple access patterns
- lots of general use. These changes have been running in my VM for some time
- redis testing. This test was my original case for the defer mode. What I was
   able to prove was that THP=always leads to increased max_latency cases; hence
   why it is recommended to disable THPs for redis servers. However with 'defer'
   we dont have the max_latency spikes and can still get the system to utilize
   THPs. I further tested this with the mTHP defer setting and found that redis
   (and probably other jmalloc users) can utilize THPs via defer (+mTHP defer)
   without a large latency penalty and some potential gains.
   I uploaded some mmtest results here [3] which compares:
       stock+thp=never
       stock+(m)thp=always
       khugepaged-mthp + defer (max_ptes_none=64)

  The results show that (m)THPs can cause some throughput regression in some
  cases, but also has gains in other cases. The mTHP+defer results have more
  gains and less losses over the (m)THP=always case.

V2 Changes:
- base changes on mTHP khugepaged support
- Fix selftests parsing issue
- add mTHP defer option
- add mTHP defer Documentation

[1] - https://lkml.org/lkml/2025/2/10/1982
[2] - https://gitlab.com/npache/khugepaged_mthp_test
[3] - https://people.redhat.com/npache/mthp_khugepaged_defer/testoutput2/output.html

Nico Pache (5):
  mm: defer THP insertion to khugepaged
  mm: document transparent_hugepage=defer usage
  selftests: mm: add defer to thp setting parser
  khugepaged: add defer option to mTHP options
  mm: document mTHP defer setting

 Documentation/admin-guide/mm/transhuge.rst | 40 ++++++++++---
 include/linux/huge_mm.h                    | 18 +++++-
 mm/huge_memory.c                           | 69 +++++++++++++++++++---
 mm/khugepaged.c                            | 10 ++--
 tools/testing/selftests/mm/thp_settings.c  |  1 +
 tools/testing/selftests/mm/thp_settings.h  |  1 +
 6 files changed, 115 insertions(+), 24 deletions(-)

-- 
2.48.1



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-02-17 19:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-11  0:40 [RFC v2 0/5] mm: introduce THP deferred setting Nico Pache
2025-02-11  0:40 ` [RFC v2 1/5] mm: defer THP insertion to khugepaged Nico Pache
2025-02-17 14:59   ` Usama Arif
2025-02-17 19:24     ` Nico Pache
2025-02-11  0:40 ` [RFC v2 2/5] mm: document transparent_hugepage=defer usage Nico Pache
2025-02-17 15:04   ` Usama Arif
2025-02-17 19:30     ` Nico Pache
2025-02-11  0:40 ` [RFC v2 3/5] selftests: mm: add defer to thp setting parser Nico Pache
2025-02-11  0:40 ` [RFC v2 4/5] khugepaged: add defer option to mTHP options Nico Pache
2025-02-11  0:40 ` [RFC v2 5/5] mm: document mTHP defer setting Nico Pache
2025-02-17 15:13   ` Usama Arif
2025-02-17 19:40     ` Nico Pache
2025-02-17 14:53 ` [RFC v2 0/5] mm: introduce THP deferred setting Usama Arif
2025-02-17 19:23   ` Nico Pache

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox