linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] variable-order, large folios for anonymous memory
@ 2023-07-14 16:04 Ryan Roberts
  2023-07-14 16:17 ` [PATCH v3 1/4] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
                   ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Ryan Roberts @ 2023-07-14 16:04 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox, Kirill A. Shutemov, Yin Fengwei,
	David Hildenbrand, Yu Zhao, Catalin Marinas, Will Deacon,
	Anshuman Khandual, Yang Shi, Huang, Ying, Zi Yan,
	Luis Chamberlain
  Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, linux-mm

Hi All,

This is v3 of a series to implement variable order, large folios for anonymous
memory. (currently called "FLEXIBLE_THP") The objective of this is to improve
performance by allocating larger chunks of memory during anonymous page faults.
See [1] and [2] for background.

There has been quite a bit more rework and simplification, mainly based on
feedback from Yu Zhao. Additionally, I've added a command line parameter,
flexthp_unhinted_max, the idea for which came from discussion with David
Hildenbrand (thanks for all your feedback!).

The last patch is for arm64 to explicitly override the default
arch_wants_pte_order() and is intended as an example. If this series is accepted
I suggest taking the first 3 patches through the mm tree and the arm64 change
could be handled through the arm64 tree separately. Neither has any build
dependency on the other.

The patches are based on top of v6.5-rc1. I have a branch at [3].


Changes since v2 [2]
--------------------

  - Dropped commit "Allow deferred splitting of arbitrary large anon folios"
      - Huang, Ying suggested the "batch zap" work (which I dropped from this
        series after v1) is a prerequisite for merging FLXEIBLE_THP, so I've
        moved the deferred split patch to a separate series along with the batch
        zap changes. I plan to submit this series early next week.
  - Changed folio order fallback policy
      - We no longer iterate from preferred to 0 looking for acceptable policy
      - Instead we iterate through preferred, PAGE_ALLOC_COSTLY_ORDER and 0 only
  - Removed vma parameter from arch_wants_pte_order()
  - Added command line parameter `flexthp_unhinted_max`
      - clamps preferred order when vma hasn't explicitly opted-in to THP
  - Never allocate large folio for MADV_NOHUGEPAGE vma (or when THP is disabled
    for process or system).
  - Simplified implementation and integration with do_anonymous_page()
  - Removed dependency on set_ptes()


Performance
-----------

Performance is still similar to v2; see cover letter at [2].


Opens
-----

  - Feature name: FLEXIBLE_THP or LARGE_ANON_FOLIO?
      - Given the closer policy ties to THP, I prefer FLEXIBLE_THP
  - Prerequisits for merging
      - Sounds like there is a concensus that we should wait until exisitng
        features are improved to place nicely with large folios.


[1] https://lore.kernel.org/linux-mm/20230626171430.3167004-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/linux-mm/20230703135330.1865927-1-ryan.roberts@arm.com/
[3] https://gitlab.arm.com/linux-arm/linux-rr/-/tree/features/granule_perf/anonfolio-lkml_v3


Thanks,
Ryan


Ryan Roberts (4):
  mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap()
  mm: Default implementation of arch_wants_pte_order()
  mm: FLEXIBLE_THP for improved performance
  arm64: mm: Override arch_wants_pte_order()

 .../admin-guide/kernel-parameters.txt         |  10 +
 arch/arm64/include/asm/pgtable.h              |   6 +
 include/linux/pgtable.h                       |  13 ++
 mm/Kconfig                                    |  10 +
 mm/memory.c                                   | 187 ++++++++++++++++--
 mm/rmap.c                                     |  28 ++-
 6 files changed, 230 insertions(+), 24 deletions(-)

--
2.25.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2023-07-26  8:48 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-14 16:04 [PATCH v3 0/4] variable-order, large folios for anonymous memory Ryan Roberts
2023-07-14 16:17 ` [PATCH v3 1/4] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-07-14 16:52   ` Yu Zhao
2023-07-14 18:01     ` Ryan Roberts
2023-07-17 13:00   ` David Hildenbrand
2023-07-17 13:13     ` Ryan Roberts
2023-07-17 13:19       ` David Hildenbrand
2023-07-17 13:21         ` Ryan Roberts
2023-07-14 16:17 ` [PATCH v3 2/4] mm: Default implementation of arch_wants_pte_order() Ryan Roberts
2023-07-14 16:54   ` Yu Zhao
2023-07-17 11:13   ` Yin Fengwei
2023-07-17 13:01   ` David Hildenbrand
2023-07-17 13:15     ` Ryan Roberts
2023-07-14 16:17 ` [PATCH v3 3/4] mm: FLEXIBLE_THP for improved performance Ryan Roberts
2023-07-14 17:17   ` Yu Zhao
2023-07-14 17:59     ` Ryan Roberts
2023-07-14 22:11       ` Yu Zhao
2023-07-17 13:36         ` Ryan Roberts
2023-07-17 19:31           ` Yu Zhao
2023-07-17 20:35             ` Yu Zhao
2023-07-17 23:37           ` Hugh Dickins
2023-07-18 10:36             ` Ryan Roberts
2023-07-17 13:06     ` David Hildenbrand
2023-07-17 13:20       ` Ryan Roberts
2023-07-17 13:56         ` David Hildenbrand
2023-07-17 14:47           ` Ryan Roberts
2023-07-17 14:55             ` David Hildenbrand
2023-07-17 17:07       ` Yu Zhao
2023-07-17 17:16         ` David Hildenbrand
2023-07-21 10:57   ` Ryan Roberts
2023-07-14 16:17 ` [PATCH v3 4/4] arm64: mm: Override arch_wants_pte_order() Ryan Roberts
2023-07-14 16:47   ` Yu Zhao
2023-07-24 11:59 ` [PATCH v3 0/4] variable-order, large folios for anonymous memory Ryan Roberts
2023-07-24 14:58   ` Zi Yan
2023-07-24 15:41     ` Ryan Roberts
2023-07-26  7:36       ` Itaru Kitayama
2023-07-26  8:42         ` Ryan Roberts
2023-07-26  8:47           ` Itaru Kitayama

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox