linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mm/swap_cgroup: remove global swap cgroup lock
@ 2024-12-10  9:28 Kairui Song
  2024-12-10  9:28 ` [PATCH v2 1/3] mm, memcontrol: avoid duplicated memcg enable check Kairui Song
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kairui Song @ 2024-12-10  9:28 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Chris Li, Hugh Dickins, Huang, Ying, Yosry Ahmed,
	Roman Gushchin, Shakeel Butt, Johannes Weiner, Barry Song,
	Michal Hocko, linux-kernel, Kairui Song

From: Kairui Song <kasong@tencent.com>

This series removes the global swap cgroup lock. The critical section of
this lock is very short but it's still a bottle neck for mass parallel
swap workloads.

Up to 10% performance gain for tmpfs build kernel test on a
48c96t system, and no regression for other cases:

Testing using 64G brd and build with build kernel with make -j96 in 1.5G
memory cgroup using 4k folios showed below improvement (10 test run):

Before this series:
Sys time: 10809.46 (stdev 80.831491)
Real time: 171.41 (stdev 1.239894)

After this commit:
Sys time: 9621.26 (stdev 34.620000), -10.42%
Real time: 160.00 (stdev 0.497814), -6.57%

With 64k folios and 2G memcg:
Before this series:
Sys time: 8231.99 (stdev 30.030994)
Real time: 143.57 (stdev 0.577394)

After this commit:
Sys time: 7403.47 (stdev 6.270000), -10.06%
Real time: 135.18 (stdev 0.605000), -5.84%

Sequential swapout of 8G 64k zero folios (24 test run):
Before this series:
5461409.12 us (stdev 183957.827084)

After this commit:
5420447.26 us (stdev 196419.240317)

Sequential swapin of 8G 4k zero folios (24 test run):
Before this series:
19736958.916667 us (stdev 189027.246676)

After this commit:
19662182.629630 us (stdev 172717.640614)

V1: https://lore.kernel.org/linux-mm/20241202184154.19321-1-ryncsn@gmail.com/
Updates:
- Collect Review and Ack.
- Use bit shift instead of a mixed usage of short and atomic for
  emulating 2 byte xchg [Chris Li]
- Merge patch 3 into patch 4 for simplicity [Roman Gushchin].
- Drop call of mem_cgroup_disabled instead in patch 1, also fix bot
  build error [Yosry Ahmed]
- Wrap the access of the atomic_t map with helpers properly, so the
  emulation can be dropped to use native 2 byte xchg once available.

Kairui Song (3):
  mm, memcontrol: avoid duplicated memcg enable check
  mm/swap_cgroup: remove swap_cgroup_cmpxchg
  mm, swap_cgroup: remove global swap cgroup lock

 include/linux/swap_cgroup.h |  2 -
 mm/memcontrol.c             |  2 +-
 mm/swap_cgroup.c            | 96 ++++++++++++++++---------------------
 3 files changed, 43 insertions(+), 57 deletions(-)

-- 
2.47.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-12-15 15:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-10  9:28 [PATCH v2 0/3] mm/swap_cgroup: remove global swap cgroup lock Kairui Song
2024-12-10  9:28 ` [PATCH v2 1/3] mm, memcontrol: avoid duplicated memcg enable check Kairui Song
2024-12-10  9:28 ` [PATCH v2 2/3] mm/swap_cgroup: remove swap_cgroup_cmpxchg Kairui Song
2024-12-10  9:28 ` [PATCH v2 3/3] mm, swap_cgroup: remove global swap cgroup lock Kairui Song
2024-12-11  1:19   ` Roman Gushchin
2024-12-14 16:07   ` Chris Li
2024-12-14 19:48     ` Kairui Song
2024-12-15 15:04       ` Chris Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox