linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
@ 2024-06-22  7:12 Barry Song
  2024-06-22  7:12 ` [PATCH v2 1/1] " Barry Song
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Barry Song @ 2024-06-22  7:12 UTC (permalink / raw)
  To: akpm, chrisl, linux-mm, ryan.roberts
  Cc: david, hughd, kaleshsingh, kasong, linux-kernel, v-songbaohua,
	ying.huang

From: Barry Song <v-songbaohua@oppo.com>

-v2:
 * add swap-in which can either be aligned or not aligned, by "-a";
   Ying;
 * move the program to tools/mm; Ryan;
 * try to simulate the scenarios swap is full. Chris;

-v1:
 https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/

I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
[1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/ 
[2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/

Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
"-s," which is good news!
If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
(value A).

The bad news is that unaligned swapin can significantly increase the fallback ratio,
reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
approaches approach 100% without "-a" and with "-s" (value B).

I believe real workloads should yield a value between A and B. Without "-a," and
lacking large folios swap-in, this tool randomly swaps in small folios without
considering spatial locality, which is a factor present in real workloads. This
typically results in values higher than A and lower than B.

Based on the below results, I believe that:
1. We truly require large folio swap-in to achieve comparable results with
aligned swap-in(based on the result w/o and w/ "-a")
2. We need a method to prevent small folios from scattering indiscriminately
(based on the result "-a -s")

*
*  Test results on Ryan's patchset:
*

1. w/ -a
./thp_swap_allocator_test -a
Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
...
Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%

2. w/o -a
./thp_swap_allocator_test

Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
...
Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%

3. w/ -a and -s
./thp_swap_allocator_test -a -s
Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%

4. w/o -a and w/ -s
thp_swap_allocator_test  -s
Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
...

*
*  Test results on Chris's v3 patchset:
*
1. w/ -a
./thp_swap_allocator_test -a
Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
...

It seems Chris's approach can be negatively affected even by aligned swapin,
having a low fallback ratio but not 0% while Ryan's patchset hasn't this
issue.

2. w/o -a
./thp_swap_allocator_test
Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
..

Ryan's fallback ratio(around 85%) seems to be a little better while both are much
worse than "-a".

3. w/ -a and -s
./thp_swap_allocator_test -a -s
Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
...

Chris's approach appears to be more susceptible to negative effects from
small folios.

4. w/o -a and w/ -s
./thp_swap_allocator_test -s
Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
...

Barry Song (1):
  tools/mm: Introduce a tool to assess swap entry allocation for
    thp_swapout

 tools/mm/Makefile                  |   2 +-
 tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
 2 files changed, 234 insertions(+), 1 deletion(-)
 create mode 100644 tools/mm/thp_swap_allocator_test.c

-- 
2.34.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-22  7:12 [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout Barry Song
@ 2024-06-22  7:12 ` Barry Song
  2024-06-25 17:22   ` Kairui Song
  2024-07-05  9:31   ` Ryan Roberts
  2024-06-24  8:26 ` [PATCH v2 0/1] " Ryan Roberts
  2024-06-24 10:06 ` Chris Li
  2 siblings, 2 replies; 16+ messages in thread
From: Barry Song @ 2024-06-22  7:12 UTC (permalink / raw)
  To: akpm, chrisl, linux-mm, ryan.roberts
  Cc: david, hughd, kaleshsingh, kasong, linux-kernel, v-songbaohua,
	ying.huang

From: Barry Song <v-songbaohua@oppo.com>

Both Ryan and Chris have been utilizing the small test program to aid
in debugging and identifying issues with swap entry allocation. While
a real or intricate workload might be more suitable for assessing the
correctness and effectiveness of the swap allocation policy, a small
test program presents a simpler means of understanding the problem and
initially verifying the improvements being made.

Let's endeavor to integrate it into tools/mm. Although it presently
only accommodates 64KB and 4KB, I'm optimistic that we can expand
its capabilities to support multiple sizes and simulate more
complex systems in the future as required.

Basically, we have
1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code
under high exercise in a short time.
2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in
freeing memory, as well as for munmap, app exits, or OOM killer scenarios.
This ensures new mTHP is always generated, released or swapped out, similar
to the behavior on a PC or Android phone where many applications are
frequently started and terminated.
3. Swap in with or without the "-a" option to observe how fragments
due to swap-in and the incoming swap-in of large folios will impact
swap-out fallback.

Due to 2, we ensure a certain proportion of mTHP. Similarly, because
of 3, we maintain a certain proportion of small folios, as we don't
support large folios swap-in, meaning any swap-in will immediately
result in small folios. Therefore, with both 2 and 3, we automatically
achieve a system containing both mTHP and small folios. Additionally,
1 provides the ability to continuously swap them out.

We can also use "-s" to add a dedicated small folios memory area.

Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 tools/mm/Makefile                  |   2 +-
 tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
 2 files changed, 234 insertions(+), 1 deletion(-)
 create mode 100644 tools/mm/thp_swap_allocator_test.c

diff --git a/tools/mm/Makefile b/tools/mm/Makefile
index 7bb03606b9ea..15791c1c5b28 100644
--- a/tools/mm/Makefile
+++ b/tools/mm/Makefile
@@ -3,7 +3,7 @@
 #
 include ../scripts/Makefile.include
 
-BUILD_TARGETS=page-types slabinfo page_owner_sort
+BUILD_TARGETS=page-types slabinfo page_owner_sort thp_swap_allocator_test
 INSTALL_TARGETS = $(BUILD_TARGETS) thpmaps
 
 LIB_DIR = ../lib/api
diff --git a/tools/mm/thp_swap_allocator_test.c b/tools/mm/thp_swap_allocator_test.c
new file mode 100644
index 000000000000..a363bdde55f0
--- /dev/null
+++ b/tools/mm/thp_swap_allocator_test.c
@@ -0,0 +1,233 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * thp_swap_allocator_test
+ *
+ * The purpose of this test program is helping check if THP swpout
+ * can correctly get swap slots to swap out as a whole instead of
+ * being split. It randomly releases swap entries through madvise
+ * DONTNEED and swapin/out on two memory areas: a memory area for
+ * 64KB THP and the other area for small folios. The second memory
+ * can be enabled by "-s".
+ * Before running the program, we need to setup a zRAM or similar
+ * swap device by:
+ *  echo lzo > /sys/block/zram0/comp_algorithm
+ *  echo 64M > /sys/block/zram0/disksize
+ *  echo never > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
+ *  echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled
+ *  mkswap /dev/zram0
+ *  swapon /dev/zram0
+ * The expected result should be 0% anon swpout fallback ratio w/ or
+ * w/o "-s".
+ *
+ * Author(s): Barry Song <v-songbaohua@oppo.com>
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <time.h>
+
+#define MEMSIZE_MTHP (60 * 1024 * 1024)
+#define MEMSIZE_SMALLFOLIO (4 * 1024 * 1024)
+#define ALIGNMENT_MTHP (64 * 1024)
+#define ALIGNMENT_SMALLFOLIO (4 * 1024)
+#define TOTAL_DONTNEED_MTHP (16 * 1024 * 1024)
+#define TOTAL_DONTNEED_SMALLFOLIO (1 * 1024 * 1024)
+#define MTHP_FOLIO_SIZE (64 * 1024)
+
+#define SWPOUT_PATH \
+	"/sys/kernel/mm/transparent_hugepage/hugepages-64kB/stats/swpout"
+#define SWPOUT_FALLBACK_PATH \
+	"/sys/kernel/mm/transparent_hugepage/hugepages-64kB/stats/swpout_fallback"
+
+static void *aligned_alloc_mem(size_t size, size_t alignment)
+{
+	void *mem = NULL;
+
+	if (posix_memalign(&mem, alignment, size) != 0) {
+		perror("posix_memalign");
+		return NULL;
+	}
+	return mem;
+}
+
+/*
+ * This emulates the behavior of native libc and Java heap,
+ * as well as process exit and munmap. It helps generate mTHP
+ * and ensures that iterations can proceed with mTHP, as we
+ * currently don't support large folios swap-in.
+ */
+static void random_madvise_dontneed(void *mem, size_t mem_size,
+		size_t align_size, size_t total_dontneed_size)
+{
+	size_t num_pages = total_dontneed_size / align_size;
+	size_t i;
+	size_t offset;
+	void *addr;
+
+	for (i = 0; i < num_pages; ++i) {
+		offset = (rand() % (mem_size / align_size)) * align_size;
+		addr = (char *)mem + offset;
+		if (madvise(addr, align_size, MADV_DONTNEED) != 0)
+			perror("madvise dontneed");
+
+		memset(addr, 0x11, align_size);
+	}
+}
+
+static void random_swapin(void *mem, size_t mem_size,
+		size_t align_size, size_t total_swapin_size)
+{
+	size_t num_pages = total_swapin_size / align_size;
+	size_t i;
+	size_t offset;
+	void *addr;
+
+	for (i = 0; i < num_pages; ++i) {
+		offset = (rand() % (mem_size / align_size)) * align_size;
+		addr = (char *)mem + offset;
+		memset(addr, 0x11, align_size);
+	}
+}
+
+static unsigned long read_stat(const char *path)
+{
+	FILE *file;
+	unsigned long value;
+
+	file = fopen(path, "r");
+	if (!file) {
+		perror("fopen");
+		return 0;
+	}
+
+	if (fscanf(file, "%lu", &value) != 1) {
+		perror("fscanf");
+		fclose(file);
+		return 0;
+	}
+
+	fclose(file);
+	return value;
+}
+
+int main(int argc, char *argv[])
+{
+	int use_small_folio = 0, aligned_swapin = 0;
+	void *mem1 = NULL, *mem2 = NULL;
+	int i;
+
+	for (i = 1; i < argc; ++i) {
+		if (strcmp(argv[i], "-s") == 0)
+			use_small_folio = 1;
+		else if (strcmp(argv[i], "-a") == 0)
+			aligned_swapin = 1;
+	}
+
+	mem1 = aligned_alloc_mem(MEMSIZE_MTHP, ALIGNMENT_MTHP);
+	if (mem1 == NULL) {
+		fprintf(stderr, "Failed to allocate large folios memory\n");
+		return EXIT_FAILURE;
+	}
+
+	if (madvise(mem1, MEMSIZE_MTHP, MADV_HUGEPAGE) != 0) {
+		perror("madvise hugepage for mem1");
+		free(mem1);
+		return EXIT_FAILURE;
+	}
+
+	if (use_small_folio) {
+		mem2 = aligned_alloc_mem(MEMSIZE_SMALLFOLIO, ALIGNMENT_MTHP);
+		if (mem2 == NULL) {
+			fprintf(stderr, "Failed to allocate small folios memory\n");
+			free(mem1);
+			return EXIT_FAILURE;
+		}
+
+		if (madvise(mem2, MEMSIZE_SMALLFOLIO, MADV_NOHUGEPAGE) != 0) {
+			perror("madvise nohugepage for mem2");
+			free(mem1);
+			free(mem2);
+			return EXIT_FAILURE;
+		}
+	}
+
+	/* warm-up phase to occupy the swapfile */
+	memset(mem1, 0x11, MEMSIZE_MTHP);
+	madvise(mem1, MEMSIZE_MTHP, MADV_PAGEOUT);
+	if (use_small_folio) {
+		memset(mem2, 0x11, MEMSIZE_SMALLFOLIO);
+		madvise(mem2, MEMSIZE_SMALLFOLIO, MADV_PAGEOUT);
+	}
+
+	/* iterations with newly created mTHP, swap-in, and swap-out */
+	for (i = 0; i < 100; ++i) {
+		unsigned long initial_swpout;
+		unsigned long initial_swpout_fallback;
+		unsigned long final_swpout;
+		unsigned long final_swpout_fallback;
+		unsigned long swpout_inc;
+		unsigned long swpout_fallback_inc;
+		double fallback_percentage;
+
+		initial_swpout = read_stat(SWPOUT_PATH);
+		initial_swpout_fallback = read_stat(SWPOUT_FALLBACK_PATH);
+
+		/*
+		 * The following setup creates a 1:1 ratio of mTHP to small folios
+		 * since large folio swap-in isn't supported yet. Once we support
+		 * mTHP swap-in, we'll likely need to reduce MEMSIZE_MTHP and
+		 * increase MEMSIZE_SMALLFOLIO to maintain the ratio.
+		 */
+		random_swapin(mem1, MEMSIZE_MTHP,
+				aligned_swapin ? ALIGNMENT_MTHP : ALIGNMENT_SMALLFOLIO,
+				TOTAL_DONTNEED_MTHP);
+		random_madvise_dontneed(mem1, MEMSIZE_MTHP, ALIGNMENT_MTHP,
+				TOTAL_DONTNEED_MTHP);
+
+		if (use_small_folio) {
+			random_swapin(mem2, MEMSIZE_SMALLFOLIO,
+					ALIGNMENT_SMALLFOLIO,
+					TOTAL_DONTNEED_SMALLFOLIO);
+		}
+
+		if (madvise(mem1, MEMSIZE_MTHP, MADV_PAGEOUT) != 0) {
+			perror("madvise pageout for mem1");
+			free(mem1);
+			if (mem2 != NULL)
+				free(mem2);
+			return EXIT_FAILURE;
+		}
+
+		if (use_small_folio) {
+			if (madvise(mem2, MEMSIZE_SMALLFOLIO, MADV_PAGEOUT) != 0) {
+				perror("madvise pageout for mem2");
+				free(mem1);
+				free(mem2);
+				return EXIT_FAILURE;
+			}
+		}
+
+		final_swpout = read_stat(SWPOUT_PATH);
+		final_swpout_fallback = read_stat(SWPOUT_FALLBACK_PATH);
+
+		swpout_inc = final_swpout - initial_swpout;
+		swpout_fallback_inc = final_swpout_fallback - initial_swpout_fallback;
+
+		fallback_percentage = (double)swpout_fallback_inc /
+			(swpout_fallback_inc + swpout_inc) * 100;
+
+		printf("Iteration %d: swpout inc: %lu, swpout fallback inc: %lu, Fallback percentage: %.2f%%\n",
+				i + 1, swpout_inc, swpout_fallback_inc, fallback_percentage);
+	}
+
+	free(mem1);
+	if (mem2 != NULL)
+		free(mem2);
+
+	return EXIT_SUCCESS;
+}
-- 
2.34.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-22  7:12 [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout Barry Song
  2024-06-22  7:12 ` [PATCH v2 1/1] " Barry Song
@ 2024-06-24  8:26 ` Ryan Roberts
  2024-06-24  8:42   ` Barry Song
  2024-06-24 10:06 ` Chris Li
  2 siblings, 1 reply; 16+ messages in thread
From: Ryan Roberts @ 2024-06-24  8:26 UTC (permalink / raw)
  To: Barry Song, akpm, chrisl, linux-mm
  Cc: david, hughd, kaleshsingh, kasong, linux-kernel, v-songbaohua,
	ying.huang

On 22/06/2024 08:12, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> -v2:
>  * add swap-in which can either be aligned or not aligned, by "-a";
>    Ying;
>  * move the program to tools/mm; Ryan;
>  * try to simulate the scenarios swap is full. Chris;
> 
> -v1:
>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
> 
> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/ 
> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
> 
> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
> "-s," which is good news!
> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
> (value A).
> 
> The bad news is that unaligned swapin can significantly increase the fallback ratio,
> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
> approaches approach 100% without "-a" and with "-s" (value B).
> 
> I believe real workloads should yield a value between A and B. Without "-a," and
> lacking large folios swap-in, this tool randomly swaps in small folios without
> considering spatial locality, which is a factor present in real workloads. This
> typically results in values higher than A and lower than B.
> 
> Based on the below results, I believe that:

Thanks for putting this together and providing such detailed results!

> 1. We truly require large folio swap-in to achieve comparable results with
> aligned swap-in(based on the result w/o and w/ "-a")

I certainly agree that as long as we require a high order swap entry to be
contiguous in the backing store then it looks like we are going to need large
folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
of indirection to allow pages to be scattered in the backing store would also
solve the problem? Although, I'm not sure this would work well for zRam?

Perhaps another way of looking at this is that we are doing a bad job of
selecting when to use an mTHP and when not to use one in the first place;
ideally the workload would access the data across the entire mTHP with high
temporal locality? In that case, we would expect the whole mTHP to be swapped in
even with the current page-by-page approach. Figuring out this "auto sizing"
seems like an incredibly complex problem to solve though.

> 2. We need a method to prevent small folios from scattering indiscriminately
> (based on the result "-a -s")

I'm confused by this statement; as I undersand it, both my and Chris's patches
already try to do this. Certainly for mine, when searching for order-0 space, I
search the non-full order-0 clusters first (just like for other orders).
Although for order-0 I will still fallback to searching any cluster if no space
is found in an order-0 cluster. What more can we do?

When run against your v1 of the tool with "-s" (v1 always implicily behaves as
if "-a" is specified, right?) my patch gives 0% fallback. So what's the
difference in v2 that causes higher fallback rate? Possibly just that
MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
size (64M)?

Thanks,
Ryan

> 
> *
> *  Test results on Ryan's patchset:
> *
> 
> 1. w/ -a
> ./thp_swap_allocator_test -a
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> ...
> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> 
> 2. w/o -a
> ./thp_swap_allocator_test
> 
> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
> ...
> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
> 
> 3. w/ -a and -s
> ./thp_swap_allocator_test -a -s
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
> 
> 4. w/o -a and w/ -s
> thp_swap_allocator_test  -s
> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
> ...
> 
> *
> *  Test results on Chris's v3 patchset:
> *
> 1. w/ -a
> ./thp_swap_allocator_test -a
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
> ...
> 
> It seems Chris's approach can be negatively affected even by aligned swapin,
> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
> issue.
> 
> 2. w/o -a
> ./thp_swap_allocator_test
> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
> ..
> 
> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
> worse than "-a".
> 
> 3. w/ -a and -s
> ./thp_swap_allocator_test -a -s
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
> ...
> 
> Chris's approach appears to be more susceptible to negative effects from
> small folios.
> 
> 4. w/o -a and w/ -s
> ./thp_swap_allocator_test -s
> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> ...
> 
> Barry Song (1):
>   tools/mm: Introduce a tool to assess swap entry allocation for
>     thp_swapout
> 
>  tools/mm/Makefile                  |   2 +-
>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>  2 files changed, 234 insertions(+), 1 deletion(-)
>  create mode 100644 tools/mm/thp_swap_allocator_test.c
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-24  8:26 ` [PATCH v2 0/1] " Ryan Roberts
@ 2024-06-24  8:42   ` Barry Song
  2024-06-24 10:35     ` Ryan Roberts
  0 siblings, 1 reply; 16+ messages in thread
From: Barry Song @ 2024-06-24  8:42 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: akpm, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On Mon, Jun 24, 2024 at 8:26 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 22/06/2024 08:12, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > -v2:
> >  * add swap-in which can either be aligned or not aligned, by "-a";
> >    Ying;
> >  * move the program to tools/mm; Ryan;
> >  * try to simulate the scenarios swap is full. Chris;
> >
> > -v1:
> >  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
> >
> > I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
> > [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
> > [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
> >
> > Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
> > "-s," which is good news!
> > If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
> > a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
> > (value A).
> >
> > The bad news is that unaligned swapin can significantly increase the fallback ratio,
> > reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
> > approaches approach 100% without "-a" and with "-s" (value B).
> >
> > I believe real workloads should yield a value between A and B. Without "-a," and
> > lacking large folios swap-in, this tool randomly swaps in small folios without
> > considering spatial locality, which is a factor present in real workloads. This
> > typically results in values higher than A and lower than B.
> >
> > Based on the below results, I believe that:
>
> Thanks for putting this together and providing such detailed results!
>
> > 1. We truly require large folio swap-in to achieve comparable results with
> > aligned swap-in(based on the result w/o and w/ "-a")
>
> I certainly agree that as long as we require a high order swap entry to be
> contiguous in the backing store then it looks like we are going to need large
> folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
> of indirection to allow pages to be scattered in the backing store would also
> solve the problem? Although, I'm not sure this would work well for zRam?

The challenge is that we also want to take advantage of improving zsmalloc
to save compressed multi-pages. However, it seems quite impossible for
zsmalloc to achieve this for a mTHP is scattered but not put together in
zRAM.

>
> Perhaps another way of looking at this is that we are doing a bad job of
> selecting when to use an mTHP and when not to use one in the first place;
> ideally the workload would access the data across the entire mTHP with high
> temporal locality? In that case, we would expect the whole mTHP to be swapped in
> even with the current page-by-page approach. Figuring out this "auto sizing"
> seems like an incredibly complex problem to solve though.

The good news is that this is exactly what we're implementing in our products,
and it has been deployed on millions of phones.

  *  Allocate mTHP and swap in the entire mTHP  in do_swap_page();
  *  If mTHP allocation fails, allocate 16 pages to swap-in in do_swap_page();

To be honest, we haven't noticed a visible increase in memory footprint. This is
likely because Android's anonymous memory exhibits good spatial locality, and
64KiB strikes a good balance—neither too large nor too small.

The bad news is that I haven't found a way to convince the community this
is universally correct.

>
> > 2. We need a method to prevent small folios from scattering indiscriminately
> > (based on the result "-a -s")
>
> I'm confused by this statement; as I undersand it, both my and Chris's patches
> already try to do this. Certainly for mine, when searching for order-0 space, I
> search the non-full order-0 clusters first (just like for other orders).
> Although for order-0 I will still fallback to searching any cluster if no space
> is found in an order-0 cluster. What more can we do?
>
> When run against your v1 of the tool with "-s" (v1 always implicily behaves as
> if "-a" is specified, right?) my patch gives 0% fallback. So what's the
> difference in v2 that causes higher fallback rate? Possibly just that
> MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
> size (64M)?

Exactly. From my understanding, we've reached a point where small folios are
struggling to find swap slots. Note that I always swap out mTHP before swapping
out small folios. Additionally, I have already swapped in 1MB small
folios before
swapping out, which means zRAM has 1MB-4KB of redundant space available
for mTHP to swap out.

>
> Thanks,
> Ryan
>
> >
> > *
> > *  Test results on Ryan's patchset:
> > *
> >
> > 1. w/ -a
> > ./thp_swap_allocator_test -a
> > Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> > ...
> > Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >
> > 2. w/o -a
> > ./thp_swap_allocator_test
> >
> > Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
> > Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
> > Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
> > Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
> > Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
> > Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
> > Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
> > Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
> > Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
> > ...
> > Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
> > Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
> > Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
> > Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
> > Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
> > Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
> > Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
> > Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
> > Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
> > Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
> >
> > 3. w/ -a and -s
> > ./thp_swap_allocator_test -a -s
> > Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
> > Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
> > Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
> > Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
> > Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
> > Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
> > Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> > Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
> > Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
> > Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
> > Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
> > Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
> > Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
> > Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> > Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> > Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
> > Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
> > Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
> > Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
> > Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
> > Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> > Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
> > Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> > Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
> > Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
> > Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> > Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> > Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
> > Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
> > Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
> > Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
> > Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
> > Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> > Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
> > Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
> > Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
> > Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
> > Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
> > Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> > Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
> > Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
> > Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
> > Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
> > Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
> > Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> > Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
> > Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> > Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
> > Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
> > Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> > Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> > Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
> > Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
> > Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
> > Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
> > Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> > Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> > Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
> > Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
> > Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
> > Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
> > Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
> > Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
> > Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
> > Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
> > Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
> > Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
> > Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
> > Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
> > Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
> > Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
> > Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
> > Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
> > Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
> > Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
> > Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
> > Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> > Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
> > Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
> > Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
> > Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
> > Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> > Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
> > Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
> > Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> > Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> > Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> > Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
> >
> > 4. w/o -a and w/ -s
> > thp_swap_allocator_test  -s
> > Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
> > Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
> > Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
> > Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
> > Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> > Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> > Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
> > Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
> > Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> > Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
> > Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
> > Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> > Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
> > Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
> > Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
> > Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> > Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
> > ...
> >
> > *
> > *  Test results on Chris's v3 patchset:
> > *
> > 1. w/ -a
> > ./thp_swap_allocator_test -a
> > Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> > Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
> > Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
> > Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> > Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
> > Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
> > Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
> > Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
> > Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
> > Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
> > Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
> > Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
> > ...
> >
> > It seems Chris's approach can be negatively affected even by aligned swapin,
> > having a low fallback ratio but not 0% while Ryan's patchset hasn't this
> > issue.
> >
> > 2. w/o -a
> > ./thp_swap_allocator_test
> > Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
> > Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
> > Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
> > Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
> > Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
> > Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
> > Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
> > Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> > Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
> > Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> > Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
> > Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
> > ..
> >
> > Ryan's fallback ratio(around 85%) seems to be a little better while both are much
> > worse than "-a".
> >
> > 3. w/ -a and -s
> > ./thp_swap_allocator_test -a -s
> > Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> > Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> > Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
> > Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
> > Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
> > Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
> > Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
> > Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
> > Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
> > Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
> > Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
> > Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
> > Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> > Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
> > Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
> > Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> > Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
> > Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
> > Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
> > Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
> > Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
> > Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
> > Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
> > Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
> > Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
> > Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
> > Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
> > Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
> > Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
> > Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
> > ...
> >
> > Chris's approach appears to be more susceptible to negative effects from
> > small folios.
> >
> > 4. w/o -a and w/ -s
> > ./thp_swap_allocator_test -s
> > Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
> > Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
> > Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
> > Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
> > Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> > Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> > Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
> > Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
> > Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> > Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
> > Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
> > Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> > Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
> > Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
> > Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
> > Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> > ...
> >
> > Barry Song (1):
> >   tools/mm: Introduce a tool to assess swap entry allocation for
> >     thp_swapout
> >
> >  tools/mm/Makefile                  |   2 +-
> >  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
> >  2 files changed, 234 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/mm/thp_swap_allocator_test.c
> >
>

Thanks
Barry


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-22  7:12 [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout Barry Song
  2024-06-22  7:12 ` [PATCH v2 1/1] " Barry Song
  2024-06-24  8:26 ` [PATCH v2 0/1] " Ryan Roberts
@ 2024-06-24 10:06 ` Chris Li
  2 siblings, 0 replies; 16+ messages in thread
From: Chris Li @ 2024-06-24 10:06 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, ryan.roberts, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On Sat, Jun 22, 2024 at 12:12 AM Barry Song <21cnbao@gmail.com> wrote:
>
> From: Barry Song <v-songbaohua@oppo.com>
>
> -v2:
>  * add swap-in which can either be aligned or not aligned, by "-a";
>    Ying;
>  * move the program to tools/mm; Ryan;
>  * try to simulate the scenarios swap is full. Chris;
>
> -v1:
>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
>
> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
>
> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
> "-s," which is good news!
> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
> (value A).
>
> The bad news is that unaligned swapin can significantly increase the fallback ratio,
> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
> approaches approach 100% without "-a" and with "-s" (value B).
>
> I believe real workloads should yield a value between A and B. Without "-a," and
> lacking large folios swap-in, this tool randomly swaps in small folios without
> considering spatial locality, which is a factor present in real workloads. This
> typically results in values higher than A and lower than B.
>
> Based on the below results, I believe that:
> 1. We truly require large folio swap-in to achieve comparable results with
> aligned swap-in(based on the result w/o and w/ "-a")
> 2. We need a method to prevent small folios from scattering indiscriminately
> (based on the result "-a -s")
>
> *
> *  Test results on Ryan's patchset:
> *
>
> 1. w/ -a
> ./thp_swap_allocator_test -a
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> ...
> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>
> 2. w/o -a
> ./thp_swap_allocator_test
>
> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
> ...
> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
>
> 3. w/ -a and -s
> ./thp_swap_allocator_test -a -s
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
>
> 4. w/o -a and w/ -s
> thp_swap_allocator_test  -s
> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
> ...
>
> *
> *  Test results on Chris's v3 patchset:
> *


> 1. w/ -a
> ./thp_swap_allocator_test -a
> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%

My working in progress allocator patch:
linux# ../thp_swap_allocator_test -a
Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 5: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 6: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 7: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 8: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 9: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 10: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 11: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 12: swpout inc: 234, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 13: swpout inc: 217, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 14: swpout inc: 229, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 15: swpout inc: 221, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 16: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 17: swpout inc: 232, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 18: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 19: swpout inc: 218, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 20: swpout inc: 227, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 21: swpout inc: 219, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 22: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 23: swpout inc: 228, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 24: swpout inc: 227, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 25: swpout inc: 212, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 26: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 27: swpout inc: 218, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 28: swpout inc: 226, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 29: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%
...
Fall back zero between.
...
Iteration 78: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 79: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 80: swpout inc: 230, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 81: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 82: swpout inc: 221, swpout fallback inc: 5, Fallback
percentage: 2.21%
Iteration 83: swpout inc: 221, swpout fallback inc: 2, Fallback
percentage: 0.90%
Iteration 84: swpout inc: 230, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 85: swpout inc: 228, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 86: swpout inc: 219, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 87: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 88: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 89: swpout inc: 232, swpout fallback inc: 1, Fallback
percentage: 0.43%
Iteration 90: swpout inc: 222, swpout fallback inc: 5, Fallback
percentage: 2.20%
Iteration 91: swpout inc: 217, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 92: swpout inc: 229, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 93: swpout inc: 220, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 94: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 95: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 96: swpout inc: 226, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 97: swpout inc: 226, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 98: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 99: swpout inc: 215, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 100: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%

Occasionally I can get a full 0.00% run, the above is more typical.


> ...
>
> It seems Chris's approach can be negatively affected even by aligned swapin,
> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
> issue.
>
> 2. w/o -a
> ./thp_swap_allocator_test

My WIP patch:
linux# ../thp_swap_allocator_test
Iteration 1: swpout inc: 233, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 2: swpout inc: 146, swpout fallback inc: 86, Fallback
percentage: 37.07%
Iteration 3: swpout inc: 75, swpout fallback inc: 151, Fallback
percentage: 66.81%
Iteration 4: swpout inc: 52, swpout fallback inc: 171, Fallback
percentage: 76.68%
Iteration 5: swpout inc: 47, swpout fallback inc: 179, Fallback
percentage: 79.20%
Iteration 6: swpout inc: 40, swpout fallback inc: 184, Fallback
percentage: 82.14%
Iteration 7: swpout inc: 40, swpout fallback inc: 188, Fallback
percentage: 82.46%
Iteration 8: swpout inc: 39, swpout fallback inc: 189, Fallback
percentage: 82.89%
Iteration 9: swpout inc: 38, swpout fallback inc: 188, Fallback
percentage: 83.19%
Iteration 10: swpout inc: 41, swpout fallback inc: 187, Fallback
percentage: 82.02%
Iteration 11: swpout inc: 32, swpout fallback inc: 197, Fallback
percentage: 86.03%
Iteration 12: swpout inc: 31, swpout fallback inc: 189, Fallback
percentage: 85.91%
Iteration 13: swpout inc: 31, swpout fallback inc: 197, Fallback
percentage: 86.40%
Iteration 14: swpout inc: 29, swpout fallback inc: 195, Fallback
percentage: 87.05%
Iteration 15: swpout inc: 32, swpout fallback inc: 194, Fallback
percentage: 85.84%
Iteration 16: swpout inc: 25, swpout fallback inc: 190, Fallback
percentage: 88.37%
Iteration 17: swpout inc: 31, swpout fallback inc: 198, Fallback
percentage: 86.46%
Iteration 18: swpout inc: 27, swpout fallback inc: 200, Fallback
percentage: 88.11%
Iteration 19: swpout inc: 29, swpout fallback inc: 189, Fallback
percentage: 86.70%
Iteration 20: swpout inc: 26, swpout fallback inc: 194, Fallback
percentage: 88.18%
Iteration 21: swpout inc: 24, swpout fallback inc: 201, Fallback
percentage: 89.33%
Iteration 22: swpout inc: 28, swpout fallback inc: 202, Fallback
percentage: 87.83%
Iteration 23: swpout inc: 31, swpout fallback inc: 198, Fallback
percentage: 86.46%
Iteration 24: swpout inc: 32, swpout fallback inc: 194, Fallback
percentage: 85.84%
Iteration 25: swpout inc: 30, swpout fallback inc: 199, Fallback
percentage: 86.90%
Iteration 26: swpout inc: 38, swpout fallback inc: 193, Fallback
percentage: 83.55%
Iteration 27: swpout inc: 28, swpout fallback inc: 195, Fallback
percentage: 87.44%
Iteration 28: swpout inc: 29, swpout fallback inc: 195, Fallback
percentage: 87.05%
Iteration 29: swpout inc: 34, swpout fallback inc: 191, Fallback
percentage: 84.89%
Iteration 30: swpout inc: 28, swpout fallback inc: 195, Fallback
percentage: 87.44%
Iteration 31: swpout inc: 36, swpout fallback inc: 184, Fallback
percentage: 83.64%
Iteration 32: swpout inc: 38, swpout fallback inc: 187, Fallback
percentage: 83.11%
Iteration 33: swpout inc: 37, swpout fallback inc: 192, Fallback
percentage: 83.84%
Iteration 34: swpout inc: 39, swpout fallback inc: 191, Fallback
percentage: 83.04%
Iteration 35: swpout inc: 30, swpout fallback inc: 197, Fallback
percentage: 86.78%
Iteration 36: swpout inc: 34, swpout fallback inc: 195, Fallback
percentage: 85.15%
Iteration 37: swpout inc: 35, swpout fallback inc: 182, Fallback
percentage: 83.87%
Iteration 38: swpout inc: 29, swpout fallback inc: 196, Fallback
percentage: 87.11%
Iteration 39: swpout inc: 33, swpout fallback inc: 190, Fallback
percentage: 85.20%
Iteration 40: swpout inc: 33, swpout fallback inc: 184, Fallback
percentage: 84.79%
Iteration 41: swpout inc: 30, swpout fallback inc: 188, Fallback
percentage: 86.24%
Iteration 42: swpout inc: 35, swpout fallback inc: 190, Fallback
percentage: 84.44%
Iteration 43: swpout inc: 30, swpout fallback inc: 193, Fallback
percentage: 86.55%

> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
> ..
>
> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
> worse than "-a".
>
> 3. w/ -a and -s
> ./thp_swap_allocator_test -a -s

My WIP patch:
linux# ../thp_swap_allocator_test -a -s
Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 4: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 5: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 7: swpout inc: 220, swpout fallback inc: 4, Fallback percentage: 1.79%
Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
Iteration 10: swpout inc: 226, swpout fallback inc: 1, Fallback
percentage: 0.44%
Iteration 11: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 12: swpout inc: 232, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 13: swpout inc: 218, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 14: swpout inc: 221, swpout fallback inc: 5, Fallback
percentage: 2.21%
Iteration 15: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 16: swpout inc: 218, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 17: swpout inc: 209, swpout fallback inc: 3, Fallback
percentage: 1.42%
Iteration 18: swpout inc: 233, swpout fallback inc: 1, Fallback
percentage: 0.43%
Iteration 19: swpout inc: 219, swpout fallback inc: 7, Fallback
percentage: 3.10%
Iteration 20: swpout inc: 225, swpout fallback inc: 6, Fallback
percentage: 2.60%
Iteration 21: swpout inc: 228, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 22: swpout inc: 226, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 25: swpout inc: 215, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 26: swpout inc: 230, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 27: swpout inc: 219, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 28: swpout inc: 224, swpout fallback inc: 2, Fallback
percentage: 0.88%
Iteration 29: swpout inc: 225, swpout fallback inc: 3, Fallback
percentage: 1.32%
Iteration 30: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 31: swpout inc: 224, swpout fallback inc: 1, Fallback
percentage: 0.44%
Iteration 32: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 33: swpout inc: 221, swpout fallback inc: 2, Fallback
percentage: 0.90%
Iteration 34: swpout inc: 228, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 35: swpout inc: 229, swpout fallback inc: 4, Fallback
percentage: 1.72%
Iteration 36: swpout inc: 228, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 37: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 38: swpout inc: 220, swpout fallback inc: 1, Fallback
percentage: 0.45%
Iteration 39: swpout inc: 227, swpout fallback inc: 3, Fallback
percentage: 1.30%
Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback
percentage: 2.19%
Iteration 41: swpout inc: 231, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 42: swpout inc: 221, swpout fallback inc: 2, Fallback
percentage: 0.90%
Iteration 43: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 44: swpout inc: 224, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 45: swpout inc: 219, swpout fallback inc: 2, Fallback
percentage: 0.90%
Iteration 46: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 48: swpout inc: 220, swpout fallback inc: 1, Fallback
percentage: 0.45%
Iteration 49: swpout inc: 216, swpout fallback inc: 2, Fallback
percentage: 0.92%
Iteration 50: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 51: swpout inc: 226, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 52: swpout inc: 229, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 53: swpout inc: 220, swpout fallback inc: 2, Fallback
percentage: 0.90%
Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 55: swpout inc: 226, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 56: swpout inc: 223, swpout fallback inc: 5, Fallback
percentage: 2.19%
Iteration 57: swpout inc: 226, swpout fallback inc: 2, Fallback
percentage: 0.88%
Iteration 58: swpout inc: 215, swpout fallback inc: 4, Fallback
percentage: 1.83%
Iteration 59: swpout inc: 222, swpout fallback inc: 3, Fallback
percentage: 1.33%
Iteration 60: swpout inc: 227, swpout fallback inc: 3, Fallback
percentage: 1.30%
Iteration 61: swpout inc: 215, swpout fallback inc: 2, Fallback
percentage: 0.92%
Iteration 62: swpout inc: 214, swpout fallback inc: 9, Fallback
percentage: 4.04%
Iteration 63: swpout inc: 220, swpout fallback inc: 3, Fallback
percentage: 1.35%
Iteration 64: swpout inc: 220, swpout fallback inc: 5, Fallback
percentage: 2.22%
Iteration 65: swpout inc: 216, swpout fallback inc: 10, Fallback
percentage: 4.42%
Iteration 66: swpout inc: 213, swpout fallback inc: 5, Fallback
percentage: 2.29%
Iteration 67: swpout inc: 212, swpout fallback inc: 10, Fallback
percentage: 4.50%
Iteration 68: swpout inc: 216, swpout fallback inc: 8, Fallback
percentage: 3.57%
Iteration 69: swpout inc: 214, swpout fallback inc: 4, Fallback
percentage: 1.83%
Iteration 70: swpout inc: 209, swpout fallback inc: 10, Fallback
percentage: 4.57%
Iteration 71: swpout inc: 217, swpout fallback inc: 8, Fallback
percentage: 3.56%
Iteration 72: swpout inc: 231, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 73: swpout inc: 215, swpout fallback inc: 8, Fallback
percentage: 3.59%
Iteration 74: swpout inc: 221, swpout fallback inc: 7, Fallback
percentage: 3.07%
Iteration 75: swpout inc: 221, swpout fallback inc: 8, Fallback
percentage: 3.49%
Iteration 76: swpout inc: 219, swpout fallback inc: 7, Fallback
percentage: 3.10%
Iteration 76: swpout inc: 219, swpout fallback inc: 7, Fallback
percentage: 3.10%
Iteration 77: swpout inc: 221, swpout fallback inc: 8, Fallback
percentage: 3.49%
Iteration 78: swpout inc: 214, swpout fallback inc: 1, Fallback
percentage: 0.47%
Iteration 79: swpout inc: 223, swpout fallback inc: 2, Fallback
percentage: 0.89%
Iteration 80: swpout inc: 220, swpout fallback inc: 3, Fallback
percentage: 1.35%
Iteration 81: swpout inc: 216, swpout fallback inc: 2, Fallback
percentage: 0.92%
Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback
percentage: 1.32%
Iteration 83: swpout inc: 228, swpout fallback inc: 2, Fallback
percentage: 0.87%
Iteration 84: swpout inc: 222, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 85: swpout inc: 214, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 86: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 87: swpout inc: 223, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 88: swpout inc: 227, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 89: swpout inc: 223, swpout fallback inc: 5, Fallback
percentage: 2.19%
Iteration 90: swpout inc: 225, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 91: swpout inc: 215, swpout fallback inc: 0, Fallback
percentage: 0.00%
Iteration 92: swpout inc: 228, swpout fallback inc: 5, Fallback
percentage: 2.15%
Iteration 93: swpout inc: 219, swpout fallback inc: 2, Fallback
percentage: 0.90%
Iteration 94: swpout inc: 221, swpout fallback inc: 4, Fallback
percentage: 1.78%
Iteration 95: swpout inc: 220, swpout fallback inc: 3, Fallback
percentage: 1.35%
Iteration 96: swpout inc: 211, swpout fallback inc: 16, Fallback
percentage: 7.05%
Iteration 97: swpout inc: 224, swpout fallback inc: 6, Fallback
percentage: 2.61%
Iteration 98: swpout inc: 223, swpout fallback inc: 5, Fallback
percentage: 2.19%
Iteration 99: swpout inc: 226, swpout fallback inc: 4, Fallback
percentage: 1.74%
Iteration 100: swpout inc: 228, swpout fallback inc: 1, Fallback
percentage: 0.44%



> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
> ...
>
> Chris's approach appears to be more susceptible to negative effects from
> small folios.
>
> 4. w/o -a and w/ -s

My current WIP patch:
linux# ../thp_swap_allocator_test -s
Iteration 1: swpout inc: 227, swpout fallback inc: 6, Fallback percentage: 2.58%
Iteration 2: swpout inc: 91, swpout fallback inc: 141, Fallback
percentage: 60.78%
Iteration 3: swpout inc: 36, swpout fallback inc: 198, Fallback
percentage: 84.62%
Iteration 4: swpout inc: 19, swpout fallback inc: 214, Fallback
percentage: 91.85%
Iteration 5: swpout inc: 12, swpout fallback inc: 213, Fallback
percentage: 94.67%
Iteration 6: swpout inc: 11, swpout fallback inc: 217, Fallback
percentage: 95.18%
Iteration 7: swpout inc: 8, swpout fallback inc: 215, Fallback
percentage: 96.41%
Iteration 8: swpout inc: 8, swpout fallback inc: 213, Fallback
percentage: 96.38%
Iteration 9: swpout inc: 2, swpout fallback inc: 223, Fallback
percentage: 99.11%
Iteration 10: swpout inc: 7, swpout fallback inc: 223, Fallback
percentage: 96.96%
Iteration 11: swpout inc: 5, swpout fallback inc: 213, Fallback
percentage: 97.71%
Iteration 12: swpout inc: 8, swpout fallback inc: 223, Fallback
percentage: 96.54%
Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback
percentage: 98.14%
Iteration 14: swpout inc: 6, swpout fallback inc: 216, Fallback
percentage: 97.30%
Iteration 15: swpout inc: 9, swpout fallback inc: 216, Fallback
percentage: 96.00%
Iteration 16: swpout inc: 9, swpout fallback inc: 218, Fallback
percentage: 96.04%
Iteration 17: swpout inc: 10, swpout fallback inc: 223, Fallback
percentage: 95.71%
Iteration 18: swpout inc: 7, swpout fallback inc: 211, Fallback
percentage: 96.79%
Iteration 19: swpout inc: 7, swpout fallback inc: 220, Fallback
percentage: 96.92%
Iteration 20: swpout inc: 7, swpout fallback inc: 220, Fallback
percentage: 96.92%
Iteration 21: swpout inc: 6, swpout fallback inc: 221, Fallback
percentage: 97.36%
Iteration 22: swpout inc: 5, swpout fallback inc: 227, Fallback
percentage: 97.84%
Iteration 23: swpout inc: 4, swpout fallback inc: 213, Fallback
percentage: 98.16%
Iteration 24: swpout inc: 6, swpout fallback inc: 218, Fallback
percentage: 97.32%
Iteration 25: swpout inc: 6, swpout fallback inc: 214, Fallback
percentage: 97.27%
Iteration 26: swpout inc: 7, swpout fallback inc: 221, Fallback
percentage: 96.93%
Iteration 27: swpout inc: 7, swpout fallback inc: 216, Fallback
percentage: 96.86%
Iteration 28: swpout inc: 7, swpout fallback inc: 214, Fallback
percentage: 96.83%

Chris

> ./thp_swap_allocator_test -s
> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> ...
>
> Barry Song (1):
>   tools/mm: Introduce a tool to assess swap entry allocation for
>     thp_swapout
>
>  tools/mm/Makefile                  |   2 +-
>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>  2 files changed, 234 insertions(+), 1 deletion(-)
>  create mode 100644 tools/mm/thp_swap_allocator_test.c
>
> --
> 2.34.1
>
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-24  8:42   ` Barry Song
@ 2024-06-24 10:35     ` Ryan Roberts
  2024-06-25  0:11       ` Barry Song
  0 siblings, 1 reply; 16+ messages in thread
From: Ryan Roberts @ 2024-06-24 10:35 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On 24/06/2024 09:42, Barry Song wrote:
> On Mon, Jun 24, 2024 at 8:26 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 22/06/2024 08:12, Barry Song wrote:
>>> From: Barry Song <v-songbaohua@oppo.com>
>>>
>>> -v2:
>>>  * add swap-in which can either be aligned or not aligned, by "-a";
>>>    Ying;
>>>  * move the program to tools/mm; Ryan;
>>>  * try to simulate the scenarios swap is full. Chris;
>>>
>>> -v1:
>>>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
>>>
>>> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
>>> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
>>> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
>>>
>>> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
>>> "-s," which is good news!
>>> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
>>> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
>>> (value A).
>>>
>>> The bad news is that unaligned swapin can significantly increase the fallback ratio,
>>> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
>>> approaches approach 100% without "-a" and with "-s" (value B).
>>>
>>> I believe real workloads should yield a value between A and B. Without "-a," and
>>> lacking large folios swap-in, this tool randomly swaps in small folios without
>>> considering spatial locality, which is a factor present in real workloads. This
>>> typically results in values higher than A and lower than B.
>>>
>>> Based on the below results, I believe that:
>>
>> Thanks for putting this together and providing such detailed results!
>>
>>> 1. We truly require large folio swap-in to achieve comparable results with
>>> aligned swap-in(based on the result w/o and w/ "-a")
>>
>> I certainly agree that as long as we require a high order swap entry to be
>> contiguous in the backing store then it looks like we are going to need large
>> folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
>> of indirection to allow pages to be scattered in the backing store would also
>> solve the problem? Although, I'm not sure this would work well for zRam?
> 
> The challenge is that we also want to take advantage of improving zsmalloc
> to save compressed multi-pages. However, it seems quite impossible for
> zsmalloc to achieve this for a mTHP is scattered but not put together in
> zRAM.

Yes understood. I finally got around to watching the lsfmm videos; I believe the
suggested solution with a fs-like approach would be to let the fs handle the
compression, which means compressing extents? So even with that approach,
presumably its still valuable to be able to allocate the biggest extents possible.

> 
>>
>> Perhaps another way of looking at this is that we are doing a bad job of
>> selecting when to use an mTHP and when not to use one in the first place;
>> ideally the workload would access the data across the entire mTHP with high
>> temporal locality? In that case, we would expect the whole mTHP to be swapped in
>> even with the current page-by-page approach. Figuring out this "auto sizing"
>> seems like an incredibly complex problem to solve though.
> 
> The good news is that this is exactly what we're implementing in our products,
> and it has been deployed on millions of phones.
> 
>   *  Allocate mTHP and swap in the entire mTHP  in do_swap_page();
>   *  If mTHP allocation fails, allocate 16 pages to swap-in in do_swap_page();

I think we were talking cross-purposes here. What I meant was that in an ideal
world we would only allocate a (64K) mTHP for a page fault if we had confidence
(via some heuristic) that the virtual 64K area was likely to always be accessed
together, else just allocate a small folio. i.e. choose the folio size to cover
a single object from user space's PoV. That would have the side effect that a
page-by-page swap-in approach (the current approach in mainline) would still
effectively result in swapping in the whole folio and therefore reduce
fragementation in the swap file. (Or thinking about it slightly differently, it
would give us confidence to always swap-in a large folio at a time, because we
know its all highly likely to get used in the near future).

I suspect this is a moot point though, because divinging a suitable heuristic
with low overhead is basically impossible.

> 
> To be honest, we haven't noticed a visible increase in memory footprint. This is
> likely because Android's anonymous memory exhibits good spatial locality, and
> 64KiB strikes a good balance—neither too large nor too small.

Indeed.

> 
> The bad news is that I haven't found a way to convince the community this
> is universally correct.

I think we will want to be pragmatic and at least implement an option (sysfs?)
to swap-in a large folio up to a certain size; These test results clearly show
the value. And I believe you have real-world data for Android that shows the
same thing.

Just to creep the scope of this thread slightly, after watching yours and Yu
Zhou's presentations around TAO, IIUC, even with TAO enabled, 64K folio
allocation fallback is still above 50%? I still believe that once the Android
filesystems are converted to use large folios that number will improve
substantially; especially if the page cache can be convinced to only allocate
64K folios (with 4K fallback). At that point you're predominantly using 64K
folios so intuitively there will be less fragmentation.

But allocations by the page cache today start at 16K and increment by 2 orders
for every new readahead IIRC. So we end up with lots of large folio sizes, and
presumably the potential for lots of fallbacks.

All of this is just to suggest that we may end up wanting controls to specify
which folio sizes the page cache can attempt to use. At that point, adding
similar controls for swap-in doesn't feel unreasonable to me.

Just my 2 cents!

> 
>>
>>> 2. We need a method to prevent small folios from scattering indiscriminately
>>> (based on the result "-a -s")
>>
>> I'm confused by this statement; as I undersand it, both my and Chris's patches
>> already try to do this. Certainly for mine, when searching for order-0 space, I
>> search the non-full order-0 clusters first (just like for other orders).
>> Although for order-0 I will still fallback to searching any cluster if no space
>> is found in an order-0 cluster. What more can we do?
>>
>> When run against your v1 of the tool with "-s" (v1 always implicily behaves as
>> if "-a" is specified, right?) my patch gives 0% fallback. So what's the
>> difference in v2 that causes higher fallback rate? Possibly just that
>> MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
>> size (64M)?
> 
> Exactly. From my understanding, we've reached a point where small folios are
> struggling to find swap slots. Note that I always swap out mTHP before swapping
> out small folios. Additionally, I have already swapped in 1MB small
> folios before
> swapping out, which means zRAM has 1MB-4KB of redundant space available
> for mTHP to swap out.
> 
>>
>> Thanks,
>> Ryan
>>
>>>
>>> *
>>> *  Test results on Ryan's patchset:
>>> *
>>>
>>> 1. w/ -a
>>> ./thp_swap_allocator_test -a
>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> ...
>>> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>
>>> 2. w/o -a
>>> ./thp_swap_allocator_test
>>>
>>> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
>>> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
>>> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
>>> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
>>> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
>>> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
>>> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
>>> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
>>> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
>>> ...
>>> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
>>> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
>>> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
>>> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
>>> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
>>> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
>>> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
>>> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
>>> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
>>> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
>>>
>>> 3. w/ -a and -s
>>> ./thp_swap_allocator_test -a -s
>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
>>> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
>>> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
>>> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
>>> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
>>> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
>>> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
>>> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
>>> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
>>> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
>>> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
>>> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
>>> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
>>> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
>>> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
>>> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
>>> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
>>> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
>>> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
>>> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
>>> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
>>> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
>>> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
>>> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
>>> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
>>> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
>>> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
>>> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
>>> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
>>> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
>>> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
>>> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
>>> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
>>> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
>>> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
>>> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
>>> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
>>> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
>>> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
>>> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
>>> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
>>> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
>>> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
>>> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
>>> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
>>> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
>>> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
>>> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
>>> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
>>> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
>>> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
>>> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
>>> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
>>> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
>>> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
>>> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
>>> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
>>> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
>>> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
>>> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
>>> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
>>> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
>>> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
>>> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
>>> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
>>> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
>>> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
>>> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
>>> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
>>> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
>>> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
>>> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
>>> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
>>> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
>>> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
>>> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
>>> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
>>> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
>>> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
>>> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
>>> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
>>> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
>>> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
>>> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
>>> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
>>>
>>> 4. w/o -a and w/ -s
>>> thp_swap_allocator_test  -s
>>> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
>>> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
>>> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
>>> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
>>> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
>>> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
>>> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
>>> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
>>> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
>>> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
>>> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
>>> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
>>> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
>>> ...
>>>
>>> *
>>> *  Test results on Chris's v3 patchset:
>>> *
>>> 1. w/ -a
>>> ./thp_swap_allocator_test -a
>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
>>> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
>>> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
>>> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
>>> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
>>> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
>>> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
>>> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
>>> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
>>> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
>>> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
>>> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
>>> ...
>>>
>>> It seems Chris's approach can be negatively affected even by aligned swapin,
>>> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
>>> issue.
>>>
>>> 2. w/o -a
>>> ./thp_swap_allocator_test
>>> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
>>> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
>>> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
>>> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
>>> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
>>> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
>>> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
>>> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
>>> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
>>> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
>>> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
>>> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
>>> ..
>>>
>>> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
>>> worse than "-a".
>>>
>>> 3. w/ -a and -s
>>> ./thp_swap_allocator_test -a -s
>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
>>> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
>>> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
>>> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
>>> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
>>> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
>>> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
>>> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
>>> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
>>> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
>>> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
>>> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
>>> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
>>> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
>>> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
>>> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
>>> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
>>> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
>>> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
>>> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
>>> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
>>> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
>>> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
>>> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
>>> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
>>> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
>>> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
>>> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
>>> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
>>> ...
>>>
>>> Chris's approach appears to be more susceptible to negative effects from
>>> small folios.
>>>
>>> 4. w/o -a and w/ -s
>>> ./thp_swap_allocator_test -s
>>> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
>>> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
>>> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
>>> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
>>> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
>>> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
>>> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
>>> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
>>> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
>>> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
>>> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
>>> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
>>> ...
>>>
>>> Barry Song (1):
>>>   tools/mm: Introduce a tool to assess swap entry allocation for
>>>     thp_swapout
>>>
>>>  tools/mm/Makefile                  |   2 +-
>>>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>>>  2 files changed, 234 insertions(+), 1 deletion(-)
>>>  create mode 100644 tools/mm/thp_swap_allocator_test.c
>>>
>>
> 
> Thanks
> Barry



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-24 10:35     ` Ryan Roberts
@ 2024-06-25  0:11       ` Barry Song
  2024-06-25  8:11         ` Ryan Roberts
  0 siblings, 1 reply; 16+ messages in thread
From: Barry Song @ 2024-06-25  0:11 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: akpm, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On Mon, Jun 24, 2024 at 10:35 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 24/06/2024 09:42, Barry Song wrote:
> > On Mon, Jun 24, 2024 at 8:26 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> On 22/06/2024 08:12, Barry Song wrote:
> >>> From: Barry Song <v-songbaohua@oppo.com>
> >>>
> >>> -v2:
> >>>  * add swap-in which can either be aligned or not aligned, by "-a";
> >>>    Ying;
> >>>  * move the program to tools/mm; Ryan;
> >>>  * try to simulate the scenarios swap is full. Chris;
> >>>
> >>> -v1:
> >>>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
> >>>
> >>> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
> >>> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
> >>> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
> >>>
> >>> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
> >>> "-s," which is good news!
> >>> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
> >>> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
> >>> (value A).
> >>>
> >>> The bad news is that unaligned swapin can significantly increase the fallback ratio,
> >>> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
> >>> approaches approach 100% without "-a" and with "-s" (value B).
> >>>
> >>> I believe real workloads should yield a value between A and B. Without "-a," and
> >>> lacking large folios swap-in, this tool randomly swaps in small folios without
> >>> considering spatial locality, which is a factor present in real workloads. This
> >>> typically results in values higher than A and lower than B.
> >>>
> >>> Based on the below results, I believe that:
> >>
> >> Thanks for putting this together and providing such detailed results!
> >>
> >>> 1. We truly require large folio swap-in to achieve comparable results with
> >>> aligned swap-in(based on the result w/o and w/ "-a")
> >>
> >> I certainly agree that as long as we require a high order swap entry to be
> >> contiguous in the backing store then it looks like we are going to need large
> >> folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
> >> of indirection to allow pages to be scattered in the backing store would also
> >> solve the problem? Although, I'm not sure this would work well for zRam?
> >
> > The challenge is that we also want to take advantage of improving zsmalloc
> > to save compressed multi-pages. However, it seems quite impossible for
> > zsmalloc to achieve this for a mTHP is scattered but not put together in
> > zRAM.
>
> Yes understood. I finally got around to watching the lsfmm videos; I believe the
> suggested solution with a fs-like approach would be to let the fs handle the
> compression, which means compressing extents? So even with that approach,
> presumably its still valuable to be able to allocate the biggest extents possible.
>
> >
> >>
> >> Perhaps another way of looking at this is that we are doing a bad job of
> >> selecting when to use an mTHP and when not to use one in the first place;
> >> ideally the workload would access the data across the entire mTHP with high
> >> temporal locality? In that case, we would expect the whole mTHP to be swapped in
> >> even with the current page-by-page approach. Figuring out this "auto sizing"
> >> seems like an incredibly complex problem to solve though.
> >
> > The good news is that this is exactly what we're implementing in our products,
> > and it has been deployed on millions of phones.
> >
> >   *  Allocate mTHP and swap in the entire mTHP  in do_swap_page();
> >   *  If mTHP allocation fails, allocate 16 pages to swap-in in do_swap_page();
>
> I think we were talking cross-purposes here. What I meant was that in an ideal
> world we would only allocate a (64K) mTHP for a page fault if we had confidence
> (via some heuristic) that the virtual 64K area was likely to always be accessed
> together, else just allocate a small folio. i.e. choose the folio size to cover
> a single object from user space's PoV. That would have the side effect that a
> page-by-page swap-in approach (the current approach in mainline) would still
> effectively result in swapping in the whole folio and therefore reduce
> fragementation in the swap file. (Or thinking about it slightly differently, it
> would give us confidence to always swap-in a large folio at a time, because we
> know its all highly likely to get used in the near future).
>
> I suspect this is a moot point though, because divinging a suitable heuristic
> with low overhead is basically impossible.
>
> >
> > To be honest, we haven't noticed a visible increase in memory footprint. This is
> > likely because Android's anonymous memory exhibits good spatial locality, and
> > 64KiB strikes a good balance—neither too large nor too small.
>
> Indeed.
>
> >
> > The bad news is that I haven't found a way to convince the community this
> > is universally correct.
>
> I think we will want to be pragmatic and at least implement an option (sysfs?)
> to swap-in a large folio up to a certain size; These test results clearly show
> the value. And I believe you have real-world data for Android that shows the
> same thing.
>
> Just to creep the scope of this thread slightly, after watching yours and Yu
> Zhou's presentations around TAO, IIUC, even with TAO enabled, 64K folio
> allocation fallback is still above 50%? I still believe that once the Android
> filesystems are converted to use large folios that number will improve
> substantially; especially if the page cache can be convinced to only allocate
> 64K folios (with 4K fallback). At that point you're predominantly using 64K
> folios so intuitively there will be less fragmentation.

Absolutely agreed. Currently, I reported an allocation fallback rate
slightly above 50%, but
this is not because TAO is ineffective. It's simply because, in the
test, we set up a
conservative 15% virtzone for mTHP. If we increase the zone, we would definitely
achieve a lower fallback ratio. However, the issue arises when we need
a large number
of small folios— for example, for the page cache—because they might
suffer. However,
we should be able to increase the percentage of the virtzone after
some fine-tuning, as
the report was based on an initial test to demonstrate that TAO can
provide guaranteed
mTHP coverage.

If we can somehow unify the mTHP size for both the page cache and anon, things
might improve.

Xiang promised to deliver EROFS large folio support. If we also get
this in f2fs, things
will be quite different.

>
> But allocations by the page cache today start at 16K and increment by 2 orders
> for every new readahead IIRC. So we end up with lots of large folio sizes, and
> presumably the potential for lots of fallbacks.
>
> All of this is just to suggest that we may end up wanting controls to specify
> which folio sizes the page cache can attempt to use. At that point, adding
> similar controls for swap-in doesn't feel unreasonable to me.
>
> Just my 2 cents!
>
> >
> >>
> >>> 2. We need a method to prevent small folios from scattering indiscriminately
> >>> (based on the result "-a -s")
> >>
> >> I'm confused by this statement; as I undersand it, both my and Chris's patches
> >> already try to do this. Certainly for mine, when searching for order-0 space, I
> >> search the non-full order-0 clusters first (just like for other orders).
> >> Although for order-0 I will still fallback to searching any cluster if no space
> >> is found in an order-0 cluster. What more can we do?
> >>
> >> When run against your v1 of the tool with "-s" (v1 always implicily behaves as
> >> if "-a" is specified, right?) my patch gives 0% fallback. So what's the
> >> difference in v2 that causes higher fallback rate? Possibly just that
> >> MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
> >> size (64M)?
> >
> > Exactly. From my understanding, we've reached a point where small folios are
> > struggling to find swap slots. Note that I always swap out mTHP before swapping
> > out small folios. Additionally, I have already swapped in 1MB small
> > folios before
> > swapping out, which means zRAM has 1MB-4KB of redundant space available
> > for mTHP to swap out.
> >
> >>
> >> Thanks,
> >> Ryan
> >>
> >>>
> >>> *
> >>> *  Test results on Ryan's patchset:
> >>> *
> >>>
> >>> 1. w/ -a
> >>> ./thp_swap_allocator_test -a
> >>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> ...
> >>> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>
> >>> 2. w/o -a
> >>> ./thp_swap_allocator_test
> >>>
> >>> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
> >>> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
> >>> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
> >>> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
> >>> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
> >>> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
> >>> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
> >>> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
> >>> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
> >>> ...
> >>> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
> >>> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
> >>> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
> >>> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
> >>> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
> >>> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
> >>> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
> >>> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
> >>> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
> >>> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
> >>>
> >>> 3. w/ -a and -s
> >>> ./thp_swap_allocator_test -a -s
> >>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
> >>> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
> >>> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
> >>> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
> >>> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
> >>> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
> >>> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> >>> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
> >>> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
> >>> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
> >>> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
> >>> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
> >>> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
> >>> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> >>> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> >>> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
> >>> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
> >>> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
> >>> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
> >>> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
> >>> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> >>> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
> >>> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> >>> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
> >>> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
> >>> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> >>> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> >>> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
> >>> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
> >>> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
> >>> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
> >>> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
> >>> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> >>> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
> >>> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
> >>> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
> >>> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
> >>> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
> >>> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> >>> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
> >>> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
> >>> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
> >>> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
> >>> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
> >>> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> >>> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
> >>> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> >>> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
> >>> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
> >>> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> >>> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> >>> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
> >>> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
> >>> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
> >>> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
> >>> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> >>> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> >>> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
> >>> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
> >>> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
> >>> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
> >>> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
> >>> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
> >>> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
> >>> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
> >>> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
> >>> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
> >>> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
> >>> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
> >>> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
> >>> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
> >>> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
> >>> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
> >>> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
> >>> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
> >>> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
> >>> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> >>> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
> >>> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
> >>> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
> >>> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
> >>> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> >>> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
> >>> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
> >>> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> >>> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> >>> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> >>> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
> >>>
> >>> 4. w/o -a and w/ -s
> >>> thp_swap_allocator_test  -s
> >>> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
> >>> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
> >>> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
> >>> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
> >>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> >>> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> >>> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
> >>> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
> >>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> >>> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
> >>> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
> >>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> >>> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
> >>> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
> >>> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
> >>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> >>> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
> >>> ...
> >>>
> >>> *
> >>> *  Test results on Chris's v3 patchset:
> >>> *
> >>> 1. w/ -a
> >>> ./thp_swap_allocator_test -a
> >>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> >>> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
> >>> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
> >>> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> >>> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
> >>> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
> >>> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
> >>> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
> >>> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
> >>> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
> >>> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
> >>> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
> >>> ...
> >>>
> >>> It seems Chris's approach can be negatively affected even by aligned swapin,
> >>> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
> >>> issue.
> >>>
> >>> 2. w/o -a
> >>> ./thp_swap_allocator_test
> >>> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
> >>> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
> >>> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
> >>> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
> >>> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
> >>> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
> >>> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
> >>> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> >>> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
> >>> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> >>> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
> >>> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
> >>> ..
> >>>
> >>> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
> >>> worse than "-a".
> >>>
> >>> 3. w/ -a and -s
> >>> ./thp_swap_allocator_test -a -s
> >>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> >>> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
> >>> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
> >>> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
> >>> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
> >>> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
> >>> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
> >>> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
> >>> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
> >>> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
> >>> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
> >>> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> >>> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
> >>> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
> >>> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> >>> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
> >>> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
> >>> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
> >>> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
> >>> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
> >>> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
> >>> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
> >>> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
> >>> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
> >>> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
> >>> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
> >>> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
> >>> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
> >>> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
> >>> ...
> >>>
> >>> Chris's approach appears to be more susceptible to negative effects from
> >>> small folios.
> >>>
> >>> 4. w/o -a and w/ -s
> >>> ./thp_swap_allocator_test -s
> >>> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
> >>> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
> >>> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
> >>> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
> >>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> >>> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> >>> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
> >>> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
> >>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> >>> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
> >>> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
> >>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> >>> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
> >>> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
> >>> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
> >>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> >>> ...
> >>>
> >>> Barry Song (1):
> >>>   tools/mm: Introduce a tool to assess swap entry allocation for
> >>>     thp_swapout
> >>>
> >>>  tools/mm/Makefile                  |   2 +-
> >>>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
> >>>  2 files changed, 234 insertions(+), 1 deletion(-)
> >>>  create mode 100644 tools/mm/thp_swap_allocator_test.c
> >>>
> >>
> >

Thanks
Barry


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-25  0:11       ` Barry Song
@ 2024-06-25  8:11         ` Ryan Roberts
  2024-06-27  0:02           ` Barry Song
  0 siblings, 1 reply; 16+ messages in thread
From: Ryan Roberts @ 2024-06-25  8:11 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On 25/06/2024 01:11, Barry Song wrote:
> On Mon, Jun 24, 2024 at 10:35 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 24/06/2024 09:42, Barry Song wrote:
>>> On Mon, Jun 24, 2024 at 8:26 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>
>>>> On 22/06/2024 08:12, Barry Song wrote:
>>>>> From: Barry Song <v-songbaohua@oppo.com>
>>>>>
>>>>> -v2:
>>>>>  * add swap-in which can either be aligned or not aligned, by "-a";
>>>>>    Ying;
>>>>>  * move the program to tools/mm; Ryan;
>>>>>  * try to simulate the scenarios swap is full. Chris;
>>>>>
>>>>> -v1:
>>>>>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
>>>>>
>>>>> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
>>>>> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
>>>>> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
>>>>>
>>>>> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
>>>>> "-s," which is good news!
>>>>> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
>>>>> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
>>>>> (value A).
>>>>>
>>>>> The bad news is that unaligned swapin can significantly increase the fallback ratio,
>>>>> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
>>>>> approaches approach 100% without "-a" and with "-s" (value B).
>>>>>
>>>>> I believe real workloads should yield a value between A and B. Without "-a," and
>>>>> lacking large folios swap-in, this tool randomly swaps in small folios without
>>>>> considering spatial locality, which is a factor present in real workloads. This
>>>>> typically results in values higher than A and lower than B.
>>>>>
>>>>> Based on the below results, I believe that:
>>>>
>>>> Thanks for putting this together and providing such detailed results!
>>>>
>>>>> 1. We truly require large folio swap-in to achieve comparable results with
>>>>> aligned swap-in(based on the result w/o and w/ "-a")
>>>>
>>>> I certainly agree that as long as we require a high order swap entry to be
>>>> contiguous in the backing store then it looks like we are going to need large
>>>> folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
>>>> of indirection to allow pages to be scattered in the backing store would also
>>>> solve the problem? Although, I'm not sure this would work well for zRam?
>>>
>>> The challenge is that we also want to take advantage of improving zsmalloc
>>> to save compressed multi-pages. However, it seems quite impossible for
>>> zsmalloc to achieve this for a mTHP is scattered but not put together in
>>> zRAM.
>>
>> Yes understood. I finally got around to watching the lsfmm videos; I believe the
>> suggested solution with a fs-like approach would be to let the fs handle the
>> compression, which means compressing extents? So even with that approach,
>> presumably its still valuable to be able to allocate the biggest extents possible.
>>
>>>
>>>>
>>>> Perhaps another way of looking at this is that we are doing a bad job of
>>>> selecting when to use an mTHP and when not to use one in the first place;
>>>> ideally the workload would access the data across the entire mTHP with high
>>>> temporal locality? In that case, we would expect the whole mTHP to be swapped in
>>>> even with the current page-by-page approach. Figuring out this "auto sizing"
>>>> seems like an incredibly complex problem to solve though.
>>>
>>> The good news is that this is exactly what we're implementing in our products,
>>> and it has been deployed on millions of phones.
>>>
>>>   *  Allocate mTHP and swap in the entire mTHP  in do_swap_page();
>>>   *  If mTHP allocation fails, allocate 16 pages to swap-in in do_swap_page();
>>
>> I think we were talking cross-purposes here. What I meant was that in an ideal
>> world we would only allocate a (64K) mTHP for a page fault if we had confidence
>> (via some heuristic) that the virtual 64K area was likely to always be accessed
>> together, else just allocate a small folio. i.e. choose the folio size to cover
>> a single object from user space's PoV. That would have the side effect that a
>> page-by-page swap-in approach (the current approach in mainline) would still
>> effectively result in swapping in the whole folio and therefore reduce
>> fragementation in the swap file. (Or thinking about it slightly differently, it
>> would give us confidence to always swap-in a large folio at a time, because we
>> know its all highly likely to get used in the near future).
>>
>> I suspect this is a moot point though, because divinging a suitable heuristic
>> with low overhead is basically impossible.
>>
>>>
>>> To be honest, we haven't noticed a visible increase in memory footprint. This is
>>> likely because Android's anonymous memory exhibits good spatial locality, and
>>> 64KiB strikes a good balance—neither too large nor too small.
>>
>> Indeed.
>>
>>>
>>> The bad news is that I haven't found a way to convince the community this
>>> is universally correct.
>>
>> I think we will want to be pragmatic and at least implement an option (sysfs?)
>> to swap-in a large folio up to a certain size; These test results clearly show
>> the value. And I believe you have real-world data for Android that shows the
>> same thing.
>>
>> Just to creep the scope of this thread slightly, after watching yours and Yu
>> Zhou's presentations around TAO, IIUC, even with TAO enabled, 64K folio
>> allocation fallback is still above 50%? I still believe that once the Android
>> filesystems are converted to use large folios that number will improve
>> substantially; especially if the page cache can be convinced to only allocate
>> 64K folios (with 4K fallback). At that point you're predominantly using 64K
>> folios so intuitively there will be less fragmentation.
> 
> Absolutely agreed. Currently, I reported an allocation fallback rate
> slightly above 50%, but
> this is not because TAO is ineffective. It's simply because, in the
> test, we set up a
> conservative 15% virtzone for mTHP. If we increase the zone, we would definitely
> achieve a lower fallback ratio. However, the issue arises when we need
> a large number
> of small folios— for example, for the page cache—because they might
> suffer. However,
> we should be able to increase the percentage of the virtzone after
> some fine-tuning, as
> the report was based on an initial test to demonstrate that TAO can
> provide guaranteed
> mTHP coverage.
> 
> If we can somehow unify the mTHP size for both the page cache and anon, things
> might improve.

Indeed. And that implies we might need extra controls for the page cache, which
I don't think Willy will be a fan of. It would be good to get some fragmentation
data for Android with a file system that supports large folios, both with the
page cache folio allocation scheme as it is today, and constrained to 64K and
4K. Rather than waiting for all the Android file systems to land support for
large folios, is it possible to hand roll all the Android partitions as XFS?
I've done that in the past for the user data partition at least.

> 
> Xiang promised to deliver EROFS large folio support. If we also get
> this in f2fs, things
> will be quite different.

Excellent! So Oppo is using only erofs and f2fs? What about ext4? And all the
ancillary things like fscrypt and fsverity, etc? (I've been hand waving a bit to
this point, but it would be good to build a full list of all the components that
need large folio support for large folio file-backed memory to be viable on
Android, if you can help enumerate that?)

> 
>>
>> But allocations by the page cache today start at 16K and increment by 2 orders
>> for every new readahead IIRC. So we end up with lots of large folio sizes, and
>> presumably the potential for lots of fallbacks.
>>
>> All of this is just to suggest that we may end up wanting controls to specify
>> which folio sizes the page cache can attempt to use. At that point, adding
>> similar controls for swap-in doesn't feel unreasonable to me.
>>
>> Just my 2 cents!
>>
>>>
>>>>
>>>>> 2. We need a method to prevent small folios from scattering indiscriminately
>>>>> (based on the result "-a -s")
>>>>
>>>> I'm confused by this statement; as I undersand it, both my and Chris's patches
>>>> already try to do this. Certainly for mine, when searching for order-0 space, I
>>>> search the non-full order-0 clusters first (just like for other orders).
>>>> Although for order-0 I will still fallback to searching any cluster if no space
>>>> is found in an order-0 cluster. What more can we do?
>>>>
>>>> When run against your v1 of the tool with "-s" (v1 always implicily behaves as
>>>> if "-a" is specified, right?) my patch gives 0% fallback. So what's the
>>>> difference in v2 that causes higher fallback rate? Possibly just that
>>>> MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
>>>> size (64M)?
>>>
>>> Exactly. From my understanding, we've reached a point where small folios are
>>> struggling to find swap slots. Note that I always swap out mTHP before swapping
>>> out small folios. Additionally, I have already swapped in 1MB small
>>> folios before
>>> swapping out, which means zRAM has 1MB-4KB of redundant space available
>>> for mTHP to swap out.
>>>
>>>>
>>>> Thanks,
>>>> Ryan
>>>>
>>>>>
>>>>> *
>>>>> *  Test results on Ryan's patchset:
>>>>> *
>>>>>
>>>>> 1. w/ -a
>>>>> ./thp_swap_allocator_test -a
>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> ...
>>>>> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>
>>>>> 2. w/o -a
>>>>> ./thp_swap_allocator_test
>>>>>
>>>>> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
>>>>> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
>>>>> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
>>>>> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
>>>>> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
>>>>> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
>>>>> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
>>>>> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
>>>>> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
>>>>> ...
>>>>> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
>>>>> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
>>>>> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
>>>>> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
>>>>> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
>>>>> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
>>>>> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
>>>>> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
>>>>> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
>>>>> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
>>>>>
>>>>> 3. w/ -a and -s
>>>>> ./thp_swap_allocator_test -a -s
>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
>>>>> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
>>>>> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
>>>>> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
>>>>> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
>>>>> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
>>>>> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
>>>>> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
>>>>> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
>>>>> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
>>>>> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
>>>>> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
>>>>> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
>>>>> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>>>> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
>>>>> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
>>>>> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
>>>>> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
>>>>> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
>>>>> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
>>>>> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
>>>>> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
>>>>> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
>>>>> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
>>>>> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
>>>>> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>>>> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
>>>>> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
>>>>> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
>>>>> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
>>>>> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
>>>>> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
>>>>> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
>>>>> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
>>>>> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
>>>>> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
>>>>> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
>>>>> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
>>>>> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
>>>>> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
>>>>> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
>>>>> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
>>>>> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
>>>>> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
>>>>> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
>>>>> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
>>>>> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
>>>>> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
>>>>> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
>>>>> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
>>>>> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
>>>>> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
>>>>> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
>>>>> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
>>>>> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
>>>>> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
>>>>> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
>>>>> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
>>>>> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
>>>>> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
>>>>> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
>>>>> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
>>>>> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
>>>>> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
>>>>> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
>>>>> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
>>>>> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
>>>>> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
>>>>> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
>>>>> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
>>>>> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
>>>>> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
>>>>> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
>>>>> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
>>>>> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
>>>>> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
>>>>> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>>>> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
>>>>> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
>>>>> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
>>>>> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
>>>>> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
>>>>> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
>>>>> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
>>>>> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
>>>>> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
>>>>> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
>>>>> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
>>>>>
>>>>> 4. w/o -a and w/ -s
>>>>> thp_swap_allocator_test  -s
>>>>> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
>>>>> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
>>>>> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
>>>>> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
>>>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
>>>>> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
>>>>> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
>>>>> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
>>>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
>>>>> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
>>>>> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
>>>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
>>>>> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
>>>>> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
>>>>> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
>>>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
>>>>> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
>>>>> ...
>>>>>
>>>>> *
>>>>> *  Test results on Chris's v3 patchset:
>>>>> *
>>>>> 1. w/ -a
>>>>> ./thp_swap_allocator_test -a
>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
>>>>> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
>>>>> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
>>>>> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
>>>>> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
>>>>> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
>>>>> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
>>>>> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
>>>>> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
>>>>> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
>>>>> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
>>>>> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
>>>>> ...
>>>>>
>>>>> It seems Chris's approach can be negatively affected even by aligned swapin,
>>>>> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
>>>>> issue.
>>>>>
>>>>> 2. w/o -a
>>>>> ./thp_swap_allocator_test
>>>>> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
>>>>> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
>>>>> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
>>>>> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
>>>>> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
>>>>> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
>>>>> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
>>>>> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
>>>>> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
>>>>> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
>>>>> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
>>>>> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
>>>>> ..
>>>>>
>>>>> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
>>>>> worse than "-a".
>>>>>
>>>>> 3. w/ -a and -s
>>>>> ./thp_swap_allocator_test -a -s
>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
>>>>> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
>>>>> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
>>>>> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
>>>>> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
>>>>> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
>>>>> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
>>>>> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
>>>>> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
>>>>> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
>>>>> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
>>>>> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
>>>>> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
>>>>> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
>>>>> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
>>>>> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
>>>>> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
>>>>> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
>>>>> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
>>>>> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
>>>>> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
>>>>> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
>>>>> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
>>>>> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
>>>>> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
>>>>> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
>>>>> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
>>>>> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
>>>>> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
>>>>> ...
>>>>>
>>>>> Chris's approach appears to be more susceptible to negative effects from
>>>>> small folios.
>>>>>
>>>>> 4. w/o -a and w/ -s
>>>>> ./thp_swap_allocator_test -s
>>>>> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
>>>>> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
>>>>> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
>>>>> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
>>>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
>>>>> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
>>>>> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
>>>>> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
>>>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
>>>>> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
>>>>> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
>>>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
>>>>> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
>>>>> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
>>>>> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
>>>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
>>>>> ...
>>>>>
>>>>> Barry Song (1):
>>>>>   tools/mm: Introduce a tool to assess swap entry allocation for
>>>>>     thp_swapout
>>>>>
>>>>>  tools/mm/Makefile                  |   2 +-
>>>>>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>>>>>  2 files changed, 234 insertions(+), 1 deletion(-)
>>>>>  create mode 100644 tools/mm/thp_swap_allocator_test.c
>>>>>
>>>>
>>>
> 
> Thanks
> Barry



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-22  7:12 ` [PATCH v2 1/1] " Barry Song
@ 2024-06-25 17:22   ` Kairui Song
  2024-06-25 22:13     ` Barry Song
  2024-07-05  9:31   ` Ryan Roberts
  1 sibling, 1 reply; 16+ messages in thread
From: Kairui Song @ 2024-06-25 17:22 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, chrisl, linux-mm, ryan.roberts, david, hughd, kaleshsingh,
	linux-kernel, v-songbaohua, ying.huang

On Sat, Jun 22, 2024 at 3:13 PM Barry Song <21cnbao@gmail.com> wrote:
>
> From: Barry Song <v-songbaohua@oppo.com>
>
> Both Ryan and Chris have been utilizing the small test program to aid
> in debugging and identifying issues with swap entry allocation. While
> a real or intricate workload might be more suitable for assessing the
> correctness and effectiveness of the swap allocation policy, a small
> test program presents a simpler means of understanding the problem and
> initially verifying the improvements being made.
>
> Let's endeavor to integrate it into tools/mm. Although it presently
> only accommodates 64KB and 4KB, I'm optimistic that we can expand
> its capabilities to support multiple sizes and simulate more
> complex systems in the future as required.
>
> Basically, we have
> 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code
> under high exercise in a short time.
> 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in
> freeing memory, as well as for munmap, app exits, or OOM killer scenarios.
> This ensures new mTHP is always generated, released or swapped out, similar
> to the behavior on a PC or Android phone where many applications are
> frequently started and terminated.
> 3. Swap in with or without the "-a" option to observe how fragments
> due to swap-in and the incoming swap-in of large folios will impact
> swap-out fallback.
>
> Due to 2, we ensure a certain proportion of mTHP. Similarly, because
> of 3, we maintain a certain proportion of small folios, as we don't
> support large folios swap-in, meaning any swap-in will immediately
> result in small folios. Therefore, with both 2 and 3, we automatically
> achieve a system containing both mTHP and small folios. Additionally,
> 1 provides the ability to continuously swap them out.
>
> We can also use "-s" to add a dedicated small folios memory area.
>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  tools/mm/Makefile                  |   2 +-
>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>  2 files changed, 234 insertions(+), 1 deletion(-)
>  create mode 100644 tools/mm/thp_swap_allocator_test.c
>
> diff --git a/tools/mm/Makefile b/tools/mm/Makefile
> index 7bb03606b9ea..15791c1c5b28 100644
> --- a/tools/mm/Makefile
> +++ b/tools/mm/Makefile
> @@ -3,7 +3,7 @@
>  #
>  include ../scripts/Makefile.include
>
> -BUILD_TARGETS=page-types slabinfo page_owner_sort
> +BUILD_TARGETS=page-types slabinfo page_owner_sort thp_swap_allocator_test
>  INSTALL_TARGETS = $(BUILD_TARGETS) thpmaps
>
>  LIB_DIR = ../lib/api
> diff --git a/tools/mm/thp_swap_allocator_test.c b/tools/mm/thp_swap_allocator_test.c
> new file mode 100644
> index 000000000000..a363bdde55f0
> --- /dev/null
> +++ b/tools/mm/thp_swap_allocator_test.c
> @@ -0,0 +1,233 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * thp_swap_allocator_test
> + *
> + * The purpose of this test program is helping check if THP swpout
> + * can correctly get swap slots to swap out as a whole instead of
> + * being split. It randomly releases swap entries through madvise
> + * DONTNEED and swapin/out on two memory areas: a memory area for
> + * 64KB THP and the other area for small folios. The second memory
> + * can be enabled by "-s".
> + * Before running the program, we need to setup a zRAM or similar
> + * swap device by:
> + *  echo lzo > /sys/block/zram0/comp_algorithm
> + *  echo 64M > /sys/block/zram0/disksize
> + *  echo never > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> + *  echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled
> + *  mkswap /dev/zram0
> + *  swapon /dev/zram0
> + * The expected result should be 0% anon swpout fallback ratio w/ or
> + * w/o "-s".
> + *
> + * Author(s): Barry Song <v-songbaohua@oppo.com>
> + */
> +
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <sys/mman.h>

Hi Barry,

Found a small issue while testing your tool.. for better
compatibility, I think you missed <linux/mman.h>, I'm getting
following error without it (with glibc-headers-2.28-236 on el8
system):

thp_swap_allocator_test.c:161:30: error: ‘MADV_PAGEOUT’ undeclared
(first use in this function); did you mean ‘MADV_RANDOM’?
  madvise(mem1, MEMSIZE_MTHP, MADV_PAGEOUT);
                              ^~~~~~~~~~~~

Other in-tree test tools using this flag also includes <linux/mman.h>.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-25 17:22   ` Kairui Song
@ 2024-06-25 22:13     ` Barry Song
  0 siblings, 0 replies; 16+ messages in thread
From: Barry Song @ 2024-06-25 22:13 UTC (permalink / raw)
  To: Kairui Song
  Cc: akpm, chrisl, linux-mm, ryan.roberts, david, hughd, kaleshsingh,
	linux-kernel, v-songbaohua, ying.huang

On Wed, Jun 26, 2024 at 5:22 AM Kairui Song <ryncsn@gmail.com> wrote:
>
> On Sat, Jun 22, 2024 at 3:13 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Both Ryan and Chris have been utilizing the small test program to aid
> > in debugging and identifying issues with swap entry allocation. While
> > a real or intricate workload might be more suitable for assessing the
> > correctness and effectiveness of the swap allocation policy, a small
> > test program presents a simpler means of understanding the problem and
> > initially verifying the improvements being made.
> >
> > Let's endeavor to integrate it into tools/mm. Although it presently
> > only accommodates 64KB and 4KB, I'm optimistic that we can expand
> > its capabilities to support multiple sizes and simulate more
> > complex systems in the future as required.
> >
> > Basically, we have
> > 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code
> > under high exercise in a short time.
> > 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in
> > freeing memory, as well as for munmap, app exits, or OOM killer scenarios.
> > This ensures new mTHP is always generated, released or swapped out, similar
> > to the behavior on a PC or Android phone where many applications are
> > frequently started and terminated.
> > 3. Swap in with or without the "-a" option to observe how fragments
> > due to swap-in and the incoming swap-in of large folios will impact
> > swap-out fallback.
> >
> > Due to 2, we ensure a certain proportion of mTHP. Similarly, because
> > of 3, we maintain a certain proportion of small folios, as we don't
> > support large folios swap-in, meaning any swap-in will immediately
> > result in small folios. Therefore, with both 2 and 3, we automatically
> > achieve a system containing both mTHP and small folios. Additionally,
> > 1 provides the ability to continuously swap them out.
> >
> > We can also use "-s" to add a dedicated small folios memory area.
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > ---
> >  tools/mm/Makefile                  |   2 +-
> >  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
> >  2 files changed, 234 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/mm/thp_swap_allocator_test.c
> >
> > diff --git a/tools/mm/Makefile b/tools/mm/Makefile
> > index 7bb03606b9ea..15791c1c5b28 100644
> > --- a/tools/mm/Makefile
> > +++ b/tools/mm/Makefile
> > @@ -3,7 +3,7 @@
> >  #
> >  include ../scripts/Makefile.include
> >
> > -BUILD_TARGETS=page-types slabinfo page_owner_sort
> > +BUILD_TARGETS=page-types slabinfo page_owner_sort thp_swap_allocator_test
> >  INSTALL_TARGETS = $(BUILD_TARGETS) thpmaps
> >
> >  LIB_DIR = ../lib/api
> > diff --git a/tools/mm/thp_swap_allocator_test.c b/tools/mm/thp_swap_allocator_test.c
> > new file mode 100644
> > index 000000000000..a363bdde55f0
> > --- /dev/null
> > +++ b/tools/mm/thp_swap_allocator_test.c
> > @@ -0,0 +1,233 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * thp_swap_allocator_test
> > + *
> > + * The purpose of this test program is helping check if THP swpout
> > + * can correctly get swap slots to swap out as a whole instead of
> > + * being split. It randomly releases swap entries through madvise
> > + * DONTNEED and swapin/out on two memory areas: a memory area for
> > + * 64KB THP and the other area for small folios. The second memory
> > + * can be enabled by "-s".
> > + * Before running the program, we need to setup a zRAM or similar
> > + * swap device by:
> > + *  echo lzo > /sys/block/zram0/comp_algorithm
> > + *  echo 64M > /sys/block/zram0/disksize
> > + *  echo never > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> > + *  echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled
> > + *  mkswap /dev/zram0
> > + *  swapon /dev/zram0
> > + * The expected result should be 0% anon swpout fallback ratio w/ or
> > + * w/o "-s".
> > + *
> > + * Author(s): Barry Song <v-songbaohua@oppo.com>
> > + */
> > +
> > +#define _GNU_SOURCE
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <unistd.h>
> > +#include <string.h>
> > +#include <sys/mman.h>
>
> Hi Barry,
>
> Found a small issue while testing your tool.. for better
> compatibility, I think you missed <linux/mman.h>, I'm getting
> following error without it (with glibc-headers-2.28-236 on el8
> system):
>
> thp_swap_allocator_test.c:161:30: error: ‘MADV_PAGEOUT’ undeclared
> (first use in this function); did you mean ‘MADV_RANDOM’?
>   madvise(mem1, MEMSIZE_MTHP, MADV_PAGEOUT);
>                               ^~~~~~~~~~~~
>
> Other in-tree test tools using this flag also includes <linux/mman.h>.

Thanks very much, Kairui.

I was using some toolchains on both arm64 and x86, but they didn't
complain.
I agree mman.h is the correct uapi file for MADV_PAGEOUT.

   1     72  arch/alpha/include/uapi/asm/mman.h <<MADV_PAGEOUT>>
             #define MADV_PAGEOUT 21
   2     99  arch/mips/include/uapi/asm/mman.h <<MADV_PAGEOUT>>
             #define MADV_PAGEOUT 21
   3     66  arch/parisc/include/uapi/asm/mman.h <<MADV_PAGEOUT>>
             #define MADV_PAGEOUT 21
   4    107  arch/xtensa/include/uapi/asm/mman.h <<MADV_PAGEOUT>>
             #define MADV_PAGEOUT 21
   5     73  include/uapi/asm-generic/mman-common.h <<MADV_PAGEOUT>>
             #define MADV_PAGEOUT 21


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-25  8:11         ` Ryan Roberts
@ 2024-06-27  0:02           ` Barry Song
  2024-06-27  8:50             ` Ryan Roberts
  0 siblings, 1 reply; 16+ messages in thread
From: Barry Song @ 2024-06-27  0:02 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: akpm, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On Tue, Jun 25, 2024 at 8:11 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 25/06/2024 01:11, Barry Song wrote:
> > On Mon, Jun 24, 2024 at 10:35 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> On 24/06/2024 09:42, Barry Song wrote:
> >>> On Mon, Jun 24, 2024 at 8:26 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>>>
> >>>> On 22/06/2024 08:12, Barry Song wrote:
> >>>>> From: Barry Song <v-songbaohua@oppo.com>
> >>>>>
> >>>>> -v2:
> >>>>>  * add swap-in which can either be aligned or not aligned, by "-a";
> >>>>>    Ying;
> >>>>>  * move the program to tools/mm; Ryan;
> >>>>>  * try to simulate the scenarios swap is full. Chris;
> >>>>>
> >>>>> -v1:
> >>>>>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
> >>>>>
> >>>>> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
> >>>>> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
> >>>>> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
> >>>>>
> >>>>> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
> >>>>> "-s," which is good news!
> >>>>> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
> >>>>> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
> >>>>> (value A).
> >>>>>
> >>>>> The bad news is that unaligned swapin can significantly increase the fallback ratio,
> >>>>> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
> >>>>> approaches approach 100% without "-a" and with "-s" (value B).
> >>>>>
> >>>>> I believe real workloads should yield a value between A and B. Without "-a," and
> >>>>> lacking large folios swap-in, this tool randomly swaps in small folios without
> >>>>> considering spatial locality, which is a factor present in real workloads. This
> >>>>> typically results in values higher than A and lower than B.
> >>>>>
> >>>>> Based on the below results, I believe that:
> >>>>
> >>>> Thanks for putting this together and providing such detailed results!
> >>>>
> >>>>> 1. We truly require large folio swap-in to achieve comparable results with
> >>>>> aligned swap-in(based on the result w/o and w/ "-a")
> >>>>
> >>>> I certainly agree that as long as we require a high order swap entry to be
> >>>> contiguous in the backing store then it looks like we are going to need large
> >>>> folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
> >>>> of indirection to allow pages to be scattered in the backing store would also
> >>>> solve the problem? Although, I'm not sure this would work well for zRam?
> >>>
> >>> The challenge is that we also want to take advantage of improving zsmalloc
> >>> to save compressed multi-pages. However, it seems quite impossible for
> >>> zsmalloc to achieve this for a mTHP is scattered but not put together in
> >>> zRAM.
> >>
> >> Yes understood. I finally got around to watching the lsfmm videos; I believe the
> >> suggested solution with a fs-like approach would be to let the fs handle the
> >> compression, which means compressing extents? So even with that approach,
> >> presumably its still valuable to be able to allocate the biggest extents possible.
> >>
> >>>
> >>>>
> >>>> Perhaps another way of looking at this is that we are doing a bad job of
> >>>> selecting when to use an mTHP and when not to use one in the first place;
> >>>> ideally the workload would access the data across the entire mTHP with high
> >>>> temporal locality? In that case, we would expect the whole mTHP to be swapped in
> >>>> even with the current page-by-page approach. Figuring out this "auto sizing"
> >>>> seems like an incredibly complex problem to solve though.
> >>>
> >>> The good news is that this is exactly what we're implementing in our products,
> >>> and it has been deployed on millions of phones.
> >>>
> >>>   *  Allocate mTHP and swap in the entire mTHP  in do_swap_page();
> >>>   *  If mTHP allocation fails, allocate 16 pages to swap-in in do_swap_page();
> >>
> >> I think we were talking cross-purposes here. What I meant was that in an ideal
> >> world we would only allocate a (64K) mTHP for a page fault if we had confidence
> >> (via some heuristic) that the virtual 64K area was likely to always be accessed
> >> together, else just allocate a small folio. i.e. choose the folio size to cover
> >> a single object from user space's PoV. That would have the side effect that a
> >> page-by-page swap-in approach (the current approach in mainline) would still
> >> effectively result in swapping in the whole folio and therefore reduce
> >> fragementation in the swap file. (Or thinking about it slightly differently, it
> >> would give us confidence to always swap-in a large folio at a time, because we
> >> know its all highly likely to get used in the near future).
> >>
> >> I suspect this is a moot point though, because divinging a suitable heuristic
> >> with low overhead is basically impossible.
> >>
> >>>
> >>> To be honest, we haven't noticed a visible increase in memory footprint. This is
> >>> likely because Android's anonymous memory exhibits good spatial locality, and
> >>> 64KiB strikes a good balance—neither too large nor too small.
> >>
> >> Indeed.
> >>
> >>>
> >>> The bad news is that I haven't found a way to convince the community this
> >>> is universally correct.
> >>
> >> I think we will want to be pragmatic and at least implement an option (sysfs?)
> >> to swap-in a large folio up to a certain size; These test results clearly show
> >> the value. And I believe you have real-world data for Android that shows the
> >> same thing.
> >>
> >> Just to creep the scope of this thread slightly, after watching yours and Yu
> >> Zhou's presentations around TAO, IIUC, even with TAO enabled, 64K folio
> >> allocation fallback is still above 50%? I still believe that once the Android
> >> filesystems are converted to use large folios that number will improve
> >> substantially; especially if the page cache can be convinced to only allocate
> >> 64K folios (with 4K fallback). At that point you're predominantly using 64K
> >> folios so intuitively there will be less fragmentation.
> >
> > Absolutely agreed. Currently, I reported an allocation fallback rate
> > slightly above 50%, but
> > this is not because TAO is ineffective. It's simply because, in the
> > test, we set up a
> > conservative 15% virtzone for mTHP. If we increase the zone, we would definitely
> > achieve a lower fallback ratio. However, the issue arises when we need
> > a large number
> > of small folios— for example, for the page cache—because they might
> > suffer. However,
> > we should be able to increase the percentage of the virtzone after
> > some fine-tuning, as
> > the report was based on an initial test to demonstrate that TAO can
> > provide guaranteed
> > mTHP coverage.
> >
> > If we can somehow unify the mTHP size for both the page cache and anon, things
> > might improve.
>
> Indeed. And that implies we might need extra controls for the page cache, which
> I don't think Willy will be a fan of. It would be good to get some fragmentation
> data for Android with a file system that supports large folios, both with the
> page cache folio allocation scheme as it is today, and constrained to 64K and
> 4K. Rather than waiting for all the Android file systems to land support for
> large folios, is it possible to hand roll all the Android partitions as XFS?
> I've done that in the past for the user data partition at least.

It might be a good idea to evaluate page cache large folios without
waiting for EROFS
and F2FS.
I need to do more research on deploying XFS on Android before getting
back to you.

>
> >
> > Xiang promised to deliver EROFS large folio support. If we also get
> > this in f2fs, things
> > will be quite different.
>
> Excellent! So Oppo is using only erofs and f2fs? What about ext4? And all the
> ancillary things like fscrypt and fsverity, etc? (I've been hand waving a bit to
> this point, but it would be good to build a full list of all the components that
> need large folio support for large folio file-backed memory to be viable on
> Android, if you can help enumerate that?)

We get all of tmpfs, ext4, f2fs, erofs, vfat and fuse for different folders.

>
> >
> >>
> >> But allocations by the page cache today start at 16K and increment by 2 orders
> >> for every new readahead IIRC. So we end up with lots of large folio sizes, and
> >> presumably the potential for lots of fallbacks.
> >>
> >> All of this is just to suggest that we may end up wanting controls to specify
> >> which folio sizes the page cache can attempt to use. At that point, adding
> >> similar controls for swap-in doesn't feel unreasonable to me.
> >>
> >> Just my 2 cents!
> >>
> >>>
> >>>>
> >>>>> 2. We need a method to prevent small folios from scattering indiscriminately
> >>>>> (based on the result "-a -s")
> >>>>
> >>>> I'm confused by this statement; as I undersand it, both my and Chris's patches
> >>>> already try to do this. Certainly for mine, when searching for order-0 space, I
> >>>> search the non-full order-0 clusters first (just like for other orders).
> >>>> Although for order-0 I will still fallback to searching any cluster if no space
> >>>> is found in an order-0 cluster. What more can we do?
> >>>>
> >>>> When run against your v1 of the tool with "-s" (v1 always implicily behaves as
> >>>> if "-a" is specified, right?) my patch gives 0% fallback. So what's the
> >>>> difference in v2 that causes higher fallback rate? Possibly just that
> >>>> MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
> >>>> size (64M)?
> >>>
> >>> Exactly. From my understanding, we've reached a point where small folios are
> >>> struggling to find swap slots. Note that I always swap out mTHP before swapping
> >>> out small folios. Additionally, I have already swapped in 1MB small
> >>> folios before
> >>> swapping out, which means zRAM has 1MB-4KB of redundant space available
> >>> for mTHP to swap out.
> >>>
> >>>>
> >>>> Thanks,
> >>>> Ryan
> >>>>
> >>>>>
> >>>>> *
> >>>>> *  Test results on Ryan's patchset:
> >>>>> *
> >>>>>
> >>>>> 1. w/ -a
> >>>>> ./thp_swap_allocator_test -a
> >>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> ...
> >>>>> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>>
> >>>>> 2. w/o -a
> >>>>> ./thp_swap_allocator_test
> >>>>>
> >>>>> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
> >>>>> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
> >>>>> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
> >>>>> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
> >>>>> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
> >>>>> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
> >>>>> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
> >>>>> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
> >>>>> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
> >>>>> ...
> >>>>> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
> >>>>> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
> >>>>> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
> >>>>> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
> >>>>> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
> >>>>> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
> >>>>> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
> >>>>> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
> >>>>> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
> >>>>> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
> >>>>>
> >>>>> 3. w/ -a and -s
> >>>>> ./thp_swap_allocator_test -a -s
> >>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
> >>>>> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
> >>>>> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
> >>>>> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
> >>>>> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
> >>>>> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
> >>>>> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> >>>>> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
> >>>>> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
> >>>>> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
> >>>>> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
> >>>>> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
> >>>>> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
> >>>>> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> >>>>> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
> >>>>> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
> >>>>> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
> >>>>> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
> >>>>> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
> >>>>> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
> >>>>> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> >>>>> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
> >>>>> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> >>>>> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
> >>>>> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
> >>>>> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> >>>>> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> >>>>> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
> >>>>> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
> >>>>> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
> >>>>> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
> >>>>> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
> >>>>> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> >>>>> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
> >>>>> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
> >>>>> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
> >>>>> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
> >>>>> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
> >>>>> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> >>>>> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
> >>>>> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
> >>>>> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
> >>>>> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
> >>>>> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
> >>>>> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> >>>>> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
> >>>>> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
> >>>>> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
> >>>>> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
> >>>>> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> >>>>> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
> >>>>> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
> >>>>> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
> >>>>> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
> >>>>> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
> >>>>> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
> >>>>> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> >>>>> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
> >>>>> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
> >>>>> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
> >>>>> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
> >>>>> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
> >>>>> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
> >>>>> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
> >>>>> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
> >>>>> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
> >>>>> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
> >>>>> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
> >>>>> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
> >>>>> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
> >>>>> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
> >>>>> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
> >>>>> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
> >>>>> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
> >>>>> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
> >>>>> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
> >>>>> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
> >>>>> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
> >>>>> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
> >>>>> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
> >>>>> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
> >>>>> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
> >>>>> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
> >>>>> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
> >>>>> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> >>>>> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
> >>>>> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
> >>>>> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
> >>>>>
> >>>>> 4. w/o -a and w/ -s
> >>>>> thp_swap_allocator_test  -s
> >>>>> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
> >>>>> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
> >>>>> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
> >>>>> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
> >>>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> >>>>> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> >>>>> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
> >>>>> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
> >>>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> >>>>> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
> >>>>> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
> >>>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> >>>>> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
> >>>>> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
> >>>>> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
> >>>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> >>>>> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
> >>>>> ...
> >>>>>
> >>>>> *
> >>>>> *  Test results on Chris's v3 patchset:
> >>>>> *
> >>>>> 1. w/ -a
> >>>>> ./thp_swap_allocator_test -a
> >>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
> >>>>> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
> >>>>> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
> >>>>> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
> >>>>> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
> >>>>> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
> >>>>> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
> >>>>> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
> >>>>> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
> >>>>> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
> >>>>> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
> >>>>> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
> >>>>> ...
> >>>>>
> >>>>> It seems Chris's approach can be negatively affected even by aligned swapin,
> >>>>> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
> >>>>> issue.
> >>>>>
> >>>>> 2. w/o -a
> >>>>> ./thp_swap_allocator_test
> >>>>> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
> >>>>> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
> >>>>> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
> >>>>> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
> >>>>> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
> >>>>> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
> >>>>> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
> >>>>> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
> >>>>> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
> >>>>> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> >>>>> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
> >>>>> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
> >>>>> ..
> >>>>>
> >>>>> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
> >>>>> worse than "-a".
> >>>>>
> >>>>> 3. w/ -a and -s
> >>>>> ./thp_swap_allocator_test -a -s
> >>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
> >>>>> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
> >>>>> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
> >>>>> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
> >>>>> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
> >>>>> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
> >>>>> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
> >>>>> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
> >>>>> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
> >>>>> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
> >>>>> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
> >>>>> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
> >>>>> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> >>>>> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
> >>>>> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
> >>>>> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
> >>>>> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
> >>>>> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
> >>>>> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
> >>>>> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
> >>>>> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
> >>>>> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
> >>>>> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
> >>>>> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
> >>>>> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
> >>>>> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
> >>>>> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
> >>>>> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
> >>>>> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
> >>>>> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
> >>>>> ...
> >>>>>
> >>>>> Chris's approach appears to be more susceptible to negative effects from
> >>>>> small folios.
> >>>>>
> >>>>> 4. w/o -a and w/ -s
> >>>>> ./thp_swap_allocator_test -s
> >>>>> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
> >>>>> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
> >>>>> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
> >>>>> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
> >>>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
> >>>>> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
> >>>>> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
> >>>>> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
> >>>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
> >>>>> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
> >>>>> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
> >>>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
> >>>>> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
> >>>>> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
> >>>>> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
> >>>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
> >>>>> ...
> >>>>>
> >>>>> Barry Song (1):
> >>>>>   tools/mm: Introduce a tool to assess swap entry allocation for
> >>>>>     thp_swapout
> >>>>>
> >>>>>  tools/mm/Makefile                  |   2 +-
> >>>>>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
> >>>>>  2 files changed, 234 insertions(+), 1 deletion(-)
> >>>>>  create mode 100644 tools/mm/thp_swap_allocator_test.c
> >>>>>
> >>>>
> >>>
> >

Thanks
Barry


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-27  0:02           ` Barry Song
@ 2024-06-27  8:50             ` Ryan Roberts
  2024-07-04 23:10               ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Ryan Roberts @ 2024-06-27  8:50 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On 27/06/2024 01:02, Barry Song wrote:
> On Tue, Jun 25, 2024 at 8:11 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 25/06/2024 01:11, Barry Song wrote:
>>> On Mon, Jun 24, 2024 at 10:35 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>
>>>> On 24/06/2024 09:42, Barry Song wrote:
>>>>> On Mon, Jun 24, 2024 at 8:26 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>>>
>>>>>> On 22/06/2024 08:12, Barry Song wrote:
>>>>>>> From: Barry Song <v-songbaohua@oppo.com>
>>>>>>>
>>>>>>> -v2:
>>>>>>>  * add swap-in which can either be aligned or not aligned, by "-a";
>>>>>>>    Ying;
>>>>>>>  * move the program to tools/mm; Ryan;
>>>>>>>  * try to simulate the scenarios swap is full. Chris;
>>>>>>>
>>>>>>> -v1:
>>>>>>>  https://lore.kernel.org/linux-mm/20240620002648.75204-1-21cnbao@gmail.com/
>>>>>>>
>>>>>>> I tested Ryan's RFC patchset[1] and Chris's v3[2] using this v2 tool:
>>>>>>> [1] https://lore.kernel.org/linux-mm/20240618232648.4090299-1-ryan.roberts@arm.com/
>>>>>>> [2] https://lore.kernel.org/linux-mm/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org/
>>>>>>>
>>>>>>> Obviously, we're rarely hitting 100% even in the worst case without "-a" and with
>>>>>>> "-s," which is good news!
>>>>>>> If swapin is aligned w/ "-a" and w/o "-s", both Chris's and Ryan's patches show
>>>>>>> a low fallback ratio though Chris's has the numbers above 0% but Ryan's are 0%
>>>>>>> (value A).
>>>>>>>
>>>>>>> The bad news is that unaligned swapin can significantly increase the fallback ratio,
>>>>>>> reaching up to 85% for Ryan's patch and 95% for Chris's patchset without "-s." Both
>>>>>>> approaches approach 100% without "-a" and with "-s" (value B).
>>>>>>>
>>>>>>> I believe real workloads should yield a value between A and B. Without "-a," and
>>>>>>> lacking large folios swap-in, this tool randomly swaps in small folios without
>>>>>>> considering spatial locality, which is a factor present in real workloads. This
>>>>>>> typically results in values higher than A and lower than B.
>>>>>>>
>>>>>>> Based on the below results, I believe that:
>>>>>>
>>>>>> Thanks for putting this together and providing such detailed results!
>>>>>>
>>>>>>> 1. We truly require large folio swap-in to achieve comparable results with
>>>>>>> aligned swap-in(based on the result w/o and w/ "-a")
>>>>>>
>>>>>> I certainly agree that as long as we require a high order swap entry to be
>>>>>> contiguous in the backing store then it looks like we are going to need large
>>>>>> folio swap-in to prevent enormous fragmentation. I guess Chris's proposed layer
>>>>>> of indirection to allow pages to be scattered in the backing store would also
>>>>>> solve the problem? Although, I'm not sure this would work well for zRam?
>>>>>
>>>>> The challenge is that we also want to take advantage of improving zsmalloc
>>>>> to save compressed multi-pages. However, it seems quite impossible for
>>>>> zsmalloc to achieve this for a mTHP is scattered but not put together in
>>>>> zRAM.
>>>>
>>>> Yes understood. I finally got around to watching the lsfmm videos; I believe the
>>>> suggested solution with a fs-like approach would be to let the fs handle the
>>>> compression, which means compressing extents? So even with that approach,
>>>> presumably its still valuable to be able to allocate the biggest extents possible.
>>>>
>>>>>
>>>>>>
>>>>>> Perhaps another way of looking at this is that we are doing a bad job of
>>>>>> selecting when to use an mTHP and when not to use one in the first place;
>>>>>> ideally the workload would access the data across the entire mTHP with high
>>>>>> temporal locality? In that case, we would expect the whole mTHP to be swapped in
>>>>>> even with the current page-by-page approach. Figuring out this "auto sizing"
>>>>>> seems like an incredibly complex problem to solve though.
>>>>>
>>>>> The good news is that this is exactly what we're implementing in our products,
>>>>> and it has been deployed on millions of phones.
>>>>>
>>>>>   *  Allocate mTHP and swap in the entire mTHP  in do_swap_page();
>>>>>   *  If mTHP allocation fails, allocate 16 pages to swap-in in do_swap_page();
>>>>
>>>> I think we were talking cross-purposes here. What I meant was that in an ideal
>>>> world we would only allocate a (64K) mTHP for a page fault if we had confidence
>>>> (via some heuristic) that the virtual 64K area was likely to always be accessed
>>>> together, else just allocate a small folio. i.e. choose the folio size to cover
>>>> a single object from user space's PoV. That would have the side effect that a
>>>> page-by-page swap-in approach (the current approach in mainline) would still
>>>> effectively result in swapping in the whole folio and therefore reduce
>>>> fragementation in the swap file. (Or thinking about it slightly differently, it
>>>> would give us confidence to always swap-in a large folio at a time, because we
>>>> know its all highly likely to get used in the near future).
>>>>
>>>> I suspect this is a moot point though, because divinging a suitable heuristic
>>>> with low overhead is basically impossible.
>>>>
>>>>>
>>>>> To be honest, we haven't noticed a visible increase in memory footprint. This is
>>>>> likely because Android's anonymous memory exhibits good spatial locality, and
>>>>> 64KiB strikes a good balance—neither too large nor too small.
>>>>
>>>> Indeed.
>>>>
>>>>>
>>>>> The bad news is that I haven't found a way to convince the community this
>>>>> is universally correct.
>>>>
>>>> I think we will want to be pragmatic and at least implement an option (sysfs?)
>>>> to swap-in a large folio up to a certain size; These test results clearly show
>>>> the value. And I believe you have real-world data for Android that shows the
>>>> same thing.
>>>>
>>>> Just to creep the scope of this thread slightly, after watching yours and Yu
>>>> Zhou's presentations around TAO, IIUC, even with TAO enabled, 64K folio
>>>> allocation fallback is still above 50%? I still believe that once the Android
>>>> filesystems are converted to use large folios that number will improve
>>>> substantially; especially if the page cache can be convinced to only allocate
>>>> 64K folios (with 4K fallback). At that point you're predominantly using 64K
>>>> folios so intuitively there will be less fragmentation.
>>>
>>> Absolutely agreed. Currently, I reported an allocation fallback rate
>>> slightly above 50%, but
>>> this is not because TAO is ineffective. It's simply because, in the
>>> test, we set up a
>>> conservative 15% virtzone for mTHP. If we increase the zone, we would definitely
>>> achieve a lower fallback ratio. However, the issue arises when we need
>>> a large number
>>> of small folios— for example, for the page cache—because they might
>>> suffer. However,
>>> we should be able to increase the percentage of the virtzone after
>>> some fine-tuning, as
>>> the report was based on an initial test to demonstrate that TAO can
>>> provide guaranteed
>>> mTHP coverage.
>>>
>>> If we can somehow unify the mTHP size for both the page cache and anon, things
>>> might improve.
>>
>> Indeed. And that implies we might need extra controls for the page cache, which
>> I don't think Willy will be a fan of. It would be good to get some fragmentation
>> data for Android with a file system that supports large folios, both with the
>> page cache folio allocation scheme as it is today, and constrained to 64K and
>> 4K. Rather than waiting for all the Android file systems to land support for
>> large folios, is it possible to hand roll all the Android partitions as XFS?
>> I've done that in the past for the user data partition at least.
> 
> It might be a good idea to evaluate page cache large folios without
> waiting for EROFS
> and F2FS.
> I need to do more research on deploying XFS on Android before getting
> back to you.

In the past, I've hacked the Android userdata.img like this. I'm sure there are
better ways, but it worked (as long as the kernel was also compiled with the XFS
driver enabled), and I guess it should work for the other partitions too:

# Install tools, inflate userdata.img and mount.
sudo apt install android-sdk-libsparse-utils
simg2img userdata.img userdata.ext4.raw
mkdir ext4_mount
sudo mount -o ro userdata.ext4.raw ext4_mount

# Create empty file for xfs raw image (size is the same as userdata.ext4.raw).
# loopback the file as blk device and format it for xfs.
dd if=/dev/zero of=userdata.xfs.raw bs=1MiB count=11250
sudo losetup /dev/loop4 userdata.xfs.raw
sudo mkfs -t xfs /dev/loop4
mkdir xfs_mount
sudo mount userdata.xfs.raw xfs_mount

# If there is anything on the userdata.img (mounted above) copy the contents
(e.g. cp -r ...).

# Unmount and remove the loopback.
sudo umount xfs_mount
sudo losetup --detach /dev/loop4

# Convert raw image to a sparse image ready for flashing.
img2simg userdata.xfs.raw userdata.xfs.img


> 
>>
>>>
>>> Xiang promised to deliver EROFS large folio support. If we also get
>>> this in f2fs, things
>>> will be quite different.
>>
>> Excellent! So Oppo is using only erofs and f2fs? What about ext4? And all the
>> ancillary things like fscrypt and fsverity, etc? (I've been hand waving a bit to
>> this point, but it would be good to build a full list of all the components that
>> need large folio support for large folio file-backed memory to be viable on
>> Android, if you can help enumerate that?)
> 
> We get all of tmpfs, ext4, f2fs, erofs, vfat and fuse for different folders.

OK, but I think there are also other components like fsverity and fscrypt that
would need to support large folios too? And possibly overlayfs? (you can
probably tell by now that I know very little about file systems :) )

> 
>>
>>>
>>>>
>>>> But allocations by the page cache today start at 16K and increment by 2 orders
>>>> for every new readahead IIRC. So we end up with lots of large folio sizes, and
>>>> presumably the potential for lots of fallbacks.
>>>>
>>>> All of this is just to suggest that we may end up wanting controls to specify
>>>> which folio sizes the page cache can attempt to use. At that point, adding
>>>> similar controls for swap-in doesn't feel unreasonable to me.
>>>>
>>>> Just my 2 cents!
>>>>
>>>>>
>>>>>>
>>>>>>> 2. We need a method to prevent small folios from scattering indiscriminately
>>>>>>> (based on the result "-a -s")
>>>>>>
>>>>>> I'm confused by this statement; as I undersand it, both my and Chris's patches
>>>>>> already try to do this. Certainly for mine, when searching for order-0 space, I
>>>>>> search the non-full order-0 clusters first (just like for other orders).
>>>>>> Although for order-0 I will still fallback to searching any cluster if no space
>>>>>> is found in an order-0 cluster. What more can we do?
>>>>>>
>>>>>> When run against your v1 of the tool with "-s" (v1 always implicily behaves as
>>>>>> if "-a" is specified, right?) my patch gives 0% fallback. So what's the
>>>>>> difference in v2 that causes higher fallback rate? Possibly just that
>>>>>> MEMSIZE_SMALLFOLIO has grown by 3MB so that the total memory matches the swap
>>>>>> size (64M)?
>>>>>
>>>>> Exactly. From my understanding, we've reached a point where small folios are
>>>>> struggling to find swap slots. Note that I always swap out mTHP before swapping
>>>>> out small folios. Additionally, I have already swapped in 1MB small
>>>>> folios before
>>>>> swapping out, which means zRAM has 1MB-4KB of redundant space available
>>>>> for mTHP to swap out.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Ryan
>>>>>>
>>>>>>>
>>>>>>> *
>>>>>>> *  Test results on Ryan's patchset:
>>>>>>> *
>>>>>>>
>>>>>>> 1. w/ -a
>>>>>>> ./thp_swap_allocator_test -a
>>>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 4: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> ...
>>>>>>> Iteration 100: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>>
>>>>>>> 2. w/o -a
>>>>>>> ./thp_swap_allocator_test
>>>>>>>
>>>>>>> Iteration 1: swpout inc: 208, swpout fallback inc: 25, Fallback percentage: 10.73%
>>>>>>> Iteration 2: swpout inc: 118, swpout fallback inc: 114, Fallback percentage: 49.14%
>>>>>>> Iteration 3: swpout inc: 63, swpout fallback inc: 163, Fallback percentage: 72.12%
>>>>>>> Iteration 4: swpout inc: 45, swpout fallback inc: 178, Fallback percentage: 79.82%
>>>>>>> Iteration 5: swpout inc: 42, swpout fallback inc: 184, Fallback percentage: 81.42%
>>>>>>> Iteration 6: swpout inc: 31, swpout fallback inc: 193, Fallback percentage: 86.16%
>>>>>>> Iteration 7: swpout inc: 27, swpout fallback inc: 201, Fallback percentage: 88.16%
>>>>>>> Iteration 8: swpout inc: 30, swpout fallback inc: 198, Fallback percentage: 86.84%
>>>>>>> Iteration 9: swpout inc: 32, swpout fallback inc: 194, Fallback percentage: 85.84%
>>>>>>> ...
>>>>>>> Iteration 91: swpout inc: 26, swpout fallback inc: 194, Fallback percentage: 88.18%
>>>>>>> Iteration 92: swpout inc: 35, swpout fallback inc: 196, Fallback percentage: 84.85%
>>>>>>> Iteration 93: swpout inc: 33, swpout fallback inc: 191, Fallback percentage: 85.27%
>>>>>>> Iteration 94: swpout inc: 26, swpout fallback inc: 193, Fallback percentage: 88.13%
>>>>>>> Iteration 95: swpout inc: 39, swpout fallback inc: 189, Fallback percentage: 82.89%
>>>>>>> Iteration 96: swpout inc: 28, swpout fallback inc: 196, Fallback percentage: 87.50%
>>>>>>> Iteration 97: swpout inc: 25, swpout fallback inc: 194, Fallback percentage: 88.58%
>>>>>>> Iteration 98: swpout inc: 31, swpout fallback inc: 196, Fallback percentage: 86.34%
>>>>>>> Iteration 99: swpout inc: 32, swpout fallback inc: 202, Fallback percentage: 86.32%
>>>>>>> Iteration 100: swpout inc: 33, swpout fallback inc: 195, Fallback percentage: 85.53%
>>>>>>>
>>>>>>> 3. w/ -a and -s
>>>>>>> ./thp_swap_allocator_test -a -s
>>>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 4: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65%
>>>>>>> Iteration 5: swpout inc: 206, swpout fallback inc: 16, Fallback percentage: 7.21%
>>>>>>> Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 10: swpout inc: 224, swpout fallback inc: 3, Fallback percentage: 1.32%
>>>>>>> Iteration 11: swpout inc: 211, swpout fallback inc: 12, Fallback percentage: 5.38%
>>>>>>> Iteration 12: swpout inc: 200, swpout fallback inc: 32, Fallback percentage: 13.79%
>>>>>>> Iteration 13: swpout inc: 189, swpout fallback inc: 29, Fallback percentage: 13.30%
>>>>>>> Iteration 14: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
>>>>>>> Iteration 15: swpout inc: 198, swpout fallback inc: 27, Fallback percentage: 12.00%
>>>>>>> Iteration 16: swpout inc: 201, swpout fallback inc: 17, Fallback percentage: 7.80%
>>>>>>> Iteration 17: swpout inc: 206, swpout fallback inc: 6, Fallback percentage: 2.83%
>>>>>>> Iteration 18: swpout inc: 220, swpout fallback inc: 14, Fallback percentage: 5.98%
>>>>>>> Iteration 19: swpout inc: 181, swpout fallback inc: 45, Fallback percentage: 19.91%
>>>>>>> Iteration 20: swpout inc: 223, swpout fallback inc: 8, Fallback percentage: 3.46%
>>>>>>> Iteration 21: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>>>>>> Iteration 22: swpout inc: 195, swpout fallback inc: 31, Fallback percentage: 13.72%
>>>>>>> Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 24: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 25: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47%
>>>>>>> Iteration 26: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43%
>>>>>>> Iteration 27: swpout inc: 214, swpout fallback inc: 5, Fallback percentage: 2.28%
>>>>>>> Iteration 28: swpout inc: 211, swpout fallback inc: 15, Fallback percentage: 6.64%
>>>>>>> Iteration 29: swpout inc: 188, swpout fallback inc: 40, Fallback percentage: 17.54%
>>>>>>> Iteration 30: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
>>>>>>> Iteration 31: swpout inc: 215, swpout fallback inc: 10, Fallback percentage: 4.44%
>>>>>>> Iteration 32: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
>>>>>>> Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 34: swpout inc: 218, swpout fallback inc: 10, Fallback percentage: 4.39%
>>>>>>> Iteration 35: swpout inc: 203, swpout fallback inc: 30, Fallback percentage: 12.88%
>>>>>>> Iteration 36: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>>>>>> Iteration 37: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
>>>>>>> Iteration 38: swpout inc: 193, swpout fallback inc: 28, Fallback percentage: 12.67%
>>>>>>> Iteration 39: swpout inc: 210, swpout fallback inc: 20, Fallback percentage: 8.70%
>>>>>>> Iteration 40: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19%
>>>>>>> Iteration 41: swpout inc: 224, swpout fallback inc: 7, Fallback percentage: 3.03%
>>>>>>> Iteration 42: swpout inc: 200, swpout fallback inc: 23, Fallback percentage: 10.31%
>>>>>>> Iteration 43: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
>>>>>>> Iteration 44: swpout inc: 206, swpout fallback inc: 18, Fallback percentage: 8.04%
>>>>>>> Iteration 45: swpout inc: 210, swpout fallback inc: 11, Fallback percentage: 4.98%
>>>>>>> Iteration 46: swpout inc: 204, swpout fallback inc: 19, Fallback percentage: 8.52%
>>>>>>> Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 48: swpout inc: 219, swpout fallback inc: 2, Fallback percentage: 0.90%
>>>>>>> Iteration 49: swpout inc: 212, swpout fallback inc: 6, Fallback percentage: 2.75%
>>>>>>> Iteration 50: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
>>>>>>> Iteration 51: swpout inc: 190, swpout fallback inc: 36, Fallback percentage: 15.93%
>>>>>>> Iteration 52: swpout inc: 212, swpout fallback inc: 17, Fallback percentage: 7.42%
>>>>>>> Iteration 53: swpout inc: 179, swpout fallback inc: 43, Fallback percentage: 19.37%
>>>>>>> Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 55: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88%
>>>>>>> Iteration 56: swpout inc: 220, swpout fallback inc: 8, Fallback percentage: 3.51%
>>>>>>> Iteration 57: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
>>>>>>> Iteration 58: swpout inc: 213, swpout fallback inc: 6, Fallback percentage: 2.74%
>>>>>>> Iteration 59: swpout inc: 207, swpout fallback inc: 18, Fallback percentage: 8.00%
>>>>>>> Iteration 60: swpout inc: 216, swpout fallback inc: 14, Fallback percentage: 6.09%
>>>>>>> Iteration 61: swpout inc: 183, swpout fallback inc: 34, Fallback percentage: 15.67%
>>>>>>> Iteration 62: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
>>>>>>> Iteration 63: swpout inc: 184, swpout fallback inc: 39, Fallback percentage: 17.49%
>>>>>>> Iteration 64: swpout inc: 210, swpout fallback inc: 15, Fallback percentage: 6.67%
>>>>>>> Iteration 65: swpout inc: 178, swpout fallback inc: 48, Fallback percentage: 21.24%
>>>>>>> Iteration 66: swpout inc: 188, swpout fallback inc: 30, Fallback percentage: 13.76%
>>>>>>> Iteration 67: swpout inc: 193, swpout fallback inc: 29, Fallback percentage: 13.06%
>>>>>>> Iteration 68: swpout inc: 202, swpout fallback inc: 22, Fallback percentage: 9.82%
>>>>>>> Iteration 69: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
>>>>>>> Iteration 70: swpout inc: 204, swpout fallback inc: 15, Fallback percentage: 6.85%
>>>>>>> Iteration 71: swpout inc: 180, swpout fallback inc: 45, Fallback percentage: 20.00%
>>>>>>> Iteration 72: swpout inc: 210, swpout fallback inc: 21, Fallback percentage: 9.09%
>>>>>>> Iteration 73: swpout inc: 216, swpout fallback inc: 7, Fallback percentage: 3.14%
>>>>>>> Iteration 74: swpout inc: 209, swpout fallback inc: 19, Fallback percentage: 8.33%
>>>>>>> Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06%
>>>>>>> Iteration 76: swpout inc: 212, swpout fallback inc: 14, Fallback percentage: 6.19%
>>>>>>> Iteration 77: swpout inc: 188, swpout fallback inc: 41, Fallback percentage: 17.90%
>>>>>>> Iteration 78: swpout inc: 198, swpout fallback inc: 17, Fallback percentage: 7.91%
>>>>>>> Iteration 79: swpout inc: 209, swpout fallback inc: 16, Fallback percentage: 7.11%
>>>>>>> Iteration 80: swpout inc: 182, swpout fallback inc: 41, Fallback percentage: 18.39%
>>>>>>> Iteration 81: swpout inc: 217, swpout fallback inc: 1, Fallback percentage: 0.46%
>>>>>>> Iteration 82: swpout inc: 225, swpout fallback inc: 3, Fallback percentage: 1.32%
>>>>>>> Iteration 83: swpout inc: 222, swpout fallback inc: 8, Fallback percentage: 3.48%
>>>>>>> Iteration 84: swpout inc: 201, swpout fallback inc: 21, Fallback percentage: 9.46%
>>>>>>> Iteration 85: swpout inc: 211, swpout fallback inc: 3, Fallback percentage: 1.40%
>>>>>>> Iteration 86: swpout inc: 209, swpout fallback inc: 14, Fallback percentage: 6.28%
>>>>>>> Iteration 87: swpout inc: 181, swpout fallback inc: 42, Fallback percentage: 18.83%
>>>>>>> Iteration 88: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76%
>>>>>>> Iteration 89: swpout inc: 214, swpout fallback inc: 14, Fallback percentage: 6.14%
>>>>>>> Iteration 90: swpout inc: 192, swpout fallback inc: 33, Fallback percentage: 14.67%
>>>>>>> Iteration 91: swpout inc: 184, swpout fallback inc: 31, Fallback percentage: 14.42%
>>>>>>> Iteration 92: swpout inc: 201, swpout fallback inc: 32, Fallback percentage: 13.73%
>>>>>>> Iteration 93: swpout inc: 181, swpout fallback inc: 40, Fallback percentage: 18.10%
>>>>>>> Iteration 94: swpout inc: 211, swpout fallback inc: 14, Fallback percentage: 6.22%
>>>>>>> Iteration 95: swpout inc: 198, swpout fallback inc: 25, Fallback percentage: 11.21%
>>>>>>> Iteration 96: swpout inc: 205, swpout fallback inc: 22, Fallback percentage: 9.69%
>>>>>>> Iteration 97: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
>>>>>>> Iteration 98: swpout inc: 203, swpout fallback inc: 25, Fallback percentage: 10.96%
>>>>>>> Iteration 99: swpout inc: 218, swpout fallback inc: 12, Fallback percentage: 5.22%
>>>>>>> Iteration 100: swpout inc: 195, swpout fallback inc: 34, Fallback percentage: 14.85%
>>>>>>>
>>>>>>> 4. w/o -a and w/ -s
>>>>>>> thp_swap_allocator_test  -s
>>>>>>> Iteration 1: swpout inc: 173, swpout fallback inc: 60, Fallback percentage: 25.75%
>>>>>>> Iteration 2: swpout inc: 85, swpout fallback inc: 147, Fallback percentage: 63.36%
>>>>>>> Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33%
>>>>>>> Iteration 4: swpout inc: 13, swpout fallback inc: 220, Fallback percentage: 94.42%
>>>>>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
>>>>>>> Iteration 6: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
>>>>>>> Iteration 7: swpout inc: 6, swpout fallback inc: 217, Fallback percentage: 97.31%
>>>>>>> Iteration 8: swpout inc: 6, swpout fallback inc: 215, Fallback percentage: 97.29%
>>>>>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
>>>>>>> Iteration 10: swpout inc: 1, swpout fallback inc: 229, Fallback percentage: 99.57%
>>>>>>> Iteration 11: swpout inc: 2, swpout fallback inc: 216, Fallback percentage: 99.08%
>>>>>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
>>>>>>> Iteration 13: swpout inc: 4, swpout fallback inc: 211, Fallback percentage: 98.14%
>>>>>>> Iteration 14: swpout inc: 1, swpout fallback inc: 221, Fallback percentage: 99.55%
>>>>>>> Iteration 15: swpout inc: 2, swpout fallback inc: 223, Fallback percentage: 99.11%
>>>>>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
>>>>>>> Iteration 17: swpout inc: 2, swpout fallback inc: 231, Fallback percentage: 99.14%
>>>>>>> ...
>>>>>>>
>>>>>>> *
>>>>>>> *  Test results on Chris's v3 patchset:
>>>>>>> *
>>>>>>> 1. w/ -a
>>>>>>> ./thp_swap_allocator_test -a
>>>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 4: swpout inc: 217, swpout fallback inc: 5, Fallback percentage: 2.25%
>>>>>>> Iteration 5: swpout inc: 215, swpout fallback inc: 12, Fallback percentage: 5.29%
>>>>>>> Iteration 6: swpout inc: 213, swpout fallback inc: 14, Fallback percentage: 6.17%
>>>>>>> Iteration 7: swpout inc: 207, swpout fallback inc: 15, Fallback percentage: 6.76%
>>>>>>> Iteration 8: swpout inc: 193, swpout fallback inc: 33, Fallback percentage: 14.60%
>>>>>>> Iteration 9: swpout inc: 214, swpout fallback inc: 13, Fallback percentage: 5.73%
>>>>>>> Iteration 10: swpout inc: 199, swpout fallback inc: 25, Fallback percentage: 11.16%
>>>>>>> Iteration 11: swpout inc: 208, swpout fallback inc: 14, Fallback percentage: 6.31%
>>>>>>> Iteration 12: swpout inc: 203, swpout fallback inc: 31, Fallback percentage: 13.25%
>>>>>>> Iteration 13: swpout inc: 192, swpout fallback inc: 25, Fallback percentage: 11.52%
>>>>>>> Iteration 14: swpout inc: 193, swpout fallback inc: 36, Fallback percentage: 15.72%
>>>>>>> Iteration 15: swpout inc: 188, swpout fallback inc: 33, Fallback percentage: 14.93%
>>>>>>> ...
>>>>>>>
>>>>>>> It seems Chris's approach can be negatively affected even by aligned swapin,
>>>>>>> having a low fallback ratio but not 0% while Ryan's patchset hasn't this
>>>>>>> issue.
>>>>>>>
>>>>>>> 2. w/o -a
>>>>>>> ./thp_swap_allocator_test
>>>>>>> Iteration 1: swpout inc: 209, swpout fallback inc: 24, Fallback percentage: 10.30%
>>>>>>> Iteration 2: swpout inc: 100, swpout fallback inc: 132, Fallback percentage: 56.90%
>>>>>>> Iteration 3: swpout inc: 43, swpout fallback inc: 183, Fallback percentage: 80.97%
>>>>>>> Iteration 4: swpout inc: 30, swpout fallback inc: 193, Fallback percentage: 86.55%
>>>>>>> Iteration 5: swpout inc: 21, swpout fallback inc: 205, Fallback percentage: 90.71%
>>>>>>> Iteration 6: swpout inc: 10, swpout fallback inc: 214, Fallback percentage: 95.54%
>>>>>>> Iteration 7: swpout inc: 16, swpout fallback inc: 212, Fallback percentage: 92.98%
>>>>>>> Iteration 8: swpout inc: 9, swpout fallback inc: 219, Fallback percentage: 96.05%
>>>>>>> Iteration 9: swpout inc: 6, swpout fallback inc: 220, Fallback percentage: 97.35%
>>>>>>> Iteration 10: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
>>>>>>> Iteration 11: swpout inc: 7, swpout fallback inc: 222, Fallback percentage: 96.94%
>>>>>>> Iteration 12: swpout inc: 8, swpout fallback inc: 212, Fallback percentage: 96.36%
>>>>>>> ..
>>>>>>>
>>>>>>> Ryan's fallback ratio(around 85%) seems to be a little better while both are much
>>>>>>> worse than "-a".
>>>>>>>
>>>>>>> 3. w/ -a and -s
>>>>>>> ./thp_swap_allocator_test -a -s
>>>>>>> Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00%
>>>>>>> Iteration 2: swpout inc: 213, swpout fallback inc: 5, Fallback percentage: 2.29%
>>>>>>> Iteration 3: swpout inc: 215, swpout fallback inc: 7, Fallback percentage: 3.15%
>>>>>>> Iteration 4: swpout inc: 210, swpout fallback inc: 16, Fallback percentage: 7.08%
>>>>>>> Iteration 5: swpout inc: 212, swpout fallback inc: 10, Fallback percentage: 4.50%
>>>>>>> Iteration 6: swpout inc: 215, swpout fallback inc: 18, Fallback percentage: 7.73%
>>>>>>> Iteration 7: swpout inc: 181, swpout fallback inc: 43, Fallback percentage: 19.20%
>>>>>>> Iteration 8: swpout inc: 173, swpout fallback inc: 55, Fallback percentage: 24.12%
>>>>>>> Iteration 9: swpout inc: 163, swpout fallback inc: 54, Fallback percentage: 24.88%
>>>>>>> Iteration 10: swpout inc: 168, swpout fallback inc: 59, Fallback percentage: 25.99%
>>>>>>> Iteration 11: swpout inc: 154, swpout fallback inc: 69, Fallback percentage: 30.94%
>>>>>>> Iteration 12: swpout inc: 166, swpout fallback inc: 66, Fallback percentage: 28.45%
>>>>>>> Iteration 13: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
>>>>>>> Iteration 14: swpout inc: 158, swpout fallback inc: 68, Fallback percentage: 30.09%
>>>>>>> Iteration 15: swpout inc: 168, swpout fallback inc: 57, Fallback percentage: 25.33%
>>>>>>> Iteration 16: swpout inc: 165, swpout fallback inc: 53, Fallback percentage: 24.31%
>>>>>>> Iteration 17: swpout inc: 163, swpout fallback inc: 49, Fallback percentage: 23.11%
>>>>>>> Iteration 18: swpout inc: 172, swpout fallback inc: 62, Fallback percentage: 26.50%
>>>>>>> Iteration 19: swpout inc: 183, swpout fallback inc: 43, Fallback percentage: 19.03%
>>>>>>> Iteration 20: swpout inc: 158, swpout fallback inc: 73, Fallback percentage: 31.60%
>>>>>>> Iteration 21: swpout inc: 147, swpout fallback inc: 81, Fallback percentage: 35.53%
>>>>>>> Iteration 22: swpout inc: 140, swpout fallback inc: 86, Fallback percentage: 38.05%
>>>>>>> Iteration 23: swpout inc: 144, swpout fallback inc: 79, Fallback percentage: 35.43%
>>>>>>> Iteration 24: swpout inc: 132, swpout fallback inc: 101, Fallback percentage: 43.35%
>>>>>>> Iteration 25: swpout inc: 133, swpout fallback inc: 82, Fallback percentage: 38.14%
>>>>>>> Iteration 26: swpout inc: 152, swpout fallback inc: 78, Fallback percentage: 33.91%
>>>>>>> Iteration 27: swpout inc: 138, swpout fallback inc: 81, Fallback percentage: 36.99%
>>>>>>> Iteration 28: swpout inc: 152, swpout fallback inc: 74, Fallback percentage: 32.74%
>>>>>>> Iteration 29: swpout inc: 153, swpout fallback inc: 75, Fallback percentage: 32.89%
>>>>>>> Iteration 30: swpout inc: 151, swpout fallback inc: 74, Fallback percentage: 32.89%
>>>>>>> ...
>>>>>>>
>>>>>>> Chris's approach appears to be more susceptible to negative effects from
>>>>>>> small folios.
>>>>>>>
>>>>>>> 4. w/o -a and w/ -s
>>>>>>> ./thp_swap_allocator_test -s
>>>>>>> Iteration 1: swpout inc: 183, swpout fallback inc: 50, Fallback percentage: 21.46%
>>>>>>> Iteration 2: swpout inc: 75, swpout fallback inc: 157, Fallback percentage: 67.67%
>>>>>>> Iteration 3: swpout inc: 33, swpout fallback inc: 201, Fallback percentage: 85.90%
>>>>>>> Iteration 4: swpout inc: 11, swpout fallback inc: 222, Fallback percentage: 95.28%
>>>>>>> Iteration 5: swpout inc: 10, swpout fallback inc: 215, Fallback percentage: 95.56%
>>>>>>> Iteration 6: swpout inc: 7, swpout fallback inc: 221, Fallback percentage: 96.93%
>>>>>>> Iteration 7: swpout inc: 2, swpout fallback inc: 221, Fallback percentage: 99.10%
>>>>>>> Iteration 8: swpout inc: 4, swpout fallback inc: 217, Fallback percentage: 98.19%
>>>>>>> Iteration 9: swpout inc: 0, swpout fallback inc: 225, Fallback percentage: 100.00%
>>>>>>> Iteration 10: swpout inc: 3, swpout fallback inc: 227, Fallback percentage: 98.70%
>>>>>>> Iteration 11: swpout inc: 1, swpout fallback inc: 217, Fallback percentage: 99.54%
>>>>>>> Iteration 12: swpout inc: 2, swpout fallback inc: 229, Fallback percentage: 99.13%
>>>>>>> Iteration 13: swpout inc: 1, swpout fallback inc: 214, Fallback percentage: 99.53%
>>>>>>> Iteration 14: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10%
>>>>>>> Iteration 15: swpout inc: 1, swpout fallback inc: 224, Fallback percentage: 99.56%
>>>>>>> Iteration 16: swpout inc: 3, swpout fallback inc: 224, Fallback percentage: 98.68%
>>>>>>> ...
>>>>>>>
>>>>>>> Barry Song (1):
>>>>>>>   tools/mm: Introduce a tool to assess swap entry allocation for
>>>>>>>     thp_swapout
>>>>>>>
>>>>>>>  tools/mm/Makefile                  |   2 +-
>>>>>>>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>>>>>>>  2 files changed, 234 insertions(+), 1 deletion(-)
>>>>>>>  create mode 100644 tools/mm/thp_swap_allocator_test.c
>>>>>>>
>>>>>>
>>>>>
>>>
> 
> Thanks
> Barry



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-27  8:50             ` Ryan Roberts
@ 2024-07-04 23:10               ` Andrew Morton
  2024-07-05  9:31                 ` Ryan Roberts
  2024-07-05 16:38                 ` Chris Li
  0 siblings, 2 replies; 16+ messages in thread
From: Andrew Morton @ 2024-07-04 23:10 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Barry Song, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

This all seems to have gone way off track.

acks or nacks on the patch, please?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-06-22  7:12 ` [PATCH v2 1/1] " Barry Song
  2024-06-25 17:22   ` Kairui Song
@ 2024-07-05  9:31   ` Ryan Roberts
  1 sibling, 0 replies; 16+ messages in thread
From: Ryan Roberts @ 2024-07-05  9:31 UTC (permalink / raw)
  To: Barry Song, akpm, chrisl, linux-mm
  Cc: david, hughd, kaleshsingh, kasong, linux-kernel, v-songbaohua,
	ying.huang

On 22/06/2024 08:12, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Both Ryan and Chris have been utilizing the small test program to aid
> in debugging and identifying issues with swap entry allocation. While
> a real or intricate workload might be more suitable for assessing the
> correctness and effectiveness of the swap allocation policy, a small
> test program presents a simpler means of understanding the problem and
> initially verifying the improvements being made.
> 
> Let's endeavor to integrate it into tools/mm. Although it presently
> only accommodates 64KB and 4KB, I'm optimistic that we can expand
> its capabilities to support multiple sizes and simulate more
> complex systems in the future as required.
> 
> Basically, we have
> 1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation code
> under high exercise in a short time.
> 2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in
> freeing memory, as well as for munmap, app exits, or OOM killer scenarios.
> This ensures new mTHP is always generated, released or swapped out, similar
> to the behavior on a PC or Android phone where many applications are
> frequently started and terminated.
> 3. Swap in with or without the "-a" option to observe how fragments
> due to swap-in and the incoming swap-in of large folios will impact
> swap-out fallback.
> 
> Due to 2, we ensure a certain proportion of mTHP. Similarly, because
> of 3, we maintain a certain proportion of small folios, as we don't
> support large folios swap-in, meaning any swap-in will immediately
> result in small folios. Therefore, with both 2 and 3, we automatically
> achieve a system containing both mTHP and small folios. Additionally,
> 1 provides the ability to continuously swap them out.
> 
> We can also use "-s" to add a dedicated small folios memory area.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

I note there is an open thread about compilation failure due to missing header
include, with specific toolcahin. But once cleared up:

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

I didn't hit the compile issue so:

Tested-by: Ryan Roberts <ryan.roberts@arm.com>


> ---
>  tools/mm/Makefile                  |   2 +-
>  tools/mm/thp_swap_allocator_test.c | 233 +++++++++++++++++++++++++++++
>  2 files changed, 234 insertions(+), 1 deletion(-)
>  create mode 100644 tools/mm/thp_swap_allocator_test.c
> 
> diff --git a/tools/mm/Makefile b/tools/mm/Makefile
> index 7bb03606b9ea..15791c1c5b28 100644
> --- a/tools/mm/Makefile
> +++ b/tools/mm/Makefile
> @@ -3,7 +3,7 @@
>  #
>  include ../scripts/Makefile.include
>  
> -BUILD_TARGETS=page-types slabinfo page_owner_sort
> +BUILD_TARGETS=page-types slabinfo page_owner_sort thp_swap_allocator_test
>  INSTALL_TARGETS = $(BUILD_TARGETS) thpmaps
>  
>  LIB_DIR = ../lib/api
> diff --git a/tools/mm/thp_swap_allocator_test.c b/tools/mm/thp_swap_allocator_test.c
> new file mode 100644
> index 000000000000..a363bdde55f0
> --- /dev/null
> +++ b/tools/mm/thp_swap_allocator_test.c
> @@ -0,0 +1,233 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * thp_swap_allocator_test
> + *
> + * The purpose of this test program is helping check if THP swpout
> + * can correctly get swap slots to swap out as a whole instead of
> + * being split. It randomly releases swap entries through madvise
> + * DONTNEED and swapin/out on two memory areas: a memory area for
> + * 64KB THP and the other area for small folios. The second memory
> + * can be enabled by "-s".
> + * Before running the program, we need to setup a zRAM or similar
> + * swap device by:
> + *  echo lzo > /sys/block/zram0/comp_algorithm
> + *  echo 64M > /sys/block/zram0/disksize
> + *  echo never > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> + *  echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled
> + *  mkswap /dev/zram0
> + *  swapon /dev/zram0
> + * The expected result should be 0% anon swpout fallback ratio w/ or
> + * w/o "-s".
> + *
> + * Author(s): Barry Song <v-songbaohua@oppo.com>
> + */
> +
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <string.h>
> +#include <sys/mman.h>
> +#include <errno.h>
> +#include <time.h>
> +
> +#define MEMSIZE_MTHP (60 * 1024 * 1024)
> +#define MEMSIZE_SMALLFOLIO (4 * 1024 * 1024)
> +#define ALIGNMENT_MTHP (64 * 1024)
> +#define ALIGNMENT_SMALLFOLIO (4 * 1024)
> +#define TOTAL_DONTNEED_MTHP (16 * 1024 * 1024)
> +#define TOTAL_DONTNEED_SMALLFOLIO (1 * 1024 * 1024)
> +#define MTHP_FOLIO_SIZE (64 * 1024)
> +
> +#define SWPOUT_PATH \
> +	"/sys/kernel/mm/transparent_hugepage/hugepages-64kB/stats/swpout"
> +#define SWPOUT_FALLBACK_PATH \
> +	"/sys/kernel/mm/transparent_hugepage/hugepages-64kB/stats/swpout_fallback"
> +
> +static void *aligned_alloc_mem(size_t size, size_t alignment)
> +{
> +	void *mem = NULL;
> +
> +	if (posix_memalign(&mem, alignment, size) != 0) {
> +		perror("posix_memalign");
> +		return NULL;
> +	}
> +	return mem;
> +}
> +
> +/*
> + * This emulates the behavior of native libc and Java heap,
> + * as well as process exit and munmap. It helps generate mTHP
> + * and ensures that iterations can proceed with mTHP, as we
> + * currently don't support large folios swap-in.
> + */
> +static void random_madvise_dontneed(void *mem, size_t mem_size,
> +		size_t align_size, size_t total_dontneed_size)
> +{
> +	size_t num_pages = total_dontneed_size / align_size;
> +	size_t i;
> +	size_t offset;
> +	void *addr;
> +
> +	for (i = 0; i < num_pages; ++i) {
> +		offset = (rand() % (mem_size / align_size)) * align_size;
> +		addr = (char *)mem + offset;
> +		if (madvise(addr, align_size, MADV_DONTNEED) != 0)
> +			perror("madvise dontneed");
> +
> +		memset(addr, 0x11, align_size);
> +	}
> +}
> +
> +static void random_swapin(void *mem, size_t mem_size,
> +		size_t align_size, size_t total_swapin_size)
> +{
> +	size_t num_pages = total_swapin_size / align_size;
> +	size_t i;
> +	size_t offset;
> +	void *addr;
> +
> +	for (i = 0; i < num_pages; ++i) {
> +		offset = (rand() % (mem_size / align_size)) * align_size;
> +		addr = (char *)mem + offset;
> +		memset(addr, 0x11, align_size);
> +	}
> +}
> +
> +static unsigned long read_stat(const char *path)
> +{
> +	FILE *file;
> +	unsigned long value;
> +
> +	file = fopen(path, "r");
> +	if (!file) {
> +		perror("fopen");
> +		return 0;
> +	}
> +
> +	if (fscanf(file, "%lu", &value) != 1) {
> +		perror("fscanf");
> +		fclose(file);
> +		return 0;
> +	}
> +
> +	fclose(file);
> +	return value;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	int use_small_folio = 0, aligned_swapin = 0;
> +	void *mem1 = NULL, *mem2 = NULL;
> +	int i;
> +
> +	for (i = 1; i < argc; ++i) {
> +		if (strcmp(argv[i], "-s") == 0)
> +			use_small_folio = 1;
> +		else if (strcmp(argv[i], "-a") == 0)
> +			aligned_swapin = 1;
> +	}
> +
> +	mem1 = aligned_alloc_mem(MEMSIZE_MTHP, ALIGNMENT_MTHP);
> +	if (mem1 == NULL) {
> +		fprintf(stderr, "Failed to allocate large folios memory\n");
> +		return EXIT_FAILURE;
> +	}
> +
> +	if (madvise(mem1, MEMSIZE_MTHP, MADV_HUGEPAGE) != 0) {
> +		perror("madvise hugepage for mem1");
> +		free(mem1);
> +		return EXIT_FAILURE;
> +	}
> +
> +	if (use_small_folio) {
> +		mem2 = aligned_alloc_mem(MEMSIZE_SMALLFOLIO, ALIGNMENT_MTHP);
> +		if (mem2 == NULL) {
> +			fprintf(stderr, "Failed to allocate small folios memory\n");
> +			free(mem1);
> +			return EXIT_FAILURE;
> +		}
> +
> +		if (madvise(mem2, MEMSIZE_SMALLFOLIO, MADV_NOHUGEPAGE) != 0) {
> +			perror("madvise nohugepage for mem2");
> +			free(mem1);
> +			free(mem2);
> +			return EXIT_FAILURE;
> +		}
> +	}
> +
> +	/* warm-up phase to occupy the swapfile */
> +	memset(mem1, 0x11, MEMSIZE_MTHP);
> +	madvise(mem1, MEMSIZE_MTHP, MADV_PAGEOUT);
> +	if (use_small_folio) {
> +		memset(mem2, 0x11, MEMSIZE_SMALLFOLIO);
> +		madvise(mem2, MEMSIZE_SMALLFOLIO, MADV_PAGEOUT);
> +	}
> +
> +	/* iterations with newly created mTHP, swap-in, and swap-out */
> +	for (i = 0; i < 100; ++i) {
> +		unsigned long initial_swpout;
> +		unsigned long initial_swpout_fallback;
> +		unsigned long final_swpout;
> +		unsigned long final_swpout_fallback;
> +		unsigned long swpout_inc;
> +		unsigned long swpout_fallback_inc;
> +		double fallback_percentage;
> +
> +		initial_swpout = read_stat(SWPOUT_PATH);
> +		initial_swpout_fallback = read_stat(SWPOUT_FALLBACK_PATH);
> +
> +		/*
> +		 * The following setup creates a 1:1 ratio of mTHP to small folios
> +		 * since large folio swap-in isn't supported yet. Once we support
> +		 * mTHP swap-in, we'll likely need to reduce MEMSIZE_MTHP and
> +		 * increase MEMSIZE_SMALLFOLIO to maintain the ratio.
> +		 */
> +		random_swapin(mem1, MEMSIZE_MTHP,
> +				aligned_swapin ? ALIGNMENT_MTHP : ALIGNMENT_SMALLFOLIO,
> +				TOTAL_DONTNEED_MTHP);
> +		random_madvise_dontneed(mem1, MEMSIZE_MTHP, ALIGNMENT_MTHP,
> +				TOTAL_DONTNEED_MTHP);
> +
> +		if (use_small_folio) {
> +			random_swapin(mem2, MEMSIZE_SMALLFOLIO,
> +					ALIGNMENT_SMALLFOLIO,
> +					TOTAL_DONTNEED_SMALLFOLIO);
> +		}
> +
> +		if (madvise(mem1, MEMSIZE_MTHP, MADV_PAGEOUT) != 0) {
> +			perror("madvise pageout for mem1");
> +			free(mem1);
> +			if (mem2 != NULL)
> +				free(mem2);
> +			return EXIT_FAILURE;
> +		}
> +
> +		if (use_small_folio) {
> +			if (madvise(mem2, MEMSIZE_SMALLFOLIO, MADV_PAGEOUT) != 0) {
> +				perror("madvise pageout for mem2");
> +				free(mem1);
> +				free(mem2);
> +				return EXIT_FAILURE;
> +			}
> +		}
> +
> +		final_swpout = read_stat(SWPOUT_PATH);
> +		final_swpout_fallback = read_stat(SWPOUT_FALLBACK_PATH);
> +
> +		swpout_inc = final_swpout - initial_swpout;
> +		swpout_fallback_inc = final_swpout_fallback - initial_swpout_fallback;
> +
> +		fallback_percentage = (double)swpout_fallback_inc /
> +			(swpout_fallback_inc + swpout_inc) * 100;
> +
> +		printf("Iteration %d: swpout inc: %lu, swpout fallback inc: %lu, Fallback percentage: %.2f%%\n",
> +				i + 1, swpout_inc, swpout_fallback_inc, fallback_percentage);
> +	}
> +
> +	free(mem1);
> +	if (mem2 != NULL)
> +		free(mem2);
> +
> +	return EXIT_SUCCESS;
> +}



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-07-04 23:10               ` Andrew Morton
@ 2024-07-05  9:31                 ` Ryan Roberts
  2024-07-05 16:38                 ` Chris Li
  1 sibling, 0 replies; 16+ messages in thread
From: Ryan Roberts @ 2024-07-05  9:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Barry Song, chrisl, linux-mm, david, hughd, kaleshsingh, kasong,
	linux-kernel, v-songbaohua, ying.huang

On 05/07/2024 00:10, Andrew Morton wrote:
> This all seems to have gone way off track.

Yes sorry about that; my fault.

> 
> acks or nacks on the patch, please?

I've replied to the actual patch with R-b and T-b.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout
  2024-07-04 23:10               ` Andrew Morton
  2024-07-05  9:31                 ` Ryan Roberts
@ 2024-07-05 16:38                 ` Chris Li
  1 sibling, 0 replies; 16+ messages in thread
From: Chris Li @ 2024-07-05 16:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ryan Roberts, Barry Song, linux-mm, david, hughd, kaleshsingh,
	kasong, linux-kernel, v-songbaohua, ying.huang

I have been using this tool to help me develop the swap allocator
patches. It helped me reproduce the kernel crash.

Acked-by: Chris Li <chrisl@kernel.org>
Tested-by: Chris Li <chrisl@kernel.org>

Chris

On Thu, Jul 4, 2024 at 4:10 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> This all seems to have gone way off track.
>
> acks or nacks on the patch, please?


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-07-05 16:38 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-22  7:12 [PATCH v2 0/1] tools/mm: Introduce a tool to assess swap entry allocation for thp_swapout Barry Song
2024-06-22  7:12 ` [PATCH v2 1/1] " Barry Song
2024-06-25 17:22   ` Kairui Song
2024-06-25 22:13     ` Barry Song
2024-07-05  9:31   ` Ryan Roberts
2024-06-24  8:26 ` [PATCH v2 0/1] " Ryan Roberts
2024-06-24  8:42   ` Barry Song
2024-06-24 10:35     ` Ryan Roberts
2024-06-25  0:11       ` Barry Song
2024-06-25  8:11         ` Ryan Roberts
2024-06-27  0:02           ` Barry Song
2024-06-27  8:50             ` Ryan Roberts
2024-07-04 23:10               ` Andrew Morton
2024-07-05  9:31                 ` Ryan Roberts
2024-07-05 16:38                 ` Chris Li
2024-06-24 10:06 ` Chris Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox