From: "Huang, Ying" <ying.huang@intel.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"yosryahmed@google.com" <yosryahmed@google.com>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"shakeel.butt@linux.dev" <shakeel.butt@linux.dev>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>
Subject: Re: [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios
Date: Thu, 26 Sep 2024 08:44:36 +0800 [thread overview]
Message-ID: <877cazs0p7.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <SJ0PR11MB5678BC6BBF8A4D7694EDDFDAC9692@SJ0PR11MB5678.namprd11.prod.outlook.com> (Kanchana P. Sridhar's message of "Thu, 26 Sep 2024 02:39:25 +0800")
"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com> writes:
>> -----Original Message-----
>> From: Huang, Ying <ying.huang@intel.com>
>> Sent: Tuesday, September 24, 2024 11:35 PM
>> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
>> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
>> hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
>> chengming.zhou@linux.dev; usamaarif642@gmail.com;
>> shakeel.butt@linux.dev; ryan.roberts@arm.com; 21cnbao@gmail.com;
>> akpm@linux-foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali,
>> Wajdi K <wajdi.k.feghali@intel.com>; Gopal, Vinodh
>> <vinodh.gopal@intel.com>
>> Subject: Re: [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios
>>
>> Kanchana P Sridhar <kanchana.p.sridhar@intel.com> writes:
>>
>> [snip]
>>
>> >
>> > Case 1: Comparing zswap 4K vs. zswap mTHP
>> > =========================================
>> >
>> > In this scenario, the "before" is CONFIG_THP_SWAP set to off, that results in
>> > 64K/2M (m)THP to be split into 4K folios that get processed by zswap.
>> >
>> > The "after" is CONFIG_THP_SWAP set to on, and this patch-series, that
>> results
>> > in 64K/2M (m)THP to not be split, and processed by zswap.
>> >
>> > 64KB mTHP (cgroup memory.high set to 40G):
>> > ==========================================
>> >
>> > -------------------------------------------------------------------------------
>> > mm-unstable 9-23-2024 zswap-mTHP Change wrt
>> > CONFIG_THP_SWAP=N CONFIG_THP_SWAP=Y Baseline
>> > Baseline
>> > -------------------------------------------------------------------------------
>> > ZSWAP compressor zstd deflate- zstd deflate- zstd deflate-
>> > iaa iaa iaa
>> > -------------------------------------------------------------------------------
>> > Throughput (KB/s) 143,323 125,485 153,550 129,609 7% 3%
>> > elapsed time (sec) 24.97 25.42 23.90 25.19 4% 1%
>> > sys time (sec) 822.72 750.96 757.70 731.13 8% 3%
>> > memcg_high 132,743 169,825 148,075 192,744
>> > memcg_swap_fail 639,067 841,553 2,204 2,215
>> > pswpin 0 0 0 0
>> > pswpout 0 0 0 0
>> > zswpin 795 873 760 902
>> > zswpout 10,011,266 13,195,137 10,010,017 13,193,554
>> > thp_swpout 0 0 0 0
>> > thp_swpout_ 0 0 0 0
>> > fallback
>> > 64kB-mthp_ 639,065 841,553 2,204 2,215
>> > swpout_fallback
>> > pgmajfault 2,861 2,924 3,054 3,259
>> > ZSWPOUT-64kB n/a n/a 623,451 822,268
>> > SWPOUT-64kB 0 0 0 0
>> > -------------------------------------------------------------------------------
>> >
>>
>> IIUC, the throughput is the sum of throughput of all usemem processes?
>>
>> One possible issue of usemem test case is the "imbalance" issue. That
>> is, some usemem processes may swap-out/swap-in less, so the score is
>> very high; while some other processes may swap-out/swap-in more, so the
>> score is very low. Sometimes, the total score decreases, but the scores
>> of usemem processes are more balanced, so that the performance should be
>> considered better. And, in general, we should make usemem score
>> balanced among processes via say longer test time. Can you check this
>> in your test results?
>
> Actually, the throughput data listed in the cover-letter is the average of
> all the usemem processes. Your observation about the "imbalance" issue is
> right. Some processes see a higher throughput than others. I have noticed
> that the throughputs progressively reduce as the individual processes exit
> and print their stats.
>
> Listed below are the stats from two runs of usemem70: sleep 10 and sleep 30.
> Both are run with a cgroup mem-limit of 40G. Data is with v7, 64K folios are
> enabled, zswap uses zstd.
>
>
> -----------------------------------------------
> sleep 10 sleep 30
> Throughput (KB/s) Throughput (KB/s)
> -----------------------------------------------
> 181,540 191,686
> 179,651 191,459
> 179,068 188,834
> 177,244 187,568
> 177,215 186,703
> 176,565 185,584
> 176,546 185,370
> 176,470 185,021
> 176,214 184,303
> 176,128 184,040
> 175,279 183,932
> 174,745 180,831
> 173,935 179,418
> 161,546 168,014
> 160,332 167,540
> 160,122 167,364
> 159,613 167,020
> 159,546 166,590
> 159,021 166,483
> 158,845 166,418
> 158,426 166,264
> 158,396 166,066
> 158,371 165,944
> 158,298 165,866
> 158,250 165,884
> 158,057 165,533
> 158,011 165,532
> 157,899 165,457
> 157,894 165,424
> 157,839 165,410
> 157,731 165,407
> 157,629 165,273
> 157,626 164,867
> 157,581 164,636
> 157,471 164,266
> 157,430 164,225
> 157,287 163,290
> 156,289 153,597
> 153,970 147,494
> 148,244 147,102
> 142,907 146,111
> 142,811 145,789
> 139,171 141,168
> 136,314 140,714
> 133,616 140,111
> 132,881 139,636
> 132,729 136,943
> 132,680 136,844
> 132,248 135,726
> 132,027 135,384
> 131,929 135,270
> 131,766 134,748
> 131,667 134,733
> 131,576 134,582
> 131,396 134,302
> 131,351 134,160
> 131,135 134,102
> 130,885 134,097
> 130,854 134,058
> 130,767 134,006
> 130,666 133,960
> 130,647 133,894
> 130,152 133,837
> 130,006 133,747
> 129,921 133,679
> 129,856 133,666
> 129,377 133,564
> 128,366 133,331
> 127,988 132,938
> 126,903 132,746
> -----------------------------------------------
> sum 10,526,916 10,919,561
> average 150,385 155,994
> stddev 17,551 19,633
> -----------------------------------------------
> elapsed 24.40 43.66
> time (sec)
> sys time 806.25 766.05
> (sec)
> zswpout 10,008,713 10,008,407
> 64K folio 623,463 623,629
> swpout
> -----------------------------------------------
Although there are some imbalance, I don't find it's too much. So, I
think the test result is reasonable. Please pay attention to the
imbalance issue in the future tests.
> As we increase the time for which allocations are maintained,
> there seems to be a slight improvement in throughput, but the
> variance increases as well. The processes with lower throughput
> could be the ones that handle the memcg being over limit by
> doing reclaim, possibly before they can allocate.
>
> Interestingly, the longer test time does seem to reduce the amount
> of reclaim (hence lower sys time), but more 64K large folios seem to
> be reclaimed. Could this mean that with longer test time (sleep 30),
> more cold memory residing in large folios is getting reclaimed, as
> against memory just relinquished by the exiting processes?
I don't think longer sleep time in test helps much to balance. Can you
try with less process, and larger memory size per process? I guess that
this will improve balance.
--
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2024-09-26 0:48 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 1:17 Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-24 16:45 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-24 16:50 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 3/8] mm: zswap: Refactor code to store an entry in zswap xarray Kanchana P Sridhar
2024-09-24 17:16 ` Nhat Pham
2024-09-24 20:40 ` Sridhar, Kanchana P
2024-09-24 19:14 ` Yosry Ahmed
2024-09-24 22:22 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 4/8] mm: zswap: Refactor code to delete stored offsets in case of errors Kanchana P Sridhar
2024-09-24 17:25 ` Nhat Pham
2024-09-24 20:41 ` Sridhar, Kanchana P
2024-09-24 19:20 ` Yosry Ahmed
2024-09-24 22:32 ` Sridhar, Kanchana P
2024-09-25 0:43 ` Yosry Ahmed
2024-09-25 1:18 ` Sridhar, Kanchana P
2024-09-25 14:11 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio Kanchana P Sridhar
2024-09-24 19:28 ` Yosry Ahmed
2024-09-24 22:45 ` Sridhar, Kanchana P
2024-09-25 0:47 ` Yosry Ahmed
2024-09-25 1:49 ` Sridhar, Kanchana P
2024-09-25 13:53 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store() Kanchana P Sridhar
2024-09-24 17:33 ` Nhat Pham
2024-09-24 20:51 ` Sridhar, Kanchana P
2024-09-24 21:08 ` Nhat Pham
2024-09-24 21:34 ` Yosry Ahmed
2024-09-24 22:16 ` Nhat Pham
2024-09-24 22:18 ` Sridhar, Kanchana P
2024-09-24 22:28 ` Yosry Ahmed
2024-09-24 22:17 ` Sridhar, Kanchana P
2024-09-24 19:38 ` Yosry Ahmed
2024-09-24 20:51 ` Nhat Pham
2024-09-24 21:38 ` Yosry Ahmed
2024-09-24 23:11 ` Nhat Pham
2024-09-25 0:05 ` Sridhar, Kanchana P
2024-09-25 0:52 ` Yosry Ahmed
2024-09-24 23:21 ` Sridhar, Kanchana P
2024-09-24 23:02 ` Sridhar, Kanchana P
2024-09-25 13:40 ` Johannes Weiner
2024-09-25 18:30 ` Yosry Ahmed
2024-09-25 19:10 ` Sridhar, Kanchana P
2024-09-25 19:49 ` Yosry Ahmed
2024-09-25 20:49 ` Johannes Weiner
2024-09-25 19:20 ` Johannes Weiner
2024-09-25 19:39 ` Yosry Ahmed
2024-09-25 20:13 ` Johannes Weiner
2024-09-25 21:06 ` Yosry Ahmed
2024-09-25 22:29 ` Sridhar, Kanchana P
2024-09-26 3:58 ` Sridhar, Kanchana P
2024-09-26 4:52 ` Yosry Ahmed
2024-09-26 16:40 ` Sridhar, Kanchana P
2024-09-26 17:19 ` Yosry Ahmed
2024-09-26 17:29 ` Sridhar, Kanchana P
2024-09-26 17:34 ` Yosry Ahmed
2024-09-26 19:36 ` Sridhar, Kanchana P
2024-09-26 18:43 ` Johannes Weiner
2024-09-26 18:45 ` Yosry Ahmed
2024-09-26 19:40 ` Sridhar, Kanchana P
2024-09-26 19:39 ` Sridhar, Kanchana P
2024-09-25 14:27 ` Johannes Weiner
2024-09-25 18:17 ` Yosry Ahmed
2024-09-25 18:48 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 7/8] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 8/8] mm: Document the newly added mTHP zswpout stats, clarify swpout semantics Kanchana P Sridhar
2024-09-24 17:36 ` Nhat Pham
2024-09-24 20:52 ` Sridhar, Kanchana P
2024-09-24 19:34 ` [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-09-24 22:50 ` Sridhar, Kanchana P
2024-09-25 6:35 ` Huang, Ying
2024-09-25 18:39 ` Sridhar, Kanchana P
2024-09-26 0:44 ` Huang, Ying [this message]
2024-09-26 3:48 ` Sridhar, Kanchana P
2024-09-26 6:47 ` Huang, Ying
2024-09-26 21:44 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877cazs0p7.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox