From: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"yosryahmed@google.com" <yosryahmed@google.com>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"shakeel.butt@linux.dev" <shakeel.butt@linux.dev>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>,
"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Subject: RE: [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios
Date: Wed, 25 Sep 2024 18:39:25 +0000 [thread overview]
Message-ID: <SJ0PR11MB5678BC6BBF8A4D7694EDDFDAC9692@SJ0PR11MB5678.namprd11.prod.outlook.com> (raw)
In-Reply-To: <87v7yks0kd.fsf@yhuang6-desk2.ccr.corp.intel.com>
> -----Original Message-----
> From: Huang, Ying <ying.huang@intel.com>
> Sent: Tuesday, September 24, 2024 11:35 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
> chengming.zhou@linux.dev; usamaarif642@gmail.com;
> shakeel.butt@linux.dev; ryan.roberts@arm.com; 21cnbao@gmail.com;
> akpm@linux-foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali,
> Wajdi K <wajdi.k.feghali@intel.com>; Gopal, Vinodh
> <vinodh.gopal@intel.com>
> Subject: Re: [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios
>
> Kanchana P Sridhar <kanchana.p.sridhar@intel.com> writes:
>
> [snip]
>
> >
> > Case 1: Comparing zswap 4K vs. zswap mTHP
> > =========================================
> >
> > In this scenario, the "before" is CONFIG_THP_SWAP set to off, that results in
> > 64K/2M (m)THP to be split into 4K folios that get processed by zswap.
> >
> > The "after" is CONFIG_THP_SWAP set to on, and this patch-series, that
> results
> > in 64K/2M (m)THP to not be split, and processed by zswap.
> >
> > 64KB mTHP (cgroup memory.high set to 40G):
> > ==========================================
> >
> > -------------------------------------------------------------------------------
> > mm-unstable 9-23-2024 zswap-mTHP Change wrt
> > CONFIG_THP_SWAP=N CONFIG_THP_SWAP=Y Baseline
> > Baseline
> > -------------------------------------------------------------------------------
> > ZSWAP compressor zstd deflate- zstd deflate- zstd deflate-
> > iaa iaa iaa
> > -------------------------------------------------------------------------------
> > Throughput (KB/s) 143,323 125,485 153,550 129,609 7% 3%
> > elapsed time (sec) 24.97 25.42 23.90 25.19 4% 1%
> > sys time (sec) 822.72 750.96 757.70 731.13 8% 3%
> > memcg_high 132,743 169,825 148,075 192,744
> > memcg_swap_fail 639,067 841,553 2,204 2,215
> > pswpin 0 0 0 0
> > pswpout 0 0 0 0
> > zswpin 795 873 760 902
> > zswpout 10,011,266 13,195,137 10,010,017 13,193,554
> > thp_swpout 0 0 0 0
> > thp_swpout_ 0 0 0 0
> > fallback
> > 64kB-mthp_ 639,065 841,553 2,204 2,215
> > swpout_fallback
> > pgmajfault 2,861 2,924 3,054 3,259
> > ZSWPOUT-64kB n/a n/a 623,451 822,268
> > SWPOUT-64kB 0 0 0 0
> > -------------------------------------------------------------------------------
> >
>
> IIUC, the throughput is the sum of throughput of all usemem processes?
>
> One possible issue of usemem test case is the "imbalance" issue. That
> is, some usemem processes may swap-out/swap-in less, so the score is
> very high; while some other processes may swap-out/swap-in more, so the
> score is very low. Sometimes, the total score decreases, but the scores
> of usemem processes are more balanced, so that the performance should be
> considered better. And, in general, we should make usemem score
> balanced among processes via say longer test time. Can you check this
> in your test results?
Actually, the throughput data listed in the cover-letter is the average of
all the usemem processes. Your observation about the "imbalance" issue is
right. Some processes see a higher throughput than others. I have noticed
that the throughputs progressively reduce as the individual processes exit
and print their stats.
Listed below are the stats from two runs of usemem70: sleep 10 and sleep 30.
Both are run with a cgroup mem-limit of 40G. Data is with v7, 64K folios are
enabled, zswap uses zstd.
-----------------------------------------------
sleep 10 sleep 30
Throughput (KB/s) Throughput (KB/s)
-----------------------------------------------
181,540 191,686
179,651 191,459
179,068 188,834
177,244 187,568
177,215 186,703
176,565 185,584
176,546 185,370
176,470 185,021
176,214 184,303
176,128 184,040
175,279 183,932
174,745 180,831
173,935 179,418
161,546 168,014
160,332 167,540
160,122 167,364
159,613 167,020
159,546 166,590
159,021 166,483
158,845 166,418
158,426 166,264
158,396 166,066
158,371 165,944
158,298 165,866
158,250 165,884
158,057 165,533
158,011 165,532
157,899 165,457
157,894 165,424
157,839 165,410
157,731 165,407
157,629 165,273
157,626 164,867
157,581 164,636
157,471 164,266
157,430 164,225
157,287 163,290
156,289 153,597
153,970 147,494
148,244 147,102
142,907 146,111
142,811 145,789
139,171 141,168
136,314 140,714
133,616 140,111
132,881 139,636
132,729 136,943
132,680 136,844
132,248 135,726
132,027 135,384
131,929 135,270
131,766 134,748
131,667 134,733
131,576 134,582
131,396 134,302
131,351 134,160
131,135 134,102
130,885 134,097
130,854 134,058
130,767 134,006
130,666 133,960
130,647 133,894
130,152 133,837
130,006 133,747
129,921 133,679
129,856 133,666
129,377 133,564
128,366 133,331
127,988 132,938
126,903 132,746
-----------------------------------------------
sum 10,526,916 10,919,561
average 150,385 155,994
stddev 17,551 19,633
-----------------------------------------------
elapsed 24.40 43.66
time (sec)
sys time 806.25 766.05
(sec)
zswpout 10,008,713 10,008,407
64K folio 623,463 623,629
swpout
-----------------------------------------------
As we increase the time for which allocations are maintained,
there seems to be a slight improvement in throughput, but the
variance increases as well. The processes with lower throughput
could be the ones that handle the memcg being over limit by
doing reclaim, possibly before they can allocate.
Interestingly, the longer test time does seem to reduce the amount
of reclaim (hence lower sys time), but more 64K large folios seem to
be reclaimed. Could this mean that with longer test time (sleep 30),
more cold memory residing in large folios is getting reclaimed, as
against memory just relinquished by the exiting processes?
Thanks,
Kanchana
>
> [snip]
>
> --
> Best Regards,
> Huang, Ying
next prev parent reply other threads:[~2024-09-25 18:39 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-24 1:17 Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 1/8] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-09-24 16:45 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 2/8] mm: zswap: Modify zswap_compress() to accept a page instead of a folio Kanchana P Sridhar
2024-09-24 16:50 ` Nhat Pham
2024-09-24 1:17 ` [PATCH v7 3/8] mm: zswap: Refactor code to store an entry in zswap xarray Kanchana P Sridhar
2024-09-24 17:16 ` Nhat Pham
2024-09-24 20:40 ` Sridhar, Kanchana P
2024-09-24 19:14 ` Yosry Ahmed
2024-09-24 22:22 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 4/8] mm: zswap: Refactor code to delete stored offsets in case of errors Kanchana P Sridhar
2024-09-24 17:25 ` Nhat Pham
2024-09-24 20:41 ` Sridhar, Kanchana P
2024-09-24 19:20 ` Yosry Ahmed
2024-09-24 22:32 ` Sridhar, Kanchana P
2024-09-25 0:43 ` Yosry Ahmed
2024-09-25 1:18 ` Sridhar, Kanchana P
2024-09-25 14:11 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 5/8] mm: zswap: Compress and store a specific page in a folio Kanchana P Sridhar
2024-09-24 19:28 ` Yosry Ahmed
2024-09-24 22:45 ` Sridhar, Kanchana P
2024-09-25 0:47 ` Yosry Ahmed
2024-09-25 1:49 ` Sridhar, Kanchana P
2024-09-25 13:53 ` Johannes Weiner
2024-09-25 18:45 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 6/8] mm: zswap: Support mTHP swapout in zswap_store() Kanchana P Sridhar
2024-09-24 17:33 ` Nhat Pham
2024-09-24 20:51 ` Sridhar, Kanchana P
2024-09-24 21:08 ` Nhat Pham
2024-09-24 21:34 ` Yosry Ahmed
2024-09-24 22:16 ` Nhat Pham
2024-09-24 22:18 ` Sridhar, Kanchana P
2024-09-24 22:28 ` Yosry Ahmed
2024-09-24 22:17 ` Sridhar, Kanchana P
2024-09-24 19:38 ` Yosry Ahmed
2024-09-24 20:51 ` Nhat Pham
2024-09-24 21:38 ` Yosry Ahmed
2024-09-24 23:11 ` Nhat Pham
2024-09-25 0:05 ` Sridhar, Kanchana P
2024-09-25 0:52 ` Yosry Ahmed
2024-09-24 23:21 ` Sridhar, Kanchana P
2024-09-24 23:02 ` Sridhar, Kanchana P
2024-09-25 13:40 ` Johannes Weiner
2024-09-25 18:30 ` Yosry Ahmed
2024-09-25 19:10 ` Sridhar, Kanchana P
2024-09-25 19:49 ` Yosry Ahmed
2024-09-25 20:49 ` Johannes Weiner
2024-09-25 19:20 ` Johannes Weiner
2024-09-25 19:39 ` Yosry Ahmed
2024-09-25 20:13 ` Johannes Weiner
2024-09-25 21:06 ` Yosry Ahmed
2024-09-25 22:29 ` Sridhar, Kanchana P
2024-09-26 3:58 ` Sridhar, Kanchana P
2024-09-26 4:52 ` Yosry Ahmed
2024-09-26 16:40 ` Sridhar, Kanchana P
2024-09-26 17:19 ` Yosry Ahmed
2024-09-26 17:29 ` Sridhar, Kanchana P
2024-09-26 17:34 ` Yosry Ahmed
2024-09-26 19:36 ` Sridhar, Kanchana P
2024-09-26 18:43 ` Johannes Weiner
2024-09-26 18:45 ` Yosry Ahmed
2024-09-26 19:40 ` Sridhar, Kanchana P
2024-09-26 19:39 ` Sridhar, Kanchana P
2024-09-25 14:27 ` Johannes Weiner
2024-09-25 18:17 ` Yosry Ahmed
2024-09-25 18:48 ` Sridhar, Kanchana P
2024-09-24 1:17 ` [PATCH v7 7/8] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-09-24 1:17 ` [PATCH v7 8/8] mm: Document the newly added mTHP zswpout stats, clarify swpout semantics Kanchana P Sridhar
2024-09-24 17:36 ` Nhat Pham
2024-09-24 20:52 ` Sridhar, Kanchana P
2024-09-24 19:34 ` [PATCH v7 0/8] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-09-24 22:50 ` Sridhar, Kanchana P
2024-09-25 6:35 ` Huang, Ying
2024-09-25 18:39 ` Sridhar, Kanchana P [this message]
2024-09-26 0:44 ` Huang, Ying
2024-09-26 3:48 ` Sridhar, Kanchana P
2024-09-26 6:47 ` Huang, Ying
2024-09-26 21:44 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SJ0PR11MB5678BC6BBF8A4D7694EDDFDAC9692@SJ0PR11MB5678.namprd11.prod.outlook.com \
--to=kanchana.p.sridhar@intel.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox