From: "Huang, Ying" <ying.huang@intel.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Cc: Yosry Ahmed <yosryahmed@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>
Subject: Re: [PATCH v6 0/3] mm: ZSWAP swap-out of mTHP folios
Date: Fri, 20 Sep 2024 17:29:14 +0800 [thread overview]
Message-ID: <87ikuqvfkl.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <SJ0PR11MB567893E61CB522991ED1379EC96C2@SJ0PR11MB5678.namprd11.prod.outlook.com> (Kanchana P. Sridhar's message of "Fri, 20 Sep 2024 09:41:02 +0800")
"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com> writes:
[snip]
>
> Thanks, these are good points. I ran this experiment with mm-unstable 9-17-2024,
> commit 248ba8004e76eb335d7e6079724c3ee89a011389.
>
> Data is based on average of 3 runs of the vm-scalability "usemem" test.
>
> 4G SSD backing zswap, each process sleeps before exiting
> ========================================================
>
> 64KB mTHP (cgroup memory.high set to 60G, no swap limit):
> =========================================================
> CONFIG_THP_SWAP=Y
> Sapphire Rapids server with 503 GiB RAM and 4G SSD swap backing device
> for zswap.
>
> Experiment 1: Each process sleeps for 0 sec after allocating memory
> (usemem --init-time -w -O --sleep 0 -n 70 1g):
>
> -------------------------------------------------------------------------------
> mm-unstable 9-17-2024 zswap-mTHP v6 Change wrt
> Baseline Baseline
> "before" "after" (sleep 0)
> -------------------------------------------------------------------------------
> ZSWAP compressor zstd deflate- zstd deflate- zstd deflate-
> iaa iaa iaa
> -------------------------------------------------------------------------------
> Throughput (KB/s) 296,684 274,207 359,722 390,162 21% 42%
> sys time (sec) 92.67 93.33 251.06 237.56 -171% -155%
> memcg_high 3,503 3,769 44,425 27,154
> memcg_swap_fail 0 0 115,814 141,936
> pswpin 17 0 0 0
> pswpout 370,853 393,232 0 0
> zswpin 693 123 666 667
> zswpout 1,484 123 1,366,680 1,199,645
> thp_swpout 0 0 0 0
> thp_swpout_ 0 0 0 0
> fallback
> pgmajfault 3,384 2,951 3,656 3,468
> ZSWPOUT-64kB n/a n/a 82,940 73,121
> SWPOUT-64kB 23,178 24,577 0 0
> -------------------------------------------------------------------------------
>
>
> Experiment 2: Each process sleeps for 10 sec after allocating memory
> (usemem --init-time -w -O --sleep 10 -n 70 1g):
>
> -------------------------------------------------------------------------------
> mm-unstable 9-17-2024 zswap-mTHP v6 Change wrt
> Baseline Baseline
> "before" "after" (sleep 10)
> -------------------------------------------------------------------------------
> ZSWAP compressor zstd deflate- zstd deflate- zstd deflate-
> iaa iaa iaa
> -------------------------------------------------------------------------------
> Throughput (KB/s) 86,744 93,730 157,528 113,110 82% 21%
> sys time (sec) 308.87 315.29 477.55 629.98 -55% -100%
What is the elapsed time for all cases?
> memcg_high 169,450 188,700 143,691 177,887
> memcg_swap_fail 10,131,859 9,740,646 18,738,715 19,528,110
> pswpin 17 16 0 0
> pswpout 1,154,779 1,210,485 0 0
> zswpin 711 659 1,016 736
> zswpout 70,212 50,128 1,235,560 1,275,917
> thp_swpout 0 0 0 0
> thp_swpout_ 0 0 0 0
> fallback
> pgmajfault 6,120 6,291 8,789 6,474
> ZSWPOUT-64kB n/a n/a 67,587 68,912
> SWPOUT-64kB 72,174 75,655 0 0
> -------------------------------------------------------------------------------
>
>
> Conclusions from the experiments:
> =================================
> 1) zswap-mTHP improves throughput as compared to the baseline, for zstd and
> deflate-iaa.
>
> 2) Yosry's theory is proved correct in the 4G constrained swap setup.
> When the processes are constrained to sleep 10 sec after allocating
> memory, thereby keeping the memory allocated longer, the "Baseline" or
> "before" with mTHP getting stored in SSD shows a degradation of 71% in
> throughput and 238% in sys time, as compared to the "Baseline" with
Higher sys time may come from compression with CPU vs. disk writing?
> sleep 0 that benefits from serialization of disk IO not allowing all
> processes to allocate memory at the same time.
>
> 3) In the 4G SSD "sleep 0" case, zswap-mTHP shows an increase in sys time
> due to the cgroup charging and consequently higher memcg.high breaches
> and swapout activity.
>
> However, the "sleep 10" case's sys time seems to degrade less, and the
> memcg.high breaches and swapout activity are almost similar between the
> before/after (confirming Yosry's hypothesis). Further, the
> memcg_swap_fail activity in the "after" scenario is almost 2X that of
> the "before". This indicates failure to obtain swap offsets, resulting
> in the folio remaining active in memory.
>
> I tried to better understand this through the 64k mTHP swpout_fallback
> stats in the "sleep 10" zstd experiments:
>
> --------------------------------------------------------------
> "before" "after"
> --------------------------------------------------------------
> 64k mTHP swpout_fallback 627,308 897,407
> 64k folio swapouts 72,174 67,587
> [p|z]swpout events due to 64k mTHP 1,154,779 1,081,397
> 4k folio swapouts 70,212 154,163
> --------------------------------------------------------------
>
> The data indicates a higher # of 64k folio swpout_fallback with
> zswap-mTHP, that co-relates with the higher memcg_swap_fail counts and
> 4k folio swapouts with zswap-mTHP. Could the root-cause be fragmentation
> of the swap space due to zswap swapout being faster than SSD swapout?
>
[snip]
--
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2024-09-20 9:32 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-29 21:27 Kanchana P Sridhar
2024-08-29 21:27 ` [PATCH v6 1/3] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-08-29 21:27 ` [PATCH v6 2/3] mm: zswap: zswap_store() extended to handle mTHP folios Kanchana P Sridhar
2024-08-29 23:06 ` Yosry Ahmed
2024-09-20 1:57 ` Sridhar, Kanchana P
2024-09-02 11:37 ` Chengming Zhou
2024-09-20 2:43 ` Sridhar, Kanchana P
2024-09-16 5:55 ` Barry Song
2024-09-20 20:53 ` Sridhar, Kanchana P
2024-08-29 21:27 ` [PATCH v6 3/3] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-08-30 0:19 ` Nhat Pham
2024-09-20 2:32 ` Sridhar, Kanchana P
2024-09-20 22:57 ` Yosry Ahmed
2024-09-20 23:28 ` Sridhar, Kanchana P
2024-08-29 22:48 ` [PATCH v6 0/3] mm: ZSWAP swap-out of mTHP folios Yosry Ahmed
2024-08-29 23:45 ` Nhat Pham
2024-08-29 23:54 ` Yosry Ahmed
2024-08-30 0:06 ` Nhat Pham
2024-08-30 0:14 ` Yosry Ahmed
2024-09-20 2:30 ` Sridhar, Kanchana P
2024-09-20 2:26 ` Sridhar, Kanchana P
2024-09-20 2:22 ` Sridhar, Kanchana P
2024-09-20 2:16 ` Sridhar, Kanchana P
2024-09-20 9:12 ` Huang, Ying
2024-09-20 16:53 ` Sridhar, Kanchana P
2024-08-30 9:27 ` Huang, Ying
2024-09-20 2:41 ` Sridhar, Kanchana P
2024-09-20 1:41 ` Sridhar, Kanchana P
2024-09-20 9:29 ` Huang, Ying [this message]
2024-09-20 17:57 ` Sridhar, Kanchana P
2024-09-20 23:15 ` Yosry Ahmed
2024-09-20 23:45 ` Sridhar, Kanchana P
2024-09-02 14:40 ` Usama Arif
2024-09-20 19:31 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ikuqvfkl.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox