From: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"nphamcs@gmail.com" <nphamcs@gmail.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"Huang, Ying" <ying.huang@intel.com>,
"21cnbao@gmail.com" <21cnbao@gmail.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Zou, Nanhai" <nanhai.zou@intel.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@intel.com>,
"Gopal, Vinodh" <vinodh.gopal@intel.com>,
Chris Li <chrisl@kernel.org>,
"Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
Subject: RE: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
Date: Thu, 29 Aug 2024 03:10:36 +0000 [thread overview]
Message-ID: <SJ0PR11MB56780CF8B669BE97DEAE8DCAC9962@SJ0PR11MB5678.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAJD7tkYBaHpJG8Uzo592cXYwNbzRJ0G8Ju71mkFA+T8uS3eARg@mail.gmail.com>
Hi Yosry,
> -----Original Message-----
> From: Yosry Ahmed <yosryahmed@google.com>
> Sent: Wednesday, August 28, 2024 6:02 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> hannes@cmpxchg.org; nphamcs@gmail.com; ryan.roberts@arm.com;
> Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-
> foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K
> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>; Chris
> Li <chrisl@kernel.org>
> Subject: Re: [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios
>
> [..]
> > > > In the "Before" scenario, when zswap does not store mTHP, only
> allocations
> > > > count towards the cgroup memory limit. However, in the "After"
> scenario,
> > > > with the introduction of zswap_store() mTHP, both, allocations as well as
> > > > the zswap compressed pool usage from all 70 processes are counted
> > > towards
> > > > the memory limit. As a result, we see higher swapout activity in the
> > > > "After" data. Hence, more time is spent doing reclaim as the zswap
> cgroup
> > > > charge leads to more frequent memory.high breaches.
> > > >
> > > > This causes degradation in throughput and sys time with zswap mTHP,
> more
> > > so
> > > > in case of zstd than deflate-iaa. Compress latency could play a part in
> > > > this - when there is more swapout activity happening, a slower
> compressor
> > > > would cause allocations to stall for any/all of the 70 processes.
> > > >
> > > > In my opinion, even though the test set up does not provide an accurate
> > > > way for a direct before/after comparison (because of zswap usage being
> > > > counted in cgroup, hence towards the memory.high), it still seems
> > > > reasonable for zswap_store to support (m)THP, so that further
> performance
> > > > improvements can be implemented.
> > >
> > > Are you saying that in the "Before" data we end up skipping zswap
> > > completely because of using mTHPs?
> >
> > That's right, Yosry.
> >
> > >
> > > Does it make more sense to turn CONFIG_THP_SWAP in the "Before" data
> >
> > We could do this, however I am not sure if turning off CONFIG_THP_SWAP
> > will have other side-effects in terms of disabling mm code paths outside of
> > zswap that are intended to be mTHP optimizations that could again skew
> > the before/after comparisons.
>
> Yeah that's possible, but right now we are testing mTHP swapout that
> does not go through zswap at all vs. mTHP swapout going through zswap.
>
> I think what we really want to measure is 4K swapout going through
> zswap vs. mTHP swapout going through zswap. This assumes that current
> zswap setups disable CONFIG_THP_SWAP, so we would be measuring the
> benefit of allowing them to enable CONFIG_THP_SWAP by supporting it
> properly in zswap.
>
> If some setups with zswap have CONFIG_THP_SWAP enabled then that's a
> different story, but we already have the data for this case as well
> right now in case this is a legitimate setup.
>
> Adding Chris Li here from Google. We have CONFIG_THP_SWAP disabled
> with zswap, so for us we would want to know the benefit of supporting
> CONFIG_THP_SWAP properly in zswap. At least I think so :)
Sure, this makes sense. Here's the data that I gathered with CONFIG_THP_SWAP
disabled. We see improvements overall in throughput and sys time for zstd and
deflate-iaa, when comparing before (THP_SWAP=N) vs. after (THP_SWAP=Y):
64K mTHP:
=========
-------------------------------------------------------------------------------
v6.11-rc3 mainline zswap-mTHP Change wrt
Baseline Baseline
CONFIG_THP_SWAP=N CONFIG_THP_SWAP=Y
--------------------------------------------------------------------------------
ZSWAP compressor zstd deflate- zstd deflate- zstd deflate-
iaa iaa iaa
-------------------------------------------------------------------------------
Throughput (KB/s) 136,113 140,044 140,363 151,938 3% 8%
sys time (sec) 986.78 951.95 954.85 735.47 3% 23%
memcg_high 124,183 127,513 138,651 133,884
memcg_swap_high 0 0 0 0
memcg_swap_fail 619,020 751,099 0 0
pswpin 0 0 0 0
pswpout 0 0 0 0
zswpin 656 569 624 639
zswpout 9,413,603 11,284,812 9,453,761 9,385,910
thp_swpout 0 0 0 0
thp_swpout_ 0 0 0 0
fallback
pgmajfault 3,470 3,382 4,633 3,611
ZSWPOUT-64kB n/a n/a 590,768 586,521
SWPOUT-64kB 0 0 0 0
-------------------------------------------------------------------------------
2M THP:
=======
------------------------------------------------------------------------------
v6.11-rc3 mainline zswap-mTHP Change wrt
Baseline Baseline
CONFIG_THP_SWAP=N CONFIG_THP_SWAP=Y
------------------------------------------------------------------------------
ZSWAP compressor zstd deflate- zstd deflate- zstd deflate-
iaa iaa iaa
------------------------------------------------------------------------------
Throughput (KB/s) 164,220 172,523 165,005 174,536 0.5% 1%
sys time (sec) 855.76 686.94 801.72 676.65 6% 1%
memcg_high 14,628 16,247 14,951 16,096
memcg_swap_high 0 0 0 0
memcg_swap_fail 18,698 21,114 0 0
pswpin 0 0 0 0
pswpout 0 0 0 0
zswpin 663 665 5,333 781
zswpout 8,419,458 8,992,065 8,546,895 9,355,760
thp_swpout 0 0 0 0
thp_swpout_ 18,697 21,113 0 0
fallback
pgmajfault 3,439 3,496 8,139 3,582
ZSWPOUT-2048kB n/a n/a 16,684 18,270
SWPOUT-2048kB 0 0 0 0
-----------------------------------------------------------------------------
Thanks,
Kanchana
>
> >
> > Will wait for Nhat's comments as well.
> >
> > Thanks,
> > Kanchana
> >
> > > to force the mTHPs to be split and for the data to be stored in zswap?
> > > This would be a more fair Before/After comparison where the memory
> > > goes to zswap in both cases, but "Before" has to be split because of
> > > zswap's lack of support for mTHP. I assume most setups relying on
> > > zswap will be turning CONFIG_THP_SWAP off today anyway, but maybe
> not.
> > > Nhat, is this something you can share?
next prev parent reply other threads:[~2024-08-29 3:10 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-28 9:35 Kanchana P Sridhar
2024-08-28 9:35 ` [PATCH v5 1/3] mm: Define obj_cgroup_get() if CONFIG_MEMCG is not defined Kanchana P Sridhar
2024-08-28 9:35 ` [PATCH v5 2/3] mm: zswap: zswap_store() extended to handle mTHP folios Kanchana P Sridhar
2024-08-28 9:35 ` [PATCH v5 3/3] mm: swap: Count successful mTHP ZSWAP stores in sysfs mTHP zswpout stats Kanchana P Sridhar
2024-08-28 15:55 ` [PATCH v5 0/3] mm: ZSWAP swap-out of mTHP folios Nhat Pham
2024-08-28 17:23 ` Nhat Pham
2024-08-28 19:30 ` Sridhar, Kanchana P
2024-08-28 19:24 ` Sridhar, Kanchana P
2024-08-28 21:35 ` Nhat Pham
2024-08-29 0:06 ` Sridhar, Kanchana P
2024-08-29 17:10 ` Nhat Pham
2024-08-29 19:38 ` Sridhar, Kanchana P
2024-08-30 4:52 ` Chengming Zhou
2024-09-20 2:34 ` Sridhar, Kanchana P
2024-08-29 3:59 ` Sridhar, Kanchana P
2024-08-28 22:37 ` Yosry Ahmed
2024-08-29 0:20 ` Sridhar, Kanchana P
2024-08-29 1:01 ` Yosry Ahmed
2024-08-29 3:10 ` Sridhar, Kanchana P [this message]
2024-08-29 23:33 ` Nhat Pham
2024-08-29 23:38 ` Yosry Ahmed
2024-08-29 23:47 ` Nhat Pham
2024-08-29 23:55 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=SJ0PR11MB56780CF8B669BE97DEAE8DCAC9962@SJ0PR11MB5678.namprd11.prod.outlook.com \
--to=kanchana.p.sridhar@intel.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox