From: Takero Funaki <flintglass@gmail.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosryahmed@google.com>,
Chengming Zhou <chengming.zhou@linux.dev>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
Domenico Cerasuolo <cerasuolodomenico@gmail.com>,
linux-mm@kvack.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink
Date: Mon, 15 Jul 2024 17:20:06 +0900 [thread overview]
Message-ID: <CAPpoddfNfrGjhHzQ4KURv2y_z-iyY8cTzG+7d2ooQFU5NcU80w@mail.gmail.com> (raw)
In-Reply-To: <CAKEwX=MFdjryQRDm9b-Oxquhw954HUipCCpABSLwH9mrV4D3WA@mail.gmail.com>
2024年7月13日(土) 8:02 Nhat Pham <nphamcs@gmail.com>:
> >
> > I agree this does not follow LRU, but I think the LRU priority
> > inversion is unavoidable once the pool limit is hit.
> > The accept_thr_percent should be lowered to reduce the probability of
> > LRU inversion if it matters. (it is why I implemented proactive
> > shrinker.)
>
> And yet, in your own benchmark it fails to prevent that, no? I think
> you lower it all the way down to 50%.
>
> >
> > When the writeback throughput is slower than memory usage grows,
> > zswap_store() will have to reject pages sooner or later.
> > If we evict the oldest stored pages synchronously before rejecting a
> > new page (rotating pool to keep LRU), it will affect latency depending
> > how much writeback is required to store the new page. If the oldest
> > pages were compressed well, we would have to evict too many pages to
> > store a warmer page, which blocks the reclaim progress. Fragmentation
> > in the zspool may also increase the required writeback amount.
> > We cannot accomplish both maintaining LRU priority and maintaining
> > pageout latency.
>
> Hmm yeah, I guess this is fair. Looks like there is not a lot of
> choice, if you want to maintain decent pageout latency...
>
> I could suggest that you have a budgeted zswap writeback on zswap
> store - i.e if the pool is full, then try to zswap writeback until we
> have enough space or if the budget is reached. But that feels like
> even more engineering - the IO priority approach might even be easier
> at that point LOL.
>
> Oh well, global shrinker delay it is :)
>
> >
> > Additionally, zswap_writeback_entry() is slower than direct pageout. I
> > assume this is because shrinker performs 4KB IO synchronously. I am
> > seeing shrinking throughput is limited by disk IOPS * 4KB while much
> > higher throughput can be achieved by disabling zswap. direct pageout
> > can be faster than zswap writeback, possibly because of bio
> > optimization or sequential allocation of swap.
>
> Hah, this is interesting!
>
> I wonder though, if the solution here is to perform some sort of
> batching for zswap writeback.
>
> BTW, what is the type of the storage device you are using for swap? Is
> it SSD or HDD etc?
>
It was tested on an Azure VM with SSD-backed storage. The total IOPS
was capped at 4K IOPS by the VM host. The max throughput of the global
shrinker was around 16 MB/s. Proactive shrinking cannot prevent
pool_limit_hit since memory allocation can be on the order of GB/s.
(The benchmark script allocates 2 GB sequentially, which was
compressed to 1.3 GB, while the zswap pool was limited to 200 MB.)
> >
> >
> > > Have you experimented with synchronous reclaim in the case the pool is
> > > full? All the way to the acceptance threshold is too aggressive of
> > > course - you might need to find something in between :)
> > >
> >
> > I don't get what the expected situation is.
> > The benchmark of patch 6 is performing synchronous reclaim in the case
> > the pool is full, since bulk memory allocation (write to mmapped
> > space) is much faster than writeback throughput. The zswap pool is
> > filled instantly at the beginning of benchmark runs. The
> > accept_thr_percent is not significant for the benchmark, I think.
>
> No. I meant synchronous reclaim as in triggering zswap writeback
> within the zswap store path, to make space for the incoming new zswap
> pages. But you already addressed it above :)
>
>
> >
> >
> > >
> > > I wonder if this contention would show up in PSI metrics
> > > (/proc/pressure/io, or the cgroup variants if you use them ). Maybe
> > > correlate reclaim counters (pgscan, zswpout, pswpout, zswpwb etc.)
> > > with IO pressure to show the pattern, i.e the contention problem was
> > > there before, and is now resolved? :)
> >
> > Unfortunately, I could not find a reliable metric other than elapsed
> > time. It seems PSI does not distinguish stalls for rejected pageout
> > from stalls for shrinker writeback.
> > For counters, this issue affects latency but does not increase the
> > number of pagein/out. Is there any better way to observe the origin of
> > contention?
> >
> > Thanks.
next prev parent reply other threads:[~2024-07-15 8:20 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-06 2:25 Takero Funaki
2024-07-06 2:25 ` [PATCH v2 1/6] mm: zswap: fix global shrinker memcg iteration Takero Funaki
2024-07-08 4:54 ` Chengming Zhou
2024-07-17 1:54 ` Yosry Ahmed
2024-07-06 2:25 ` [PATCH v2 2/6] mm: zswap: fix global shrinker error handling logic Takero Funaki
2024-07-17 2:39 ` Yosry Ahmed
2024-07-06 2:25 ` [PATCH v2 3/6] mm: zswap: proactive shrinking before pool size limit is hit Takero Funaki
2024-07-12 23:18 ` Nhat Pham
2024-07-06 2:25 ` [PATCH v2 4/6] mm: zswap: make writeback run in the background Takero Funaki
2024-07-06 2:25 ` [PATCH v2 5/6] mm: zswap: store incompressible page as-is Takero Funaki
2024-07-06 23:53 ` Nhat Pham
2024-07-07 9:38 ` Takero Funaki
2024-07-12 22:36 ` Nhat Pham
2024-07-08 3:56 ` Chengming Zhou
2024-07-08 13:44 ` Takero Funaki
2024-07-09 13:26 ` Chengming Zhou
2024-07-12 22:47 ` Nhat Pham
2024-07-16 2:30 ` Chengming Zhou
2024-07-06 2:25 ` [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO Takero Funaki
2024-07-08 19:17 ` Nhat Pham
2024-07-09 0:57 ` Nhat Pham
2024-07-10 21:21 ` Takero Funaki
2024-07-10 22:10 ` Nhat Pham
2024-07-15 7:33 ` Takero Funaki
2024-07-06 17:32 ` [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink Andrew Morton
2024-07-07 10:54 ` Takero Funaki
2024-07-09 0:53 ` Nhat Pham
2024-07-10 22:26 ` Takero Funaki
2024-07-12 23:02 ` Nhat Pham
2024-07-15 8:20 ` Takero Funaki [this message]
2024-07-26 18:13 ` Nhat Pham
2024-07-26 18:25 ` Nhat Pham
2024-07-17 2:53 ` Yosry Ahmed
2024-07-17 17:49 ` Nhat Pham
2024-07-17 18:05 ` Yosry Ahmed
2024-07-17 19:01 ` Nhat Pham
2024-07-19 14:55 ` Takero Funaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPpoddfNfrGjhHzQ4KURv2y_z-iyY8cTzG+7d2ooQFU5NcU80w@mail.gmail.com \
--to=flintglass@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cerasuolodomenico@gmail.com \
--cc=chengming.zhou@linux.dev \
--cc=corbet@lwn.net \
--cc=hannes@cmpxchg.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox