Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Takero Funaki <flintglass@gmail.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Yosry Ahmed <yosryahmed@google.com>,
	 Chengming Zhou <chengming.zhou@linux.dev>,
	Jonathan Corbet <corbet@lwn.net>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Domenico Cerasuolo <cerasuolodomenico@gmail.com>,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink
Date: Mon, 15 Jul 2024 17:20:06 +0900	[thread overview]
Message-ID: <CAPpoddfNfrGjhHzQ4KURv2y_z-iyY8cTzG+7d2ooQFU5NcU80w@mail.gmail.com> (raw)
In-Reply-To: <CAKEwX=MFdjryQRDm9b-Oxquhw954HUipCCpABSLwH9mrV4D3WA@mail.gmail.com>

2024年7月13日(土) 8:02 Nhat Pham <nphamcs@gmail.com>:

> >
> > I agree this does not follow LRU, but I think the LRU priority
> > inversion is unavoidable once the pool limit is hit.
> > The accept_thr_percent should be lowered to reduce the probability of
> > LRU inversion if it matters. (it is why I implemented proactive
> > shrinker.)
>
> And yet, in your own benchmark it fails to prevent that, no? I think
> you lower it all the way down to 50%.
>
> >
> > When the writeback throughput is slower than memory usage grows,
> > zswap_store() will have to reject pages sooner or later.
> > If we evict the oldest stored pages synchronously before rejecting a
> > new page (rotating pool to keep LRU), it will affect latency depending
> > how much writeback is required to store the new page. If the oldest
> > pages were compressed well, we would have to evict too many pages to
> > store a warmer page, which blocks the reclaim progress. Fragmentation
> > in the zspool may also increase the required writeback amount.
> > We cannot accomplish both maintaining LRU priority and maintaining
> > pageout latency.
>
> Hmm yeah, I guess this is fair. Looks like there is not a lot of
> choice, if you want to maintain decent pageout latency...
>
> I could suggest that you have a budgeted zswap writeback on zswap
> store - i.e if the pool is full, then try to zswap writeback until we
> have enough space or if the budget is reached. But that feels like
> even more engineering - the IO priority approach might even be easier
> at that point LOL.
>
> Oh well, global shrinker delay it is :)
>
> >
> > Additionally, zswap_writeback_entry() is slower than direct pageout. I
> > assume this is because shrinker performs 4KB IO synchronously. I am
> > seeing shrinking throughput is limited by disk IOPS * 4KB while much
> > higher throughput can be achieved by disabling zswap. direct pageout
> > can be faster than zswap writeback, possibly because of bio
> > optimization or sequential allocation of swap.
>
> Hah, this is interesting!
>
> I wonder though, if the solution here is to perform some sort of
> batching for zswap writeback.
>
> BTW, what is the type of the storage device you are using for swap? Is
> it SSD or HDD etc?
>

It was tested on an Azure VM with SSD-backed storage. The total IOPS
was capped at 4K IOPS by the VM host. The max throughput of the global
shrinker was around 16 MB/s. Proactive shrinking cannot prevent
pool_limit_hit since memory allocation can be on the order of GB/s.
(The benchmark script allocates 2 GB sequentially, which was
compressed to 1.3 GB, while the zswap pool was limited to 200 MB.)


> >
> >
> > > Have you experimented with synchronous reclaim in the case the pool is
> > > full? All the way to the acceptance threshold is too aggressive of
> > > course - you might need to find something in between :)
> > >
> >
> > I don't get what the expected situation is.
> > The benchmark of patch 6 is performing synchronous reclaim in the case
> > the pool is full, since bulk memory allocation (write to mmapped
> > space) is much faster than writeback throughput. The zswap pool is
> > filled instantly at the beginning of benchmark runs. The
> > accept_thr_percent is not significant for the benchmark, I think.
>
> No. I meant synchronous reclaim as in triggering zswap writeback
> within the zswap store path, to make space for the incoming new zswap
> pages. But you already addressed it above :)
>
>
> >
> >
> > >
> > > I wonder if this contention would show up in PSI metrics
> > > (/proc/pressure/io, or the cgroup variants if you use them ). Maybe
> > > correlate reclaim counters (pgscan, zswpout, pswpout, zswpwb etc.)
> > > with IO pressure to show the pattern, i.e the contention problem was
> > > there before, and is now resolved? :)
> >
> > Unfortunately, I could not find a reliable metric other than elapsed
> > time. It seems PSI does not distinguish stalls for rejected pageout
> > from stalls for shrinker writeback.
> > For counters, this issue affects latency but does not increase the
> > number of pagein/out. Is there any better way to observe the origin of
> > contention?
> >
> > Thanks.

next prev parent reply	other threads:[~2024-07-15  8:20 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-06  2:25 Takero Funaki
2024-07-06  2:25 ` [PATCH v2 1/6] mm: zswap: fix global shrinker memcg iteration Takero Funaki
2024-07-08  4:54   ` Chengming Zhou
2024-07-17  1:54   ` Yosry Ahmed
2024-07-06  2:25 ` [PATCH v2 2/6] mm: zswap: fix global shrinker error handling logic Takero Funaki
2024-07-17  2:39   ` Yosry Ahmed
2024-07-06  2:25 ` [PATCH v2 3/6] mm: zswap: proactive shrinking before pool size limit is hit Takero Funaki
2024-07-12 23:18   ` Nhat Pham
2024-07-06  2:25 ` [PATCH v2 4/6] mm: zswap: make writeback run in the background Takero Funaki
2024-07-06  2:25 ` [PATCH v2 5/6] mm: zswap: store incompressible page as-is Takero Funaki
2024-07-06 23:53   ` Nhat Pham
2024-07-07  9:38     ` Takero Funaki
2024-07-12 22:36       ` Nhat Pham
2024-07-08  3:56   ` Chengming Zhou
2024-07-08 13:44     ` Takero Funaki
2024-07-09 13:26       ` Chengming Zhou
2024-07-12 22:47         ` Nhat Pham
2024-07-16  2:30           ` Chengming Zhou
2024-07-06  2:25 ` [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO Takero Funaki
2024-07-08 19:17   ` Nhat Pham
2024-07-09  0:57   ` Nhat Pham
2024-07-10 21:21     ` Takero Funaki
2024-07-10 22:10       ` Nhat Pham
2024-07-15  7:33         ` Takero Funaki
2024-07-06 17:32 ` [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink Andrew Morton
2024-07-07 10:54   ` Takero Funaki
2024-07-09  0:53 ` Nhat Pham
2024-07-10 22:26   ` Takero Funaki
2024-07-12 23:02     ` Nhat Pham
2024-07-15  8:20       ` Takero Funaki [this message]
2024-07-26 18:13         ` Nhat Pham
2024-07-26 18:25           ` Nhat Pham
2024-07-17  2:53     ` Yosry Ahmed
2024-07-17 17:49       ` Nhat Pham
2024-07-17 18:05         ` Yosry Ahmed
2024-07-17 19:01           ` Nhat Pham
2024-07-19 14:55           ` Takero Funaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPpoddfNfrGjhHzQ4KURv2y_z-iyY8cTzG+7d2ooQFU5NcU80w@mail.gmail.com \
    --to=flintglass@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cerasuolodomenico@gmail.com \
    --cc=chengming.zhou@linux.dev \
    --cc=corbet@lwn.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox