From: Barry Song <21cnbao@gmail.com>
To: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, axboe@kernel.dk,
bala.seshasayee@linux.intel.com, chrisl@kernel.org,
david@redhat.com, hannes@cmpxchg.org,
kanchana.p.sridhar@intel.com, kasong@tencent.com,
linux-block@vger.kernel.org, minchan@kernel.org,
nphamcs@gmail.com, ryan.roberts@arm.com, surenb@google.com,
terrelln@fb.com, usamaarif642@gmail.com, v-songbaohua@oppo.com,
wajdi.k.feghali@intel.com, willy@infradead.org,
ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com,
zhengtangquan@oppo.com, zhouchengming@bytedance.com
Subject: Re: [PATCH RFC v3 0/4] mTHP-friendly compression in zsmalloc and zram based on multi-pages
Date: Fri, 29 Nov 2024 09:40:09 +1300 [thread overview]
Message-ID: <CAGsJ_4wpHMoSk=6UXDNv_r+AR_-TwyQ5bM=sObvgcXinBOhFtA@mail.gmail.com> (raw)
In-Reply-To: <20241127045224.GF440697@google.com>
On Wed, Nov 27, 2024 at 5:52 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/11/27 09:20), Barry Song wrote:
> [..]
> > > 390 12736
> > > 395 13056
> > > 404 13632
> > > 410 14016
> > > 415 14336
> > > 418 14528
> > > 447 16384
> > >
> > > E.g. 13632 and 13056 are more than 500 bytes apart.
> > >
> > > > swap-out time(ms) 68711 49908
> > > > swap-in time(ms) 30687 20685
> > > > compression ratio 20.49% 16.9%
> > >
> > > These are not the only numbers to focus on, really important metrics
> > > are: zsmalloc pages-used and zsmalloc max-pages-used. Then we can
> > > calculate the pool memory usage ratio (the size of compressed data vs
> > > the number of pages zsmalloc pool allocated to keep them).
> >
> > To address this, we plan to collect more data and get back to you
> > afterwards. From my understanding, we still have an opportunity
> > to refine the CHAIN SIZE?
>
> Do you mean changing the value? It's configurable.
>
> > Essentially, each small object might cause some waste within the
> > original PAGE_SIZE. Now, with 4 * PAGE_SIZE, there could be a
> > single instance of waste. If we can manage the ratio, this could be
> > optimized?
>
> All size classes work the same and we merge size-classes with equal
> characteristics. So in the example above
>
> 395 13056
> 404 13632
>
> size-classes #396-403 are merged with size-class #404. And #404 size-class
> splits zspage into 13632-byte chunks, any smaller objects (e.g. an object
> from size-class #396 (which can be just one byte larger than #395
> objects)) takes that entire chunk and the rest of the space in the chunk
> is just padding.
>
> CHAIN_SIZE is how we find the optimal balance. The larger the zspage
> the more likely we squeeze some space for extra objects, which otherwise
> would have been just a waste. With large CHAIN_SIZE we also change
> characteristics of many size classes so we merge less classes and have
> more clusters. The price, on the other hand, is more physical 0-order
> pages per zspage, which can be painful. On all the tests I ran 8 or 10
> worked best.
Thanks very much for the explanation. We’ll gather more data on this and follow
up with you.
>
> [..]
> > > another option might be to just use a faster algorithm and then utilize
> > > post-processing (re-compression with zstd or writeback) for memory
> > > savings?
> >
> > The concern lies in power consumption
>
> But the power consumption concern is also in "decompress just one middle
> page from very large object" case, and size-classes de-fragmentation
That's why we have "[patch 4/4] mm: fall back to four small folios if mTHP
allocation fails" to address the issue of "decompressing just one middle page
from a very large object." I assume that recompression and writeback should
also focus on large objects if the original compression involves multiple pages?
> which requires moving around lots of objects in order to form more full
> zspage and release empty zspages. There are concerns everywhere, how
I assume the cost of defragmentation is M * N, where:
* M is the number of objects,
* N is the size of the objects.
With large objects, M is reduced to 1/4 of the original number of
objects. Although
N increases, the overall M * N becomes slightly smaller than before,
as N is just
under 4 times the size of the original objects?
> many of them are measured and analyzed and either ruled out or confirmed
> is another question.
In phone scenarios, if recompression uses zstd and the original compression
is based on lz4 with 4KB blocks, the cost to obtain zstd-compressed objects
would be:
* A: Compression of 4 × 4KB using lz4
* B: Decompression of 4 × 4KB using lz4
* C: Compression of 4 × 4KB using zstd
By leveraging the speed advantages of mTHP swap and zstd's large-block
compression,
the cost becomes:
D: Compression of 16KB using zstd
Since D is significantly smaller than C (D < C), it follows that:
D < A + B + C ?
Thanks
Barry
prev parent reply other threads:[~2024-11-28 20:40 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-21 22:25 Barry Song
2024-11-21 22:25 ` [PATCH RFC v3 1/4] mm: zsmalloc: support objects compressed based on multiple pages Barry Song
2024-11-26 5:37 ` Sergey Senozhatsky
2024-11-27 1:53 ` Barry Song
2024-11-21 22:25 ` [PATCH RFC v3 2/4] zram: support compression at the granularity of multi-pages Barry Song
2024-11-21 22:25 ` [PATCH RFC v3 3/4] zram: backend_zstd: Adjust estimated_src_size to accommodate multi-page compression Barry Song
2024-11-21 22:25 ` [PATCH RFC v3 4/4] mm: fall back to four small folios if mTHP allocation fails Barry Song
2024-11-22 14:54 ` Usama Arif
2024-11-24 21:47 ` Barry Song
2024-11-25 16:19 ` Usama Arif
2024-11-25 18:32 ` Barry Song
2024-11-26 5:09 ` [PATCH RFC v3 0/4] mTHP-friendly compression in zsmalloc and zram based on multi-pages Sergey Senozhatsky
2024-11-26 10:52 ` Sergey Senozhatsky
2024-11-26 20:31 ` Barry Song
2024-11-27 5:04 ` Sergey Senozhatsky
2024-11-28 20:56 ` Barry Song
2024-11-26 20:20 ` Barry Song
2024-11-27 4:52 ` Sergey Senozhatsky
2024-11-28 20:40 ` Barry Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGsJ_4wpHMoSk=6UXDNv_r+AR_-TwyQ5bM=sObvgcXinBOhFtA@mail.gmail.com' \
--to=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bala.seshasayee@linux.intel.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=kasong@tencent.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=senozhatsky@chromium.org \
--cc=surenb@google.com \
--cc=terrelln@fb.com \
--cc=usamaarif642@gmail.com \
--cc=v-songbaohua@oppo.com \
--cc=wajdi.k.feghali@intel.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
--cc=zhengtangquan@oppo.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox