From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Barry Song <21cnbao@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
akpm@linux-foundation.org, minchan@kernel.org,
linux-block@vger.kernel.org, axboe@kernel.dk, linux-mm@kvack.org,
terrelln@fb.com, chrisl@kernel.org, david@redhat.com,
kasong@tencent.com, yuzhao@google.com, yosryahmed@google.com,
nphamcs@gmail.com, willy@infradead.org, hannes@cmpxchg.org,
ying.huang@intel.com, surenb@google.com,
wajdi.k.feghali@intel.com, kanchana.p.sridhar@intel.com,
corbet@lwn.net, zhouchengming@bytedance.com,
Tangquan Zheng <zhengtangquan@oppo.com>,
Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH RFC 2/2] zram: support compression at the granularity of multi-pages
Date: Fri, 19 Apr 2024 12:41:15 +0900 [thread overview]
Message-ID: <20240419034115.GD14947@google.com> (raw)
In-Reply-To: <CAGsJ_4wdBtoBUbwmDs+FnPvinG-PtKKY7SzOinr_6DzrZ22_0A@mail.gmail.com>
On (24/04/11 19:49), Barry Song wrote:
> > > This question brings up an interesting observation. In our actual product,
> > > we've noticed a success rate of over 90% when allocating large folios in
> > > do_swap_page, but occasionally, we encounter failures. In such cases,
> > > instead of resorting to partial reads, we opt to allocate 16 small folios and
> > > request zram to fill them all. This strategy effectively minimizes partial reads
> > > to nearly zero. However, integrating this into the upstream codebase seems
> > > like a considerable task, and for now, it remains part of our
> > > out-of-tree code[1],
> > > which is also open-source.
> > > We're gradually sending patches for the swap-in process, systematically
> > > cleaning up the product's code.
> >
> > I see, thanks for explanation.
> > Does this sound like this series is ahead of its time?
>
> I feel it is necessary to present the whole picture together with large folios
> swp-in series[1]
Yeah, makes sense.
> > These partial reads/writes are difficult to justify - instead of doing
> > comp_op(PAGE_SIZE) we, in the worst case, now can do ZCOMP_MULTI_PAGES_NR
> > of comp_op(ZCOMP_MULTI_PAGES_ORDER) (assuming a access pattern that
> > touches each of multi-pages individually). That is a potentially huge
> > increase in CPU/power usage, which cannot be easily sacrificed. In fact,
> > I'd probably say that power usage is more important here than zspool
> > memory usage (that we have means to deal with).
>
> Once Ryan's mTHP swapout without splitting [2] is integrated into the
> mainline, this
> patchset certainly gains an advantage for SWPOUT. However, for SWPIN,
> the situation
> is more nuanced. There's a risk of failing to allocate mTHP, which
> could result in the
> allocation of a small folio instead. In such cases, decompressing a
> large folio but
> copying only one subpage leads to inefficiency.
>
> In real-world products, we've addressed this challenge in two ways:
> 1. We've enhanced reserved page blocks for mTHP to boost allocation
> success rates.
> 2. In instances where we fail to allocate a large folio, we fall back
> to allocating nr_pages
> small folios instead of just one. so we still only decompress once for
> multi-pages.
>
> With these measures in place, we consistently achieve wins in both
> power consumption and
> memory savings. However, it's important to note that these
> optimizations are specific to our
> product, and there's still much work needed to upstream them all.
Do you track any other metrics? Memory savings is just one way of looking
at it. The other metrics is utilization ratio of zspool
compressed size : zs_get_total_pages(zram->mem_pool)
Compaction and migration can also be interesting, given that
zsmalloc is changing.
> > Have you evaluated power usage?
> >
> > I also wonder if it brings down the number of ZRAM_SAME pages. Suppose
> > when several pages out of ZCOMP_MULTI_PAGES_ORDER are filled with zeroes
> > (or some other recognizable pattern) which previously would have been
> > stored using just unsigned long. Makes me even wonder if ZRAM_SAME test
> > makes sense on multi-page at all, for that matter.
>
> I don't think we need to worry about ZRAM_SAME.
Oh, it's not that I worry about it, just another thing that is
changing. E.g. having memcpy() /* current ZRAM_SAME handing ling */
vs decomp(order 4) and then memcpy().
> mTHP offers a means to emulate a 16KiB/64KiB base page while
> maintaining software
> compatibility with a 4KiB base page. The primary concern here lies in
> partial read/write
> operations. In our product, we've successfully addressed these issues. However,
> convincing people in the mainline community may take considerable time
> and effort :-)
Do you have a rebased zram/zsmalloc series somewhere in public access
that I can test?
next prev parent reply other threads:[~2024-04-19 3:41 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-27 21:48 [PATCH RFC 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages Barry Song
2024-03-27 21:48 ` [PATCH RFC 1/2] mm: zsmalloc: support objects compressed based on multiple pages Barry Song
2024-10-21 23:26 ` Barry Song
2024-03-27 21:48 ` [PATCH RFC 2/2] zram: support compression at the granularity of multi-pages Barry Song
2024-04-11 0:40 ` Sergey Senozhatsky
2024-04-11 1:24 ` Barry Song
2024-04-11 1:42 ` Sergey Senozhatsky
2024-04-11 2:03 ` Barry Song
2024-04-11 4:14 ` Sergey Senozhatsky
2024-04-11 7:49 ` Barry Song
2024-04-19 3:41 ` Sergey Senozhatsky [this message]
2024-10-21 23:28 ` Barry Song
2024-11-06 16:23 ` Usama Arif
2024-11-07 10:25 ` Barry Song
2024-11-07 10:31 ` Barry Song
2024-11-07 11:49 ` Usama Arif
2024-11-07 20:53 ` Barry Song
2024-03-27 22:01 ` [PATCH RFC 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240419034115.GD14947@google.com \
--to=senozhatsky@chromium.org \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=kasong@tencent.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=surenb@google.com \
--cc=terrelln@fb.com \
--cc=v-songbaohua@oppo.com \
--cc=wajdi.k.feghali@intel.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
--cc=zhengtangquan@oppo.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox