linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: akpm@linux-foundation.org, minchan@kernel.org,
	linux-block@vger.kernel.org,  axboe@kernel.dk,
	linux-mm@kvack.org, terrelln@fb.com, chrisl@kernel.org,
	 david@redhat.com, kasong@tencent.com, yuzhao@google.com,
	 yosryahmed@google.com, nphamcs@gmail.com, willy@infradead.org,
	 hannes@cmpxchg.org, ying.huang@intel.com, surenb@google.com,
	 wajdi.k.feghali@intel.com, kanchana.p.sridhar@intel.com,
	corbet@lwn.net,  zhouchengming@bytedance.com,
	Tangquan Zheng <zhengtangquan@oppo.com>,
	 Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH RFC 2/2] zram: support compression at the granularity of multi-pages
Date: Thu, 11 Apr 2024 14:03:46 +1200	[thread overview]
Message-ID: <CAGsJ_4yKjfr1kgFKufM68yJoTysgT_gri4Dbg-aghj70F0Zf0Q@mail.gmail.com> (raw)
In-Reply-To: <20240411014237.GB8743@google.com>

On Thu, Apr 11, 2024 at 1:42 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> On (24/03/28 10:48), Barry Song wrote:
> [..]
> > +/*
> > + * Use a temporary buffer to decompress the page, as the decompressor
> > + * always expects a full page for the output.
> > + */
> > +static int zram_bvec_read_multi_pages_partial(struct zram *zram, struct bio_vec *bvec,
> > +                               u32 index, int offset)
> > +{
> > +     struct page *page = alloc_pages(GFP_NOIO | __GFP_COMP, ZCOMP_MULTI_PAGES_ORDER);
> > +     int ret;
> > +
> > +     if (!page)
> > +             return -ENOMEM;
> > +     ret = zram_read_multi_pages(zram, page, index, NULL);
> > +     if (likely(!ret)) {
> > +             atomic64_inc(&zram->stats.zram_bio_read_multi_pages_partial_count);
> > +             void *dst = kmap_local_page(bvec->bv_page);
> > +             void *src = kmap_local_page(page);
> > +
> > +             memcpy(dst + bvec->bv_offset, src + offset, bvec->bv_len);
> > +             kunmap_local(src);
> > +             kunmap_local(dst);
> > +     }
> > +     __free_pages(page, ZCOMP_MULTI_PAGES_ORDER);
> > +     return ret;
> > +}
>
> [..]
>
> > +static int zram_bvec_write_multi_pages_partial(struct zram *zram, struct bio_vec *bvec,
> > +                                u32 index, int offset, struct bio *bio)
> > +{
> > +     struct page *page = alloc_pages(GFP_NOIO | __GFP_COMP, ZCOMP_MULTI_PAGES_ORDER);
> > +     int ret;
> > +     void *src, *dst;
> > +
> > +     if (!page)
> > +             return -ENOMEM;
> > +
> > +     ret = zram_read_multi_pages(zram, page, index, bio);
> > +     if (!ret) {
> > +             src = kmap_local_page(bvec->bv_page);
> > +             dst = kmap_local_page(page);
> > +             memcpy(dst + offset, src + bvec->bv_offset, bvec->bv_len);
> > +             kunmap_local(dst);
> > +             kunmap_local(src);
> > +
> > +             atomic64_inc(&zram->stats.zram_bio_write_multi_pages_partial_count);
> > +             ret = zram_write_page(zram, page, index);
> > +     }
> > +     __free_pages(page, ZCOMP_MULTI_PAGES_ORDER);
> > +     return ret;
> > +}
>
> What type of testing you run on it? How often do you see partial
> reads and writes? Because this looks concerning - zsmalloc memory
> usage reduction is one metrics, but this also can be achieved via
> recompression, writeback, or even a different compression algorithm,
> but higher CPU/power usage/higher requirements for physically contig
> pages cannot be offset easily. (Another corner case, assume we have
> partial read requests on every CPU simultaneously.)

This question brings up an interesting observation. In our actual product,
we've noticed a success rate of over 90% when allocating large folios in
do_swap_page, but occasionally, we encounter failures. In such cases,
instead of resorting to partial reads, we opt to allocate 16 small folios and
request zram to fill them all. This strategy effectively minimizes partial reads
to nearly zero. However, integrating this into the upstream codebase seems
like a considerable task, and for now, it remains part of our
out-of-tree code[1],
which is also open-source.
We're gradually sending patches for the swap-in process, systematically
cleaning up the product's code.

To enhance the success rate of large folio allocation, we've reserved some
page blocks for mTHP. This approach is currently absent from the mainline
codebase as well (Yu Zhao is trying to provide TAO [2]). Consequently, we
anticipate that partial reads may reach 50% or more until this method is
incorporated upstream.

[1] https://github.com/OnePlusOSS/android_kernel_oneplus_sm8550/tree/oneplus/sm8550_u_14.0.0_oneplus11
[2] https://lore.kernel.org/linux-mm/20240229183436.4110845-1-yuzhao@google.com/

Thanks
Barry


  reply	other threads:[~2024-04-11  2:04 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-27 21:48 [PATCH RFC 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages Barry Song
2024-03-27 21:48 ` [PATCH RFC 1/2] mm: zsmalloc: support objects compressed based on multiple pages Barry Song
2024-10-21 23:26   ` Barry Song
2024-03-27 21:48 ` [PATCH RFC 2/2] zram: support compression at the granularity of multi-pages Barry Song
2024-04-11  0:40   ` Sergey Senozhatsky
2024-04-11  1:24     ` Barry Song
2024-04-11  1:42   ` Sergey Senozhatsky
2024-04-11  2:03     ` Barry Song [this message]
2024-04-11  4:14       ` Sergey Senozhatsky
2024-04-11  7:49         ` Barry Song
2024-04-19  3:41           ` Sergey Senozhatsky
2024-10-21 23:28   ` Barry Song
2024-11-06 16:23     ` Usama Arif
2024-11-07 10:25       ` Barry Song
2024-11-07 10:31         ` Barry Song
2024-11-07 11:49           ` Usama Arif
2024-11-07 20:53             ` Barry Song
2024-03-27 22:01 ` [PATCH RFC 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4yKjfr1kgFKufM68yJoTysgT_gri4Dbg-aghj70F0Zf0Q@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=chrisl@kernel.org \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=kanchana.p.sridhar@intel.com \
    --cc=kasong@tencent.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=senozhatsky@chromium.org \
    --cc=surenb@google.com \
    --cc=terrelln@fb.com \
    --cc=v-songbaohua@oppo.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    --cc=zhengtangquan@oppo.com \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox