From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Nhat Pham <nphamcs@gmail.com>, Minchan Kim <minchan@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Brian Geffon <bgeffon@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Sergey Senozhatsky <senozhatsky@chromium.org>
Subject: Re: [RFC PATCH 2/2] zsmalloc: chain-length configuration should consider other metrics
Date: Mon, 5 Jan 2026 16:23:39 +0900 [thread overview]
Message-ID: <gdtbjepmea65zajh6dafxiekv2wmoerje7v3qxjundbezdnpps@f5ofogyw7wnl> (raw)
In-Reply-To: <gg5zdpzrk47tljbnaudcy2gnsodyhmmar23qb57b67bhx6ntje@eq2fcrl2dk4z>
On (26/01/05 10:42), Sergey Senozhatsky wrote:
> On (26/01/02 18:29), Yosry Ahmed wrote:
> > On Thu, Jan 01, 2026 at 10:38:14AM +0900, Sergey Senozhatsky wrote:
> [..]
> >
> > I worry that the heuristics are too hand-wavy
>
> I don't disagree. Am not super excited about the heuristics either.
>
> > and I wonder if the memcpy savings actually show up as perf improvements
> > in any real life workload. Do we have data about this?
>
> I don't have real life 16K PAGE_SIZE devices. However, on 16K PAGE_SIZE
> systems we have "normal" size-classes up to a very large size, and normal
> class means chaining of 0-order physical pages, and chaining means spanning.
> So on 16K memcpy overhead is expected to be somewhat noticeable.
By the way, while looking at it, I think we need to "fix" obj_read_begin().
Currently, it uses "off + class->size" to detect spanning objects, which is
incorrect: size classes get merged, so a typical size class can hold a range
of sizes, using padding for smaller objects. So instead of class->size we
need to use the actual compressed objects size, just in case if actual written
size was small enough to fit into the first physical page (we do that in
obj_write()). I'll cook a patch.
Something like this:
---
drivers/block/zram/zram_drv.c | 8 +++++---
include/linux/zsmalloc.h | 2 +-
mm/zsmalloc.c | 4 ++--
mm/zswap.c | 3 ++-
4 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a6587bed6a03..b371ba6bfec2 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -2065,7 +2065,7 @@ static int read_incompressible_page(struct zram *zram, struct page *page,
void *src, *dst;
handle = get_slot_handle(zram, index);
- src = zs_obj_read_begin(zram->mem_pool, handle, NULL);
+ src = zs_obj_read_begin(zram->mem_pool, handle, PAGE_SIZE, NULL);
dst = kmap_local_page(page);
copy_page(dst, src);
kunmap_local(dst);
@@ -2087,7 +2087,8 @@ static int read_compressed_page(struct zram *zram, struct page *page, u32 index)
prio = get_slot_comp_priority(zram, index);
zstrm = zcomp_stream_get(zram->comps[prio]);
- src = zs_obj_read_begin(zram->mem_pool, handle, zstrm->local_copy);
+ src = zs_obj_read_begin(zram->mem_pool, handle, size,
+ zstrm->local_copy);
dst = kmap_local_page(page);
ret = zcomp_decompress(zram->comps[prio], zstrm, src, size, dst);
kunmap_local(dst);
@@ -2114,7 +2115,8 @@ static int read_from_zspool_raw(struct zram *zram, struct page *page, u32 index)
* takes place here, as we read raw compressed data.
*/
zstrm = zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]);
- src = zs_obj_read_begin(zram->mem_pool, handle, zstrm->local_copy);
+ src = zs_obj_read_begin(zram->mem_pool, handle, size,
+ zstrm->local_copy);
memcpy_to_page(page, 0, src, size);
zs_obj_read_end(zram->mem_pool, handle, src);
zcomp_stream_put(zstrm);
diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
index f3ccff2d966c..64f65c1f14d6 100644
--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -40,7 +40,7 @@ unsigned int zs_lookup_class_index(struct zs_pool *pool, unsigned int size);
void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats);
void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
- void *local_copy);
+ size_t mem_len, void *local_copy);
void zs_obj_read_end(struct zs_pool *pool, unsigned long handle,
void *handle_mem);
void zs_obj_write(struct zs_pool *pool, unsigned long handle,
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index be385609ef8a..2da60c23cd18 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1070,7 +1070,7 @@ unsigned long zs_get_total_pages(struct zs_pool *pool)
EXPORT_SYMBOL_GPL(zs_get_total_pages);
void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
- void *local_copy)
+ size_t mem_len, void *local_copy)
{
struct zspage *zspage;
struct zpdesc *zpdesc;
@@ -1092,7 +1092,7 @@ void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
class = zspage_class(pool, zspage);
off = offset_in_page(class->size * obj_idx);
- if (off + class->size <= PAGE_SIZE) {
+ if (off + mem_len <= PAGE_SIZE) {
/* this object is contained entirely within a page */
addr = kmap_local_zpdesc(zpdesc);
addr += off;
diff --git a/mm/zswap.c b/mm/zswap.c
index de8858ff1521..291352629616 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -937,7 +937,8 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
u8 *src, *obj;
acomp_ctx = acomp_ctx_get_cpu_lock(pool);
- obj = zs_obj_read_begin(pool->zs_pool, entry->handle, acomp_ctx->buffer);
+ obj = zs_obj_read_begin(pool->zs_pool, entry->handle, entry->length,
+ acomp_ctx->buffer);
/* zswap entries of length PAGE_SIZE are not compressed. */
if (entry->length == PAGE_SIZE) {
--
2.52.0.351.gbe84eed79e-goog
next prev parent reply other threads:[~2026-01-05 7:23 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-01 1:38 [RFC PATCH 0/2] zsmalloc: size-classes chain-length tunings Sergey Senozhatsky
2026-01-01 1:38 ` [RFC PATCH 1/2] zsmalloc: drop hard limit on the number of size classes Sergey Senozhatsky
2026-01-01 1:38 ` [RFC PATCH 2/2] zsmalloc: chain-length configuration should consider other metrics Sergey Senozhatsky
2026-01-02 18:29 ` Yosry Ahmed
2026-01-05 1:42 ` Sergey Senozhatsky
2026-01-05 7:23 ` Sergey Senozhatsky [this message]
2026-01-05 16:01 ` Yosry Ahmed
2026-01-06 4:10 ` Sergey Senozhatsky
2026-01-05 15:58 ` Yosry Ahmed
2026-01-06 4:20 ` Sergey Senozhatsky
2026-01-06 4:22 ` Sergey Senozhatsky
2026-01-06 5:08 ` Herbert Xu
2026-01-06 16:24 ` Yosry Ahmed
2026-01-07 5:25 ` Herbert Xu
2026-01-07 5:39 ` Yosry Ahmed
2026-01-07 5:42 ` Herbert Xu
2026-01-07 5:43 ` Sergey Senozhatsky
2026-01-07 17:12 ` Yosry Ahmed
2026-01-06 9:47 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=gdtbjepmea65zajh6dafxiekv2wmoerje7v3qxjundbezdnpps@f5ofogyw7wnl \
--to=senozhatsky@chromium.org \
--cc=akpm@linux-foundation.org \
--cc=bgeffon@google.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox