From: Yosry Ahmed <yosry.ahmed@linux.dev>
To: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Nhat Pham <nphamcs@gmail.com>, Minchan Kim <minchan@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Brian Geffon <bgeffon@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] zsmalloc: use actual object size to detect spans
Date: Wed, 7 Jan 2026 05:19:16 +0000 [thread overview]
Message-ID: <u6dy6oa6ztghy7ozficimubhb2mwppcq6gosupepnn63uu6oq7@qyph3nyq7las> (raw)
In-Reply-To: <tzppylvpaq7lk4re33h4niofgfwiziibp3jdoos2ofeepxx5yh@sh2doyzze25z>
On Wed, Jan 07, 2026 at 11:20:20AM +0900, Sergey Senozhatsky wrote:
> On (26/01/07 02:10), Yosry Ahmed wrote:
> > On Wed, Jan 07, 2026 at 11:06:09AM +0900, Sergey Senozhatsky wrote:
> > > On (26/01/07 01:56), Yosry Ahmed wrote:
> > > > > I recall us having exactly this idea when we first introduced
> > > > > zs_obj_{read,write}_end() functions, and I do recall that it
> > > > > did not work. Somehow this panics in __memcpy+0xc/0x44. Let
> > > > > me dig into it again.
> > > >
> > > > Maybe because at this point we are trying to memcpy() class->size, which
> > > > already includes ZS_HANDLE_SIZE. So reading after increasing the offset
> > > > reads ZS_HANDLE_SIZE after class->size.
> > >
> > > Yeah, I guess that falsely hits the spanning path because of extra
> > > sizeof(unsigned long).
> >
> > Or the object could be spanning two pages indeed, but we're copying
> > extra sizeof(unsigned long), that shouldn't crash tho.
>
> It seems there is no second page, it's a pow-of-two size class. So
> we mis-detect spanning.
>
> [ 51.406310] zsmalloc: :: size class 48, orig offt 16336, page size 16384, memcpy sizes 40, 8
> [ 51.407571] Unable to handle kernel paging request at virtual address ffffc04000000000
> [ 51.420816] pc : __memcpy+0xc/0x44
>
> Second memcpy() of sizeof(unsigned long) traps.
I think this case is exactly what you expected earlier (not sure what
you mean by the pow of 2 reply). We increase the offset by 8 bytes
(ZS_HANDLE_SIZE), but we still copy 48 bytes, even though 48 bytes
includes both the object and ZS_HANDLE_SIZE. So we end up copying 8
bytes beyond the end of the object, which puts us in the next page which
we should not be copying.
I think to fix the bug at this point we need to subtract ZS_HANDLE_SIZE
from class->size before we use it for copying or spanning detection.
Something like (untested):
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 5bf832f9c05c..894783d2526c 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1072,6 +1072,7 @@ void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
unsigned long obj, off;
unsigned int obj_idx;
struct size_class *class;
+ unsigned long size;
void *addr;
/* Guarantee we can get zspage from handle safely */
@@ -1087,7 +1088,13 @@ void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
class = zspage_class(pool, zspage);
off = offset_in_page(class->size * obj_idx);
- if (off + class->size <= PAGE_SIZE) {
+ size = class->size;
+ if (!ZsHugePage(zspage)) {
+ off += ZS_HANDLE_SIZE;
+ size -= ZS_HANDLE_SIZE;
+ }
+
+ if (off + size <= PAGE_SIZE) {
/* this object is contained entirely within a page */
addr = kmap_local_zpdesc(zpdesc);
addr += off;
@@ -1096,7 +1103,7 @@ void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
/* this object spans two pages */
sizes[0] = PAGE_SIZE - off;
- sizes[1] = class->size - sizes[0];
+ sizes[1] = size - sizes[0];
addr = local_copy;
memcpy_from_page(addr, zpdesc_page(zpdesc),
@@ -1107,9 +1114,6 @@ void *zs_obj_read_begin(struct zs_pool *pool, unsigned long handle,
0, sizes[1]);
}
- if (!ZsHugePage(zspage))
- addr += ZS_HANDLE_SIZE;
-
return addr;
}
EXPORT_SYMBOL_GPL(zs_obj_read_begin);
@@ -1121,6 +1125,7 @@ void zs_obj_read_end(struct zs_pool *pool, unsigned long handle,
struct zpdesc *zpdesc;
unsigned long obj, off;
unsigned int obj_idx;
+ unsigned long size;
struct size_class *class;
obj = handle_to_obj(handle);
@@ -1129,9 +1134,13 @@ void zs_obj_read_end(struct zs_pool *pool, unsigned long handle,
class = zspage_class(pool, zspage);
off = offset_in_page(class->size * obj_idx);
- if (off + class->size <= PAGE_SIZE) {
- if (!ZsHugePage(zspage))
- off += ZS_HANDLE_SIZE;
+ size = class->size;
+ if (!ZsHugePage(zspage)) {
+ off += ZS_HANDLE_SIZE;
+ size -= ZS_HANDLE_SIZE;
+ }
+
+ if (off + size <= PAGE_SIZE) {
handle_mem -= off;
kunmap_local(handle_mem);
}
>
> > I think the changes need to be shuffled around to avoid this, or just
> > have a combined patch, which would be less pretty.
>
> I think I prefer a shuffle.
>
> There is another possible improvement point (UNTESTED): if the first
> page holds only ZS_HANDLE bytes, then we can avoid memcpy() path and
> instead just kmap the second page + offset.
Yeah good point.
next prev parent reply other threads:[~2026-01-07 5:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-06 4:25 Sergey Senozhatsky
2026-01-07 0:23 ` Yosry Ahmed
2026-01-07 0:59 ` Sergey Senozhatsky
2026-01-07 1:37 ` Sergey Senozhatsky
2026-01-07 1:56 ` Yosry Ahmed
2026-01-07 2:06 ` Sergey Senozhatsky
2026-01-07 2:10 ` Yosry Ahmed
2026-01-07 2:20 ` Sergey Senozhatsky
2026-01-07 2:22 ` Sergey Senozhatsky
2026-01-07 5:19 ` Yosry Ahmed [this message]
2026-01-07 5:30 ` Sergey Senozhatsky
2026-01-07 7:12 ` Sergey Senozhatsky
2026-01-07 3:03 ` Sergey Senozhatsky
2026-01-07 5:22 ` Yosry Ahmed
2026-01-07 5:38 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=u6dy6oa6ztghy7ozficimubhb2mwppcq6gosupepnn63uu6oq7@qyph3nyq7las \
--to=yosry.ahmed@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=bgeffon@google.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=senozhatsky@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox