From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: Yosry Ahmed <yosry@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosry.ahmed@linux.dev>,
Nhat Pham <hoangnhat.pham@linux.dev>,
Nhat Pham <nphamcs@gmail.com>,
Chengming Zhou <chengming.zhou@linux.dev>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH 6/8] mm/zsmalloc, zswap: Handle objcg charging and lifetime in zsmalloc
Date: Wed, 4 Mar 2026 07:11:19 -0800 [thread overview]
Message-ID: <20260304151120.3512645-1-joshua.hahnjy@gmail.com> (raw)
In-Reply-To: <CAO9r8zOFS7zU-eGkErcjud=DW0t00_WqNqCzq_QDf061dOsYSQ@mail.gmail.com>
On Tue, 3 Mar 2026 15:53:31 -0800 Yosry Ahmed <yosry@kernel.org> wrote:
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 067215a6ddcc..88c7cd399261 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -963,6 +963,44 @@ static bool alloc_zspage_objcgs(struct size_class *class, gfp_t gfp,
> > return true;
> > }
> >
> > +static void zs_charge_objcg(struct zpdesc *zpdesc, struct obj_cgroup *objcg,
> > + int size, unsigned long offset)
> > +{
> > + struct mem_cgroup *memcg;
> > +
> > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > + return;
> > +
> > + VM_WARN_ON_ONCE(!(current->flags & PF_MEMALLOC));
> > +
> > + /* PF_MEMALLOC context, charging must succeed */
> > + if (obj_cgroup_charge(objcg, GFP_KERNEL, size))
> > + VM_WARN_ON_ONCE(1);
> > +
> > + rcu_read_lock();
> > + memcg = obj_cgroup_memcg(objcg);
> > + mod_memcg_state(memcg, MEMCG_ZSWAP_B, size);
> > + mod_memcg_state(memcg, MEMCG_ZSWAPPED, 1);
Hello Yosry, I hope you are doing well!
Thank you for your feedback : -)
> Zsmalloc should not be updating zswap stats (e.g. in case zram starts
> supporting memcg charging). How about moving the stat updates to
> zswap?
Yeah... I think this was also a big point of concern for me. While reading
the code, I was really amazed by how clean the logical divide between
zsmalloc and zswap / zram were, and I wanted to preserve it as much as
possible.
There are a few problems, though. Probably the biggest is that migration
of zpdescs and compressed objects within them are invisible to zswap.
Of course, this is by design, but this leads to two problems.
zswap's ignorance of compressed objects' movements across physical nodes
makes it impossible to accurately charge and uncharge from the correct
memcg-lruvec.
Conversely, zsmalloc's ignorance of memcg association makes it impossible
to correctly restrict cpusets.mems during migration.
So the clean logical divide makes a lot of sense for separating the
high-level cgroup association, compression, etc. from the physical
location of the memory and migration / zpdesc compaction, but it would
appear that this comes at a cost of oversimplifying the logic and missing
out on accurate memory charging and a unified source of truth for the
counters.
The last thing I wanted to note was that I agree that zsmalloc doing
explicit zswap stat updates feels a bit awkward. The reason I chose to do
this right now is because when enlightening zsmalloc about the compressed
objs' objcgs, zswap is the only one that does this memory accounting.
So having an objcg is a bit of a proxy to understand that the consumer
is zswap (as opposed to zram). Of course, if zram starts to do memcg
accounting as well, we'll have to start doing some other checks to
see if the compresed object should be accounted as zram or zswap.
OK. That's all the defense I have for my design : -) Now for thinking
about other designs:
I also explored whether it makes sense to make zsmalloc call a hook into
zswap code during and after migrations. The problem is that there isn't
a good way to do the compressed object --> zswap entry lookup, and this
still doesn't solve the issue of zsmalloc migrating compressed objects
without checking whether that object can live on another node.
Maybe one possible approach is to turn the array of objcgs into an array
of backpointers from compressed objects to their corresponding zswap_entries?
One concern is that this does add 8 bytes of additional overhead per
zswap entry, and I'm not sure that this is acceptable. I'll keep thinking
on whether there's a creative way to save some memory here, though...
Of course the other concern is what this will look like for zram users.
I guess it can be done similarly to what is done here, and only allocate
the array of pointers when called in from zswap.
Anyways, thank you for bringing this up. What do you think about the
options we have here? I hope that I've motivated why we want
per-memcg-lruvec accounting as well. Please let me know if there is anything
I can provide additional context for : -)
Have a great day!
Joshua
next prev parent reply other threads:[~2026-03-04 15:11 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 19:29 [PATCH 0/8] mm/zswap, zsmalloc: Per-memcg-lruvec zswap accounting Joshua Hahn
2026-02-26 19:29 ` [PATCH 1/8] mm/zsmalloc: Rename zs_object_copy to zs_obj_copy Joshua Hahn
2026-02-26 19:29 ` [PATCH 2/8] mm/zsmalloc: Make all obj_idx unsigned ints Joshua Hahn
2026-02-26 19:29 ` [PATCH 3/8] mm/zsmalloc: Introduce objcgs pointer in struct zpdesc Joshua Hahn
2026-02-26 21:37 ` Shakeel Butt
2026-02-26 21:43 ` Joshua Hahn
2026-03-04 16:58 ` Yosry Ahmed
2026-03-04 18:03 ` Joshua Hahn
2026-02-26 19:29 ` [PATCH 4/8] mm/zsmalloc: Store obj_cgroup pointer in zpdesc Joshua Hahn
2026-02-26 19:29 ` [PATCH 5/8] mm/zsmalloc,zswap: Redirect zswap_entry->obcg to zpdesc Joshua Hahn
2026-02-26 23:13 ` kernel test robot
2026-02-27 19:10 ` Joshua Hahn
2026-02-26 19:29 ` [PATCH 6/8] mm/zsmalloc, zswap: Handle objcg charging and lifetime in zsmalloc Joshua Hahn
2026-03-03 23:53 ` Yosry Ahmed
2026-03-04 15:11 ` Joshua Hahn [this message]
2026-03-04 15:46 ` Yosry Ahmed
2026-03-04 16:26 ` Joshua Hahn
2026-03-04 16:27 ` Nhat Pham
2026-03-04 16:45 ` Yosry Ahmed
2026-03-04 16:49 ` Nhat Pham
2026-02-26 19:29 ` [PATCH 7/8] mm/memcontrol: Track MEMCG_ZSWAPPED in bytes Joshua Hahn
2026-02-26 19:29 ` [PATCH 8/8] mm/vmstat, memcontrol: Track ZSWAP_B, ZSWAPPED_B per-memcg-lruvec Joshua Hahn
2026-02-26 22:40 ` kernel test robot
2026-02-27 19:45 ` Joshua Hahn
2026-02-26 23:02 ` kernel test robot
2026-03-02 21:31 ` [PATCH 0/8] mm/zswap, zsmalloc: Per-memcg-lruvec zswap accounting Nhat Pham
2026-03-03 17:51 ` Joshua Hahn
2026-03-03 18:01 ` Nhat Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260304151120.3512645-1-joshua.hahnjy@gmail.com \
--to=joshua.hahnjy@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=hoangnhat.pham@linux.dev \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=minchan@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=senozhatsky@chromium.org \
--cc=shakeel.butt@linux.dev \
--cc=yosry.ahmed@linux.dev \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox