linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosry@kernel.org>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Yosry Ahmed <yosry.ahmed@linux.dev>,
	 Nhat Pham <hoangnhat.pham@linux.dev>,
	Nhat Pham <nphamcs@gmail.com>,
	 Chengming Zhou <chengming.zhou@linux.dev>,
	Michal Hocko <mhocko@kernel.org>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	 Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	 cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  kernel-team@meta.com
Subject: Re: [PATCH 6/8] mm/zsmalloc, zswap: Handle objcg charging and lifetime in zsmalloc
Date: Wed, 4 Mar 2026 07:46:48 -0800	[thread overview]
Message-ID: <CAO9r8zOJ5bkJzptM7+8V2G00dHuycAEAF4CDcaR1YMk0kWyktA@mail.gmail.com> (raw)
In-Reply-To: <20260304151120.3512645-1-joshua.hahnjy@gmail.com>

On Wed, Mar 4, 2026 at 7:11 AM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> On Tue, 3 Mar 2026 15:53:31 -0800 Yosry Ahmed <yosry@kernel.org> wrote:
>
> > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > > index 067215a6ddcc..88c7cd399261 100644
> > > --- a/mm/zsmalloc.c
> > > +++ b/mm/zsmalloc.c
> > > @@ -963,6 +963,44 @@ static bool alloc_zspage_objcgs(struct size_class *class, gfp_t gfp,
> > >         return true;
> > >  }
> > >
> > > +static void zs_charge_objcg(struct zpdesc *zpdesc, struct obj_cgroup *objcg,
> > > +                           int size, unsigned long offset)
> > > +{
> > > +       struct mem_cgroup *memcg;
> > > +
> > > +       if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > > +               return;
> > > +
> > > +       VM_WARN_ON_ONCE(!(current->flags & PF_MEMALLOC));
> > > +
> > > +       /* PF_MEMALLOC context, charging must succeed */
> > > +       if (obj_cgroup_charge(objcg, GFP_KERNEL, size))
> > > +               VM_WARN_ON_ONCE(1);
> > > +
> > > +       rcu_read_lock();
> > > +       memcg = obj_cgroup_memcg(objcg);
> > > +       mod_memcg_state(memcg, MEMCG_ZSWAP_B, size);
> > > +       mod_memcg_state(memcg, MEMCG_ZSWAPPED, 1);
>
> Hello Yosry, I hope you are doing well!
> Thank you for your feedback : -)
>
> > Zsmalloc should not be updating zswap stats (e.g. in case zram starts
> > supporting memcg charging).  How about moving the stat updates to
> > zswap?
>
> Yeah... I think this was also a big point of concern for me. While reading
> the code, I was really amazed by how clean the logical divide between
> zsmalloc and zswap / zram were, and I wanted to preserve it as much as
> possible.
>
> There are a few problems, though. Probably the biggest is that migration
> of zpdescs and compressed objects within them are invisible to zswap.
> Of course, this is by design, but this leads to two problems.
>
> zswap's ignorance of compressed objects' movements across physical nodes
> makes it impossible to accurately charge and uncharge from the correct
> memcg-lruvec.
>
> Conversely, zsmalloc's ignorance of memcg association makes it impossible
> to correctly restrict cpusets.mems during migration.
>
> So the clean logical divide makes a lot of sense for separating the
> high-level cgroup association, compression, etc. from the physical
> location of the memory and migration / zpdesc compaction, but it would
> appear that this comes at a cost of oversimplifying the logic and missing
> out on accurate memory charging and a unified source of truth for the
> counters.
>
> The last thing I wanted to note was that I agree that zsmalloc doing
> explicit zswap stat updates feels a bit awkward. The reason I chose to do
> this right now is because when enlightening zsmalloc about the compressed
> objs' objcgs, zswap is the only one that does this memory accounting.
> So having an objcg is a bit of a proxy to understand that the consumer
> is zswap (as opposed to zram). Of course, if zram starts to do memcg
> accounting as well, we'll have to start doing some other checks to
> see if the compresed object should be accounted as zram or zswap.
>
> OK. That's all the defense I have for my design : -) Now for thinking
> about other designs:
>
> I also explored whether it makes sense to make zsmalloc call a hook into
> zswap code during and after migrations. The problem is that there isn't
> a good way to do the compressed object --> zswap entry lookup, and this
> still doesn't solve the issue of zsmalloc migrating compressed objects
> without checking whether that object can live on another node.
>
> Maybe one possible approach is to turn the array of objcgs into an array
> of backpointers from compressed objects to their corresponding zswap_entries?
> One concern is that this does add 8 bytes of additional overhead per
> zswap entry, and I'm not sure that this is acceptable. I'll keep thinking
> on whether there's a creative way to save some memory here, though...
>
> Of course the other concern is what this will look like for zram users.
> I guess it can be done similarly to what is done here, and only allocate
> the array of pointers when called in from zswap.
>
> Anyways, thank you for bringing this up. What do you think about the
> options we have here? I hope that I've motivated why we want
> per-memcg-lruvec accounting as well. Please let me know if there is anything
> I can provide additional context for : -)

Thanks for the detailed elaboration.

AFAICT the only zswap-specific part is the actual stat indexes, what
if these are parameterized at the zsmalloc pool level? AFAICT zswap
and zram will never share a pool.


  reply	other threads:[~2026-03-04 15:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-26 19:29 [PATCH 0/8] mm/zswap, zsmalloc: Per-memcg-lruvec zswap accounting Joshua Hahn
2026-02-26 19:29 ` [PATCH 1/8] mm/zsmalloc: Rename zs_object_copy to zs_obj_copy Joshua Hahn
2026-02-26 19:29 ` [PATCH 2/8] mm/zsmalloc: Make all obj_idx unsigned ints Joshua Hahn
2026-02-26 19:29 ` [PATCH 3/8] mm/zsmalloc: Introduce objcgs pointer in struct zpdesc Joshua Hahn
2026-02-26 21:37   ` Shakeel Butt
2026-02-26 21:43     ` Joshua Hahn
2026-03-04 16:58   ` Yosry Ahmed
2026-03-04 18:03     ` Joshua Hahn
2026-02-26 19:29 ` [PATCH 4/8] mm/zsmalloc: Store obj_cgroup pointer in zpdesc Joshua Hahn
2026-02-26 19:29 ` [PATCH 5/8] mm/zsmalloc,zswap: Redirect zswap_entry->obcg to zpdesc Joshua Hahn
2026-02-26 23:13   ` kernel test robot
2026-02-27 19:10     ` Joshua Hahn
2026-02-26 19:29 ` [PATCH 6/8] mm/zsmalloc, zswap: Handle objcg charging and lifetime in zsmalloc Joshua Hahn
2026-03-03 23:53   ` Yosry Ahmed
2026-03-04 15:11     ` Joshua Hahn
2026-03-04 15:46       ` Yosry Ahmed [this message]
2026-03-04 16:26         ` Joshua Hahn
2026-03-04 16:27         ` Nhat Pham
2026-03-04 16:45           ` Yosry Ahmed
2026-03-04 16:49             ` Nhat Pham
2026-02-26 19:29 ` [PATCH 7/8] mm/memcontrol: Track MEMCG_ZSWAPPED in bytes Joshua Hahn
2026-02-26 19:29 ` [PATCH 8/8] mm/vmstat, memcontrol: Track ZSWAP_B, ZSWAPPED_B per-memcg-lruvec Joshua Hahn
2026-02-26 22:40   ` kernel test robot
2026-02-27 19:45     ` Joshua Hahn
2026-02-26 23:02   ` kernel test robot
2026-03-02 21:31 ` [PATCH 0/8] mm/zswap, zsmalloc: Per-memcg-lruvec zswap accounting Nhat Pham
2026-03-03 17:51   ` Joshua Hahn
2026-03-03 18:01     ` Nhat Pham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAO9r8zOJ5bkJzptM7+8V2G00dHuycAEAF4CDcaR1YMk0kWyktA@mail.gmail.com \
    --to=yosry@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=hoangnhat.pham@linux.dev \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=senozhatsky@chromium.org \
    --cc=shakeel.butt@linux.dev \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox