linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Yosry Ahmed" <yosry.ahmed@linux.dev>
To: "Jiayuan Chen" <jiayuan.chen@linux.dev>, "SeongJae Park" <sj@kernel.org>
Cc: "SeongJae Park" <sj@kernel.org>,
	linux-mm@kvack.org, "Jiayuan Chen" <jiayuan.chen@shopee.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Nhat Pham" <nphamcs@gmail.com>,
	"Chengming Zhou" <chengming.zhou@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Nick Terrell" <terrelln@fb.com>,
	"David   Sterba" <dsterba@suse.com>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] mm: zswap: add per-memcg stat for incompressible pages
Date: Fri, 06 Feb 2026 04:12:57 +0000	[thread overview]
Message-ID: <b40e6e0a63c4303e048ed0dbc61c09788de98e19@linux.dev> (raw)
In-Reply-To: <3179fa38027bdacdd38b4ef34b493bdb5ef7a19a@linux.dev>

> > 
> > On Thu, 5 Feb 2026 13:30:12 +0800 Jiayuan Chen <> wrote:
> >  
> >  
> >  From: Jiayuan Chen <jiayuan.chen@shopee.com>
> >  
> >  The global zswap_stored_incompressible_pages counter was added in commit
> >  dca4437a5861 ("mm/zswap: store <PAGE_SIZE compression failed page as-is")
> >  to track how many pages are stored in raw (uncompressed) form in zswap.
> >  However, in containerized environments, knowing which cgroup is
> >  contributing incompressible pages is essential for effective resource
> >  management.
> >  
> >  Add a new memcg stat 'zswpraw' to track incompressible pages per cgroup.
> >  This helps administrators and orchestrators to:
> >  
> >  1. Identify workloads that produce incompressible data (e.g., encrypted
> >  data, already-compressed media, random data) and may not benefit from
> >  zswap.
> >  
> >  2. Make informed decisions about workload placement - moving
> >  incompressible workloads to nodes with larger swap backing devices
> >  rather than relying on zswap.
> >  
> >  3. Debug zswap efficiency issues at the cgroup level without needing to
> >  correlate global stats with individual cgroups.
> >  
> >  While the compression ratio can be estimated from existing stats
> >  (zswap / zswapped * PAGE_SIZE), this doesn't distinguish between
> >  "uniformly poor compression" and "a few completely incompressible pages
> >  mixed with highly compressible ones". The zswpraw stat provides direct
> >  visibility into the latter case.
> >  
> >  Changes
> >  -------
> >  
> >  1. Add zswap_is_raw() helper (include/linux/zswap.h)
> >  - Abstract the PAGE_SIZE comparison logic for identifying raw entries
> >  - Keep the incompressible check in one place for maintainability
> >  
> >  2. Add MEMCG_ZSWAP_RAW stat definition (include/linux/memcontrol.h,
> >  mm/memcontrol.c)
> >  - Add MEMCG_ZSWAP_RAW to memcg_stat_item enum
> >  - Register in memcg_stat_items[] and memory_stats[] arrays
> >  - Export as "zswpraw" in memory.stat
> >  
> >  3. Update statistics accounting (mm/memcontrol.c, mm/zswap.c)
> >  - Track MEMCG_ZSWAP_RAW in obj_cgroup_charge/uncharge_zswap()
> >  - Use zswap_is_raw() helper in zswap.c for consistency
> >  
> >  Test
> >  ----
> >  
> >  I wrote a simple test program[1] that allocates memory and compresses it
> >  with zstd, so kernel zswap cannot compress further.
> >  
> >  $ cgcreate -g memory:test
> >  $ cgexec -g memory:test ./test_zswpraw &
> >  $ cat /sys/fs/cgroup/test/memory.stat | grep zswp
> >  zswpraw 0
> >  zswpin 0
> >  zswpout 0
> >  zswpwb 0
> >  
> >  $ echo "100M" > /sys/fs/cgroup/test/memory.reclaim
> >  $ cat /sys/fs/cgroup/test/memory.stat | grep zswp
> >  zswpraw 104800256
> >  zswpin 0
> >  zswpout 51222
> >  zswpwb 0
> >  
> >  $ pkill test_zswpraw
> >  $ cat /sys/fs/cgroup/test/memory.stat | grep zswp
> >  zswpraw 0
> >  zswpin 1
> >  zswpout 51222
> >  zswpwb 0
> >  
> >  [1] https://gist.github.com/mrpre/00432c6154250326994fbeaf62e0e6f1
> >  
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
> >  ---
> >  include/linux/memcontrol.h | 1 +
> >  include/linux/zswap.h | 9 +++++++++
> >  mm/memcontrol.c | 6 ++++++
> >  mm/zswap.c | 6 +++---
> >  4 files changed, 19 insertions(+), 3 deletions(-)
> >  
> >  As others also mentioned, the documentation of the new stat would be needed.
> >  
> >  
> >  diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> >  index b6c82c8f73e1..83d1328f81d1 100644
> >  --- a/include/linux/memcontrol.h
> >  +++ b/include/linux/memcontrol.h
> >  @@ -39,6 +39,7 @@ enum memcg_stat_item {
> >  MEMCG_KMEM,
> >  MEMCG_ZSWAP_B,
> >  MEMCG_ZSWAPPED,
> >  + MEMCG_ZSWAP_RAW,
> >  MEMCG_NR_STAT,
> >  };
> >  
> >  diff --git a/include/linux/zswap.h b/include/linux/zswap.h
> >  index 30c193a1207e..94f84b154b71 100644
> >  --- a/include/linux/zswap.h
> >  +++ b/include/linux/zswap.h
> >  @@ -7,6 +7,15 @@
> >  
> >  struct lruvec;
> >  
> >  +/*
> >  + * Check if a zswap entry is stored in raw (uncompressed) form.
> >  + * This happens when compression doesn't reduce the size.
> >  + */
> >  +static inline bool zswap_is_raw(size_t size)
> >  +{
> >  + return size == PAGE_SIZE;
> >  +}
> >  +
> >  
> >  No strong opinion, but I'm not really sure if the helper is needed, because it
> >  feels quite simple logic:
> >  
> >  "If an object is compressed and the size is same to the original one, the
> >  object is incompressible."
> >  
> >  I also feel the function name bit odd, given the type of the parameter. Based
> >  on the function name and the comment, I'd expect it to receive a zswap_entry
> >  object. I understand it is better to receive a size_t, to be called from
> >  obj_cgroup_[un]charge_zswap(), though. Even in the case, I think the name can
> >  be better (e.g., zswap_compression_failed() or zswap_was_incompressible() ?),
> >  or at least the coment can be more kindly explain the fact that the parameter
> >  is the size of object after the compression attempt.
> >  
> >  I vote to drop the helper.
> > 
> The reason I introduced the helper is that the incompressible check now lives in two places:
> 
> In zswap.c - for the global zswap_stored_incompressible_pages counter
> In memcontrol.c - for the per-memcg MEMCG_ZSWAP_INCOMP stat
> 
> By extracting a shared helper, both modules use the same logic, which helps with maintainability.
> 
> That said, I'm fine with dropping the helper if preferred. I can add a comment in memcontrol.c
> explaining the logic. My only concern is that if the incompressible detection logic in zswap
> ever changes, someone might forget to update the memcg accounting accordingly.
> 
> But perhaps that's an unlikely scenario.

Well, a selftest would be the right way to detect such a problem imo. Even if we need to have a customer definition for incompressible later, it should remain in zswap and we should pass it into memcg code.

For now, I think let's keep open-coding the PAGE_SIZE check.


  reply	other threads:[~2026-02-06  4:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-05  5:30 Jiayuan Chen
2026-02-05 17:31 ` Nhat Pham
2026-02-05 17:45   ` Nhat Pham
2026-02-06  2:04     ` Jiayuan Chen
2026-02-05 21:33 ` Shakeel Butt
2026-02-06  2:04   ` Jiayuan Chen
2026-02-06  0:39 ` Yosry Ahmed
2026-02-06  2:05   ` Jiayuan Chen
2026-02-06  2:21 ` SeongJae Park
2026-02-06  2:33   ` Yosry Ahmed
2026-02-06  2:53     ` Jiayuan Chen
2026-02-06  4:12       ` Yosry Ahmed [this message]
2026-02-06  2:47   ` Jiayuan Chen
2026-02-06  3:15     ` SeongJae Park
2026-02-06  4:11     ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b40e6e0a63c4303e048ed0dbc61c09788de98e19@linux.dev \
    --to=yosry.ahmed@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=dsterba@suse.com \
    --cc=hannes@cmpxchg.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=jiayuan.chen@shopee.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=sj@kernel.org \
    --cc=terrelln@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox