From: "Jiayuan Chen" <jiayuan.chen@linux.dev>
To: "Yosry Ahmed" <yosry.ahmed@linux.dev>
Cc: linux-mm@kvack.org, "Jiayuan Chen" <jiayuan.chen@shopee.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Hocko" <mhocko@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Shakeel Butt" <shakeel.butt@linux.dev>,
"Muchun Song" <muchun.song@linux.dev>,
"Nhat Pham" <nphamcs@gmail.com>,
"Chengming Zhou" <chengming.zhou@linux.dev>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Nick Terrell" <terrelln@fb.com>,
"David Sterba" <dsterba@suse.com>,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] mm: zswap: add per-memcg stat for incompressible pages
Date: Fri, 06 Feb 2026 02:05:25 +0000 [thread overview]
Message-ID: <2928382d5614d5632e3508c81878c8ad720ccd7e@linux.dev> (raw)
In-Reply-To: <hevovghpl2udhpof66oz26ulrpqcrtuwjxcakyskoeoil2wo6x@osbrncj7ifwz>
February 6, 2026 at 08:39, "Yosry Ahmed" <yosry.ahmed@linux.dev mailto:yosry.ahmed@linux.dev?to=%22Yosry%20Ahmed%22%20%3Cyosry.ahmed%40linux.dev%3E > wrote:
>
> On Thu, Feb 05, 2026 at 01:30:12PM +0800, Jiayuan Chen wrote:
>
> >
> > From: Jiayuan Chen <jiayuan.chen@shopee.com>
> >
> > The global zswap_stored_incompressible_pages counter was added in commit
> > dca4437a5861 ("mm/zswap: store <PAGE_SIZE compression failed page as-is")
> > to track how many pages are stored in raw (uncompressed) form in zswap.
> > However, in containerized environments, knowing which cgroup is
> > contributing incompressible pages is essential for effective resource
> > management.
> >
> > Add a new memcg stat 'zswpraw' to track incompressible pages per cgroup.
> > This helps administrators and orchestrators to:
> >
> > 1. Identify workloads that produce incompressible data (e.g., encrypted
> > data, already-compressed media, random data) and may not benefit from
> > zswap.
> >
> > 2. Make informed decisions about workload placement - moving
> > incompressible workloads to nodes with larger swap backing devices
> > rather than relying on zswap.
> >
> > 3. Debug zswap efficiency issues at the cgroup level without needing to
> > correlate global stats with individual cgroups.
> >
> > While the compression ratio can be estimated from existing stats
> > (zswap / zswapped * PAGE_SIZE), this doesn't distinguish between
> > "uniformly poor compression" and "a few completely incompressible pages
> > mixed with highly compressible ones". The zswpraw stat provides direct
> > visibility into the latter case.
> >
> > Changes
> > -------
> >
> > 1. Add zswap_is_raw() helper (include/linux/zswap.h)
> > - Abstract the PAGE_SIZE comparison logic for identifying raw entries
> > - Keep the incompressible check in one place for maintainability
> >
> > 2. Add MEMCG_ZSWAP_RAW stat definition (include/linux/memcontrol.h,
> > mm/memcontrol.c)
> > - Add MEMCG_ZSWAP_RAW to memcg_stat_item enum
> > - Register in memcg_stat_items[] and memory_stats[] arrays
> > - Export as "zswpraw" in memory.stat
> >
> > 3. Update statistics accounting (mm/memcontrol.c, mm/zswap.c)
> > - Track MEMCG_ZSWAP_RAW in obj_cgroup_charge/uncharge_zswap()
> > - Use zswap_is_raw() helper in zswap.c for consistency
> >
> > Test
> > ----
> >
> > I wrote a simple test program[1] that allocates memory and compresses it
> > with zstd, so kernel zswap cannot compress further.
> >
> > $ cgcreate -g memory:test
> > $ cgexec -g memory:test ./test_zswpraw &
> > $ cat /sys/fs/cgroup/test/memory.stat | grep zswp
> > zswpraw 0
> > zswpin 0
> > zswpout 0
> > zswpwb 0
> >
> > $ echo "100M" > /sys/fs/cgroup/test/memory.reclaim
> > $ cat /sys/fs/cgroup/test/memory.stat | grep zswp
> > zswpraw 104800256
> > zswpin 0
> > zswpout 51222
> > zswpwb 0
> >
> > $ pkill test_zswpraw
> > $ cat /sys/fs/cgroup/test/memory.stat | grep zswp
> > zswpraw 0
> > zswpin 1
> > zswpout 51222
> > zswpwb 0
> >
> > [1] https://gist.github.com/mrpre/00432c6154250326994fbeaf62e0e6f1
> >
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
> > ---
> > include/linux/memcontrol.h | 1 +
> > include/linux/zswap.h | 9 +++++++++
> > mm/memcontrol.c | 6 ++++++
> > mm/zswap.c | 6 +++---
> > 4 files changed, 19 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index b6c82c8f73e1..83d1328f81d1 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -39,6 +39,7 @@ enum memcg_stat_item {
> > MEMCG_KMEM,
> > MEMCG_ZSWAP_B,
> > MEMCG_ZSWAPPED,
> > + MEMCG_ZSWAP_RAW,
> >
> Please change the name as Shakeel suggested.
>
> >
> > MEMCG_NR_STAT,
> > };
> >
> > diff --git a/include/linux/zswap.h b/include/linux/zswap.h
> > index 30c193a1207e..94f84b154b71 100644
> > --- a/include/linux/zswap.h
> > +++ b/include/linux/zswap.h
> > @@ -7,6 +7,15 @@
> >
> > struct lruvec;
> >
> > +/*
> > + * Check if a zswap entry is stored in raw (uncompressed) form.
> > + * This happens when compression doesn't reduce the size.
> > + */
> > +static inline bool zswap_is_raw(size_t size)
> >
> Internall as well, please rename this to zswap_is_incompressible() or
> zswap_is_incomp(). Not a big fan of the helper because it doesn't add
> much, but I don't feel strongly either way.
>
> >
> > +{
> > + return size == PAGE_SIZE;
> > +}
> > +
> > extern atomic_long_t zswap_stored_pages;
> >
> > #ifdef CONFIG_ZSWAP
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 007413a53b45..32fb801530a3 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -341,6 +341,7 @@ static const unsigned int memcg_stat_items[] = {
> > MEMCG_KMEM,
> > MEMCG_ZSWAP_B,
> > MEMCG_ZSWAPPED,
> > + MEMCG_ZSWAP_RAW,
> > };
> >
> > #define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
> > @@ -1346,6 +1347,7 @@ static const struct memory_stat memory_stats[] = {
> > #ifdef CONFIG_ZSWAP
> > { "zswap", MEMCG_ZSWAP_B },
> > { "zswapped", MEMCG_ZSWAPPED },
> > + { "zswpraw", MEMCG_ZSWAP_RAW },
> >
> Here as well: zswap_incompressible or zswap_incomp?
>
> Other than the renames and doc, LGTM.
>
> >
Thanks Yosry!
I'll rename everything to use "incomp" (MEMCG_ZSWAP_INCOMP, zswap_is_incomp(), "zswap_incomp")
and update the documentation in v2.
next prev parent reply other threads:[~2026-02-06 2:05 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-05 5:30 Jiayuan Chen
2026-02-05 17:31 ` Nhat Pham
2026-02-05 17:45 ` Nhat Pham
2026-02-06 2:04 ` Jiayuan Chen
2026-02-05 21:33 ` Shakeel Butt
2026-02-06 2:04 ` Jiayuan Chen
2026-02-06 0:39 ` Yosry Ahmed
2026-02-06 2:05 ` Jiayuan Chen [this message]
2026-02-06 2:21 ` SeongJae Park
2026-02-06 2:33 ` Yosry Ahmed
2026-02-06 2:53 ` Jiayuan Chen
2026-02-06 4:12 ` Yosry Ahmed
2026-02-06 2:47 ` Jiayuan Chen
2026-02-06 3:15 ` SeongJae Park
2026-02-06 4:11 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2928382d5614d5632e3508c81878c8ad720ccd7e@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=dsterba@suse.com \
--cc=hannes@cmpxchg.org \
--cc=jiayuan.chen@shopee.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=terrelln@fb.com \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox