From: Nhat Pham <nphamcs@gmail.com>
To: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosry.ahmed@linux.dev>,
Nhat Pham <hoangnhat.pham@linux.dev>,
Chengming Zhou <chengming.zhou@linux.dev>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH 0/8] mm/zswap, zsmalloc: Per-memcg-lruvec zswap accounting
Date: Tue, 3 Mar 2026 10:01:51 -0800 [thread overview]
Message-ID: <CAKEwX=NW75_ftM5ZuJJRMB2CLnB-25KPvamb4L1eP=i0XuFS_g@mail.gmail.com> (raw)
In-Reply-To: <20260303175140.1032459-1-joshua.hahnjy@gmail.com>
On Tue, Mar 3, 2026 at 9:51 AM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> On Mon, 2 Mar 2026 13:31:32 -0800 Nhat Pham <nphamcs@gmail.com> wrote:
>
> > On Thu, Feb 26, 2026 at 11:29 AM Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> [...snip...]
>
> > > Introduce a new per-zpdesc array of objcg pointers to track
> > > per-memcg-lruvec memory usage by zswap, while leaving zram users
> > > unaffected.
>
> [...snip...]
>
> Hi Nhat! I hope you are doing well : -) Thank you for taking a look!
>
> > I might have missed it and this might be in one of the latter patches,
> > but could also add some quick and dirty benchmark for zswap to ensure
> > there's no or minimal performance implications? IIUC there is a small
> > amount of extra overhead in certain steps, because we have to go
> > through zsmalloc to query objcg. Usemem or kernel build should suffice
> > IMHO.
>
> Yup, this was one of my concerns too. I tried to do a somewhat comprehensive
> analysis below, hopefully this can show a good picture of what's happening.
> Spoilers: there doesn't seem to be any significant regressions (< 1%)
> and any regressions are within a small fraction of the standard deviation.
>
> One thing that I have noticed is that there is a tangible reduction in
> standard deviation for some of these benchmarks. I can't exactly pinpoint
> why this is happening, but I'll take it as a win :p
>
> > To be clear, I don't anticipate any observable performance change, but
> > it's a good sanity check :) Besides, can't be too careful with stress
> > testing stuff :P
>
> For sure. I should have done these and included it in the original RFC,
> but I think I might have been too eager to get the RFC out : -)
> Will include in the second version of the series!
>
> All the experiments below are done on a 2-NUMA system. The data is quite
> compressible, which I think makes sense for measuring the overhead of accounting.
>
> Benchmark 1
> Allocating 2G memory to one node with 1G memory.high. Average across 10 trials
> +-------------------------+---------+----------+
> | | average | stddev |
> +-------------------------+---------+----------+
> | Baseline (11439c4635ed) | 8887.82 | 362.40 |
> | Baseline + Series | 8944.16 | 356.45 |
> +-------------------------+---------+----------+
> | Delta | +0.634% | -1.642% |
> +-------------------------+---------+----------+
>
> Benchmark 2
> Allocating 2G memory to one node with 1G memory.high, churn 5x through the
> memory. Average across 5 trials.
> +-------------------------+----------+----------+
> | | average | stddev |
> +-------------------------+----------+----------+
> | Baseline (11439c4635ed) | 31152.96 | 166.23 |
> | Baseline + Series | 31355.28 | 64.86 |
> +-------------------------+----------+----------+
> | Delta | +0.649% | -60.981% |
> +-------------------------+----------+----------+
>
> Benchmark 3
> Allocating 2G memory to one node with 1G memory.high, split across 2 nodes.
> Average across 5 trials.
> +-------------------------+---------+----------+
> | a | average | stddev |
> +-------------------------+---------+----------+
> | Baseline (11439c4635ed) | 16101.6 | 174.18 |
> | Baseline + Series | 16022.4 | 117.17 |
> +-------------------------+---------+----------+
> | Delta | -0.492% | -32.731% |
> +-------------------------+---------+----------+
>
> Benchmark 4
> Reading stat files 10000 times under memory pressure
>
> memory.stat
> +-------------------------+---------+----------+
> | | average | stddev |
> +-------------------------+---------+----------+
> | Baseline (11439c4635ed) | 24524.4 | 501.7 |
> | Baseline + Series | 24807.2 | 444.53 |
> +-------------------------+---------+---------+
> | Delta | 1.153% | -11.395% |
> +-------------------------+---------+----------+
>
> memory.numa_stat
> +-------------------------+---------+---------+
> | | average | stddev |
> +-------------------------+---------+---------+
> | Baseline (11439c4635ed) | 24807.2 | 444.53 |
> | Baseline + Series | 23837.6 | 521.68 |
> +-------------------------+---------+---------+
> | Delta | -3.905% | 17.355% |
> +-------------------------+---------+---------+
>
> proc/vmstat
> +-------------------------+---------+----------+
> | | average | stddev |
> +-------------------------+---------+----------+
> | Baseline (11439c4635ed) | 24793.6 | 285.26 |
> | Baseline + Series | 23815.6 | 553.44 |
> +-------------------------+---------+---------+
> | Delta | -3.945% | +94.012% |
> +-------------------------+---------+----------+
>
> ^^^ Some big increase in standard deviation here, although there is some
> decrease in the average time. Probably the most notable change that I've seen
> from this patch.
>
> node0/vmstat
> +-------------------------+---------+----------+
> | a | average | stddev |
> +-------------------------+---------+----------+
> | Baseline (11439c4635ed) | 24541.4 | 281.41 |
> | Baseline + Series | 24479 | 241.29 |
> +-------------------------+---------+---------+
> | Delta | -0.254% | -14.257% |
> +-------------------------+---------+----------+
>
> Lots of testing results, I think mostly negligible in terms of average, but
> some non-negligible changes in standard deviation going in both directions.
> I don't see anything too concerning off the top of my head, but for the
> next version I'll try to do some more testing across different machines
> as well (I don't have any machines with > 2 nodes, but maybe I can do
> some tests on QEMU just to sanity check)
>
> Thanks again, Nhat. Have a great day!
> Joshua
Sounds like any meagre performance difference is smaller than noise :P
If it's this negligible on these microbenchmarks, I think they'll be
infinitesimal in production workloads where these operations are a
very small part.
Kinda makes sense, because objcgroup access is only done in very small
subsets of operations: zswap entry store and zswap entry free, which
can only happen once each per zswap entry.
I think we're fine, but I'll leave other reviewers comment on it as well.
prev parent reply other threads:[~2026-03-03 18:02 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 19:29 Joshua Hahn
2026-02-26 19:29 ` [PATCH 1/8] mm/zsmalloc: Rename zs_object_copy to zs_obj_copy Joshua Hahn
2026-02-26 19:29 ` [PATCH 2/8] mm/zsmalloc: Make all obj_idx unsigned ints Joshua Hahn
2026-02-26 19:29 ` [PATCH 3/8] mm/zsmalloc: Introduce objcgs pointer in struct zpdesc Joshua Hahn
2026-02-26 21:37 ` Shakeel Butt
2026-02-26 21:43 ` Joshua Hahn
2026-02-26 19:29 ` [PATCH 4/8] mm/zsmalloc: Store obj_cgroup pointer in zpdesc Joshua Hahn
2026-02-26 19:29 ` [PATCH 5/8] mm/zsmalloc,zswap: Redirect zswap_entry->obcg to zpdesc Joshua Hahn
2026-02-26 23:13 ` kernel test robot
2026-02-27 19:10 ` Joshua Hahn
2026-02-26 19:29 ` [PATCH 6/8] mm/zsmalloc, zswap: Handle objcg charging and lifetime in zsmalloc Joshua Hahn
2026-02-26 19:29 ` [PATCH 7/8] mm/memcontrol: Track MEMCG_ZSWAPPED in bytes Joshua Hahn
2026-02-26 19:29 ` [PATCH 8/8] mm/vmstat, memcontrol: Track ZSWAP_B, ZSWAPPED_B per-memcg-lruvec Joshua Hahn
2026-02-26 22:40 ` kernel test robot
2026-02-27 19:45 ` Joshua Hahn
2026-02-26 23:02 ` kernel test robot
2026-03-02 21:31 ` [PATCH 0/8] mm/zswap, zsmalloc: Per-memcg-lruvec zswap accounting Nhat Pham
2026-03-03 17:51 ` Joshua Hahn
2026-03-03 18:01 ` Nhat Pham [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=NW75_ftM5ZuJJRMB2CLnB-25KPvamb4L1eP=i0XuFS_g@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=hoangnhat.pham@linux.dev \
--cc=joshua.hahnjy@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=minchan@kernel.org \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=senozhatsky@chromium.org \
--cc=shakeel.butt@linux.dev \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox