linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Chris Li <chrisl@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>,
	akpm@linux-foundation.org, hughd@google.com,
	yosry.ahmed@linux.dev, mhocko@kernel.org,
	roman.gushchin@linux.dev, shakeel.butt@linux.dev,
	muchun.song@linux.dev, len.brown@intel.com,
	chengming.zhou@linux.dev, kasong@tencent.com,
	huang.ying.caritas@gmail.com, ryan.roberts@arm.com,
	shikemeng@huaweicloud.com, viro@zeniv.linux.org.uk,
	baohua@kernel.org, bhe@redhat.com, osalvador@suse.de,
	christophe.leroy@csgroup.eu, pavel@kernel.org,
	linux-mm@kvack.org, kernel-team@meta.com,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-pm@vger.kernel.org, peterx@redhat.com, riel@surriel.com,
	joshua.hahnjy@gmail.com, npache@redhat.com, gourry@gourry.net,
	axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com,
	rafael@kernel.org, jannh@google.com, pfalcato@suse.de,
	zhengqi.arch@bytedance.com
Subject: Re: [PATCH v3 00/20] Virtual Swap Space
Date: Tue, 10 Feb 2026 18:01:24 -0500	[thread overview]
Message-ID: <aYu4xK8HhjO7mLP8@cmpxchg.org> (raw)
In-Reply-To: <CACePvbWnJFkMOtX8LbL+0hm5RP6jD5nfZcYUyxrJsPNTq0vbPg@mail.gmail.com>

Hi Chris,

On Tue, Feb 10, 2026 at 01:24:03PM -0800, Chris Li wrote:
> Hi Johannes,
> On Mon, Feb 9, 2026 at 6:36 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > Here is the more detailed breakdown:
> 
> It seems you did not finish your sentence before sending your reply.

I did. I trimmed the quote of Nhat's cover letter to the parts
addressing your questions. If you use gmail, click the three dots:

> > > > The size of the virtual swap descriptor is 24 bytes. Note that this is
> > > > not all "new" overhead, as the swap descriptor will replace:
> > > > * the swap_cgroup arrays (one per swap type) in the old design, which
> > > >   is a massive source of static memory overhead. With the new design,
> > > >   it is only allocated for used clusters.
> > > > * the swap tables, which holds the swap cache and workingset shadows.
> > > > * the zeromap bitmap, which is a bitmap of physical swap slots to
> > > >   indicate whether the swapped out page is zero-filled or not.
> > > > * huge chunk of the swap_map. The swap_map is now replaced by 2 bitmaps,
> > > >   one for allocated slots, and one for bad slots, representing 3 possible
> > > >   states of a slot on the swapfile: allocated, free, and bad.
> > > > * the zswap tree.
> > > >
> > > > So, in terms of additional memory overhead:
> > > > * For zswap entries, the added memory overhead is rather minimal. The
> > > >   new indirection pointer neatly replaces the existing zswap tree.
> > > >   We really only incur less than one word of overhead for swap count
> > > >   blow up (since we no longer use swap continuation) and the swap type.
> > > > * For physical swap entries, the new design will impose fewer than 3 words
> > > >   memory overhead. However, as noted above this overhead is only for
> > > >   actively used swap entries, whereas in the current design the overhead is
> > > >   static (including the swap cgroup array for example).
> > > >
> > > >   The primary victim of this overhead will be zram users. However, as
> > > >   zswap now no longer takes up disk space, zram users can consider
> > > >   switching to zswap (which, as a bonus, has a lot of useful features
> > > >   out of the box, such as cgroup tracking, dynamic zswap pool sizing,
> > > >   LRU-ordering writeback, etc.).
> > > >
> > > > For a more concrete example, suppose we have a 32 GB swapfile (i.e.
> > > > 8,388,608 swap entries), and we use zswap.
> > > >
> > > > 0% usage, or 0 entries: 0.00 MB
> > > > * Old design total overhead: 25.00 MB
> > > > * Vswap total overhead: 0.00 MB
> > > >
> > > > 25% usage, or 2,097,152 entries:
> > > > * Old design total overhead: 57.00 MB
> > > > * Vswap total overhead: 48.25 MB
> > > >
> > > > 50% usage, or 4,194,304 entries:
> > > > * Old design total overhead: 89.00 MB
> > > > * Vswap total overhead: 96.50 MB
> > > >
> > > > 75% usage, or 6,291,456 entries:
> > > > * Old design total overhead: 121.00 MB
> > > > * Vswap total overhead: 144.75 MB
> > > >
> > > > 100% usage, or 8,388,608 entries:
> > > > * Old design total overhead: 153.00 MB
> > > > * Vswap total overhead: 193.00 MB
> > > >
> > > > So even in the worst case scenario for virtual swap, i.e when we
> > > > somehow have an oracle to correctly size the swapfile for zswap
> > > > pool to 32 GB, the added overhead is only 40 MB, which is a mere
> > > > 0.12% of the total swapfile :)
> > > >
> > > > In practice, the overhead will be closer to the 50-75% usage case, as
> > > > systems tend to leave swap headroom for pathological events or sudden
> > > > spikes in memory requirements. The added overhead in these cases are
> > > > practically neglible. And in deployments where swapfiles for zswap
> > > > are previously sparsely used, switching over to virtual swap will
> > > > actually reduce memory overhead.
> > > >
> > > > Doing the same math for the disk swap, which is the worst case for
> > > > virtual swap in terms of swap backends:
> > > >
> > > > 0% usage, or 0 entries: 0.00 MB
> > > > * Old design total overhead: 25.00 MB
> > > > * Vswap total overhead: 2.00 MB
> > > >
> > > > 25% usage, or 2,097,152 entries:
> > > > * Old design total overhead: 41.00 MB
> > > > * Vswap total overhead: 66.25 MB
> > > >
> > > > 50% usage, or 4,194,304 entries:
> > > > * Old design total overhead: 57.00 MB
> > > > * Vswap total overhead: 130.50 MB
> > > >
> > > > 75% usage, or 6,291,456 entries:
> > > > * Old design total overhead: 73.00 MB
> > > > * Vswap total overhead: 194.75 MB
> > > >
> > > > 100% usage, or 8,388,608 entries:
> > > > * Old design total overhead: 89.00 MB
> > > > * Vswap total overhead: 259.00 MB
> > > >
> > > > The added overhead is 170MB, which is 0.5% of the total swapfile size,
> > > > again in the worst case when we have a sizing oracle.


  reply	other threads:[~2026-02-10 23:01 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-08 21:58 Nhat Pham
2026-02-08 21:58 ` [PATCH v3 01/20] mm/swap: decouple swap cache from physical swap infrastructure Nhat Pham
2026-02-08 22:26   ` [PATCH v3 00/20] Virtual Swap Space Nhat Pham
2026-02-10 17:59     ` Kairui Song
2026-02-10 18:52       ` Johannes Weiner
2026-02-10 19:11       ` Nhat Pham
2026-02-10 19:23         ` Nhat Pham
2026-02-12  5:07         ` Chris Li
2026-02-17 23:36         ` Nhat Pham
2026-02-10 21:58       ` Chris Li
2026-02-20 21:05       ` [PATCH] vswap: fix poor batching behavior of vswap free path Nhat Pham
2026-02-08 22:31   ` [PATCH v3 00/20] Virtual Swap Space Nhat Pham
2026-02-09 12:20     ` Chris Li
2026-02-10  2:36       ` Johannes Weiner
2026-02-10 21:24         ` Chris Li
2026-02-10 23:01           ` Johannes Weiner [this message]
2026-02-10 18:00       ` Nhat Pham
2026-02-10 23:17         ` Chris Li
2026-02-08 22:39   ` Nhat Pham
2026-02-09  2:22   ` [PATCH v3 01/20] mm/swap: decouple swap cache from physical swap infrastructure kernel test robot
2026-02-08 21:58 ` [PATCH v3 02/20] swap: rearrange the swap header file Nhat Pham
2026-02-08 21:58 ` [PATCH v3 03/20] mm: swap: add an abstract API for locking out swapoff Nhat Pham
2026-02-08 21:58 ` [PATCH v3 04/20] zswap: add new helpers for zswap entry operations Nhat Pham
2026-02-08 21:58 ` [PATCH v3 05/20] mm/swap: add a new function to check if a swap entry is in swap cached Nhat Pham
2026-02-08 21:58 ` [PATCH v3 06/20] mm: swap: add a separate type for physical swap slots Nhat Pham
2026-02-08 21:58 ` [PATCH v3 07/20] mm: create scaffolds for the new virtual swap implementation Nhat Pham
2026-02-08 21:58 ` [PATCH v3 08/20] zswap: prepare zswap for swap virtualization Nhat Pham
2026-02-08 21:58 ` [PATCH v3 09/20] mm: swap: allocate a virtual swap slot for each swapped out page Nhat Pham
2026-02-09 17:12   ` kernel test robot
2026-02-11 13:42   ` kernel test robot
2026-02-08 21:58 ` [PATCH v3 10/20] swap: move swap cache to virtual swap descriptor Nhat Pham
2026-02-08 21:58 ` [PATCH v3 11/20] zswap: move zswap entry management to the " Nhat Pham
2026-02-08 21:58 ` [PATCH v3 12/20] swap: implement the swap_cgroup API using virtual swap Nhat Pham
2026-02-08 21:58 ` [PATCH v3 13/20] swap: manage swap entry lifecycle at the virtual swap layer Nhat Pham
2026-02-08 21:58 ` [PATCH v3 14/20] mm: swap: decouple virtual swap slot from backing store Nhat Pham
2026-02-10  6:31   ` Dan Carpenter
2026-02-08 21:58 ` [PATCH v3 15/20] zswap: do not start zswap shrinker if there is no physical swap slots Nhat Pham
2026-02-08 21:58 ` [PATCH v3 16/20] swap: do not unnecesarily pin readahead swap entries Nhat Pham
2026-02-08 21:58 ` [PATCH v3 17/20] swapfile: remove zeromap bitmap Nhat Pham
2026-02-08 21:58 ` [PATCH v3 18/20] memcg: swap: only charge physical swap slots Nhat Pham
2026-02-09  2:01   ` kernel test robot
2026-02-09  2:12   ` kernel test robot
2026-02-08 21:58 ` [PATCH v3 19/20] swap: simplify swapoff using virtual swap Nhat Pham
2026-02-08 21:58 ` [PATCH v3 20/20] swapfile: replace the swap map with bitmaps Nhat Pham
2026-02-08 22:51 ` [PATCH v3 00/20] Virtual Swap Space Nhat Pham
2026-02-12 12:23   ` David Hildenbrand (Arm)
2026-02-12 17:29     ` Nhat Pham
2026-02-12 17:39       ` Nhat Pham
2026-02-12 20:11         ` David Hildenbrand (Arm)
2026-02-12 17:41       ` David Hildenbrand (Arm)
2026-02-12 17:45         ` Nhat Pham
2026-02-10 15:45 ` [syzbot ci] " syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYu4xK8HhjO7mLP8@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=gourry@gourry.net \
    --cc=huang.ying.caritas@gmail.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=kernel-team@meta.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=npache@redhat.com \
    --cc=nphamcs@gmail.com \
    --cc=osalvador@suse.de \
    --cc=pavel@kernel.org \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=rafael@kernel.org \
    --cc=riel@surriel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shikemeng@huaweicloud.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=weixugc@google.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox