From: Kairui Song <ryncsn@gmail.com>
To: Barry Song <21cnbao@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Hugh Dickins <hughd@google.com>, Chris Li <chrisl@kernel.org>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosry.ahmed@linux.dev>,
Youngjun Park <youngjun.park@lge.com>,
Chengming Zhou <chengming.zhou@linux.dev>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Qi Zheng <zhengqi.arch@bytedance.com>,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH RFC 00/15] mm, swap: swap table phase IV with dynamic ghost swapfile
Date: Sat, 21 Feb 2026 17:07:03 +0800 [thread overview]
Message-ID: <CAMgjq7CXgGxhtU3XJYnxVQ8fFYtNZBN3uF4FgqbBVV75ohOhtg@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4xF5sK8H1RsqRNoi7DfGBtThASsozY30gq_kdRLaYgaTw@mail.gmail.com>
On Sat, Feb 21, 2026 at 4:16 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Fri, Feb 20, 2026 at 7:42 AM Kairui Song via B4 Relay
> <devnull+kasong.tencent.com@kernel.org> wrote:
>
> To be honest, I really dislike the name "ghost." I would
> prefer something that reflects its actual functionality.
> "Ghost" does not describe what it does and feels rather
> arbitrary.
Hi Barry,
That can be easily changed by "search and replace", I just kept the
name since patch 13 is directly from Chris and I just didn't change
it.
>
> I suggest retiring the name "ghost" and replacing it with
> something more appropriate. "vswap" could be a good option,
That looks good to me too, you can also check the slide from LSFMM
last year page 23 to see how I imaged thing would workout at that
time:
https://drive.google.com/file/d/1_QKlXErUkQ-TXmJJy79fJoLPui9TGK1S/view
The actual layout will be a bit different from that slide, since the
redirect entry will be in the lower devices, the virtual device will
have an extra virtual table to hold its redirect entry. But still I'm
glad that plain swap still has zero overhead so ZRAM or high
performance NVME is still good.
> > Currently, the dynamic ghost files are just reported as ordinary swap files
> > in /proc/swaps and we can have multiple ones, so users will have a full
> > view of what's going on. This is a very easy-to-change design decision.
> > I'm open to ideas about how we should present this to users. e.g., Hiding
> > it will make it more "virtual", but I don't think that's a good idea.
>
> Even if it remains visible in /proc/swaps, I would rather
> not represent it as a real file in any filesystem. Putting
> a "ghost" swapfile on something like ext4 seems unnatural.
How do you think about this? Here is the output after this sereis:
# swapon
NAME TYPE SIZE USED PRIO
/dev/ghostswap ghost 11.5G 821M -1
/dev/ram0 partition 1024G 9.9M -1
/dev/vdb2 partition 2G 112K -1
Or we can rename it to:
# swapon
NAME TYPE SIZE USED PRIO
/dev/xswap xswap 11.5G 821M -1
/dev/ram0 partition 1024G 9.9M -1
/dev/vdb2 partition 2G 112K -1
swapon /dev/xswap will enable this layer (for now I just hardcoded it
to be 8 times the size of total ram). swapoff /dev/xswap disables it.
We can also change the priority.
We can also hide it.
> > And for easier testing, I added a /dev/ghostswap in this RFC. `swapon
> > /dev/ghostswap` enables that. Without swapon /dev/ghostswap, any existing
> > users, including ZRAM, won't observe any change.
>
> /dev/ghostswap is assumed to be a virtual block device or
> something similar? If it is a block device, how is its size
> related to si->size?
It's not a real device, just a placeholder to make swapon usable
without any modification for easier testing (some user space
implementation doesn't work well with dummy header). And it has
nothing to do with the si->size.
>
> Looking at [PATCH RFC 14/15] mm, swap: add a special device
> for ghost swap setup, it appears to be a character device.
> This feels very odd to me. I’m not in favor of coupling the
> ghost swapfile with a memdev character device.
> A cdev should be a true character device.
No coupling at all, it's just a place holder so swapon (the syscall)
knows it's a virtual device, which is just an alternative to the dummy
header approach from Chris, so people can test it easier.
The si->size is just a number and any value can be given. I just
haven't decided how we should pass the number to the kernel or just
make it dynamic: e.g. set it to total ram size and increase by 2M
every time a new cluster is used.
next prev parent reply other threads:[~2026-02-21 9:07 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-19 23:42 Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 01/15] mm: move thp_limit_gfp_mask to header Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 02/15] mm, swap: simplify swap_cache_alloc_folio Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 03/15] mm, swap: move conflict checking logic of out swap cache adding Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 04/15] mm, swap: add support for large order folios in swap cache directly Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 05/15] mm, swap: unify large folio allocation Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 06/15] memcg, swap: reparent the swap entry on swapin if swapout cgroup is dead Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 07/15] memcg, swap: defer the recording of memcg info and reparent flexibly Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 08/15] mm, swap: store and check memcg info in the swap table Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 09/15] mm, swap: support flexible batch freeing of slots in different memcg Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 10/15] mm, swap: always retrieve memcg id from swap table Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 11/15] mm/swap, memcg: remove swap cgroup array Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 12/15] mm, swap: merge zeromap into swap table Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 13/15] mm: ghost swapfile support for zswap Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 14/15] mm, swap: add a special device for ghost swap setup Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 15/15] mm, swap: allocate cluster dynamically for ghost swapfile Kairui Song via B4 Relay
2026-02-21 8:15 ` [PATCH RFC 00/15] mm, swap: swap table phase IV with dynamic " Barry Song
2026-02-21 9:07 ` Kairui Song [this message]
2026-02-21 9:30 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMgjq7CXgGxhtU3XJYnxVQ8fFYtNZBN3uF4FgqbBVV75ohOhtg@mail.gmail.com \
--to=ryncsn@gmail.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=yosry.ahmed@linux.dev \
--cc=youngjun.park@lge.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox