linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Zi Yan <ziy@nvidia.com>,
	 Baolin Wang <baolin.wang@linux.alibaba.com>,
	Hugh Dickins <hughd@google.com>,  Chris Li <chrisl@kernel.org>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	 Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Yosry Ahmed <yosry.ahmed@linux.dev>,
	Youngjun Park <youngjun.park@lge.com>,
	 Chengming Zhou <chengming.zhou@linux.dev>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	Muchun Song <muchun.song@linux.dev>,
	 Qi Zheng <zhengqi.arch@bytedance.com>,
	linux-kernel@vger.kernel.org,  cgroups@vger.kernel.org
Subject: Re: [PATCH RFC 00/15] mm, swap: swap table phase IV with dynamic ghost swapfile
Date: Sat, 21 Feb 2026 17:30:15 +0800	[thread overview]
Message-ID: <CAGsJ_4zewviHRYcDVe5RSDKR5XyRppLj=7BN4dyyCCGDTKhD1A@mail.gmail.com> (raw)
In-Reply-To: <CAMgjq7CXgGxhtU3XJYnxVQ8fFYtNZBN3uF4FgqbBVV75ohOhtg@mail.gmail.com>

On Sat, Feb 21, 2026 at 5:07 PM Kairui Song <ryncsn@gmail.com> wrote:
>
> On Sat, Feb 21, 2026 at 4:16 PM Barry Song <21cnbao@gmail.com> wrote:
> >
> > On Fri, Feb 20, 2026 at 7:42 AM Kairui Song via B4 Relay
> > <devnull+kasong.tencent.com@kernel.org> wrote:
> >
> > To be honest, I really dislike the name "ghost." I would
> > prefer something that reflects its actual functionality.
> > "Ghost" does not describe what it does and feels rather
> > arbitrary.
>
> Hi Barry,
>
> That can be easily changed by "search and replace", I just kept the
> name since patch 13 is directly from Chris and I just didn't change
> it.
>
> >
> > I suggest retiring the name "ghost" and replacing it with
> > something more appropriate. "vswap" could be a good option,
>
> That looks good to me too, you can also check the slide from LSFMM
> last year page 23 to see how I imaged thing would workout at that
> time:
> https://drive.google.com/file/d/1_QKlXErUkQ-TXmJJy79fJoLPui9TGK1S/view
>
> The actual layout will be a bit different from that slide, since the
> redirect entry will be in the lower devices, the virtual device will
> have an extra virtual table to hold its redirect entry. But still I'm
> glad that plain swap still has zero overhead so ZRAM or high
> performance NVME is still good.
>
> > > Currently, the dynamic ghost files are just reported as ordinary swap files
> > > in /proc/swaps and we can have multiple ones, so users will have a full
> > > view of what's going on. This is a very easy-to-change design decision.
> > > I'm open to ideas about how we should present this to users. e.g., Hiding
> > > it will make it more "virtual", but I don't think that's a good idea.
> >
> > Even if it remains visible in /proc/swaps, I would rather
> > not represent it as a real file in any filesystem. Putting
> > a "ghost" swapfile on something like ext4 seems unnatural.
>
> How do you think about this? Here is the output after this sereis:
> # swapon
> NAME           TYPE       SIZE USED PRIO
> /dev/ghostswap ghost     11.5G 821M   -1
> /dev/ram0      partition 1024G 9.9M   -1
> /dev/vdb2      partition    2G 112K   -1

I’d rather have a “virtual” block device, /dev/xswap, with
its size displayed as 11.5G via `ls -l filename`. This is
also more natural than relying on a cdev placeholder.

If

>
> Or we can rename it to:
> # swapon
> NAME           TYPE       SIZE USED PRIO
> /dev/xswap     xswap     11.5G 821M   -1
> /dev/ram0      partition 1024G 9.9M   -1
> /dev/vdb2      partition    2G 112K   -1
>
> swapon /dev/xswap will enable this layer (for now I just hardcoded it
> to be 8 times the size of total ram). swapoff /dev/xswap disables it.
> We can also change the priority.
>
> We can also hide it.
>
> > > And for easier testing, I added a /dev/ghostswap in this RFC. `swapon
> > > /dev/ghostswap` enables that. Without swapon /dev/ghostswap, any existing
> > > users, including ZRAM, won't observe any change.
> >
> > /dev/ghostswap is assumed to be a virtual block device or
> > something similar? If it is a block device, how is its size
> > related to si->size?
>
> It's not a real device, just a placeholder to make swapon usable
> without any modification for easier testing (some user space
> implementation doesn't work well with dummy header). And it has
> nothing to do with the si->size.

I understand it is a placeholder for swap, but if it appears
as /dev/ghostfile, users browsing /dev/ will see it as a
real cdev. A /dev/chardev is intended for user read/write
access.
Also, udev rules can act on an exported cdev. This couples
us with a lot of userspace behavior.

>
> >
> > Looking at [PATCH RFC 14/15] mm, swap: add a special device
> > for ghost swap setup, it appears to be a character device.
> > This feels very odd to me. I’m not in favor of coupling the
> > ghost swapfile with a memdev character device.
> > A cdev should be a true character device.
>
> No coupling at all, it's just a place holder so swapon (the syscall)
> knows it's a virtual device, which is just an alternative to the dummy
> header approach from Chris, so people can test it easier.

Using a cdev as a placeholder has introduced behavioral
coupling. For swap, it serves as a placeholder; for anything
outside swap, it behaves as a regular cdev.

>
> The si->size is just a number and any value can be given. I just
> haven't decided how we should pass the number to the kernel or just
> make it dynamic: e.g. set it to total ram size and increase by 2M
> every time a new cluster is used.


      reply	other threads:[~2026-02-21  9:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19 23:42 Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 01/15] mm: move thp_limit_gfp_mask to header Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 02/15] mm, swap: simplify swap_cache_alloc_folio Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 03/15] mm, swap: move conflict checking logic of out swap cache adding Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 04/15] mm, swap: add support for large order folios in swap cache directly Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 05/15] mm, swap: unify large folio allocation Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 06/15] memcg, swap: reparent the swap entry on swapin if swapout cgroup is dead Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 07/15] memcg, swap: defer the recording of memcg info and reparent flexibly Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 08/15] mm, swap: store and check memcg info in the swap table Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 09/15] mm, swap: support flexible batch freeing of slots in different memcg Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 10/15] mm, swap: always retrieve memcg id from swap table Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 11/15] mm/swap, memcg: remove swap cgroup array Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 12/15] mm, swap: merge zeromap into swap table Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 13/15] mm: ghost swapfile support for zswap Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 14/15] mm, swap: add a special device for ghost swap setup Kairui Song via B4 Relay
2026-02-19 23:42 ` [PATCH RFC 15/15] mm, swap: allocate cluster dynamically for ghost swapfile Kairui Song via B4 Relay
2026-02-21  8:15 ` [PATCH RFC 00/15] mm, swap: swap table phase IV with dynamic " Barry Song
2026-02-21  9:07   ` Kairui Song
2026-02-21  9:30     ` Barry Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGsJ_4zewviHRYcDVe5RSDKR5XyRppLj=7BN4dyyCCGDTKhD1A@mail.gmail.com' \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryncsn@gmail.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shikemeng@huaweicloud.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=youngjun.park@lge.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox