From: Nhat Pham <nphamcs@gmail.com>
To: YoungJun Park <youngjun.park@lge.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
hannes@cmpxchg.org, hughd@google.com, yosry.ahmed@linux.dev,
mhocko@kernel.org, roman.gushchin@linux.dev,
shakeel.butt@linux.dev, muchun.song@linux.dev,
len.brown@intel.com, chengming.zhou@linux.dev,
kasong@tencent.com, chrisl@kernel.org,
huang.ying.caritas@gmail.com, ryan.roberts@arm.com,
viro@zeniv.linux.org.uk, baohua@kernel.org, osalvador@suse.de,
lorenzo.stoakes@oracle.com, christophe.leroy@csgroup.eu,
pavel@kernel.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-pm@vger.kernel.org, peterx@redhat.com, gunho.lee@lge.com,
taejoon.song@lge.com, iamjoonsoo.kim@lge.com
Subject: Re: [RFC PATCH v2 00/18] Virtual Swap Space
Date: Fri, 30 May 2025 09:54:43 -0700 [thread overview]
Message-ID: <CAKEwX=NUD7w1CD30FY-_FdxQ6u1sUOAyvKhg-mDr6BUOkfFq_g@mail.gmail.com> (raw)
In-Reply-To: <CAKEwX=MjyEsoyDmMBCRr0QnBfgkTA5bfrshPbfSgNp887zaxVw@mail.gmail.com>
On Fri, May 30, 2025 at 9:52 AM Nhat Pham <nphamcs@gmail.com> wrote:
>
> On Thu, May 29, 2025 at 11:47 PM YoungJun Park <youngjun.park@lge.com> wrote:
> >
> > On Tue, Apr 29, 2025 at 04:38:28PM -0700, Nhat Pham wrote:
> > > Changelog:
> > > * v2:
> > > * Use a single atomic type (swap_refs) for reference counting
> > > purpose. This brings the size of the swap descriptor from 64 KB
> > > down to 48 KB (25% reduction). Suggested by Yosry Ahmed.
> > > * Zeromap bitmap is removed in the virtual swap implementation.
> > > This saves one bit per phyiscal swapfile slot.
> > > * Rearrange the patches and the code change to make things more
> > > reviewable. Suggested by Johannes Weiner.
> > > * Update the cover letter a bit.
> >
> > Hi Nhat,
> >
> > Thank you for sharing this patch series.
> > I’ve read through it with great interest.
> >
> > I’m part of a kernel team working on features related to multi-tier swapping,
> > and this patch set appears quite relevant
> > to our ongoing discussions and early-stage implementation.
>
> May I ask - what's the use case you're thinking of here? Remote swapping?
>
> >
> > I had a couple of questions regarding the future direction.
> >
> > > * Multi-tier swapping (as mentioned in [5]), with transparent
> > > transferring (promotion/demotion) of pages across tiers (see [8] and
> > > [9]). Similar to swapoff, with the old design we would need to
> > > perform the expensive page table walk.
> >
> > Based on the discussion in [5], it seems there was some exploration
> > around enabling per-cgroup selection of multiple tiers.
> > Do you envision the current design evolving in a similar direction
> > to those past discussions, or is there a different direction you're aiming for?
To be extra clear, I don't have an issue with a cgroup-based interface
for swap tiering like that.
I think the only objections at the time is we do not really have a use
case in mind?
>
> IIRC, that past design focused on the interface aspect of the problem,
> but never actually touched the mechanism to implement a multi-tier
> swapping solution.
>
> The simple reason is it's impossible, or at least highly inefficient
> to do it in the current design, i.e without virtualizing swap. Storing
> the physical swap location in PTEs means that changing the swap
> backend requires a full page table walk to update all the PTEs that
> refer to the old physical swap location. So you have to pick your
> poison - either:
>
> 1. Pick your backend at swap out time, and never change it. You might
> not have sufficient information to decide at that time. It prevents
> you from adapting to the change in workload dynamics and working set -
> the access frequency of pages might change, so their physical location
> should change accordingly.
>
> 2. Reserve the space in every tier, and associate them with the same
> handle. This is kinda what zswap is doing. It is space efficient, and
> create a lot of operational issues in production.
s/efficient/inefficient
>
next prev parent reply other threads:[~2025-05-30 16:55 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-29 23:38 Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 01/18] swap: rearrange the swap header file Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 02/18] swapfile: rearrange functions Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 03/18] swapfile: rearrange freeing steps Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 04/18] mm: swap: add an abstract API for locking out swapoff Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 05/18] mm: swap: add a separate type for physical swap slots Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 06/18] mm: create scaffolds for the new virtual swap implementation Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 07/18] mm: swap: zswap: swap cache and zswap support for virtualized swap Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 08/18] mm: swap: allocate a virtual swap slot for each swapped out page Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 09/18] swap: implement the swap_cgroup API using virtual swap Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 10/18] swap: manage swap entry lifetime at the virtual swap layer Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 11/18] mm: swap: temporarily disable THP swapin and batched freeing swap Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 12/18] mm: swap: decouple virtual swap slot from backing store Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 13/18] zswap: do not start zswap shrinker if there is no physical swap slots Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 14/18] memcg: swap: only charge " Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 15/18] vswap: support THP swapin and batch free_swap_and_cache Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 16/18] swap: simplify swapoff using virtual swap Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 17/18] swapfile: move zeromap setup out of enable_swap_info Nhat Pham
2025-04-29 23:38 ` [RFC PATCH v2 18/18] swapfile: remove zeromap in virtual swap implementation Nhat Pham
2025-04-29 23:51 ` [RFC PATCH v2 00/18] Virtual Swap Space Nhat Pham
2025-05-30 6:47 ` YoungJun Park
2025-05-30 16:52 ` Nhat Pham
2025-05-30 16:54 ` Nhat Pham [this message]
2025-06-01 12:56 ` YoungJun Park
2025-06-01 16:14 ` Kairui Song
2025-06-02 15:17 ` YoungJun Park
2025-06-02 18:29 ` Nhat Pham
2025-06-03 9:50 ` Kairui Song
2025-06-01 21:08 ` Nhat Pham
2025-06-02 15:03 ` YoungJun Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=NUD7w1CD30FY-_FdxQ6u1sUOAyvKhg-mDr6BUOkfFq_g@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=chrisl@kernel.org \
--cc=christophe.leroy@csgroup.eu \
--cc=gunho.lee@lge.com \
--cc=hannes@cmpxchg.org \
--cc=huang.ying.caritas@gmail.com \
--cc=hughd@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=kasong@tencent.com \
--cc=kernel-team@meta.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pm@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=pavel@kernel.org \
--cc=peterx@redhat.com \
--cc=roman.gushchin@linux.dev \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=taejoon.song@lge.com \
--cc=viro@zeniv.linux.org.uk \
--cc=yosry.ahmed@linux.dev \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox