linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm <linux-mm@kvack.org>,
	 ryan.roberts@arm.com, David Hildenbrand <david@redhat.com>,
	Barry Song <21cnbao@gmail.com>,
	 Chuanhua Han <hanchuanhua@oppo.com>
Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction "the pony"
Date: Fri, 1 Mar 2024 10:57:07 -0800	[thread overview]
Message-ID: <CAF8kJuNFtejEtjQHg5UBGduvFNn3AaGn4ffyoOrEnXfHpx6Ubg@mail.gmail.com> (raw)
In-Reply-To: <CAKEwX=P7AE8Ofqi4CyL0UOwSOVvHEG4kUFmRBzHH_N=NPxPDuA@mail.gmail.com>

On Fri, Mar 1, 2024 at 1:53 AM Nhat Pham <nphamcs@gmail.com> wrote:
> > At the swap entry level, here is the list of existing swap entry usage:
> >
> > * Swap entry allocation and free. Each swap entry needs to be
> > associated with a location of the disk space in the swapfile. (offset
> > of swap entry).
> > * Each swap entry needs to track the map count of the entry. (swap_map)
> > * Each swap entry needs to be able to find the associated memory
> > cgroup. (swap_cgroup_ctrl->map)
> > * Swap cache. Lookup folio/shadow from swap entry
> > * Swap page writes through a swapfile in a file system other than a
> > block device. (swap_extent)
> > * Shadow entry. (store in swap cache)
>
> IMHO, one thing this new abstraction should support is seamless
> transfer/migration of pages from one backend to another (perhaps from
> high to low priority backends, i.e writeback).

Yes, that is the next step. I am just covering the existing usage here.
What you describe is what I call "the swap tiers". I considered that
topic but did not submit it this year. The current swap back end is
too en-tangled, (lack of a better word). It is very hard to add more
complex data structures in the existing swap back end. That is why I
want to untangle it a bit before attacking the next level stuff.

>
> I think this will require some careful redesigns. The closest thing we
> have right now is zswap -> backing swapfile. But it is currently
> handled in a rather peculiar manner - the underlying swap slot has
> already been reserved for the zswap entry. But there's a couple of
> problems with this:
>
> a) This is wasteful. We're essentially having the same piece of data
> occupying spaces in two levels in the hierarchies.

Can you elerate? If you have a ghost swap file, the zswap will not
store data in two swap devices.
The price to pay is that you need to allocate another swap slot on the
real backing swap file. That is the same if you move SSD data to a
hard disk. You need to allocate a new swap entry on the destination
device.

> b) How do we generalize to a multi-tier hierarchy?

If zswap runs on a ghost swap file, flushing from zswap to another
real swap file would be very similar to flushing from one SSD to
another. That is the more generalized case. Zswap sharing swap slot
with the backing swapfile is a very special case.

> c) This is a bit too backend-specific. It'd be nice if we can make
> this as backend-agnostic as possible (if possible).

Totally agree, that is one of my motivations for the "swap.tiers" idea.

>
> Motivation: I'm currently working/thinking about decoupling zswap and
> swap, and this is one of the more challenging aspects (as I can't seem
> to find a precedent in the swap world for inter-swap backends pages
> migration), and especially with respect to concurrent loads (and
> swapcache interactions).

It will be very messy if you try that in the current swap back end.

Chris

>
> I don't have good answers/designs quite yet - just raising some
> questions/concerns :)
>


  reply	other threads:[~2024-03-01 18:57 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01  9:24 Chris Li
2024-03-01  9:53 ` Nhat Pham
2024-03-01 18:57   ` Chris Li [this message]
2024-03-04 22:58   ` Matthew Wilcox
2024-03-05  3:23     ` Chengming Zhou
2024-03-05  7:44       ` Chris Li
2024-03-05  8:15         ` Chengming Zhou
2024-03-05 18:24           ` Chris Li
2024-03-05  9:32         ` Nhat Pham
2024-03-05  9:52           ` Chengming Zhou
2024-03-05 10:55             ` Nhat Pham
2024-03-05 19:20               ` Chris Li
2024-03-05 20:56                 ` Jared Hulbert
2024-03-05 21:38         ` Jared Hulbert
2024-03-05 21:58           ` Chris Li
2024-03-06  4:16             ` Jared Hulbert
2024-03-06  5:50               ` Chris Li
     [not found]                 ` <CA+ZsKJ7JE56NS6hu4L_uyywxZO7ixgftvfKjdND9e5SOyn+72Q@mail.gmail.com>
2024-03-06 18:16                   ` Chris Li
2024-03-06 22:44                     ` Jared Hulbert
2024-03-07  0:46                       ` Chris Li
2024-03-07  8:57                         ` Jared Hulbert
2024-03-06  1:33   ` Barry Song
2024-03-04 18:43 ` Kairui Song
2024-03-04 22:03   ` Jared Hulbert
2024-03-04 22:47     ` Chris Li
2024-03-04 22:36   ` Chris Li
2024-03-06  1:15 ` Barry Song
2024-03-06  2:59   ` Chris Li
2024-03-06  6:05     ` Barry Song
2024-03-06 17:56       ` Chris Li
2024-03-06 21:29         ` Barry Song
2024-03-08  8:55       ` David Hildenbrand
2024-03-07  7:56 ` Chuanhua Han
2024-03-07 14:03   ` [Lsf-pc] " Jan Kara
2024-03-07 21:06     ` Jared Hulbert
2024-03-07 21:17       ` Barry Song
2024-03-08  0:14         ` Jared Hulbert
2024-03-08  0:53           ` Barry Song
2024-03-14  9:03         ` Jan Kara
2024-05-16 15:04           ` Zi Yan
2024-05-17  3:48             ` Chris Li
2024-03-14  8:52       ` Jan Kara
2024-03-08  2:02     ` Chuanhua Han
2024-03-14  8:26       ` Jan Kara
2024-03-14 11:19         ` Chuanhua Han
2024-05-15 23:07           ` Chris Li
2024-05-16  7:16             ` Chuanhua Han
2024-05-17 12:12     ` Karim Manaouil
2024-05-21 20:40       ` Chris Li
2024-05-28  7:08         ` Jared Hulbert
2024-05-29  3:36           ` Chris Li
2024-05-29  3:57         ` Matthew Wilcox
2024-05-29  6:50           ` Chris Li
2024-05-29 12:33             ` Matthew Wilcox
2024-05-30 22:53               ` Chris Li
2024-05-31  3:12                 ` Matthew Wilcox
2024-06-01  0:43                   ` Chris Li
2024-05-31  1:56               ` Yuanchu Xie
2024-05-31 16:51                 ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF8kJuNFtejEtjQHg5UBGduvFNn3AaGn4ffyoOrEnXfHpx6Ubg@mail.gmail.com \
    --to=chrisl@kernel.org \
    --cc=21cnbao@gmail.com \
    --cc=david@redhat.com \
    --cc=hanchuanhua@oppo.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox