linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: Karim Manaouil <kmanaouil.dev@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Chuanhua Han <hanchuanhua@oppo.com>,
	linux-mm <linux-mm@kvack.org>,
	 lsf-pc@lists.linux-foundation.org, ryan.roberts@arm.com,
	21cnbao@gmail.com,  david@redhat.com
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Swap Abstraction "the pony"
Date: Tue, 21 May 2024 13:40:56 -0700	[thread overview]
Message-ID: <CANeU7QnoKUSdMjOGNFWFueH4LG+mj8J0Ezp_KetHhHUr2_pC_w@mail.gmail.com> (raw)
In-Reply-To: <ZkdJoOdr_IdGgpYz@localhost.localdomain>

Hi Karim,

On Fri, May 17, 2024 at 5:12 AM Karim Manaouil <kmanaouil.dev@gmail.com> wrote:
>
> On Thu, Mar 07, 2024 at 03:03:44PM +0100, Jan Kara wrote:
> > Frankly as I'm reading the discussions here, it seems to me you are trying
> > to reinvent a lot of things from the filesystem space :) Like block
> > allocation with reasonably efficient fragmentation prevention, transparent
> > data compression (zswap), hierarchical storage management (i.e., moving
> > data between different backing stores), efficient way to get from
> > VMA+offset to the place on disk where the content is stored. Sure you still
> > don't need a lot of things modern filesystems do like permissions,
> > directory structure (or even more complex namespacing stuff), all the stuff
> > achieving fs consistency after a crash, etc. But still what you need is a
> > notable portion of what filesystems do.
> >
> > So maybe it would be time to implement swap as a proper filesystem? Or even
> > better we could think about factoring out these bits out of some existing
> > filesystem to share code?
>
> I definitely agree with you on this point. I had the same exact thought,
> reading the discussion.
>
> Filesystems already implemented a lot of solutions for fragmentation
> avoidance that are more apropriate for slow storage media.
>

Swap and file systems have very different requirements and usage
patterns and IO patterns.

> Also, writing chunks of any size (e.g. to directly write compressed
> pages) means slab-based management of swap space might not be ideal
> and will waste space for internal fragmentation. Also compaction
> for slow media is obviously harder and slower to implement compared
> to doing it in memory. You can do it in memory as well, but that is
> at the expense of more I/O.

I am not able to understand what you describe above. The current swap
entry is not allocated from slab. The compressed swap backend, zswap
or zram. both use zsmalloc as backend to store compressed pages.

>
> It sounds to me that all the problems above can be solved with an
> extent-based filesystem implementation of swap.

It looks good on paper, once you try to actually implement it  you
will find out a lot of new obstacles.

One challenging aspect is that the current swap back end has a very
low per swap entry memory overhead. It is about 1 byte (swap_map), 2
byte (swap cgroup), 8 byte(swap cache pointer). The inode struct is
more than 64 bytes per file. That is a big jump if you map a swap
entry to a file. If you map more than one swap entry to a file, then
you need to track the mapping of file offset to swap entry, and the
reverse lookup of swap entry to a file with offset. Whichever way you
cut it, it will significantly increase the per swap entry memory
overhead.

Chris


  reply	other threads:[~2024-05-21 20:41 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01  9:24 Chris Li
2024-03-01  9:53 ` Nhat Pham
2024-03-01 18:57   ` Chris Li
2024-03-04 22:58   ` Matthew Wilcox
2024-03-05  3:23     ` Chengming Zhou
2024-03-05  7:44       ` Chris Li
2024-03-05  8:15         ` Chengming Zhou
2024-03-05 18:24           ` Chris Li
2024-03-05  9:32         ` Nhat Pham
2024-03-05  9:52           ` Chengming Zhou
2024-03-05 10:55             ` Nhat Pham
2024-03-05 19:20               ` Chris Li
2024-03-05 20:56                 ` Jared Hulbert
2024-03-05 21:38         ` Jared Hulbert
2024-03-05 21:58           ` Chris Li
2024-03-06  4:16             ` Jared Hulbert
2024-03-06  5:50               ` Chris Li
     [not found]                 ` <CA+ZsKJ7JE56NS6hu4L_uyywxZO7ixgftvfKjdND9e5SOyn+72Q@mail.gmail.com>
2024-03-06 18:16                   ` Chris Li
2024-03-06 22:44                     ` Jared Hulbert
2024-03-07  0:46                       ` Chris Li
2024-03-07  8:57                         ` Jared Hulbert
2024-03-06  1:33   ` Barry Song
2024-03-04 18:43 ` Kairui Song
2024-03-04 22:03   ` Jared Hulbert
2024-03-04 22:47     ` Chris Li
2024-03-04 22:36   ` Chris Li
2024-03-06  1:15 ` Barry Song
2024-03-06  2:59   ` Chris Li
2024-03-06  6:05     ` Barry Song
2024-03-06 17:56       ` Chris Li
2024-03-06 21:29         ` Barry Song
2024-03-08  8:55       ` David Hildenbrand
2024-03-07  7:56 ` Chuanhua Han
2024-03-07 14:03   ` [Lsf-pc] " Jan Kara
2024-03-07 21:06     ` Jared Hulbert
2024-03-07 21:17       ` Barry Song
2024-03-08  0:14         ` Jared Hulbert
2024-03-08  0:53           ` Barry Song
2024-03-14  9:03         ` Jan Kara
2024-05-16 15:04           ` Zi Yan
2024-05-17  3:48             ` Chris Li
2024-03-14  8:52       ` Jan Kara
2024-03-08  2:02     ` Chuanhua Han
2024-03-14  8:26       ` Jan Kara
2024-03-14 11:19         ` Chuanhua Han
2024-05-15 23:07           ` Chris Li
2024-05-16  7:16             ` Chuanhua Han
2024-05-17 12:12     ` Karim Manaouil
2024-05-21 20:40       ` Chris Li [this message]
2024-05-28  7:08         ` Jared Hulbert
2024-05-29  3:36           ` Chris Li
2024-05-29  3:57         ` Matthew Wilcox
2024-05-29  6:50           ` Chris Li
2024-05-29 12:33             ` Matthew Wilcox
2024-05-30 22:53               ` Chris Li
2024-05-31  3:12                 ` Matthew Wilcox
2024-06-01  0:43                   ` Chris Li
2024-05-31  1:56               ` Yuanchu Xie
2024-05-31 16:51                 ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANeU7QnoKUSdMjOGNFWFueH4LG+mj8J0Ezp_KetHhHUr2_pC_w@mail.gmail.com \
    --to=chrisl@kernel.org \
    --cc=21cnbao@gmail.com \
    --cc=david@redhat.com \
    --cc=hanchuanhua@oppo.com \
    --cc=jack@suse.cz \
    --cc=kmanaouil.dev@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=ryan.roberts@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox