From: Barry Song <21cnbao@gmail.com>
To: Jared Hulbert <jaredeh@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Chuanhua Han <hanchuanhua@oppo.com>,
Chris Li <chrisl@kernel.org>, linux-mm <linux-mm@kvack.org>,
lsf-pc@lists.linux-foundation.org, ryan.roberts@arm.com,
david@redhat.com
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Swap Abstraction "the pony"
Date: Fri, 8 Mar 2024 05:17:46 +0800 [thread overview]
Message-ID: <CAGsJ_4xH72GAeqrXAmjNRy0GbLRU+4mcSAbp_5R527Hb1X0G0Q@mail.gmail.com> (raw)
In-Reply-To: <CA+ZsKJ5VDf3YgOyiANNJiPomtkFqCT2QpvDCNNwtW6kyoDxjbQ@mail.gmail.com>
On Fri, Mar 8, 2024 at 5:06 AM Jared Hulbert <jaredeh@gmail.com> wrote:
>
> On Thu, Mar 7, 2024 at 9:35 AM Jan Kara <jack@suse.cz> wrote:
> >
> > Well, but then if you fill in space of a particular order and need to swap
> > out a page of that order what do you do? Return ENOSPC prematurely?
> >
> > Frankly as I'm reading the discussions here, it seems to me you are trying
> > to reinvent a lot of things from the filesystem space :) Like block
> > allocation with reasonably efficient fragmentation prevention, transparent
> > data compression (zswap), hierarchical storage management (i.e., moving
> > data between different backing stores), efficient way to get from
> > VMA+offset to the place on disk where the content is stored. Sure you still
> > don't need a lot of things modern filesystems do like permissions,> directory structure (or even more complex namespacing stuff), all the stuff
> > achieving fs consistency after a crash, etc. But still what you need is a
> > notable portion of what filesystems do.
> >
> > So maybe it would be time to implement swap as a proper filesystem? Or even
> > better we could think about factoring out these bits out of some existing
> > filesystem to share code?
>
> Yes. Thank you. I've been struggling to communicate this.
>
> I'm thinking you can just use existing filesystems as a first step
> with a modest glue layer. See the branch of this thread where I'm
> babbling on to Chris about this.
>
> "efficient way to get from VMA+offset to place on the disk where
> content is stored"
> You mean treat swapped pages like they were mmap'ed files and use the
> same code paths? How big of a project is that? That seems either
> deceptively easy or really hard... I've been away too long and was
> never really good enough to have a clear vision of the scale.
I don't understand why we need this level of complexity. All we need to know
are the offsets during pageout. After that, the large folio is
destroyed, and all
offsets are stored in page table entries (PTEs) or xa. Swap-in doesn't depend
on a complex file system; it can make its own decision on how to swap-in
based on the values it reads from PTEs.
Swap-in doesn't need to know whether the swapped-out folio was large or not.
>
> On the file side we have the page cache, but on the swap side you have
> swap cache and zswap. If we reconciled file pages and swap pages you
> could have page cache and zpage_cache(?) bringing gains in both
> directions. If the argument is that the swap fault path is a lot
> faster, then shouldn't we be talking about fixing the file fault path
> anyway?
>
> I'd love to hear the real experts chime in.
Thanks
Barry
next prev parent reply other threads:[~2024-03-07 21:18 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-01 9:24 Chris Li
2024-03-01 9:53 ` Nhat Pham
2024-03-01 18:57 ` Chris Li
2024-03-04 22:58 ` Matthew Wilcox
2024-03-05 3:23 ` Chengming Zhou
2024-03-05 7:44 ` Chris Li
2024-03-05 8:15 ` Chengming Zhou
2024-03-05 18:24 ` Chris Li
2024-03-05 9:32 ` Nhat Pham
2024-03-05 9:52 ` Chengming Zhou
2024-03-05 10:55 ` Nhat Pham
2024-03-05 19:20 ` Chris Li
2024-03-05 20:56 ` Jared Hulbert
2024-03-05 21:38 ` Jared Hulbert
2024-03-05 21:58 ` Chris Li
2024-03-06 4:16 ` Jared Hulbert
2024-03-06 5:50 ` Chris Li
[not found] ` <CA+ZsKJ7JE56NS6hu4L_uyywxZO7ixgftvfKjdND9e5SOyn+72Q@mail.gmail.com>
2024-03-06 18:16 ` Chris Li
2024-03-06 22:44 ` Jared Hulbert
2024-03-07 0:46 ` Chris Li
2024-03-07 8:57 ` Jared Hulbert
2024-03-06 1:33 ` Barry Song
2024-03-04 18:43 ` Kairui Song
2024-03-04 22:03 ` Jared Hulbert
2024-03-04 22:47 ` Chris Li
2024-03-04 22:36 ` Chris Li
2024-03-06 1:15 ` Barry Song
2024-03-06 2:59 ` Chris Li
2024-03-06 6:05 ` Barry Song
2024-03-06 17:56 ` Chris Li
2024-03-06 21:29 ` Barry Song
2024-03-08 8:55 ` David Hildenbrand
2024-03-07 7:56 ` Chuanhua Han
2024-03-07 14:03 ` [Lsf-pc] " Jan Kara
2024-03-07 21:06 ` Jared Hulbert
2024-03-07 21:17 ` Barry Song [this message]
2024-03-08 0:14 ` Jared Hulbert
2024-03-08 0:53 ` Barry Song
2024-03-14 9:03 ` Jan Kara
2024-05-16 15:04 ` Zi Yan
2024-05-17 3:48 ` Chris Li
2024-03-14 8:52 ` Jan Kara
2024-03-08 2:02 ` Chuanhua Han
2024-03-14 8:26 ` Jan Kara
2024-03-14 11:19 ` Chuanhua Han
2024-05-15 23:07 ` Chris Li
2024-05-16 7:16 ` Chuanhua Han
2024-05-17 12:12 ` Karim Manaouil
2024-05-21 20:40 ` Chris Li
2024-05-28 7:08 ` Jared Hulbert
2024-05-29 3:36 ` Chris Li
2024-05-29 3:57 ` Matthew Wilcox
2024-05-29 6:50 ` Chris Li
2024-05-29 12:33 ` Matthew Wilcox
2024-05-30 22:53 ` Chris Li
2024-05-31 3:12 ` Matthew Wilcox
2024-06-01 0:43 ` Chris Li
2024-05-31 1:56 ` Yuanchu Xie
2024-05-31 16:51 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4xH72GAeqrXAmjNRy0GbLRU+4mcSAbp_5R527Hb1X0G0Q@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hanchuanhua@oppo.com \
--cc=jack@suse.cz \
--cc=jaredeh@gmail.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=ryan.roberts@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox