From: "Huang, Ying" <ying.huang@intel.com>
To: Chris Li <chrisl@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>,
lsf-pc@lists.linux-foundation.org,
Johannes Weiner <hannes@cmpxchg.org>,
Linux-MM <linux-mm@kvack.org>, Michal Hocko <mhocko@kernel.org>,
Shakeel Butt <shakeelb@google.com>,
David Rientjes <rientjes@google.com>,
Hugh Dickins <hughd@google.com>,
Seth Jennings <sjenning@redhat.com>,
Dan Streetman <ddstreet@ieee.org>,
Vitaly Wool <vitaly.wool@konsulko.com>,
Yang Shi <shy828301@gmail.com>, Peter Xu <peterx@redhat.com>,
Minchan Kim <minchan@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>,
Michal Hocko <mhocko@suse.com>, Wei Xu <weixugc@google.com>
Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap
Date: Tue, 04 Apr 2023 16:24:58 +0800 [thread overview]
Message-ID: <87ttxwxdet.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <ZCRhhHb7MuhD7Oit@google.com> (Chris Li's message of "Wed, 29 Mar 2023 09:04:20 -0700")
Chris Li <chrisl@kernel.org> writes:
> On Tue, Mar 28, 2023 at 06:41:54PM -0700, Yosry Ahmed wrote:
>> My main concern here would be having two separate swap counting
>> implementations -- although it might not be the end of the world. It
>> would be useful to consider all the options. So far, I think we have
>
> Agree.
>
>> been discussing 3 alternatives:
>>
>> (a) The initial swap_desc proposal.
>> (b) Add an optional indirection layer that can move swap entries
>> between swap devices and add a virtual swap device for zswap in the
>> kernel.
>
> For the completeness sake let me add some option that have both pros
> and cons.
>
> (d) There is the google's ghost swap file. I understand it mean a bit
> ABI change. It has the advantange that it allow more than one
> zswap swapfile. Google use it that way. Another consideration is
> that ghost swap file compatible with exisiting swapon behavior.
> You can see how much swap entry was used from swapon summary.
> Some application might depend on that.
>
> We might able to find some way to break ABI less.
>
>> (c) Add an optional indirection layer that can move entries between
>> different swap backends. Swap backends would be zswap & swap devices
>> for now. Zswap needs to implement swap entry management, swap
>> counting, etc.
> (f) I have been thinking of variants of (b) without adding a virtual
> swap device for zswap, using the ghost swap file instead.
>
> Also the indirection is optional per swap entry at run time.
> Some swap devices can have some entries move to another swap device.
> Only those swap entries pay the price of the indirection layer.
>
> (e) This is the long term goal I have in mind. A VFS like
> implementation for swap file. Let's call it VSW.
> This allows different swap devices using different
> swap file system implementations.
I like this too!
> A lot of the difficult trade off we have right now:
> Smaller per entry up front allocate like swap_map[] for all
> entry vs only allocating memory for swap entry that has been
> swap out, but a larger per entry allocation.
Yes.
> I believe some of those trade offs can be addressed by having a
> different swap file system. I do mean a different "mkswap"
> that kind of file system.
We may don't need that, because the swap on-disk format needn't to be
permanent across rebooting.
> We can write out some of the swap
> entry meta data to the swap file system as well. It means
> we don't have to pay the larger per swap entry allocation overhead
> for very cold pages. it might need to take two reads to swap
> in some of the very cold swap entries. But that should be rare.
Sound like a good idea. At least can be investigated further.
> It can offer benefits for swapping out larger folio as well.
> Right now swapping out large folios still needs to go through
> the per 4k page swap index allocation and break down.
>
> Basically, modernized the swap file system.
>
> The redirection layer should be able to implement within VSW
> as well.
>
> I know that is a very ambitious plan :-)
Yes.
> We can do that incrementally. The swap file system doesn't have
> much backward compatibility cross reboot, should be easier than
> the normal file system.
Agree.
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2023-04-04 8:26 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-18 22:38 Yosry Ahmed
2023-02-19 4:31 ` Matthew Wilcox
2023-02-19 9:34 ` Yosry Ahmed
2023-02-28 23:22 ` Chris Li
2023-03-01 0:08 ` Matthew Wilcox
2023-03-01 23:22 ` Chris Li
2023-02-21 18:39 ` Yang Shi
2023-02-21 18:56 ` Yosry Ahmed
2023-02-21 19:26 ` Yang Shi
2023-02-21 19:46 ` Yosry Ahmed
2023-02-21 23:34 ` Yang Shi
2023-02-21 23:38 ` Yosry Ahmed
2023-02-22 16:57 ` Johannes Weiner
2023-02-22 22:46 ` Yosry Ahmed
2023-02-28 4:29 ` Kalesh Singh
2023-02-28 8:09 ` Yosry Ahmed
2023-02-28 4:54 ` Sergey Senozhatsky
2023-02-28 8:12 ` Yosry Ahmed
2023-02-28 23:29 ` Minchan Kim
2023-03-02 0:58 ` Yosry Ahmed
2023-03-02 1:25 ` Yosry Ahmed
2023-03-02 17:05 ` Chris Li
2023-03-02 17:47 ` Chris Li
2023-03-02 18:15 ` Johannes Weiner
2023-03-02 18:56 ` Chris Li
2023-03-02 18:23 ` Rik van Riel
2023-03-02 21:42 ` Chris Li
2023-03-02 22:36 ` Rik van Riel
2023-03-02 22:55 ` Yosry Ahmed
2023-03-03 4:05 ` Chris Li
2023-03-03 0:01 ` Chris Li
2023-03-02 16:58 ` Chris Li
2023-03-01 10:44 ` Sergey Senozhatsky
2023-03-02 1:01 ` Yosry Ahmed
2023-02-28 23:11 ` Chris Li
2023-03-02 0:30 ` Yosry Ahmed
2023-03-02 1:00 ` Yosry Ahmed
2023-03-02 16:51 ` Chris Li
2023-03-03 0:33 ` Minchan Kim
2023-03-03 0:49 ` Yosry Ahmed
2023-03-03 1:25 ` Minchan Kim
2023-03-03 17:15 ` Yosry Ahmed
2023-03-09 12:48 ` Huang, Ying
2023-03-09 19:58 ` Chris Li
2023-03-09 20:19 ` Yosry Ahmed
2023-03-10 3:06 ` Huang, Ying
2023-03-10 23:14 ` Chris Li
2023-03-13 1:10 ` Huang, Ying
2023-03-15 7:41 ` Yosry Ahmed
2023-03-16 1:42 ` Huang, Ying
2023-03-11 1:06 ` Yosry Ahmed
2023-03-13 2:12 ` Huang, Ying
2023-03-15 8:01 ` Yosry Ahmed
2023-03-16 7:50 ` Huang, Ying
2023-03-17 10:19 ` Yosry Ahmed
2023-03-17 18:19 ` Chris Li
2023-03-17 18:23 ` Yosry Ahmed
2023-03-20 2:55 ` Huang, Ying
2023-03-20 6:25 ` Chris Li
2023-03-23 0:56 ` Huang, Ying
2023-03-23 6:46 ` Chris Li
2023-03-23 6:56 ` Huang, Ying
2023-03-23 18:28 ` Chris Li
2023-03-23 18:40 ` Yosry Ahmed
2023-03-23 19:49 ` Chris Li
2023-03-23 19:54 ` Yosry Ahmed
2023-03-23 21:10 ` Chris Li
2023-03-24 17:28 ` Chris Li
2023-03-22 5:56 ` Yosry Ahmed
2023-03-23 1:48 ` Huang, Ying
2023-03-23 2:21 ` Yosry Ahmed
2023-03-23 3:16 ` Huang, Ying
2023-03-23 3:27 ` Yosry Ahmed
2023-03-23 5:37 ` Huang, Ying
2023-03-23 15:18 ` Yosry Ahmed
2023-03-24 2:37 ` Huang, Ying
2023-03-24 7:28 ` Yosry Ahmed
2023-03-24 17:23 ` Chris Li
2023-03-27 1:23 ` Huang, Ying
2023-03-28 5:54 ` Yosry Ahmed
2023-03-28 6:20 ` Huang, Ying
2023-03-28 6:29 ` Yosry Ahmed
2023-03-28 6:59 ` Huang, Ying
2023-03-28 7:59 ` Yosry Ahmed
2023-03-28 14:14 ` Johannes Weiner
2023-03-28 19:59 ` Yosry Ahmed
2023-03-28 21:22 ` Chris Li
2023-03-28 21:30 ` Yosry Ahmed
2023-03-28 20:50 ` Chris Li
2023-03-28 21:01 ` Yosry Ahmed
2023-03-28 21:32 ` Chris Li
2023-03-28 21:44 ` Yosry Ahmed
2023-03-28 22:01 ` Chris Li
2023-03-28 22:02 ` Yosry Ahmed
2023-03-29 1:31 ` Huang, Ying
2023-03-29 1:41 ` Yosry Ahmed
2023-03-29 16:04 ` Chris Li
2023-04-04 8:24 ` Huang, Ying [this message]
2023-04-04 8:10 ` Huang, Ying
2023-04-04 8:47 ` Yosry Ahmed
2023-04-06 1:40 ` Huang, Ying
2023-03-29 15:22 ` Chris Li
2023-03-10 2:07 ` Luis Chamberlain
2023-03-10 2:15 ` Yosry Ahmed
2023-05-12 3:07 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttxwxdet.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=chrisl@kernel.org \
--cc=ddstreet@ieee.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=peterx@redhat.com \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=shy828301@gmail.com \
--cc=sjenning@redhat.com \
--cc=vitaly.wool@konsulko.com \
--cc=weixugc@google.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox