From: Kairui Song <ryncsn@gmail.com>
To: Chris Li <chrisl@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>,
akpm@linux-foundation.org, tj@kernel.org,
lizefan.x@bytedance.com, hannes@cmpxchg.org,
cerasuolodomenico@gmail.com, yosryahmed@google.com,
sjenning@redhat.com, ddstreet@ieee.org,
vitaly.wool@konsulko.com, mhocko@kernel.org,
roman.gushchin@linux.dev, shakeelb@google.com,
muchun.song@linux.dev, hughd@google.com, corbet@lwn.net,
konrad.wilk@oracle.com, senozhatsky@chromium.org,
rppt@kernel.org, linux-mm@kvack.org, kernel-team@meta.com,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
david@ixit.cz, Minchan Kim <minchan@google.com>,
Zhongkun He <hezhongkun.hzk@bytedance.com>
Subject: Re: [PATCH v6] zswap: memcontrol: implement zswap writeback disabling
Date: Wed, 20 Dec 2023 18:21:28 +0800 [thread overview]
Message-ID: <CAMgjq7BXXSHKJXijpB_FfNA9N=dh5uWHBJmHrJKoLOShrqvDYA@mail.gmail.com> (raw)
In-Reply-To: <CAF8kJuN=E0RA_JyVnAVraYSyHx5sk=znM2A-JKnAfDc4M2BYGg@mail.gmail.com>
Chris Li <chrisl@kernel.org> 于2023年12月13日周三 07:39写道:
>
> Hi Kairui,
>
> Thanks for sharing the information on how you use swap.
Hi Chris,
>
> On Mon, Dec 11, 2023 at 1:31 AM Kairui Song <ryncsn@gmail.com> wrote:
> > > 2) As indicated by this discussion, Tencent has a usage case for SSD
> > > and hard disk swap as overflow.
> > > https://lore.kernel.org/linux-mm/20231119194740.94101-9-ryncsn@gmail.com/
> > > +Kairui
> >
> > Yes, we are not using zswap. We are using ZRAM for swap since we have
> > many different varieties of workload instances, with a very flexible
> > storage setup. Some of them don't have the ability to set up a
> > swapfile. So we built a pack of kernel infrastructures based on ZRAM,
> > which so far worked pretty well.
>
> This is great. The usage case is actually much more than I expected.
> For example, I never thought of zram as a swap tier. Now you mention
> it. I am considering whether it makes sense to add zram to the
> memory.swap.tiers as well as zswap.
>
> >
> > The concern from some teams is that ZRAM (or zswap) can't always free
> > up memory so they may lead to higher risk of OOM compared to a
> > physical swap device, and they do have suitable devices for doing swap
> > on some of their machines. So a secondary swap support is very helpful
> > in case of memory usage peak.
> >
> > Besides this, another requirement is that different containers may
> > have different priority, some containers can tolerate high swap
> > overhead while some cannot, so swap tiering is useful for us in many
> > ways.
> >
> > And thanks to cloud infrastructure the disk setup could change from
> > time to time depending on workload requirements, so our requirement is
> > to support ZRAM (always) + SSD (optional) + HDD (also optional) as
> > swap backends, while not making things too complex to maintain.
>
> Just curious, do you use ZRAM + SSD + HDD all enabled? Do you ever
> consider moving data from ZRAM to SSD, or from SSD to HDD? If you do,
> I do see the possibility of having more general swap tiers support and
> sharing the shrinking code between tiers somehow. Granted there are
> many unanswered questions and a lot of infrastructure is lacking.
> Gathering requirements, weight in the priority of the quirement is the
> first step towards a possible solution.
Sorry for the late response. Yes, it's our plan to use ZRAM + SSD +
HDD all enabled when possible. Alghouth currently only ZRAM + SSD is
expected.
I see this discussion is still going one so just add some info here...
We have some test environments which have a kernel worker enabled to
move data from ZRAM to SSD, and from SSD to HDD too, to free up space
for higher tier swap devices. The kworker is simple, it maintains a
swap entry LRU for every swap device (maybe worth noting here, there
is currently no LRU bases writeback for ZRAM, and ZRAM writeback
require a fixed block device on init, and a swap device level LRU is
also helpful for migrating entry from SSD to HDD). It walks the page
table to swap in coldest swap entry then swap out immediately to a
lower tier, doing this page by page periodically. Overhead and memory
footprint is minimal with limited moving rate, but the efficiency for
large scaled data moving is terrible so it only has very limited
usage. I was trying to come up with a better design but am currently
not working on it.
next prev parent reply other threads:[~2023-12-20 10:21 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-07 19:24 Nhat Pham
2023-12-07 19:26 ` Yosry Ahmed
2023-12-07 22:11 ` Andrew Morton
2023-12-08 0:42 ` Nhat Pham
2023-12-08 1:14 ` Nhat Pham
2023-12-08 19:58 ` Andrew Morton
2023-12-08 19:57 ` Andrew Morton
2023-12-08 0:19 ` Chris Li
2023-12-08 1:03 ` Nhat Pham
2023-12-08 1:12 ` Yosry Ahmed
2023-12-08 16:34 ` Johannes Weiner
2023-12-08 20:08 ` Yosry Ahmed
2023-12-09 2:02 ` Chris Li
2023-12-09 0:09 ` Chris Li
2023-12-08 23:55 ` Chris Li
2023-12-09 3:42 ` Johannes Weiner
2023-12-09 17:39 ` Chris Li
2023-12-11 22:55 ` Minchan Kim
2023-12-12 2:43 ` [External] " Zhongkun He
2023-12-12 23:57 ` Chris Li
2023-12-20 10:22 ` Kairui Song
2023-12-14 17:11 ` Johannes Weiner
2023-12-14 17:23 ` Yu Zhao
2023-12-14 18:00 ` Fabian Deutsch
2023-12-14 23:22 ` Chris Li
2023-12-15 7:42 ` Fabian Deutsch
2023-12-15 9:40 ` Chris Li
2023-12-15 9:50 ` Fabian Deutsch
2023-12-15 9:18 ` Fabian Deutsch
2023-12-14 18:03 ` Fabian Deutsch
2023-12-14 17:34 ` Christopher Li
2023-12-14 22:11 ` Johannes Weiner
2023-12-14 22:54 ` Chris Li
2023-12-15 2:19 ` Nhat Pham
2023-12-12 21:36 ` Nhat Pham
2023-12-13 0:29 ` Chris Li
2023-12-11 9:31 ` Kairui Song
2023-12-12 23:39 ` Chris Li
2023-12-20 10:21 ` Kairui Song [this message]
2023-12-15 21:21 ` Yosry Ahmed
2023-12-18 14:44 ` Johannes Weiner
2023-12-18 19:21 ` Nhat Pham
2023-12-18 21:54 ` Yosry Ahmed
2023-12-18 21:52 ` Yosry Ahmed
2023-12-20 5:15 ` Johannes Weiner
2023-12-20 8:59 ` Yosry Ahmed
2023-12-20 14:50 ` Johannes Weiner
2023-12-21 0:24 ` Yosry Ahmed
2023-12-21 0:50 ` Nhat Pham
2023-12-21 0:57 ` [PATCH v6] zswap: memcontrol: implement zswap writeback disabling (fix) Nhat Pham
2023-12-24 17:17 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMgjq7BXXSHKJXijpB_FfNA9N=dh5uWHBJmHrJKoLOShrqvDYA@mail.gmail.com' \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cerasuolodomenico@gmail.com \
--cc=chrisl@kernel.org \
--cc=corbet@lwn.net \
--cc=david@ixit.cz \
--cc=ddstreet@ieee.org \
--cc=hannes@cmpxchg.org \
--cc=hezhongkun.hzk@bytedance.com \
--cc=hughd@google.com \
--cc=kernel-team@meta.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=minchan@google.com \
--cc=muchun.song@linux.dev \
--cc=nphamcs@gmail.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=senozhatsky@chromium.org \
--cc=shakeelb@google.com \
--cc=sjenning@redhat.com \
--cc=tj@kernel.org \
--cc=vitaly.wool@konsulko.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox