From: Chris Li <chrisl@kernel.org>
To: Chengming Zhou <zhouchengming@bytedance.com>
Cc: "Yosry Ahmed" <yosryahmed@google.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
"Wei Xu" <weixugc@google.com>, "Yu Zhao" <yuzhao@google.com>,
"Greg Thelen" <gthelen@google.com>,
"Chun-Tse Shao" <ctshao@google.com>,
"Suren Baghdasaryan" <surenb@google.com>,
"Brain Geffon" <bgeffon@google.com>,
"Minchan Kim" <minchan@kernel.org>,
"Michal Hocko" <mhocko@suse.com>,
"Mel Gorman" <mgorman@techsingularity.net>,
"Huang Ying" <ying.huang@intel.com>,
"Nhat Pham" <nphamcs@gmail.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Kairui Song" <kasong@tencent.com>,
"Zhongkun He" <hezhongkun.hzk@bytedance.com>,
"Kemeng Shi" <shikemeng@huaweicloud.com>,
"Barry Song" <v-songbaohua@oppo.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Joel Fernandes" <joel@joelfernandes.org>
Subject: Re: [PATCH 0/2] RFC: zswap tree use xarray instead of RB tree
Date: Fri, 19 Jan 2024 03:59:07 -0800 [thread overview]
Message-ID: <CAF8kJuM4ybP+4_3zssCfV3-Vf9_gE2P7jiOcD9OGgT4JjFC0bg@mail.gmail.com> (raw)
In-Reply-To: <ad007bf8-ab06-4414-8675-e689c5c84fc9@bytedance.com>
On Fri, Jan 19, 2024 at 3:12 AM Chengming Zhou
<zhouchengming@bytedance.com> wrote:
>
> On 2024/1/19 18:26, Chris Li wrote:
> > On Thu, Jan 18, 2024 at 10:19 PM Chengming Zhou
> > <zhouchengming@bytedance.com> wrote:
> >>
> >> On 2024/1/19 12:59, Chris Li wrote:
> >>> On Wed, Jan 17, 2024 at 11:35 PM Chengming Zhou
> >>> <zhouchengming@bytedance.com> wrote:
> >>>
> >>>>>>> mm-stable zswap-split-tree zswap-xarray
> >>>>>>> real 1m10.442s 1m4.157s 1m9.962s
> >>>>>>> user 17m48.232s 17m41.477s 17m45.887s
> >>>>>>> sys 8m13.517s 5m2.226s 7m59.305s
> >>>>>>>
> >>>>>>> Looks like the contention of concurrency is still there, I haven't
> >>>>>>> look into the code yet, will review it later.
> >>>>>
> >>>>> Thanks for the quick test. Interesting to see the sys usage drop for
> >>>>> the xarray case even with the spin lock.
> >>>>> Not sure if the 13 second saving is statistically significant or not.
> >>>>>
> >>>>> We might need to have both xarray and split trees for the zswap. It is
> >>>>> likely removing the spin lock wouldn't be able to make up the 35%
> >>>>> difference. That is just my guess. There is only one way to find out.
> >>>>
> >>>> Yes, I totally agree with this! IMHO, concurrent zswap_store paths still
> >>>> have to contend for the xarray spinlock even though we would have converted
> >>>> the rb-tree to the xarray structure at last. So I think we should have both.
> >>>>
> >>>>>
> >>>>> BTW, do you have a script I can run to replicate your results?
> >>>
> >>> Hi Chengming,
> >>>
> >>> Thanks for your script.
> >>>
> >>>>
> >>>> ```
> >>>> #!/bin/bash
> >>>>
> >>>> testname="build-kernel-tmpfs"
> >>>> cgroup="/sys/fs/cgroup/$testname"
> >>>>
> >>>> tmpdir="/tmp/vm-scalability-tmp"
> >>>> workdir="$tmpdir/$testname"
> >>>>
> >>>> memory_max="$((2 * 1024 * 1024 * 1024))"
> >>>>
> >>>> linux_src="/root/zcm/linux-6.6.tar.xz"
> >>>> NR_TASK=32
> >>>>
> >>>> swapon ~/zcm/swapfile
> >>>
> >>> How big is your swapfile here?
> >>
> >> The swapfile is big enough here, I use a 50GB swapfile.
> >
> > Thanks,
> >
> >>
> >>>
> >>> It seems you have only one swapfile there. That can explain the contention.
> >>> Have you tried multiple swapfiles for the same test?
> >>> That should reduce the contention without using your patch.
> >> Do you mean to have many 64MB swapfiles to swapon at the same time?
> >
> > 64MB is too small. There are limits to MAX_SWAPFILES. It is less than
> > (32 - n) swap files.
> > If you want to use 50G swap space, you can have MAX_SWAPFILES, each
> > swapfile 50GB / MAX_SWAPFILES.
>
> Right.
>
> >
> >> Maybe it's feasible to test,
> >
> > Of course it is testable, I am curious to see the test results.
> >
> >> I'm not sure how swapout will choose.
> >
> > It will rotate through the same priority swap files first.
> > swapfile.c: get_swap_pages().
> >
> >> But in our usecase, we normally have only one swapfile.
> >
> > Is there a good reason why you can't use more than one swapfile?
>
> I think no, but it seems an unneeded change/burden to our admin.
> So I just tested and optimized for the normal case.
I understand. Just saying it is not really a kernel limitation per say.
I blame the user space :-)
>
> > One swapfile will not take the full advantage of the existing code.
> > Even if you split the zswap trees within a swapfile. With only one
> > swapfile, you will still be having lock contention on "(struct
> > swap_info_struct).lock".
> > It is one lock per swapfile.
> > Using more than one swap file should get you better results.
>
> IIUC, we already have the per-cpu swap entry cache to not contend for
> this lock? And I don't see much hot of this lock in the testing.
Yes. The swap entry cache helps. The cache batching also causes other
problems, e.g. the long tail in swap faults handling.
Shameless plug, I have a patch posted earlier to address the swap
fault long tail latencies.
https://lore.kernel.org/linux-mm/20231221-async-free-v1-1-94b277992cb0@kernel.org/T/
Chris
next prev parent reply other threads:[~2024-01-19 11:59 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-18 3:05 Chris Li
2024-01-18 3:05 ` [PATCH 1/2] mm: zswap.c: add xarray tree to zswap Chris Li
2024-01-18 6:20 ` Yosry Ahmed
2024-01-18 13:52 ` Matthew Wilcox
2024-01-18 16:59 ` Yosry Ahmed
2024-01-18 18:25 ` Matthew Wilcox
2024-01-19 5:28 ` Chris Li
2024-01-19 19:30 ` Yosry Ahmed
2024-01-19 5:24 ` Chris Li
2024-01-19 19:29 ` Yosry Ahmed
2024-01-19 20:04 ` Matthew Wilcox
2024-01-19 21:41 ` Yosry Ahmed
2024-01-19 22:05 ` Chris Li
2024-01-19 22:08 ` Yosry Ahmed
2024-01-18 3:05 ` [PATCH 2/2] mm: zswap.c: remove RB tree Chris Li
2024-01-18 6:35 ` Yosry Ahmed
2024-01-18 19:35 ` Yosry Ahmed
2024-01-19 5:49 ` Chris Li
2024-01-19 19:37 ` Yosry Ahmed
2024-01-19 5:43 ` Chris Li
2024-01-19 19:36 ` Yosry Ahmed
2024-01-19 21:31 ` Chris Li
2024-01-19 21:44 ` Yosry Ahmed
2024-01-18 6:01 ` [PATCH 0/2] RFC: zswap tree use xarray instead of " Yosry Ahmed
2024-01-18 6:39 ` Yosry Ahmed
2024-01-18 6:57 ` Chengming Zhou
2024-01-18 7:02 ` Yosry Ahmed
2024-01-18 7:19 ` Chris Li
2024-01-18 7:35 ` Chengming Zhou
2024-01-19 4:59 ` Chris Li
2024-01-19 6:18 ` Chengming Zhou
2024-01-19 10:26 ` Chris Li
2024-01-19 11:12 ` Chengming Zhou
2024-01-19 11:59 ` Chris Li [this message]
2024-01-18 6:48 ` Christopher Li
2024-01-18 7:05 ` Yosry Ahmed
2024-01-18 7:28 ` Chris Li
2024-01-18 17:14 ` Yosry Ahmed
2024-01-18 14:48 ` Johannes Weiner
2024-01-18 18:59 ` Liam R. Howlett
2024-01-19 5:13 ` Chris Li
2024-01-18 18:01 ` Nhat Pham
2024-01-19 5:14 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAF8kJuM4ybP+4_3zssCfV3-Vf9_gE2P7jiOcD9OGgT4JjFC0bg@mail.gmail.com \
--to=chrisl@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=bgeffon@google.com \
--cc=ctshao@google.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=hezhongkun.hzk@bytedance.com \
--cc=joel@joelfernandes.org \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=shikemeng@huaweicloud.com \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox