From: Kairui Song <ryncsn@gmail.com>
To: Chris Li <chrisl@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
Hugh Dickins <hughd@google.com>, Barry Song <baohua@kernel.org>,
Baoquan He <bhe@redhat.com>, Nhat Pham <nphamcs@gmail.com>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Ying Huang <ying.huang@linux.alibaba.com>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@redhat.com>,
Yosry Ahmed <yosryahmed@google.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>,
linux-kernel@vger.kernel.org,
kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH 7/9] mm, swap: remove contention workaround for swap cache
Date: Sun, 31 Aug 2025 23:54:32 +0800 [thread overview]
Message-ID: <CAMgjq7AdauQ8=X0zeih2r21QoV=-WWj1hyBxLWRzq74n-C=-Ng@mail.gmail.com> (raw)
In-Reply-To: <CAMgjq7Aznd7=m6JTNGM4EyFj+6pqHTRBCo2hsQL-cKi0LZggOg@mail.gmail.com>
On Sat, Aug 30, 2025 at 11:24 PM Kairui Song <ryncsn@gmail.com> wrote:
>
> On Sat, Aug 30, 2025 at 1:03 PM Chris Li <chrisl@kernel.org> wrote:
> >
> > Hi Kairui,
> >
> > It feels so good to remove that 64M swap cache space. Thank you for
> > making it happen.
> >
> > Some nitpick follows. I am fine as is as well.
> >
> > Acked-by: Chris Li <chrisl@kernel.org>
>
> Thanks.
>
> >
> > Chris
> >
> > On Fri, Aug 22, 2025 at 12:21 PM Kairui Song <ryncsn@gmail.com> wrote:
> > >
> > > From: Kairui Song <kasong@tencent.com>
> > >
> > > Swap cluster setup will try to shuffle the clusters on initialization.
> > > It was helpful to avoid contention for the swap cache space. The cluster
> > > size (2M) was much smaller than each swap cache space (64M), so shuffling
> > > the cluster means the allocator will try to allocate swap slots that are
> > > in different swap cache spaces for each CPU, reducing the chance of two
> > > CPUs using the same swap cache space, and hence reducing the contention.
> > >
> > > Now, swap cache is managed by swap clusters, this shuffle is pointless.
> > > Just remove it, and clean up related macros.
> > >
> > > This should also improve the HDD swap performance as shuffling IO is a
> > > bad idea for HDD, and now the shuffling is gone.
> >
> > Did you have any numbers to prove that :-). Last time the swap
> > allocator stress testing has already destroyed two of my SAS drives
> > dedicated for testing. So I am not very keen on running the HDD swap
> > stress test. The HDD swap stress test are super slow to run, it takes
> > ages.
>
> I did some test months before, removing the cluster shuffle did help.
> I didn't test it again this time, only did some stress test. Doing
> performance test on HDD is really not a good experience as my HDD
> drives are too old so a long running test kills them easily.
>
> And I couldn't find any other factor that is causing a serial HDD IO
> regression, maybe the bot can help verify. If this doesn't help, we'll
> think of something else. But I don't think HDD based SWAP will ever
> have a practical good performance as they are terrible at rand read...
>
> Anyway, let me try again with HDD today, maybe I'll get some useful data.
So I tried to run some HDD test for many rounds, basically doing the
test in the URL below manually. Test is done using nr_task = 8. The
HDD swap partition size is 8G.
Do the preparation following:
https://github.com/intel/lkp-tests/blob/master/setup/swapin_setup
(Make usemem hold 8G memory and push them to swap)
And do the test with:
https://github.com/intel/lkp-tests/blob/master/programs/swapin/run
(Use SIGUSR1 to make usemem to read its memory and swapin)
Before this patch:
Test run 1:
1073741824 bytes / 878662493 usecs = 1193 KB/s
33019 usecs to free memory
1073741824 bytes / 891315681 usecs = 1176 KB/s
35144 usecs to free memory
1073741824 bytes / 898801090 usecs = 1166 KB/s
36305 usecs to free memory
1073741824 bytes / 925899753 usecs = 1132 KB/s
20498 usecs to free memory
1073741824 bytes / 927522592 usecs = 1130 KB/s
34397 usecs to free memory
1073741824 bytes / 928164994 usecs = 1129 KB/s
35908 usecs to free memory
1073741824 bytes / 929890294 usecs = 1127 KB/s
35014 usecs to free memory
1073741824 bytes / 929997808 usecs = 1127 KB/s
30491 usecs to free memory
test done
Test run 2:
1073741824 bytes / 771932432 usecs = 1358 KB/s
31194 usecs to free memory
1073741824 bytes / 788739551 usecs = 1329 KB/s
25714 usecs to free memory
1073741824 bytes / 795853979 usecs = 1317 KB/s
33809 usecs to free memory
1073741824 bytes / 798019211 usecs = 1313 KB/s
32019 usecs to free memory
1073741824 bytes / 798771141 usecs = 1312 KB/s
31689 usecs to free memory
1073741824 bytes / 800384757 usecs = 1310 KB/s
32622 usecs to free memory
1073741824 bytes / 800822764 usecs = 1309 KB/s
1073741824 bytes / 800882227 usecs = 1309 KB/s
32789 usecs to free memory
30577 usecs to free memory
test done
Test run 3:
1073741824 bytes / 775202370 usecs = 1352 KB/s
31832 usecs to free memory
1073741824 bytes / 777618372 usecs = 1348 KB/s
30172 usecs to free memory
1073741824 bytes / 778180006 usecs = 1347 KB/s
32482 usecs to free memory
1073741824 bytes / 778521023 usecs = 1346 KB/s
30188 usecs to free memory
1073741824 bytes / 779207791 usecs = 1345 KB/s
29364 usecs to free memory
1073741824 bytes / 780753200 usecs = 1343 KB/s
29860 usecs to free memory
1073741824 bytes / 781078362 usecs = 1342 KB/s
30449 usecs to free memory
1073741824 bytes / 781224993 usecs = 1342 KB/s
19557 usecs to free memory
test done
After this patch:
Test run 1:
1073741824 bytes / 569803736 usecs = 1840 KB/s
29032 usecs to free memory
1073741824 bytes / 573718349 usecs = 1827 KB/s
30399 usecs to free memory
1073741824 bytes / 592070142 usecs = 1771 KB/s
31896 usecs to free memory
1073741824 bytes / 593484694 usecs = 1766 KB/s
30650 usecs to free memory
1073741824 bytes / 596693866 usecs = 1757 KB/s
31582 usecs to free memory
1073741824 bytes / 597359263 usecs = 1755 KB/s
26436 usecs to free memory
1073741824 bytes / 598339187 usecs = 1752 KB/s
30697 usecs to free memory
1073741824 bytes / 598674138 usecs = 1751 KB/s
29791 usecs to free memory
test done
Test run 2:
1073741824 bytes / 578821803 usecs = 1811 KB/s
28433 usecs to free memory
1073741824 bytes / 584262760 usecs = 1794 KB/s
28565 usecs to free memory
1073741824 bytes / 586118970 usecs = 1789 KB/s
27365 usecs to free memory
1073741824 bytes / 589159154 usecs = 1779 KB/s
42645 usecs to free memory
1073741824 bytes / 593487980 usecs = 1766 KB/s
28684 usecs to free memory
1073741824 bytes / 606025290 usecs = 1730 KB/s
28974 usecs to free memory
1073741824 bytes / 607547362 usecs = 1725 KB/s
33221 usecs to free memory
1073741824 bytes / 607882511 usecs = 1724 KB/s
31393 usecs to free memory
test done
Test run 3:
1073741824 bytes / 487637856 usecs = 2150 KB/s
28022 usecs to free memory
1073741824 bytes / 491211037 usecs = 2134 KB/s
28229 usecs to free memory
1073741824 bytes / 527698561 usecs = 1987 KB/s
30265 usecs to free memory
1073741824 bytes / 531719920 usecs = 1972 KB/s
30373 usecs to free memory
1073741824 bytes / 532555758 usecs = 1968 KB/s
30019 usecs to free memory
1073741824 bytes / 532942789 usecs = 1967 KB/s
29354 usecs to free memory
1073741824 bytes / 540793872 usecs = 1938 KB/s
32703 usecs to free memory
1073741824 bytes / 541343777 usecs = 1936 KB/s
33428 usecs to free memory
test done
It seems to match the ~33% swapin.throughput regression reported by
the bot, it's about ~40% faster with this patch applied. I'll add this
test result to V2.
next prev parent reply other threads:[~2025-08-31 15:55 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-22 19:20 [PATCH 0/9] mm, swap: introduce swap table as swap cache (phase I) Kairui Song
2025-08-22 19:20 ` [PATCH 1/9] mm, swap: use unified helper for swap cache look up Kairui Song
2025-08-27 2:47 ` Chris Li
2025-08-27 3:50 ` Chris Li
2025-08-27 13:45 ` Kairui Song
2025-08-27 3:52 ` Baoquan He
2025-08-27 13:46 ` Kairui Song
2025-08-28 3:20 ` Baolin Wang
2025-09-01 23:50 ` Barry Song
2025-09-02 6:12 ` Kairui Song
2025-09-02 6:52 ` Chris Li
2025-09-02 10:06 ` David Hildenbrand
2025-09-02 12:32 ` Chris Li
2025-09-02 13:18 ` David Hildenbrand
2025-09-02 16:38 ` Kairui Song
2025-09-02 10:10 ` David Hildenbrand
2025-09-02 17:13 ` Kairui Song
2025-09-03 8:00 ` David Hildenbrand
2025-09-03 17:41 ` Nhat Pham
2025-09-04 16:05 ` Kairui Song
2025-08-22 19:20 ` [PATCH 2/9] mm, swap: always lock and check the swap cache folio before use Kairui Song
2025-08-27 6:13 ` Chris Li
2025-08-27 13:44 ` Kairui Song
2025-08-30 1:42 ` Chris Li
2025-08-27 7:03 ` Chris Li
2025-08-27 14:35 ` Kairui Song
2025-08-28 3:41 ` Baolin Wang
2025-08-28 18:05 ` Kairui Song
2025-08-30 1:53 ` Chris Li
2025-08-30 15:15 ` Kairui Song
2025-08-30 17:17 ` Chris Li
2025-09-01 18:17 ` Kairui Song
2025-09-01 21:10 ` Chris Li
2025-09-02 5:40 ` Barry Song
2025-09-02 10:18 ` David Hildenbrand
2025-09-02 10:21 ` David Hildenbrand
2025-09-02 12:46 ` Chris Li
2025-09-02 13:27 ` Kairui Song
2025-08-22 19:20 ` [PATCH 3/9] mm, swap: rename and move some swap cluster definition and helpers Kairui Song
2025-08-30 2:31 ` Chris Li
2025-09-02 5:53 ` Barry Song
2025-09-02 10:20 ` David Hildenbrand
2025-09-02 12:50 ` Chris Li
2025-08-22 19:20 ` [PATCH 4/9] mm, swap: tidy up swap device and cluster info helpers Kairui Song
2025-08-27 3:47 ` Baoquan He
2025-08-27 17:44 ` Chris Li
2025-08-27 23:46 ` Baoquan He
2025-08-30 2:38 ` Chris Li
2025-09-02 6:01 ` Barry Song
2025-09-03 9:28 ` David Hildenbrand
2025-09-02 6:02 ` Barry Song
2025-09-02 13:33 ` David Hildenbrand
2025-09-02 15:03 ` Kairui Song
2025-09-03 8:11 ` David Hildenbrand
2025-08-22 19:20 ` [PATCH 5/9] mm/shmem, swap: remove redundant error handling for replacing folio Kairui Song
2025-08-25 3:02 ` Baolin Wang
2025-08-25 9:45 ` Kairui Song
2025-08-30 2:41 ` Chris Li
2025-09-03 8:25 ` David Hildenbrand
2025-08-22 19:20 ` [PATCH 6/9] mm, swap: use the swap table for the swap cache and switch API Kairui Song
2025-08-30 1:54 ` Baoquan He
2025-08-30 3:40 ` Chris Li
2025-08-30 3:34 ` Chris Li
2025-08-30 16:52 ` Kairui Song
2025-08-31 1:00 ` Chris Li
2025-09-02 11:51 ` Kairui Song
2025-09-02 9:55 ` Barry Song
2025-09-02 11:58 ` Kairui Song
2025-09-02 23:44 ` Barry Song
2025-09-03 2:12 ` Kairui Song
2025-09-03 2:31 ` Barry Song
2025-09-03 11:41 ` David Hildenbrand
2025-09-03 12:54 ` Kairui Song
2025-09-04 9:28 ` David Hildenbrand
2025-08-22 19:20 ` [PATCH 7/9] mm, swap: remove contention workaround for swap cache Kairui Song
2025-08-30 4:07 ` Chris Li
2025-08-30 15:24 ` Kairui Song
2025-08-31 15:54 ` Kairui Song [this message]
2025-08-31 20:06 ` Chris Li
2025-08-31 20:04 ` Chris Li
2025-09-02 10:06 ` Barry Song
2025-08-22 19:20 ` [PATCH 8/9] mm, swap: implement dynamic allocation of swap table Kairui Song
2025-08-30 4:17 ` Chris Li
2025-09-02 11:15 ` Barry Song
2025-09-02 13:17 ` Chris Li
2025-09-02 16:57 ` Kairui Song
2025-09-02 23:31 ` Barry Song
2025-09-03 2:13 ` Kairui Song
2025-09-03 12:35 ` Chris Li
2025-09-03 20:52 ` Barry Song
2025-09-04 6:50 ` Chris Li
2025-08-22 19:20 ` [PATCH 9/9] mm, swap: use a single page for swap table when the size fits Kairui Song
2025-08-30 4:23 ` Chris Li
2025-08-26 22:00 ` [PATCH 0/9] mm, swap: introduce swap table as swap cache (phase I) Chris Li
2025-08-30 5:44 ` Chris Li
2025-09-04 16:36 ` Kairui Song
2025-09-04 18:50 ` Chris Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMgjq7AdauQ8=X0zeih2r21QoV=-WWj1hyBxLWRzq74n-C=-Ng@mail.gmail.com' \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=nphamcs@gmail.com \
--cc=oliver.sang@intel.com \
--cc=shikemeng@huaweicloud.com \
--cc=willy@infradead.org \
--cc=ying.huang@linux.alibaba.com \
--cc=yosryahmed@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox