From: Kairui Song <ryncsn@gmail.com>
To: Chris Li <chrisl@kernel.org>, "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ryan Roberts <ryan.roberts@arm.com>,
Kalesh Singh <kaleshsingh@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Barry Song <baohua@kernel.org>
Subject: Re: [PATCH v5 0/9] mm: swap: mTHP swap allocator base on swap cluster order
Date: Mon, 19 Aug 2024 00:59:41 +0800 [thread overview]
Message-ID: <CAMgjq7DJwF+kwxJkDKnH-cnp-36xdEObrNpKGrH_GvNKQtqjSw@mail.gmail.com> (raw)
In-Reply-To: <CACePvbUenbKM+i5x6xR=2A=8tz4Eu2azDFAV_ksvn2TtrFsVOQ@mail.gmail.com>
On Fri, Aug 16, 2024 at 3:53 PM Chris Li <chrisl@kernel.org> wrote:
>
> On Thu, Aug 8, 2024 at 1:38 AM Huang, Ying <ying.huang@intel.com> wrote:
> >
> > Chris Li <chrisl@kernel.org> writes:
> >
> > > On Wed, Aug 7, 2024 at 12:59 AM Huang, Ying <ying.huang@intel.com> wrote:
> > >>
> > >> Hi, Chris,
> > >>
> > >> Chris Li <chrisl@kernel.org> writes:
> > >>
> > >> > This is the short term solutions "swap cluster order" listed
> > >> > in my "Swap Abstraction" discussion slice 8 in the recent
> > >> > LSF/MM conference.
> > >> >
> > >> > When commit 845982eb264bc "mm: swap: allow storage of all mTHP
> > >> > orders" is introduced, it only allocates the mTHP swap entries
> > >> > from the new empty cluster list. It has a fragmentation issue
> > >> > reported by Barry.
> > >> >
> > >> > https://lore.kernel.org/all/CAGsJ_4zAcJkuW016Cfi6wicRr8N9X+GJJhgMQdSMp+Ah+NSgNQ@mail.gmail.com/
> > >> >
> > >> > The reason is that all the empty clusters have been exhausted while
> > >> > there are plenty of free swap entries in the cluster that are
> > >> > not 100% free.
> > >> >
> > >> > Remember the swap allocation order in the cluster.
> > >> > Keep track of the per order non full cluster list for later allocation.
> > >> >
> > >> > This series gives the swap SSD allocation a new separate code path
> > >> > from the HDD allocation. The new allocator use cluster list only
> > >> > and do not global scan swap_map[] without lock any more.
> > >>
> > >> This sounds good. Can we use SSD allocation method for HDD too?
> > >> We may not need a swap entry allocator optimized for HDD.
> > >
> > > Yes, that is the plan as well. That way we can completely get rid of
> > > the old scan_swap_map_slots() code.
> >
> > Good!
> >
> > > However, considering the size of the series, let's focus on the
> > > cluster allocation path first, get it tested and reviewed.
> >
> > OK.
> >
> > > For HDD optimization, mostly just the new block allocations portion
> > > need some separate code path from the new cluster allocator to not do
> > > the per cpu allocation. Allocating from the non free list doesn't
> > > need to change too
> >
> > I suggest not consider HDD optimization at all. Just use SSD algorithm
> > to simplify.
>
> Adding a global next allocating CI rather than the per CPU next CI
> pointer is pretty trivial as well. It is just a different way to fetch
> the next cluster pointer.
Yes, if we enable the new cluster based allocator for HDD, we can
enable THP and mTHP for HDD too, and use a global cluster_next instead
of Per-CPU for it.
It's easy to do with minimal changes, and should actually boost
performance for HDD SWAP. Currently testing this locally.
> > >>
> > >> Hi, Hugh,
> > >>
> > >> What do you think about this?
> > >>
> > >> > This streamline the swap allocation for SSD. The code matches the
> > >> > execution flow much better.
> > >> >
> > >> > User impact: For users that allocate and free mix order mTHP swapping,
> > >> > It greatly improves the success rate of the mTHP swap allocation after the
> > >> > initial phase.
> > >> >
> > >> > It also performs faster when the swapfile is close to full, because the
> > >> > allocator can get the non full cluster from a list rather than scanning
> > >> > a lot of swap_map entries.
> > >>
> > >> Do you have some test results to prove this? Or which test below can
> > >> prove this?
> > >
> > > The two zram tests are already proving this. The system time
> > > improvement is about 2% on my low CPU count machine.
> > > Kairui has a higher core count machine and the difference is higher
> > > there. The theory is that higher CPU count has higher contentions.
> >
> > I will interpret this as the performance is better in theory. But
> > there's almost no measurable results so far.
>
> I am trying to understand why don't see the performance improvement in
> the zram setup in my cover letter as a measurable result?
Hi Ying, you can check the test with the 32 cores AMD machine in the
cover letter, as Chris pointed out the performance gain is higher as
core number grows. The performance gain is still not much (*yet, based
on this design thing can go much faster after HDD codes are
dropped which enables many other optimizations, this series
is mainly focusing on the fragmentation issue), but I think a
stable ~4 - 8% improvement with a build linux kernel test
could be considered measurable?
next prev parent reply other threads:[~2024-08-18 17:00 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-31 6:49 Chris Li
2024-07-31 6:49 ` [PATCH v5 1/9] mm: swap: swap cluster switch to double link list Chris Li
2024-07-31 6:49 ` [PATCH v5 2/9] mm: swap: mTHP allocate swap entries from nonfull list Chris Li
[not found] ` <87bk23250r.fsf@yhuang6-desk2.ccr.corp.intel.com>
2024-08-16 8:01 ` Chris Li
2024-08-19 8:08 ` Huang, Ying
2024-08-26 21:26 ` Chris Li
2024-09-09 7:19 ` Huang, Ying
2024-07-31 6:49 ` [PATCH v5 3/9] mm: swap: separate SSD allocation from scan_swap_map_slots() Chris Li
2024-07-31 6:49 ` [PATCH v5 4/9] mm: swap: clean up initialization helper chrisl
2024-07-31 6:49 ` [PATCH v5 5/9] mm: swap: skip slot cache on freeing for mTHP chrisl
2024-08-03 9:11 ` Barry Song
2024-08-03 10:57 ` Barry Song
2024-07-31 6:49 ` [PATCH v5 6/9] mm: swap: allow cache reclaim to skip slot cache chrisl
2024-08-03 10:38 ` Barry Song
2024-08-03 12:18 ` Kairui Song
2024-08-04 18:06 ` Chris Li
2024-08-05 1:53 ` Barry Song
2024-07-31 6:49 ` [PATCH v5 7/9] mm: swap: add a fragment cluster list chrisl
2024-07-31 6:49 ` [PATCH v5 8/9] mm: swap: relaim the cached parts that got scanned chrisl
2024-07-31 6:49 ` [PATCH v5 9/9] mm: swap: add a adaptive full cluster cache reclaim chrisl
2024-08-01 9:14 ` [PATCH v5 0/9] mm: swap: mTHP swap allocator base on swap cluster order David Hildenbrand
2024-08-01 9:59 ` Kairui Song
2024-08-01 10:06 ` Kairui Song
[not found] ` <87le17z9zr.fsf@yhuang6-desk2.ccr.corp.intel.com>
2024-08-16 7:36 ` Chris Li
2024-08-17 17:47 ` Kairui Song
[not found] ` <87h6bw3gxl.fsf@yhuang6-desk2.ccr.corp.intel.com>
[not found] ` <CACePvbXH8b9SOePQ-Ld_UBbcAdJ3gdYtEkReMto5Hbq9WAL7JQ@mail.gmail.com>
[not found] ` <87sevfza3w.fsf@yhuang6-desk2.ccr.corp.intel.com>
2024-08-16 7:47 ` Chris Li
2024-08-18 16:59 ` Kairui Song [this message]
2024-08-19 8:27 ` Huang, Ying
2024-08-19 8:47 ` Kairui Song
2024-08-19 21:27 ` Chris Li
2024-08-19 8:39 ` Huang, Ying
2024-09-02 1:20 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMgjq7DJwF+kwxJkDKnH-cnp-36xdEObrNpKGrH_GvNKQtqjSw@mail.gmail.com \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=chrisl@kernel.org \
--cc=hughd@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox