linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris Li <chrisl@kernel.org>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Kairui Song <ryncsn@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Ryan Roberts <ryan.roberts@arm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	 Barry Song <baohua@kernel.org>
Subject: Re: [PATCH 0/2] mm: swap: mTHP swap allocator base on swap cluster order
Date: Wed, 5 Jun 2024 00:40:31 -0700	[thread overview]
Message-ID: <CAF8kJuPVAvPVGze9dutzDGxJBYQ3T1vp8xOxYXX_iXqDwxHDTw@mail.gmail.com> (raw)
In-Reply-To: <87frttcgmc.fsf@yhuang6-desk2.ccr.corp.intel.com>

On Tue, Jun 4, 2024 at 12:29 AM Huang, Ying <ying.huang@intel.com> wrote:
>
> Kairui Song <ryncsn@gmail.com> writes:
>
> > On Fri, May 31, 2024 at 10:37 AM Huang, Ying <ying.huang@intel.com> wrote:
> > Isn't limiting order-0 allocation breaks the bottom line that order-0
> > allocation is the first class citizen, and should not fail if there is
> > space?
>
> Sorry for confusing words.  I mean limiting maximum number order-0 swap
> entries allocation in workloads, instead of limiting that in kernel.

What interface does it use to limit the order 0 swap entries?

I was thinking the kernel would enforce the high order swap space
reservation just like hugetlbfs does on huge pages.
We will need to introduce some interface to specify the reservation.

>
> > Just my two cents...
> >
> > I had a try locally based on Chris's work, allowing order 0 to use
> > nonfull_clusters as Ying has suggested, and starting with low order
> > and increase the order until nonfull_cluster[order] is not empty, that
> > way higher order is just better protected, because unless we ran out
> > of free_cluster and nonfull_cluster, direct scan won't happen.
> >
> > More concretely, I applied the following changes, which didn't change
> > the code much:
> > - In scan_swap_map_try_ssd_cluster, check nonfull_cluster first, then
> > free_clusters, then discard_cluster.
> > - If it's order 0, also check for (int i = 0; i < SWAP_NR_ORDERS; ++i)
> > nonfull_clusters[i] cluster before scan_swap_map_try_ssd_cluster
> > returns false.
> >
> > A quick test still using the memtier test, but decreased the swap
> > device size from 10G to 8g for higher pressure.
> >
> > Before:
> > hugepages-32kB/stats/swpout:34013
> > hugepages-32kB/stats/swpout_fallback:266
> > hugepages-512kB/stats/swpout:0
> > hugepages-512kB/stats/swpout_fallback:77
> > hugepages-2048kB/stats/swpout:0
> > hugepages-2048kB/stats/swpout_fallback:1
> > hugepages-1024kB/stats/swpout:0
> > hugepages-1024kB/stats/swpout_fallback:0
> > hugepages-64kB/stats/swpout:35088
> > hugepages-64kB/stats/swpout_fallback:66
> > hugepages-16kB/stats/swpout:31848
> > hugepages-16kB/stats/swpout_fallback:402
> > hugepages-256kB/stats/swpout:390
> > hugepages-256kB/stats/swpout_fallback:7244
> > hugepages-128kB/stats/swpout:28573
> > hugepages-128kB/stats/swpout_fallback:474
> >
> > After:
> > hugepages-32kB/stats/swpout:31448
> > hugepages-32kB/stats/swpout_fallback:3354
> > hugepages-512kB/stats/swpout:30
> > hugepages-512kB/stats/swpout_fallback:33
> > hugepages-2048kB/stats/swpout:2
> > hugepages-2048kB/stats/swpout_fallback:0
> > hugepages-1024kB/stats/swpout:0
> > hugepages-1024kB/stats/swpout_fallback:0
> > hugepages-64kB/stats/swpout:31255
> > hugepages-64kB/stats/swpout_fallback:3112
> > hugepages-16kB/stats/swpout:29931
> > hugepages-16kB/stats/swpout_fallback:3397
> > hugepages-256kB/stats/swpout:5223
> > hugepages-256kB/stats/swpout_fallback:2351
> > hugepages-128kB/stats/swpout:25600
> > hugepages-128kB/stats/swpout_fallback:2194
> >
> > High order (256k) swapout rate are significantly higher, 512k is now
> > possible, which indicate high orders are better protected, lower
> > orders are sacrificed but seems worth it.
>
> Yes.  I think that this reflects another aspect of the problem.  In some
> situations, it's better to steal one high-order cluster and use it for
> order-0 allocation instead of scattering order-0 allocation in random
> high-order clusters.

Agree, the  scan loop on swap_map[] has the worst pollution.

Chris


  reply	other threads:[~2024-06-05  7:40 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24 17:17 Chris Li
2024-05-24 17:17 ` [PATCH 1/2] mm: swap: swap cluster switch to double link list Chris Li
2024-05-28 16:23   ` Kairui Song
2024-05-28 22:27     ` Chris Li
2024-05-29  0:50       ` Chris Li
2024-05-29  8:46   ` Huang, Ying
2024-05-30 21:49     ` Chris Li
2024-05-31  2:03       ` Huang, Ying
2024-05-24 17:17 ` [PATCH 2/2] mm: swap: mTHP allocate swap entries from nonfull list Chris Li
2024-06-07 10:35   ` Ryan Roberts
2024-06-07 10:57     ` Ryan Roberts
2024-06-07 20:53       ` Chris Li
2024-06-07 20:52     ` Chris Li
2024-06-10 11:18       ` Ryan Roberts
2024-06-11  6:09         ` Chris Li
2024-05-28  3:07 ` [PATCH 0/2] mm: swap: mTHP swap allocator base on swap cluster order Barry Song
2024-05-28 21:04 ` Chris Li
2024-05-29  8:55   ` Huang, Ying
2024-05-30  1:13     ` Chris Li
2024-05-30  2:52       ` Huang, Ying
2024-05-30  8:08         ` Kairui Song
2024-05-30 18:31           ` Chris Li
2024-05-30 21:44         ` Chris Li
2024-05-31  2:35           ` Huang, Ying
2024-05-31 12:40             ` Kairui Song
2024-06-04  7:27               ` Huang, Ying
2024-06-05  7:40                 ` Chris Li [this message]
2024-06-05  7:30               ` Chris Li
2024-06-05  7:08             ` Chris Li
2024-06-06  1:55               ` Huang, Ying
2024-06-07 18:40                 ` Chris Li
2024-06-11  2:36                   ` Huang, Ying
2024-06-11  7:11                     ` Chris Li
2024-06-13  8:38                       ` Huang, Ying
2024-06-18  4:35                         ` Chris Li
2024-06-18  6:54                           ` Huang, Ying
2024-06-18  9:31                             ` Chris Li
2024-06-19  9:21                               ` Huang, Ying
2024-05-30  7:49   ` Barry Song
2024-06-07 10:49     ` Ryan Roberts
2024-06-07 18:57       ` Chris Li
2024-06-07  9:43 ` Ryan Roberts
2024-06-07 18:48   ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF8kJuPVAvPVGze9dutzDGxJBYQ3T1vp8xOxYXX_iXqDwxHDTw@mail.gmail.com \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=ryncsn@gmail.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox