Re: [PATCH] mm: swap: async free swap slot cache entries

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Chris Li <chrisl@kernel.org>
To: Nhat Pham <nphamcs@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Wei Xu <weixugc@google.com>, Yu Zhao <yuzhao@google.com>,
	Greg Thelen <gthelen@google.com>,
	Chun-Tse Shao <ctshao@google.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Yosry Ahmed <yosryahmed@google.com>,
	Brain Geffon <bgeffon@google.com>,
	Minchan Kim <minchan@kernel.org>, Michal Hocko <mhocko@suse.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Huang Ying <ying.huang@intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Kairui Song <kasong@tencent.com>,
	Zhongkun He <hezhongkun.hzk@bytedance.com>,
	Kemeng Shi <shikemeng@huaweicloud.com>,
	Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH] mm: swap: async free swap slot cache entries
Date: Fri, 22 Dec 2023 20:41:01 -0800	[thread overview]
Message-ID: <ZYZk3dISUtDe-0wT@google.com> (raw)
In-Reply-To: <CAKEwX=MNWcADDDWMo_V8V=1snAPKWmcxbnKX8jzt4XdNoXiV3Q@mail.gmail.com>

On Fri, Dec 22, 2023 at 05:44:19PM -0800, Nhat Pham wrote:
> On Thu, Dec 21, 2023 at 10:25 PM Chris Li <chrisl@kernel.org> wrote:
> >
> > We discovered that 1% swap page fault is 100us+ while 50% of
> > the swap fault is under 20us.
> >
> > Further investigation show that a large portion of the time
> > spent in the free_swap_slots() function for the long tail case.
> >
> > The percpu cache of swap slots is freed in a batch of 64 entries
> > inside free_swap_slots(). These cache entries are accumulated
> > from previous page faults, which may not be related to the current
> > process.
> >
> > Doing the batch free in the page fault handler causes longer
> > tail latencies and penalizes the current process.
> >
> > Move free_swap_slots() outside of the swapin page fault handler into an
> > async work queue to avoid such long tail latencies.
> >
> > Testing:
> >
> > Chun-Tse did some benchmark in chromebook, showing that
> > zram_wait_metrics improve about 15% with 80% and 95% confidence.

This benchmark result is using zram. There are 3 micro benchmarks of
all showing about 15% improvement with a slightly different confidence
level.  That is where the 80%-90% come from.

> >
> > I recently ran some experiments on about 1000 Google production
> > machines. It shows swapin latency drops in the long tail
> > 100us - 500us bucket dramatically.
> >
> > platform        (100-500us)             (0-100us)
> > A               1.12% -> 0.36%          98.47% -> 99.22%
> > B               0.65% -> 0.15%          98.96% -> 99.46%
> > C               0.61% -> 0.23%          98.96% -> 99.38%
> 
> Nice! Are these values for zram as well, or ordinary (SSD?) swap? I
> imagine it will matter less for swap, right?

Those production servers only use zswap. There is no zram there.
For ordinary SSD swap the latency reduction is also there in terms
of absolute us. However the raw savings get shadowed by the SSD IO
latency, typically in the 100us range. In terms of percentage,
you don't have as dramatica an effect compared to the memory
compression based swapping(zswap and zram).

> > @@ -348,3 +362,10 @@ swp_entry_t folio_alloc_swap(struct folio *folio)
> >         }
> >         return entry;
> >  }
> > +
> > +static int __init async_queue_init(void)
> > +{
> > +       swap_free_queue = create_workqueue("async swap cache");
> 
> nit(?): isn't create_workqueue() deprecated? from:
> 
> https://www.kernel.org/doc/html/latest/core-api/workqueue.html#application-programming-interface-api
> 
> I think there's a zswap patch proposing fixing that on the zswap side.
> 


Yes, I recall I saw that patch. I might acked on it as well.
Very good catch. I will fix it in the V2 spin.
Meanwhile, I will wait on it a bit to collect the other review
feedback.

Thans for catching that.

Chris

next prev parent reply	other threads:[~2023-12-23  4:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-22  6:25 Chris Li
2023-12-22 19:52 ` Andrew Morton
2023-12-22 23:16   ` Chris Li
2023-12-23  6:11     ` David Rientjes
2023-12-23 16:51       ` Chris Li
2023-12-24  3:01         ` David Rientjes
2023-12-24 18:15           ` Chris Li
2023-12-24 21:13             ` David Rientjes
2023-12-24 22:06               ` Chris Li
2023-12-24 22:20                 ` David Rientjes
2023-12-28 15:34                 ` Yosry Ahmed
2023-12-25  7:07     ` Huang, Ying
2024-02-01  0:43       ` Chris Li
2023-12-23  1:44 ` Nhat Pham
2023-12-23  4:41   ` Chris Li [this message]
2023-12-28 15:33 ` Yosry Ahmed
2024-02-01  0:57   ` Chris Li
2024-02-01  1:21     ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYZk3dISUtDe-0wT@google.com \
    --to=chrisl@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=ctshao@google.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hezhongkun.hzk@bytedance.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=v-songbaohua@oppo.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox