From: Aaron Lu <aaron.lu@intel.com>
To: Yang Shi <shy828301@gmail.com>
Cc: "ying.huang@intel.com" <ying.huang@intel.com>,
Michal Hocko <mhocko@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH] mm: swap: determine swap device by using page nid
Date: Fri, 29 Apr 2022 18:26:49 +0800 [thread overview]
Message-ID: <Ymu9acl18pTA5GU6@ziqianlu-desk1> (raw)
In-Reply-To: <CAHbLzkriO6xWzyMNpcVFmyxSn=cqbz2qx+2mJ5d0m-beqPRCUg@mail.gmail.com>
On Fri, Apr 22, 2022 at 10:00:59AM -0700, Yang Shi wrote:
> On Thu, Apr 21, 2022 at 11:24 PM Aaron Lu <aaron.lu@intel.com> wrote:
> >
> > On Thu, Apr 21, 2022 at 04:34:09PM +0800, ying.huang@intel.com wrote:
> > > On Thu, 2022-04-21 at 16:17 +0800, Aaron Lu wrote:
> > > > On Thu, Apr 21, 2022 at 03:49:21PM +0800, ying.huang@intel.com wrote:
> >
> > ... ...
> >
> > > > > For swap-in latency, we can use pmbench, which can output latency
> > > > > information.
> > > > >
> > > >
> > > > OK, I'll give pmbench a run, thanks for the suggestion.
> > >
> > > Better to construct a senario with more swapin than swapout. For
> > > example, start a memory eater, then kill it later.
> >
> > What about vm-scalability/case-swapin?
> > https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-swapin
> >
> > I think you are pretty familiar with it but still:
> > 1) it starts $nr_task processes and each mmaps $size/$nr_task area and
> > then consumes the memory, after this, it waits for a signal;
> > 2) start another process to consume $size memory to push the memory in
> > step 1) to swap device;
> > 3) kick processes in step 1) to start accessing their memory, thus
> > trigger swapins. The metric of this testcase is the swapin throughput.
> >
> > I plan to restrict the cgroup's limit to $size.
> >
> > Considering there is only one NVMe drive attached to node 0, I will run
> > the test as described before:
> > 1) bind processes to run on node 0, allocate on node 1 to test the
> > performance when reclaimer's node id is the same as swap device's.
> > 2) bind processes to run on node 1, allocate on node 0 to test the
> > performance when page's node id is the same as swap device's.
> >
Thanks to Tim, he has found me a server that has a single Optane disk
attached to node 0.
Let's use task0_mem0 to denote tasks bound to node 0 and memory bound
to node 0 through cgroup cpuset. And for the above swapin case:
when nr_task=1:
task0_mem0 throughput: [571652, 587158, 594316], avg=584375 -> baseline
task0_mem1 throughput: [582944, 583752, 589026], avg=585240 +0.15%
task1_mem0 throughput: [569349, 577459, 581107], avg=575971 -1.4%
task1_mem1 throughput: [564482, 570664, 571466], avg=568870 -2.6%
task0_mem1 is slightly better than task1_mem0.
For nr_task=8 or nr_task=16, I also gave it a run and the result is
almost the same for all 4 cases.
> > Ying and Yang,
> >
> > Let me know what you think about the case used and the way the test is
> > conducted.
>
> Looks fine to me. To measure the latency, you could also try the below
> bpftrace script:
>
Trying to install bpftrace on an old distro(Ubuntu 16.04) is a real
pain, I gave up... But I managed to get an old bcc installed. Using
the provided funclatency script to profile 30 seconds swap_readpage(),
there is no obvious difference from the histrogram.
So for now, from the existing results, it did't show big difference.
Theoretically, for IO device, when swapping a remote page, using the
remote swap device that is at the same node as the page can reduce the
traffic of the interlink and improve performance. I think this is the
main motivation for this code change?
On swapin time, it's hard to say which node the task will run on anyway
so it's hard to say where to swap is beneficial.
next prev parent reply other threads:[~2022-04-29 10:27 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-07 2:09 Yang Shi
2022-04-07 7:52 ` Michal Hocko
2022-04-07 17:27 ` Yang Shi
2022-04-07 8:13 ` Aaron Lu
2022-04-07 17:36 ` Yang Shi
2022-04-20 8:33 ` Aaron Lu
2022-04-20 22:21 ` Yang Shi
2022-04-21 7:34 ` Aaron Lu
2022-04-21 7:49 ` ying.huang
2022-04-21 8:17 ` Aaron Lu
2022-04-21 8:30 ` Aaron Lu
2022-04-21 8:34 ` ying.huang
2022-04-22 6:24 ` Aaron Lu
2022-04-22 6:27 ` ying.huang
2022-04-22 6:43 ` Aaron Lu
2022-04-22 7:26 ` ying.huang
2022-04-22 17:00 ` Yang Shi
2022-04-23 3:22 ` Aaron Lu
2022-04-29 10:26 ` Aaron Lu [this message]
2022-04-29 19:07 ` Yang Shi
2022-04-21 14:11 ` Aaron Lu
2022-04-21 17:19 ` Yang Shi
2022-04-21 23:57 ` ying.huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ymu9acl18pTA5GU6@ziqianlu-desk1 \
--to=aaron.lu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=shy828301@gmail.com \
--cc=tim.c.chen@linux.intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox