Re: hugepage compaction causes performance drop

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vlastimil Babka <vbabka@suse.cz>
To: Aaron Lu <aaron.lu@intel.com>, linux-mm@kvack.org
Cc: Huang Ying <ying.huang@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	lkp@lists.01.org, Andrea Arcangeli <aarcange@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: hugepage compaction causes performance drop
Date: Thu, 19 Nov 2015 14:29:10 +0100	[thread overview]
Message-ID: <564DCEA6.3000802@suse.cz> (raw)
In-Reply-To: <20151119092920.GA11806@aaronlu.sh.intel.com>

+CC Andrea, David, Joonsoo

On 11/19/2015 10:29 AM, Aaron Lu wrote:
> Hi,
>
> One vm related test case run by LKP on a Haswell EP with 128GiB memory
> showed that compaction code would cause performance drop about 30%. To
> illustrate the problem, I've simplified the test with a program called
> usemem(see attached). The test goes like this:
> 1 Boot up the server;
> 2 modprobe scsi_debug(a module that could use memory as SCSI device),
>    dev_size set to 4/5 free memory, i.e. about 100GiB. Use it as swap.
> 3 run the usemem test, which use mmap to map a MAP_PRIVATE | MAP_ANON
>    region with the size set to 3/4 of (remaining_free_memory + swap), and
>    then write to that region sequentially to trigger page fault and swap
>    out.
>
> The above test runs with two configs regarding the below two sysfs files:
> /sys/kernel/mm/transparent_hugepage/enabled
> /sys/kernel/mm/transparent_hugepage/defrag
> 1 transparent hugepage and defrag are both set to always, let's call it
>    always-always case;
> 2 transparent hugepage is set to always while defrag is set to never,
>    let's call it always-never case.
>
> The output from the always-always case is:
> Setting up swapspace version 1, size = 104627196 KiB
> no label, UUID=aafa53ae-af9e-46c9-acb9-8b3d4f57f610
> cmdline: /lkp/aaron/src/bin/usemem 99994672128
> 99994672128 transferred in 95 seconds, throughput: 1003 MB/s
>
> And the output from the always-never case is:
> etting up swapspace version 1, size = 104629244 KiB
> no label, UUID=60563c82-d1c6-4d86-b9fa-b52f208097e9
> cmdline: /lkp/aaron/src/bin/usemem 99995965440
> 99995965440 transferred in 67 seconds, throughput: 1423 MB/s

So yeah this is an example of workload that has no benefit from THP's, 
but pays all the cost. Fixing that is non-trivial and I admit I haven't 
pushed my prior efforts there too much lately...
But it's also possible there still are actual compaction bugs making the 
issue worse.

> The vmstat and perf-profile are also attached, please let me know if you
> need any more information, thanks.

Output from vmstat (the tool) isn't much useful here, a periodic "cat 
/proc/vmstat" would be much better.
The perf profiles are somewhat weirdly sorted by children cost (?), but 
I noticed a very high cost (46%) in pageblock_pfn_to_page(). This could 
be due to a very large but sparsely populated zone. Could you provide 
/proc/zoneinfo?
If the compaction scanners behave strangely due to a bug, enabling the 
ftrace compaction tracepoints should help find the cause. That might 
produce a very large output, but maybe it would be enough to see some 
parts of it (i.e. towards beginning, middle, end of the experiment).

Vlastimil

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2015-11-19 13:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-19  9:29 Aaron Lu
2015-11-19 13:29 ` Vlastimil Babka [this message]
2015-11-20  8:55   ` Aaron Lu
2015-11-20  9:33     ` Aaron Lu
2015-11-20 10:06       ` Vlastimil Babka
2015-11-23  8:16         ` Joonsoo Kim
2015-11-23  8:33           ` Aaron Lu
2015-11-23  9:24             ` Joonsoo Kim
2015-11-24  3:40               ` Aaron Lu
2015-11-24  4:55                 ` Joonsoo Kim
2015-11-24  7:27                   ` Aaron Lu
2015-11-24  8:29                     ` Joonsoo Kim
2015-11-25 12:44                       ` Vlastimil Babka
2015-11-26  5:47                         ` Aaron Lu
2015-11-24  2:45         ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=564DCEA6.3000802@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=aaron.lu@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=lkp@lists.01.org \
    --cc=rientjes@google.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox