Re: [RFC 0/3] reduce latency of direct async compaction

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Aaron Lu <aaron.lu@intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mel Gorman <mgorman@suse.de>, Minchan Kim <minchan@kernel.org>
Subject: Re: [RFC 0/3] reduce latency of direct async compaction
Date: Mon, 7 Dec 2015 16:35:24 +0900	[thread overview]
Message-ID: <20151207073523.GA27292@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <56618841.2080808@suse.cz>

On Fri, Dec 04, 2015 at 01:34:09PM +0100, Vlastimil Babka wrote:
> On 12/03/2015 12:52 PM, Aaron Lu wrote:
> >On Thu, Dec 03, 2015 at 07:35:08PM +0800, Aaron Lu wrote:
> >>On Thu, Dec 03, 2015 at 10:38:50AM +0100, Vlastimil Babka wrote:
> >>>On 12/03/2015 10:25 AM, Aaron Lu wrote:
> >>>>On Thu, Dec 03, 2015 at 09:10:44AM +0100, Vlastimil Babka wrote:
> >>
> >>My bad, I uploaded the wrong data :-/
> >>I uploaded again:
> >>https://drive.google.com/file/d/0B49uX3igf4K4UFI4TEQ3THYta0E
> >>
> >>And I just run the base tree with trace-cmd and found that its
> >>performace drops significantly(from 1000MB/s to 6xxMB/s), is it that
> >>trace-cmd will impact performace a lot?
> 
> Yeah it has some overhead depending on how many events it has to
> process. Your workload is quite sensitive to that.
> 
> >>Any suggestions on how to run
> >>the test regarding trace-cmd? i.e. should I aways run usemem under
> >>trace-cmd or only when necessary?
> 
> I'd run it with tracing only when the goal is to collect traces, but
> not for any performance comparisons. Also it's not useful to collect
> perf data while also tracing.
> 
> >I just run the test with the base tree and with this patch series
> >applied(head), I didn't use trace-cmd this time.
> >
> >The throughput for base tree is 963MB/s while the head is 815MB/s, I
> >have attached pagetypeinfo/proc-vmstat/perf-profile for them.
> 
> The compact stats improvements look fine, perhaps better than in my tests:
> 
> base: compact_migrate_scanned 3476360
> head: compact_migrate_scanned 1020827
> 
> - that's the eager skipping of patch 2
> 
> base: compact_free_scanned 5924928
> head: compact_free_scanned 0
>       compact_free_direct 918813
>       compact_free_direct_miss 500308
> 
> As your workload does exclusively async direct compaction through
> THP faults, the traditional free scanner isn't used at all. Direct
> allocations should be much cheaper, although the "miss" ratio (the
> allocations that were from the same pageblock as the one we are
> compacting) is quite high. I should probably look into making
> migration release pages to the tails of the freelists - could be
> that it's grabbing the very pages that were just freed in the
> previous COMPACT_CLUSTER_MAX cycle (modulo pcplist buffering).
> 
> I however find it strange that your original stats (4.3?) differ
> from the base so much:
> 
> compact_migrate_scanned 1982396
> compact_free_scanned 40576943
> 
> That was order of magnitude more free scanned on 4.3, and half the
> migrate scanned. But your throughput figures in the other mail
> suggested a regression from 4.3 to 4.4, which would be the opposite
> of what the stats say. And anyway, compaction code didn't change
> between 4.3 and 4.4 except changes to tracepoint format...
> 
> moving on...
> base:
> compact_isolated 731304
> compact_stall 10561
> compact_fail 9459
> compact_success 1102
> 
> head:
> compact_isolated 921087
> compact_stall 14451
> compact_fail 12550
> compact_success 1901
> 
> More success in both isolation and compaction results.
> 
> base:
> thp_fault_alloc 45337
> thp_fault_fallback 2349
> 
> head:
> thp_fault_alloc 45564
> thp_fault_fallback 2120
> 
> Somehow the extra compact success didn't fully translate to thp
> alloc success... But given how many of the alloc's didn't even
> involve a compact_stall (two thirds of them), that interpretation
> could also be easily misleading. So, hard to say.
> 
> Looking at the perf profiles...
> base:
>     54.55%    54.55%            :1550  [kernel.kallsyms]   [k]
> pageblock_pfn_to_page
> 
> head:
>     40.13%    40.13%            :1551  [kernel.kallsyms]   [k]
> pageblock_pfn_to_page
> 
> Since the freepage allocation doesn't hit this code anymore, it
> shows that the bulk was actually from the migration scanner,
> although the perf callgraph and vmstats suggested otherwise.

It looks like overhead still remain. I guess that migration scanner
would call pageblock_pfn_to_page() for more extended range so
overhead still remain.

I have an idea to solve his problem. Aaron, could you test following patch
on top of base? It tries to skip calling pageblock_pfn_to_page()
if we check that zone is contiguous at initialization stage.

Thanks.

---->8----

next prev parent reply	other threads:[~2015-12-07  7:34 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-03  8:10 Vlastimil Babka
2015-12-03  8:10 ` [RFC 1/3] mm, compaction: reduce spurious pcplist drains Vlastimil Babka
2015-12-03  8:10 ` [RFC 2/3] mm, compaction: make async direct compaction skip blocks where isolation fails Vlastimil Babka
2015-12-03  8:10 ` [RFC 3/3] mm, compaction: direct freepage allocation for async direct compaction Vlastimil Babka
2015-12-03  9:25 ` [RFC 0/3] reduce latency of direct async compaction Aaron Lu
2015-12-03  9:38   ` Vlastimil Babka
2015-12-03 11:35     ` Aaron Lu
2015-12-03 11:52       ` Aaron Lu
2015-12-04 12:34         ` Vlastimil Babka
2015-12-07  7:35           ` Joonsoo Kim [this message]
2015-12-07  8:59             ` Aaron Lu
2015-12-08  0:41               ` Joonsoo Kim
2015-12-08  5:14                 ` Aaron Lu
2015-12-08  6:51                   ` Joonsoo Kim
2015-12-08  8:52                     ` Aaron Lu
2015-12-09  0:33                       ` Joonsoo Kim
2015-12-09  5:40                         ` Aaron Lu
2015-12-10  4:35                           ` Joonsoo Kim
2015-12-10  6:15                             ` Aaron Lu
2015-12-04  6:25 ` Aaron Lu
2015-12-04 12:38   ` Vlastimil Babka
2015-12-07  3:14     ` Aaron Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151207073523.GA27292@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=aaron.lu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox