linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Shaohua Li <shaohua.li@intel.com>
Cc: linux-mm@kvack.org, cl@linux.com
Subject: Re: zone state overhead
Date: Fri, 8 Oct 2010 16:29:53 +0100	[thread overview]
Message-ID: <20101008152953.GB3315@csn.ul.ie> (raw)
In-Reply-To: <20100928050801.GA29021@sli10-conroe.sh.intel.com>

On Tue, Sep 28, 2010 at 01:08:01PM +0800, Shaohua Li wrote:
> In a 4 socket 64 CPU system, zone_nr_free_pages() takes about 5% ~ 10% cpu time
> according to perf when memory pressure is high. The workload does something
> like:
> for i in `seq 1 $nr_cpu`
> do
>         create_sparse_file $SPARSE_FILE-$i $((10 * mem / nr_cpu))
>         $USEMEM -f $SPARSE_FILE-$i -j 4096 --readonly $((10 * mem / nr_cpu)) &
> done
> this simply reads a sparse file for each CPU. Apparently the
> zone->percpu_drift_mark is too big, and guess zone_page_state_snapshot() makes
> a lot of cache bounce for ->vm_stat_diff[]. below is the zoneinfo for reference.

Would it be possible for you to post the oprofile report? I'm in the
early stages of trying to reproduce this locally based on your test
description. The first machine I tried showed that zone_nr_page_state
was consuming 0.26% of profile time with the vast bulk occupied by
do_mpage_readahead. See as follows

1599339  53.3463  vmlinux-2.6.36-rc7-pcpudrift do_mpage_readpage
131713    4.3933  vmlinux-2.6.36-rc7-pcpudrift __isolate_lru_page
103958    3.4675  vmlinux-2.6.36-rc7-pcpudrift free_pcppages_bulk
85024     2.8360  vmlinux-2.6.36-rc7-pcpudrift __rmqueue
78697     2.6250  vmlinux-2.6.36-rc7-pcpudrift native_flush_tlb_others
75678     2.5243  vmlinux-2.6.36-rc7-pcpudrift unlock_page
68741     2.2929  vmlinux-2.6.36-rc7-pcpudrift get_page_from_freelist
56043     1.8693  vmlinux-2.6.36-rc7-pcpudrift __alloc_pages_nodemask
55863     1.8633  vmlinux-2.6.36-rc7-pcpudrift ____pagevec_lru_add
46044     1.5358  vmlinux-2.6.36-rc7-pcpudrift radix_tree_delete
44543     1.4857  vmlinux-2.6.36-rc7-pcpudrift shrink_page_list
33636     1.1219  vmlinux-2.6.36-rc7-pcpudrift zone_watermark_ok
.....
7855      0.2620  vmlinux-2.6.36-rc7-pcpudrift zone_nr_free_pages

The machine I am testing on is non-NUMA 4-core single socket and totally
different characteristics but I want to be sure I'm going more or less the
right direction with the reproduction case before trying to find a larger
machine.

Thanks.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-10-08 15:30 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-28  5:08 Shaohua Li
2010-09-28 12:39 ` Christoph Lameter
2010-09-28 13:30   ` Mel Gorman
2010-09-28 13:40     ` Christoph Lameter
2010-09-28 13:51       ` Mel Gorman
2010-09-28 14:08         ` Christoph Lameter
2010-09-29  3:02           ` Shaohua Li
2010-09-29  4:02     ` David Rientjes
2010-09-29  4:47       ` Shaohua Li
2010-09-29  5:06         ` David Rientjes
2010-09-29 10:03       ` Mel Gorman
2010-09-29 14:12         ` Christoph Lameter
2010-09-29 14:17           ` Mel Gorman
2010-09-29 14:34             ` Christoph Lameter
2010-09-29 14:41               ` Mel Gorman
2010-09-29 14:45                 ` Mel Gorman
2010-09-29 14:54                   ` Christoph Lameter
2010-09-29 14:52                 ` Christoph Lameter
2010-09-29 19:44         ` David Rientjes
2010-10-08 15:29 ` Mel Gorman [this message]
2010-10-09  0:58   ` Shaohua Li
2010-10-11  8:56     ` Mel Gorman
2010-10-12  1:05       ` Shaohua Li
2010-10-12 16:25         ` Mel Gorman
2010-10-13  2:41           ` Shaohua Li
2010-10-13 12:09             ` Mel Gorman
2010-10-13  3:36           ` KOSAKI Motohiro
2010-10-13  6:25             ` [RFC][PATCH 0/3] mm: reserve max drift pages at boot time instead using zone_page_state_snapshot() KOSAKI Motohiro
2010-10-13  6:27               ` [RFC][PATCH 1/3] mm, mem-hotplug: recalculate lowmem_reserve when memory hotplug occur KOSAKI Motohiro
2010-10-13  6:39                 ` KAMEZAWA Hiroyuki
2010-10-13 12:59                 ` Mel Gorman
2010-10-14  2:44                   ` KOSAKI Motohiro
2010-10-13  6:28               ` [RFC][PATCH 2/3] mm: update pcp->stat_threshold " KOSAKI Motohiro
2010-10-13  6:40                 ` KAMEZAWA Hiroyuki
2010-10-13 13:02                 ` Mel Gorman
2010-10-13  6:32               ` [RFC][PATCH 3/3] mm: reserve max drift pages at boot time instead using zone_page_state_snapshot() KOSAKI Motohiro
2010-10-13 13:19                 ` Mel Gorman
2010-10-14  2:39                   ` KOSAKI Motohiro
2010-10-18 10:43                     ` Mel Gorman
2010-10-13  7:10               ` [experimental][PATCH] mm,vmstat: per cpu stat flush too when per cpu page cache flushed KOSAKI Motohiro
2010-10-13  7:16                 ` KAMEZAWA Hiroyuki
2010-10-13 13:22                 ` Mel Gorman
2010-10-14  2:50                   ` KOSAKI Motohiro
2010-10-15 17:31                     ` Christoph Lameter
2010-10-18  9:27                       ` KOSAKI Motohiro
2010-10-18 15:44                         ` Christoph Lameter
2010-10-19  1:10                           ` KOSAKI Motohiro
2010-10-18 11:08                     ` Mel Gorman
2010-10-19  1:34                       ` KOSAKI Motohiro
2010-10-19  9:06                         ` Mel Gorman
2010-10-18 15:51                 ` Christoph Lameter
2010-10-19  0:43                   ` KOSAKI Motohiro
2010-10-13 11:24             ` zone state overhead Mel Gorman
2010-10-14  3:07               ` KOSAKI Motohiro
2010-10-18 10:39                 ` Mel Gorman
2010-10-19  1:16                   ` KOSAKI Motohiro
2010-10-19  9:08                     ` Mel Gorman
2010-10-22 14:12                       ` Mel Gorman
2010-10-22 15:23                         ` Christoph Lameter
2010-10-22 18:45                           ` Mel Gorman
2010-10-22 15:27                         ` Christoph Lameter
2010-10-22 18:46                           ` Mel Gorman
2010-10-22 20:01                             ` Christoph Lameter
2010-10-25  4:46                         ` KOSAKI Motohiro
2010-10-27  8:19                           ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101008152953.GB3315@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=cl@linux.com \
    --cc=linux-mm@kvack.org \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox