linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Korolyov <andrey@xdel.ru>
To: jesper@krogh.cc
Cc: linux-mm@kvack.org
Subject: Re: High system load and 3TB of memory.
Date: Sat, 14 Mar 2015 20:14:02 +0300	[thread overview]
Message-ID: <CABYiri81_RAtJizfpOdNPc6m9_Q2u0O35NX0ZhO1cxFpm866HQ@mail.gmail.com> (raw)
In-Reply-To: <52ec58f434865829c37337624d124981.squirrel@shrek.krogh.cc>

On Sat, Mar 14, 2015 at 8:05 PM,  <jesper@krogh.cc> wrote:
> Hi.
>
> I have a 3.13 (ubuntu LTS) server with 3TB of memory and under certain load
> conditions it can spiral off to 80+% system load. Per recommendation on IRC
> yesterday I have captured 2 perf reports (I'm new to perf, so I'm not
> sure they tell precisely whats needed.
>
> Bad situation (high sysload 80%+)
>
> Samples: 381K of event 'cycles', Event count (approx.): 1228296411165
> +  27.84%         postgres  [kernel.kallsyms]     [k] isolate_freepages_block
> +  21.08%             psql  [kernel.kallsyms]     [k] isolate_freepages_block
> +  20.72%       pg_restore  [kernel.kallsyms]     [k] isolate_freepages_block
> +   3.94%         postgres  postgres              [.] pglz_compress
> +   2.86%         postgres  [kernel.kallsyms]     [k]
> set_pageblock_flags_mask
> +   2.35%        bacula-fd  [kernel.kallsyms]     [k] isolate_freepages_block
> +   2.07%       pg_restore  [kernel.kallsyms]     [k]
> set_pageblock_flags_mask
> +   2.06%             psql  [kernel.kallsyms]     [k]
> set_pageblock_flags_mask
> +   1.56%         postgres  libc-2.15.so          [.] 0x000000000003c95f
> +   0.93%       irqbalance  [kernel.kallsyms]     [k] isolate_freepages_block
> +   0.88%       pg_restore  [kernel.kallsyms]     [k] isolate_freepages
> +   0.87%             psql  [kernel.kallsyms]     [k] isolate_freepages
> +   0.86%         postgres  [kernel.kallsyms]     [k] isolate_freepages
> +   0.81%         postgres  postgres              [.] 0x000000000027ff5b
> +   0.60%         postgres  [kernel.kallsyms]     [k]
> get_pageblock_flags_mask
> +   0.44%         proc_pri  [kernel.kallsyms]     [k] isolate_freepages_block
>
> Good situation .. sysload < 5%
>
> Samples: 509K of event 'cycles', Event count (approx.): 1635259826919
> +  21.14%         postgres  postgres                  [.] pglz_compress
> +  14.46%         postgres  postgres                  [.] 0x000000000016b643
> +  10.11%         postgres  libc-2.15.so              [.] 0x0000000000092f69
> +   5.74%         postgres  postgres                  [.] s_lock
> +   2.86%         postgres  postgres                  [.] LWLockAcquire
> +   2.51%       pg_restore  [kernel.kallsyms]         [k]
> isolate_freepages_block
> +   2.33%         postgres  postgres                  [.]
> NextCopyFromRawFields
> +   2.15%         postgres  postgres                  [.] LWLockRelease
> +   2.10%         postgres  postgres                  [.] _start
> +   1.93%         postgres  [kernel.kallsyms]         [k]
> copy_user_enhanced_fast_string
> +   1.70%         postgres  [kernel.kallsyms]         [k] change_pte_range
> +   1.61%         postgres  postgres                  [.] pg_verify_mbstr_len
> +   1.31%         postgres  postgres                  [.]
> hash_search_with_hash_value
> +   1.21%         postgres  libc-2.15.so              [.] __strcoll_l
> +   0.86%          kswapd0  [kernel.kallsyms]         [k]
> __mem_cgroup_uncharge_common
> +   0.72%         postgres  postgres                  [.] heap_fill_tuple
> +   0.68%        bacula-fd  [kernel.kallsyms]         [k]
> isolate_freepages_block
> +   0.66%         postgres  [kernel.kallsyms]         [k] clear_page_c_e
> +   0.63%       pg_restore  [kernel.kallsyms]         [k]
> copy_user_enhanced_fast_string
>
>
> Hugepages are disabled. All suggestions for configuration changes, etc are
> welcome?
>
> IO subsystem is not particulary busy in any of the situations. A sar
> output can be seen here:
> http://thread.gmane.org/gmane.linux.kernel/1908263
>
> Jesper

Hi Jesper, please take a look on
http://marc.info/?l=linux-mm&m=141605213522925&w=2, there is a long
and unfinished discussion as it seems very problematic to make a
deterministic reproduction of the bug in our environments. If you can
observe same lockups with more ease, it`ll help a lot in the issue
pinning and fixing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-03-14 17:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-14 17:05 jesper
2015-03-14 17:14 ` Andrey Korolyov [this message]
2015-03-14 17:25   ` jesper
2015-03-14 17:33     ` Andrey Korolyov
2015-03-18 14:15       ` Vlastimil Babka
2015-03-18 15:14         ` Jesper Krogh
2015-03-19 12:51           ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABYiri81_RAtJizfpOdNPc6m9_Q2u0O35NX0ZhO1cxFpm866HQ@mail.gmail.com \
    --to=andrey@xdel.ru \
    --cc=jesper@krogh.cc \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox