linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: linux-mm@kvack.org
Subject: Re: isolate_freepages_block and excessive CPU usage by OSD process
Date: Wed, 3 Dec 2014 17:05:59 +0900	[thread overview]
Message-ID: <20141203080559.GC6276@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <20141203040404.GA16499@cucumber.bridge.anchor.net.au>

On Wed, Dec 03, 2014 at 03:04:04PM +1100, Christian Marie wrote:
> On Tue, Dec 02, 2014 at 04:06:08PM +1100, Christian Marie wrote:
> > I will attempt to do this tomorrow and should have results in around 24 hours.
> 
> I ran said test today and wasn't able to pinpoint a solid difference between a kernel
> with both patches and one with only the first. The one with both patches "felt"
> a little more responsive, probably a fluke.

Thanks! It would help me.

> 
> I'd really like to write a stress test that simulates what ceph/ipoib is doing
> here so that I can test this in a more scientific manner.
> 
> Here is some perf output, the kernel with only the first patch is on the right:
> 
> http://ponies.io/raw/before-after.png
> 
> 
> A note in passing: we left the cluster running with min_free_kbytes set to the
> default last night and within a few hours it started spewing the usual
> pre-patch allocation failures, so whilst this patch appears to make the system
> more responsive under adverse conditions the underlying
> not-keeping-up-with-pressure issue is still there.

I guess that it is caused by too fast allocation. If your allocation rate
is more than kswapd's reclaim rate and no GFP_WAIT, failure would be possible.
Following failure log looks that case. In this case, enlaring
min_free_kbytes may be right solution, but, I'm not expert so please consult
other MM guys.

> There's enough starvation to break single page allocations.
> 
> Keep in mind that this is on a 3.10 kernel with the patches applied so I'm not
> expecting anyone to particularly care. I'm running out of time to test the
> whole cluster at 3.18 is all, I really do think that replicating the allocation
> pattern is the best way forward but my attempts at simply sending a lot of
> packets that look similar with lots of page cache don't do it.
> 
> Those allocation failures on 3.10 with both patches look like this:
> 
> 	[73138.803800] ceph-osd: page allocation failure: order:0, mode:0x20
> 	[73138.803802] CPU: 0 PID: 9214 Comm: ceph-osd Tainted: GF
> 	O--------------   3.10.0-123.9.3.anchor.x86_64 #1
> 	[73138.803803] Hardware name: Dell Inc. PowerEdge R720xd/0X3D66, BIOS 2.2.2
> 	01/16/2014
> 	[73138.803803]  0000000000000020 00000000d6532f99 ffff88081fa03aa0
> 	ffffffff815e23bb
> 	[73138.803806]  ffff88081fa03b30 ffffffff81147340 00000000ffffffff
> 	ffff8807da887900
> 	[73138.803808]  ffff88083ffd9e80 ffff8800b2242900 ffff8807d843c050
> 	00000000d6532f99
> 	[73138.803812] Call Trace:
> 	[73138.803813]  <IRQ>  [<ffffffff815e23bb>] dump_stack+0x19/0x1b
> 	[73138.803817]  [<ffffffff81147340>] warn_alloc_failed+0x110/0x180
> 	[73138.803819]  [<ffffffff8114b4ee>] __alloc_pages_nodemask+0x91e/0xb20
> 	[73138.803821]  [<ffffffff8152f82a>] ? tcp_v4_rcv+0x67a/0x7c0
> 	[73138.803823]  [<ffffffff81509710>] ? ip_rcv_finish+0x350/0x350
> 	[73138.803826]  [<ffffffff81188369>] alloc_pages_current+0xa9/0x170
> 	[73138.803828]  [<ffffffff814bedb1>] __netdev_alloc_frag+0x91/0x140
> 	[73138.803831]  [<ffffffff814c0df7>] __netdev_alloc_skb+0x77/0xc0
> 	[73138.803834]  [<ffffffffa06b54c5>] ipoib_cm_handle_rx_wc+0xf5/0x940
> 	[ib_ipoib]
> 	[73138.803838]  [<ffffffffa0625e78>] ? mlx4_ib_poll_cq+0xc8/0x210 [mlx4_ib]
> 	[73138.803841]  [<ffffffffa06a90ed>] ipoib_poll+0x8d/0x150 [ib_ipoib]
> 	[73138.803843]  [<ffffffff814d05aa>] net_rx_action+0x15a/0x250
> 	[73138.803846]  [<ffffffff81067047>] __do_softirq+0xf7/0x290
> 	[73138.803848]  [<ffffffff815f43dc>] call_softirq+0x1c/0x30
> 	[73138.803851]  [<ffffffff81014d25>] do_softirq+0x55/0x90
> 	[73138.803853]  [<ffffffff810673e5>] irq_exit+0x115/0x120
> 	[73138.803855]  [<ffffffff815f4cd8>] do_IRQ+0x58/0xf0
> 	[73138.803857]  [<ffffffff815e9e2d>] common_interrupt+0x6d/0x6d
> 	[73138.803858]  <EOI>  [<ffffffff815f2bc0>] ? sysret_audit+0x17/0x21
> 
> We get some like this, also:
> 
> [ 1293.152415] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> [ 1293.152416]   cache: kmalloc-256, object size: 256, buffer size: 256,
> default order: 1, min order: 0
> [ 1293.152417]   node 0: slabs: 1789, objs: 57248, free: 0
> [ 1293.152418]   node 1: slabs: 449, objs: 14368, free: 2
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-12-03  8:02 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABYiri-do2YdfBx=r+u1kwXkEwN4v+yeRSHB-ODXo4gMFgW-Fg.mail.gmail.com>
2014-11-19  1:21 ` Christian Marie
2014-11-19 18:03   ` Andrey Korolyov
2014-11-19 21:20     ` Christian Marie
2014-11-19 23:10       ` Vlastimil Babka
2014-11-19 23:49         ` Andrey Korolyov
2014-11-20  3:30         ` Christian Marie
2014-11-21  2:35         ` Christian Marie
2014-11-23  9:33           ` Christian Marie
2014-11-24 21:48             ` Andrey Korolyov
2014-11-28  8:03               ` Joonsoo Kim
2014-11-28  9:26                 ` Vlastimil Babka
2014-12-01  8:31                   ` Joonsoo Kim
2014-12-02  1:47                     ` Christian Marie
2014-12-02  4:53                       ` Joonsoo Kim
2014-12-02  5:06                         ` Christian Marie
2014-12-03  4:04                           ` Christian Marie
2014-12-03  8:05                             ` Joonsoo Kim [this message]
2014-12-04 23:30                             ` Vlastimil Babka
2014-12-05  5:50                               ` Christian Marie
2014-12-03  7:57                           ` Joonsoo Kim
2014-12-04  7:30                             ` Christian Marie
2014-12-04  7:51                               ` Christian Marie
2014-12-05  1:07                               ` Joonsoo Kim
2014-12-05  5:55                                 ` Christian Marie
2014-12-08  7:19                                   ` Joonsoo Kim
2014-12-10 15:06                                 ` Vlastimil Babka
2014-12-11  3:08                                   ` Joonsoo Kim
2014-12-02 15:46                         ` Vlastimil Babka
2014-12-03  7:49                           ` Joonsoo Kim
2014-12-03 12:43                             ` Vlastimil Babka
2014-12-04  6:53                               ` Joonsoo Kim
2014-11-15 11:48 Andrey Korolyov
2014-11-15 16:32 ` Vlastimil Babka
2014-11-15 17:10   ` Andrey Korolyov
2014-11-15 18:45     ` Vlastimil Babka
2014-11-15 18:52       ` Andrey Korolyov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141203080559.GC6276@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox