From: Andrey Korolyov <andrey@xdel.ru>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, Christian Marie <christian@ponies.io>
Subject: Re: isolate_freepages_block and excessive CPU usage by OSD process
Date: Thu, 20 Nov 2014 03:49:26 +0400 [thread overview]
Message-ID: <CABYiri_UeWY6gzY_TKKJ+VPcrYowjN=5riwff30-WTgZAO5xeQ@mail.gmail.com> (raw)
In-Reply-To: <546D2366.1050506@suse.cz>
On Thu, Nov 20, 2014 at 2:10 AM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 11/19/2014 10:20 PM, Christian Marie wrote:
>> On Wed, Nov 19, 2014 at 10:03:44PM +0400, Andrey Korolyov wrote:
>>> > We are using Mellanox ipoib drivers which do not do scatter-gather, so I'm
>>> > currently working on adding support for that (the hardware supports it). Are
>>> > you also using ipoib or have something else doing high order allocations? It's
>>> > a bit concerning for me if you don't as it would suggest that cutting down on
>>> > those allocations won't help.
>>>
>>> So do I. On a test environment with regular tengig cards I was unable to
>>> reproduce the issue. Honestly, I thought that almost every contemporary
>>> driver for high-speed cards is working with scatter-gather, so I had not mlx
>>> in mind as a potential cause of this problem from very beginning.
>>
>> Right, the drivers handle SG just fine, even in UD mode. It's just that as soon
>> as you go switch to CM they turn of hardware IP csums and SG support. The only
>> question I remain to answer before testing a patched driver is whether or not
>> the messages sent by Ceph are fragmented enough to save allocations. If not, we
>> could always patch Ceph as well but this is beginning to snowball.
>>
>> Here is the untested WIP patch for SG support in ipoib CM mode, I'm currently
>> talking to the original author of a larger patch to review and split that and
>> get them both upstream.:
>>
>> https://gist.github.com/christian-marie/e8048b9c118bd3925957
>>
>>> There are a couple of reports in ceph lists, complaining for OSD
>>> flapping/unresponsiveness without clear reason on certain (not always clear
>>> though) conditions which may have same root cause.
>>
>> Possibly, though ipoib and Ceph seem to be a relatively rare combination.
>> Someone will likely find this thread if it is the same root cause.
>>
>>> Wonder if numad-like mechanism will help there, but its usage is generally an
>>> anti-performance pattern in my experience.
>>
>> We've played with zone_reclaim_mode and numad to no avail. Only thing we haven't
>> tried is striping, which I don't want to do anyway.
>>
>> If these large allocations are indeed a reasonable thing to ask of the
>> compaction/reclaim subsystem that seems like the best way forward. I have two
>> questions that follow from this conjecture:
>>
>> Are compaction behaving badly or are we just asking for too many high order
>> allocations?
>>
>> Is this fixed in a later kernel? I haven't tested yet.
>
> As I said, recent kernels received many compaction performance tuning patches,
> and reclaim as well. I would recommend trying them, if it's possible.
>
> You mention 3.10.0-123.9.3.el7.x86_64 which I have no idea how it relates to
> upstream stable kernel. Upstream version 3.10.44 received several compaction
> fixes that I'd deem critical for compaction to work as intended, and lack of
> them could explain your problems:
>
> mm: compaction: reset cached scanner pfn's before reading them
> commit d3132e4b83e6bd383c74d716f7281d7c3136089c upstream.
>
> mm: compaction: detect when scanners meet in isolate_freepages
> commit 7ed695e069c3cbea5e1fd08f84a04536da91f584 upstream.
>
> mm/compaction: make isolate_freepages start at pageblock boundary
> commit 49e068f0b73dd042c186ffa9b420a9943e90389a upstream.
>
> You might want to check if those are included in your kernel package, and/or try
> upstream stable 3.10 (if you can't use the latest for some reason).
>
> Vlastimil
Thanks, neither Christian`s nor mine builds aren`t including those. I
mentioned that I run -stable 3.10 but it was derived from public
branch probably as early as RH`s and received only
performance/security fixes at most. Will check the issue soon and
report back.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-11-19 23:49 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CABYiri-do2YdfBx=r+u1kwXkEwN4v+yeRSHB-ODXo4gMFgW-Fg.mail.gmail.com>
2014-11-19 1:21 ` Christian Marie
2014-11-19 18:03 ` Andrey Korolyov
2014-11-19 21:20 ` Christian Marie
2014-11-19 23:10 ` Vlastimil Babka
2014-11-19 23:49 ` Andrey Korolyov [this message]
2014-11-20 3:30 ` Christian Marie
2014-11-21 2:35 ` Christian Marie
2014-11-23 9:33 ` Christian Marie
2014-11-24 21:48 ` Andrey Korolyov
2014-11-28 8:03 ` Joonsoo Kim
2014-11-28 9:26 ` Vlastimil Babka
2014-12-01 8:31 ` Joonsoo Kim
2014-12-02 1:47 ` Christian Marie
2014-12-02 4:53 ` Joonsoo Kim
2014-12-02 5:06 ` Christian Marie
2014-12-03 4:04 ` Christian Marie
2014-12-03 8:05 ` Joonsoo Kim
2014-12-04 23:30 ` Vlastimil Babka
2014-12-05 5:50 ` Christian Marie
2014-12-03 7:57 ` Joonsoo Kim
2014-12-04 7:30 ` Christian Marie
2014-12-04 7:51 ` Christian Marie
2014-12-05 1:07 ` Joonsoo Kim
2014-12-05 5:55 ` Christian Marie
2014-12-08 7:19 ` Joonsoo Kim
2014-12-10 15:06 ` Vlastimil Babka
2014-12-11 3:08 ` Joonsoo Kim
2014-12-02 15:46 ` Vlastimil Babka
2014-12-03 7:49 ` Joonsoo Kim
2014-12-03 12:43 ` Vlastimil Babka
2014-12-04 6:53 ` Joonsoo Kim
2014-11-15 11:48 Andrey Korolyov
2014-11-15 16:32 ` Vlastimil Babka
2014-11-15 17:10 ` Andrey Korolyov
2014-11-15 18:45 ` Vlastimil Babka
2014-11-15 18:52 ` Andrey Korolyov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABYiri_UeWY6gzY_TKKJ+VPcrYowjN=5riwff30-WTgZAO5xeQ@mail.gmail.com' \
--to=andrey@xdel.ru \
--cc=christian@ponies.io \
--cc=linux-mm@kvack.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox