From: Marinko Catovic <marinko.catovic@gmail.com>
To: Marinko Catovic <marinko.catovic@gmail.com>
Cc: Christopher Lameter <cl@linux.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-mm@kvack.org
Subject: Re: Caching/buffers become useless after some time
Date: Tue, 21 Aug 2018 02:36:05 +0200 [thread overview]
Message-ID: <CADF2uSp7MKYWL7Yu5TDOT4qe0v-0iiq+Tv9J6rnzCSgahXbNaA@mail.gmail.com> (raw)
In-Reply-To: <CADF2uSqzt+u7vMkcD-vvT6tjz2bdHtrFK+p6s7NXGP-BJ34dRA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6465 bytes --]
> The only way how kmemcg limit could help I can think of would be to
>> enforce metadata reclaim much more often. But that is rather a bad
>> workaround.
>
>would that have some significant performance impact?
>I would be willing to try if you think the idea is not thaaat bad.
>If so, could you please explain what to do?
>
>> > > Because a lot of FS metadata is fragmenting the memory and a large
>> > > number of high order allocations which want to be served reclaim a
lot
>> > > of memory to achieve their gol. Considering a large part of memory is
>> > > fragmented by unmovable objects there is no other way than to use
>> > > reclaim to release that memory.
>> >
>> > Well it looks like the fragmentation issue gets worse. Is that enough
to
>> > consider merging the slab defrag patchset and get some work done on
inodes
>> > and dentries to make them movable (or use targetd reclaim)?
>
>> Is there anything to test?
>
>Are you referring to some known issue there, possibly directly related to
mine?
>If so, I would be willing to test that patchset, if it makes into the
kernel.org sources,
>or if I'd have to patch that manually.
>
>
>> Well, there are some drivers (mostly out-of-tree) which are high order
>> hungry. You can try to trace all allocations which with order > 0 and
>> see who that might be.
>> # mount -t tracefs none /debug/trace/
>> # echo stacktrace > /debug/trace/trace_options
>> # echo "order>0" > /debug/trace/events/kmem/mm_page_alloc/filter
>> # echo 1 > /debug/trace/events/kmem/mm_page_alloc/enable
>> # cat /debug/trace/trace_pipe
>>
>> And later this to disable tracing.
>> # echo 0 > /debug/trace/events/kmem/mm_page_alloc/enable
>
>I just had a major cache-useless situation, with like 100M/8G usage only
>and horrible performance. There you go:
>
>https://nofile.io/f/mmwVedaTFsd
>
>I think mysql occurs mostly, regardless of the binary name this is actually
>mariadb in version 10.1.
>
>> You do not have to drop all caches. echo 2 > /proc/sys/vm/drop_caches
>> should be sufficient to drop metadata only.
>
>that is exactly what I am doing, I already mentioned that 1> does not
>make any difference at all 2> is the only way that helps.
>just 5 minutes after doing that the usage grew to 2GB/10GB and is steadily
>going up, as usual.
>
>
>2018-08-09 10:29 GMT+02:00 Marinko Catovic <marinko.catovic@gmail.com>:
>
>
>
> On Mon 06-08-18 15:37:14, Cristopher Lameter wrote:
> > On Mon, 6 Aug 2018, Michal Hocko wrote:
> >
> > > Because a lot of FS metadata is fragmenting the memory and a
large
> > > number of high order allocations which want to be served
reclaim a lot
> > > of memory to achieve their gol. Considering a large part of
memory is
> > > fragmented by unmovable objects there is no other way than to
use
> > > reclaim to release that memory.
> >
> > Well it looks like the fragmentation issue gets worse. Is that
enough to
> > consider merging the slab defrag patchset and get some work done
on inodes
> > and dentries to make them movable (or use targetd reclaim)?
>
> Is there anything to test?
> --
> Michal Hocko
> SUSE Labs
>
>
> > [Please do not top-post]
>
> like this?
>
> > The only way how kmemcg limit could help I can think of would be to
> > enforce metadata reclaim much more often. But that is rather a bad
> > workaround.
>
> would that have some significant performance impact?
> I would be willing to try if you think the idea is not thaaat bad.
> If so, could you please explain what to do?
>
> > > > Because a lot of FS metadata is fragmenting the memory and a
large
> > > > number of high order allocations which want to be served reclaim
a lot
> > > > of memory to achieve their gol. Considering a large part of
memory is
> > > > fragmented by unmovable objects there is no other way than to use
> > > > reclaim to release that memory.
> > >
> > > Well it looks like the fragmentation issue gets worse. Is that
enough to
> > > consider merging the slab defrag patchset and get some work done
on inodes
> > > and dentries to make them movable (or use targetd reclaim)?
>
> > Is there anything to test?
>
> Are you referring to some known issue there, possibly directly related
to mine?
> If so, I would be willing to test that patchset, if it makes into the
kernel.org sources,
> or if I'd have to patch that manually.
>
>
> > Well, there are some drivers (mostly out-of-tree) which are high
order
> > hungry. You can try to trace all allocations which with order > 0 and
> > see who that might be.
> > # mount -t tracefs none /debug/trace/
> > # echo stacktrace > /debug/trace/trace_options
> > # echo "order>0" > /debug/trace/events/kmem/mm_page_alloc/filter
> > # echo 1 > /debug/trace/events/kmem/mm_page_alloc/enable
> > # cat /debug/trace/trace_pipe
> >
> > And later this to disable tracing.
> > # echo 0 > /debug/trace/events/kmem/mm_page_alloc/enable
>
> I just had a major cache-useless situation, with like 100M/8G usage
only
> and horrible performance. There you go:
>
> https://nofile.io/f/mmwVedaTFsd
>
> I think mysql occurs mostly, regardless of the binary name this is
actually
> mariadb in version 10.1.
>
> > You do not have to drop all caches. echo 2 > /proc/sys/vm/drop_caches
> > should be sufficient to drop metadata only.
>
> that is exactly what I am doing, I already mentioned that 1> does not
> make any difference at all 2> is the only way that helps.
> just 5 minutes after doing that the usage grew to 2GB/10GB and is
steadily
> going up, as usual.
Is there anything you can read from these results?
The issue keeps occuring, the latest one was even totally unexpected in the
morning hours,
causing downtime the entire morning until noon when I could check and drop
the caches again.
I also reset O_DIRECT from mariadb to `fsync`, the new default in their
latest release, hoping
that this would help, but it did not.
Before giving totally up I'd like to know whether there is any solution for
this, where again I can
not believe that I am the only one affected. this *has* to affect anyone
with similar a use case,
I do not see what is so special about mine. this is simply many users with
many files, every
larger shared hosting provider should experience the totally same behaviour
with the 4.x kernel branch.
[-- Attachment #2: Type: text/html, Size: 8266 bytes --]
next prev parent reply other threads:[~2018-08-21 0:36 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 13:18 Marinko Catovic
2018-07-12 11:34 ` Michal Hocko
2018-07-13 15:48 ` Marinko Catovic
2018-07-16 15:53 ` Marinko Catovic
2018-07-16 16:23 ` Michal Hocko
2018-07-16 16:33 ` Marinko Catovic
2018-07-16 16:45 ` Michal Hocko
2018-07-20 22:03 ` Marinko Catovic
2018-07-27 11:15 ` Vlastimil Babka
2018-07-30 14:40 ` Michal Hocko
2018-07-30 22:08 ` Marinko Catovic
2018-08-02 16:15 ` Vlastimil Babka
2018-08-03 14:13 ` Marinko Catovic
2018-08-06 9:40 ` Vlastimil Babka
2018-08-06 10:29 ` Marinko Catovic
2018-08-06 12:00 ` Michal Hocko
2018-08-06 15:37 ` Christopher Lameter
2018-08-06 18:16 ` Michal Hocko
2018-08-09 8:29 ` Marinko Catovic
2018-08-21 0:36 ` Marinko Catovic [this message]
2018-08-21 6:49 ` Michal Hocko
2018-08-21 7:19 ` Vlastimil Babka
2018-08-22 20:02 ` Marinko Catovic
2018-08-23 12:10 ` Vlastimil Babka
2018-08-23 12:21 ` Michal Hocko
2018-08-24 0:11 ` Marinko Catovic
2018-08-24 6:34 ` Vlastimil Babka
2018-08-24 8:11 ` Marinko Catovic
2018-08-24 8:36 ` Vlastimil Babka
2018-08-29 14:54 ` Marinko Catovic
2018-08-29 15:01 ` Michal Hocko
2018-08-29 15:13 ` Marinko Catovic
2018-08-29 15:27 ` Michal Hocko
2018-08-29 16:44 ` Marinko Catovic
2018-10-22 1:19 ` Marinko Catovic
2018-10-23 17:41 ` Marinko Catovic
2018-10-26 5:48 ` Marinko Catovic
2018-10-26 8:01 ` Michal Hocko
2018-10-26 23:31 ` Marinko Catovic
2018-10-27 6:42 ` Michal Hocko
[not found] ` <6e3a9434-32f2-0388-e0c7-2bd1c2ebc8b1@suse.cz>
2018-10-30 15:30 ` Michal Hocko
2018-10-30 16:08 ` Marinko Catovic
2018-10-30 17:00 ` Vlastimil Babka
2018-10-30 18:26 ` Marinko Catovic
2018-10-31 7:34 ` Michal Hocko
2018-10-31 7:32 ` Michal Hocko
2018-10-31 13:40 ` Vlastimil Babka
2018-10-31 14:53 ` Marinko Catovic
2018-10-31 17:01 ` Michal Hocko
2018-10-31 19:21 ` Marinko Catovic
2018-11-01 13:23 ` Michal Hocko
2018-11-01 22:46 ` Marinko Catovic
2018-11-02 8:05 ` Michal Hocko
2018-11-02 11:31 ` Marinko Catovic
2018-11-02 11:49 ` Michal Hocko
2018-11-02 12:22 ` Vlastimil Babka
2018-11-02 12:41 ` Marinko Catovic
2018-11-02 13:13 ` Vlastimil Babka
2018-11-02 13:50 ` Marinko Catovic
2018-11-02 14:49 ` Vlastimil Babka
2018-11-02 14:59 ` Vlastimil Babka
2018-11-30 12:01 ` Marinko Catovic
2018-12-10 21:30 ` Marinko Catovic
2018-12-10 21:47 ` Michal Hocko
2018-10-31 13:12 ` Vlastimil Babka
2018-08-24 6:24 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADF2uSp7MKYWL7Yu5TDOT4qe0v-0iiq+Tv9J6rnzCSgahXbNaA@mail.gmail.com \
--to=marinko.catovic@gmail.com \
--cc=cl@linux.com \
--cc=linux-mm@kvack.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox