From: Nikolay Borisov <kernel@kyup.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Linux MM <linux-mm@kvack.org>
Subject: Re: Softlockup during memory allocation
Date: Thu, 24 Nov 2016 15:09:38 +0200 [thread overview]
Message-ID: <a655e607-91c5-173c-ec3a-e211df598f92@kyup.com> (raw)
In-Reply-To: <20161124121209.GE20668@dhcp22.suse.cz>
On 11/24/2016 02:12 PM, Michal Hocko wrote:
> On Thu 24-11-16 13:45:03, Nikolay Borisov wrote:
> [...]
>> Ok, I think I know what has happened. Inspecting the data structures of
>> the respective cgroup here is what the mem_cgroup_per_zone looks like:
>>
>> zoneinfo[2] = {
>> lruvec = {{
>> lists = {
>> {
>> next = 0xffffea004f98c660,
>> prev = 0xffffea0063f6b1a0
>> },
>> {
>> next = 0xffffea0004123120,
>> prev = 0xffffea002c2e2260
>> },
>> {
>> next = 0xffff8818c37bb360,
>> prev = 0xffff8818c37bb360
>> },
>> {
>> next = 0xffff8818c37bb370,
>> prev = 0xffff8818c37bb370
>> },
>> {
>> next = 0xffff8818c37bb380,
>> prev = 0xffff8818c37bb380
>> }
>> },
>> reclaim_stat = {
>> recent_rotated = {172969085, 43319509},
>> recent_scanned = {173112994, 185446658}
>> },
>> zone = 0xffff88207fffcf00
>> }},
>> lru_size = {159722, 158714, 0, 0, 0},
>> }
>>
>> So this means that there are inactive_anon and active_annon only -
>> correct?
>
> yes. at least in this particular zone.
>
>> Since the machine doesn't have any swap this means anon memory
>> has nowhere to go. If I'm interpreting the data correctly then this
>> explains why reclaim makes no progress. If that's the case then I have
>> the following questions:
>>
>> 1. Shouldn't reclaim exit at some point rather than being stuck in
>> reclaim without making further progress.
>
> Reclaim (try_to_free_mem_cgroup_pages) has to go down all priorities
> without to get out. We are not doing any pro-active checks whether there
> is anything reclaimable but that alone shouldn't be such a big deal
> because shrink_node_memcg should simply do nothing because
> get_scan_count will find no pages to scan. So it shouldn't take much
> time to realize there is nothing to reclaim and get back to try_charge
> which retries few more times and eventually goes OOM. I do not see how
> we could trigger rcu stalls here. There shouldn't be any long RCU
> critical section on the way and preemption points on the way.
>
>> 2. It seems rather strange that there are no (INACTIVE|ACTIVE)_FILE
>> pages - is this possible?
>
> All of them might be reclaimed already as a result of the memory
> pressure in the memcg. So not all that surprising. But the fact that
> you are hitting the limit means that the anonymous pages saturate your
> hard limit so your memcg seems underprovisioned.
>
>> 3. Why hasn't OOM been activated in order to free up some anonymous memory ?
>
> It should eventually. Maybe there still were some reclaimable pages in
> other zones for this memcg.
I just checked all the zones for both nodes (the machines have 2 NUMA
nodes) so essentially there are no reclaimable pages - all are
anonymous. So the pertinent question is why process are sleeping in
reclamation path when there are no pages to free. I also observed the
same behavior on a different node, this time the priority was 0 and the
code hasn't resorted to OOM. This seems all too strange..
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-11-24 13:09 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-01 8:12 Nikolay Borisov
2016-11-01 8:16 ` Nikolay Borisov
2016-11-02 19:00 ` Vlastimil Babka
2016-11-04 3:46 ` Hugh Dickins
2016-11-04 12:18 ` Nikolay Borisov
2016-11-13 22:02 ` Nikolay Borisov
2016-11-21 5:31 ` Michal Hocko
2016-11-22 8:56 ` Nikolay Borisov
2016-11-22 14:30 ` Michal Hocko
2016-11-22 14:32 ` Michal Hocko
2016-11-22 14:46 ` Nikolay Borisov
2016-11-22 14:35 ` Nikolay Borisov
2016-11-22 17:02 ` Michal Hocko
2016-11-23 7:44 ` Nikolay Borisov
2016-11-23 7:49 ` Michal Hocko
2016-11-23 7:50 ` Michal Hocko
2016-11-24 11:45 ` Nikolay Borisov
2016-11-24 12:12 ` Michal Hocko
2016-11-24 13:09 ` Nikolay Borisov [this message]
2016-11-25 9:00 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a655e607-91c5-173c-ec3a-e211df598f92@kyup.com \
--to=kernel@kyup.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox