From: "azurIt" <azurit@pobox.sk> To: "Michal Hocko" <mhocko@suse.cz> Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, "cgroups mailinglist" <cgroups@vger.kernel.org> Subject: Re: memory-cgroup bug Date: Thu, 22 Nov 2012 19:05:26 +0100 [thread overview] Message-ID: <20121122190526.390C7A28@pobox.sk> (raw) In-Reply-To: <20121122152441.GA9609@dhcp22.suse.cz> >> i'm using memory cgroup for limiting our users and having a really >> strange problem when a cgroup gets out of its memory limit. It's very >> strange because it happens only sometimes (about once per week on >> random user), out of memory is usually handled ok. > >What is your memcg configuration? Do you use deeper hierarchies, is >use_hierarchy enabled? Is the memcg oom (aka memory.oom_control) >enabled? Do you use soft limit for those groups? Is memcg swap >accounting enabled and memsw limits in place? >Is the machine under global memory pressure as well? >Could you post sysrq+t or sysrq+w? My cgroups hierarchy: /cgroups/<user_id>/uid/ where '<user_id>' is system user id and 'uid' is just word 'uid'. Memory limits are set in /cgroups/<user_id>/ and hierarchy is enabled. Processes are inside /cgroups/<user_id>/uid/ . I'm using hard limits for memory and swap BUT system has no swap at all (it has 'only' 16 GB of real RAM). memory.oom_control is set to 'oom_kill_disable 0'. Server has enough of free memory when problem occurs. >> This happens when problem occures: >> - no new processes can be started for this cgroup >> - current processes are freezed and taking 100% of CPU >> - when i try to 'strace' any of current processes, the whole strace >> freezes until process is killed (strace cannot be terminated by >> CTRL-c) >> - problem can be resolved by raising memory limit for cgroup or >> killing of few processes inside cgroup so some memory is freed >> >> I also garbbed the content of /proc/<pid>/stack of freezed process: >> [<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0 >> [<ffffffff8110b5ab>] T.1146+0x5ab/0x5c0 > >Hmm what is this? Really doesn't know, i will get stack of all freezed processes next time so we can compare it. >> [<ffffffff8110ba56>] mem_cgroup_charge_common+0x56/0xa0 >> [<ffffffff8110bae5>] mem_cgroup_newpage_charge+0x45/0x50 >> [<ffffffff810ec54e>] do_wp_page+0x14e/0x800 >> [<ffffffff810eda34>] handle_pte_fault+0x264/0x940 >> [<ffffffff810ee248>] handle_mm_fault+0x138/0x260 >> [<ffffffff810270ed>] do_page_fault+0x13d/0x460 >> [<ffffffff815b53ff>] page_fault+0x1f/0x30 >> [<ffffffffffffffff>] 0xffffffffffffffff >> > >How many tasks are hung in mem_cgroup_handle_oom? If there were many >of them then it'd smell like an issue fixed by 79dfdaccd1d5 (memcg: >make oom_lock 0 and 1 based rather than counter) and its follow up fix >23751be00940 (memcg: fix hierarchical oom locking) but you are saying >that you can reproduce with 3.2 and those went in for 3.1. 2.6.32 would >make more sense. Usually maximum of several 10s of processes but i will check it next time. I was having much worse problems in 2.6.32 - when freezing happens, the whole server was affected (i wasn't able to do anything and needs to wait until my scripts takes case of it and killed apache, so i don't have any detailed info). In 3.2 only target cgroup is affected. >> I'm currently using kernel 3.2.34 but i'm having this problem since 2.6.32. > >I guess this is a clean vanilla (stable) kernel, right? Are you able to >reproduce with the latest Linus tree? Well, no. I'm using, for example, newest stable grsecurity patch. I'm also using few of Andrea Righi's cgroup subsystems but i don't believe these are doing problems: - cgroup-uid which is moving processes into cgroups based on UID - cgroup-task which can limit number of tasks in cgroup (i already tried to disable this one, it didn't help) http://www.develer.com/~arighi/linux/patches/ Unfortunately i cannot just install new and untested kernel version cos i'm not able to reproduce this problem anytime (it's happening randomly in production environment). Could it be that OOM cannot start and kill processes because there's no free memory in cgroup? Thank you! azur -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-11-22 18:05 UTC|newest]
Thread overview: 171+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20121121200207.01068046@pobox.sk>
2012-11-22 0:26 ` Kamezawa Hiroyuki
2012-11-22 9:36 ` azurIt
2012-11-22 21:45 ` Michal Hocko
2012-11-22 15:24 ` Michal Hocko
2012-11-22 18:05 ` azurIt [this message]
2012-11-22 21:42 ` Michal Hocko
2012-11-22 22:34 ` azurIt
2012-11-23 7:40 ` Michal Hocko
2012-11-23 9:21 ` azurIt
2012-11-23 9:28 ` Michal Hocko
2012-11-23 9:44 ` azurIt
2012-11-23 10:10 ` Michal Hocko
2012-11-23 9:34 ` Glauber Costa
2012-11-23 10:04 ` Michal Hocko
2012-11-23 14:59 ` azurIt
2012-11-25 10:17 ` Michal Hocko
2012-11-25 12:39 ` azurIt
2012-11-25 13:02 ` Michal Hocko
2012-11-25 13:27 ` azurIt
2012-11-25 13:44 ` Michal Hocko
2012-11-25 0:10 ` azurIt
2012-11-25 12:05 ` Michal Hocko
2012-11-25 12:36 ` azurIt
2012-11-25 13:55 ` Michal Hocko
2012-11-26 0:38 ` azurIt
2012-11-26 7:57 ` Michal Hocko
2012-11-26 13:18 ` [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked Michal Hocko
2012-11-26 13:21 ` [PATCH for 3.2.34] " Michal Hocko
2012-11-26 21:28 ` azurIt
2012-11-30 1:45 ` azurIt
2012-11-30 2:29 ` azurIt
2012-11-30 12:45 ` Michal Hocko
2012-11-30 12:53 ` azurIt
2012-11-30 13:44 ` azurIt
2012-11-30 14:44 ` Michal Hocko
2012-11-30 15:03 ` Michal Hocko
2012-11-30 15:37 ` Michal Hocko
2012-11-30 15:08 ` azurIt
2012-11-30 15:39 ` Michal Hocko
2012-11-30 15:59 ` azurIt
2012-11-30 16:19 ` Michal Hocko
2012-11-30 16:26 ` azurIt
2012-11-30 16:53 ` Michal Hocko
2012-11-30 20:43 ` azurIt
2012-12-03 15:16 ` Michal Hocko
2012-12-05 1:36 ` azurIt
2012-12-05 14:17 ` Michal Hocko
2012-12-06 0:29 ` azurIt
2012-12-06 9:54 ` Michal Hocko
2012-12-06 10:12 ` azurIt
2012-12-06 17:06 ` Michal Hocko
2012-12-10 1:20 ` azurIt
2012-12-10 9:43 ` Michal Hocko
2012-12-10 10:18 ` azurIt
2012-12-10 15:52 ` Michal Hocko
2012-12-10 17:18 ` azurIt
2012-12-17 1:34 ` azurIt
2012-12-17 16:32 ` Michal Hocko
2012-12-17 18:23 ` azurIt
2012-12-17 19:55 ` Michal Hocko
2012-12-18 14:22 ` azurIt
2012-12-18 15:20 ` Michal Hocko
2012-12-24 13:25 ` azurIt
2012-12-28 16:22 ` Michal Hocko
2012-12-30 1:09 ` azurIt
2012-12-30 11:08 ` Michal Hocko
2013-01-25 15:07 ` azurIt
2013-01-25 16:31 ` Michal Hocko
2013-02-05 13:49 ` Michal Hocko
2013-02-05 14:49 ` azurIt
2013-02-05 16:09 ` Michal Hocko
2013-02-05 16:46 ` azurIt
2013-02-05 16:48 ` Greg Thelen
2013-02-05 17:46 ` Michal Hocko
2013-02-05 18:09 ` Greg Thelen
2013-02-05 18:59 ` Michal Hocko
2013-02-08 4:27 ` Greg Thelen
2013-02-08 16:29 ` Michal Hocko
2013-02-08 16:40 ` Michal Hocko
2013-02-06 1:17 ` azurIt
2013-02-06 14:01 ` Michal Hocko
2013-02-06 14:22 ` Michal Hocko
2013-02-06 16:00 ` [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set Michal Hocko
2013-02-08 5:03 ` azurIt
2013-02-08 9:44 ` Michal Hocko
2013-02-08 11:02 ` azurIt
2013-02-08 12:38 ` Michal Hocko
2013-02-08 13:56 ` azurIt
2013-02-08 14:47 ` Michal Hocko
2013-02-08 15:24 ` Michal Hocko
2013-02-08 15:58 ` azurIt
2013-02-08 17:10 ` Michal Hocko
2013-02-08 21:02 ` azurIt
2013-02-10 15:03 ` Michal Hocko
2013-02-10 16:46 ` azurIt
2013-02-11 11:22 ` Michal Hocko
2013-02-22 8:23 ` azurIt
2013-02-22 12:52 ` Michal Hocko
2013-02-22 12:54 ` azurIt
2013-02-22 13:00 ` Michal Hocko
2013-06-06 16:04 ` Michal Hocko
2013-06-06 16:16 ` azurIt
2013-06-07 13:11 ` [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM Michal Hocko
2013-06-17 10:21 ` azurIt
2013-06-19 13:26 ` Michal Hocko
2013-06-22 20:09 ` azurIt
2013-06-24 20:13 ` Johannes Weiner
2013-06-28 10:06 ` azurIt
2013-07-05 18:17 ` Johannes Weiner
2013-07-05 19:02 ` azurIt
2013-07-05 19:18 ` Johannes Weiner
2013-07-07 23:42 ` azurIt
2013-07-09 13:10 ` Michal Hocko
2013-07-09 13:19 ` azurIt
2013-07-09 13:54 ` Michal Hocko
2013-07-10 16:25 ` azurIt
2013-07-11 7:25 ` Michal Hocko
2013-07-13 23:26 ` azurIt
2013-07-13 23:51 ` azurIt
2013-07-15 15:41 ` Michal Hocko
2013-07-15 16:00 ` Michal Hocko
2013-07-16 15:35 ` Johannes Weiner
2013-07-16 16:09 ` Michal Hocko
2013-07-16 16:48 ` Johannes Weiner
2013-07-19 4:21 ` Johannes Weiner
2013-07-19 4:22 ` [patch 1/5] mm: invoke oom-killer from remaining unconverted page fault handlers Johannes Weiner
2013-07-19 4:24 ` [patch 2/5] mm: pass userspace fault flag to generic fault handler Johannes Weiner
2013-07-19 4:25 ` [patch 3/5] x86: finish fault error path with fatal signal Johannes Weiner
2013-07-24 20:32 ` Johannes Weiner
2013-07-25 20:29 ` KOSAKI Motohiro
2013-07-25 21:50 ` Johannes Weiner
2013-07-19 4:25 ` [patch 4/5] memcg: do not trap chargers with full callstack on OOM Johannes Weiner
2013-07-19 4:26 ` [patch 5/5] mm: memcontrol: sanity check memcg OOM context unwind Johannes Weiner
2013-07-19 8:23 ` [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM azurIt
2013-07-14 17:07 ` azurIt
2013-07-09 13:00 ` Michal Hocko
2013-07-09 13:08 ` Michal Hocko
2013-07-09 13:10 ` Michal Hocko
2013-06-24 16:48 ` azurIt
2013-02-22 12:00 ` [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set azurIt
2013-02-07 11:01 ` [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked Kamezawa Hiroyuki
2013-02-07 12:31 ` Michal Hocko
2013-02-08 4:16 ` Kamezawa Hiroyuki
2013-02-08 1:40 ` Kamezawa Hiroyuki
2013-02-08 16:01 ` Michal Hocko
2013-02-05 16:31 ` Michal Hocko
2012-12-24 13:38 ` azurIt
2012-12-28 16:35 ` Michal Hocko
2012-11-26 17:46 ` [PATCH -mm] " Johannes Weiner
2012-11-26 18:04 ` Michal Hocko
2012-11-26 18:24 ` Johannes Weiner
2012-11-26 19:03 ` Michal Hocko
2012-11-26 19:29 ` Johannes Weiner
2012-11-26 20:08 ` Michal Hocko
2012-11-26 20:19 ` Johannes Weiner
2012-11-26 20:46 ` azurIt
2012-11-26 20:53 ` Johannes Weiner
2012-11-26 22:06 ` Michal Hocko
2012-11-27 0:05 ` Kamezawa Hiroyuki
2012-11-27 9:54 ` Michal Hocko
2012-11-27 19:48 ` Johannes Weiner
2012-11-27 20:54 ` [PATCH -v2 " Michal Hocko
2012-11-27 20:59 ` Michal Hocko
2012-11-28 15:26 ` Johannes Weiner
2012-11-28 16:04 ` Michal Hocko
2012-11-28 16:37 ` Johannes Weiner
2012-11-28 16:46 ` Michal Hocko
2012-11-28 16:48 ` Michal Hocko
2012-11-28 18:44 ` Johannes Weiner
2012-11-28 20:20 ` Hugh Dickins
2012-11-29 14:05 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121122190526.390C7A28@pobox.sk \
--to=azurit@pobox.sk \
--cc=cgroups@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox