From: Michal Hocko <mhocko@kernel.org>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Vladimir Davydov <vdavydov.dev@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
David Rientjes <rientjes@google.com>, Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, cgroups@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [v4 2/4] mm, oom: cgroup-aware OOM killer
Date: Tue, 1 Aug 2017 16:54:35 +0200 [thread overview]
Message-ID: <20170801145435.GN15774@dhcp22.suse.cz> (raw)
In-Reply-To: <20170726132718.14806-3-guro@fb.com>
On Wed 26-07-17 14:27:16, Roman Gushchin wrote:
[...]
> +static long memcg_oom_badness(struct mem_cgroup *memcg,
> + const nodemask_t *nodemask)
> +{
> + long points = 0;
> + int nid;
> +
> + for_each_node_state(nid, N_MEMORY) {
> + if (nodemask && !node_isset(nid, *nodemask))
> + continue;
> +
> + points += mem_cgroup_node_nr_lru_pages(memcg, nid,
> + LRU_ALL_ANON | BIT(LRU_UNEVICTABLE));
> + }
> +
> + points += memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) /
> + (PAGE_SIZE / 1024);
> + points += memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE);
> + points += memcg_page_state(memcg, MEMCG_SOCK);
> + points += memcg_page_state(memcg, MEMCG_SWAP);
> +
> + return points;
I am wondering why are you diverging from the global oom_badness
behavior here. Although doing per NUMA accounting sounds like a better
idea but then you just end up mixing this with non NUMA numbers and the
whole thing is harder to understand without great advantages.
> +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
> +{
> + struct mem_cgroup *iter, *parent;
> +
> + for_each_mem_cgroup_tree(iter, root) {
> + if (memcg_has_children(iter)) {
> + iter->oom_score = 0;
> + continue;
> + }
> +
> + iter->oom_score = oom_evaluate_memcg(iter, oc->nodemask);
> + if (iter->oom_score == -1) {
> + oc->chosen_memcg = (void *)-1UL;
> + mem_cgroup_iter_break(root, iter);
> + return;
> + }
> +
> + if (!iter->oom_score)
> + continue;
> +
> + for (parent = parent_mem_cgroup(iter); parent && parent != root;
> + parent = parent_mem_cgroup(parent))
> + parent->oom_score += iter->oom_score;
> + }
> +
> + for (;;) {
> + struct cgroup_subsys_state *css;
> + struct mem_cgroup *memcg = NULL;
> + long score = LONG_MIN;
> +
> + css_for_each_child(css, &root->css) {
> + struct mem_cgroup *iter = mem_cgroup_from_css(css);
> +
> + if (iter->oom_score > score) {
> + memcg = iter;
> + score = iter->oom_score;
> + }
> + }
> +
> + if (!memcg) {
> + if (oc->memcg && root == oc->memcg) {
> + oc->chosen_memcg = oc->memcg;
> + css_get(&oc->chosen_memcg->css);
> + oc->chosen_points = oc->memcg->oom_score;
> + }
> + break;
> + }
> +
> + if (memcg->oom_kill_all_tasks || !memcg_has_children(memcg)) {
> + oc->chosen_memcg = memcg;
> + css_get(&oc->chosen_memcg->css);
> + oc->chosen_points = score;
> + break;
> + }
> +
> + root = memcg;
> + }
> +}
This and the rest of the victim selection code is really hairy and hard
to follow.
I would reap out the oom_kill_process into a separate patch.
> -static void oom_kill_process(struct oom_control *oc, const char *message)
> +static void __oom_kill_process(struct task_struct *victim)
To the rest of the patch. I have to say I do not quite like how it is
implemented. I was hoping for something much simpler which would hook
into oom_evaluate_task. If a task belongs to a memcg with kill-all flag
then we would update the cumulative memcg badness (more specifically the
badness of the topmost parent with kill-all flag). Memcg will then
compete with existing self contained tasks (oom_badness will have to
tell whether points belong to a task or a memcg to allow the caller to
deal with it). But it shouldn't be much more complex than that.
Or is there something that I am missing and that would prevent such a
simple approach?
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-01 14:54 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-26 13:27 [v4 0/4] " Roman Gushchin
2017-07-26 13:27 ` [v4 1/4] mm, oom: refactor the TIF_MEMDIE usage Roman Gushchin
2017-07-26 13:56 ` Michal Hocko
2017-07-26 14:06 ` Roman Gushchin
2017-07-26 14:24 ` Michal Hocko
2017-07-26 14:44 ` Michal Hocko
2017-07-26 14:50 ` Roman Gushchin
2017-07-26 13:27 ` [v4 2/4] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-07-27 21:41 ` kbuild test robot
2017-08-01 14:54 ` Michal Hocko [this message]
2017-08-01 15:25 ` Roman Gushchin
2017-08-01 17:03 ` Michal Hocko
2017-08-01 18:13 ` Roman Gushchin
2017-08-02 7:29 ` Michal Hocko
2017-08-03 12:47 ` Roman Gushchin
2017-08-03 13:01 ` Michal Hocko
2017-08-08 23:06 ` David Rientjes
2017-08-14 12:03 ` Roman Gushchin
2017-07-26 13:27 ` [v4 3/4] mm, oom: introduce oom_priority for memory cgroups Roman Gushchin
2017-08-08 23:14 ` David Rientjes
2017-08-14 12:39 ` Roman Gushchin
2017-07-26 13:27 ` [v4 4/4] mm, oom, docs: describe the cgroup-aware OOM killer Roman Gushchin
2017-08-08 23:24 ` David Rientjes
2017-08-14 12:28 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170801145435.GN15774@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=cgroups@vger.kernel.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox