linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: oom: deduplicate victim selection code for memcg and global oom
Date: Thu, 21 Jul 2016 08:41:44 -0400	[thread overview]
Message-ID: <20160721124144.GB21806@cmpxchg.org> (raw)
In-Reply-To: <1467045594-20990-1-git-send-email-vdavydov@virtuozzo.com>

Hi Vladimir,

Sorry for getting to this only now.

On Mon, Jun 27, 2016 at 07:39:54PM +0300, Vladimir Davydov wrote:
> When selecting an oom victim, we use the same heuristic for both memory
> cgroup and global oom. The only difference is the scope of tasks to
> select the victim from. So we could just export an iterator over all
> memcg tasks and keep all oom related logic in oom_kill.c, but instead we
> duplicate pieces of it in memcontrol.c reusing some initially private
> functions of oom_kill.c in order to not duplicate all of it. That looks
> ugly and error prone, because any modification of select_bad_process
> should also be propagated to mem_cgroup_out_of_memory.
> 
> Let's rework this as follows: keep all oom heuristic related code
> private to oom_kill.c and make oom_kill.c use exported memcg functions
> when it's really necessary (like in case of iterating over memcg tasks).

This approach, with the control flow in the OOM code, makes a lot of
sense to me. I think it's particularly useful in preparation for
supporting cgroup-aware OOM killing, where not just individual tasks
but entire cgroups are evaluated and killed as opaque memory units.

I'm thinking about doing something like the following, which should be
able to work regardless on what cgroup level - root, intermediate, or
leaf node - the OOM killer is invoked, and this patch works toward it:

struct oom_victim {
        bool is_memcg;
        union {
                struct task_struct *task;
                struct mem_cgroup *memcg;
        } entity;
        unsigned long badness;
};

oom_evaluate_memcg(oc, memcg, victim)
{
        if (memcg == root) {
                for_each_memcg_process(p, memcg) {
                        badness = oom_badness(oc, memcg, p);
                        if (badness == some_special_value) {
                                ...
                        } else if (badness > victim->badness) {
				victim->is_memcg = false;
				victim->entity.task = p;
				victim->badness = badness;
			}
                }
        } else {
                badness = 0;
                for_each_memcg_process(p, memcg) {
                        b = oom_badness(oc, memcg, p);
                        if (b == some_special_value)
                                ...
                        else
                                badness += b;
                }
                if (badness > victim.badness)
                        victim->is_memcg = true;
			victim->entity.memcg = memcg;
			victim->badness = badness;
		}
        }
}

oom()
{
        struct oom_victim victim = {
                .badness = 0,
        };

        for_each_mem_cgroup_tree(memcg, oc->memcg)
                oom_evaluate_memcg(oc, memcg, &victim);

        if (!victim.badness && !is_sysrq_oom(oc)) {
                dump_header(oc, NULL);
                panic("Out of memory and no killable processes...\n");
        }

        if (victim.badness != -1) {
                oom_kill_victim(oc, &victim);
                schedule_timeout_killable(1);
        }

        return true;
}

But even without that, with the unification of two identical control
flows and the privatization of a good amount of oom killer internals,
the patch speaks for itself.
	
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-07-21 12:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-27 16:39 Vladimir Davydov
2016-06-28  0:14 ` David Rientjes
2016-06-28 16:16   ` Vladimir Davydov
2016-07-01 11:18     ` Michal Hocko
2016-07-21 12:41 ` Johannes Weiner [this message]
2016-07-23 14:49   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160721124144.GB21806@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox