linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [v5 2/4] mm, oom: cgroup-aware OOM killer
Date: Mon, 14 Aug 2017 15:42:54 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1708141532300.63207@chino.kir.corp.google.com> (raw)
In-Reply-To: <20170814183213.12319-3-guro@fb.com>

On Mon, 14 Aug 2017, Roman Gushchin wrote:

> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 8a266e2be5a6..b7ec3bd441be 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -39,6 +39,7 @@ struct oom_control {
>  	unsigned long totalpages;
>  	struct task_struct *chosen;
>  	unsigned long chosen_points;
> +	struct mem_cgroup *chosen_memcg;
>  };
>  
>  extern struct mutex oom_lock;
> @@ -79,6 +80,8 @@ extern void oom_killer_enable(void);
>  
>  extern struct task_struct *find_lock_task_mm(struct task_struct *p);
>  
> +extern int oom_evaluate_task(struct task_struct *task, void *arg);
> +
>  /* sysctls */
>  extern int sysctl_oom_dump_tasks;
>  extern int sysctl_oom_kill_allocating_task;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index df6f63ee95d6..0b81dc55c6ac 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2639,6 +2639,181 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg)
>  	return ret;
>  }
>  
> +static long memcg_oom_badness(struct mem_cgroup *memcg,
> +			      const nodemask_t *nodemask)
> +{
> +	long points = 0;
> +	int nid;
> +
> +	for_each_node_state(nid, N_MEMORY) {
> +		if (nodemask && !node_isset(nid, *nodemask))
> +			continue;
> +
> +		points += mem_cgroup_node_nr_lru_pages(memcg, nid,
> +				LRU_ALL_ANON | BIT(LRU_UNEVICTABLE));
> +	}
> +
> +	points += memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) /
> +		(PAGE_SIZE / 1024);
> +	points += memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE);
> +	points += memcg_page_state(memcg, MEMCG_SOCK);
> +	points += memcg_page_state(memcg, MEMCG_SWAP);
> +
> +	return points;
> +}

I'm indifferent to the memcg evaluation criteria used to determine which 
memcg should be selected over others with the same priority, others may 
feel differently.

> +
> +static long oom_evaluate_memcg(struct mem_cgroup *memcg,
> +			       const nodemask_t *nodemask)
> +{
> +	struct css_task_iter it;
> +	struct task_struct *task;
> +	int elegible = 0;
> +
> +	css_task_iter_start(&memcg->css, 0, &it);
> +	while ((task = css_task_iter_next(&it))) {
> +		/*
> +		 * If there are no tasks, or all tasks have oom_score_adj set
> +		 * to OOM_SCORE_ADJ_MIN and oom_kill_all_tasks is not set,
> +		 * don't select this memory cgroup.
> +		 */
> +		if (!elegible &&
> +		    (memcg->oom_kill_all_tasks ||
> +		     task->signal->oom_score_adj != OOM_SCORE_ADJ_MIN))
> +			elegible = 1;

I'm curious about the decision made in this conditional and how 
oom_kill_memcg_member() ignores task->signal->oom_score_adj.  It means 
that memory.oom_kill_all_tasks overrides /proc/pid/oom_score_adj if it 
should otherwise be disabled.

It's undocumented in the changelog, but I'm questioning whether it's the 
right decision.  Doesn't it make sense to kill all tasks that are not oom 
disabled, and allow the user to still protect certain processes by their 
/proc/pid/oom_score_adj setting?  Otherwise, there's no way to do that 
protection without a sibling memcg and its own reservation of memory.  I'm 
thinking about a process that governs jobs inside the memcg and if there 
is an oom kill, it wants to do logging and any cleanup necessary before 
exiting itself.  It seems like a powerful combination if coupled with oom 
notification.

Also, s/elegible/eligible/

Otherwise, looks good!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-14 22:42 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-14 18:32 [v5 1/4] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-08-14 18:32 ` [v5 0/4] cgroup-aware OOM killer Roman Gushchin
2017-08-14 18:32 ` [v5 2/4] mm, oom: " Roman Gushchin
2017-08-14 22:42   ` David Rientjes [this message]
2017-08-15 12:15     ` Roman Gushchin
2017-08-15 12:20       ` Aleksa Sarai
2017-08-15 12:57         ` Roman Gushchin
2017-08-15 21:47       ` David Rientjes
2017-08-16 15:43         ` Roman Gushchin
2017-08-21  0:50           ` David Rientjes
2017-08-21  9:46             ` Roman Gushchin
2017-08-22 17:03   ` Johannes Weiner
2017-08-23 16:20     ` Roman Gushchin
2017-08-23 17:24       ` Johannes Weiner
2017-08-23 18:04         ` Roman Gushchin
2017-08-23 23:13           ` David Rientjes
2017-08-14 18:32 ` [v5 3/4] mm, oom: introduce oom_priority for memory cgroups Roman Gushchin
2017-08-14 22:44   ` David Rientjes
2017-08-14 18:32 ` [v5 4/4] mm, oom, docs: describe the cgroup-aware OOM killer Roman Gushchin
2017-08-14 22:52   ` David Rientjes
2017-08-15 14:13     ` Roman Gushchin
2017-08-15 20:56       ` David Rientjes
2017-08-16 14:43         ` Roman Gushchin
2017-08-17 12:16         ` Roman Gushchin
2017-08-21  0:41           ` David Rientjes
2017-08-14 22:00 ` [v5 1/4] mm, oom: refactor the oom_kill_process() function David Rientjes
2017-08-22 17:06 ` Johannes Weiner
2017-08-23 12:30   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1708141532300.63207@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox