linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] mm, memcg: store memcg name for oom kill log consistency
@ 2013-08-29  6:03 David Rientjes
  2013-08-29 13:30 ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: David Rientjes @ 2013-08-29  6:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Johannes Weiner, Michal Hocko, cgroups,
	linux-kernel, linux-mm

A shared buffer is currently used for the name of the oom memcg and the
memcg of the killed process.  There is no serialization of memcg oom
kills, so this buffer can easily be overwritten if there is a concurrent
oom kill in another memcg.

This patch stores the names of the memcgs directly in struct mem_cgroup.
This allows it to be printed anytime during the oom kill without fearing
that it will change or become corrupted.  It also ensures that the name
of the memcg that is oom and the memcg of the killed process are the same
even if renamed at the same time.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/memcontrol.c | 49 +++++++++++++++++++++++--------------------------
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -83,6 +83,8 @@ static int really_do_swap_account __initdata = 0;
 #define do_swap_account		0
 #endif
 
+/* First 128 bytes of memcg name should be unique */
+#define MEM_CGROUP_STORE_NAME_LEN	128
 
 static const char * const mem_cgroup_stat_names[] = {
 	"cache",
@@ -247,6 +249,9 @@ struct mem_cgroup {
 	/* OOM-Killer disable */
 	int		oom_kill_disable;
 
+	/* name of memcg for display purposes only */
+	char		name[MEM_CGROUP_STORE_NAME_LEN];
+
 	/* set when res.limit == memsw.limit */
 	bool		memsw_is_minimum;
 
@@ -1538,27 +1543,22 @@ static void move_unlock_mem_cgroup(struct mem_cgroup *memcg,
  */
 void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
 {
+	struct mem_cgroup *task_memcg;
+	struct mem_cgroup *iter;
 	struct cgroup *task_cgrp;
 	struct cgroup *mem_cgrp;
-	/*
-	 * Need a buffer in BSS, can't rely on allocations. The code relies
-	 * on the assumption that OOM is serialized for memory controller.
-	 * If this assumption is broken, revisit this code.
-	 */
-	static char memcg_name[PATH_MAX];
 	int ret;
-	struct mem_cgroup *iter;
 	unsigned int i;
 
 	if (!p)
 		return;
 
 	rcu_read_lock();
-
-	mem_cgrp = memcg->css.cgroup;
+	task_memcg = mem_cgroup_from_task(p);
 	task_cgrp = task_cgroup(p, mem_cgroup_subsys_id);
+	mem_cgrp = memcg->css.cgroup;
 
-	ret = cgroup_path(task_cgrp, memcg_name, PATH_MAX);
+	ret = cgroup_path(task_cgrp, task_memcg->name, MEM_CGROUP_STORE_NAME_LEN);
 	if (ret < 0) {
 		/*
 		 * Unfortunately, we are unable to convert to a useful name
@@ -1567,24 +1567,20 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
 		rcu_read_unlock();
 		goto done;
 	}
-	rcu_read_unlock();
 
-	pr_info("Task in %s killed", memcg_name);
+	if (task_memcg != memcg) {
+		ret = cgroup_path(mem_cgrp, memcg->name, MEM_CGROUP_STORE_NAME_LEN);
+		if (ret < 0) {
+			rcu_read_unlock();
+			goto done;
+		}
+	} else
+		strncpy(memcg->name, task_memcg->name, MEM_CGROUP_STORE_NAME_LEN);
 
-	rcu_read_lock();
-	ret = cgroup_path(mem_cgrp, memcg_name, PATH_MAX);
-	if (ret < 0) {
-		rcu_read_unlock();
-		goto done;
-	}
+	pr_info("Task in %s killed as a result of limit of %s\n",
+		task_memcg->name, memcg->name);
 	rcu_read_unlock();
-
-	/*
-	 * Continues from above, so we don't need an KERN_ level
-	 */
-	pr_cont(" as a result of limit of %s\n", memcg_name);
 done:
-
 	pr_info("memory: usage %llukB, limit %llukB, failcnt %llu\n",
 		res_counter_read_u64(&memcg->res, RES_USAGE) >> 10,
 		res_counter_read_u64(&memcg->res, RES_LIMIT) >> 10,
@@ -1602,9 +1598,10 @@ done:
 		pr_info("Memory cgroup stats");
 
 		rcu_read_lock();
-		ret = cgroup_path(iter->css.cgroup, memcg_name, PATH_MAX);
+		ret = cgroup_path(iter->css.cgroup, iter->name,
+				  MEM_CGROUP_STORE_NAME_LEN);
 		if (!ret)
-			pr_cont(" for %s", memcg_name);
+			pr_cont(" for %s", iter->name);
 		rcu_read_unlock();
 		pr_cont(":");
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] mm, memcg: store memcg name for oom kill log consistency
  2013-08-29  6:03 [patch] mm, memcg: store memcg name for oom kill log consistency David Rientjes
@ 2013-08-29 13:30 ` Michal Hocko
  2013-09-05 13:52   ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2013-08-29 13:30 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Johannes Weiner, cgroups,
	linux-kernel, linux-mm

On Wed 28-08-13 23:03:54, David Rientjes wrote:
> A shared buffer is currently used for the name of the oom memcg and the
> memcg of the killed process.  There is no serialization of memcg oom
> kills, so this buffer can easily be overwritten if there is a concurrent
> oom kill in another memcg.

Right.

> This patch stores the names of the memcgs directly in struct mem_cgroup.

I do not like to make every mem_cgroup larger even if it never sees an
OOM.

Wouldn't it be much easier to add a new lock (memcg_oom_info_lock) inside
mem_cgroup_print_oom_info instead? This would have a nice side effect
that parallel memcg oom kill messages wouldn't interleave.

> This allows it to be printed anytime during the oom kill without fearing
> that it will change or become corrupted.  It also ensures that the name
> of the memcg that is oom and the memcg of the killed process are the same
> even if renamed at the same time.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  mm/memcontrol.c | 49 +++++++++++++++++++++++--------------------------
>  1 file changed, 23 insertions(+), 26 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -83,6 +83,8 @@ static int really_do_swap_account __initdata = 0;
>  #define do_swap_account		0
>  #endif
>  
> +/* First 128 bytes of memcg name should be unique */
> +#define MEM_CGROUP_STORE_NAME_LEN	128
>  
>  static const char * const mem_cgroup_stat_names[] = {
>  	"cache",
> @@ -247,6 +249,9 @@ struct mem_cgroup {
>  	/* OOM-Killer disable */
>  	int		oom_kill_disable;
>  
> +	/* name of memcg for display purposes only */
> +	char		name[MEM_CGROUP_STORE_NAME_LEN];
> +
>  	/* set when res.limit == memsw.limit */
>  	bool		memsw_is_minimum;
>  
> @@ -1538,27 +1543,22 @@ static void move_unlock_mem_cgroup(struct mem_cgroup *memcg,
>   */
>  void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
>  {
> +	struct mem_cgroup *task_memcg;
> +	struct mem_cgroup *iter;
>  	struct cgroup *task_cgrp;
>  	struct cgroup *mem_cgrp;
> -	/*
> -	 * Need a buffer in BSS, can't rely on allocations. The code relies
> -	 * on the assumption that OOM is serialized for memory controller.
> -	 * If this assumption is broken, revisit this code.
> -	 */
> -	static char memcg_name[PATH_MAX];
>  	int ret;
> -	struct mem_cgroup *iter;
>  	unsigned int i;
>  
>  	if (!p)
>  		return;
>  
>  	rcu_read_lock();
> -
> -	mem_cgrp = memcg->css.cgroup;
> +	task_memcg = mem_cgroup_from_task(p);
>  	task_cgrp = task_cgroup(p, mem_cgroup_subsys_id);
> +	mem_cgrp = memcg->css.cgroup;
>  
> -	ret = cgroup_path(task_cgrp, memcg_name, PATH_MAX);
> +	ret = cgroup_path(task_cgrp, task_memcg->name, MEM_CGROUP_STORE_NAME_LEN);
>  	if (ret < 0) {
>  		/*
>  		 * Unfortunately, we are unable to convert to a useful name
> @@ -1567,24 +1567,20 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
>  		rcu_read_unlock();
>  		goto done;
>  	}
> -	rcu_read_unlock();
>  
> -	pr_info("Task in %s killed", memcg_name);
> +	if (task_memcg != memcg) {
> +		ret = cgroup_path(mem_cgrp, memcg->name, MEM_CGROUP_STORE_NAME_LEN);
> +		if (ret < 0) {
> +			rcu_read_unlock();
> +			goto done;
> +		}
> +	} else
> +		strncpy(memcg->name, task_memcg->name, MEM_CGROUP_STORE_NAME_LEN);
>  
> -	rcu_read_lock();
> -	ret = cgroup_path(mem_cgrp, memcg_name, PATH_MAX);
> -	if (ret < 0) {
> -		rcu_read_unlock();
> -		goto done;
> -	}
> +	pr_info("Task in %s killed as a result of limit of %s\n",
> +		task_memcg->name, memcg->name);
>  	rcu_read_unlock();
> -
> -	/*
> -	 * Continues from above, so we don't need an KERN_ level
> -	 */
> -	pr_cont(" as a result of limit of %s\n", memcg_name);
>  done:
> -
>  	pr_info("memory: usage %llukB, limit %llukB, failcnt %llu\n",
>  		res_counter_read_u64(&memcg->res, RES_USAGE) >> 10,
>  		res_counter_read_u64(&memcg->res, RES_LIMIT) >> 10,
> @@ -1602,9 +1598,10 @@ done:
>  		pr_info("Memory cgroup stats");
>  
>  		rcu_read_lock();
> -		ret = cgroup_path(iter->css.cgroup, memcg_name, PATH_MAX);
> +		ret = cgroup_path(iter->css.cgroup, iter->name,
> +				  MEM_CGROUP_STORE_NAME_LEN);
>  		if (!ret)
> -			pr_cont(" for %s", memcg_name);
> +			pr_cont(" for %s", iter->name);
>  		rcu_read_unlock();
>  		pr_cont(":");
>  

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] mm, memcg: store memcg name for oom kill log consistency
  2013-08-29 13:30 ` Michal Hocko
@ 2013-09-05 13:52   ` Michal Hocko
  2013-09-09  9:00     ` David Rientjes
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2013-09-05 13:52 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Johannes Weiner, cgroups,
	linux-kernel, linux-mm

On Thu 29-08-13 15:30:32, Michal Hocko wrote:
> On Wed 28-08-13 23:03:54, David Rientjes wrote:
> > A shared buffer is currently used for the name of the oom memcg and the
> > memcg of the killed process.  There is no serialization of memcg oom
> > kills, so this buffer can easily be overwritten if there is a concurrent
> > oom kill in another memcg.
> 
> Right.
> 
> > This patch stores the names of the memcgs directly in struct mem_cgroup.
> 
> I do not like to make every mem_cgroup larger even if it never sees an
> OOM.
> 
> Wouldn't it be much easier to add a new lock (memcg_oom_info_lock) inside
> mem_cgroup_print_oom_info instead? This would have a nice side effect
> that parallel memcg oom kill messages wouldn't interleave.

What about the following?
---

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] mm, memcg: store memcg name for oom kill log consistency
  2013-09-05 13:52   ` Michal Hocko
@ 2013-09-09  9:00     ` David Rientjes
  2013-09-09 11:13       ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: David Rientjes @ 2013-09-09  9:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Johannes Weiner, cgroups,
	linux-kernel, linux-mm

On Thu, 5 Sep 2013, Michal Hocko wrote:

> From 4cee36f56100f5689fe1ae22f468016ce5a0cbae Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Thu, 5 Sep 2013 15:39:20 +0200
> Subject: [PATCH] memcg, oom: lock mem_cgroup_print_oom_info
> 
> mem_cgroup_print_oom_info uses a static buffer (memcg_name) to store the
> name of the cgroup. This is not safe as pointed out by David Rientjes
> because although memcg oom is locked for its hierarchy nothing prevents
> another parallel hierarchy to trigger oom as well and overwrite the
> already in-use buffer.
> 
> This patch introduces oom_info_lock hidden inside mem_cgroup_print_oom_info
> which is held throughout the function. It make access to memcg_name safe
> and as a bonus it also prevents parallel memcg ooms to interleave their
> statistics which would make the printed data hard to analyze otherwise.
> 
> Using the spinlock is OK here because this path is not hot and
> meaningful data is much more important.
> 
> Reported-by: David Rientjes <rientjes@google.com>

Remove this.

> Signed-off-by: Michal Hocko <mhocko@suse.cz>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] mm, memcg: store memcg name for oom kill log consistency
  2013-09-09  9:00     ` David Rientjes
@ 2013-09-09 11:13       ` Michal Hocko
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2013-09-09 11:13 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Johannes Weiner, cgroups,
	linux-kernel, linux-mm

On Mon 09-09-13 02:00:26, David Rientjes wrote:
> On Thu, 5 Sep 2013, Michal Hocko wrote:
[...]
> > Reported-by: David Rientjes <rientjes@google.com>
> 
> Remove this.

OK. Is there any other way how to give you a credit for
discovering/reporting this issue?

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-09-09 11:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-29  6:03 [patch] mm, memcg: store memcg name for oom kill log consistency David Rientjes
2013-08-29 13:30 ` Michal Hocko
2013-09-05 13:52   ` Michal Hocko
2013-09-09  9:00     ` David Rientjes
2013-09-09 11:13       ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox