linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v0 0/3] cgroup notifications API and memory thresholds
@ 2009-11-26 16:27 Kirill A. Shutemov
  2009-11-26 16:27 ` [PATCH RFC v0 1/3] cgroup: implement eventfd-based generic API for notifications Kirill A. Shutemov
  2009-11-26 17:02 ` [PATCH RFC v0 0/3] cgroup notifications API and " Daniel Lezcano
  0 siblings, 2 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 16:27 UTC (permalink / raw)
  To: containers, linux-mm
  Cc: Paul Menage, Li Zefan, Andrew Morton, KAMEZAWA Hiroyuki,
	Balbir Singh, Pavel Emelyanov, linux-kernel, Kirill A. Shutemov

It's my first attempt to implement cgroup notifications API and memory
thresholds on top of it. The idea of API was proposed by Paul Menage.

It lacks some important features and need more testing, but I want publish
it as soon as possible to get feedback from community.

TODO:
 - memory thresholds on root cgroup;
 - memsw support;
 - documentation.

Kirill A. Shutemov (3):
  cgroup: implement eventfd-based generic API for notifications
  res_counter: implement thresholds
  memcg: implement memory thresholds

 include/linux/cgroup.h      |    8 ++
 include/linux/res_counter.h |   44 +++++++++++
 kernel/cgroup.c             |  181 ++++++++++++++++++++++++++++++++++++++++++-
 kernel/res_counter.c        |    4 +
 mm/memcontrol.c             |  149 +++++++++++++++++++++++++++++++++++
 5 files changed, 385 insertions(+), 1 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v0 1/3] cgroup: implement eventfd-based generic API for notifications
  2009-11-26 16:27 [PATCH RFC v0 0/3] cgroup notifications API and memory thresholds Kirill A. Shutemov
@ 2009-11-26 16:27 ` Kirill A. Shutemov
  2009-11-26 16:27   ` [PATCH RFC v0 2/3] res_counter: implement thresholds Kirill A. Shutemov
  2009-11-26 17:02 ` [PATCH RFC v0 0/3] cgroup notifications API and " Daniel Lezcano
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 16:27 UTC (permalink / raw)
  To: containers, linux-mm
  Cc: Paul Menage, Li Zefan, Andrew Morton, KAMEZAWA Hiroyuki,
	Balbir Singh, Pavel Emelyanov, linux-kernel, Kirill A. Shutemov

This patch introduces write-only file "cgroup.event_control" in every
cgroup.

To register new notification handler you need:
- create an eventfd;
- open a control file to be monitored. Callbacks register_event() and
  unregister_event() must be defined for the control file;
- write "<event_fd> <control_fd> <args>" to cgroup.event_control.
  Interpretation of args is defined by control file implementation;

eventfd will be woken up by control file implementation or when the
cgroup is removed.

To unregister notification handler just close eventfd.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>

-- 
1.6.5.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v0 2/3] res_counter: implement thresholds
  2009-11-26 16:27 ` [PATCH RFC v0 1/3] cgroup: implement eventfd-based generic API for notifications Kirill A. Shutemov
@ 2009-11-26 16:27   ` Kirill A. Shutemov
  2009-11-26 16:27     ` [PATCH RFC v0 3/3] memcg: implement memory thresholds Kirill A. Shutemov
  0 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 16:27 UTC (permalink / raw)
  To: containers, linux-mm
  Cc: Paul Menage, Li Zefan, Andrew Morton, KAMEZAWA Hiroyuki,
	Balbir Singh, Pavel Emelyanov, linux-kernel, Kirill A. Shutemov

It allows to setup two thresholds: one above current usage and one
below. Callback threshold_notifier() will be called if a threshold is
crossed.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>

-- 
1.6.5.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v0 3/3] memcg: implement memory thresholds
  2009-11-26 16:27   ` [PATCH RFC v0 2/3] res_counter: implement thresholds Kirill A. Shutemov
@ 2009-11-26 16:27     ` Kirill A. Shutemov
  2009-11-26 17:03       ` Balbir Singh
  0 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 16:27 UTC (permalink / raw)
  To: containers, linux-mm
  Cc: Paul Menage, Li Zefan, Andrew Morton, KAMEZAWA Hiroyuki,
	Balbir Singh, Pavel Emelyanov, linux-kernel, Kirill A. Shutemov

It allows to register multiple memory thresholds and gets notifications
when it crosses.

To register a threshold application need:
- create an eventfd;
- open file memory.usage_in_bytes of a cgroup
- write string "<event_fd> <memory.usage_in_bytes> <threshold>" to
  cgroup.event_control.

Application will be notified through eventfd when memory usage crosses
threshold in any direction.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>

-- 
1.6.5.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC v0 0/3] cgroup notifications API and memory thresholds
  2009-11-26 16:27 [PATCH RFC v0 0/3] cgroup notifications API and memory thresholds Kirill A. Shutemov
  2009-11-26 16:27 ` [PATCH RFC v0 1/3] cgroup: implement eventfd-based generic API for notifications Kirill A. Shutemov
@ 2009-11-26 17:02 ` Daniel Lezcano
  2009-11-26 18:38   ` Kirill A. Shutemov
  1 sibling, 1 reply; 9+ messages in thread
From: Daniel Lezcano @ 2009-11-26 17:02 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: containers, linux-mm, linux-kernel, Paul Menage, Balbir Singh,
	Andrew Morton, Pavel Emelyanov

Kirill A. Shutemov wrote:
> It's my first attempt to implement cgroup notifications API and memory
> thresholds on top of it. The idea of API was proposed by Paul Menage.
>
> It lacks some important features and need more testing, but I want publish
> it as soon as possible to get feedback from community.
>
> TODO:
>  - memory thresholds on root cgroup;
>  - memsw support;
>  - documentation.
>   
Maybe it would be interesting to do that for the /cgroup/<name>/tasks by 
sending in the event the number of tasks in the cgroup when it changes, 
so it more easy to detect 0 process event and then remove the cgroup 
directory, no ?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC v0 3/3] memcg: implement memory thresholds
  2009-11-26 16:27     ` [PATCH RFC v0 3/3] memcg: implement memory thresholds Kirill A. Shutemov
@ 2009-11-26 17:03       ` Balbir Singh
  2009-11-26 17:11         ` Kirill A. Shutemov
  0 siblings, 1 reply; 9+ messages in thread
From: Balbir Singh @ 2009-11-26 17:03 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: containers, linux-mm, Paul Menage, Li Zefan, Andrew Morton,
	KAMEZAWA Hiroyuki, Pavel Emelyanov, linux-kernel

On Thu, Nov 26, 2009 at 9:57 PM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> It allows to register multiple memory thresholds and gets notifications
> when it crosses.
>
> To register a threshold application need:
> - create an eventfd;
> - open file memory.usage_in_bytes of a cgroup
> - write string "<event_fd> <memory.usage_in_bytes> <threshold>" to
>  cgroup.event_control.
>
> Application will be notified through eventfd when memory usage crosses
> threshold in any direction.
>
> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
>

I don't see the patches attached or inlined in the emails that follow

Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC v0 3/3] memcg: implement memory thresholds
  2009-11-26 17:03       ` Balbir Singh
@ 2009-11-26 17:11         ` Kirill A. Shutemov
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 17:11 UTC (permalink / raw)
  To: Balbir Singh
  Cc: containers, linux-mm, Paul Menage, Li Zefan, Andrew Morton,
	KAMEZAWA Hiroyuki, Pavel Emelyanov, linux-kernel

On Thu, Nov 26, 2009 at 7:03 PM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> On Thu, Nov 26, 2009 at 9:57 PM, Kirill A. Shutemov
> <kirill@shutemov.name> wrote:
>> It allows to register multiple memory thresholds and gets notifications
>> when it crosses.
>>
>> To register a threshold application need:
>> - create an eventfd;
>> - open file memory.usage_in_bytes of a cgroup
>> - write string "<event_fd> <memory.usage_in_bytes> <threshold>" to
>>  cgroup.event_control.
>>
>> Application will be notified through eventfd when memory usage crosses
>> threshold in any direction.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
>>
>
> I don't see the patches attached or inlined in the emails that follow

Sorry. Resent.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH RFC v0 0/3] cgroup notifications API and memory thresholds
  2009-11-26 17:02 ` [PATCH RFC v0 0/3] cgroup notifications API and " Daniel Lezcano
@ 2009-11-26 18:38   ` Kirill A. Shutemov
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 18:38 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: containers, linux-mm, linux-kernel, Paul Menage, Balbir Singh,
	Andrew Morton, Pavel Emelyanov

On Thu, Nov 26, 2009 at 7:02 PM, Daniel Lezcano <daniel.lezcano@free.fr> wrote:
> Kirill A. Shutemov wrote:
>>
>> It's my first attempt to implement cgroup notifications API and memory
>> thresholds on top of it. The idea of API was proposed by Paul Menage.
>>
>> It lacks some important features and need more testing, but I want publish
>> it as soon as possible to get feedback from community.
>>
>> TODO:
>>  - memory thresholds on root cgroup;
>>  - memsw support;
>>  - documentation.
>>
>
> Maybe it would be interesting to do that for the /cgroup/<name>/tasks by
> sending in the event the number of tasks in the cgroup when it changes, so
> it more easy to detect 0 process event and then remove the cgroup directory,
> no ?

I'll do it later.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH RFC v0 3/3] memcg: implement memory thresholds
  2009-11-26 17:11   ` [PATCH RFC v0 2/3] res_counter: implement thresholds Kirill A. Shutemov
@ 2009-11-26 17:11     ` Kirill A. Shutemov
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2009-11-26 17:11 UTC (permalink / raw)
  To: containers, linux-mm
  Cc: Paul Menage, Li Zefan, Andrew Morton, KAMEZAWA Hiroyuki,
	Balbir Singh, Pavel Emelyanov, linux-kernel, Kirill A. Shutemov

It allows to register multiple memory thresholds and gets notifications
when it crosses.

To register a threshold application need:
- create an eventfd;
- open file memory.usage_in_bytes of a cgroup
- write string "<event_fd> <memory.usage_in_bytes> <threshold>" to
  cgroup.event_control.

Application will be notified through eventfd when memory usage crosses
threshold in any direction.

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
---
 mm/memcontrol.c |  149 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 149 insertions(+), 0 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f99f599..af1af0b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6,6 +6,10 @@
  * Copyright 2007 OpenVZ SWsoft Inc
  * Author: Pavel Emelianov <xemul@openvz.org>
  *
+ * Memory thresholds
+ * Copyright (C) 2009 Nokia Corporation
+ * Author: Kirill A. Shutemov
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -38,6 +42,7 @@
 #include <linux/vmalloc.h>
 #include <linux/mm_inline.h>
 #include <linux/page_cgroup.h>
+#include <linux/eventfd.h>
 #include "internal.h"
 
 #include <asm/uaccess.h>
@@ -174,6 +179,12 @@ struct mem_cgroup_tree {
 
 static struct mem_cgroup_tree soft_limit_tree __read_mostly;
 
+struct mem_cgroup_threshold {
+	struct list_head list;
+	struct eventfd_ctx *eventfd;
+	u64 threshold;
+};
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
@@ -225,6 +236,9 @@ struct mem_cgroup {
 	/* set when res.limit == memsw.limit */
 	bool		memsw_is_minimum;
 
+	struct list_head thresholds;
+	struct mem_cgroup_threshold *current_threshold;
+
 	/*
 	 * statistics. This must be placed at the end of memcg.
 	 */
@@ -2839,12 +2853,119 @@ static int mem_cgroup_swappiness_write(struct cgroup *cgrp, struct cftype *cft,
 	return 0;
 }
 
+static inline void mem_cgroup_set_thresholds(struct res_counter *counter,
+		u64 above, u64 below)
+{
+	BUG_ON(res_counter_set_thresholds(counter, above, below));
+}
+
+static void mem_cgroup_threshold(struct res_counter *counter, u64 usage,
+		u64 threshold)
+{
+	struct mem_cgroup *memcg = container_of(counter,
+			struct mem_cgroup,res);
+	struct mem_cgroup_threshold *above, *below;
+
+	above = below = memcg->current_threshold;
+
+	if (threshold <= usage) {
+		list_for_each_entry_continue(above, &memcg->thresholds,
+				list) {
+			if (above->threshold > usage)
+				break;
+			below = above;
+			eventfd_signal(below->eventfd, 1);
+		}
+	} else {
+		list_for_each_entry_continue_reverse(below,
+				&memcg->thresholds, list) {
+			eventfd_signal(above->eventfd, 1);
+			if (below->threshold <= usage)
+				break;
+			above = below;
+		}
+	}
+
+	mem_cgroup_set_thresholds(&memcg->res, above->threshold,
+			below->threshold);
+	memcg->current_threshold = below;
+}
+
+static void mem_cgroup_invalidate_thresholds(struct cgroup *cgrp)
+{
+	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	struct mem_cgroup_threshold *tmp, *prev = NULL;
+	u64 usage = memcg->res.usage;
+
+	list_for_each_entry(tmp, &memcg->thresholds, list) {
+		if (tmp->threshold > usage) {
+			BUG_ON(!prev);
+			memcg->current_threshold = prev;
+			break;
+		}
+		prev = tmp;
+	}
+
+	mem_cgroup_set_thresholds(&memcg->res, tmp->threshold,
+			prev->threshold);
+}
+
+static int mem_cgroup_register_event(struct cgroup *cgrp, struct cftype *cft,
+		struct eventfd_ctx *eventfd, const char *args)
+{
+	u64 threshold;
+	struct mem_cgroup_threshold *new, *tmp;
+	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+	int ret;
+
+	/* TODO: Root cgroup is a special case */
+	if (mem_cgroup_is_root(memcg))
+		return -ENOSYS;
+
+	ret = res_counter_memparse_write_strategy(args, &threshold);
+	if (ret)
+		return ret;
+
+	new = kmalloc(sizeof(*new), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	INIT_LIST_HEAD(&new->list);
+	new->eventfd = eventfd;
+	new->threshold = threshold;
+
+	list_for_each_entry(tmp, &memcg->thresholds, list)
+		if (new->threshold < tmp->threshold) {
+			list_add_tail(&new->list, &tmp->list);
+			break;
+		}
+	mem_cgroup_invalidate_thresholds(cgrp);
+
+	return 0;
+}
+
+static int mem_cgroup_unregister_event(struct cgroup *cgrp, struct cftype *cft,
+		struct eventfd_ctx *eventfd)
+{
+	struct mem_cgroup_threshold *threshold, *tmp;
+	struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+
+	list_for_each_entry_safe(threshold, tmp, &memcg->thresholds, list)
+		if (threshold->eventfd == eventfd) {
+			list_del(&threshold->list);
+			kfree(threshold);
+		}
+	mem_cgroup_invalidate_thresholds(cgrp);
+
+	return 0;
+}
 
 static struct cftype mem_cgroup_files[] = {
 	{
 		.name = "usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_USAGE),
 		.read_u64 = mem_cgroup_read,
+		.register_event = mem_cgroup_register_event,
+		.unregister_event = mem_cgroup_unregister_event,
 	},
 	{
 		.name = "max_usage_in_bytes",
@@ -3080,6 +3201,32 @@ static int mem_cgroup_soft_limit_tree_init(void)
 	return 0;
 }
 
+static int mem_cgroup_thresholds_init(struct mem_cgroup *mem)
+{
+	struct mem_cgroup_threshold *new;
+
+	mem->res.threshold_notifier = mem_cgroup_threshold;
+	INIT_LIST_HEAD(&mem->thresholds);
+
+	new = kmalloc(sizeof(*new), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	INIT_LIST_HEAD(&new->list);
+	new->threshold = 0ULL;
+	list_add(&new->list, &mem->thresholds);
+
+	mem->current_threshold = new;
+
+	new = kmalloc(sizeof(*new), GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+	INIT_LIST_HEAD(&new->list);
+	new->threshold = RESOURCE_MAX;
+	list_add_tail(&new->list, &mem->thresholds);
+
+	return 0;
+}
+
 static struct cgroup_subsys_state * __ref
 mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
 {
@@ -3125,6 +3272,8 @@ mem_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont)
 	mem->last_scanned_child = 0;
 	spin_lock_init(&mem->reclaim_param_lock);
 
+	mem_cgroup_thresholds_init(mem);
+
 	if (parent)
 		mem->swappiness = get_swappiness(parent);
 	atomic_set(&mem->refcnt, 1);
-- 
1.6.5.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-11-26 18:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-26 16:27 [PATCH RFC v0 0/3] cgroup notifications API and memory thresholds Kirill A. Shutemov
2009-11-26 16:27 ` [PATCH RFC v0 1/3] cgroup: implement eventfd-based generic API for notifications Kirill A. Shutemov
2009-11-26 16:27   ` [PATCH RFC v0 2/3] res_counter: implement thresholds Kirill A. Shutemov
2009-11-26 16:27     ` [PATCH RFC v0 3/3] memcg: implement memory thresholds Kirill A. Shutemov
2009-11-26 17:03       ` Balbir Singh
2009-11-26 17:11         ` Kirill A. Shutemov
2009-11-26 17:02 ` [PATCH RFC v0 0/3] cgroup notifications API and " Daniel Lezcano
2009-11-26 18:38   ` Kirill A. Shutemov
2009-11-26 17:11 Kirill A. Shutemov
2009-11-26 17:11 ` [PATCH RFC v0 1/3] cgroup: implement eventfd-based generic API for notifications Kirill A. Shutemov
2009-11-26 17:11   ` [PATCH RFC v0 2/3] res_counter: implement thresholds Kirill A. Shutemov
2009-11-26 17:11     ` [PATCH RFC v0 3/3] memcg: implement memory thresholds Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox