From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ying Han <yinghan@google.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
Johannes Weiner <jweiner@redhat.com>,
"minchan.kim@gmail.com" <minchan.kim@gmail.com>,
Michal Hocko <mhocko@suse.cz>
Subject: [PATCH 2/7] memcg high watermark interface
Date: Mon, 25 Apr 2011 18:29:53 +0900 [thread overview]
Message-ID: <20110425182953.fd33f261.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20110425182529.c7c37bb4.kamezawa.hiroyu@jp.fujitsu.com>
Add memory.high_wmark_distance and reclaim_wmarks API per memcg.
The first adjust the internal low/high wmark calculation and
the reclaim_wmarks exports the current value of watermarks.
low_wmark is caclurated in automatic.
$ echo 500m >/dev/cgroup/A/memory.limit_in_bytes
$ cat /dev/cgroup/A/memory.limit_in_bytes
524288000
$ echo 50m >/dev/cgroup/A/memory.high_wmark_distance
$ cat /dev/cgroup/A/memory.reclaim_wmarks
low_wmark 476053504
high_wmark 471859200
Change v8a..v7
1. removed low_wmark_distance it's now automatic.
2. added Documenation.
Signed-off-by: Ying Han <yinghan@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Documentation/cgroups/memory.txt | 43 ++++++++++++++++++++++++++++
mm/memcontrol.c | 58 +++++++++++++++++++++++++++++++++++++++
2 files changed, 100 insertions(+), 1 deletion(-)
Index: memcg/mm/memcontrol.c
===================================================================
--- memcg.orig/mm/memcontrol.c
+++ memcg/mm/memcontrol.c
@@ -4074,6 +4074,40 @@ static int mem_cgroup_swappiness_write(s
return 0;
}
+static u64 mem_cgroup_high_wmark_distance_read(struct cgroup *cgrp,
+ struct cftype *cft)
+{
+ struct mem_cgroup *memcg = mem_cgroup_from_cont(cgrp);
+
+ return memcg->high_wmark_distance;
+}
+
+static int mem_cgroup_high_wmark_distance_write(struct cgroup *cont,
+ struct cftype *cft,
+ const char *buffer)
+{
+ struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
+ unsigned long long val;
+ u64 limit;
+ int ret;
+
+ if (!cont->parent)
+ return -EINVAL;
+
+ ret = res_counter_memparse_write_strategy(buffer, &val);
+ if (ret)
+ return -EINVAL;
+
+ limit = res_counter_read_u64(&memcg->res, RES_LIMIT);
+ if (val >= limit)
+ return -EINVAL;
+
+ memcg->high_wmark_distance = val;
+
+ setup_per_memcg_wmarks(memcg);
+ return 0;
+}
+
static void __mem_cgroup_threshold(struct mem_cgroup *memcg, bool swap)
{
struct mem_cgroup_threshold_ary *t;
@@ -4365,6 +4399,21 @@ static void mem_cgroup_oom_unregister_ev
mutex_unlock(&memcg_oom_mutex);
}
+static int mem_cgroup_wmark_read(struct cgroup *cgrp,
+ struct cftype *cft, struct cgroup_map_cb *cb)
+{
+ struct mem_cgroup *mem = mem_cgroup_from_cont(cgrp);
+ u64 low_wmark, high_wmark;
+
+ low_wmark = res_counter_read_u64(&mem->res, RES_LOW_WMARK_LIMIT);
+ high_wmark = res_counter_read_u64(&mem->res, RES_HIGH_WMARK_LIMIT);
+
+ cb->fill(cb, "low_wmark", low_wmark);
+ cb->fill(cb, "high_wmark", high_wmark);
+
+ return 0;
+}
+
static int mem_cgroup_oom_control_read(struct cgroup *cgrp,
struct cftype *cft, struct cgroup_map_cb *cb)
{
@@ -4468,6 +4517,15 @@ static struct cftype mem_cgroup_files[]
.unregister_event = mem_cgroup_oom_unregister_event,
.private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL),
},
+ {
+ .name = "high_wmark_distance",
+ .write_string = mem_cgroup_high_wmark_distance_write,
+ .read_u64 = mem_cgroup_high_wmark_distance_read,
+ },
+ {
+ .name = "reclaim_wmarks",
+ .read_map = mem_cgroup_wmark_read,
+ },
};
#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
Index: memcg/Documentation/cgroups/memory.txt
===================================================================
--- memcg.orig/Documentation/cgroups/memory.txt
+++ memcg/Documentation/cgroups/memory.txt
@@ -68,6 +68,8 @@ Brief summary of control files.
(See sysctl's vm.swappiness)
memory.move_charge_at_immigrate # set/show controls of moving charges
memory.oom_control # set/show oom controls.
+ memory.hiwmark_distance # set/show watermark control
+ memory.reclaim_wmarks # show watermark details.
1. History
@@ -501,6 +503,7 @@ NOTE2: When panic_on_oom is set to "2",
case of an OOM event in any cgroup.
7. Soft limits
+(See Watermarks, too.)
Soft limits allow for greater sharing of memory. The idea behind soft limits
is to allow control groups to use as much of the memory as needed, provided
@@ -649,7 +652,45 @@ At reading, current status of OOM is sho
under_oom 0 or 1 (if 1, the memory cgroup is under OOM, tasks may
be stopped.)
-11. TODO
+11. Watermarks
+
+Tasks gets big overhead when it hits memory limit because it needs to scan
+memory and free them. To avoid that, some background memory freeing by
+kernel will be helpful. Memory cgroup supports background memory freeing
+by threshold called Watermarks. It can be used for fuzzy limiting of memory.
+
+For example, if you have 1G limit and set
+ - high_watermark ....980M
+ - low_watermark ....984M
+Memory freeing work by kernel starts when usage goes over 984M until memory
+usage goes down to 980M. Of course, this cousumes CPU. So, the kernel controls
+this work to avoid too much cpu hogging.
+
+11.1 memory.high_wmark_distance
+
+This is an interface for high_wmark. You can specify the distance between
+the limit of memory and high_watemark here. For example, under 1G limit memroy
+cgroup,
+ # echo 20M > memory.high_wmark_distance
+will set high_watermark as 980M. low_watermark is _automatically_ determined
+because big distance between high-low watermark tend to use too much CPU and
+it's difficult to determine low_watermark by users.
+
+With this, memory usage will be reduced to 980M as time goes by.
+After setting memory.high_wmark_distance to be 20M, assume you update
+memory.limit_in_bytes to be 2G bytes. In this case, hiwh_watermak is 1980M.
+
+Another thinking, assume you have memory.limit_in_bytes to be 1G.
+Then, set memory.high_wmark_distance as 300M. Then, you can limit memory
+usage under 700M in moderate way and you can limit it under 1G with hard
+limit.
+
+11.2 memory.reclaim_wmarks
+
+This interface shows high_watermark and low_watermark in bytes. Maybe
+useful at compareing usage/watermarks.
+
+12. TODO
1. Add support for accounting huge pages (as a separate controller)
2. Make per-cgroup scanner reclaim not-shared pages first
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-25 9:36 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-25 9:25 [PATCH 0/7] memcg background reclaim , yet another one KAMEZAWA Hiroyuki
2011-04-25 9:28 ` [PATCH 1/7] memcg: add high/low watermark to res_counter KAMEZAWA Hiroyuki
2011-04-26 17:54 ` Ying Han
2011-04-29 13:33 ` Michal Hocko
2011-05-01 6:06 ` KOSAKI Motohiro
2011-05-03 6:49 ` Michal Hocko
2011-05-03 7:45 ` KOSAKI Motohiro
2011-05-03 8:25 ` Michal Hocko
2011-05-03 17:01 ` Ying Han
2011-05-04 8:58 ` Michal Hocko
2011-05-04 17:16 ` Ying Han
2011-05-05 6:59 ` Michal Hocko
2011-05-06 5:28 ` KAMEZAWA Hiroyuki
2011-05-06 14:22 ` Johannes Weiner
2011-05-09 0:21 ` KAMEZAWA Hiroyuki
2011-05-09 5:47 ` Ying Han
2011-05-09 9:58 ` Johannes Weiner
2011-05-09 9:59 ` KAMEZAWA Hiroyuki
2011-05-10 4:43 ` Ying Han
2011-05-09 5:40 ` Ying Han
2011-05-09 7:10 ` KAMEZAWA Hiroyuki
2011-05-09 10:18 ` Johannes Weiner
2011-05-09 12:49 ` Michal Hocko
2011-05-09 23:49 ` KAMEZAWA Hiroyuki
2011-05-10 4:39 ` Ying Han
2011-05-10 4:51 ` Ying Han
2011-05-10 6:27 ` Johannes Weiner
2011-05-10 7:09 ` Ying Han
2011-05-04 3:55 ` KOSAKI Motohiro
2011-05-04 8:55 ` Michal Hocko
2011-05-09 3:24 ` KOSAKI Motohiro
2011-05-02 9:07 ` Balbir Singh
2011-05-06 5:30 ` KAMEZAWA Hiroyuki
2011-04-25 9:29 ` KAMEZAWA Hiroyuki [this message]
2011-04-25 22:36 ` [PATCH 2/7] memcg high watermark interface Ying Han
2011-04-25 9:31 ` [PATCH 3/7] memcg: select victim node in round robin KAMEZAWA Hiroyuki
2011-04-25 9:34 ` [PATCH 4/7] memcg fix scan ratio with small memcg KAMEZAWA Hiroyuki
2011-04-25 17:35 ` Ying Han
2011-04-26 1:43 ` KAMEZAWA Hiroyuki
2011-04-25 9:36 ` [PATCH 5/7] memcg bgreclaim core KAMEZAWA Hiroyuki
2011-04-26 4:59 ` Ying Han
2011-04-26 5:08 ` KAMEZAWA Hiroyuki
2011-04-26 23:15 ` Ying Han
2011-04-27 0:10 ` KAMEZAWA Hiroyuki
2011-04-27 1:01 ` KAMEZAWA Hiroyuki
2011-04-26 18:37 ` Ying Han
2011-04-25 9:40 ` [PATCH 6/7] memcg add zone_all_unreclaimable KAMEZAWA Hiroyuki
2011-04-25 9:42 ` [PATCH 7/7] memcg watermark reclaim workqueue KAMEZAWA Hiroyuki
2011-04-26 23:19 ` Ying Han
2011-04-27 0:31 ` KAMEZAWA Hiroyuki
2011-04-27 3:40 ` Ying Han
2011-04-25 9:43 ` [PATCH 8/7] memcg : reclaim statistics KAMEZAWA Hiroyuki
2011-04-26 5:35 ` Ying Han
2011-04-25 9:49 ` [PATCH 0/7] memcg background reclaim , yet another one KAMEZAWA Hiroyuki
2011-04-25 10:14 ` KAMEZAWA Hiroyuki
2011-04-25 22:21 ` Ying Han
2011-04-26 1:38 ` KAMEZAWA Hiroyuki
2011-04-26 7:19 ` Ying Han
2011-04-26 7:43 ` KAMEZAWA Hiroyuki
2011-04-26 8:43 ` Ying Han
2011-04-26 8:47 ` KAMEZAWA Hiroyuki
2011-04-26 23:08 ` Ying Han
2011-04-27 0:34 ` KAMEZAWA Hiroyuki
2011-04-27 1:19 ` Ying Han
2011-04-28 3:55 ` Ying Han
2011-04-28 4:05 ` KAMEZAWA Hiroyuki
2011-05-02 7:02 ` Balbir Singh
2011-05-02 6:09 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110425182953.fd33f261.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=jweiner@redhat.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=minchan.kim@gmail.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox