From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [RFC][PATCH 4/9] soft limit queue and priority
Date: Mon, 6 Apr 2009 16:35:34 +0530 [thread overview]
Message-ID: <20090406110534.GJ7082@balbir.in.ibm.com> (raw)
In-Reply-To: <20090403171248.df3e1b03.kamezawa.hiroyu@jp.fujitsu.com>
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-04-03 17:12:48]:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> Softlimitq. for memcg.
>
> Implements an array of queue to list memcgs, array index is determined by
> the amount of memory usage excess the soft limit.
>
> While Balbir's one uses RB-tree and my old one used a per-zone queue
> (with round-robin), this is one of mixture of them.
> (I'd like to use rotation of queue in later patches)
>
> Priority is determined by following.
> Assume unit = total pages/1024. (the code uses different value)
> if excess is...
> < unit, priority = 0,
> < unit*2, priority = 1,
> < unit*2*2, priority = 2,
> ...
> < unit*2^9, priority = 9,
> < unit*2^10, priority = 10, (> 50% to total mem)
>
> This patch just includes queue management part and not includes
> selection logic from queue. Some trick will be used for selecting victims at
> soft limit in efficient way.
>
> And this equips 2 queues, for anon and file. Inset/Delete of both list is
> done at once but scan will be independent. (These 2 queues are used later.)
>
> Major difference from Balbir's one other than RB-tree is bahavior under
> hierarchy. This one adds all children to queue by checking hierarchical
> priority. This is for helping per-zone usage check on victim-selection logic.
>
> Changelog: v1->v2
> - fixed comments.
> - change base size to exponent.
> - some micro optimization to reduce code size.
> - considering memory hotplug, it's not good to record a value calculated
> from totalram_pages at boot and using it later is bad manner. Fixed it.
> - removed soft_limit_lock (spinlock)
> - added soft_limit_update counter for avoiding mulptiple update at once.
>
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> mm/memcontrol.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 117 insertions(+), 1 deletion(-)
>
> Index: softlimit-test2/mm/memcontrol.c
> ===================================================================
> --- softlimit-test2.orig/mm/memcontrol.c
> +++ softlimit-test2/mm/memcontrol.c
> @@ -192,7 +192,14 @@ struct mem_cgroup {
> atomic_t refcnt;
>
> unsigned int swappiness;
> -
> + /*
> + * For soft limit.
> + */
> + int soft_limit_priority;
> + struct list_head soft_limit_list[2];
> +#define SL_ANON (0)
> +#define SL_FILE (1)
Comments for the #define please.
> + atomic_t soft_limit_update;
> /*
> * statistics. This must be placed at the end of memcg.
> */
> @@ -938,11 +945,115 @@ static bool mem_cgroup_soft_limit_check(
> return ret;
> }
>
> +/*
> + * Assume "base_amount", and excess = usage - soft limit.
> + *
> + * 0...... if excess < base_amount
> + * 1...... if excess < base_amount * 2
> + * 2...... if excess < base_amount * 2^2
> + * 3.......if excess < base_amount * 2^3
> + * ....
> + * 9.......if excess < base_amount * 2^9
> + * 10 .....if excess < base_amount * 2^10
> + *
> + * base_amount is detemined from total pages in the system.
> + */
> +
> +#define SLQ_MAXPRIO (11)
> +static struct {
> + spinlock_t lock;
> + struct list_head queue[SLQ_MAXPRIO][2]; /* 0:anon 1:file */
> +} softlimitq;
> +
> +#define SLQ_PRIO_FACTOR (1024) /* 2^10 */
> +
> +static int __calc_soft_limit_prio(unsigned long excess)
> +{
> + unsigned long factor = totalram_pages /SLQ_PRIO_FACTOR;
I would prefer to use global_lru_pages()
> +
> + return fls(excess/factor);
> +}
> +
> +static int mem_cgroup_soft_limit_prio(struct mem_cgroup *mem)
> +{
> + unsigned long excess, max_excess = 0;
> + struct res_counter *c = &mem->res;
> +
> + do {
> + excess = res_counter_soft_limit_excess(c) >> PAGE_SHIFT;
> + if (max_excess < excess)
> + max_excess = excess;
max_excess = min(max_excess, excess)
> + c = c->parent;
> + } while (c);
> +
> + return __calc_soft_limit_prio(max_excess);
> +}
> +
> +static void __mem_cgroup_requeue(struct mem_cgroup *mem, int prio)
> +{
> + /* enqueue to softlimit queue */
> + int i;
> +
> + spin_lock(&softlimitq.lock);
> + if (prio != mem->soft_limit_priority) {
> + mem->soft_limit_priority = prio;
> + for (i = 0; i < 2; i++) {
> + list_del_init(&mem->soft_limit_list[i]);
> + list_add_tail(&mem->soft_limit_list[i],
> + &softlimitq.queue[prio][i]);
> + }
> + }
> + spin_unlock(&softlimitq.lock);
> +}
> +
> +static void __mem_cgroup_dequeue(struct mem_cgroup *mem)
> +{
> + int i;
> +
> + spin_lock(&softlimitq.lock);
> + for (i = 0; i < 2; i++)
> + list_del_init(&mem->soft_limit_list[i]);
> + spin_unlock(&softlimitq.lock);
> +}
> +
> +static int
> +__mem_cgroup_update_soft_limit_cb(struct mem_cgroup *mem, void *data)
> +{
> + int priority;
> + /* If someone updates, we don't need more */
> + priority = mem_cgroup_soft_limit_prio(mem);
> +
> + if (priority != mem->soft_limit_priority)
> + __mem_cgroup_requeue(mem, priority);
> + return 0;
> +}
> +
> static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
> {
> + int priority;
> +
> + /* check status change */
> + priority = mem_cgroup_soft_limit_prio(mem);
> + if (priority != mem->soft_limit_priority &&
> + atomic_inc_return(&mem->soft_limit_update) > 1) {
> + mem_cgroup_walk_tree(mem, NULL,
> + __mem_cgroup_update_soft_limit_cb);
> + atomic_set(&mem->soft_limit_update, 0);
> + }
> return;
> }
>
> +static void softlimitq_init(void)
> +{
> + int i;
> +
> + spin_lock_init(&softlimitq.lock);
> + for (i = 0; i < SLQ_MAXPRIO; i++) {
> + INIT_LIST_HEAD(&softlimitq.queue[i][SL_ANON]);
> + INIT_LIST_HEAD(&softlimitq.queue[i][SL_FILE]);
> + }
> +}
> +
> /*
> * Unlike exported interface, "oom" parameter is added. if oom==true,
> * oom-killer can be invoked.
> @@ -2512,6 +2623,7 @@ mem_cgroup_create(struct cgroup_subsys *
> if (cont->parent == NULL) {
> enable_swap_cgroup();
> parent = NULL;
> + softlimitq_init();
> } else {
> parent = mem_cgroup_from_cont(cont->parent);
> mem->use_hierarchy = parent->use_hierarchy;
> @@ -2532,6 +2644,9 @@ mem_cgroup_create(struct cgroup_subsys *
> res_counter_init(&mem->memsw, NULL);
> }
> mem->last_scanned_child = 0;
> + mem->soft_limit_priority = 0;
> + INIT_LIST_HEAD(&mem->soft_limit_list[SL_ANON]);
> + INIT_LIST_HEAD(&mem->soft_limit_list[SL_FILE]);
> spin_lock_init(&mem->reclaim_param_lock);
>
> if (parent)
> @@ -2556,6 +2671,7 @@ static void mem_cgroup_destroy(struct cg
> {
> struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
>
> + __mem_cgroup_dequeue(mem);
> mem_cgroup_put(mem);
> }
>
>
>
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-04-06 11:06 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-03 8:08 [RFC][PATCH 0/9] memcg soft limit v2 (new design) KAMEZAWA Hiroyuki
2009-04-03 8:09 ` [RFC][PATCH 1/9] " KAMEZAWA Hiroyuki
2009-04-03 8:10 ` [RFC][PATCH 2/9] soft limit framework for memcg KAMEZAWA Hiroyuki
2009-04-03 8:12 ` [RFC][PATCH 3/9] soft limit update filter KAMEZAWA Hiroyuki
2009-04-06 9:43 ` Balbir Singh
2009-04-07 0:04 ` KAMEZAWA Hiroyuki
2009-04-07 2:26 ` Balbir Singh
2009-04-03 8:12 ` [RFC][PATCH 4/9] soft limit queue and priority KAMEZAWA Hiroyuki
2009-04-06 11:05 ` Balbir Singh [this message]
2009-04-06 23:55 ` KAMEZAWA Hiroyuki
2009-04-06 18:42 ` Balbir Singh
2009-04-06 23:54 ` KAMEZAWA Hiroyuki
2009-04-03 8:13 ` [RFC][PATCH 5/9] add more hooks and check in lazy manner KAMEZAWA Hiroyuki
2009-04-03 8:14 ` [RFC][PATCH 6/9] active inactive ratio for private KAMEZAWA Hiroyuki
2009-04-03 8:15 ` [RFC][PATCH 7/9] vicitim selection logic KAMEZAWA Hiroyuki
2009-04-03 8:17 ` [RFC][PATCH 8/9] lru reordering KAMEZAWA Hiroyuki
2009-04-03 8:18 ` [RFC][PATCH 9/9] more event filter depend on priority KAMEZAWA Hiroyuki
2009-04-03 8:24 ` [RFC][PATCH ex/9] for debug KAMEZAWA Hiroyuki
2009-04-06 9:08 ` [RFC][PATCH 0/9] memcg soft limit v2 (new design) Balbir Singh
2009-04-07 0:16 ` KAMEZAWA Hiroyuki
2009-04-24 12:24 ` Balbir Singh
2009-04-24 15:19 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090406110534.GJ7082@balbir.in.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox