linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	lizf@cn.fujitsu.com,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 3/5] Memory controller soft limit organize cgroups (v7)
Date: Wed, 25 Mar 2009 13:59:00 +0900	[thread overview]
Message-ID: <20090325135900.dc82f133.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20090319165735.27274.96091.sendpatchset@localhost.localdomain>

On Thu, 19 Mar 2009 22:27:35 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> Feature: Organize cgroups over soft limit in a RB-Tree
> 
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
> 
> Changelog v7...v6
> 1. Refactor the check and update logic. The goal is to allow the
>    check logic to be modular, so that it can be revisited in the future
>    if something more appropriate is found to be useful.
> 
> Changelog v6...v5
> 1. Update the key before inserting into RB tree. Without the current change
>    it could take an additional iteration to get the key correct.
> 
> Changelog v5...v4
> 1. res_counter_uncharge has an additional parameter to indicate if the
>    counter was over its soft limit, before uncharge.
> 
> Changelog v4...v3
> 1. Optimizations to ensure we don't uncessarily get res_counter values
> 2. Fixed a bug in usage of time_after()
> 
> Changelog v3...v2
> 1. Add only the ancestor to the RB-Tree
> 2. Use css_tryget/css_put instead of mem_cgroup_get/mem_cgroup_put
> 
> Changelog v2...v1
> 1. Add support for hierarchies
> 2. The res_counter that is highest in the hierarchy is returned on soft
>    limit being exceeded. Since we do hierarchical reclaim and add all
>    groups exceeding their soft limits, this approach seems to work well
>    in practice.
> 
> This patch introduces a RB-Tree for storing memory cgroups that are over their
> soft limit. The overall goal is to
> 
> 1. Add a memory cgroup to the RB-Tree when the soft limit is exceeded.
>    We are careful about updates, updates take place only after a particular
>    time interval has passed
> 2. We remove the node from the RB-Tree when the usage goes below the soft
>    limit
> 
> The next set of patches will exploit the RB-Tree to get the group that is
> over its soft limit by the largest amount and reclaim from it, when we
> face memory contention.
> 
> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
> ---
> 
>  include/linux/res_counter.h |    6 +-
>  kernel/res_counter.c        |   18 +++++
>  mm/memcontrol.c             |  149 ++++++++++++++++++++++++++++++++++++++-----
>  3 files changed, 151 insertions(+), 22 deletions(-)
> 
> 
> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
> index 5c821fd..5bbf8b1 100644
> --- a/include/linux/res_counter.h
> +++ b/include/linux/res_counter.h
> @@ -112,7 +112,8 @@ void res_counter_init(struct res_counter *counter, struct res_counter *parent);
>  int __must_check res_counter_charge_locked(struct res_counter *counter,
>  		unsigned long val);
>  int __must_check res_counter_charge(struct res_counter *counter,
> -		unsigned long val, struct res_counter **limit_fail_at);
> +		unsigned long val, struct res_counter **limit_fail_at,
> +		struct res_counter **soft_limit_at);
>  
>  /*
>   * uncharge - tell that some portion of the resource is released
> @@ -125,7 +126,8 @@ int __must_check res_counter_charge(struct res_counter *counter,
>   */
>  
>  void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val);
> -void res_counter_uncharge(struct res_counter *counter, unsigned long val);
> +void res_counter_uncharge(struct res_counter *counter, unsigned long val,
> +				bool *was_soft_limit_excess);
>  
>  static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
>  {
> diff --git a/kernel/res_counter.c b/kernel/res_counter.c
> index 4e6dafe..51ec438 100644
> --- a/kernel/res_counter.c
> +++ b/kernel/res_counter.c
> @@ -37,17 +37,27 @@ int res_counter_charge_locked(struct res_counter *counter, unsigned long val)
>  }
>  
>  int res_counter_charge(struct res_counter *counter, unsigned long val,
> -			struct res_counter **limit_fail_at)
> +			struct res_counter **limit_fail_at,
> +			struct res_counter **soft_limit_fail_at)
>  {
>  	int ret;
>  	unsigned long flags;
>  	struct res_counter *c, *u;
>  
>  	*limit_fail_at = NULL;
> +	if (soft_limit_fail_at)
> +		*soft_limit_fail_at = NULL;
>  	local_irq_save(flags);
>  	for (c = counter; c != NULL; c = c->parent) {
>  		spin_lock(&c->lock);
>  		ret = res_counter_charge_locked(c, val);
> +		/*
> +		 * With soft limits, we return the highest ancestor
> +		 * that exceeds its soft limit
> +		 */
> +		if (soft_limit_fail_at &&
> +			!res_counter_soft_limit_check_locked(c))
> +			*soft_limit_fail_at = c;
>  		spin_unlock(&c->lock);

I'm not sure this works as intended or not. Could you clarify ? (see below)

    In following hierarchy,

         A/   soft_limit=1G, usage=1.2G.
           B  soft_limit=200M, usage=1G
           C  soft_limit=800M, usage=200M

   This function returns only "A". 
   And memory will be reclaimed from B and C, at first.
   


>  		if (ret < 0) {
>  			*limit_fail_at = c;
> @@ -75,7 +85,8 @@ void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val)
>  	counter->usage -= val;
>  }
>  
> -void res_counter_uncharge(struct res_counter *counter, unsigned long val)
> +void res_counter_uncharge(struct res_counter *counter, unsigned long val,
> +				bool *was_soft_limit_excess)
>  {
>  	unsigned long flags;
>  	struct res_counter *c;
> @@ -83,6 +94,9 @@ void res_counter_uncharge(struct res_counter *counter, unsigned long val)
>  	local_irq_save(flags);
>  	for (c = counter; c != NULL; c = c->parent) {
>  		spin_lock(&c->lock);
> +		if (c == counter && was_soft_limit_excess)
> +			*was_soft_limit_excess =
> +				!res_counter_soft_limit_check_locked(c);
>  		res_counter_uncharge_locked(c, val);
>  		spin_unlock(&c->lock);
>  	}
Does this work as intended ?
Assume following hierarchy

   A/  softlimit=1G usage=300M
     B/ softlimit=200M usage=300M.
     C/ softlimit=800M usage=0M

*was_soft_limit_excess will be false and no tree update, forever.

Hmm ?


Thanks,
-Kame





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-03-25  4:35 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-19 16:57 [PATCH 0/5] Memory controller soft limit patches (v7) Balbir Singh
2009-03-19 16:57 ` [PATCH 1/5] Memory controller soft limit documentation (v7) Balbir Singh
2009-03-19 16:57 ` [PATCH 2/5] Memory controller soft limit interface (v7) Balbir Singh
2009-03-19 16:57 ` [PATCH 3/5] Memory controller soft limit organize cgroups (v7) Balbir Singh
2009-03-20  3:46   ` KAMEZAWA Hiroyuki
2009-03-22 14:21     ` Balbir Singh
2009-03-22 23:53       ` KAMEZAWA Hiroyuki
2009-03-23  3:34         ` Balbir Singh
2009-03-23  3:38           ` KAMEZAWA Hiroyuki
2009-03-23  4:15             ` Balbir Singh
2009-03-23  4:23               ` KAMEZAWA Hiroyuki
2009-03-23  8:22                 ` Balbir Singh
2009-03-23  8:47                   ` KAMEZAWA Hiroyuki
2009-03-23  9:30                     ` Balbir Singh
2009-03-25  4:59   ` KAMEZAWA Hiroyuki [this message]
2009-03-25  5:29     ` Balbir Singh
2009-03-25  5:39       ` KAMEZAWA Hiroyuki
2009-03-25  5:53         ` Balbir Singh
2009-03-25  6:01           ` KAMEZAWA Hiroyuki
2009-03-25  6:21             ` Balbir Singh
2009-03-25  6:38               ` Balbir Singh
2009-03-25  5:07   ` KAMEZAWA Hiroyuki
2009-03-25  5:18     ` Balbir Singh
2009-03-25  5:22       ` KAMEZAWA Hiroyuki
2009-03-19 16:57 ` [PATCH 4/5] Memory controller soft limit refactor reclaim flags (v7) Balbir Singh
2009-03-20  3:47   ` KAMEZAWA Hiroyuki
2009-03-22 14:21     ` Balbir Singh
2009-03-19 16:57 ` [PATCH 5/5] Memory controller soft limit reclaim on contention (v7) Balbir Singh
2009-03-20  4:06   ` KAMEZAWA Hiroyuki
2009-03-22 14:27     ` Balbir Singh
2009-03-23  0:02       ` KAMEZAWA Hiroyuki
2009-03-23  4:12         ` Balbir Singh
2009-03-23  4:20           ` KAMEZAWA Hiroyuki
2009-03-23  8:28             ` Balbir Singh
2009-03-23  8:30               ` KAMEZAWA Hiroyuki
2009-03-23  3:50 ` [PATCH 0/5] Memory controller soft limit patches (v7) KAMEZAWA Hiroyuki
2009-03-23  5:22   ` Balbir Singh
2009-03-23  5:31     ` KAMEZAWA Hiroyuki
2009-03-23  6:12     ` KAMEZAWA Hiroyuki
2009-03-23  6:17       ` KAMEZAWA Hiroyuki
2009-03-23  6:35         ` KOSAKI Motohiro
2009-03-23  8:24           ` Balbir Singh
2009-03-23  9:12             ` KOSAKI Motohiro
2009-03-23  9:23               ` Balbir Singh
2009-03-23  8:35         ` Balbir Singh
2009-03-23  8:52           ` KAMEZAWA Hiroyuki
2009-03-23  9:46             ` Balbir Singh
2009-03-23  9:41       ` Balbir Singh
2009-03-23  8:31 ` KAMEZAWA Hiroyuki
2009-03-24 17:34 ` Balbir Singh
2009-03-24 23:55   ` KAMEZAWA Hiroyuki
2009-03-25  3:42     ` KAMEZAWA Hiroyuki
2009-03-25  4:02       ` Balbir Singh
2009-03-25  4:05         ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090325135900.dc82f133.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=riel@redhat.com \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox