linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ying Han <yinghan@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mel@csn.ul.ie>, Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Lameter <cl@linux.com>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Andi Kleen <ak@linux.intel.com>, Hugh Dickins <hughd@google.com>,
	Rik van Riel <riel@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Tejun Heo <tj@kernel.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH 2/5] Add per cgroup reclaim watermarks.
Date: Tue, 18 Jan 2011 12:02:51 -0800	[thread overview]
Message-ID: <AANLkTimo7c3pwFoQvE140o6uFDOaRvxdq6+r3tQnfuPe@mail.gmail.com> (raw)
In-Reply-To: <20110114091119.2f11b3b9.kamezawa.hiroyu@jp.fujitsu.com>

On Thu, Jan 13, 2011 at 4:11 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Thu, 13 Jan 2011 14:00:32 -0800
> Ying Han <yinghan@google.com> wrote:
>
>> The per cgroup kswapd is invoked when the cgroup's free memory (limit - usage)
>> is less than a threshold--low_wmark. Then the kswapd thread starts to reclaim
>> pages in a priority loop similar to global algorithm. The kswapd is done if the
>> free memory is above a threshold--high_wmark.
>>
>> The per cgroup background reclaim is based on the per cgroup LRU and also adds
>> per cgroup watermarks. There are two watermarks including "low_wmark" and
>> "high_wmark", and they are calculated based on the limit_in_bytes(hard_limit)
>> for each cgroup. Each time the hard_limit is changed, the corresponding wmarks
>> are re-calculated. Since memory controller charges only user pages, there is
>> no need for a "min_wmark". The current calculation of wmarks is a function of
>> "memory.min_free_kbytes" which could be adjusted by writing different values
>> into the new api. This is added mainly for debugging purpose.
>>
>> Change log v2...v1:
>> 1. Remove the res_counter_charge on wmark due to performance concern.
>> 2. Move the new APIs min_free_kbytes, reclaim_wmarks into seperate commit.
>> 3. Calculate the min_free_kbytes automatically based on the limit_in_bytes.
>> 4. make the wmark to be consistant with core VM which checks the free pages
>> instead of usage.
>> 5. changed wmark to be boolean
>>
>> Signed-off-by: Ying Han <yinghan@google.com>
>

Thank you  KAMEZAWA for your comments.

> Hmm, I don't think using the same algorithm as min_free_kbytes is good.
>
> Why it's bad to have 2 interfaces as low_wmark and high_wmark ?


>
> And in this patch, min_free_kbytes can be [256...65536]...I think this
> '256' is not good because it should be able to be set to '0'.
>
> IIUC, in enterprise systems, there are users who want to keep a fixed amount
> of free memory always. This interface will not allow such use case.

>
> I think we should have 2 interfaces as low_wmark and high_wmark. But as default
> value, the same value as to the alogorithm with min_free_kbytes will make sense.

I agree that "min_free_kbytes" concept doesn't apply well since there
is no notion of "reserved pool" in memcg. I borrowed it at the
beginning is to add a tunable to the per-memcg watermarks besides the
hard_limit. I read the
patch posted from Satoru Moriya "Tunable watermarks", and introducing
the per-memcg-per-watermark tunable
sounds good to me. Might consider adding it to the next post.

>
> BTW, please divide res_counter part and memcg part in the next post.

Will do.

>
> Please explain your handling of 'hierarchy' in description.
I haven't thought through the 'hierarchy' handling in this patchset
which I will probably put more thoughts in the following
posts. Do you have recommendations on handing the 'hierarchy' ?

--Ying

>
> Thanks,
> -Kame
>
>
>> ---
>>  include/linux/memcontrol.h  |    1 +
>>  include/linux/res_counter.h |   83 +++++++++++++++++++++++++++++++++++++++++++
>>  kernel/res_counter.c        |    6 +++
>>  mm/memcontrol.c             |   73 +++++++++++++++++++++++++++++++++++++
>>  4 files changed, 163 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> index 3433784..80a605f 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -93,6 +93,7 @@ int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem);
>>
>>  extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
>>  extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
>> +extern int mem_cgroup_watermark_ok(struct mem_cgroup *mem, int charge_flags);
>>
>>  static inline
>>  int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
>> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
>> index fcb9884..10b7e59 100644
>> --- a/include/linux/res_counter.h
>> +++ b/include/linux/res_counter.h
>> @@ -39,6 +39,15 @@ struct res_counter {
>>        */
>>       unsigned long long soft_limit;
>>       /*
>> +      * the limit that reclaim triggers. it is the free count
>> +      * (limit - usage)
>> +      */
>> +     unsigned long long low_wmark_limit;
>> +     /*
>> +      * the limit that reclaim stops. it is the free count
>> +      */
>> +     unsigned long long high_wmark_limit;
>> +     /*
>>        * the number of unsuccessful attempts to consume the resource
>>        */
>>       unsigned long long failcnt;
>> @@ -55,6 +64,9 @@ struct res_counter {
>>
>>  #define RESOURCE_MAX (unsigned long long)LLONG_MAX
>>
>> +#define CHARGE_WMARK_LOW     0x02
>> +#define CHARGE_WMARK_HIGH    0x04
>> +
>>  /**
>>   * Helpers to interact with userspace
>>   * res_counter_read_u64() - returns the value of the specified member

  reply	other threads:[~2011-01-18 20:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-13 22:00 [PATCH 0/5] memcg: per cgroup background reclaim Ying Han
2011-01-13 22:00 ` [PATCH 1/5] Add kswapd descriptor Ying Han
2011-01-13 22:00 ` [PATCH 2/5] Add per cgroup reclaim watermarks Ying Han
2011-01-14  0:11   ` KAMEZAWA Hiroyuki
2011-01-18 20:02     ` Ying Han [this message]
2011-01-18 20:36       ` David Rientjes
2011-01-18 21:10         ` Ying Han
2011-01-19  0:56           ` KAMEZAWA Hiroyuki
2011-01-19  2:38             ` David Rientjes
2011-01-19  2:47               ` KAMEZAWA Hiroyuki
2011-01-19 10:03                 ` David Rientjes
2011-01-19  0:44       ` KAMEZAWA Hiroyuki
2011-01-13 22:00 ` [PATCH 3/5] New APIs to adjust per cgroup wmarks Ying Han
2011-01-13 22:00 ` [PATCH 4/5] Per cgroup background reclaim Ying Han
2011-01-14  0:52   ` KAMEZAWA Hiroyuki
2011-01-19  2:12     ` Ying Han
2011-01-13 22:00 ` [PATCH 5/5] Add more per memcg stats Ying Han

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimo7c3pwFoQvE140o6uFDOaRvxdq6+r3tQnfuPe@mail.gmail.com \
    --to=yinghan@google.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=cl@linux.com \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox