From: Ying Han <yinghan@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mel@csn.ul.ie>, Johannes Weiner <hannes@cmpxchg.org>,
Christoph Lameter <cl@linux.com>,
Wu Fengguang <fengguang.wu@intel.com>,
Andi Kleen <ak@linux.intel.com>, Hugh Dickins <hughd@google.com>,
Rik van Riel <riel@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Tejun Heo <tj@kernel.org>,
linux-mm@kvack.org
Subject: Re: [PATCH 2/5] Add per cgroup reclaim watermarks.
Date: Tue, 18 Jan 2011 12:02:51 -0800 [thread overview]
Message-ID: <AANLkTimo7c3pwFoQvE140o6uFDOaRvxdq6+r3tQnfuPe@mail.gmail.com> (raw)
In-Reply-To: <20110114091119.2f11b3b9.kamezawa.hiroyu@jp.fujitsu.com>
On Thu, Jan 13, 2011 at 4:11 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Thu, 13 Jan 2011 14:00:32 -0800
> Ying Han <yinghan@google.com> wrote:
>
>> The per cgroup kswapd is invoked when the cgroup's free memory (limit - usage)
>> is less than a threshold--low_wmark. Then the kswapd thread starts to reclaim
>> pages in a priority loop similar to global algorithm. The kswapd is done if the
>> free memory is above a threshold--high_wmark.
>>
>> The per cgroup background reclaim is based on the per cgroup LRU and also adds
>> per cgroup watermarks. There are two watermarks including "low_wmark" and
>> "high_wmark", and they are calculated based on the limit_in_bytes(hard_limit)
>> for each cgroup. Each time the hard_limit is changed, the corresponding wmarks
>> are re-calculated. Since memory controller charges only user pages, there is
>> no need for a "min_wmark". The current calculation of wmarks is a function of
>> "memory.min_free_kbytes" which could be adjusted by writing different values
>> into the new api. This is added mainly for debugging purpose.
>>
>> Change log v2...v1:
>> 1. Remove the res_counter_charge on wmark due to performance concern.
>> 2. Move the new APIs min_free_kbytes, reclaim_wmarks into seperate commit.
>> 3. Calculate the min_free_kbytes automatically based on the limit_in_bytes.
>> 4. make the wmark to be consistant with core VM which checks the free pages
>> instead of usage.
>> 5. changed wmark to be boolean
>>
>> Signed-off-by: Ying Han <yinghan@google.com>
>
Thank you KAMEZAWA for your comments.
> Hmm, I don't think using the same algorithm as min_free_kbytes is good.
>
> Why it's bad to have 2 interfaces as low_wmark and high_wmark ?
>
> And in this patch, min_free_kbytes can be [256...65536]...I think this
> '256' is not good because it should be able to be set to '0'.
>
> IIUC, in enterprise systems, there are users who want to keep a fixed amount
> of free memory always. This interface will not allow such use case.
>
> I think we should have 2 interfaces as low_wmark and high_wmark. But as default
> value, the same value as to the alogorithm with min_free_kbytes will make sense.
I agree that "min_free_kbytes" concept doesn't apply well since there
is no notion of "reserved pool" in memcg. I borrowed it at the
beginning is to add a tunable to the per-memcg watermarks besides the
hard_limit. I read the
patch posted from Satoru Moriya "Tunable watermarks", and introducing
the per-memcg-per-watermark tunable
sounds good to me. Might consider adding it to the next post.
>
> BTW, please divide res_counter part and memcg part in the next post.
Will do.
>
> Please explain your handling of 'hierarchy' in description.
I haven't thought through the 'hierarchy' handling in this patchset
which I will probably put more thoughts in the following
posts. Do you have recommendations on handing the 'hierarchy' ?
--Ying
>
> Thanks,
> -Kame
>
>
>> ---
>> include/linux/memcontrol.h | 1 +
>> include/linux/res_counter.h | 83 +++++++++++++++++++++++++++++++++++++++++++
>> kernel/res_counter.c | 6 +++
>> mm/memcontrol.c | 73 +++++++++++++++++++++++++++++++++++++
>> 4 files changed, 163 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> index 3433784..80a605f 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -93,6 +93,7 @@ int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem);
>>
>> extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
>> extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
>> +extern int mem_cgroup_watermark_ok(struct mem_cgroup *mem, int charge_flags);
>>
>> static inline
>> int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
>> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
>> index fcb9884..10b7e59 100644
>> --- a/include/linux/res_counter.h
>> +++ b/include/linux/res_counter.h
>> @@ -39,6 +39,15 @@ struct res_counter {
>> */
>> unsigned long long soft_limit;
>> /*
>> + * the limit that reclaim triggers. it is the free count
>> + * (limit - usage)
>> + */
>> + unsigned long long low_wmark_limit;
>> + /*
>> + * the limit that reclaim stops. it is the free count
>> + */
>> + unsigned long long high_wmark_limit;
>> + /*
>> * the number of unsuccessful attempts to consume the resource
>> */
>> unsigned long long failcnt;
>> @@ -55,6 +64,9 @@ struct res_counter {
>>
>> #define RESOURCE_MAX (unsigned long long)LLONG_MAX
>>
>> +#define CHARGE_WMARK_LOW 0x02
>> +#define CHARGE_WMARK_HIGH 0x04
>> +
>> /**
>> * Helpers to interact with userspace
>> * res_counter_read_u64() - returns the value of the specified member
next prev parent reply other threads:[~2011-01-18 20:03 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-13 22:00 [PATCH 0/5] memcg: per cgroup background reclaim Ying Han
2011-01-13 22:00 ` [PATCH 1/5] Add kswapd descriptor Ying Han
2011-01-13 22:00 ` [PATCH 2/5] Add per cgroup reclaim watermarks Ying Han
2011-01-14 0:11 ` KAMEZAWA Hiroyuki
2011-01-18 20:02 ` Ying Han [this message]
2011-01-18 20:36 ` David Rientjes
2011-01-18 21:10 ` Ying Han
2011-01-19 0:56 ` KAMEZAWA Hiroyuki
2011-01-19 2:38 ` David Rientjes
2011-01-19 2:47 ` KAMEZAWA Hiroyuki
2011-01-19 10:03 ` David Rientjes
2011-01-19 0:44 ` KAMEZAWA Hiroyuki
2011-01-13 22:00 ` [PATCH 3/5] New APIs to adjust per cgroup wmarks Ying Han
2011-01-13 22:00 ` [PATCH 4/5] Per cgroup background reclaim Ying Han
2011-01-14 0:52 ` KAMEZAWA Hiroyuki
2011-01-19 2:12 ` Ying Han
2011-01-13 22:00 ` [PATCH 5/5] Add more per memcg stats Ying Han
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTimo7c3pwFoQvE140o6uFDOaRvxdq6+r3tQnfuPe@mail.gmail.com \
--to=yinghan@google.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=cl@linux.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=riel@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox