From: Ying Han <yinghan@google.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Glauber Costa <glommer@parallels.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org, devel@openvz.org,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
kamezawa.hiroyu@jp.fujitsu.com, Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Pekka Enberg <penberg@kernel.org>
Subject: Re: [PATCH v2 04/11] kmem accounting basic infrastructure
Date: Wed, 15 Aug 2012 12:50:55 -0700 [thread overview]
Message-ID: <CALWz4iwgnqwq5k_zhpsiiwrj8Y=OkCUg7H96khJWPZScSQE=nw@mail.gmail.com> (raw)
In-Reply-To: <20120814162144.GC6905@dhcp22.suse.cz>
On Tue, Aug 14, 2012 at 9:21 AM, Michal Hocko <mhocko@suse.cz> wrote:
> On Thu 09-08-12 17:01:12, Glauber Costa wrote:
>> This patch adds the basic infrastructure for the accounting of the slab
>> caches. To control that, the following files are created:
>>
>> * memory.kmem.usage_in_bytes
>> * memory.kmem.limit_in_bytes
>> * memory.kmem.failcnt
>> * memory.kmem.max_usage_in_bytes
>>
>> They have the same meaning of their user memory counterparts. They
>> reflect the state of the "kmem" res_counter.
>>
>> The code is not enabled until a limit is set. This can be tested by the
>> flag "kmem_accounted". This means that after the patch is applied, no
>> behavioral changes exists for whoever is still using memcg to control
>> their memory usage.
>>
>> We always account to both user and kernel resource_counters. This
>> effectively means that an independent kernel limit is in place when the
>> limit is set to a lower value than the user memory. A equal or higher
>> value means that the user limit will always hit first, meaning that kmem
>> is effectively unlimited.
>
> Well, it contributes to the user limit so it is not unlimited. It just
> falls under a different limit and it tends to contribute less. This can
> be quite confusing. I am still not sure whether we should mix the two
> things together. If somebody wants to limit the kernel memory he has to
> touch the other limit anyway. Do you have a strong reason to mix the
> user and kernel counters?
The reason to mix the two together is a compromise of the two use
cases we've heard by far. In google, we only need one limit which
limits u & k, and the reclaim kicks in when the total usage hits the
limit.
> My impression was that kernel allocation should simply fail while user
> allocations might reclaim as well. Why should we reclaim just because of
> the kernel allocation (which is unreclaimable from hard limit reclaim
> point of view)?
Some of kernel objects are reclaimable if we have per-memcg shrinker.
> I also think that the whole thing would get much simpler if those two
> are split. Anyway if this is really a must then this should be
> documented here.
What would be the use case you have in your end?
--Ying
> One nit bellow.
>
>> People who want to track kernel memory but not limit it, can set this
>> limit to a very high number (like RESOURCE_MAX - 1page - that no one
>> will ever hit, or equal to the user memory)
>>
>> Signed-off-by: Glauber Costa <glommer@parallels.com>
>> CC: Michal Hocko <mhocko@suse.cz>
>> CC: Johannes Weiner <hannes@cmpxchg.org>
>> Reviewed-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> ---
>> mm/memcontrol.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 68 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index b0e29f4..54e93de 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
> [...]
>> @@ -4046,8 +4059,23 @@ static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
>> break;
>> if (type == _MEM)
>> ret = mem_cgroup_resize_limit(memcg, val);
>> - else
>> + else if (type == _MEMSWAP)
>> ret = mem_cgroup_resize_memsw_limit(memcg, val);
>> + else if (type == _KMEM) {
>> + ret = res_counter_set_limit(&memcg->kmem, val);
>> + if (ret)
>> + break;
>> + /*
>> + * Once enabled, can't be disabled. We could in theory
>> + * disable it if we haven't yet created any caches, or
>> + * if we can shrink them all to death.
>> + *
>> + * But it is not worth the trouble
>> + */
>> + if (!memcg->kmem_accounted && val != RESOURCE_MAX)
>> + memcg->kmem_accounted = true;
>> + } else
>> + return -EINVAL;
>> break;
>
> This doesn't check for the hierachy so kmem_accounted might not be in
> sync with it's parents. mem_cgroup_create (below) needs to copy
> kmem_accounted down from the parent and the above needs to check if this
> is a similar dance like mem_cgroup_oom_control_write.
>
> [...]
>
>> @@ -5033,6 +5098,7 @@ mem_cgroup_create(struct cgroup *cont)
>> if (parent && parent->use_hierarchy) {
>> res_counter_init(&memcg->res, &parent->res);
>> res_counter_init(&memcg->memsw, &parent->memsw);
>> + res_counter_init(&memcg->kmem, &parent->kmem);
>> /*
>> * We increment refcnt of the parent to ensure that we can
>> * safely access it on res_counter_charge/uncharge.
>> @@ -5043,6 +5109,7 @@ mem_cgroup_create(struct cgroup *cont)
>> } else {
>> res_counter_init(&memcg->res, NULL);
>> res_counter_init(&memcg->memsw, NULL);
>> + res_counter_init(&memcg->kmem, NULL);
>> }
>> memcg->last_scanned_node = MAX_NUMNODES;
>> INIT_LIST_HEAD(&memcg->oom_notify);
>> --
>> 1.7.11.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe cgroups" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-08-15 19:50 UTC|newest]
Thread overview: 135+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-09 13:01 [PATCH v2 00/11] Request for Inclusion: kmem controller for memcg Glauber Costa
2012-08-09 13:01 ` [PATCH v2 01/11] memcg: Make it possible to use the stock for more than one page Glauber Costa
2012-08-10 15:12 ` Michal Hocko
2012-08-09 13:01 ` [PATCH v2 02/11] memcg: Reclaim when more than one page needed Glauber Costa
2012-08-10 15:42 ` Michal Hocko
2012-08-10 16:49 ` Kamezawa Hiroyuki
2012-08-10 17:28 ` Michal Hocko
2012-08-10 17:56 ` Kamezawa Hiroyuki
2012-08-10 17:30 ` Michal Hocko
2012-08-10 18:52 ` Michal Hocko
2012-08-10 18:54 ` Michal Hocko
2012-08-13 8:05 ` Glauber Costa
2012-08-13 13:10 ` Michal Hocko
2012-08-09 13:01 ` [PATCH v2 03/11] memcg: change defines to an enum Glauber Costa
2012-08-10 15:43 ` Michal Hocko
2012-08-09 13:01 ` [PATCH v2 04/11] kmem accounting basic infrastructure Glauber Costa
2012-08-10 17:02 ` Kamezawa Hiroyuki
2012-08-13 8:36 ` Glauber Costa
2012-08-17 2:38 ` Kamezawa Hiroyuki
2012-08-14 16:21 ` Michal Hocko
2012-08-15 9:33 ` Glauber Costa
2012-08-15 11:12 ` James Bottomley
2012-08-15 12:55 ` Michal Hocko
2012-08-15 13:29 ` James Bottomley
2012-08-15 12:39 ` Michal Hocko
2012-08-15 12:53 ` Glauber Costa
2012-08-15 13:02 ` Michal Hocko
2012-08-15 13:04 ` Glauber Costa
2012-08-15 13:26 ` Michal Hocko
2012-08-15 13:31 ` Glauber Costa
2012-08-15 14:10 ` Michal Hocko
2012-08-15 14:11 ` Glauber Costa
2012-08-15 14:47 ` Christoph Lameter
2012-08-15 15:11 ` Glauber Costa
2012-08-15 15:34 ` Christoph Lameter
2012-08-15 15:35 ` Glauber Costa
2012-08-15 17:26 ` Christoph Lameter
2012-08-15 18:11 ` Ying Han
2012-08-15 18:25 ` Christoph Lameter
2012-08-15 19:22 ` Glauber Costa
2012-08-15 18:07 ` Ying Han
2012-08-15 15:19 ` Greg Thelen
2012-08-15 15:36 ` Christoph Lameter
2012-08-15 18:01 ` Ying Han
2012-08-15 18:00 ` Glauber Costa
2012-08-15 19:50 ` Ying Han [this message]
2012-08-16 15:25 ` Michal Hocko
2012-08-17 5:58 ` Ying Han
2012-08-09 13:01 ` [PATCH v2 05/11] Add a __GFP_KMEMCG flag Glauber Costa
2012-08-10 17:07 ` Kamezawa Hiroyuki
2012-08-09 13:01 ` [PATCH v2 06/11] memcg: kmem controller infrastructure Glauber Costa
2012-08-10 17:27 ` Kamezawa Hiroyuki
2012-08-13 8:28 ` Glauber Costa
2012-08-14 18:58 ` Greg Thelen
2012-08-15 9:18 ` Glauber Costa
2012-08-15 16:38 ` Greg Thelen
2012-08-15 17:00 ` Glauber Costa
2012-08-15 17:12 ` Greg Thelen
2012-08-15 19:31 ` Glauber Costa
2012-08-16 3:37 ` Greg Thelen
2012-08-16 7:47 ` Glauber Costa
2012-08-20 13:36 ` Kamezawa Hiroyuki
2012-08-20 15:29 ` Glauber Costa
2012-08-17 2:36 ` Kamezawa Hiroyuki
2012-08-17 7:04 ` Glauber Costa
2012-08-14 11:00 ` Glauber Costa
2012-08-11 5:11 ` Greg Thelen
2012-08-13 8:07 ` Glauber Costa
2012-08-13 9:59 ` Glauber Costa
2012-08-13 21:21 ` Greg Thelen
2012-08-14 17:25 ` Michal Hocko
2012-08-15 9:42 ` Glauber Costa
2012-08-15 10:44 ` Glauber Costa
2012-08-15 13:09 ` Michal Hocko
2012-08-15 14:01 ` Glauber Costa
2012-08-15 14:23 ` Michal Hocko
2012-08-15 14:27 ` Glauber Costa
2012-08-16 9:53 ` Michal Hocko
2012-08-16 9:57 ` Glauber Costa
2012-08-16 15:05 ` Michal Hocko
2012-08-16 15:22 ` Glauber Costa
2012-08-21 21:50 ` Greg Thelen
2012-08-22 8:35 ` Glauber Costa
2012-08-23 0:07 ` Greg Thelen
2012-08-23 7:51 ` Glauber Costa
2012-08-09 13:01 ` [PATCH v2 07/11] mm: Allocate kernel pages to the right memcg Glauber Costa
2012-08-09 16:33 ` Greg Thelen
2012-08-09 16:42 ` Glauber Costa
2012-08-10 17:33 ` Kamezawa Hiroyuki
2012-08-13 8:03 ` Glauber Costa
2012-08-13 8:57 ` Mel Gorman
2012-08-10 17:36 ` Greg Thelen
2012-08-13 8:02 ` Glauber Costa
2012-08-14 15:16 ` Mel Gorman
2012-08-15 9:08 ` Glauber Costa
2012-08-15 13:22 ` Mel Gorman
2012-08-15 13:39 ` Glauber Costa
2012-08-15 13:51 ` Glauber Costa
2012-08-15 9:24 ` Michal Hocko
2012-08-09 13:01 ` [PATCH v2 08/11] memcg: disable kmem code when not in use Glauber Costa
2012-08-17 7:02 ` Michal Hocko
2012-08-17 7:01 ` Glauber Costa
2012-08-17 8:04 ` Michal Hocko
2012-08-09 13:01 ` [PATCH v2 09/11] memcg: propagate kmem limiting information to children Glauber Costa
2012-08-10 17:51 ` Kamezawa Hiroyuki
2012-08-13 8:01 ` Glauber Costa
2012-08-17 9:00 ` Michal Hocko
2012-08-17 9:15 ` Glauber Costa
2012-08-17 9:35 ` Michal Hocko
2012-08-17 10:07 ` Glauber Costa
2012-08-17 10:35 ` Michal Hocko
2012-08-17 10:36 ` Glauber Costa
2012-08-21 7:54 ` Michal Hocko
2012-08-21 8:35 ` Michal Hocko
2012-08-21 9:17 ` Glauber Costa
2012-08-21 9:22 ` Glauber Costa
2012-08-21 10:00 ` Michal Hocko
2012-08-21 10:01 ` Glauber Costa
2012-08-22 1:09 ` Greg Thelen
2012-08-22 8:22 ` Glauber Costa
2012-08-22 23:23 ` Greg Thelen
2012-08-23 7:55 ` Glauber Costa
2012-08-24 5:06 ` Greg Thelen
2012-08-24 5:23 ` Glauber Costa
2012-08-17 10:39 ` Glauber Costa
2012-08-09 13:01 ` [PATCH v2 10/11] memcg: allow a memcg with kmem charges to be destructed Glauber Costa
2012-08-21 8:22 ` Michal Hocko
2012-08-22 8:36 ` Glauber Costa
2012-08-09 13:01 ` [PATCH v2 11/11] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs Glauber Costa
2012-08-10 17:54 ` Kamezawa Hiroyuki
2012-08-21 9:35 ` Michal Hocko
2012-08-21 9:40 ` Glauber Costa
2012-08-21 10:57 ` Michal Hocko
2012-08-17 21:37 ` [PATCH v2 00/11] Request for Inclusion: kmem controller for memcg Ying Han
2012-08-20 7:51 ` Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALWz4iwgnqwq5k_zhpsiiwrj8Y=OkCUg7H96khJWPZScSQE=nw@mail.gmail.com' \
--to=yinghan@google.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=devel@openvz.org \
--cc=glommer@parallels.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox