From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Paul Menage <menage@google.com>
Cc: Vladislav Buzov <vbuzov@embeddedalley.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux Containers Mailing List
<containers@lists.linux-foundation.org>,
Linux memory management list <linux-mm@kvack.org>,
Dan Malek <dan@embeddedalley.com>,
Andrew Morton <akpm@linux-foundation.org>,
Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [PATCH 1/2] Resource usage threshold notification addition to res_counter (v3)
Date: Tue, 14 Jul 2009 09:47:29 +0900 [thread overview]
Message-ID: <20090714094729.45d4dff4.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <6599ad830907131736w4397d336xad733f274c812690@mail.gmail.com>
On Mon, 13 Jul 2009 17:36:40 -0700
Paul Menage <menage@google.com> wrote:
> As I mentioned in another thread, I think that associating the
> threshold with the res_counter rather than with each individual waiter
> is a mistake, since it creates global state and makes it hard to have
> multiple waiters on the same cgroup.
>
Ah, Hmm...maybe yes.
But the problem is "hierarchy". (even if this usage notifier don't handle it.)
While we charge as following res_coutner+hierarchy
res_counter_A + PAGE_SIZE
res_counter_B + PAGE_SIZE
res_counter_C + PAGE_SIZE
Checking "where we exceeds" in smart way is not very easy. Balbir's soft limit does
similar check but it's not very smart, either I think.
If there are prural thesholds (notifer, softlimit, etc...), this is worth to be
tried. Hmm...if not, size of res_coutner excees 128bytes and we'll see terrible counter.
Any idea ?
Thanks,
-Kame
> Paul
>
> On Mon, Jul 13, 2009 at 5:16 PM, Vladislav
> Buzov<vbuzov@embeddedalley.com> wrote:
> > This patch updates the Resource Counter to add a configurable resource usage
> > threshold notification mechanism.
> >
> > Signed-off-by: Vladislav Buzov <vbuzov@embeddedalley.com>
> > Signed-off-by: Dan Malek <dan@embeddedalley.com>
> > ---
> > A Documentation/cgroups/resource_counter.txt | A 21 ++++++++-
> > A include/linux/res_counter.h A A A A A A A A | A 69 ++++++++++++++++++++++++++++
> > A kernel/res_counter.c A A A A A A A A A A A | A A 7 +++
> > A 3 files changed, 95 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/cgroups/resource_counter.txt b/Documentation/cgroups/resource_counter.txt
> > index 95b24d7..1369dff 100644
> > --- a/Documentation/cgroups/resource_counter.txt
> > +++ b/Documentation/cgroups/resource_counter.txt
> > @@ -39,7 +39,20 @@ to work with it.
> > A A A A The failcnt stands for "failures counter". This is the number of
> > A A A A resource allocation attempts that failed.
> >
> > - c. spinlock_t lock
> > + e. unsigned long long threshold
> > +
> > + A A A The resource usage threshold to notify the resouce controller. This is
> > + A A A the minimal difference between the resource limit and current usage
> > + A A A to fire a notification.
> > +
> > + f. void (*threshold_notifier)(struct res_counter *counter)
> > +
> > + A A A The threshold notification callback installed by the resource
> > + A A A controller. Called when the usage reaches or exceeds the threshold.
> > + A A A Should be fast and not sleep because called when interrupts are
> > + A A A disabled.
> > +
> > + g. spinlock_t lock
> >
> > A A A A Protects changes of the above values.
> >
> > @@ -140,6 +153,7 @@ counter fields. They are recommended to adhere to the following rules:
> > A A A A usage A A A A A usage_in_<unit_of_measurement>
> > A A A A max_usage A A A max_usage_in_<unit_of_measurement>
> > A A A A limit A A A A A limit_in_<unit_of_measurement>
> > + A A A threshold A A A notify_threshold_in_<unit_of_measurement>
> > A A A A failcnt A A A A failcnt
> > A A A A lock A A A A A A no file :)
> >
> > @@ -153,9 +167,12 @@ counter fields. They are recommended to adhere to the following rules:
> > A A A A usage A A A A A prohibited
> > A A A A max_usage A A A reset to usage
> > A A A A limit A A A A A set the limit
> > + A A A threshold A A A set the threshold
> > A A A A failcnt A A A A reset to zero
> >
> > -
> > + d. Notification is enabled by installing the threshold notifier callback. It
> > + A A is up to the resouce controller to communicate the notification to user
> > + A A space tasks.
> >
> > A 5. Usage example
> >
> > diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
> > index 511f42f..5ec98d7 100644
> > --- a/include/linux/res_counter.h
> > +++ b/include/linux/res_counter.h
> > @@ -9,6 +9,11 @@
> > A *
> > A * Author: Pavel Emelianov <xemul@openvz.org>
> > A *
> > + * Resouce usage threshold notification update
> > + * Copyright 2009 CE Linux Forum and Embedded Alley Solutions, Inc.
> > + * Author: Dan Malek <dan@embeddedalley.com>
> > + * Author: Vladislav Buzov <vbuzov@embeddedalley.com>
> > + *
> > A * See Documentation/cgroups/resource_counter.txt for more
> > A * info about what this counter is.
> > A */
> > @@ -35,6 +40,19 @@ struct res_counter {
> > A A A A */
> > A A A A unsigned long long limit;
> > A A A A /*
> > + A A A A * the resource usage threshold to notify the resouce controller. This
> > + A A A A * is the minimal difference between the resource limit and current
> > + A A A A * usage to fire a notification.
> > + A A A A */
> > + A A A unsigned long long threshold;
> > + A A A /*
> > + A A A A * the threshold notification callback installed by the resource
> > + A A A A * controller. Called when the usage reaches or exceeds the threshold.
> > + A A A A * Should be fast and not sleep because called when interrupts are
> > + A A A A * disabled.
> > + A A A A */
> > + A A A void (*threshold_notifier)(struct res_counter *counter);
> > + A A A /*
> > A A A A * the number of unsuccessful attempts to consume the resource
> > A A A A */
> > A A A A unsigned long long failcnt;
> > @@ -87,6 +105,7 @@ enum {
> > A A A A RES_MAX_USAGE,
> > A A A A RES_LIMIT,
> > A A A A RES_FAILCNT,
> > + A A A RES_THRESHOLD,
> > A };
> >
> > A /*
> > @@ -132,6 +151,21 @@ static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
> > A A A A return false;
> > A }
> >
> > +static inline bool res_counter_threshold_check_locked(struct res_counter *cnt)
> > +{
> > + A A A if (cnt->usage + cnt->threshold < cnt->limit)
> > + A A A A A A A return true;
> > +
> > + A A A return false;
> > +}
> > +
> > +static inline void res_counter_threshold_notify_locked(struct res_counter *cnt)
> > +{
> > + A A A if (!res_counter_threshold_check_locked(cnt) &&
> > + A A A A A cnt->threshold_notifier)
> > + A A A A A A A cnt->threshold_notifier(cnt);
> > +}
> > +
> > A /*
> > A * Helper function to detect if the cgroup is within it's limit or
> > A * not. It's currently called from cgroup_rss_prepare()
> > @@ -147,6 +181,21 @@ static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> > A A A A return ret;
> > A }
> >
> > +/*
> > + * Helper function to detect if the cgroup usage is under it's threshold or
> > + * not.
> > + */
> > +static inline bool res_counter_check_under_threshold(struct res_counter *cnt)
> > +{
> > + A A A bool ret;
> > + A A A unsigned long flags;
> > +
> > + A A A spin_lock_irqsave(&cnt->lock, flags);
> > + A A A ret = res_counter_threshold_check_locked(cnt);
> > + A A A spin_unlock_irqrestore(&cnt->lock, flags);
> > + A A A return ret;
> > +}
> > +
> > A static inline void res_counter_reset_max(struct res_counter *cnt)
> > A {
> > A A A A unsigned long flags;
> > @@ -174,6 +223,26 @@ static inline int res_counter_set_limit(struct res_counter *cnt,
> > A A A A spin_lock_irqsave(&cnt->lock, flags);
> > A A A A if (cnt->usage <= limit) {
> > A A A A A A A A cnt->limit = limit;
> > + A A A A A A A if (limit <= cnt->threshold)
> > + A A A A A A A A A A A cnt->threshold = 0;
> > + A A A A A A A else
> > + A A A A A A A A A A A res_counter_threshold_notify_locked(cnt);
> > + A A A A A A A ret = 0;
> > + A A A }
> > + A A A spin_unlock_irqrestore(&cnt->lock, flags);
> > + A A A return ret;
> > +}
> > +
> > +static inline int res_counter_set_threshold(struct res_counter *cnt,
> > + A A A A A A A unsigned long long threshold)
> > +{
> > + A A A unsigned long flags;
> > + A A A int ret = -EINVAL;
> > +
> > + A A A spin_lock_irqsave(&cnt->lock, flags);
> > + A A A if (cnt->limit > threshold) {
> > + A A A A A A A cnt->threshold = threshold;
> > + A A A A A A A res_counter_threshold_notify_locked(cnt);
> > A A A A A A A A ret = 0;
> > A A A A }
> > A A A A spin_unlock_irqrestore(&cnt->lock, flags);
> > diff --git a/kernel/res_counter.c b/kernel/res_counter.c
> > index e1338f0..9b36748 100644
> > --- a/kernel/res_counter.c
> > +++ b/kernel/res_counter.c
> > @@ -5,6 +5,10 @@
> > A *
> > A * Author: Pavel Emelianov <xemul@openvz.org>
> > A *
> > + * Resouce usage threshold notification update
> > + * Copyright 2009 CE Linux Forum and Embedded Alley Solutions, Inc.
> > + * Author: Dan Malek <dan@embeddedalley.com>
> > + * Author: Vladislav Buzov <vbuzov@embeddedalley.com>
> > A */
> >
> > A #include <linux/types.h>
> > @@ -32,6 +36,7 @@ int res_counter_charge_locked(struct res_counter *counter, unsigned long val)
> > A A A A counter->usage += val;
> > A A A A if (counter->usage > counter->max_usage)
> > A A A A A A A A counter->max_usage = counter->usage;
> > + A A A res_counter_threshold_notify_locked(counter);
> > A A A A return 0;
> > A }
> >
> > @@ -101,6 +106,8 @@ res_counter_member(struct res_counter *counter, int member)
> > A A A A A A A A return &counter->limit;
> > A A A A case RES_FAILCNT:
> > A A A A A A A A return &counter->failcnt;
> > + A A A case RES_THRESHOLD:
> > + A A A A A A A return &counter->threshold;
> > A A A A };
> >
> > A A A A BUG();
> > --
> > 1.5.6.3
> >
> >
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-07-14 0:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1239660512-25468-1-git-send-email-dan@embeddedalley.com>
[not found] ` <1246998310-16764-1-git-send-email-vbuzov@embeddedalley.com>
[not found] ` <1246998310-16764-2-git-send-email-vbuzov@embeddedalley.com>
[not found] ` <20090708095616.cdfe8c7c.kamezawa.hiroyu@jp.fujitsu.com>
2009-07-09 1:43 ` [PATCH 1/1] Memory usage limit notification addition to memcg Vladislav D. Buzov
2009-07-13 0:52 ` KAMEZAWA Hiroyuki
2009-07-13 21:21 ` Vladislav D. Buzov
2009-07-14 0:16 ` [PATCH 0/2] Memory usage limit notification feature (v3) Vladislav Buzov
2009-07-14 0:16 ` [PATCH 1/2] Resource usage threshold notification addition to res_counter (v3) Vladislav Buzov
2009-07-14 0:16 ` [PATCH 2/2] Memory usage limit notification addition to memcg (v3) Vladislav Buzov
2009-07-14 0:30 ` [PATCH 1/2] Resource usage threshold notification addition to res_counter (v3) KAMEZAWA Hiroyuki
2009-07-14 1:29 ` Vladislav D. Buzov
2009-07-14 1:45 ` KAMEZAWA Hiroyuki
2009-07-14 0:36 ` Paul Menage
2009-07-14 0:47 ` KAMEZAWA Hiroyuki [this message]
2009-07-14 0:20 ` [PATCH 0/2] Memory usage limit notification feature (v3) Paul Menage
2009-07-14 0:31 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090714094729.45d4dff4.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=dan@embeddedalley.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=vbuzov@embeddedalley.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox