* [RFC] Reduce the resource counter lock overhead
@ 2009-06-24 17:05 Balbir Singh
2009-06-24 19:40 ` Paul Menage
2009-06-24 23:10 ` Andrew Morton
0 siblings, 2 replies; 14+ messages in thread
From: Balbir Singh @ 2009-06-24 17:05 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki, nishimura; +Cc: Andrew Morton, menage, xemul, linux-mm, lizf
Hi, All,
I've been experimenting with reduction of resource counter locking
overhead. My benchmarks show a marginal improvement, /proc/lock_stat
however shows that the lock contention time and held time reduce
by quite an amount after this patch.
Before the patch, I see
lock_stat version 0.3
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
class name con-bounces contentions
waittime-min waittime-max waittime-total acq-bounces
acquisitions holdtime-min holdtime-max holdtime-total
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
&counter->lock: 1534627 1575341
0.57 18.39 675713.23 43330446 138524248
0.43 148.13 54133607.05
--------------
&counter->lock 809559
[<ffffffff810810c5>] res_counter_charge+0x3f/0xed
&counter->lock 765782
[<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
--------------
&counter->lock 653284
[<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
&counter->lock 922057
[<ffffffff810810c5>] res_counter_charge+0x3f/0xed
After the patch I see
lock_stat version 0.3
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
class name con-bounces contentions
waittime-min waittime-max waittime-total acq-bounces
acquisitions holdtime-min holdtime-max holdtime-total
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
&(&counter->lock)->lock: 962193 976349
0.60 14.07 465926.04 21364165 66041988
0.45 88.31 25395513.12
-----------------------
&(&counter->lock)->lock 495468
[<ffffffff8108106e>] res_counter_uncharge+0x2c/0x77
&(&counter->lock)->lock 480881
[<ffffffff810810f7>] res_counter_charge+0x3e/0xfb
-----------------------
&(&counter->lock)->lock 564419
[<ffffffff810810f7>] res_counter_charge+0x3e/0xfb
&(&counter->lock)->lock 411930
[<ffffffff8108106e>] res_counter_uncharge+0x2c/0x77
Please review, comment on the usefulness of this approach. I do have
another approach in mind for reducing res_counter lock overhead, but
this one seems the most straight forward
Feature: Change locking of res_counter
From: Balbir Singh <balbir@linux.vnet.ibm.com>
Resource Counters today use spin_lock_irq* variants for locking.
This patch converts the lock to a seqlock_t
---
include/linux/res_counter.h | 24 +++++++++++++-----------
kernel/res_counter.c | 18 +++++++++---------
2 files changed, 22 insertions(+), 20 deletions(-)
diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
index 511f42f..4c61757 100644
--- a/include/linux/res_counter.h
+++ b/include/linux/res_counter.h
@@ -14,6 +14,7 @@
*/
#include <linux/cgroup.h>
+#include <linux/seqlock.h>
/*
* The core object. the cgroup that wishes to account for some
@@ -42,7 +43,7 @@ struct res_counter {
* the lock to protect all of the above.
* the routines below consider this to be IRQ-safe
*/
- spinlock_t lock;
+ seqlock_t lock;
/*
* Parent counter, used for hierarchial resource accounting
*/
@@ -139,11 +140,12 @@ static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
static inline bool res_counter_check_under_limit(struct res_counter *cnt)
{
bool ret;
- unsigned long flags;
+ unsigned long flags, seq;
- spin_lock_irqsave(&cnt->lock, flags);
- ret = res_counter_limit_check_locked(cnt);
- spin_unlock_irqrestore(&cnt->lock, flags);
+ do {
+ seq = read_seqbegin_irqsave(&cnt->lock, flags);
+ ret = res_counter_limit_check_locked(cnt);
+ } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
return ret;
}
@@ -151,18 +153,18 @@ static inline void res_counter_reset_max(struct res_counter *cnt)
{
unsigned long flags;
- spin_lock_irqsave(&cnt->lock, flags);
+ write_seqlock_irqsave(&cnt->lock, flags);
cnt->max_usage = cnt->usage;
- spin_unlock_irqrestore(&cnt->lock, flags);
+ write_sequnlock_irqrestore(&cnt->lock, flags);
}
static inline void res_counter_reset_failcnt(struct res_counter *cnt)
{
unsigned long flags;
- spin_lock_irqsave(&cnt->lock, flags);
+ write_seqlock_irqsave(&cnt->lock, flags);
cnt->failcnt = 0;
- spin_unlock_irqrestore(&cnt->lock, flags);
+ write_sequnlock_irqrestore(&cnt->lock, flags);
}
static inline int res_counter_set_limit(struct res_counter *cnt,
@@ -171,12 +173,12 @@ static inline int res_counter_set_limit(struct res_counter *cnt,
unsigned long flags;
int ret = -EBUSY;
- spin_lock_irqsave(&cnt->lock, flags);
+ write_seqlock_irqsave(&cnt->lock, flags);
if (cnt->usage <= limit) {
cnt->limit = limit;
ret = 0;
}
- spin_unlock_irqrestore(&cnt->lock, flags);
+ write_sequnlock_irqrestore(&cnt->lock, flags);
return ret;
}
diff --git a/kernel/res_counter.c b/kernel/res_counter.c
index e1338f0..9830c00 100644
--- a/kernel/res_counter.c
+++ b/kernel/res_counter.c
@@ -17,7 +17,7 @@
void res_counter_init(struct res_counter *counter, struct res_counter *parent)
{
- spin_lock_init(&counter->lock);
+ seqlock_init(&counter->lock);
counter->limit = RESOURCE_MAX;
counter->parent = parent;
}
@@ -45,9 +45,9 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
*limit_fail_at = NULL;
local_irq_save(flags);
for (c = counter; c != NULL; c = c->parent) {
- spin_lock(&c->lock);
+ write_seqlock(&c->lock);
ret = res_counter_charge_locked(c, val);
- spin_unlock(&c->lock);
+ write_sequnlock(&c->lock);
if (ret < 0) {
*limit_fail_at = c;
goto undo;
@@ -57,9 +57,9 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
goto done;
undo:
for (u = counter; u != c; u = u->parent) {
- spin_lock(&u->lock);
+ write_seqlock(&u->lock);
res_counter_uncharge_locked(u, val);
- spin_unlock(&u->lock);
+ write_sequnlock(&u->lock);
}
done:
local_irq_restore(flags);
@@ -81,9 +81,9 @@ void res_counter_uncharge(struct res_counter *counter, unsigned long val)
local_irq_save(flags);
for (c = counter; c != NULL; c = c->parent) {
- spin_lock(&c->lock);
+ write_seqlock(&c->lock);
res_counter_uncharge_locked(c, val);
- spin_unlock(&c->lock);
+ write_sequnlock(&c->lock);
}
local_irq_restore(flags);
}
@@ -167,9 +167,9 @@ int res_counter_write(struct res_counter *counter, int member,
if (*end != '\0')
return -EINVAL;
}
- spin_lock_irqsave(&counter->lock, flags);
+ write_seqlock_irqsave(&counter->lock, flags);
val = res_counter_member(counter, member);
*val = tmp;
- spin_unlock_irqrestore(&counter->lock, flags);
+ write_sequnlock_irqrestore(&counter->lock, flags);
return 0;
}
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-24 17:05 [RFC] Reduce the resource counter lock overhead Balbir Singh
@ 2009-06-24 19:40 ` Paul Menage
2009-06-24 23:10 ` Andrew Morton
1 sibling, 0 replies; 14+ messages in thread
From: Paul Menage @ 2009-06-24 19:40 UTC (permalink / raw)
To: balbir; +Cc: KAMEZAWA Hiroyuki, nishimura, Andrew Morton, xemul, linux-mm, lizf
Looks like a sensible change.
Paul
On Wed, Jun 24, 2009 at 10:05 AM, Balbir Singh<balbir@linux.vnet.ibm.com> wrote:
> Hi, All,
>
> I've been experimenting with reduction of resource counter locking
> overhead. My benchmarks show a marginal improvement, /proc/lock_stat
> however shows that the lock contention time and held time reduce
> by quite an amount after this patch.
>
> Before the patch, I see
>
> lock_stat version 0.3
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> &counter->lock: 1534627 1575341
> 0.57 18.39 675713.23 43330446 138524248
> 0.43 148.13 54133607.05
> --------------
> &counter->lock 809559
> [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
> &counter->lock 765782
> [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> --------------
> &counter->lock 653284
> [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> &counter->lock 922057
> [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
>
>
> After the patch I see
>
> lock_stat version 0.3
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> &(&counter->lock)->lock: 962193 976349
> 0.60 14.07 465926.04 21364165 66041988
> 0.45 88.31 25395513.12
> -----------------------
> &(&counter->lock)->lock 495468
> [<ffffffff8108106e>] res_counter_uncharge+0x2c/0x77
> &(&counter->lock)->lock 480881
> [<ffffffff810810f7>] res_counter_charge+0x3e/0xfb
> -----------------------
> &(&counter->lock)->lock 564419
> [<ffffffff810810f7>] res_counter_charge+0x3e/0xfb
> &(&counter->lock)->lock 411930
> [<ffffffff8108106e>] res_counter_uncharge+0x2c/0x77
>
> Please review, comment on the usefulness of this approach. I do have
> another approach in mind for reducing res_counter lock overhead, but
> this one seems the most straight forward
>
>
> Feature: Change locking of res_counter
>
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
>
> Resource Counters today use spin_lock_irq* variants for locking.
> This patch converts the lock to a seqlock_t
> ---
>
> include/linux/res_counter.h | 24 +++++++++++++-----------
> kernel/res_counter.c | 18 +++++++++---------
> 2 files changed, 22 insertions(+), 20 deletions(-)
>
>
> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
> index 511f42f..4c61757 100644
> --- a/include/linux/res_counter.h
> +++ b/include/linux/res_counter.h
> @@ -14,6 +14,7 @@
> */
>
> #include <linux/cgroup.h>
> +#include <linux/seqlock.h>
>
> /*
> * The core object. the cgroup that wishes to account for some
> @@ -42,7 +43,7 @@ struct res_counter {
> * the lock to protect all of the above.
> * the routines below consider this to be IRQ-safe
> */
> - spinlock_t lock;
> + seqlock_t lock;
> /*
> * Parent counter, used for hierarchial resource accounting
> */
> @@ -139,11 +140,12 @@ static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
> static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> {
> bool ret;
> - unsigned long flags;
> + unsigned long flags, seq;
>
> - spin_lock_irqsave(&cnt->lock, flags);
> - ret = res_counter_limit_check_locked(cnt);
> - spin_unlock_irqrestore(&cnt->lock, flags);
> + do {
> + seq = read_seqbegin_irqsave(&cnt->lock, flags);
> + ret = res_counter_limit_check_locked(cnt);
> + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
> return ret;
> }
>
> @@ -151,18 +153,18 @@ static inline void res_counter_reset_max(struct res_counter *cnt)
> {
> unsigned long flags;
>
> - spin_lock_irqsave(&cnt->lock, flags);
> + write_seqlock_irqsave(&cnt->lock, flags);
> cnt->max_usage = cnt->usage;
> - spin_unlock_irqrestore(&cnt->lock, flags);
> + write_sequnlock_irqrestore(&cnt->lock, flags);
> }
>
> static inline void res_counter_reset_failcnt(struct res_counter *cnt)
> {
> unsigned long flags;
>
> - spin_lock_irqsave(&cnt->lock, flags);
> + write_seqlock_irqsave(&cnt->lock, flags);
> cnt->failcnt = 0;
> - spin_unlock_irqrestore(&cnt->lock, flags);
> + write_sequnlock_irqrestore(&cnt->lock, flags);
> }
>
> static inline int res_counter_set_limit(struct res_counter *cnt,
> @@ -171,12 +173,12 @@ static inline int res_counter_set_limit(struct res_counter *cnt,
> unsigned long flags;
> int ret = -EBUSY;
>
> - spin_lock_irqsave(&cnt->lock, flags);
> + write_seqlock_irqsave(&cnt->lock, flags);
> if (cnt->usage <= limit) {
> cnt->limit = limit;
> ret = 0;
> }
> - spin_unlock_irqrestore(&cnt->lock, flags);
> + write_sequnlock_irqrestore(&cnt->lock, flags);
> return ret;
> }
>
> diff --git a/kernel/res_counter.c b/kernel/res_counter.c
> index e1338f0..9830c00 100644
> --- a/kernel/res_counter.c
> +++ b/kernel/res_counter.c
> @@ -17,7 +17,7 @@
>
> void res_counter_init(struct res_counter *counter, struct res_counter *parent)
> {
> - spin_lock_init(&counter->lock);
> + seqlock_init(&counter->lock);
> counter->limit = RESOURCE_MAX;
> counter->parent = parent;
> }
> @@ -45,9 +45,9 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
> *limit_fail_at = NULL;
> local_irq_save(flags);
> for (c = counter; c != NULL; c = c->parent) {
> - spin_lock(&c->lock);
> + write_seqlock(&c->lock);
> ret = res_counter_charge_locked(c, val);
> - spin_unlock(&c->lock);
> + write_sequnlock(&c->lock);
> if (ret < 0) {
> *limit_fail_at = c;
> goto undo;
> @@ -57,9 +57,9 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
> goto done;
> undo:
> for (u = counter; u != c; u = u->parent) {
> - spin_lock(&u->lock);
> + write_seqlock(&u->lock);
> res_counter_uncharge_locked(u, val);
> - spin_unlock(&u->lock);
> + write_sequnlock(&u->lock);
> }
> done:
> local_irq_restore(flags);
> @@ -81,9 +81,9 @@ void res_counter_uncharge(struct res_counter *counter, unsigned long val)
>
> local_irq_save(flags);
> for (c = counter; c != NULL; c = c->parent) {
> - spin_lock(&c->lock);
> + write_seqlock(&c->lock);
> res_counter_uncharge_locked(c, val);
> - spin_unlock(&c->lock);
> + write_sequnlock(&c->lock);
> }
> local_irq_restore(flags);
> }
> @@ -167,9 +167,9 @@ int res_counter_write(struct res_counter *counter, int member,
> if (*end != '\0')
> return -EINVAL;
> }
> - spin_lock_irqsave(&counter->lock, flags);
> + write_seqlock_irqsave(&counter->lock, flags);
> val = res_counter_member(counter, member);
> *val = tmp;
> - spin_unlock_irqrestore(&counter->lock, flags);
> + write_sequnlock_irqrestore(&counter->lock, flags);
> return 0;
> }
>
> --
> Balbir
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-24 17:05 [RFC] Reduce the resource counter lock overhead Balbir Singh
2009-06-24 19:40 ` Paul Menage
@ 2009-06-24 23:10 ` Andrew Morton
2009-06-24 23:53 ` KAMEZAWA Hiroyuki
2009-06-25 3:04 ` Balbir Singh
1 sibling, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2009-06-24 23:10 UTC (permalink / raw)
To: balbir; +Cc: kamezawa.hiroyu, nishimura, menage, xemul, linux-mm, lizf
On Wed, 24 Jun 2009 22:35:16 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> Hi, All,
>
> I've been experimenting with reduction of resource counter locking
> overhead. My benchmarks show a marginal improvement, /proc/lock_stat
> however shows that the lock contention time and held time reduce
> by quite an amount after this patch.
That looks sane.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> class name con-bounces contentions
> waittime-min waittime-max waittime-total acq-bounces
> acquisitions holdtime-min holdtime-max holdtime-total
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> &counter->lock: 1534627 1575341
> 0.57 18.39 675713.23 43330446 138524248
> 0.43 148.13 54133607.05
> --------------
> &counter->lock 809559
> [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
> &counter->lock 765782
> [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> --------------
> &counter->lock 653284
> [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> &counter->lock 922057
> [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
Please turn off the wordwrapping before sending the signed-off version.
> static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> {
> bool ret;
> - unsigned long flags;
> + unsigned long flags, seq;
>
> - spin_lock_irqsave(&cnt->lock, flags);
> - ret = res_counter_limit_check_locked(cnt);
> - spin_unlock_irqrestore(&cnt->lock, flags);
> + do {
> + seq = read_seqbegin_irqsave(&cnt->lock, flags);
> + ret = res_counter_limit_check_locked(cnt);
> + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
> return ret;
> }
This change makes the inlining of these functions even more
inappropriate than it already was.
This function should be static in memcontrol.c anyway?
Which function is calling mem_cgroup_check_under_limit() so much?
__mem_cgroup_try_charge()? If so, I'm a bit surprised because
inefficiencies of this nature in page reclaim rarely are demonstrable -
reclaim just doesn't get called much. Perhaps this is a sign that
reclaim is scanning the same pages over and over again and is being
inefficient at a higher level?
Do we really need to call mem_cgroup_hierarchical_reclaim() as
frequently as we apparently are doing?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-24 23:10 ` Andrew Morton
@ 2009-06-24 23:53 ` KAMEZAWA Hiroyuki
2009-06-25 3:27 ` Balbir Singh
2009-06-25 3:04 ` Balbir Singh
1 sibling, 1 reply; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-06-24 23:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: balbir, nishimura, menage, xemul, linux-mm, lizf
On Wed, 24 Jun 2009 16:10:28 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Wed, 24 Jun 2009 22:35:16 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > Hi, All,
> >
> > I've been experimenting with reduction of resource counter locking
> > overhead. My benchmarks show a marginal improvement, /proc/lock_stat
> > however shows that the lock contention time and held time reduce
> > by quite an amount after this patch.
>
> That looks sane.
>
I suprized to see seq_lock here can reduce the overhead.
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > class name con-bounces contentions
> > waittime-min waittime-max waittime-total acq-bounces
> > acquisitions holdtime-min holdtime-max holdtime-total
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > &counter->lock: 1534627 1575341
> > 0.57 18.39 675713.23 43330446 138524248
> > 0.43 148.13 54133607.05
> > --------------
> > &counter->lock 809559
> > [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
> > &counter->lock 765782
> > [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> > --------------
> > &counter->lock 653284
> > [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> > &counter->lock 922057
> > [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
>
> Please turn off the wordwrapping before sending the signed-off version.
>
> > static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> > {
> > bool ret;
> > - unsigned long flags;
> > + unsigned long flags, seq;
> >
> > - spin_lock_irqsave(&cnt->lock, flags);
> > - ret = res_counter_limit_check_locked(cnt);
> > - spin_unlock_irqrestore(&cnt->lock, flags);
> > + do {
> > + seq = read_seqbegin_irqsave(&cnt->lock, flags);
> > + ret = res_counter_limit_check_locked(cnt);
> > + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
> > return ret;
> > }
>
> This change makes the inlining of these functions even more
> inappropriate than it already was.
>
> This function should be static in memcontrol.c anyway?
>
> Which function is calling mem_cgroup_check_under_limit() so much?
> __mem_cgroup_try_charge()? If so, I'm a bit surprised because
> inefficiencies of this nature in page reclaim rarely are demonstrable -
> reclaim just doesn't get called much. Perhaps this is a sign that
> reclaim is scanning the same pages over and over again and is being
> inefficient at a higher level?
>
> Do we really need to call mem_cgroup_hierarchical_reclaim() as
> frequently as we apparently are doing?
>
Most of modification to res_counter is
- charge
- uncharge
and not
- read
What kind of workload can be much improved ?
IIUC, in general, using seq_lock to frequently modified counter just makes
it slow.
Could you show improved kernbench or unixbench score ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-24 23:53 ` KAMEZAWA Hiroyuki
@ 2009-06-25 3:27 ` Balbir Singh
2009-06-25 3:44 ` Andrew Morton
2009-06-25 4:37 ` KAMEZAWA Hiroyuki
0 siblings, 2 replies; 14+ messages in thread
From: Balbir Singh @ 2009-06-25 3:27 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Andrew Morton, nishimura, menage, xemul, linux-mm, lizf
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-06-25 08:53:47]:
> On Wed, 24 Jun 2009 16:10:28 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Wed, 24 Jun 2009 22:35:16 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> >
> > > Hi, All,
> > >
> > > I've been experimenting with reduction of resource counter locking
> > > overhead. My benchmarks show a marginal improvement, /proc/lock_stat
> > > however shows that the lock contention time and held time reduce
> > > by quite an amount after this patch.
> >
> > That looks sane.
> >
> I suprized to see seq_lock here can reduce the overhead.
>
I am not too surprised, given that we do frequent read-writes. We do a
read everytime before we charge.
>
> > > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > > class name con-bounces contentions
> > > waittime-min waittime-max waittime-total acq-bounces
> > > acquisitions holdtime-min holdtime-max holdtime-total
> > > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > >
> > > &counter->lock: 1534627 1575341
> > > 0.57 18.39 675713.23 43330446 138524248
> > > 0.43 148.13 54133607.05
> > > --------------
> > > &counter->lock 809559
> > > [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
> > > &counter->lock 765782
> > > [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> > > --------------
> > > &counter->lock 653284
> > > [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> > > &counter->lock 922057
> > > [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
> >
> > Please turn off the wordwrapping before sending the signed-off version.
> >
> > > static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> > > {
> > > bool ret;
> > > - unsigned long flags;
> > > + unsigned long flags, seq;
> > >
> > > - spin_lock_irqsave(&cnt->lock, flags);
> > > - ret = res_counter_limit_check_locked(cnt);
> > > - spin_unlock_irqrestore(&cnt->lock, flags);
> > > + do {
> > > + seq = read_seqbegin_irqsave(&cnt->lock, flags);
> > > + ret = res_counter_limit_check_locked(cnt);
> > > + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
> > > return ret;
> > > }
> >
> > This change makes the inlining of these functions even more
> > inappropriate than it already was.
> >
> > This function should be static in memcontrol.c anyway?
> >
> > Which function is calling mem_cgroup_check_under_limit() so much?
> > __mem_cgroup_try_charge()? If so, I'm a bit surprised because
> > inefficiencies of this nature in page reclaim rarely are demonstrable -
> > reclaim just doesn't get called much. Perhaps this is a sign that
> > reclaim is scanning the same pages over and over again and is being
> > inefficient at a higher level?
> >
> > Do we really need to call mem_cgroup_hierarchical_reclaim() as
> > frequently as we apparently are doing?
> >
>
> Most of modification to res_counter is
> - charge
> - uncharge
> and not
> - read
>
> What kind of workload can be much improved ?
> IIUC, in general, using seq_lock to frequently modified counter just makes
> it slow.
Why do you think so? I've been looking primarily at do_gettimeofday().
Yes, frequent updates can hurt readers in the worst case. I've been
meaning to experiment with percpu counters as well, but we'll need to
decide what is the tolerance limit, since we can have a batch value
fuzziness, before all CPUs see that the limit is exceeded, but it
might be worth experimenting.
>
> Could you show improved kernbench or unixbench score ?
>
I'll start some of these and see if I can get a large machine to test
on. I ran reaim for the current run.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 3:27 ` Balbir Singh
@ 2009-06-25 3:44 ` Andrew Morton
2009-06-25 4:39 ` KAMEZAWA Hiroyuki
2009-06-25 5:01 ` Balbir Singh
2009-06-25 4:37 ` KAMEZAWA Hiroyuki
1 sibling, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2009-06-25 3:44 UTC (permalink / raw)
To: balbir; +Cc: KAMEZAWA Hiroyuki, nishimura, menage, xemul, linux-mm, lizf
On Thu, 25 Jun 2009 08:57:17 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> We do a read everytime before we charge.
See, a good way to fix that is to not do it. Instead of
if (under_limit())
charge_some_more(amount);
else
goto fail;
one can do
if (try_to_charge_some_more(amount) < 0)
goto fail;
which will halve the locking frequency. Which may not be as beneficial
as avoiding the locking altogether on the read side, dunno.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 3:44 ` Andrew Morton
@ 2009-06-25 4:39 ` KAMEZAWA Hiroyuki
2009-06-25 5:40 ` Balbir Singh
2009-06-25 5:01 ` Balbir Singh
1 sibling, 1 reply; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-06-25 4:39 UTC (permalink / raw)
To: Andrew Morton; +Cc: balbir, nishimura, menage, xemul, linux-mm, lizf
On Wed, 24 Jun 2009 20:44:26 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 25 Jun 2009 08:57:17 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > We do a read everytime before we charge.
>
> See, a good way to fix that is to not do it. Instead of
>
> if (under_limit())
> charge_some_more(amount);
> else
> goto fail;
>
> one can do
>
> if (try_to_charge_some_more(amount) < 0)
> goto fail;
>
> which will halve the locking frequency. Which may not be as beneficial
> as avoiding the locking altogether on the read side, dunno.
>
I don't think we do read-before-write ;)
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 4:39 ` KAMEZAWA Hiroyuki
@ 2009-06-25 5:40 ` Balbir Singh
2009-06-25 6:30 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2009-06-25 5:40 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Andrew Morton, nishimura, menage, xemul, linux-mm, lizf
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-06-25 13:39:08]:
> On Wed, 24 Jun 2009 20:44:26 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Thu, 25 Jun 2009 08:57:17 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> >
> > > We do a read everytime before we charge.
> >
> > See, a good way to fix that is to not do it. Instead of
> >
> > if (under_limit())
> > charge_some_more(amount);
> > else
> > goto fail;
> >
> > one can do
> >
> > if (try_to_charge_some_more(amount) < 0)
> > goto fail;
> >
> > which will halve the locking frequency. Which may not be as beneficial
> > as avoiding the locking altogether on the read side, dunno.
> >
> I don't think we do read-before-write ;)
>
I need to figure out the reason for read contention and why seqlock's
help. Like I said before I am seeing some strange values for
reclaim_stats on the root cgroup, even though it is not reclaimable or
not used for reclaim. There can be two reasons
1. Reclaim
2. User space constantly reading the counters
I have no user space utilities I am aware of running on the system,
constantly reading the contents of the files.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 5:40 ` Balbir Singh
@ 2009-06-25 6:30 ` KAMEZAWA Hiroyuki
2009-06-25 16:16 ` Balbir Singh
0 siblings, 1 reply; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-06-25 6:30 UTC (permalink / raw)
To: balbir; +Cc: Andrew Morton, nishimura, menage, xemul, linux-mm, lizf
On Thu, 25 Jun 2009 11:10:42 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-06-25 13:39:08]:
>
> > On Wed, 24 Jun 2009 20:44:26 -0700
> > Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > > On Thu, 25 Jun 2009 08:57:17 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > >
> > > > We do a read everytime before we charge.
> > >
> > > See, a good way to fix that is to not do it. Instead of
> > >
> > > if (under_limit())
> > > charge_some_more(amount);
> > > else
> > > goto fail;
> > >
> > > one can do
> > >
> > > if (try_to_charge_some_more(amount) < 0)
> > > goto fail;
> > >
> > > which will halve the locking frequency. Which may not be as beneficial
> > > as avoiding the locking altogether on the read side, dunno.
> > >
> > I don't think we do read-before-write ;)
> >
>
> I need to figure out the reason for read contention and why seqlock's
> help. Like I said before I am seeing some strange values for
> reclaim_stats on the root cgroup, even though it is not reclaimable or
> not used for reclaim. There can be two reasons
>
I don't remember but reclaim_stat goes bad ? new BUG ?
reclaim_stat means zone_recaim_stat gotten by get_reclaim_stat() ?
IIUC, after your ROOT_CGROUP-no-LRU patch, reclaim_stat of root cgroup
will never be accessed. Right ?
> 1. Reclaim
> 2. User space constantly reading the counters
>
> I have no user space utilities I am aware of running on the system,
> constantly reading the contents of the files.
>
This is from your result.
Before After
class name &counter->lock: &(&counter->lock)->lock
con-bounces 1534627 962193
contentions 1575341 976349
waittime-min 0.57 0.60
waittime-max 18.39 14.07
waittime-total 675713.23 465926.04
acq-bounces 43330446 21364165
acquisitions 138524248 66041988
holdtime-min 0.43 0.45
holdtime-max 148.13 88.31
holdtime-total 54133607.05 25395513.12
>From this result, acquisitions is changed as
- 138524248 => 66041988
Almost half.
Then,
- "read" should be half of all counter access.
or
- did you enabped swap cgroup in "after" test ?
BTW, if this result is against "Root" cgroup, no reclaim by memcg
will happen after your no-ROOT-LRU patch.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 6:30 ` KAMEZAWA Hiroyuki
@ 2009-06-25 16:16 ` Balbir Singh
0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2009-06-25 16:16 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Andrew Morton, nishimura, menage, xemul, linux-mm, lizf
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-06-25 15:30:33]:
> On Thu, 25 Jun 2009 11:10:42 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-06-25 13:39:08]:
> >
> > > On Wed, 24 Jun 2009 20:44:26 -0700
> > > Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > > On Thu, 25 Jun 2009 08:57:17 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > > >
> > > > > We do a read everytime before we charge.
> > > >
> > > > See, a good way to fix that is to not do it. Instead of
> > > >
> > > > if (under_limit())
> > > > charge_some_more(amount);
> > > > else
> > > > goto fail;
> > > >
> > > > one can do
> > > >
> > > > if (try_to_charge_some_more(amount) < 0)
> > > > goto fail;
> > > >
> > > > which will halve the locking frequency. Which may not be as beneficial
> > > > as avoiding the locking altogether on the read side, dunno.
> > > >
> > > I don't think we do read-before-write ;)
> > >
> >
> > I need to figure out the reason for read contention and why seqlock's
> > help. Like I said before I am seeing some strange values for
> > reclaim_stats on the root cgroup, even though it is not reclaimable or
> > not used for reclaim. There can be two reasons
> >
> I don't remember but reclaim_stat goes bad ? new BUG ?
> reclaim_stat means zone_recaim_stat gotten by get_reclaim_stat() ?
>
> IIUC, after your ROOT_CGROUP-no-LRU patch, reclaim_stat of root cgroup
> will never be accessed. Right ?
>
Correct!
>
> > 1. Reclaim
> > 2. User space constantly reading the counters
> >
> > I have no user space utilities I am aware of running on the system,
> > constantly reading the contents of the files.
> >
>
> This is from your result.
>
> Before After
> class name &counter->lock: &(&counter->lock)->lock
> con-bounces 1534627 962193
> contentions 1575341 976349
> waittime-min 0.57 0.60
> waittime-max 18.39 14.07
> waittime-total 675713.23 465926.04
> acq-bounces 43330446 21364165
> acquisitions 138524248 66041988
> holdtime-min 0.43 0.45
> holdtime-max 148.13 88.31
> holdtime-total 54133607.05 25395513.12
>
> >From this result, acquisitions is changed as
> - 138524248 => 66041988
> Almost half.
>
Yes, precisely! That is why I thought it was a great result.
> Then,
> - "read" should be half of all counter access.
> or
> - did you enabped swap cgroup in "after" test ?
>
> BTW, if this result is against "Root" cgroup, no reclaim by memcg
> will happen after your no-ROOT-LRU patch.
>
The configuration was the same for both runs. I'll rerun and see why
that is.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 3:44 ` Andrew Morton
2009-06-25 4:39 ` KAMEZAWA Hiroyuki
@ 2009-06-25 5:01 ` Balbir Singh
1 sibling, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2009-06-25 5:01 UTC (permalink / raw)
To: Andrew Morton; +Cc: KAMEZAWA Hiroyuki, nishimura, menage, xemul, linux-mm, lizf
* Andrew Morton <akpm@linux-foundation.org> [2009-06-24 20:44:26]:
> On Thu, 25 Jun 2009 08:57:17 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > We do a read everytime before we charge.
>
> See, a good way to fix that is to not do it. Instead of
>
> if (under_limit())
> charge_some_more(amount);
> else
> goto fail;
>
> one can do
>
> if (try_to_charge_some_more(amount) < 0)
> goto fail;
>
> which will halve the locking frequency. Which may not be as beneficial
> as avoiding the locking altogether on the read side, dunno.
>
My bad, we do it all under one lock. We do a read within the charge
lock. I should get some Tea or coffee before responding to emails in
the morning.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 3:27 ` Balbir Singh
2009-06-25 3:44 ` Andrew Morton
@ 2009-06-25 4:37 ` KAMEZAWA Hiroyuki
1 sibling, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-06-25 4:37 UTC (permalink / raw)
To: balbir; +Cc: Andrew Morton, nishimura, menage, xemul, linux-mm, lizf
On Thu, 25 Jun 2009 08:57:17 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > What kind of workload can be much improved ?
> > IIUC, in general, using seq_lock to frequently modified counter just makes
> > it slow.
>
> Why do you think so? I've been looking primarily at do_gettimeofday().
IIUC, modification to xtime is _not_ frequent.
> Yes, frequent updates can hurt readers in the worst case.
You don't understand my point. write-side of seqlock itself is
heavy. I have no interests in read-side.
What need to be faster is here.
==
929 while (1) {
930 int ret;
931 bool noswap = false;
932
933 ret = res_counter_charge(&mem->res, PAGE_SIZE, &fail_res);
934 if (likely(!ret)) {
935 if (!do_swap_account)
936 break;
937 ret = res_counter_charge(&mem->memsw, PAGE_SIZE,
938 &fail_res);
939 if (likely(!ret))
940 break;
941 /* mem+swap counter fails */
942 res_counter_uncharge(&mem->res, PAGE_SIZE);
943 noswap = true;
944 mem_over_limit = mem_cgroup_from_res_counter(fail_res,
945 memsw);
946 } else
947 /* mem counter fails */
948 mem_over_limit = mem_cgroup_from_res_counter(fail_res,
949
==
And using seq_lock will add more overheads to here.
> I've been
> meaning to experiment with percpu counters as well, but we'll need to
> decide what is the tolerance limit, since we can have a batch value
> fuzziness, before all CPUs see that the limit is exceeded, but it
> might be worth experimenting.
>
per-cpu counter is a choice. but "batch" value is very difficult if
we never allow "exceeds". And if # of bactch is too small, percpu
counter is slower than current one.
And if hierarchy is used, jitter by batch will be very big in parent nodes.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] Reduce the resource counter lock overhead
2009-06-24 23:10 ` Andrew Morton
2009-06-24 23:53 ` KAMEZAWA Hiroyuki
@ 2009-06-25 3:04 ` Balbir Singh
2009-06-25 3:40 ` Andrew Morton
1 sibling, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2009-06-25 3:04 UTC (permalink / raw)
To: Andrew Morton; +Cc: kamezawa.hiroyu, nishimura, menage, xemul, linux-mm, lizf
* Andrew Morton <akpm@linux-foundation.org> [2009-06-24 16:10:28]:
> On Wed, 24 Jun 2009 22:35:16 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> > Hi, All,
> >
> > I've been experimenting with reduction of resource counter locking
> > overhead. My benchmarks show a marginal improvement, /proc/lock_stat
> > however shows that the lock contention time and held time reduce
> > by quite an amount after this patch.
>
> That looks sane.
>
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > class name con-bounces contentions
> > waittime-min waittime-max waittime-total acq-bounces
> > acquisitions holdtime-min holdtime-max holdtime-total
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > &counter->lock: 1534627 1575341
> > 0.57 18.39 675713.23 43330446 138524248
> > 0.43 148.13 54133607.05
> > --------------
> > &counter->lock 809559
> > [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
> > &counter->lock 765782
> > [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> > --------------
> > &counter->lock 653284
> > [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
> > &counter->lock 922057
> > [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
>
> Please turn off the wordwrapping before sending the signed-off version.
>
I'll need to see what caused the problem here. Thanks for the heads-up
> > static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> > {
> > bool ret;
> > - unsigned long flags;
> > + unsigned long flags, seq;
> >
> > - spin_lock_irqsave(&cnt->lock, flags);
> > - ret = res_counter_limit_check_locked(cnt);
> > - spin_unlock_irqrestore(&cnt->lock, flags);
> > + do {
> > + seq = read_seqbegin_irqsave(&cnt->lock, flags);
> > + ret = res_counter_limit_check_locked(cnt);
> > + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
> > return ret;
> > }
>
> This change makes the inlining of these functions even more
> inappropriate than it already was.
>
> This function should be static in memcontrol.c anyway?
We wanted to modularize resource counters and keep the code isolated
from memcontrol.c, hence it continues to live outside
>
> Which function is calling mem_cgroup_check_under_limit() so much?
> __mem_cgroup_try_charge()? If so, I'm a bit surprised because
> inefficiencies of this nature in page reclaim rarely are demonstrable -
> reclaim just doesn't get called much. Perhaps this is a sign that
> reclaim is scanning the same pages over and over again and is being
> inefficient at a higher level?
>
We do a check everytime before we charge. To answer the other part of
reclaim, I am currently seeing some interesting data, even with no
groups created, I see memcg reclaim_stats set to root to be quite
high, even though we are not reclaiming from root.
I am yet to get to the root cause of the issue
> Do we really need to call mem_cgroup_hierarchical_reclaim() as
> frequently as we apparently are doing?
>
All our reclaim is now hierarchical, was there anything specific you
saw?
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFC] Reduce the resource counter lock overhead
2009-06-25 3:04 ` Balbir Singh
@ 2009-06-25 3:40 ` Andrew Morton
0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2009-06-25 3:40 UTC (permalink / raw)
To: balbir; +Cc: kamezawa.hiroyu, nishimura, menage, xemul, linux-mm, lizf
On Thu, 25 Jun 2009 08:34:46 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * Andrew Morton <akpm@linux-foundation.org> [2009-06-24 16:10:28]:
>
> > On Wed, 24 Jun 2009 22:35:16 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
> ...
>
> > > static inline bool res_counter_check_under_limit(struct res_counter *cnt)
> > > {
> > > bool ret;
> > > - unsigned long flags;
> > > + unsigned long flags, seq;
> > >
> > > - spin_lock_irqsave(&cnt->lock, flags);
> > > - ret = res_counter_limit_check_locked(cnt);
> > > - spin_unlock_irqrestore(&cnt->lock, flags);
> > > + do {
> > > + seq = read_seqbegin_irqsave(&cnt->lock, flags);
> > > + ret = res_counter_limit_check_locked(cnt);
> > > + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
> > > return ret;
> > > }
> >
> > This change makes the inlining of these functions even more
> > inappropriate than it already was.
> >
> > This function should be static in memcontrol.c anyway?
>
> We wanted to modularize resource counters and keep the code isolated
> from memcontrol.c, hence it continues to live outside
That doesn't mean that is has to be inlined. That function is really
really big, especially with lockdep enabled.
> >
> > Which function is calling mem_cgroup_check_under_limit() so much?
> > __mem_cgroup_try_charge()? If so, I'm a bit surprised because
> > inefficiencies of this nature in page reclaim rarely are demonstrable -
> > reclaim just doesn't get called much. Perhaps this is a sign that
> > reclaim is scanning the same pages over and over again and is being
> > inefficient at a higher level?
> >
>
> We do a check everytime before we charge. To answer the other part of
> reclaim, I am currently seeing some interesting data, even with no
> groups created, I see memcg reclaim_stats set to root to be quite
> high, even though we are not reclaiming from root.
> I am yet to get to the root cause of the issue
>
>
> > Do we really need to call mem_cgroup_hierarchical_reclaim() as
> > frequently as we apparently are doing?
> >
>
> All our reclaim is now hierarchical, was there anything specific you
> saw?
My point is that when one sees a function high in the profiles,
speeding up that function isn't the only fix. Another (often superior)
fix is to call that function less frequently. Or perhaps to cache its
result in some fashion.
Have you established that this function is being called at the minimum
possible frequency? Is the frequency at which it being called
reasonable and expected?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-06-25 19:09 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-24 17:05 [RFC] Reduce the resource counter lock overhead Balbir Singh
2009-06-24 19:40 ` Paul Menage
2009-06-24 23:10 ` Andrew Morton
2009-06-24 23:53 ` KAMEZAWA Hiroyuki
2009-06-25 3:27 ` Balbir Singh
2009-06-25 3:44 ` Andrew Morton
2009-06-25 4:39 ` KAMEZAWA Hiroyuki
2009-06-25 5:40 ` Balbir Singh
2009-06-25 6:30 ` KAMEZAWA Hiroyuki
2009-06-25 16:16 ` Balbir Singh
2009-06-25 5:01 ` Balbir Singh
2009-06-25 4:37 ` KAMEZAWA Hiroyuki
2009-06-25 3:04 ` Balbir Singh
2009-06-25 3:40 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox