Re: [RFC] Reduce the resource counter lock overhead

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Paul Menage <menage@google.com>
To: balbir@linux.vnet.ibm.com
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
	Andrew Morton <akpm@linux-foundation.org>,
	xemul@openvz.org, "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"lizf@cn.fujitsu.com" <lizf@cn.fujitsu.com>
Subject: Re: [RFC] Reduce the resource counter lock overhead
Date: Wed, 24 Jun 2009 12:40:38 -0700	[thread overview]
Message-ID: <6599ad830906241240o26ab54ffj37a1685f7c7d9e05@mail.gmail.com> (raw)
In-Reply-To: <20090624170516.GT8642@balbir.in.ibm.com>

Looks like a sensible change.

Paul

On Wed, Jun 24, 2009 at 10:05 AM, Balbir Singh<balbir@linux.vnet.ibm.com> wrote:
> Hi, All,
>
> I've been experimenting with reduction of resource counter locking
> overhead. My benchmarks show a marginal improvement, /proc/lock_stat
> however shows that the lock contention time and held time reduce
> by quite an amount after this patch.
>
> Before the patch, I see
>
> lock_stat version 0.3
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>                              class name    con-bounces    contentions
> waittime-min   waittime-max waittime-total    acq-bounces
> acquisitions   holdtime-min   holdtime-max holdtime-total
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                          &counter->lock:       1534627        1575341
> 0.57          18.39      675713.23       43330446      138524248
> 0.43         148.13    54133607.05
>                          --------------
>                          &counter->lock         809559
> [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
>                          &counter->lock         765782
> [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
>                          --------------
>                          &counter->lock         653284
> [<ffffffff81081045>] res_counter_uncharge+0x2c/0x6d
>                          &counter->lock         922057
> [<ffffffff810810c5>] res_counter_charge+0x3f/0xed
>
>
> After the patch I see
>
> lock_stat version 0.3
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>                              class name    con-bounces    contentions
> waittime-min   waittime-max waittime-total    acq-bounces
> acquisitions   holdtime-min   holdtime-max holdtime-total
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 &(&counter->lock)->lock:        962193         976349
> 0.60          14.07      465926.04       21364165       66041988
> 0.45          88.31    25395513.12
>                 -----------------------
>                 &(&counter->lock)->lock         495468
> [<ffffffff8108106e>] res_counter_uncharge+0x2c/0x77
>                 &(&counter->lock)->lock         480881
> [<ffffffff810810f7>] res_counter_charge+0x3e/0xfb
>                 -----------------------
>                 &(&counter->lock)->lock         564419
> [<ffffffff810810f7>] res_counter_charge+0x3e/0xfb
>                 &(&counter->lock)->lock         411930
> [<ffffffff8108106e>] res_counter_uncharge+0x2c/0x77
>
> Please review, comment on the usefulness of this approach. I do have
> another approach in mind for reducing res_counter lock overhead, but
> this one seems the most straight forward
>
>
> Feature: Change locking of res_counter
>
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
>
> Resource Counters today use spin_lock_irq* variants for locking.
> This patch converts the lock to a seqlock_t
> ---
>
>  include/linux/res_counter.h |   24 +++++++++++++-----------
>  kernel/res_counter.c        |   18 +++++++++---------
>  2 files changed, 22 insertions(+), 20 deletions(-)
>
>
> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
> index 511f42f..4c61757 100644
> --- a/include/linux/res_counter.h
> +++ b/include/linux/res_counter.h
> @@ -14,6 +14,7 @@
>  */
>
>  #include <linux/cgroup.h>
> +#include <linux/seqlock.h>
>
>  /*
>  * The core object. the cgroup that wishes to account for some
> @@ -42,7 +43,7 @@ struct res_counter {
>         * the lock to protect all of the above.
>         * the routines below consider this to be IRQ-safe
>         */
> -       spinlock_t lock;
> +       seqlock_t lock;
>        /*
>         * Parent counter, used for hierarchial resource accounting
>         */
> @@ -139,11 +140,12 @@ static inline bool res_counter_limit_check_locked(struct res_counter *cnt)
>  static inline bool res_counter_check_under_limit(struct res_counter *cnt)
>  {
>        bool ret;
> -       unsigned long flags;
> +       unsigned long flags, seq;
>
> -       spin_lock_irqsave(&cnt->lock, flags);
> -       ret = res_counter_limit_check_locked(cnt);
> -       spin_unlock_irqrestore(&cnt->lock, flags);
> +       do {
> +               seq = read_seqbegin_irqsave(&cnt->lock, flags);
> +               ret = res_counter_limit_check_locked(cnt);
> +       } while (read_seqretry_irqrestore(&cnt->lock, seq, flags));
>        return ret;
>  }
>
> @@ -151,18 +153,18 @@ static inline void res_counter_reset_max(struct res_counter *cnt)
>  {
>        unsigned long flags;
>
> -       spin_lock_irqsave(&cnt->lock, flags);
> +       write_seqlock_irqsave(&cnt->lock, flags);
>        cnt->max_usage = cnt->usage;
> -       spin_unlock_irqrestore(&cnt->lock, flags);
> +       write_sequnlock_irqrestore(&cnt->lock, flags);
>  }
>
>  static inline void res_counter_reset_failcnt(struct res_counter *cnt)
>  {
>        unsigned long flags;
>
> -       spin_lock_irqsave(&cnt->lock, flags);
> +       write_seqlock_irqsave(&cnt->lock, flags);
>        cnt->failcnt = 0;
> -       spin_unlock_irqrestore(&cnt->lock, flags);
> +       write_sequnlock_irqrestore(&cnt->lock, flags);
>  }
>
>  static inline int res_counter_set_limit(struct res_counter *cnt,
> @@ -171,12 +173,12 @@ static inline int res_counter_set_limit(struct res_counter *cnt,
>        unsigned long flags;
>        int ret = -EBUSY;
>
> -       spin_lock_irqsave(&cnt->lock, flags);
> +       write_seqlock_irqsave(&cnt->lock, flags);
>        if (cnt->usage <= limit) {
>                cnt->limit = limit;
>                ret = 0;
>        }
> -       spin_unlock_irqrestore(&cnt->lock, flags);
> +       write_sequnlock_irqrestore(&cnt->lock, flags);
>        return ret;
>  }
>
> diff --git a/kernel/res_counter.c b/kernel/res_counter.c
> index e1338f0..9830c00 100644
> --- a/kernel/res_counter.c
> +++ b/kernel/res_counter.c
> @@ -17,7 +17,7 @@
>
>  void res_counter_init(struct res_counter *counter, struct res_counter *parent)
>  {
> -       spin_lock_init(&counter->lock);
> +       seqlock_init(&counter->lock);
>        counter->limit = RESOURCE_MAX;
>        counter->parent = parent;
>  }
> @@ -45,9 +45,9 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
>        *limit_fail_at = NULL;
>        local_irq_save(flags);
>        for (c = counter; c != NULL; c = c->parent) {
> -               spin_lock(&c->lock);
> +               write_seqlock(&c->lock);
>                ret = res_counter_charge_locked(c, val);
> -               spin_unlock(&c->lock);
> +               write_sequnlock(&c->lock);
>                if (ret < 0) {
>                        *limit_fail_at = c;
>                        goto undo;
> @@ -57,9 +57,9 @@ int res_counter_charge(struct res_counter *counter, unsigned long val,
>        goto done;
>  undo:
>        for (u = counter; u != c; u = u->parent) {
> -               spin_lock(&u->lock);
> +               write_seqlock(&u->lock);
>                res_counter_uncharge_locked(u, val);
> -               spin_unlock(&u->lock);
> +               write_sequnlock(&u->lock);
>        }
>  done:
>        local_irq_restore(flags);
> @@ -81,9 +81,9 @@ void res_counter_uncharge(struct res_counter *counter, unsigned long val)
>
>        local_irq_save(flags);
>        for (c = counter; c != NULL; c = c->parent) {
> -               spin_lock(&c->lock);
> +               write_seqlock(&c->lock);
>                res_counter_uncharge_locked(c, val);
> -               spin_unlock(&c->lock);
> +               write_sequnlock(&c->lock);
>        }
>        local_irq_restore(flags);
>  }
> @@ -167,9 +167,9 @@ int res_counter_write(struct res_counter *counter, int member,
>                if (*end != '\0')
>                        return -EINVAL;
>        }
> -       spin_lock_irqsave(&counter->lock, flags);
> +       write_seqlock_irqsave(&counter->lock, flags);
>        val = res_counter_member(counter, member);
>        *val = tmp;
> -       spin_unlock_irqrestore(&counter->lock, flags);
> +       write_sequnlock_irqrestore(&counter->lock, flags);
>        return 0;
>  }
>
> --
>        Balbir
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-06-24 19:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-24 17:05 Balbir Singh
2009-06-24 19:40 ` Paul Menage [this message]
2009-06-24 23:10 ` Andrew Morton
2009-06-24 23:53   ` KAMEZAWA Hiroyuki
2009-06-25  3:27     ` Balbir Singh
2009-06-25  3:44       ` Andrew Morton
2009-06-25  4:39         ` KAMEZAWA Hiroyuki
2009-06-25  5:40           ` Balbir Singh
2009-06-25  6:30             ` KAMEZAWA Hiroyuki
2009-06-25 16:16               ` Balbir Singh
2009-06-25  5:01         ` Balbir Singh
2009-06-25  4:37       ` KAMEZAWA Hiroyuki
2009-06-25  3:04   ` Balbir Singh
2009-06-25  3:40     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6599ad830906241240o26ab54ffj37a1685f7c7d9e05@mail.gmail.com \
    --to=menage@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox