From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail144.messagelabs.com (mail144.messagelabs.com [216.82.254.51]) by kanga.kvack.org (Postfix) with ESMTP id D40F66B005D for ; Wed, 24 Jun 2009 23:26:40 -0400 (EDT) Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e37.co.us.ibm.com (8.13.1/8.13.1) with ESMTP id n5P3Qlgd003816 for ; Wed, 24 Jun 2009 21:26:47 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n5P3RTp8206764 for ; Wed, 24 Jun 2009 21:27:29 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n5P3RTCR011418 for ; Wed, 24 Jun 2009 21:27:29 -0600 Date: Thu, 25 Jun 2009 08:57:17 +0530 From: Balbir Singh Subject: Re: [RFC] Reduce the resource counter lock overhead Message-ID: <20090625032717.GX8642@balbir.in.ibm.com> Reply-To: balbir@linux.vnet.ibm.com References: <20090624170516.GT8642@balbir.in.ibm.com> <20090624161028.b165a61a.akpm@linux-foundation.org> <20090625085347.a64654a7.kamezawa.hiroyu@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20090625085347.a64654a7.kamezawa.hiroyu@jp.fujitsu.com> Sender: owner-linux-mm@kvack.org To: KAMEZAWA Hiroyuki Cc: Andrew Morton , nishimura@mxp.nes.nec.co.jp, menage@google.com, xemul@openvz.org, linux-mm@kvack.org, lizf@cn.fujitsu.com List-ID: * KAMEZAWA Hiroyuki [2009-06-25 08:53:47]: > On Wed, 24 Jun 2009 16:10:28 -0700 > Andrew Morton wrote: > > > On Wed, 24 Jun 2009 22:35:16 +0530 > > Balbir Singh wrote: > > > > > Hi, All, > > > > > > I've been experimenting with reduction of resource counter locking > > > overhead. My benchmarks show a marginal improvement, /proc/lock_stat > > > however shows that the lock contention time and held time reduce > > > by quite an amount after this patch. > > > > That looks sane. > > > I suprized to see seq_lock here can reduce the overhead. > I am not too surprised, given that we do frequent read-writes. We do a read everytime before we charge. > > > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > > class name con-bounces contentions > > > waittime-min waittime-max waittime-total acq-bounces > > > acquisitions holdtime-min holdtime-max holdtime-total > > > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > > > > > &counter->lock: 1534627 1575341 > > > 0.57 18.39 675713.23 43330446 138524248 > > > 0.43 148.13 54133607.05 > > > -------------- > > > &counter->lock 809559 > > > [] res_counter_charge+0x3f/0xed > > > &counter->lock 765782 > > > [] res_counter_uncharge+0x2c/0x6d > > > -------------- > > > &counter->lock 653284 > > > [] res_counter_uncharge+0x2c/0x6d > > > &counter->lock 922057 > > > [] res_counter_charge+0x3f/0xed > > > > Please turn off the wordwrapping before sending the signed-off version. > > > > > static inline bool res_counter_check_under_limit(struct res_counter *cnt) > > > { > > > bool ret; > > > - unsigned long flags; > > > + unsigned long flags, seq; > > > > > > - spin_lock_irqsave(&cnt->lock, flags); > > > - ret = res_counter_limit_check_locked(cnt); > > > - spin_unlock_irqrestore(&cnt->lock, flags); > > > + do { > > > + seq = read_seqbegin_irqsave(&cnt->lock, flags); > > > + ret = res_counter_limit_check_locked(cnt); > > > + } while (read_seqretry_irqrestore(&cnt->lock, seq, flags)); > > > return ret; > > > } > > > > This change makes the inlining of these functions even more > > inappropriate than it already was. > > > > This function should be static in memcontrol.c anyway? > > > > Which function is calling mem_cgroup_check_under_limit() so much? > > __mem_cgroup_try_charge()? If so, I'm a bit surprised because > > inefficiencies of this nature in page reclaim rarely are demonstrable - > > reclaim just doesn't get called much. Perhaps this is a sign that > > reclaim is scanning the same pages over and over again and is being > > inefficient at a higher level? > > > > Do we really need to call mem_cgroup_hierarchical_reclaim() as > > frequently as we apparently are doing? > > > > Most of modification to res_counter is > - charge > - uncharge > and not > - read > > What kind of workload can be much improved ? > IIUC, in general, using seq_lock to frequently modified counter just makes > it slow. Why do you think so? I've been looking primarily at do_gettimeofday(). Yes, frequent updates can hurt readers in the worst case. I've been meaning to experiment with percpu counters as well, but we'll need to decide what is the tolerance limit, since we can have a batch value fuzziness, before all CPUs see that the limit is exceeded, but it might be worth experimenting. > > Could you show improved kernbench or unixbench score ? > I'll start some of these and see if I can get a large machine to test on. I ran reaim for the current run. -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org