From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 26 Aug 2004 21:59:27 -0700 From: Andrew Morton Subject: Re: [Lhms-devel] [RFC] buddy allocator without bitmap [2/4] Message-Id: <20040826215927.0af2dee9.akpm@osdl.org> In-Reply-To: <412EBD22.2090508@jp.fujitsu.com> References: <412DD1AA.8080408@jp.fujitsu.com> <1093535402.2984.11.camel@nighthawk> <412E6CC3.8060908@jp.fujitsu.com> <20040826171840.4a61e80d.akpm@osdl.org> <412E8009.3080508@jp.fujitsu.com> <412EBD22.2090508@jp.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Hiroyuki KAMEZAWA Cc: haveblue@us.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lhms-devel@lists.sourceforge.net, wli@holomorphy.com List-ID: Hiroyuki KAMEZAWA wrote: > > > Hi, > I testd set_bit()/__set_bit() ops, atomic and non atomic ops, on my Xeon. > I think this test is not perfect, but shows some aspect of pefromance of atomic ops. Oh, atomic ops on a P4 hurt like hell. Try doing this: time dd if=/dev/null of=foo bs=1 count=1M on an SMP kernel and compare it with a uniproc kernel. The difference is large. Certainly, executing an atomic op in a tight loop will show a lot of difference. But that doesn't mean that making these operations non-atomic makes a significant difference to overall kernel performance! But whatever - it all adds up. The microoptimisation is fine - let's go that way. > Result: > [root@kanex2 atomic]# nice -10 ./test-atomics > score 0 is 64011 note: cache hit, no atomic > score 1 is 543011 note: cache hit, atomic > score 2 is 303901 note: cache hit, mixture > score 3 is 344261 note: cache miss, no atomic > score 4 is 1131085 note: cache miss, atomic > score 5 is 593443 note: cache miss, mixture > score 6 is 118455 note: cache hit, dependency, noatomic > score 7 is 416195 note: cache hit, dependency, mixture > > smaller score is better. > score 0-2 shows set_bit/__set_bit performance during good cache hit rate. > score 3-5 shows set_bit/__set_bit performance during bad cache hit rate. > score 6-7 shows set_bit/__set_bit performance during good cache hit > but there is data dependency on each access in the tight loop. I _think_ the above means atomic ops are 10x more costly, yes? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: aart@kvack.org