Hi, I testd set_bit()/__set_bit() ops, atomic and non atomic ops, on my Xeon. I think this test is not perfect, but shows some aspect of pefromance of atomic ops. Program: the program touches memory in tight loop, using atomic and non-atomic set_bit(). memory size is 512k, L2 cache size. I attaches it in this mail, but it is configured to my Xeon and looks ugly :). My CPU: from /proc/cpuinfo vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) XEON(TM) MP CPU 1.90GHz stepping : 2 cpu MHz : 1891.582 cache size : 512 KBCPU : Intel Xeon 1.8GHz Result: [root@kanex2 atomic]# nice -10 ./test-atomics score 0 is 64011 note: cache hit, no atomic score 1 is 543011 note: cache hit, atomic score 2 is 303901 note: cache hit, mixture score 3 is 344261 note: cache miss, no atomic score 4 is 1131085 note: cache miss, atomic score 5 is 593443 note: cache miss, mixture score 6 is 118455 note: cache hit, dependency, noatomic score 7 is 416195 note: cache hit, dependency, mixture smaller score is better. score 0-2 shows set_bit/__set_bit performance during good cache hit rate. score 3-5 shows set_bit/__set_bit performance during bad cache hit rate. score 6-7 shows set_bit/__set_bit performance during good cache hit but there is data dependency on each access in the tight loop. To Dave: cost of prefetch() is not here, because I found it is very sensitive to what is done in the loop and difficult to measure in this program. I found cost of calling prefetch is a bit high, I'll measure whether prefetch() in buddy allocator is good or bad again. I think this result shows I should use non-atomic ops when I can. Thanks. Kame Hiroyuki KAMEZAWA wrote: > > > Okay, I'll do more test and if I find atomic ops are slow, > I'll add __XXXPagePrivate() macros. > > ps. I usually test codes on Xeon 1.8G x 2 server. > > -- Kame > > Andrew Morton wrote: > >> Hiroyuki KAMEZAWA wrote: >> >>> In the previous version, I used >>> SetPagePrivate()/ClearPagePrivate()/PagePrivate(). >>> But these are "atomic" operation and looks very slow. >>> This is why I doesn't used these macros in this version. >>> >>> My previous version, which used set_bit/test_bit/clear_bit, shows >>> very bad performance >>> on my test, and I replaced it. >> >> >> >> That's surprising. But if you do intend to use non-atomic bitops then >> please add __SetPagePrivate() and __ClearPagePrivate() > > -- --the clue is these footmarks leading to the door.-- KAMEZAWA Hiroyuki