From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <476B1677.4020009@hp.com> Date: Thu, 20 Dec 2007 20:27:19 -0500 From: Mark Seger MIME-Version: 1.0 Subject: Re: SLUB References: <476A850A.1080807@hp.com> <476AFC6C.3080903@hp.com> <476B122E.7010108@hp.com> In-Reply-To: <476B122E.7010108@hp.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Mark Seger , linux-mm@kvack.org List-ID: I just realized I forgot to include an example of the output I was generating so here it is: Slab Name ObjSize NumObj SlabSize NumSlab Total :0000008 8 2185 8 5 17480 :0000016 16 1604 16 9 25664 :0000024 24 409 24 4 9816 :0000032 32 380 32 5 12160 :0000040 40 204 40 2 8160 :0000048 48 0 48 0 0 :0000064 64 843 64 17 53952 :0000072 72 167 72 3 12024 :0000088 88 5549 88 121 488312 :0000096 96 1400 96 40 134400 :0000112 112 0 112 0 0 :0000128 128 385 128 21 49280 :0000136 136 70 136 4 9520 :0000152 152 59 152 4 8968 :0000160 160 46 160 4 7360 :0000176 176 2071 176 93 364496 :0000192 192 400 192 24 76800 :0000256 256 1333 256 100 341248 :0000288 288 54 288 6 15552 :0000320 320 53 320 7 16960 :0000384 384 29 384 5 11136 :0000448 420 22 448 4 9856 :0000512 512 150 512 22 76800 :0000704 696 33 704 3 23232 :0000768 768 82 768 21 62976 :0000832 776 98 832 15 81536 :0000896 896 48 896 14 43008 :0000960 944 39 960 15 37440 :0001024 1024 303 1024 80 310272 :0001088 1048 28 1088 4 30464 :0001608 1608 34 1608 7 54672 :0001728 1712 16 1728 5 27648 :0001856 1856 8 1856 2 14848 :0001904 1904 87 1904 28 165648 :0002048 2048 504 2048 131 1032192 :0004096 4096 49 4096 28 200704 :0008192 8192 8 8192 12 65536 :0016384 16384 4 16384 7 65536 :0032768 32768 3 32768 3 98304 :0065536 65536 1 65536 1 65536 :0131072 131072 0 131072 0 0 :0262144 262144 0 262144 0 0 :0524288 524288 0 524288 0 0 :1048576 1048576 0 1048576 0 0 :2097152 2097152 0 2097152 0 0 :4194304 4194304 0 4194304 0 0 :a-0000088 88 0 88 0 0 :a-0000104 104 13963 104 359 1452152 :a-0000168 168 0 168 0 0 :a-0000224 224 11113 224 619 2489312 :a-0000256 248 0 256 0 0 anon_vma 40 796 48 12 38208 bdev_cache 960 32 1024 8 32768 ext2_inode_cache 920 0 928 0 0 ext3_inode_cache 968 4775 976 1194 4660400 file_lock_cache 192 58 200 4 11600 hugetlbfs_inode_cache 752 5 760 1 3800 idr_layer_cache 528 91 536 14 48776 inode_cache 720 3015 728 604 2194920 isofs_inode_cache 768 0 776 0 0 kmem_cache_node 72 232 72 6 16704 mqueue_inode_cache 1040 7 1088 1 7616 nfs_inode_cache 1120 102 1128 15 115056 proc_inode_cache 752 503 760 102 382280 radix_tree_node 552 2666 560 381 1492960 rpc_inode_cache 928 16 960 4 15360 shmem_inode_cache 960 243 968 61 235224 sighand_cache 2120 86 2176 31 187136 sock_inode_cache 816 81 832 11 67392 TOTAL K: 17169 and here's /proc/meminfo MemTotal: 4040768 kB MemFree: 3726112 kB Buffers: 13864 kB Cached: 196920 kB SwapCached: 0 kB Active: 127264 kB Inactive: 127864 kB SwapTotal: 4466060 kB SwapFree: 4466060 kB Dirty: 60 kB Writeback: 0 kB AnonPages: 44364 kB Mapped: 16124 kB Slab: 18608 kB SReclaimable: 11768 kB SUnreclaim: 6840 kB PageTables: 2240 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 6486444 kB Committed_AS: 64064 kB VmallocTotal: 34359738367 kB VmallocUsed: 32364 kB VmallocChunk: 34359705775 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB -mark Mark Seger wrote: > I did some preliminary prototyping and I guess I'm not sure of the > math. If I understand what you're saying, an object has a particular > size, but given the fact that you may need alignment, the true size is > really the slab size, and the difference is the overhead. What I > don't understand is how to calculate how much memory a particular slab > takes up. If the slabsize is really the size of an object, wouldn't I > multiple that times the number of objects? But when I do that I get a > number smaller than that reported in /proc/meminfo, in my case 15997K > vs 17388K. Given memory numbers rarely seem to add up maybe this IS > close enough? If so, what's the significance of the number of slabs? > Would I divide the 15997K by the number of slabs to find out how big a > single slab is? I would have thought that's what the slab_size is but > clearly it isn't. > > In any event, here's a table of what I see on my machine. The first 4 > columns come from /sys/slab and the 5th I calculated by just > multiplying SlabSize X NumObj. If I should be doing something else, > please tell me. Also be sure to tell me if I should include other > data. For example, the number of objects is a little misleading since > when I look at the file I really see something like: > > 49 N0=19 N1=30 > > which I'm guessing may mean 19 objects are allocated to socket 0 and > 30 to socket 1? this is a dual-core, dual-socket system. > > -mark > > Mark Seger wrote: >> >>>> Perhaps someone would like to take this discussion off-line with me >>>> and even >>>> collaborate with me on enhancements for slub in collectl? >> sounds good to me, I just didn't want to annoy anyone... >>>> I think we better keep it public (so that it goes into the >>>> archive). Here a short description of the field in >>>> /sys/kernel/slab/ that you would need >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size >>>> >>>> The size of an object. Subtract slab_size - object_size and you >>>> have the per object overhead generated by alignements and slab >>>> metadata. Does not change you only need to read this once. >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects >>>> >>>> Number of objects in use. This changes and you may want to monitor it. >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size >>>> >>>> Total memory used for a single object. Read this only once. >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs >>>> >>>> Number of slab pages in use for this slab cache. May change if slab >>>> is extended. >>>> >> What I'm not sure about is how this maps to the old slab info. >> Specifically, I believe in the old model one reported on the size >> taken up by the slabs (number of slabs X number of objects/slab X >> object size). There was a second size for the actual number of >> objects in use, so in my report that looked like this: >> >> # <-----------Objects----------><---------Slab >> Allocation------> >> #Name InUse Bytes Alloc Bytes InUse >> Bytes Total Bytes >> nfs_direct_cache 0 0 0 0 0 >> 0 0 0 >> nfs_write_data 36 27648 40 30720 8 >> 32768 8 32768 >> >> the slab allocation was real memory allocated (which should come >> close to Slab: in /proc/meminfo, right?) for the slabs while the >> object bytes were those in use. Is it worth it to continue this >> model or do thing work differently. It sounds like I can still do >> this with the numbers you've pointed me to above and I do now realize >> I only need to monitor the number of slabs and the number of objects >> since the others are constants. >> >> To get back to my original question, I'd like to make sure that I'm >> reporting useful information and not just data for the sake of it. >> In one of your postings I saw a report you had that showed: >> >> slubinfo - version: 1.0 >> # name // >> >> >> How useful is order, cpu, flags and nodes? >> Do people really care about how much memory is taken up by objects vs >> slabs? If not, I could see reporting for each slab: >> - object size >> - number objects >> - slab size >> - number of slabs >> - total memory (slab size X number of slabs) >> - whatever else people might think to be useful such as order, cpu, >> flags, etc >> >> Another thing I noticed is a number of the slabs are simply links to >> the same base name and is it sufficient to just report the base names >> and not those linked to it? Seems reasonable to me... >> >> The interesting thing about collectl is that it's written in perl >> (but I'm trying to be very careful to keep it efficient and it tends >> to use <0.1% cpu when run as a daemon) and the good news is it's >> pretty easy to get something implemented, depending on my free time. >> If we can get some level of agreement on what seems useful I could >> get a version up fairly quickly for people to start playing with if >> there is any interest. >> >> -mark >> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org