* SLUB @ 2007-12-20 15:06 Mark Seger 2007-12-20 19:44 ` SLUB Christoph Lameter 0 siblings, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-20 15:06 UTC (permalink / raw) To: linux-mm, clameter Forgive me if this is the wrong place to be asking this, but if so could someone point me to a better place? This past summer I released a tool on sourceforge called collectl - see http://collectl.sourceforge.net/ which does some pretty nifty system monitoring, one component of which is slabs. I finally got around to trying it out on a newer kernel and I picked 2.6.23 and lo and behold, it didn't work because /proc/slabinfo has disappeared to be replaced by /sys/slab. I've been looking around to try and better understand how to map slubs to slabs and couldn't find anything written up the definitions of the field on /sys/slab and I also suspect that while some of information reported by slub might map there could be other useful information worth tracking. To back up a few steps, in my collectl tool I can monitor slabs both in real time or log that data to a file for later playback. The format I use for display is modeled after slabtop, but I simply record data for all slabs (you can supply a filter). What I think is particularly useful about collectl is a switch that only shows allocations that have changed. This means if you run my tool with a monitoring interval of a second (the default interval I use for slabs is 60 seconds since it is more work to read/process all of slabinfo) you only see occasional changes as they occur. I've also found this feature very useful when analyzing longer term data that was collected at the 60 second intervals. Here's an example of running it with a 1 second monitoring interval on a relatively idle system: # <-----------Objects----------><---------Slab Allocation------> # Name InUse Bytes Alloc Bytes InUse Bytes Total Bytes 09:28:54 sgpool-32 32 32768 36 36864 8 32768 9 36864 09:28:54 blkdev_requests 12 3168 30 7920 1 4096 2 8192 09:28:54 bio 313 40064 372 47616 11 45056 12 49152 09:28:55 sgpool-32 32 32768 32 32768 8 32768 8 32768 09:28:55 blkdev_requests 12 3168 15 3960 1 4096 1 4096 09:28:55 bio 313 40064 341 43648 11 45056 11 45056 09:28:56 bio 287 36736 341 43648 10 40960 11 45056 09:28:56 task_struct 128 253952 140 277760 69 282624 70 286720 09:28:58 sgpool-64 33 67584 34 69632 17 69632 17 69632 09:28:58 bio 403 51584 403 51584 13 53248 13 53248 09:28:58 task_struct 124 246016 140 277760 68 278528 70 286720 09:28:59 journal_handle 0 0 0 0 0 0 0 0 09:28:59 task_struct 124 246016 136 269824 68 278528 68 278528 09:29:00 journal_handle 16 768 81 3888 1 4096 1 4096 09:29:00 scsi_cmd_cache 24 12288 35 17920 5 20480 5 20480 09:29:00 sgpool-64 32 65536 34 69632 16 65536 17 69632 09:29:00 sgpool-8 51 13056 75 19200 5 20480 5 20480 The thing that is especially useful with collectl is that by monitoring slabs at the same time as monitoring cpu, processes, disk, network and more, you can get a very comprehensive picture of what's going on at any one time. My main purpose for writing to this list then becomes what would make the most sense to do with slabs with the new slub allocator? Should I simply report on these same fields? Are there others that make more sense? Do I need to read all 184 entries in /sys/slab and then all the entries under them? Clearly I want to do this efficiently and provide meaningful data at the same time. Perhaps someone would like to take this discussion off-line with me and even collaborate with me on enhancements for slub in collectl? -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-20 15:06 SLUB Mark Seger @ 2007-12-20 19:44 ` Christoph Lameter 2007-12-20 23:36 ` SLUB Mark Seger 2007-12-21 16:59 ` SLUB Mark Seger 0 siblings, 2 replies; 24+ messages in thread From: Christoph Lameter @ 2007-12-20 19:44 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 20 Dec 2007, Mark Seger wrote: > This past summer I released a tool on sourceforge called collectl - see > http://collectl.sourceforge.net/ which does some pretty nifty system > monitoring, one component of which is slabs. I finally got around to trying > it out on a newer kernel and I picked 2.6.23 and lo and behold, it didn't work > because /proc/slabinfo has disappeared to be replaced by /sys/slab. I've been Yes. The information available about slabs is different now. > The thing that is especially useful with collectl is that by monitoring slabs > at the same time as monitoring cpu, processes, disk, network and more, you can > get a very comprehensive picture of what's going on at any one time. Good idea. > My main purpose for writing to this list then becomes what would make the most > sense to do with slabs with the new slub allocator? Should I simply report on > these same fields? Are there others that make more sense? Do I need to read > all 184 entries in /sys/slab and then all the entries under them? Clearly I > want to do this efficiently and provide meaningful data at the same time. You only need to read certain files that you need for the information you want to display. > Perhaps someone would like to take this discussion off-line with me and even > collaborate with me on enhancements for slub in collectl? I think we better keep it public (so that it goes into the archive). Here a short description of the field in /sys/kernel/slab/<slabcache> that you would need -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size The size of an object. Subtract slab_size - object_size and you have the per object overhead generated by alignements and slab metadata. Does not change you only need to read this once. -r--r--r-- 1 root root 4096 Dec 20 11:41 objects Number of objects in use. This changes and you may want to monitor it. -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size Total memory used for a single object. Read this only once. -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs Number of slab pages in use for this slab cache. May change if slab is extended. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-20 19:44 ` SLUB Christoph Lameter @ 2007-12-20 23:36 ` Mark Seger 2007-12-21 1:09 ` SLUB Mark Seger 2007-12-21 21:32 ` SLUB Christoph Lameter 2007-12-21 16:59 ` SLUB Mark Seger 1 sibling, 2 replies; 24+ messages in thread From: Mark Seger @ 2007-12-20 23:36 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm >> Perhaps someone would like to take this discussion off-line with me and even >> collaborate with me on enhancements for slub in collectl? sounds good to me, I just didn't want to annoy anyone... >> I think we better keep it public (so that it goes into the archive). Here >> a short description of the field in /sys/kernel/slab/<slabcache> that you >> would need >> >> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size >> >> The size of an object. Subtract slab_size - object_size and you have the >> per object overhead generated by alignements and slab metadata. Does not >> change you only need to read this once. >> >> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects >> >> Number of objects in use. This changes and you may want to monitor it. >> >> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size >> >> Total memory used for a single object. Read this only once. >> >> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs >> >> Number of slab pages in use for this slab cache. May change if slab is >> extended. >> What I'm not sure about is how this maps to the old slab info. Specifically, I believe in the old model one reported on the size taken up by the slabs (number of slabs X number of objects/slab X object size). There was a second size for the actual number of objects in use, so in my report that looked like this: # <-----------Objects----------><---------Slab Allocation------> #Name InUse Bytes Alloc Bytes InUse Bytes Total Bytes nfs_direct_cache 0 0 0 0 0 0 0 0 nfs_write_data 36 27648 40 30720 8 32768 8 32768 the slab allocation was real memory allocated (which should come close to Slab: in /proc/meminfo, right?) for the slabs while the object bytes were those in use. Is it worth it to continue this model or do thing work differently. It sounds like I can still do this with the numbers you've pointed me to above and I do now realize I only need to monitor the number of slabs and the number of objects since the others are constants. To get back to my original question, I'd like to make sure that I'm reporting useful information and not just data for the sake of it. In one of your postings I saw a report you had that showed: slubinfo - version: 1.0 # name <objects> <order> <objsize> <slabs>/<partial>/<cpu> <flags> <nodes> How useful is order, cpu, flags and nodes? Do people really care about how much memory is taken up by objects vs slabs? If not, I could see reporting for each slab: - object size - number objects - slab size - number of slabs - total memory (slab size X number of slabs) - whatever else people might think to be useful such as order, cpu, flags, etc Another thing I noticed is a number of the slabs are simply links to the same base name and is it sufficient to just report the base names and not those linked to it? Seems reasonable to me... The interesting thing about collectl is that it's written in perl (but I'm trying to be very careful to keep it efficient and it tends to use <0.1% cpu when run as a daemon) and the good news is it's pretty easy to get something implemented, depending on my free time. If we can get some level of agreement on what seems useful I could get a version up fairly quickly for people to start playing with if there is any interest. -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-20 23:36 ` SLUB Mark Seger @ 2007-12-21 1:09 ` Mark Seger 2007-12-21 1:27 ` SLUB Mark Seger 2007-12-21 21:41 ` SLUB Christoph Lameter 2007-12-21 21:32 ` SLUB Christoph Lameter 1 sibling, 2 replies; 24+ messages in thread From: Mark Seger @ 2007-12-21 1:09 UTC (permalink / raw) To: Christoph Lameter; +Cc: Mark Seger, linux-mm I did some preliminary prototyping and I guess I'm not sure of the math. If I understand what you're saying, an object has a particular size, but given the fact that you may need alignment, the true size is really the slab size, and the difference is the overhead. What I don't understand is how to calculate how much memory a particular slab takes up. If the slabsize is really the size of an object, wouldn't I multiple that times the number of objects? But when I do that I get a number smaller than that reported in /proc/meminfo, in my case 15997K vs 17388K. Given memory numbers rarely seem to add up maybe this IS close enough? If so, what's the significance of the number of slabs? Would I divide the 15997K by the number of slabs to find out how big a single slab is? I would have thought that's what the slab_size is but clearly it isn't. In any event, here's a table of what I see on my machine. The first 4 columns come from /sys/slab and the 5th I calculated by just multiplying SlabSize X NumObj. If I should be doing something else, please tell me. Also be sure to tell me if I should include other data. For example, the number of objects is a little misleading since when I look at the file I really see something like: 49 N0=19 N1=30 which I'm guessing may mean 19 objects are allocated to socket 0 and 30 to socket 1? this is a dual-core, dual-socket system. -mark Mark Seger wrote: > >>> Perhaps someone would like to take this discussion off-line with me >>> and even >>> collaborate with me on enhancements for slub in collectl? > sounds good to me, I just didn't want to annoy anyone... >>> I think we better keep it public (so that it goes into the archive). >>> Here a short description of the field in >>> /sys/kernel/slab/<slabcache> that you would need >>> >>> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size >>> >>> The size of an object. Subtract slab_size - object_size and you have >>> the per object overhead generated by alignements and slab metadata. >>> Does not change you only need to read this once. >>> >>> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects >>> >>> Number of objects in use. This changes and you may want to monitor it. >>> >>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size >>> >>> Total memory used for a single object. Read this only once. >>> >>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs >>> >>> Number of slab pages in use for this slab cache. May change if slab >>> is extended. >>> > What I'm not sure about is how this maps to the old slab info. > Specifically, I believe in the old model one reported on the size > taken up by the slabs (number of slabs X number of objects/slab X > object size). There was a second size for the actual number of > objects in use, so in my report that looked like this: > > # <-----------Objects----------><---------Slab > Allocation------> > #Name InUse Bytes Alloc Bytes InUse Bytes > Total Bytes > nfs_direct_cache 0 0 0 0 0 > 0 0 0 > nfs_write_data 36 27648 40 30720 8 > 32768 8 32768 > > the slab allocation was real memory allocated (which should come close > to Slab: in /proc/meminfo, right?) for the slabs while the object > bytes were those in use. Is it worth it to continue this model or do > thing work differently. It sounds like I can still do this with the > numbers you've pointed me to above and I do now realize I only need to > monitor the number of slabs and the number of objects since the others > are constants. > > To get back to my original question, I'd like to make sure that I'm > reporting useful information and not just data for the sake of it. In > one of your postings I saw a report you had that showed: > > slubinfo - version: 1.0 > # name <objects> <order> <objsize> <slabs>/<partial>/<cpu> > <flags> <nodes> > > How useful is order, cpu, flags and nodes? > Do people really care about how much memory is taken up by objects vs > slabs? If not, I could see reporting for each slab: > - object size > - number objects > - slab size > - number of slabs > - total memory (slab size X number of slabs) > - whatever else people might think to be useful such as order, cpu, > flags, etc > > Another thing I noticed is a number of the slabs are simply links to > the same base name and is it sufficient to just report the base names > and not those linked to it? Seems reasonable to me... > > The interesting thing about collectl is that it's written in perl (but > I'm trying to be very careful to keep it efficient and it tends to use > <0.1% cpu when run as a daemon) and the good news is it's pretty easy > to get something implemented, depending on my free time. If we can > get some level of agreement on what seems useful I could get a version > up fairly quickly for people to start playing with if there is any > interest. > > -mark > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-21 1:09 ` SLUB Mark Seger @ 2007-12-21 1:27 ` Mark Seger 2007-12-21 21:41 ` SLUB Christoph Lameter 1 sibling, 0 replies; 24+ messages in thread From: Mark Seger @ 2007-12-21 1:27 UTC (permalink / raw) To: Christoph Lameter; +Cc: Mark Seger, linux-mm I just realized I forgot to include an example of the output I was generating so here it is: Slab Name ObjSize NumObj SlabSize NumSlab Total :0000008 8 2185 8 5 17480 :0000016 16 1604 16 9 25664 :0000024 24 409 24 4 9816 :0000032 32 380 32 5 12160 :0000040 40 204 40 2 8160 :0000048 48 0 48 0 0 :0000064 64 843 64 17 53952 :0000072 72 167 72 3 12024 :0000088 88 5549 88 121 488312 :0000096 96 1400 96 40 134400 :0000112 112 0 112 0 0 :0000128 128 385 128 21 49280 :0000136 136 70 136 4 9520 :0000152 152 59 152 4 8968 :0000160 160 46 160 4 7360 :0000176 176 2071 176 93 364496 :0000192 192 400 192 24 76800 :0000256 256 1333 256 100 341248 :0000288 288 54 288 6 15552 :0000320 320 53 320 7 16960 :0000384 384 29 384 5 11136 :0000448 420 22 448 4 9856 :0000512 512 150 512 22 76800 :0000704 696 33 704 3 23232 :0000768 768 82 768 21 62976 :0000832 776 98 832 15 81536 :0000896 896 48 896 14 43008 :0000960 944 39 960 15 37440 :0001024 1024 303 1024 80 310272 :0001088 1048 28 1088 4 30464 :0001608 1608 34 1608 7 54672 :0001728 1712 16 1728 5 27648 :0001856 1856 8 1856 2 14848 :0001904 1904 87 1904 28 165648 :0002048 2048 504 2048 131 1032192 :0004096 4096 49 4096 28 200704 :0008192 8192 8 8192 12 65536 :0016384 16384 4 16384 7 65536 :0032768 32768 3 32768 3 98304 :0065536 65536 1 65536 1 65536 :0131072 131072 0 131072 0 0 :0262144 262144 0 262144 0 0 :0524288 524288 0 524288 0 0 :1048576 1048576 0 1048576 0 0 :2097152 2097152 0 2097152 0 0 :4194304 4194304 0 4194304 0 0 :a-0000088 88 0 88 0 0 :a-0000104 104 13963 104 359 1452152 :a-0000168 168 0 168 0 0 :a-0000224 224 11113 224 619 2489312 :a-0000256 248 0 256 0 0 anon_vma 40 796 48 12 38208 bdev_cache 960 32 1024 8 32768 ext2_inode_cache 920 0 928 0 0 ext3_inode_cache 968 4775 976 1194 4660400 file_lock_cache 192 58 200 4 11600 hugetlbfs_inode_cache 752 5 760 1 3800 idr_layer_cache 528 91 536 14 48776 inode_cache 720 3015 728 604 2194920 isofs_inode_cache 768 0 776 0 0 kmem_cache_node 72 232 72 6 16704 mqueue_inode_cache 1040 7 1088 1 7616 nfs_inode_cache 1120 102 1128 15 115056 proc_inode_cache 752 503 760 102 382280 radix_tree_node 552 2666 560 381 1492960 rpc_inode_cache 928 16 960 4 15360 shmem_inode_cache 960 243 968 61 235224 sighand_cache 2120 86 2176 31 187136 sock_inode_cache 816 81 832 11 67392 TOTAL K: 17169 and here's /proc/meminfo MemTotal: 4040768 kB MemFree: 3726112 kB Buffers: 13864 kB Cached: 196920 kB SwapCached: 0 kB Active: 127264 kB Inactive: 127864 kB SwapTotal: 4466060 kB SwapFree: 4466060 kB Dirty: 60 kB Writeback: 0 kB AnonPages: 44364 kB Mapped: 16124 kB Slab: 18608 kB SReclaimable: 11768 kB SUnreclaim: 6840 kB PageTables: 2240 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 6486444 kB Committed_AS: 64064 kB VmallocTotal: 34359738367 kB VmallocUsed: 32364 kB VmallocChunk: 34359705775 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB -mark Mark Seger wrote: > I did some preliminary prototyping and I guess I'm not sure of the > math. If I understand what you're saying, an object has a particular > size, but given the fact that you may need alignment, the true size is > really the slab size, and the difference is the overhead. What I > don't understand is how to calculate how much memory a particular slab > takes up. If the slabsize is really the size of an object, wouldn't I > multiple that times the number of objects? But when I do that I get a > number smaller than that reported in /proc/meminfo, in my case 15997K > vs 17388K. Given memory numbers rarely seem to add up maybe this IS > close enough? If so, what's the significance of the number of slabs? > Would I divide the 15997K by the number of slabs to find out how big a > single slab is? I would have thought that's what the slab_size is but > clearly it isn't. > > In any event, here's a table of what I see on my machine. The first 4 > columns come from /sys/slab and the 5th I calculated by just > multiplying SlabSize X NumObj. If I should be doing something else, > please tell me. Also be sure to tell me if I should include other > data. For example, the number of objects is a little misleading since > when I look at the file I really see something like: > > 49 N0=19 N1=30 > > which I'm guessing may mean 19 objects are allocated to socket 0 and > 30 to socket 1? this is a dual-core, dual-socket system. > > -mark > > Mark Seger wrote: >> >>>> Perhaps someone would like to take this discussion off-line with me >>>> and even >>>> collaborate with me on enhancements for slub in collectl? >> sounds good to me, I just didn't want to annoy anyone... >>>> I think we better keep it public (so that it goes into the >>>> archive). Here a short description of the field in >>>> /sys/kernel/slab/<slabcache> that you would need >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size >>>> >>>> The size of an object. Subtract slab_size - object_size and you >>>> have the per object overhead generated by alignements and slab >>>> metadata. Does not change you only need to read this once. >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects >>>> >>>> Number of objects in use. This changes and you may want to monitor it. >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size >>>> >>>> Total memory used for a single object. Read this only once. >>>> >>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs >>>> >>>> Number of slab pages in use for this slab cache. May change if slab >>>> is extended. >>>> >> What I'm not sure about is how this maps to the old slab info. >> Specifically, I believe in the old model one reported on the size >> taken up by the slabs (number of slabs X number of objects/slab X >> object size). There was a second size for the actual number of >> objects in use, so in my report that looked like this: >> >> # <-----------Objects----------><---------Slab >> Allocation------> >> #Name InUse Bytes Alloc Bytes InUse >> Bytes Total Bytes >> nfs_direct_cache 0 0 0 0 0 >> 0 0 0 >> nfs_write_data 36 27648 40 30720 8 >> 32768 8 32768 >> >> the slab allocation was real memory allocated (which should come >> close to Slab: in /proc/meminfo, right?) for the slabs while the >> object bytes were those in use. Is it worth it to continue this >> model or do thing work differently. It sounds like I can still do >> this with the numbers you've pointed me to above and I do now realize >> I only need to monitor the number of slabs and the number of objects >> since the others are constants. >> >> To get back to my original question, I'd like to make sure that I'm >> reporting useful information and not just data for the sake of it. >> In one of your postings I saw a report you had that showed: >> >> slubinfo - version: 1.0 >> # name <objects> <order> <objsize> <slabs>/<partial>/<cpu> >> <flags> <nodes> >> >> How useful is order, cpu, flags and nodes? >> Do people really care about how much memory is taken up by objects vs >> slabs? If not, I could see reporting for each slab: >> - object size >> - number objects >> - slab size >> - number of slabs >> - total memory (slab size X number of slabs) >> - whatever else people might think to be useful such as order, cpu, >> flags, etc >> >> Another thing I noticed is a number of the slabs are simply links to >> the same base name and is it sufficient to just report the base names >> and not those linked to it? Seems reasonable to me... >> >> The interesting thing about collectl is that it's written in perl >> (but I'm trying to be very careful to keep it efficient and it tends >> to use <0.1% cpu when run as a daemon) and the good news is it's >> pretty easy to get something implemented, depending on my free time. >> If we can get some level of agreement on what seems useful I could >> get a version up fairly quickly for people to start playing with if >> there is any interest. >> >> -mark >> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-21 1:09 ` SLUB Mark Seger 2007-12-21 1:27 ` SLUB Mark Seger @ 2007-12-21 21:41 ` Christoph Lameter 2007-12-27 14:22 ` SLUB Mark Seger 1 sibling, 1 reply; 24+ messages in thread From: Christoph Lameter @ 2007-12-21 21:41 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 20 Dec 2007, Mark Seger wrote: > I did some preliminary prototyping and I guess I'm not sure of the math. If I > understand what you're saying, an object has a particular size, but given the > fact that you may need alignment, the true size is really the slab size, and > the difference is the overhead. What I don't understand is how to calculate > how much memory a particular slab takes up. If the slabsize is really the If you want the use in terms of pages allocated from the page allocator then you do slabs << order If you want to use in actual bytes in allocated objects by the user of a slab cache then you can do objects * obj_size > this IS close enough? If so, what's the significance of the number of slabs? Its the amount of pages that were taken from the page allocator. > Would I divide the 15997K by the number of slabs to find out how big a single > slab is? I would have thought that's what the slab_size is but clearly it > isn't. The size of a single slab that contains multiple objects is PAGE_SIZE << order > 49 N0=19 N1=30 > > which I'm guessing may mean 19 objects are allocated to socket 0 and 30 to > socket 1? this is a dual-core, dual-socket system. Right. There are 49 objects in use. 19 of those are on node 0 and 30 on node 0. The Nx values only show up on NUMA systems otherwise this will be omitted. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-21 21:41 ` SLUB Christoph Lameter @ 2007-12-27 14:22 ` Mark Seger 2007-12-27 15:59 ` SLUB Mark Seger 2007-12-27 19:40 ` SLUB Christoph Lameter 0 siblings, 2 replies; 24+ messages in thread From: Mark Seger @ 2007-12-27 14:22 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm Now that I've had some more time to think about this and play around with the slabinfo tool I fear my problem had getting my head wrapped around the terminology, but that's my problem. Since there are entries called object_size, objs_per_slab and slab_size I would have thought that object_size*objects_per_slab=slab_size but that clearly isn't the case. Since slabs are allocated in pages, the actual size of the slabs is always a multiple of the page_size (actually by a power of 2) and that's why I see calculations in slabinfo like page_size << order, but I guess I'm still not sure what the actual definition of 'order' actually is. Anyhow, when I run slabinfo and see the following entry Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25 ** Hardware cacheline aligned Sizes (bytes) Slabs Debug Memory ------------------------------------------------------------------------ Object : 420 Total : 4 Sanity Checks : Off Total: 16384 SlabObj: 448 Full : 0 Redzoning : Off Used : 10500 SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884 Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700 Align : 0 Objects: 9 Tracing : Off Lpadd: 256 according to the entries under /sys/slabs/skbuff_fclone_cache it looks like the slab_size field is being reported above as 'SlabObj' and objs_per_slab is being reported as 'Objects' and as I mentioned above, SlabSiz is based on 'order'. Anyhow, as I understand what's going on at a very high level, memory is reserved for use as slabs (which themselves are multiples of pages) and processes allocate objects from within slabs as they need them. Therefore the 2 high-level numbers that seem of interest from a memory usage perspective are the memory allocated and the amount in use. I think these are the "Total" and "Used" fields in slabinfo. Total = page_size << order As for 'Used' that looks to be a straight calculation of objects * object_size The Slabs field in /proc/meminfo is the total of the individual 'Total's... Stay tuned and at some point I'll have support in collectl for reporting total/allocated usage by slab in collectl, though perhaps I'll post a 'proposal' first in the hopes of getting some constructive feedback as I want to present useful information rather than that columns of numbers. -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 14:22 ` SLUB Mark Seger @ 2007-12-27 15:59 ` Mark Seger 2007-12-27 19:43 ` SLUB Christoph Lameter 2007-12-27 19:40 ` SLUB Christoph Lameter 1 sibling, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-27 15:59 UTC (permalink / raw) To: Christoph Lameter; +Cc: Mark Seger, linux-mm I now have a 'prototype' of something I think makes sense, at least from my collectl tool's perspective. Keep in mind the philosophy behind collectl is to have a tool you can run both interactively and as a daemon that will give you enough information to paint a picture of what's happening on your system and in this case I'm focused on slabs. This is not intended to be a highly analytical tool but rather a starting point to identify areas potentially requiring a deeper dive. For example, with the current version that's driven off /proc/slabinfo, it's been possible to look at the long term changes to individual slabs to get picture of how memory is being allocated and when there are memory issues it can be useful to see which slabs (if any) are growing at an unexpected rate. That said, I'm thinking of reporting something like the following: <-------- objects --------><----- slabs -----><------ memory ------> Slab Name Size In Use Avail Size Number Used Total :0000008 8 2164 2560 4096 5 17312 20480 :0000016 16 1448 2816 4096 11 23168 45056 :0000024 24 460 680 4096 4 11040 16384 :0000032 32 384 1152 4096 9 12288 36864 :0000040 40 306 306 4096 3 12240 12288 The idea here is that for each slab in the 'objects' section one can see how many objects are 'in use' and how many are 'available', the point being one can look at the difference to see how many more objects are available before the system needs to allocate another slab. Under the 'slabs' section you can see how big the individual slabs are and how many of them there are and finally under 'memory' you can see how much has been used by processes vs how much is still allocated as slabs. There are all sorts of other ways to present the data such as percentages, differences, etc. but this is more-or-less the way I did it in the past and the information was useful. One could also argue that the real key information here is Uses/Total and the rest is just window dressing and I couldn't disagree with that either, but I do think it helps paint a more complete picture. -mark Mark Seger wrote: > Now that I've had some more time to think about this and play around > with the slabinfo tool I fear my problem had getting my head wrapped > around the terminology, but that's my problem. Since there are > entries called object_size, objs_per_slab and slab_size I would have > thought that object_size*objects_per_slab=slab_size but that clearly > isn't the case. Since slabs are allocated in pages, the actual size > of the slabs is always a multiple of the page_size (actually by a > power of 2) and that's why I see calculations in slabinfo like > page_size << order, but I guess I'm still not sure what the actual > definition of 'order' actually is. > > Anyhow, when I run slabinfo and see the following entry > > Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25 > ** Hardware cacheline aligned > > Sizes (bytes) Slabs Debug Memory > ------------------------------------------------------------------------ > Object : 420 Total : 4 Sanity Checks : Off Total: 16384 > SlabObj: 448 Full : 0 Redzoning : Off Used : 10500 > SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884 > Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700 > Align : 0 Objects: 9 Tracing : Off Lpadd: 256 > > according to the entries under /sys/slabs/skbuff_fclone_cache it looks > like the slab_size field is being reported above as 'SlabObj' and > objs_per_slab is being reported as 'Objects' and as I mentioned above, > SlabSiz is based on 'order'. > > Anyhow, as I understand what's going on at a very high level, memory > is reserved for use as slabs (which themselves are multiples of pages) > and processes allocate objects from within slabs as they need them. > Therefore the 2 high-level numbers that seem of interest from a memory > usage perspective are the memory allocated and the amount in use. I > think these are the "Total" and "Used" fields in slabinfo. > > Total = page_size << order > > As for 'Used' that looks to be a straight calculation of objects * > object_size > > The Slabs field in /proc/meminfo is the total of the individual > 'Total's... > > Stay tuned and at some point I'll have support in collectl for > reporting total/allocated usage by slab in collectl, though perhaps > I'll post a 'proposal' first in the hopes of getting some constructive > feedback as I want to present useful information rather than that > columns of numbers. > > -mark > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 15:59 ` SLUB Mark Seger @ 2007-12-27 19:43 ` Christoph Lameter 2007-12-27 19:57 ` SLUB Mark Seger 0 siblings, 1 reply; 24+ messages in thread From: Christoph Lameter @ 2007-12-27 19:43 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 27 Dec 2007, Mark Seger wrote: > <-------- objects --------><----- slabs > -----><------ memory ------> > Slab Name Size In Use Avail Size Number Used Total > :0000008 8 2164 2560 4096 5 17312 20480 The right hand side is okay. Could you list all the slab names that are covered by :00008 on the left side (maybe separated by commas?) Having the :00008 there is ugly. slabinfo can show you a way how to get the names. > There are all sorts of other ways to present the data such as percentages, > differences, etc. but this is more-or-less the way I did it in the past and > the information was useful. One could also argue that the real key > information here is Uses/Total and the rest is just window dressing and I > couldn't disagree with that either, but I do think it helps paint a more > complete picture. I agree. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 19:43 ` SLUB Christoph Lameter @ 2007-12-27 19:57 ` Mark Seger 2007-12-27 19:58 ` SLUB Christoph Lameter 0 siblings, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-27 19:57 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm Christoph Lameter wrote: > On Thu, 27 Dec 2007, Mark Seger wrote: > > >> <-------- objects --------><----- slabs >> -----><------ memory ------> >> Slab Name Size In Use Avail Size Number Used Total >> :0000008 8 2164 2560 4096 5 17312 20480 >> > > The right hand side is okay. Could you list all the slab names that are > covered by :00008 on the left side (maybe separated by commas?) Having the > :00008 there is ugly. slabinfo can show you a way how to get the names. > here's the challenge - I only want to use a single line per entry AND I want all the columns to line up for easy reading (I don't want much do I?). I'll have to do some experiments to see what might look better. One thought is to list a 'primary' name (whatever that might mean) in the left-hand column and perhaps line up the rest of the other names to the right of the total. Another option could be to just repeat the line with each slab entry but that also generates a lot of output and one of the other notions behind collectl is to make it real easy to see what's going on and repeating information can be confusing. I'm assuming the way slabinfo gets the names (or at least the way I can think of doing it) it so just look for entries in /sys/slab that are links. >> There are all sorts of other ways to present the data such as percentages, >> differences, etc. but this is more-or-less the way I did it in the past and >> the information was useful. One could also argue that the real key >> information here is Uses/Total and the rest is just window dressing and I >> couldn't disagree with that either, but I do think it helps paint a more >> complete picture. >> > > I agree. > The neat thing about collectl is it's written in perl and contains lots of switches and print statements. I can easily see additional switches that might control how the information is printed, such as the 'node' level allocations, but I figure that can come later. -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 19:57 ` SLUB Mark Seger @ 2007-12-27 19:58 ` Christoph Lameter 2007-12-27 20:17 ` SLUB Mark Seger 2007-12-27 20:55 ` SLUB Mark Seger 0 siblings, 2 replies; 24+ messages in thread From: Christoph Lameter @ 2007-12-27 19:58 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 27 Dec 2007, Mark Seger wrote: > > The right hand side is okay. Could you list all the slab names that are > > covered by :00008 on the left side (maybe separated by commas?) Having the > > :00008 there is ugly. slabinfo can show you a way how to get the names. > > > here's the challenge - I only want to use a single line per entry AND I want > all the columns to line up for easy reading (I don't want much do I?). I'll > have to do some experiments to see what might look better. One thought is to > list a 'primary' name (whatever that might mean) in the left-hand column and > perhaps line up the rest of the other names to the right of the total. slabinfo has the concept of the "first" name of a slab. See the -f option. > Another option could be to just repeat the line with each slab entry but that > also generates a lot of output and one of the other notions behind collectl is > to make it real easy to see what's going on and repeating information can be > confusing. I'd say just pack as much as fit into the space and then create a new line if there are too many aliases of the slab. > I'm assuming the way slabinfo gets the names (or at least the way I can think > of doing it) it so just look for entries in /sys/slab that are links. It scans for symlinks pointing to that strange name. Source code for slabinfo is in Documentation/vm/slabinfo.c. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 19:58 ` SLUB Christoph Lameter @ 2007-12-27 20:17 ` Mark Seger 2007-12-27 20:55 ` SLUB Mark Seger 1 sibling, 0 replies; 24+ messages in thread From: Mark Seger @ 2007-12-27 20:17 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm Christoph Lameter wrote: > On Thu, 27 Dec 2007, Mark Seger wrote: > > >>> The right hand side is okay. Could you list all the slab names that are >>> covered by :00008 on the left side (maybe separated by commas?) Having the >>> :00008 there is ugly. slabinfo can show you a way how to get the names. >>> >>> >> here's the challenge - I only want to use a single line per entry AND I want >> all the columns to line up for easy reading (I don't want much do I?). I'll >> have to do some experiments to see what might look better. One thought is to >> list a 'primary' name (whatever that might mean) in the left-hand column and >> perhaps line up the rest of the other names to the right of the total. >> > > slabinfo has the concept of the "first" name of a slab. See the -f option. > slick! >> Another option could be to just repeat the line with each slab entry but that >> also generates a lot of output and one of the other notions behind collectl is >> to make it real easy to see what's going on and repeating information can be >> confusing. >> > > I'd say just pack as much as fit into the space and then create a new line > if there are too many aliases of the slab. > lemme play with it some >> I'm assuming the way slabinfo gets the names (or at least the way I can think >> of doing it) it so just look for entries in /sys/slab that are links. >> > > It scans for symlinks pointing to that strange name. Source code for > slabinfo is in Documentation/vm/slabinfo.c. > gottcha... -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 19:58 ` SLUB Christoph Lameter 2007-12-27 20:17 ` SLUB Mark Seger @ 2007-12-27 20:55 ` Mark Seger 2007-12-27 20:59 ` SLUB Christoph Lameter 1 sibling, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-27 20:55 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm ok, here's a dumb question... I've been looking at slabinfo and see a routine called find_one_alias which returns the alias that gets printed with the -f switch. the only thing is the leading comment says "Find the shortest alias of a slab" but it looks like it returns the longest name. Did you change the functionality after your wrote the comment? that'll teach you for commenting your code! 8-) I'm also not sure why it would stop the search when it finds an alias that started with 'kmall'. Is there some reason you wouldn't want to use any of those names as potential candidates? Does it really matter how I choose the 'first' name? It's certainly easy enough to pick the longest, I'm just not sure about the test for 'kmall'. -mark Christoph Lameter wrote: > On Thu, 27 Dec 2007, Mark Seger wrote: > > >>> The right hand side is okay. Could you list all the slab names that are >>> covered by :00008 on the left side (maybe separated by commas?) Having the >>> :00008 there is ugly. slabinfo can show you a way how to get the names. >>> >>> >> here's the challenge - I only want to use a single line per entry AND I want >> all the columns to line up for easy reading (I don't want much do I?). I'll >> have to do some experiments to see what might look better. One thought is to >> list a 'primary' name (whatever that might mean) in the left-hand column and >> perhaps line up the rest of the other names to the right of the total. >> > > slabinfo has the concept of the "first" name of a slab. See the -f option. > > >> Another option could be to just repeat the line with each slab entry but that >> also generates a lot of output and one of the other notions behind collectl is >> to make it real easy to see what's going on and repeating information can be >> confusing. >> > > I'd say just pack as much as fit into the space and then create a new line > if there are too many aliases of the slab. > > >> I'm assuming the way slabinfo gets the names (or at least the way I can think >> of doing it) it so just look for entries in /sys/slab that are links. >> > > It scans for symlinks pointing to that strange name. Source code for > slabinfo is in Documentation/vm/slabinfo.c. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 20:55 ` SLUB Mark Seger @ 2007-12-27 20:59 ` Christoph Lameter 2007-12-27 23:49 ` collectl and the new slab allocator [slub] statistics Mark Seger 0 siblings, 1 reply; 24+ messages in thread From: Christoph Lameter @ 2007-12-27 20:59 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 27 Dec 2007, Mark Seger wrote: > ok, here's a dumb question... I've been looking at slabinfo and see a routine > called find_one_alias which returns the alias that gets printed with the -f > switch. the only thing is the leading comment says "Find the shortest alias > of a slab" but it looks like it returns the longest name. Did you change the > functionality after your wrote the comment? that'll teach you for commenting > your code! 8-) Yuck. > I'm also not sure why it would stop the search when it finds an alias that > started with 'kmall'. Is there some reason you wouldn't want to use any of > those names as potential candidates? Does it really matter how I choose the > 'first' name? It's certainly easy enough to pick the longest, I'm just not > sure about the test for 'kmall'. Well the kmallocs are generic and just give size information. You want a slab name that is more informative than that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* collectl and the new slab allocator [slub] statistics 2007-12-27 20:59 ` SLUB Christoph Lameter @ 2007-12-27 23:49 ` Mark Seger 2007-12-27 23:52 ` Christoph Lameter 0 siblings, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-27 23:49 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm I hope you don't mind but I changed the subject from the pretty generic one of 'slub. My latest thought about handling the multiple aliases is what if I do something like slabinfo - pick a 'primary' one based on a similar criteria such as the longest name that isn't 'kmalloc' or that other funky format with the size in its name. Then provide a second option that shows the mappings of all the names to the primary ones. That way if you're interested in a particular slab you can always look up its mapping. I would also provide a mechanism for specifying those slabs you want to monitor and even if not a 'primary' name it would use that name. Today's kind of over for me but perhaps I can send out an updated prototype format tomorrow. -mark Christoph Lameter wrote: > On Thu, 27 Dec 2007, Mark Seger wrote: > > >> ok, here's a dumb question... I've been looking at slabinfo and see a routine >> called find_one_alias which returns the alias that gets printed with the -f >> switch. the only thing is the leading comment says "Find the shortest alias >> of a slab" but it looks like it returns the longest name. Did you change the >> functionality after your wrote the comment? that'll teach you for commenting >> your code! 8-) >> > > Yuck. > > >> I'm also not sure why it would stop the search when it finds an alias that >> started with 'kmall'. Is there some reason you wouldn't want to use any of >> those names as potential candidates? Does it really matter how I choose the >> 'first' name? It's certainly easy enough to pick the longest, I'm just not >> sure about the test for 'kmall'. >> > > Well the kmallocs are generic and just give size information. You want a > slab name that is more informative than that. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: collectl and the new slab allocator [slub] statistics 2007-12-27 23:49 ` collectl and the new slab allocator [slub] statistics Mark Seger @ 2007-12-27 23:52 ` Christoph Lameter 2007-12-28 15:10 ` Mark Seger 0 siblings, 1 reply; 24+ messages in thread From: Christoph Lameter @ 2007-12-27 23:52 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 27 Dec 2007, Mark Seger wrote: > particular slab you can always look up its mapping. I would also provide a > mechanism for specifying those slabs you want to monitor and even if not a > 'primary' name it would use that name. Sounds good. > Today's kind of over for me but perhaps I can send out an updated prototype > format tomorrow. Great. But I will only be back next Wednesday. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: collectl and the new slab allocator [slub] statistics 2007-12-27 23:52 ` Christoph Lameter @ 2007-12-28 15:10 ` Mark Seger 2007-12-31 18:30 ` Mark Seger 0 siblings, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-28 15:10 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm Christoph Lameter wrote: > On Thu, 27 Dec 2007, Mark Seger wrote: > > >> particular slab you can always look up its mapping. I would also provide a >> mechanism for specifying those slabs you want to monitor and even if not a >> 'primary' name it would use that name. >> > > Sounds good. > > >> Today's kind of over for me but perhaps I can send out an updated prototype >> format tomorrow. >> > > Great. But I will only be back next Wednesday. > So here's the latest... I made a couple of tweaks to the format but I think it's getting real close and as you can see, I'm now printing the longest alias associated with a slab as is done in slabinfo. I'm also including the time to make it easier to read but typically this is an option in case the user doesn't want to use the extra screen real-estate. As a minor point, as I was debugging this and comparing its output to slabinfo (and we don't always get the same aliases if there are multiple aliases of the same length) I found that slabinfo reports on 'kmalloc-1024' and I'm reporting 'biovec-64'. I thought you wanted to only print the kmalloc* names when there was nothing else and so I suspect a slight bug in slabinfo... Note that I decided to print the number of objects in a slab, even though one could derive that themselves. I also decided to report the size of the slabs in K bytes as well as the user/total memory. I'm still reporting the objects inuse/avail in bytes since these are often <1K and I really don't want to report fractions. <----------- objects -----------><--- slabs ---><----- memory -----> Time Slab Name Size /slab In Use Avail SizeK Number UsedK TotalK 10:25:04 TCP 1728 4 13 20 8 5 21 40 10:25:04 TCPv6 1856 4 15 20 8 5 27 40 10:25:04 UDP-Lite 896 4 51 64 4 16 44 64 10:25:04 UDPLITEv6 1088 7 28 28 8 4 29 32 10:25:04 anon_vma 48 85 773 1105 4 13 36 52 Anyhow, here's an example of watching the system once a second for any slabs that change while the system is idle <----------- objects -----------><--- slabs ---><----- memory -----> Time Slab Name Size /slab In Use Avail SizeK Number UsedK TotalK 10:25:34 skbuff_fclone_cache 448 9 16 36 4 4 7 16 10:25:34 skbuff_head_cache 256 16 1266 1552 4 97 316 388 10:25:35 skbuff_fclone_cache 448 9 23 36 4 4 10 16 10:25:35 skbuff_head_cache 256 16 1265 1552 4 97 316 388 10:25:36 biovec-64 1024 4 303 320 4 80 303 320 10:25:36 dentry 224 18 215543 215568 4 11976 47150 47904 10:25:36 skbuff_fclone_cache 448 9 19 36 4 4 8 16 10:25:36 skbuff_head_cache 256 16 1269 1552 4 97 317 388 And finally, here's watching a single slab while writing a large file, noting the I/O started at 10:26:30... <----------- objects -----------><--- slabs ---><----- memory -----> Time Slab Name Size /slab In Use Avail SizeK Number UsedK TotalK 10:26:25 blkdev_requests 288 14 39 84 4 6 10 24 10:26:30 blkdev_requests 288 14 189 224 4 16 53 64 10:26:31 blkdev_requests 288 14 187 224 4 16 52 64 10:26:32 blkdev_requests 288 14 174 224 4 16 48 64 10:26:33 blkdev_requests 288 14 173 224 4 16 48 64 10:26:34 blkdev_requests 288 14 46 84 4 6 12 24 It shouldn't take too much time to actually implement this in collectl, but I do need to find the block of time to update the code, man pages, etc before releasing it so if there are any final tweaks, now is the time to say so... -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: collectl and the new slab allocator [slub] statistics 2007-12-28 15:10 ` Mark Seger @ 2007-12-31 18:30 ` Mark Seger 0 siblings, 0 replies; 24+ messages in thread From: Mark Seger @ 2007-12-31 18:30 UTC (permalink / raw) To: Christoph Lameter; +Cc: Mark Seger, linux-mm Even though I know you won't be around for a few days I found a few more cycles to put into this and have implemented quite a lot in collectl. Rather than send along a bunch of output, I started to put together a web page as part of collectl web site though I haven't linked it in yet as I haven't yet released the associated version. In any event, I took a shot of trying to include a few high level words about slabs in general as well as show what some of the different output formats will look like as I'd much rather make changes before I release it than after. That said if you or anyone else on this list want to have a look at what I've been up to you can see it at http://collectl.sourceforge.net/SlabInfo.html -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 14:22 ` SLUB Mark Seger 2007-12-27 15:59 ` SLUB Mark Seger @ 2007-12-27 19:40 ` Christoph Lameter 2007-12-27 19:51 ` SLUB Mark Seger 1 sibling, 1 reply; 24+ messages in thread From: Christoph Lameter @ 2007-12-27 19:40 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 27 Dec 2007, Mark Seger wrote: > Now that I've had some more time to think about this and play around with the > slabinfo tool I fear my problem had getting my head wrapped around the > terminology, but that's my problem. Since there are entries called > object_size, objs_per_slab and slab_size I would have thought that > object_size*objects_per_slab=slab_size but that clearly isn't the case. Since > slabs are allocated in pages, the actual size of the slabs is always a > multiple of the page_size (actually by a power of 2) and that's why I see > calculations in slabinfo like page_size << order, but I guess I'm still not > sure what the actual definition of 'order' actually is. order is the shift you apply to PAGE_SIZE to get to the allocation size you want. Order 0 = PAGE_SIZE, order 1 = PAGE_SIZE << 1 (PAGE_SIZE *2), order 2 = PAGE_SIZE << 2 (PAGE_SIZE * 4) etc. > Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25 > ** Hardware cacheline aligned > > Sizes (bytes) Slabs Debug Memory > ------------------------------------------------------------------------ > Object : 420 Total : 4 Sanity Checks : Off Total: 16384 > SlabObj: 448 Full : 0 Redzoning : Off Used : 10500 > SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884 > Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700 > Align : 0 Objects: 9 Tracing : Off Lpadd: 256 > > according to the entries under /sys/slabs/skbuff_fclone_cache it looks like > the slab_size field is being reported above as 'SlabObj' and objs_per_slab is > being reported as 'Objects' and as I mentioned above, SlabSiz is based on > 'order'. > > Anyhow, as I understand what's going on at a very high level, memory is > reserved for use as slabs (which themselves are multiples of pages) and > processes allocate objects from within slabs as they need them. Therefore the > 2 high-level numbers that seem of interest from a memory usage perspective are > the memory allocated and the amount in use. I think these are the "Total" and > "Used" fields in slabinfo. Total is the total memory allocated from the page allocator. There are 4 slab allocated with the size of 4096 bytes each. This is 16k. The used value is the memory that was actually handed out through kmalloc and friends. > Total = page_size << order Order = 0. So Total would be 4096 << 0 = 4096. Wrong value. > As for 'Used' that looks to be a straight calculation of objects * object_size Right. > The Slabs field in /proc/meminfo is the total of the individual 'Total's... Right. > Stay tuned and at some point I'll have support in collectl for reporting > total/allocated usage by slab in collectl, though perhaps I'll post a > 'proposal' first in the hopes of getting some constructive feedback as I want > to present useful information rather than that columns of numbers. Ahh Great. Thanks for all your work. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 19:40 ` SLUB Christoph Lameter @ 2007-12-27 19:51 ` Mark Seger 2007-12-27 19:53 ` SLUB Christoph Lameter 0 siblings, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-27 19:51 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm It feels like we're closing in on something as O'm getting more 'Right's from you than before. 8-) Just a few more comments/questions to your comments below... Christoph Lameter wrote: > On Thu, 27 Dec 2007, Mark Seger wrote: > > >> Now that I've had some more time to think about this and play around with the >> slabinfo tool I fear my problem had getting my head wrapped around the >> terminology, but that's my problem. Since there are entries called >> object_size, objs_per_slab and slab_size I would have thought that >> object_size*objects_per_slab=slab_size but that clearly isn't the case. Since >> slabs are allocated in pages, the actual size of the slabs is always a >> multiple of the page_size (actually by a power of 2) and that's why I see >> calculations in slabinfo like page_size << order, but I guess I'm still not >> sure what the actual definition of 'order' actually is. >> > > order is the shift you apply to PAGE_SIZE to get to the allocation size > you want. Order 0 = PAGE_SIZE, order 1 = PAGE_SIZE << 1 (PAGE_SIZE *2), > order 2 = PAGE_SIZE << 2 (PAGE_SIZE * 4) etc. > I think the thing that was throwing me here for awhile was the name 'order'. I thought it meant order in the ordinal sense but clearly it's more intended as is 'the power of' sense. >> Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25 >> ** Hardware cacheline aligned >> >> Sizes (bytes) Slabs Debug Memory >> ------------------------------------------------------------------------ >> Object : 420 Total : 4 Sanity Checks : Off Total: 16384 >> SlabObj: 448 Full : 0 Redzoning : Off Used : 10500 >> SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884 >> Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700 >> Align : 0 Objects: 9 Tracing : Off Lpadd: 256 >> >> according to the entries under /sys/slabs/skbuff_fclone_cache it looks like >> the slab_size field is being reported above as 'SlabObj' and objs_per_slab is >> being reported as 'Objects' and as I mentioned above, SlabSiz is based on >> 'order'. >> >> Anyhow, as I understand what's going on at a very high level, memory is >> reserved for use as slabs (which themselves are multiples of pages) and >> processes allocate objects from within slabs as they need them. Therefore the >> 2 high-level numbers that seem of interest from a memory usage perspective are >> the memory allocated and the amount in use. I think these are the "Total" and >> "Used" fields in slabinfo. >> > > Total is the total memory allocated from the page allocator. There are 4 > slab allocated with the size of 4096 bytes each. This is 16k. > > The used value is the memory that was actually handed out through kmalloc > and friends. > > >> Total = page_size << order >> > > Order = 0. So Total would be 4096 << 0 = 4096. Wrong value. > I'm not sure what your 'wong value. I think it's because I said page_size << order instead of (page_size << order ) * number of slabs, right? >> As for 'Used' that looks to be a straight calculation of objects * object_size >> > > Right. > > >> The Slabs field in /proc/meminfo is the total of the individual 'Total's... >> > > Right. > > >> Stay tuned and at some point I'll have support in collectl for reporting >> total/allocated usage by slab in collectl, though perhaps I'll post a >> 'proposal' first in the hopes of getting some constructive feedback as I want >> to present useful information rather than that columns of numbers. >> > > Ahh Great. Thanks for all your work. > now the only assumption is that someone will actually use it! 8-) one more thing - can I assume order is a constant for a particular type of a slab and only need to read it at initialization time? -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-27 19:51 ` SLUB Mark Seger @ 2007-12-27 19:53 ` Christoph Lameter 0 siblings, 0 replies; 24+ messages in thread From: Christoph Lameter @ 2007-12-27 19:53 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 27 Dec 2007, Mark Seger wrote: > > Order = 0. So Total would be 4096 << 0 = 4096. Wrong value. > > > I'm not sure what your 'wong value. I think it's because I said page_size << > order instead of (page_size << order ) * number of slabs, right? Right. > one more thing - can I assume order is a constant for a particular type of a > slab and only need to read it at initialization time? Correct. Only the number of slabs and the number of objects changes. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-20 23:36 ` SLUB Mark Seger 2007-12-21 1:09 ` SLUB Mark Seger @ 2007-12-21 21:32 ` Christoph Lameter 1 sibling, 0 replies; 24+ messages in thread From: Christoph Lameter @ 2007-12-21 21:32 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Thu, 20 Dec 2007, Mark Seger wrote: > What I'm not sure about is how this maps to the old slab info. Specifically, > I believe in the old model one reported on the size taken up by the slabs > (number of slabs X number of objects/slab X object size). There was a second > size for the actual number of objects in use, so in my report that looked like > this: > > # <-----------Objects----------><---------Slab > Allocation------> > #Name InUse Bytes Alloc Bytes InUse Bytes Total > Bytes > nfs_direct_cache 0 0 0 0 0 0 0 > 0 > nfs_write_data 36 27648 40 30720 8 32768 8 > 32768 > > the slab allocation was real memory allocated (which should come close to > Slab: in /proc/meminfo, right?) for the slabs while the object bytes were The real memory allocates can be deducated from the "slabs" field. Multiply that by the order of the slab and you have the size of it. The "objects" are the actual objects in current use. > To get back to my original question, I'd like to make sure that I'm reporting > useful information and not just data for the sake of it. In one of your > postings I saw a report you had that showed: > > slubinfo - version: 1.0 > # name <objects> <order> <objsize> <slabs>/<partial>/<cpu> <flags> > <nodes> That report can be had using the slabinfo tool. See Documentation/vm/slabinfo.c > How useful is order, cpu, flags and nodes? > Do people really care about how much memory is taken up by objects vs slabs? > If not, I could see reporting for each slab: > - object size > - number objects > - slab size > - number of slabs > - total memory (slab size X number of slabs) > - whatever else people might think to be useful such as order, cpu, flags, etc Sounds fine. > Another thing I noticed is a number of the slabs are simply links to the same > base name and is it sufficient to just report the base names and not those > linked to it? Seems reasonable to me... slabinfo reports it like that. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-20 19:44 ` SLUB Christoph Lameter 2007-12-20 23:36 ` SLUB Mark Seger @ 2007-12-21 16:59 ` Mark Seger 2007-12-21 21:37 ` SLUB Christoph Lameter 1 sibling, 1 reply; 24+ messages in thread From: Mark Seger @ 2007-12-21 16:59 UTC (permalink / raw) To: Christoph Lameter; +Cc: linux-mm > I think we better keep it public (so that it goes into the archive). Here > a short description of the field in /sys/kernel/slab/<slabcache> that you > would need > > -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size > > The size of an object. Subtract slab_size - object_size and you have the > per object overhead generated by alignements and slab metadata. Does not > change you only need to read this once. > > -r--r--r-- 1 root root 4096 Dec 20 11:41 objects > > Number of objects in use. This changes and you may want to monitor it. > > -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size > > Total memory used for a single object. Read this only once. > > -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs > > Number of slab pages in use for this slab cache. May change if slab is > extended. > Sorry for being confused, but I thought that a slab was made up of a number of objects and above you're saying slab_size is the size of single object. Furthermore, looking at /sys/slab/shmem_inode_cache I see: object_size = 960 objs_per_slab = 4 which implies a slab is made up more than one object, so which is it? could it be a simple matter of clearer names? I also see slab_size = 968 which certainly supports your statement about this being the size of an object and it looks like there is 8 bytes of overhead. finally, I also see objects = 242 and objects * obj_per_slab = slabsize. is that a coincidence? -mark -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB 2007-12-21 16:59 ` SLUB Mark Seger @ 2007-12-21 21:37 ` Christoph Lameter 0 siblings, 0 replies; 24+ messages in thread From: Christoph Lameter @ 2007-12-21 21:37 UTC (permalink / raw) To: Mark Seger; +Cc: linux-mm On Fri, 21 Dec 2007, Mark Seger wrote: > Sorry for being confused, but I thought that a slab was made up of a number of > objects and above you're saying slab_size is the size of single object. > Furthermore, looking at /sys/slab/shmem_inode_cache I see: > > object_size = 960 > objs_per_slab = 4 > > which implies a slab is made up more than one object, so which is it? could > it be a simple matter of clearer names? I also see Yes a slab holds "objs_per_slab" object/ > > slab_size = 968 > > which certainly supports your statement about this being the size of an object > and it looks like there is 8 bytes of overhead. finally, I also see > > objects = 242 > > and objects * obj_per_slab = slabsize. is that a coincidence? This means that the slab contains 242 active objects. From the "slabs" field you can deduce how many objects the slab could hold: slabs * objs_per_slab If you subtract "objects" from this then you have the number of unused objects in the slabs. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-12-31 18:30 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-12-20 15:06 SLUB Mark Seger 2007-12-20 19:44 ` SLUB Christoph Lameter 2007-12-20 23:36 ` SLUB Mark Seger 2007-12-21 1:09 ` SLUB Mark Seger 2007-12-21 1:27 ` SLUB Mark Seger 2007-12-21 21:41 ` SLUB Christoph Lameter 2007-12-27 14:22 ` SLUB Mark Seger 2007-12-27 15:59 ` SLUB Mark Seger 2007-12-27 19:43 ` SLUB Christoph Lameter 2007-12-27 19:57 ` SLUB Mark Seger 2007-12-27 19:58 ` SLUB Christoph Lameter 2007-12-27 20:17 ` SLUB Mark Seger 2007-12-27 20:55 ` SLUB Mark Seger 2007-12-27 20:59 ` SLUB Christoph Lameter 2007-12-27 23:49 ` collectl and the new slab allocator [slub] statistics Mark Seger 2007-12-27 23:52 ` Christoph Lameter 2007-12-28 15:10 ` Mark Seger 2007-12-31 18:30 ` Mark Seger 2007-12-27 19:40 ` SLUB Christoph Lameter 2007-12-27 19:51 ` SLUB Mark Seger 2007-12-27 19:53 ` SLUB Christoph Lameter 2007-12-21 21:32 ` SLUB Christoph Lameter 2007-12-21 16:59 ` SLUB Mark Seger 2007-12-21 21:37 ` SLUB Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox