* SLUB
@ 2007-12-20 15:06 Mark Seger
2007-12-20 19:44 ` SLUB Christoph Lameter
0 siblings, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-20 15:06 UTC (permalink / raw)
To: linux-mm, clameter
Forgive me if this is the wrong place to be asking this, but if so could
someone point me to a better place?
This past summer I released a tool on sourceforge called collectl - see
http://collectl.sourceforge.net/ which does some pretty nifty system
monitoring, one component of which is slabs. I finally got around to
trying it out on a newer kernel and I picked 2.6.23 and lo and behold,
it didn't work because /proc/slabinfo has disappeared to be replaced by
/sys/slab. I've been looking around to try and better understand how to
map slubs to slabs and couldn't find anything written up the definitions
of the field on /sys/slab and I also suspect that while some of
information reported by slub might map there could be other useful
information worth tracking.
To back up a few steps, in my collectl tool I can monitor slabs both in
real time or log that data to a file for later playback. The format I
use for display is modeled after slabtop, but I simply record data for
all slabs (you can supply a filter). What I think is particularly
useful about collectl is a switch that only shows allocations that have
changed. This means if you run my tool with a monitoring interval of a
second (the default interval I use for slabs is 60 seconds since it is
more work to read/process all of slabinfo) you only see occasional
changes as they occur. I've also found this feature very useful when
analyzing longer term data that was collected at the 60 second
intervals. Here's an example of running it with a 1 second monitoring
interval on a relatively idle system:
#
<-----------Objects----------><---------Slab Allocation------>
# Name InUse Bytes Alloc Bytes InUse
Bytes Total Bytes
09:28:54 sgpool-32 32 32768 36 36864 8
32768 9 36864
09:28:54 blkdev_requests 12 3168 30 7920 1
4096 2 8192
09:28:54 bio 313 40064 372 47616 11
45056 12 49152
09:28:55 sgpool-32 32 32768 32 32768 8
32768 8 32768
09:28:55 blkdev_requests 12 3168 15 3960 1
4096 1 4096
09:28:55 bio 313 40064 341 43648 11
45056 11 45056
09:28:56 bio 287 36736 341 43648 10
40960 11 45056
09:28:56 task_struct 128 253952 140 277760 69
282624 70 286720
09:28:58 sgpool-64 33 67584 34 69632 17
69632 17 69632
09:28:58 bio 403 51584 403 51584 13
53248 13 53248
09:28:58 task_struct 124 246016 140 277760 68
278528 70 286720
09:28:59 journal_handle 0 0 0 0
0 0 0 0
09:28:59 task_struct 124 246016 136 269824 68
278528 68 278528
09:29:00 journal_handle 16 768 81 3888 1
4096 1 4096
09:29:00 scsi_cmd_cache 24 12288 35 17920 5
20480 5 20480
09:29:00 sgpool-64 32 65536 34 69632 16
65536 17 69632
09:29:00 sgpool-8 51 13056 75 19200 5
20480 5 20480
The thing that is especially useful with collectl is that by monitoring
slabs at the same time as monitoring cpu, processes, disk, network and
more, you can get a very comprehensive picture of what's going on at any
one time.
My main purpose for writing to this list then becomes what would make
the most sense to do with slabs with the new slub allocator? Should I
simply report on these same fields? Are there others that make more
sense? Do I need to read all 184 entries in /sys/slab and then all the
entries under them? Clearly I want to do this efficiently and provide
meaningful data at the same time. Perhaps someone would like to take
this discussion off-line with me and even collaborate with me on
enhancements for slub in collectl?
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-20 15:06 SLUB Mark Seger
@ 2007-12-20 19:44 ` Christoph Lameter
2007-12-20 23:36 ` SLUB Mark Seger
2007-12-21 16:59 ` SLUB Mark Seger
0 siblings, 2 replies; 24+ messages in thread
From: Christoph Lameter @ 2007-12-20 19:44 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 20 Dec 2007, Mark Seger wrote:
> This past summer I released a tool on sourceforge called collectl - see
> http://collectl.sourceforge.net/ which does some pretty nifty system
> monitoring, one component of which is slabs. I finally got around to trying
> it out on a newer kernel and I picked 2.6.23 and lo and behold, it didn't work
> because /proc/slabinfo has disappeared to be replaced by /sys/slab. I've been
Yes. The information available about slabs is different now.
> The thing that is especially useful with collectl is that by monitoring slabs
> at the same time as monitoring cpu, processes, disk, network and more, you can
> get a very comprehensive picture of what's going on at any one time.
Good idea.
> My main purpose for writing to this list then becomes what would make the most
> sense to do with slabs with the new slub allocator? Should I simply report on
> these same fields? Are there others that make more sense? Do I need to read
> all 184 entries in /sys/slab and then all the entries under them? Clearly I
> want to do this efficiently and provide meaningful data at the same time.
You only need to read certain files that you need for the information you
want to display.
> Perhaps someone would like to take this discussion off-line with me and even
> collaborate with me on enhancements for slub in collectl?
I think we better keep it public (so that it goes into the archive). Here
a short description of the field in /sys/kernel/slab/<slabcache> that you
would need
-r--r--r-- 1 root root 4096 Dec 20 11:41 object_size
The size of an object. Subtract slab_size - object_size and you have the
per object overhead generated by alignements and slab metadata. Does not
change you only need to read this once.
-r--r--r-- 1 root root 4096 Dec 20 11:41 objects
Number of objects in use. This changes and you may want to monitor it.
-r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size
Total memory used for a single object. Read this only once.
-r--r--r-- 1 root root 4096 Dec 20 11:41 slabs
Number of slab pages in use for this slab cache. May change if slab is
extended.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-20 19:44 ` SLUB Christoph Lameter
@ 2007-12-20 23:36 ` Mark Seger
2007-12-21 1:09 ` SLUB Mark Seger
2007-12-21 21:32 ` SLUB Christoph Lameter
2007-12-21 16:59 ` SLUB Mark Seger
1 sibling, 2 replies; 24+ messages in thread
From: Mark Seger @ 2007-12-20 23:36 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
>> Perhaps someone would like to take this discussion off-line with me and even
>> collaborate with me on enhancements for slub in collectl?
sounds good to me, I just didn't want to annoy anyone...
>> I think we better keep it public (so that it goes into the archive). Here
>> a short description of the field in /sys/kernel/slab/<slabcache> that you
>> would need
>>
>> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size
>>
>> The size of an object. Subtract slab_size - object_size and you have the
>> per object overhead generated by alignements and slab metadata. Does not
>> change you only need to read this once.
>>
>> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects
>>
>> Number of objects in use. This changes and you may want to monitor it.
>>
>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size
>>
>> Total memory used for a single object. Read this only once.
>>
>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs
>>
>> Number of slab pages in use for this slab cache. May change if slab is
>> extended.
>>
What I'm not sure about is how this maps to the old slab info.
Specifically, I believe in the old model one reported on the size taken
up by the slabs (number of slabs X number of objects/slab X object
size). There was a second size for the actual number of objects in use,
so in my report that looked like this:
# <-----------Objects----------><---------Slab
Allocation------>
#Name InUse Bytes Alloc Bytes InUse Bytes
Total Bytes
nfs_direct_cache 0 0 0 0 0
0 0 0
nfs_write_data 36 27648 40 30720 8
32768 8 32768
the slab allocation was real memory allocated (which should come close
to Slab: in /proc/meminfo, right?) for the slabs while the object bytes
were those in use. Is it worth it to continue this model or do thing
work differently. It sounds like I can still do this with the numbers
you've pointed me to above and I do now realize I only need to monitor
the number of slabs and the number of objects since the others are
constants.
To get back to my original question, I'd like to make sure that I'm
reporting useful information and not just data for the sake of it. In
one of your postings I saw a report you had that showed:
slubinfo - version: 1.0
# name <objects> <order> <objsize> <slabs>/<partial>/<cpu> <flags> <nodes>
How useful is order, cpu, flags and nodes?
Do people really care about how much memory is taken up by objects vs
slabs? If not, I could see reporting for each slab:
- object size
- number objects
- slab size
- number of slabs
- total memory (slab size X number of slabs)
- whatever else people might think to be useful such as order, cpu,
flags, etc
Another thing I noticed is a number of the slabs are simply links to the
same base name and is it sufficient to just report the base names and
not those linked to it? Seems reasonable to me...
The interesting thing about collectl is that it's written in perl (but
I'm trying to be very careful to keep it efficient and it tends to use
<0.1% cpu when run as a daemon) and the good news is it's pretty easy to
get something implemented, depending on my free time. If we can get
some level of agreement on what seems useful I could get a version up
fairly quickly for people to start playing with if there is any interest.
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-20 23:36 ` SLUB Mark Seger
@ 2007-12-21 1:09 ` Mark Seger
2007-12-21 1:27 ` SLUB Mark Seger
2007-12-21 21:41 ` SLUB Christoph Lameter
2007-12-21 21:32 ` SLUB Christoph Lameter
1 sibling, 2 replies; 24+ messages in thread
From: Mark Seger @ 2007-12-21 1:09 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Mark Seger, linux-mm
I did some preliminary prototyping and I guess I'm not sure of the
math. If I understand what you're saying, an object has a particular
size, but given the fact that you may need alignment, the true size is
really the slab size, and the difference is the overhead. What I don't
understand is how to calculate how much memory a particular slab takes
up. If the slabsize is really the size of an object, wouldn't I
multiple that times the number of objects? But when I do that I get a
number smaller than that reported in /proc/meminfo, in my case 15997K vs
17388K. Given memory numbers rarely seem to add up maybe this IS close
enough? If so, what's the significance of the number of slabs? Would I
divide the 15997K by the number of slabs to find out how big a single
slab is? I would have thought that's what the slab_size is but clearly
it isn't.
In any event, here's a table of what I see on my machine. The first 4
columns come from /sys/slab and the 5th I calculated by just multiplying
SlabSize X NumObj. If I should be doing something else, please tell
me. Also be sure to tell me if I should include other data. For
example, the number of objects is a little misleading since when I look
at the file I really see something like:
49 N0=19 N1=30
which I'm guessing may mean 19 objects are allocated to socket 0 and 30
to socket 1? this is a dual-core, dual-socket system.
-mark
Mark Seger wrote:
>
>>> Perhaps someone would like to take this discussion off-line with me
>>> and even
>>> collaborate with me on enhancements for slub in collectl?
> sounds good to me, I just didn't want to annoy anyone...
>>> I think we better keep it public (so that it goes into the archive).
>>> Here a short description of the field in
>>> /sys/kernel/slab/<slabcache> that you would need
>>>
>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size
>>>
>>> The size of an object. Subtract slab_size - object_size and you have
>>> the per object overhead generated by alignements and slab metadata.
>>> Does not change you only need to read this once.
>>>
>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects
>>>
>>> Number of objects in use. This changes and you may want to monitor it.
>>>
>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size
>>>
>>> Total memory used for a single object. Read this only once.
>>>
>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs
>>>
>>> Number of slab pages in use for this slab cache. May change if slab
>>> is extended.
>>>
> What I'm not sure about is how this maps to the old slab info.
> Specifically, I believe in the old model one reported on the size
> taken up by the slabs (number of slabs X number of objects/slab X
> object size). There was a second size for the actual number of
> objects in use, so in my report that looked like this:
>
> # <-----------Objects----------><---------Slab
> Allocation------>
> #Name InUse Bytes Alloc Bytes InUse Bytes
> Total Bytes
> nfs_direct_cache 0 0 0 0 0
> 0 0 0
> nfs_write_data 36 27648 40 30720 8
> 32768 8 32768
>
> the slab allocation was real memory allocated (which should come close
> to Slab: in /proc/meminfo, right?) for the slabs while the object
> bytes were those in use. Is it worth it to continue this model or do
> thing work differently. It sounds like I can still do this with the
> numbers you've pointed me to above and I do now realize I only need to
> monitor the number of slabs and the number of objects since the others
> are constants.
>
> To get back to my original question, I'd like to make sure that I'm
> reporting useful information and not just data for the sake of it. In
> one of your postings I saw a report you had that showed:
>
> slubinfo - version: 1.0
> # name <objects> <order> <objsize> <slabs>/<partial>/<cpu>
> <flags> <nodes>
>
> How useful is order, cpu, flags and nodes?
> Do people really care about how much memory is taken up by objects vs
> slabs? If not, I could see reporting for each slab:
> - object size
> - number objects
> - slab size
> - number of slabs
> - total memory (slab size X number of slabs)
> - whatever else people might think to be useful such as order, cpu,
> flags, etc
>
> Another thing I noticed is a number of the slabs are simply links to
> the same base name and is it sufficient to just report the base names
> and not those linked to it? Seems reasonable to me...
>
> The interesting thing about collectl is that it's written in perl (but
> I'm trying to be very careful to keep it efficient and it tends to use
> <0.1% cpu when run as a daemon) and the good news is it's pretty easy
> to get something implemented, depending on my free time. If we can
> get some level of agreement on what seems useful I could get a version
> up fairly quickly for people to start playing with if there is any
> interest.
>
> -mark
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-21 1:09 ` SLUB Mark Seger
@ 2007-12-21 1:27 ` Mark Seger
2007-12-21 21:41 ` SLUB Christoph Lameter
1 sibling, 0 replies; 24+ messages in thread
From: Mark Seger @ 2007-12-21 1:27 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Mark Seger, linux-mm
I just realized I forgot to include an example of the output I was
generating so here it is:
Slab Name ObjSize NumObj SlabSize NumSlab Total
:0000008 8 2185 8 5 17480
:0000016 16 1604 16 9 25664
:0000024 24 409 24 4 9816
:0000032 32 380 32 5 12160
:0000040 40 204 40 2 8160
:0000048 48 0 48 0 0
:0000064 64 843 64 17 53952
:0000072 72 167 72 3 12024
:0000088 88 5549 88 121 488312
:0000096 96 1400 96 40 134400
:0000112 112 0 112 0 0
:0000128 128 385 128 21 49280
:0000136 136 70 136 4 9520
:0000152 152 59 152 4 8968
:0000160 160 46 160 4 7360
:0000176 176 2071 176 93 364496
:0000192 192 400 192 24 76800
:0000256 256 1333 256 100 341248
:0000288 288 54 288 6 15552
:0000320 320 53 320 7 16960
:0000384 384 29 384 5 11136
:0000448 420 22 448 4 9856
:0000512 512 150 512 22 76800
:0000704 696 33 704 3 23232
:0000768 768 82 768 21 62976
:0000832 776 98 832 15 81536
:0000896 896 48 896 14 43008
:0000960 944 39 960 15 37440
:0001024 1024 303 1024 80 310272
:0001088 1048 28 1088 4 30464
:0001608 1608 34 1608 7 54672
:0001728 1712 16 1728 5 27648
:0001856 1856 8 1856 2 14848
:0001904 1904 87 1904 28 165648
:0002048 2048 504 2048 131 1032192
:0004096 4096 49 4096 28 200704
:0008192 8192 8 8192 12 65536
:0016384 16384 4 16384 7 65536
:0032768 32768 3 32768 3 98304
:0065536 65536 1 65536 1 65536
:0131072 131072 0 131072 0 0
:0262144 262144 0 262144 0 0
:0524288 524288 0 524288 0 0
:1048576 1048576 0 1048576 0 0
:2097152 2097152 0 2097152 0 0
:4194304 4194304 0 4194304 0 0
:a-0000088 88 0 88 0 0
:a-0000104 104 13963 104 359 1452152
:a-0000168 168 0 168 0 0
:a-0000224 224 11113 224 619 2489312
:a-0000256 248 0 256 0 0
anon_vma 40 796 48 12 38208
bdev_cache 960 32 1024 8 32768
ext2_inode_cache 920 0 928 0 0
ext3_inode_cache 968 4775 976 1194 4660400
file_lock_cache 192 58 200 4 11600
hugetlbfs_inode_cache 752 5 760 1 3800
idr_layer_cache 528 91 536 14 48776
inode_cache 720 3015 728 604 2194920
isofs_inode_cache 768 0 776 0 0
kmem_cache_node 72 232 72 6 16704
mqueue_inode_cache 1040 7 1088 1 7616
nfs_inode_cache 1120 102 1128 15 115056
proc_inode_cache 752 503 760 102 382280
radix_tree_node 552 2666 560 381 1492960
rpc_inode_cache 928 16 960 4 15360
shmem_inode_cache 960 243 968 61 235224
sighand_cache 2120 86 2176 31 187136
sock_inode_cache 816 81 832 11 67392
TOTAL K: 17169
and here's /proc/meminfo
MemTotal: 4040768 kB
MemFree: 3726112 kB
Buffers: 13864 kB
Cached: 196920 kB
SwapCached: 0 kB
Active: 127264 kB
Inactive: 127864 kB
SwapTotal: 4466060 kB
SwapFree: 4466060 kB
Dirty: 60 kB
Writeback: 0 kB
AnonPages: 44364 kB
Mapped: 16124 kB
Slab: 18608 kB
SReclaimable: 11768 kB
SUnreclaim: 6840 kB
PageTables: 2240 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 6486444 kB
Committed_AS: 64064 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 32364 kB
VmallocChunk: 34359705775 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
-mark
Mark Seger wrote:
> I did some preliminary prototyping and I guess I'm not sure of the
> math. If I understand what you're saying, an object has a particular
> size, but given the fact that you may need alignment, the true size is
> really the slab size, and the difference is the overhead. What I
> don't understand is how to calculate how much memory a particular slab
> takes up. If the slabsize is really the size of an object, wouldn't I
> multiple that times the number of objects? But when I do that I get a
> number smaller than that reported in /proc/meminfo, in my case 15997K
> vs 17388K. Given memory numbers rarely seem to add up maybe this IS
> close enough? If so, what's the significance of the number of slabs?
> Would I divide the 15997K by the number of slabs to find out how big a
> single slab is? I would have thought that's what the slab_size is but
> clearly it isn't.
>
> In any event, here's a table of what I see on my machine. The first 4
> columns come from /sys/slab and the 5th I calculated by just
> multiplying SlabSize X NumObj. If I should be doing something else,
> please tell me. Also be sure to tell me if I should include other
> data. For example, the number of objects is a little misleading since
> when I look at the file I really see something like:
>
> 49 N0=19 N1=30
>
> which I'm guessing may mean 19 objects are allocated to socket 0 and
> 30 to socket 1? this is a dual-core, dual-socket system.
>
> -mark
>
> Mark Seger wrote:
>>
>>>> Perhaps someone would like to take this discussion off-line with me
>>>> and even
>>>> collaborate with me on enhancements for slub in collectl?
>> sounds good to me, I just didn't want to annoy anyone...
>>>> I think we better keep it public (so that it goes into the
>>>> archive). Here a short description of the field in
>>>> /sys/kernel/slab/<slabcache> that you would need
>>>>
>>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size
>>>>
>>>> The size of an object. Subtract slab_size - object_size and you
>>>> have the per object overhead generated by alignements and slab
>>>> metadata. Does not change you only need to read this once.
>>>>
>>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects
>>>>
>>>> Number of objects in use. This changes and you may want to monitor it.
>>>>
>>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size
>>>>
>>>> Total memory used for a single object. Read this only once.
>>>>
>>>> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs
>>>>
>>>> Number of slab pages in use for this slab cache. May change if slab
>>>> is extended.
>>>>
>> What I'm not sure about is how this maps to the old slab info.
>> Specifically, I believe in the old model one reported on the size
>> taken up by the slabs (number of slabs X number of objects/slab X
>> object size). There was a second size for the actual number of
>> objects in use, so in my report that looked like this:
>>
>> # <-----------Objects----------><---------Slab
>> Allocation------>
>> #Name InUse Bytes Alloc Bytes InUse
>> Bytes Total Bytes
>> nfs_direct_cache 0 0 0 0 0
>> 0 0 0
>> nfs_write_data 36 27648 40 30720 8
>> 32768 8 32768
>>
>> the slab allocation was real memory allocated (which should come
>> close to Slab: in /proc/meminfo, right?) for the slabs while the
>> object bytes were those in use. Is it worth it to continue this
>> model or do thing work differently. It sounds like I can still do
>> this with the numbers you've pointed me to above and I do now realize
>> I only need to monitor the number of slabs and the number of objects
>> since the others are constants.
>>
>> To get back to my original question, I'd like to make sure that I'm
>> reporting useful information and not just data for the sake of it.
>> In one of your postings I saw a report you had that showed:
>>
>> slubinfo - version: 1.0
>> # name <objects> <order> <objsize> <slabs>/<partial>/<cpu>
>> <flags> <nodes>
>>
>> How useful is order, cpu, flags and nodes?
>> Do people really care about how much memory is taken up by objects vs
>> slabs? If not, I could see reporting for each slab:
>> - object size
>> - number objects
>> - slab size
>> - number of slabs
>> - total memory (slab size X number of slabs)
>> - whatever else people might think to be useful such as order, cpu,
>> flags, etc
>>
>> Another thing I noticed is a number of the slabs are simply links to
>> the same base name and is it sufficient to just report the base names
>> and not those linked to it? Seems reasonable to me...
>>
>> The interesting thing about collectl is that it's written in perl
>> (but I'm trying to be very careful to keep it efficient and it tends
>> to use <0.1% cpu when run as a daemon) and the good news is it's
>> pretty easy to get something implemented, depending on my free time.
>> If we can get some level of agreement on what seems useful I could
>> get a version up fairly quickly for people to start playing with if
>> there is any interest.
>>
>> -mark
>>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-20 19:44 ` SLUB Christoph Lameter
2007-12-20 23:36 ` SLUB Mark Seger
@ 2007-12-21 16:59 ` Mark Seger
2007-12-21 21:37 ` SLUB Christoph Lameter
1 sibling, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-21 16:59 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
> I think we better keep it public (so that it goes into the archive). Here
> a short description of the field in /sys/kernel/slab/<slabcache> that you
> would need
>
> -r--r--r-- 1 root root 4096 Dec 20 11:41 object_size
>
> The size of an object. Subtract slab_size - object_size and you have the
> per object overhead generated by alignements and slab metadata. Does not
> change you only need to read this once.
>
> -r--r--r-- 1 root root 4096 Dec 20 11:41 objects
>
> Number of objects in use. This changes and you may want to monitor it.
>
> -r--r--r-- 1 root root 4096 Dec 20 11:41 slab_size
>
> Total memory used for a single object. Read this only once.
>
> -r--r--r-- 1 root root 4096 Dec 20 11:41 slabs
>
> Number of slab pages in use for this slab cache. May change if slab is
> extended.
>
Sorry for being confused, but I thought that a slab was made up of a
number of objects and above you're saying slab_size is the size of
single object. Furthermore, looking at /sys/slab/shmem_inode_cache I see:
object_size = 960
objs_per_slab = 4
which implies a slab is made up more than one object, so which is it?
could it be a simple matter of clearer names? I also see
slab_size = 968
which certainly supports your statement about this being the size of an
object and it looks like there is 8 bytes of overhead. finally, I also see
objects = 242
and objects * obj_per_slab = slabsize. is that a coincidence?
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-20 23:36 ` SLUB Mark Seger
2007-12-21 1:09 ` SLUB Mark Seger
@ 2007-12-21 21:32 ` Christoph Lameter
1 sibling, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2007-12-21 21:32 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 20 Dec 2007, Mark Seger wrote:
> What I'm not sure about is how this maps to the old slab info. Specifically,
> I believe in the old model one reported on the size taken up by the slabs
> (number of slabs X number of objects/slab X object size). There was a second
> size for the actual number of objects in use, so in my report that looked like
> this:
>
> # <-----------Objects----------><---------Slab
> Allocation------>
> #Name InUse Bytes Alloc Bytes InUse Bytes Total
> Bytes
> nfs_direct_cache 0 0 0 0 0 0 0
> 0
> nfs_write_data 36 27648 40 30720 8 32768 8
> 32768
>
> the slab allocation was real memory allocated (which should come close to
> Slab: in /proc/meminfo, right?) for the slabs while the object bytes were
The real memory allocates can be deducated from the "slabs" field.
Multiply that by the order of the slab and you have the size of it.
The "objects" are the actual objects in current use.
> To get back to my original question, I'd like to make sure that I'm reporting
> useful information and not just data for the sake of it. In one of your
> postings I saw a report you had that showed:
>
> slubinfo - version: 1.0
> # name <objects> <order> <objsize> <slabs>/<partial>/<cpu> <flags>
> <nodes>
That report can be had using the slabinfo tool. See
Documentation/vm/slabinfo.c
> How useful is order, cpu, flags and nodes?
> Do people really care about how much memory is taken up by objects vs slabs?
> If not, I could see reporting for each slab:
> - object size
> - number objects
> - slab size
> - number of slabs
> - total memory (slab size X number of slabs)
> - whatever else people might think to be useful such as order, cpu, flags, etc
Sounds fine.
> Another thing I noticed is a number of the slabs are simply links to the same
> base name and is it sufficient to just report the base names and not those
> linked to it? Seems reasonable to me...
slabinfo reports it like that.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-21 16:59 ` SLUB Mark Seger
@ 2007-12-21 21:37 ` Christoph Lameter
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2007-12-21 21:37 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Fri, 21 Dec 2007, Mark Seger wrote:
> Sorry for being confused, but I thought that a slab was made up of a number of
> objects and above you're saying slab_size is the size of single object.
> Furthermore, looking at /sys/slab/shmem_inode_cache I see:
>
> object_size = 960
> objs_per_slab = 4
>
> which implies a slab is made up more than one object, so which is it? could
> it be a simple matter of clearer names? I also see
Yes a slab holds "objs_per_slab" object/
>
> slab_size = 968
>
> which certainly supports your statement about this being the size of an object
> and it looks like there is 8 bytes of overhead. finally, I also see
>
> objects = 242
>
> and objects * obj_per_slab = slabsize. is that a coincidence?
This means that the slab contains 242 active objects. From the "slabs"
field you can deduce how many objects the slab could hold:
slabs * objs_per_slab
If you subtract "objects" from this then you have the number of unused
objects in the slabs.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-21 1:09 ` SLUB Mark Seger
2007-12-21 1:27 ` SLUB Mark Seger
@ 2007-12-21 21:41 ` Christoph Lameter
2007-12-27 14:22 ` SLUB Mark Seger
1 sibling, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2007-12-21 21:41 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 20 Dec 2007, Mark Seger wrote:
> I did some preliminary prototyping and I guess I'm not sure of the math. If I
> understand what you're saying, an object has a particular size, but given the
> fact that you may need alignment, the true size is really the slab size, and
> the difference is the overhead. What I don't understand is how to calculate
> how much memory a particular slab takes up. If the slabsize is really the
If you want the use in terms of pages allocated from the page allocator
then you do
slabs << order
If you want to use in actual bytes in allocated objects by the user of
a slab cache then you can do
objects * obj_size
> this IS close enough? If so, what's the significance of the number of slabs?
Its the amount of pages that were taken from the page allocator.
> Would I divide the 15997K by the number of slabs to find out how big a single
> slab is? I would have thought that's what the slab_size is but clearly it
> isn't.
The size of a single slab that contains multiple objects is
PAGE_SIZE << order
> 49 N0=19 N1=30
>
> which I'm guessing may mean 19 objects are allocated to socket 0 and 30 to
> socket 1? this is a dual-core, dual-socket system.
Right. There are 49 objects in use. 19 of those are on node 0 and 30 on
node 0. The Nx values only show up on NUMA systems otherwise this will be
omitted.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-21 21:41 ` SLUB Christoph Lameter
@ 2007-12-27 14:22 ` Mark Seger
2007-12-27 15:59 ` SLUB Mark Seger
2007-12-27 19:40 ` SLUB Christoph Lameter
0 siblings, 2 replies; 24+ messages in thread
From: Mark Seger @ 2007-12-27 14:22 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
Now that I've had some more time to think about this and play around
with the slabinfo tool I fear my problem had getting my head wrapped
around the terminology, but that's my problem. Since there are entries
called object_size, objs_per_slab and slab_size I would have thought
that object_size*objects_per_slab=slab_size but that clearly isn't the
case. Since slabs are allocated in pages, the actual size of the slabs
is always a multiple of the page_size (actually by a power of 2) and
that's why I see calculations in slabinfo like page_size << order, but I
guess I'm still not sure what the actual definition of 'order' actually is.
Anyhow, when I run slabinfo and see the following entry
Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25
** Hardware cacheline aligned
Sizes (bytes) Slabs Debug Memory
------------------------------------------------------------------------
Object : 420 Total : 4 Sanity Checks : Off Total: 16384
SlabObj: 448 Full : 0 Redzoning : Off Used : 10500
SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884
Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700
Align : 0 Objects: 9 Tracing : Off Lpadd: 256
according to the entries under /sys/slabs/skbuff_fclone_cache it looks
like the slab_size field is being reported above as 'SlabObj' and
objs_per_slab is being reported as 'Objects' and as I mentioned above,
SlabSiz is based on 'order'.
Anyhow, as I understand what's going on at a very high level, memory is
reserved for use as slabs (which themselves are multiples of pages) and
processes allocate objects from within slabs as they need them.
Therefore the 2 high-level numbers that seem of interest from a memory
usage perspective are the memory allocated and the amount in use. I
think these are the "Total" and "Used" fields in slabinfo.
Total = page_size << order
As for 'Used' that looks to be a straight calculation of objects *
object_size
The Slabs field in /proc/meminfo is the total of the individual 'Total's...
Stay tuned and at some point I'll have support in collectl for reporting
total/allocated usage by slab in collectl, though perhaps I'll post a
'proposal' first in the hopes of getting some constructive feedback as I
want to present useful information rather than that columns of numbers.
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 14:22 ` SLUB Mark Seger
@ 2007-12-27 15:59 ` Mark Seger
2007-12-27 19:43 ` SLUB Christoph Lameter
2007-12-27 19:40 ` SLUB Christoph Lameter
1 sibling, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-27 15:59 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Mark Seger, linux-mm
I now have a 'prototype' of something I think makes sense, at least from
my collectl tool's perspective. Keep in mind the philosophy behind
collectl is to have a tool you can run both interactively and as a
daemon that will give you enough information to paint a picture of
what's happening on your system and in this case I'm focused on slabs.
This is not intended to be a highly analytical tool but rather a
starting point to identify areas potentially requiring a deeper dive.
For example, with the current version that's driven off /proc/slabinfo,
it's been possible to look at the long term changes to individual slabs
to get picture of how memory is being allocated and when there are
memory issues it can be useful to see which slabs (if any) are growing
at an unexpected rate. That said, I'm thinking of reporting something
like the following:
<-------- objects --------><----- slabs
-----><------ memory ------>
Slab Name Size In Use Avail Size
Number Used Total
:0000008 8 2164 2560 4096
5 17312 20480
:0000016 16 1448 2816 4096
11 23168 45056
:0000024 24 460 680 4096
4 11040 16384
:0000032 32 384 1152 4096
9 12288 36864
:0000040 40 306 306 4096
3 12240 12288
The idea here is that for each slab in the 'objects' section one can
see how many objects are 'in use' and how many are 'available', the
point being one can look at the difference to see how many more objects
are available before the system needs to allocate another slab. Under
the 'slabs' section you can see how big the individual slabs are and how
many of them there are and finally under 'memory' you can see how much
has been used by processes vs how much is still allocated as slabs.
There are all sorts of other ways to present the data such as
percentages, differences, etc. but this is more-or-less the way I did it
in the past and the information was useful. One could also argue that
the real key information here is Uses/Total and the rest is just window
dressing and I couldn't disagree with that either, but I do think it
helps paint a more complete picture.
-mark
Mark Seger wrote:
> Now that I've had some more time to think about this and play around
> with the slabinfo tool I fear my problem had getting my head wrapped
> around the terminology, but that's my problem. Since there are
> entries called object_size, objs_per_slab and slab_size I would have
> thought that object_size*objects_per_slab=slab_size but that clearly
> isn't the case. Since slabs are allocated in pages, the actual size
> of the slabs is always a multiple of the page_size (actually by a
> power of 2) and that's why I see calculations in slabinfo like
> page_size << order, but I guess I'm still not sure what the actual
> definition of 'order' actually is.
>
> Anyhow, when I run slabinfo and see the following entry
>
> Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25
> ** Hardware cacheline aligned
>
> Sizes (bytes) Slabs Debug Memory
> ------------------------------------------------------------------------
> Object : 420 Total : 4 Sanity Checks : Off Total: 16384
> SlabObj: 448 Full : 0 Redzoning : Off Used : 10500
> SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884
> Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700
> Align : 0 Objects: 9 Tracing : Off Lpadd: 256
>
> according to the entries under /sys/slabs/skbuff_fclone_cache it looks
> like the slab_size field is being reported above as 'SlabObj' and
> objs_per_slab is being reported as 'Objects' and as I mentioned above,
> SlabSiz is based on 'order'.
>
> Anyhow, as I understand what's going on at a very high level, memory
> is reserved for use as slabs (which themselves are multiples of pages)
> and processes allocate objects from within slabs as they need them.
> Therefore the 2 high-level numbers that seem of interest from a memory
> usage perspective are the memory allocated and the amount in use. I
> think these are the "Total" and "Used" fields in slabinfo.
>
> Total = page_size << order
>
> As for 'Used' that looks to be a straight calculation of objects *
> object_size
>
> The Slabs field in /proc/meminfo is the total of the individual
> 'Total's...
>
> Stay tuned and at some point I'll have support in collectl for
> reporting total/allocated usage by slab in collectl, though perhaps
> I'll post a 'proposal' first in the hopes of getting some constructive
> feedback as I want to present useful information rather than that
> columns of numbers.
>
> -mark
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 14:22 ` SLUB Mark Seger
2007-12-27 15:59 ` SLUB Mark Seger
@ 2007-12-27 19:40 ` Christoph Lameter
2007-12-27 19:51 ` SLUB Mark Seger
1 sibling, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2007-12-27 19:40 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 27 Dec 2007, Mark Seger wrote:
> Now that I've had some more time to think about this and play around with the
> slabinfo tool I fear my problem had getting my head wrapped around the
> terminology, but that's my problem. Since there are entries called
> object_size, objs_per_slab and slab_size I would have thought that
> object_size*objects_per_slab=slab_size but that clearly isn't the case. Since
> slabs are allocated in pages, the actual size of the slabs is always a
> multiple of the page_size (actually by a power of 2) and that's why I see
> calculations in slabinfo like page_size << order, but I guess I'm still not
> sure what the actual definition of 'order' actually is.
order is the shift you apply to PAGE_SIZE to get to the allocation size
you want. Order 0 = PAGE_SIZE, order 1 = PAGE_SIZE << 1 (PAGE_SIZE *2),
order 2 = PAGE_SIZE << 2 (PAGE_SIZE * 4) etc.
> Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25
> ** Hardware cacheline aligned
>
> Sizes (bytes) Slabs Debug Memory
> ------------------------------------------------------------------------
> Object : 420 Total : 4 Sanity Checks : Off Total: 16384
> SlabObj: 448 Full : 0 Redzoning : Off Used : 10500
> SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884
> Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700
> Align : 0 Objects: 9 Tracing : Off Lpadd: 256
>
> according to the entries under /sys/slabs/skbuff_fclone_cache it looks like
> the slab_size field is being reported above as 'SlabObj' and objs_per_slab is
> being reported as 'Objects' and as I mentioned above, SlabSiz is based on
> 'order'.
>
> Anyhow, as I understand what's going on at a very high level, memory is
> reserved for use as slabs (which themselves are multiples of pages) and
> processes allocate objects from within slabs as they need them. Therefore the
> 2 high-level numbers that seem of interest from a memory usage perspective are
> the memory allocated and the amount in use. I think these are the "Total" and
> "Used" fields in slabinfo.
Total is the total memory allocated from the page allocator. There are 4
slab allocated with the size of 4096 bytes each. This is 16k.
The used value is the memory that was actually handed out through kmalloc
and friends.
> Total = page_size << order
Order = 0. So Total would be 4096 << 0 = 4096. Wrong value.
> As for 'Used' that looks to be a straight calculation of objects * object_size
Right.
> The Slabs field in /proc/meminfo is the total of the individual 'Total's...
Right.
> Stay tuned and at some point I'll have support in collectl for reporting
> total/allocated usage by slab in collectl, though perhaps I'll post a
> 'proposal' first in the hopes of getting some constructive feedback as I want
> to present useful information rather than that columns of numbers.
Ahh Great. Thanks for all your work.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 15:59 ` SLUB Mark Seger
@ 2007-12-27 19:43 ` Christoph Lameter
2007-12-27 19:57 ` SLUB Mark Seger
0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2007-12-27 19:43 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 27 Dec 2007, Mark Seger wrote:
> <-------- objects --------><----- slabs
> -----><------ memory ------>
> Slab Name Size In Use Avail Size Number Used Total
> :0000008 8 2164 2560 4096 5 17312 20480
The right hand side is okay. Could you list all the slab names that are
covered by :00008 on the left side (maybe separated by commas?) Having the
:00008 there is ugly. slabinfo can show you a way how to get the names.
> There are all sorts of other ways to present the data such as percentages,
> differences, etc. but this is more-or-less the way I did it in the past and
> the information was useful. One could also argue that the real key
> information here is Uses/Total and the rest is just window dressing and I
> couldn't disagree with that either, but I do think it helps paint a more
> complete picture.
I agree.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 19:40 ` SLUB Christoph Lameter
@ 2007-12-27 19:51 ` Mark Seger
2007-12-27 19:53 ` SLUB Christoph Lameter
0 siblings, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-27 19:51 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
It feels like we're closing in on something as O'm getting more 'Right's
from you than before. 8-)
Just a few more comments/questions to your comments below...
Christoph Lameter wrote:
> On Thu, 27 Dec 2007, Mark Seger wrote:
>
>
>> Now that I've had some more time to think about this and play around with the
>> slabinfo tool I fear my problem had getting my head wrapped around the
>> terminology, but that's my problem. Since there are entries called
>> object_size, objs_per_slab and slab_size I would have thought that
>> object_size*objects_per_slab=slab_size but that clearly isn't the case. Since
>> slabs are allocated in pages, the actual size of the slabs is always a
>> multiple of the page_size (actually by a power of 2) and that's why I see
>> calculations in slabinfo like page_size << order, but I guess I'm still not
>> sure what the actual definition of 'order' actually is.
>>
>
> order is the shift you apply to PAGE_SIZE to get to the allocation size
> you want. Order 0 = PAGE_SIZE, order 1 = PAGE_SIZE << 1 (PAGE_SIZE *2),
> order 2 = PAGE_SIZE << 2 (PAGE_SIZE * 4) etc.
>
I think the thing that was throwing me here for awhile was the name
'order'. I thought it meant order in the ordinal sense but clearly it's
more intended as is 'the power of' sense.
>> Slabcache: skbuff_fclone_cache Aliases: 0 Order : 0 Objects: 25
>> ** Hardware cacheline aligned
>>
>> Sizes (bytes) Slabs Debug Memory
>> ------------------------------------------------------------------------
>> Object : 420 Total : 4 Sanity Checks : Off Total: 16384
>> SlabObj: 448 Full : 0 Redzoning : Off Used : 10500
>> SlabSiz: 4096 Partial: 0 Poisoning : Off Loss : 5884
>> Loss : 28 CpuSlab: 4 Tracking : Off Lalig: 700
>> Align : 0 Objects: 9 Tracing : Off Lpadd: 256
>>
>> according to the entries under /sys/slabs/skbuff_fclone_cache it looks like
>> the slab_size field is being reported above as 'SlabObj' and objs_per_slab is
>> being reported as 'Objects' and as I mentioned above, SlabSiz is based on
>> 'order'.
>>
>> Anyhow, as I understand what's going on at a very high level, memory is
>> reserved for use as slabs (which themselves are multiples of pages) and
>> processes allocate objects from within slabs as they need them. Therefore the
>> 2 high-level numbers that seem of interest from a memory usage perspective are
>> the memory allocated and the amount in use. I think these are the "Total" and
>> "Used" fields in slabinfo.
>>
>
> Total is the total memory allocated from the page allocator. There are 4
> slab allocated with the size of 4096 bytes each. This is 16k.
>
> The used value is the memory that was actually handed out through kmalloc
> and friends.
>
>
>> Total = page_size << order
>>
>
> Order = 0. So Total would be 4096 << 0 = 4096. Wrong value.
>
I'm not sure what your 'wong value. I think it's because I said
page_size << order instead of (page_size << order ) * number of slabs,
right?
>> As for 'Used' that looks to be a straight calculation of objects * object_size
>>
>
> Right.
>
>
>> The Slabs field in /proc/meminfo is the total of the individual 'Total's...
>>
>
> Right.
>
>
>> Stay tuned and at some point I'll have support in collectl for reporting
>> total/allocated usage by slab in collectl, though perhaps I'll post a
>> 'proposal' first in the hopes of getting some constructive feedback as I want
>> to present useful information rather than that columns of numbers.
>>
>
> Ahh Great. Thanks for all your work.
>
now the only assumption is that someone will actually use it! 8-)
one more thing - can I assume order is a constant for a particular type
of a slab and only need to read it at initialization time?
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 19:51 ` SLUB Mark Seger
@ 2007-12-27 19:53 ` Christoph Lameter
0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2007-12-27 19:53 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 27 Dec 2007, Mark Seger wrote:
> > Order = 0. So Total would be 4096 << 0 = 4096. Wrong value.
> >
> I'm not sure what your 'wong value. I think it's because I said page_size <<
> order instead of (page_size << order ) * number of slabs, right?
Right.
> one more thing - can I assume order is a constant for a particular type of a
> slab and only need to read it at initialization time?
Correct. Only the number of slabs and the number of objects changes.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 19:43 ` SLUB Christoph Lameter
@ 2007-12-27 19:57 ` Mark Seger
2007-12-27 19:58 ` SLUB Christoph Lameter
0 siblings, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-27 19:57 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
Christoph Lameter wrote:
> On Thu, 27 Dec 2007, Mark Seger wrote:
>
>
>> <-------- objects --------><----- slabs
>> -----><------ memory ------>
>> Slab Name Size In Use Avail Size Number Used Total
>> :0000008 8 2164 2560 4096 5 17312 20480
>>
>
> The right hand side is okay. Could you list all the slab names that are
> covered by :00008 on the left side (maybe separated by commas?) Having the
> :00008 there is ugly. slabinfo can show you a way how to get the names.
>
here's the challenge - I only want to use a single line per entry AND I
want all the columns to line up for easy reading (I don't want much do
I?). I'll have to do some experiments to see what might look better.
One thought is to list a 'primary' name (whatever that might mean) in
the left-hand column and perhaps line up the rest of the other names to
the right of the total. Another option could be to just repeat the line
with each slab entry but that also generates a lot of output and one of
the other notions behind collectl is to make it real easy to see what's
going on and repeating information can be confusing.
I'm assuming the way slabinfo gets the names (or at least the way I can
think of doing it) it so just look for entries in /sys/slab that are links.
>> There are all sorts of other ways to present the data such as percentages,
>> differences, etc. but this is more-or-less the way I did it in the past and
>> the information was useful. One could also argue that the real key
>> information here is Uses/Total and the rest is just window dressing and I
>> couldn't disagree with that either, but I do think it helps paint a more
>> complete picture.
>>
>
> I agree.
>
The neat thing about collectl is it's written in perl and contains lots
of switches and print statements. I can easily see additional switches
that might control how the information is printed, such as the 'node'
level allocations, but I figure that can come later.
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 19:57 ` SLUB Mark Seger
@ 2007-12-27 19:58 ` Christoph Lameter
2007-12-27 20:17 ` SLUB Mark Seger
2007-12-27 20:55 ` SLUB Mark Seger
0 siblings, 2 replies; 24+ messages in thread
From: Christoph Lameter @ 2007-12-27 19:58 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 27 Dec 2007, Mark Seger wrote:
> > The right hand side is okay. Could you list all the slab names that are
> > covered by :00008 on the left side (maybe separated by commas?) Having the
> > :00008 there is ugly. slabinfo can show you a way how to get the names.
> >
> here's the challenge - I only want to use a single line per entry AND I want
> all the columns to line up for easy reading (I don't want much do I?). I'll
> have to do some experiments to see what might look better. One thought is to
> list a 'primary' name (whatever that might mean) in the left-hand column and
> perhaps line up the rest of the other names to the right of the total.
slabinfo has the concept of the "first" name of a slab. See the -f option.
> Another option could be to just repeat the line with each slab entry but that
> also generates a lot of output and one of the other notions behind collectl is
> to make it real easy to see what's going on and repeating information can be
> confusing.
I'd say just pack as much as fit into the space and then create a new line
if there are too many aliases of the slab.
> I'm assuming the way slabinfo gets the names (or at least the way I can think
> of doing it) it so just look for entries in /sys/slab that are links.
It scans for symlinks pointing to that strange name. Source code for
slabinfo is in Documentation/vm/slabinfo.c.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 19:58 ` SLUB Christoph Lameter
@ 2007-12-27 20:17 ` Mark Seger
2007-12-27 20:55 ` SLUB Mark Seger
1 sibling, 0 replies; 24+ messages in thread
From: Mark Seger @ 2007-12-27 20:17 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
Christoph Lameter wrote:
> On Thu, 27 Dec 2007, Mark Seger wrote:
>
>
>>> The right hand side is okay. Could you list all the slab names that are
>>> covered by :00008 on the left side (maybe separated by commas?) Having the
>>> :00008 there is ugly. slabinfo can show you a way how to get the names.
>>>
>>>
>> here's the challenge - I only want to use a single line per entry AND I want
>> all the columns to line up for easy reading (I don't want much do I?). I'll
>> have to do some experiments to see what might look better. One thought is to
>> list a 'primary' name (whatever that might mean) in the left-hand column and
>> perhaps line up the rest of the other names to the right of the total.
>>
>
> slabinfo has the concept of the "first" name of a slab. See the -f option.
>
slick!
>> Another option could be to just repeat the line with each slab entry but that
>> also generates a lot of output and one of the other notions behind collectl is
>> to make it real easy to see what's going on and repeating information can be
>> confusing.
>>
>
> I'd say just pack as much as fit into the space and then create a new line
> if there are too many aliases of the slab.
>
lemme play with it some
>> I'm assuming the way slabinfo gets the names (or at least the way I can think
>> of doing it) it so just look for entries in /sys/slab that are links.
>>
>
> It scans for symlinks pointing to that strange name. Source code for
> slabinfo is in Documentation/vm/slabinfo.c.
>
gottcha...
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 19:58 ` SLUB Christoph Lameter
2007-12-27 20:17 ` SLUB Mark Seger
@ 2007-12-27 20:55 ` Mark Seger
2007-12-27 20:59 ` SLUB Christoph Lameter
1 sibling, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-27 20:55 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
ok, here's a dumb question... I've been looking at slabinfo and see a
routine called find_one_alias which returns the alias that gets printed
with the -f switch. the only thing is the leading comment says "Find
the shortest alias of a slab" but it looks like it returns the longest
name. Did you change the functionality after your wrote the comment?
that'll teach you for commenting your code! 8-)
I'm also not sure why it would stop the search when it finds an alias
that started with 'kmall'. Is there some reason you wouldn't want to
use any of those names as potential candidates? Does it really matter
how I choose the 'first' name? It's certainly easy enough to pick the
longest, I'm just not sure about the test for 'kmall'.
-mark
Christoph Lameter wrote:
> On Thu, 27 Dec 2007, Mark Seger wrote:
>
>
>>> The right hand side is okay. Could you list all the slab names that are
>>> covered by :00008 on the left side (maybe separated by commas?) Having the
>>> :00008 there is ugly. slabinfo can show you a way how to get the names.
>>>
>>>
>> here's the challenge - I only want to use a single line per entry AND I want
>> all the columns to line up for easy reading (I don't want much do I?). I'll
>> have to do some experiments to see what might look better. One thought is to
>> list a 'primary' name (whatever that might mean) in the left-hand column and
>> perhaps line up the rest of the other names to the right of the total.
>>
>
> slabinfo has the concept of the "first" name of a slab. See the -f option.
>
>
>> Another option could be to just repeat the line with each slab entry but that
>> also generates a lot of output and one of the other notions behind collectl is
>> to make it real easy to see what's going on and repeating information can be
>> confusing.
>>
>
> I'd say just pack as much as fit into the space and then create a new line
> if there are too many aliases of the slab.
>
>
>> I'm assuming the way slabinfo gets the names (or at least the way I can think
>> of doing it) it so just look for entries in /sys/slab that are links.
>>
>
> It scans for symlinks pointing to that strange name. Source code for
> slabinfo is in Documentation/vm/slabinfo.c.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SLUB
2007-12-27 20:55 ` SLUB Mark Seger
@ 2007-12-27 20:59 ` Christoph Lameter
2007-12-27 23:49 ` collectl and the new slab allocator [slub] statistics Mark Seger
0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2007-12-27 20:59 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 27 Dec 2007, Mark Seger wrote:
> ok, here's a dumb question... I've been looking at slabinfo and see a routine
> called find_one_alias which returns the alias that gets printed with the -f
> switch. the only thing is the leading comment says "Find the shortest alias
> of a slab" but it looks like it returns the longest name. Did you change the
> functionality after your wrote the comment? that'll teach you for commenting
> your code! 8-)
Yuck.
> I'm also not sure why it would stop the search when it finds an alias that
> started with 'kmall'. Is there some reason you wouldn't want to use any of
> those names as potential candidates? Does it really matter how I choose the
> 'first' name? It's certainly easy enough to pick the longest, I'm just not
> sure about the test for 'kmall'.
Well the kmallocs are generic and just give size information. You want a
slab name that is more informative than that.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* collectl and the new slab allocator [slub] statistics
2007-12-27 20:59 ` SLUB Christoph Lameter
@ 2007-12-27 23:49 ` Mark Seger
2007-12-27 23:52 ` Christoph Lameter
0 siblings, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-27 23:49 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
I hope you don't mind but I changed the subject from the pretty generic
one of 'slub.
My latest thought about handling the multiple aliases is what if I do
something like slabinfo - pick a 'primary' one based on a similar
criteria such as the longest name that isn't 'kmalloc' or that other
funky format with the size in its name. Then provide a second option
that shows the mappings of all the names to the primary ones. That way
if you're interested in a particular slab you can always look up its
mapping. I would also provide a mechanism for specifying those slabs
you want to monitor and even if not a 'primary' name it would use that name.
Today's kind of over for me but perhaps I can send out an updated
prototype format tomorrow.
-mark
Christoph Lameter wrote:
> On Thu, 27 Dec 2007, Mark Seger wrote:
>
>
>> ok, here's a dumb question... I've been looking at slabinfo and see a routine
>> called find_one_alias which returns the alias that gets printed with the -f
>> switch. the only thing is the leading comment says "Find the shortest alias
>> of a slab" but it looks like it returns the longest name. Did you change the
>> functionality after your wrote the comment? that'll teach you for commenting
>> your code! 8-)
>>
>
> Yuck.
>
>
>> I'm also not sure why it would stop the search when it finds an alias that
>> started with 'kmall'. Is there some reason you wouldn't want to use any of
>> those names as potential candidates? Does it really matter how I choose the
>> 'first' name? It's certainly easy enough to pick the longest, I'm just not
>> sure about the test for 'kmall'.
>>
>
> Well the kmallocs are generic and just give size information. You want a
> slab name that is more informative than that.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: collectl and the new slab allocator [slub] statistics
2007-12-27 23:49 ` collectl and the new slab allocator [slub] statistics Mark Seger
@ 2007-12-27 23:52 ` Christoph Lameter
2007-12-28 15:10 ` Mark Seger
0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2007-12-27 23:52 UTC (permalink / raw)
To: Mark Seger; +Cc: linux-mm
On Thu, 27 Dec 2007, Mark Seger wrote:
> particular slab you can always look up its mapping. I would also provide a
> mechanism for specifying those slabs you want to monitor and even if not a
> 'primary' name it would use that name.
Sounds good.
> Today's kind of over for me but perhaps I can send out an updated prototype
> format tomorrow.
Great. But I will only be back next Wednesday.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: collectl and the new slab allocator [slub] statistics
2007-12-27 23:52 ` Christoph Lameter
@ 2007-12-28 15:10 ` Mark Seger
2007-12-31 18:30 ` Mark Seger
0 siblings, 1 reply; 24+ messages in thread
From: Mark Seger @ 2007-12-28 15:10 UTC (permalink / raw)
To: Christoph Lameter; +Cc: linux-mm
Christoph Lameter wrote:
> On Thu, 27 Dec 2007, Mark Seger wrote:
>
>
>> particular slab you can always look up its mapping. I would also provide a
>> mechanism for specifying those slabs you want to monitor and even if not a
>> 'primary' name it would use that name.
>>
>
> Sounds good.
>
>
>> Today's kind of over for me but perhaps I can send out an updated prototype
>> format tomorrow.
>>
>
> Great. But I will only be back next Wednesday.
>
So here's the latest... I made a couple of tweaks to the format but I
think it's getting real close and as you can see, I'm now printing the
longest alias associated with a slab as is done in slabinfo. I'm also
including the time to make it easier to read but typically this is an
option in case the user doesn't want to use the extra screen
real-estate. As a minor point, as I was debugging this and comparing
its output to slabinfo (and we don't always get the same aliases if
there are multiple aliases of the same length) I found that slabinfo
reports on 'kmalloc-1024' and I'm reporting 'biovec-64'. I thought you
wanted to only print the kmalloc* names when there was nothing else and
so I suspect a slight bug in slabinfo...
Note that I decided to print the number of objects in a slab, even
though one could derive that themselves. I also decided to report the
size of the slabs in K bytes as well as the user/total memory. I'm
still reporting the objects inuse/avail in bytes since these are often
<1K and I really don't want to report fractions.
<----------- objects
-----------><--- slabs ---><----- memory ----->
Time Slab Name Size /slab In Use Avail
SizeK Number UsedK TotalK
10:25:04 TCP 1728 4 13
20 8 5 21 40
10:25:04 TCPv6 1856 4 15
20 8 5 27 40
10:25:04 UDP-Lite 896 4 51
64 4 16 44 64
10:25:04 UDPLITEv6 1088 7 28
28 8 4 29 32
10:25:04 anon_vma 48 85 773
1105 4 13 36 52
Anyhow, here's an example of watching the system once a second for any
slabs that change while the system is idle
<----------- objects
-----------><--- slabs ---><----- memory ----->
Time Slab Name Size /slab In Use Avail
SizeK Number UsedK TotalK
10:25:34 skbuff_fclone_cache 448 9 16
36 4 4 7 16
10:25:34 skbuff_head_cache 256 16 1266
1552 4 97 316 388
10:25:35 skbuff_fclone_cache 448 9 23
36 4 4 10 16
10:25:35 skbuff_head_cache 256 16 1265
1552 4 97 316 388
10:25:36 biovec-64 1024 4 303
320 4 80 303 320
10:25:36 dentry 224 18 215543
215568 4 11976 47150 47904
10:25:36 skbuff_fclone_cache 448 9 19
36 4 4 8 16
10:25:36 skbuff_head_cache 256 16 1269
1552 4 97 317 388
And finally, here's watching a single slab while writing a large file,
noting the I/O started at 10:26:30...
<----------- objects
-----------><--- slabs ---><----- memory ----->
Time Slab Name Size /slab In Use Avail
SizeK Number UsedK TotalK
10:26:25 blkdev_requests 288 14 39
84 4 6 10 24
10:26:30 blkdev_requests 288 14 189
224 4 16 53 64
10:26:31 blkdev_requests 288 14 187
224 4 16 52 64
10:26:32 blkdev_requests 288 14 174
224 4 16 48 64
10:26:33 blkdev_requests 288 14 173
224 4 16 48 64
10:26:34 blkdev_requests 288 14 46
84 4 6 12 24
It shouldn't take too much time to actually implement this in collectl,
but I do need to find the block of time to update the code, man pages,
etc before releasing it so if there are any final tweaks, now is the
time to say so...
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: collectl and the new slab allocator [slub] statistics
2007-12-28 15:10 ` Mark Seger
@ 2007-12-31 18:30 ` Mark Seger
0 siblings, 0 replies; 24+ messages in thread
From: Mark Seger @ 2007-12-31 18:30 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Mark Seger, linux-mm
Even though I know you won't be around for a few days I found a few more
cycles to put into this and have implemented quite a lot in collectl.
Rather than send along a bunch of output, I started to put together a
web page as part of collectl web site though I haven't linked it in yet
as I haven't yet released the associated version. In any event, I took
a shot of trying to include a few high level words about slabs in
general as well as show what some of the different output formats will
look like as I'd much rather make changes before I release it than after.
That said if you or anyone else on this list want to have a look at what
I've been up to you can see it at
http://collectl.sourceforge.net/SlabInfo.html
-mark
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-12-31 18:30 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-20 15:06 SLUB Mark Seger
2007-12-20 19:44 ` SLUB Christoph Lameter
2007-12-20 23:36 ` SLUB Mark Seger
2007-12-21 1:09 ` SLUB Mark Seger
2007-12-21 1:27 ` SLUB Mark Seger
2007-12-21 21:41 ` SLUB Christoph Lameter
2007-12-27 14:22 ` SLUB Mark Seger
2007-12-27 15:59 ` SLUB Mark Seger
2007-12-27 19:43 ` SLUB Christoph Lameter
2007-12-27 19:57 ` SLUB Mark Seger
2007-12-27 19:58 ` SLUB Christoph Lameter
2007-12-27 20:17 ` SLUB Mark Seger
2007-12-27 20:55 ` SLUB Mark Seger
2007-12-27 20:59 ` SLUB Christoph Lameter
2007-12-27 23:49 ` collectl and the new slab allocator [slub] statistics Mark Seger
2007-12-27 23:52 ` Christoph Lameter
2007-12-28 15:10 ` Mark Seger
2007-12-31 18:30 ` Mark Seger
2007-12-27 19:40 ` SLUB Christoph Lameter
2007-12-27 19:51 ` SLUB Mark Seger
2007-12-27 19:53 ` SLUB Christoph Lameter
2007-12-21 21:32 ` SLUB Christoph Lameter
2007-12-21 16:59 ` SLUB Mark Seger
2007-12-21 21:37 ` SLUB Christoph Lameter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox