Hello, Sorry for misspell linux-mm ML. So, I send the patch again. I would propose several tracepoints for tracing pagecache behaviors. By using the tracepoints, we can monitor pagecache usage with high resolution. ----------------------------------------------------------------------------- # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | postmaster-7293 [002] 104039.093744: find_get_page: s_dev=8:2 i_ino=19672 42 offset=22302 page_found postmaster-7293 [000] 104047.138110: add_to_page_cache: s_dev=8:2 i_ino=1 967242 offset=5672 postmaster-7293 [000] 104072.590885: remove_from_page_cache: s_dev=8:2 i_ ino=5016146 offset=1 ----------------------------------------------------------------------------- We can now know system-wide pagecache usage by /proc/meminfo. But we have no method to get higher resolution information like per file or per process usage than system-wide one. A process may share some pagecache or add a pagecache to the memory or remove a pagecache from the memory. If a pagecache miss hit ratio rises, maybe it leads to extra I/O and affects system performance. So, by using the tracepoints we can get the following information. 1. how many pagecaches each process has per each file 2. how many pages are cached per each file 3. how many pagecaches each process shares 4. how often each process adds/removes pagecache 5. how long a pagecache stays in the memory 6. pagecache hit rate per file Especially, the monitoring pagecache usage per each file would help us tune some application like database. I attach a sample script for counting file-by-file pagecache usage per process. The scripts processes raw data from /tracing/trace to get human-readable output. You can run it as: # echo 1 > /tracing/events/filemap # cat /tracing/trace | python trace-pagecache-postprocess.py The script implements counting 1, 2 and 3 information in the above. o script output format [file list] < pagecache usage on a file basis > ... [process list] process: < pagecache usage of this process > dev: < pagecache usage of above process on this file > ... ... For example: The below output is pagecache usage when pgbench(benchmark tests on PostgreSQL) runs. An inode 1967121 is a part of file(75M) for PostgreSQL database. An inode 5019039 is a part of exec file(2.9M) for PostgreSQL, "/usr/bin/postgres". - if "added"(L8) > "cached"(L2) then It means repeating add/remove pagecache many times. => Bad case for pagecache usage - if "cached"(L3) >= "added"(L9)) && "cached"(L6) > 0 then It means no unnecessary I/O operations. => Good case for pagecache usage. (the "L2" means that second line in the output, "2: dev:8:2, ...".) ----------------------------------------------------------------------------- 1: [file list] 2: dev:8:2, inode:1967121, cached: 13M 3: dev:8:2, inode:5019039, cached: 1M 4: [process list] 5: process: kswapd0-369 (cached:0K, added:0K, removed:0K, indirect removed:10M) 6: dev:8:2, inode:1967121, cached:0K, added:0K, removed:0K, indirect removed:10M 7: process: postmaster-5025 (cached:23M, added:26M, removed:616K, indirect removed:176K) 8: dev:8:2, inode:1967121, cached:22M, added:26M, removed:616K, indirect removed:0K 9: dev:8:2, inode:5019039, cached:1M, added:64K, removed:0K, indirect removed:176K 10: process: dd-5028 (cached:0K, added:0K, removed:0K, indirect removed:1M) 11: dev:8:2, inode:1967121, cached:0K, added:0K, removed:0K, indirect removed:848K 12: dev:8:2, inode:5019039, cached:0K, added:0K, removed:0K, indirect removed:396K ----------------------------------------------------------------------------- Any comments are welcome. -- Keiichi Kii