linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Badari Pulavarty <pbadari@us.ibm.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: linux-mm <linux-mm@kvack.org>, lkml <linux-kernel@vger.kernel.org>
Subject: Re: Better pagecache statistics ?
Date: Thu, 01 Dec 2005 09:31:49 -0800	[thread overview]
Message-ID: <1133458309.21429.36.camel@localhost.localdomain> (raw)
In-Reply-To: <20051201171938.GB16235@dmt.cnet>

On Thu, 2005-12-01 at 15:19 -0200, Marcelo Tosatti wrote:
> > Hi Marcelo,
> > 
> > Let me give you background on why I am looking at this.
> > 
> > I have been involved in various database customer situations.
> > Most times, machine is either extreemly sluggish or dying.
> > Only hints we get from /proc/meminfo, /proc/slabinfo, vmstat
> > etc is - lots of stuff in "Cache" and system is heavily swapping.
> > I want to find out whats getting swapped out and whats eating up 
> > all the pagecache., whats getting into cache, whats getting out 
> > of cache etc.. I find no easy way to get this kind of information.
> 
> Someone recently wrote a patch to record such information (pagecache
> insertion/eviction, etc), don't remember who did though. Rik?
> 
> > Database folks complain that filecache causes them most trouble.
> > Even when they use DIO on their tables & stuff, random apps (ftp,
> > scp, tar etc..) bloats the pagecache and kicks out database 
> > pools, shared mem, malloc etc - causing lots of trouble for them.
> 
> LRU lacks frequency information, which is crucial for avoiding 
> such kind of problems.
> 
> http://www.linux-mm.org/AdvancedPageReplacement
> 
> Peter Zijlstra is working on implementing CLOCK-Pro, which uses 
> inter reference distance between accesses to a page instead of "least 
> recently used" metric for page replacement decision. He just published
> results of "mdb" (mini-db) benchmark at http://www.linux-mm.org/PeterZClockPro2.
> 
> Read more about the "mdb" benchmark at
> http://www.linux-mm.org/PageReplacementTesting. 
> 
> But thats offtopic :)
> 
> > I want to understand more before I try to fix it. First step would
> > be to get better stats from pagecache and evaluate whats happening
> > to get a better handle on the problem.
> > 
> > BTW, I am very well familiar with kprobes/jprobes & systemtap.
> > I have been playing with them for at least 8 months :) There is
> > no easy way to do this, unless stats are already in the kernel.
> 
> I thought that it would be easy to use SystemTap for a such
> a purpose?
> 
> The sys_read/sys_write example at 
> http://www.redhat.com/magazine/011sep05/features/systemtap/ sounds
> interesting.
> 
> What I'm I missing?

Well, Few things:

1) We have to have those probes present in the system all the time
collecting the information when read/write happens, maintaining it
and spitting it out. Since its kernel probe, all this data will be
in the kernel.

2) If we want to do this accounting (and you don't have those probes
installed already) - we can't capture what happened earlier.

3) probing sys_read/sys_write() are going to tell you how much
a data a process did read or wrote - but its not going to tell you
how much is in the cache (now or 10 minutes later).

> 
> > My final goal is to get stats like ..
> > 
> > Out of "Cached" value - to get details like
> > 
> > 	<mmap> - xxx KB
> > 	<shared mem> - xxx KB
> > 	<text, data, bss, malloc, heap, stacks> - xxx KB
> > 	<filecache pages total> -- xxx KB
> > 		(filename1 or <dev>, <ino>) -- #of pages
> > 		(filename2 or <dev>, <ino>) -- #of pages
> > 		
> > This would be really powerful on understanding system better.
> > 
> > Don't you think ?
> 
> Yep... /proc/<pid>/smaps provides that information on a per-process
> basis already.

/proc/pid/smaps will give me information about text,data,shared libs,
malloc etc. Not the filecache information about files process opened,
pages read/wrote currently in the pagecache. Isn't it ?

Thanks,
Badari

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2005-12-01 17:31 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-30 18:57 Badari Pulavarty
2005-12-01  2:35 ` Hareesh Nagarajan
2005-12-01 15:20 ` Marcelo Tosatti
2005-12-01 15:59   ` Badari Pulavarty
2005-12-01 16:10     ` Arjan van de Ven
2005-12-01 16:23       ` Badari Pulavarty
2005-12-01 17:08       ` Marcelo Tosatti
2005-12-01 17:15         ` Badari Pulavarty
2005-12-01 17:21           ` Arjan van de Ven
2005-12-01 17:57             ` Marcelo Tosatti
2005-12-01 18:20               ` Badari Pulavarty
2005-12-02 22:15                 ` Frank Ch. Eigler
2005-12-02 22:31                   ` Badari Pulavarty
2005-12-02 22:46                     ` Frank Ch. Eigler
2005-12-02 23:46                       ` Badari Pulavarty
2005-12-01 18:24               ` Badari Pulavarty
2005-12-04 18:48           ` Martin J. Bligh
2005-12-01 17:19     ` Marcelo Tosatti
2005-12-01 17:31       ` Badari Pulavarty [this message]
2005-12-01 18:15         ` Marcelo Tosatti
2005-12-01 18:25           ` Badari Pulavarty
2005-12-01 16:00   ` Marcelo Tosatti
2005-12-01 21:16     ` Christoph Lameter
2005-12-02  0:13       ` Badari Pulavarty
2005-12-28  1:33   ` Marcelo Tosatti
2005-12-28 19:36     ` Tom Zanussi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1133458309.21429.36.camel@localhost.localdomain \
    --to=pbadari@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcelo.tosatti@cyclades.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox