From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e32.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id jB1HF6Ns028765 for ; Thu, 1 Dec 2005 12:15:06 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VERS6.8) with ESMTP id jB1HGY5F096122 for ; Thu, 1 Dec 2005 10:16:35 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id jB1HF5GH030568 for ; Thu, 1 Dec 2005 10:15:05 -0700 Subject: Re: Better pagecache statistics ? From: Badari Pulavarty In-Reply-To: <20051201170850.GA16235@dmt.cnet> References: <1133377029.27824.90.camel@localhost.localdomain> <20051201152029.GA14499@dmt.cnet> <1133452790.27824.117.camel@localhost.localdomain> <1133453411.2853.67.camel@laptopd505.fenrus.org> <20051201170850.GA16235@dmt.cnet> Content-Type: text/plain Date: Thu, 01 Dec 2005 09:15:15 -0800 Message-Id: <1133457315.21429.29.camel@localhost.localdomain> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Marcelo Tosatti Cc: Arjan van de Ven , linux-mm , lkml List-ID: On Thu, 2005-12-01 at 15:08 -0200, Marcelo Tosatti wrote: > On Thu, Dec 01, 2005 at 05:10:11PM +0100, Arjan van de Ven wrote: > > > Out of "Cached" value - to get details like > > > > > > - xxx KB > > > - xxx KB > > > - xxx KB > > > -- xxx KB > > > (filename1 or , ) -- #of pages > > > (filename2 or , ) -- #of pages > > > > > > This would be really powerful on understanding system better. > > > > to some extend it might be useful. > > I have a few concerns though > > 1) If we make these stats into an ABI then it becomes harder to change > > the architecture of the VM radically since such concepts may not even > > exist in the new architecture. As long as this is some sort of advisory, > > humans-only file I think this isn't too much of a big deal though. > > > > 2) not all the concepts you mention really exist as far as the kernel is > > concerned. I mean.. a mmap file is file cache is .. etc. > > malloc/heap/stacks are also not differentiated too much and are mostly > > userspace policy (especially thread stacks). > > > > A split in > > * non-file backed > > - mapped once > > - mapped more than once > > * file backed > > - mapped at least once > > - not mapped > > I can see as being meaningful. Assigning meaning to it beyond this is > > dangerous; that is more an interpretation of the policy userspace > > happens to use for things and I think coding that into the kernel is a > > mistake. > > > > Knowing which files are in memory how much is, as debug feature, > > potentially quite useful for VM hackers to see how well the various VM > > algorithms work. I'm concerned about the performance impact (eg you can > > do it only once a day or so, not every 10 seconds) and about how to get > > this data out in a consistent way (after all, spewing this amount of > > debug info will in itself impact the vm balances) > > Most of the issues you mention are null if you move the stats > maintenance burden to userspace. > > The performance impact is also minimized since the hooks > (read: overhead) can be loaded on-demand as needed. > The overhead is - going through each mapping/inode in the system and dumping out "nrpages" - to get per-file statistics. This is going to be expensive, need locking and there is no single list we can traverse to get it. I am not sure how to do this. Thanks, Badari -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org