linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch 8/9] LTTng instrumentation - filemap
       [not found] <20090324155625.420966314@polymtl.ca>
@ 2009-03-24 15:56 ` Mathieu Desnoyers
  2009-03-24 18:39   ` Ingo Molnar
  2009-03-24 15:56 ` [patch 9/9] LTTng instrumentation - swap Mathieu Desnoyers
  1 sibling, 1 reply; 4+ messages in thread
From: Mathieu Desnoyers @ 2009-03-24 15:56 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel, ltt-dev
  Cc: Mathieu Desnoyers, linux-mm, Dave Hansen, Masami Hiramatsu,
	Peter Zijlstra, Frank Ch. Eigler, Frederic Weisbecker,
	Hideo AOKI, Takashi Nishiie, Steven Rostedt,
	Eduard - Gabriel Munteanu

[-- Attachment #1: lttng-instrumentation-filemap.patch --]
[-- Type: text/plain, Size: 3043 bytes --]

Instrumentation of waits caused by memory accesses on mmap regions.

Those tracepoints are used by LTTng.

About the performance impact of tracepoints (which is comparable to markers),
even without immediate values optimizations, tests done by Hideo Aoki on ia64
show no regression. His test case was using hackbench on a kernel where
scheduler instrumentation (about 5 events in code scheduler code) was added.
See the "Tracepoints" patch header for performance result detail.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
CC: Dave Hansen <haveblue@us.ibm.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
CC: 'Peter Zijlstra' <peterz@infradead.org>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: 'Hideo AOKI' <haoki@redhat.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
CC: 'Steven Rostedt' <rostedt@goodmis.org>
CC: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 include/trace/filemap.h |   13 +++++++++++++
 mm/filemap.c            |    5 +++++
 2 files changed, 18 insertions(+)

Index: linux-2.6-lttng/mm/filemap.c
===================================================================
--- linux-2.6-lttng.orig/mm/filemap.c	2009-03-24 09:09:52.000000000 -0400
+++ linux-2.6-lttng/mm/filemap.c	2009-03-24 09:32:05.000000000 -0400
@@ -34,6 +34,7 @@
 #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
 #include <linux/memcontrol.h>
 #include <linux/mm_inline.h> /* for page_is_file_cache() */
+#include <trace/filemap.h>
 #include "internal.h"
 
 /*
@@ -43,6 +44,8 @@
 
 #include <asm/mman.h>
 
+DEFINE_TRACE(wait_on_page_start);
+DEFINE_TRACE(wait_on_page_end);
 
 /*
  * Shared mappings implemented 30.11.1994. It's not fully working yet,
@@ -558,9 +561,11 @@ void wait_on_page_bit(struct page *page,
 {
 	DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
 
+	trace_wait_on_page_start(page, bit_nr);
 	if (test_bit(bit_nr, &page->flags))
 		__wait_on_bit(page_waitqueue(page), &wait, sync_page,
 							TASK_UNINTERRUPTIBLE);
+	trace_wait_on_page_end(page, bit_nr);
 }
 EXPORT_SYMBOL(wait_on_page_bit);
 
Index: linux-2.6-lttng/include/trace/filemap.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6-lttng/include/trace/filemap.h	2009-03-24 09:32:13.000000000 -0400
@@ -0,0 +1,13 @@
+#ifndef _TRACE_FILEMAP_H
+#define _TRACE_FILEMAP_H
+
+#include <linux/tracepoint.h>
+
+DECLARE_TRACE(wait_on_page_start,
+	TPPROTO(struct page *page, int bit_nr),
+		TPARGS(page, bit_nr));
+DECLARE_TRACE(wait_on_page_end,
+	TPPROTO(struct page *page, int bit_nr),
+		TPARGS(page, bit_nr));
+
+#endif

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [patch 9/9] LTTng instrumentation - swap
       [not found] <20090324155625.420966314@polymtl.ca>
  2009-03-24 15:56 ` [patch 8/9] LTTng instrumentation - filemap Mathieu Desnoyers
@ 2009-03-24 15:56 ` Mathieu Desnoyers
  2009-03-24 18:51   ` Ingo Molnar
  1 sibling, 1 reply; 4+ messages in thread
From: Mathieu Desnoyers @ 2009-03-24 15:56 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel, ltt-dev
  Cc: Mathieu Desnoyers, linux-mm, Dave Hansen, Masami Hiramatsu,
	Peter Zijlstra, Frank Ch. Eigler, Frederic Weisbecker,
	Hideo AOKI, Takashi Nishiie, Steven Rostedt,
	Eduard - Gabriel Munteanu

[-- Attachment #1: lttng-instrumentation-swap.patch --]
[-- Type: text/plain, Size: 5091 bytes --]

Instrumentation of waits caused by swap activity. Also instrumentation
swapon/swapoff events to keep track of active swap partitions.

Those tracepoints are used by LTTng.

About the performance impact of tracepoints (which is comparable to markers),
even without immediate values optimizations, tests done by Hideo Aoki on ia64
show no regression. His test case was using hackbench on a kernel where
scheduler instrumentation (about 5 events in code scheduler code) was added.
See the "Tracepoints" patch header for performance result detail.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
CC: Dave Hansen <haveblue@us.ibm.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
CC: 'Peter Zijlstra' <peterz@infradead.org>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: 'Hideo AOKI' <haoki@redhat.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
CC: 'Steven Rostedt' <rostedt@goodmis.org>
CC: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 include/trace/swap.h |   20 ++++++++++++++++++++
 mm/memory.c          |    4 ++++
 mm/page_io.c         |    4 ++++
 mm/swapfile.c        |    6 ++++++
 4 files changed, 34 insertions(+)

Index: linux-2.6-lttng/mm/memory.c
===================================================================
--- linux-2.6-lttng.orig/mm/memory.c	2009-03-24 09:09:55.000000000 -0400
+++ linux-2.6-lttng/mm/memory.c	2009-03-24 09:32:15.000000000 -0400
@@ -55,6 +55,7 @@
 #include <linux/kallsyms.h>
 #include <linux/swapops.h>
 #include <linux/elf.h>
+#include <trace/swap.h>
 
 #include <asm/pgalloc.h>
 #include <asm/uaccess.h>
@@ -64,6 +65,8 @@
 
 #include "internal.h"
 
+DEFINE_TRACE(swap_in);
+
 #ifndef CONFIG_NEED_MULTIPLE_NODES
 /* use the per-pgdat data instead for discontigmem - mbligh */
 unsigned long max_mapnr;
@@ -2431,6 +2434,7 @@ static int do_swap_page(struct mm_struct
 		/* Had to read the page from swap area: Major fault */
 		ret = VM_FAULT_MAJOR;
 		count_vm_event(PGMAJFAULT);
+		trace_swap_in(page, entry);
 	}
 
 	mark_page_accessed(page);
Index: linux-2.6-lttng/mm/page_io.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_io.c	2009-03-24 09:09:52.000000000 -0400
+++ linux-2.6-lttng/mm/page_io.c	2009-03-24 09:32:15.000000000 -0400
@@ -17,8 +17,11 @@
 #include <linux/bio.h>
 #include <linux/swapops.h>
 #include <linux/writeback.h>
+#include <trace/swap.h>
 #include <asm/pgtable.h>
 
+DEFINE_TRACE(swap_out);
+
 static struct bio *get_swap_bio(gfp_t gfp_flags, pgoff_t index,
 				struct page *page, bio_end_io_t end_io)
 {
@@ -114,6 +117,7 @@ int swap_writepage(struct page *page, st
 		rw |= (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG);
 	count_vm_event(PSWPOUT);
 	set_page_writeback(page);
+	trace_swap_out(page);
 	unlock_page(page);
 	submit_bio(rw, bio);
 out:
Index: linux-2.6-lttng/mm/swapfile.c
===================================================================
--- linux-2.6-lttng.orig/mm/swapfile.c	2009-03-24 09:09:52.000000000 -0400
+++ linux-2.6-lttng/mm/swapfile.c	2009-03-24 09:32:15.000000000 -0400
@@ -29,12 +29,16 @@
 #include <linux/capability.h>
 #include <linux/syscalls.h>
 #include <linux/memcontrol.h>
+#include <trace/swap.h>
 
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
 #include <linux/swapops.h>
 #include <linux/page_cgroup.h>
 
+DEFINE_TRACE(swap_file_open);
+DEFINE_TRACE(swap_file_close);
+
 static DEFINE_SPINLOCK(swap_lock);
 static unsigned int nr_swapfiles;
 long nr_swap_pages;
@@ -1497,6 +1501,7 @@ SYSCALL_DEFINE1(swapoff, const char __us
 	swap_map = p->swap_map;
 	p->swap_map = NULL;
 	p->flags = 0;
+	trace_swap_file_close(swap_file);
 	spin_unlock(&swap_lock);
 	mutex_unlock(&swapon_mutex);
 	vfree(swap_map);
@@ -1886,6 +1891,7 @@ SYSCALL_DEFINE2(swapon, const char __use
 	} else {
 		swap_info[prev].next = p - swap_info;
 	}
+	trace_swap_file_open(swap_file, name);
 	spin_unlock(&swap_lock);
 	mutex_unlock(&swapon_mutex);
 	error = 0;
Index: linux-2.6-lttng/include/trace/swap.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6-lttng/include/trace/swap.h	2009-03-24 09:32:26.000000000 -0400
@@ -0,0 +1,20 @@
+#ifndef _TRACE_SWAP_H
+#define _TRACE_SWAP_H
+
+#include <linux/swap.h>
+#include <linux/tracepoint.h>
+
+DECLARE_TRACE(swap_in,
+	TPPROTO(struct page *page, swp_entry_t entry),
+		TPARGS(page, entry));
+DECLARE_TRACE(swap_out,
+	TPPROTO(struct page *page),
+		TPARGS(page));
+DECLARE_TRACE(swap_file_open,
+	TPPROTO(struct file *file, char *filename),
+		TPARGS(file, filename));
+DECLARE_TRACE(swap_file_close,
+	TPPROTO(struct file *file),
+		TPARGS(file));
+
+#endif

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 8/9] LTTng instrumentation - filemap
  2009-03-24 15:56 ` [patch 8/9] LTTng instrumentation - filemap Mathieu Desnoyers
@ 2009-03-24 18:39   ` Ingo Molnar
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Molnar @ 2009-03-24 18:39 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, linux-kernel, ltt-dev, linux-mm, Dave Hansen,
	Masami Hiramatsu, Peter Zijlstra, Frank Ch. Eigler,
	Frederic Weisbecker, Hideo AOKI, Takashi Nishiie, Steven Rostedt,
	Eduard - Gabriel Munteanu


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> Index: linux-2.6-lttng/mm/filemap.c

> +DEFINE_TRACE(wait_on_page_start);
> +DEFINE_TRACE(wait_on_page_end);

These are extremely incomplete - to the level of being useless.

To understand the lifetime of the pagecache, the following basic 
events have to be observed and instrumented:

 - create a new page
 - fill in a new page
 - dirty a page [when we know this]
 - request writeout of a page
 - clean a page / complete writeout
 - free a page due to MM pressure
 - free a page due to truncation/delete

The following additional events are useful as well:

 - mmap a page to a user-space address
 - copy a page to a user-space address (read)
 - write to a page from a user-space address (write)
 - unmap a page from a user-space address
 - fault in a user-space mapped pagecache page

optional:
   - shmem attach/detach events
   - shmem map/unmap events
   - hugetlb map/unmap events

I'm sure i havent listed them all. Have a look at the function-graph 
tracer output to see what kind of basic events can happen to a 
pagecache page.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 9/9] LTTng instrumentation - swap
  2009-03-24 15:56 ` [patch 9/9] LTTng instrumentation - swap Mathieu Desnoyers
@ 2009-03-24 18:51   ` Ingo Molnar
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Molnar @ 2009-03-24 18:51 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, linux-kernel, ltt-dev, linux-mm, Dave Hansen,
	Masami Hiramatsu, Peter Zijlstra, Frank Ch. Eigler,
	Frederic Weisbecker, Hideo AOKI, Takashi Nishiie, Steven Rostedt,
	Eduard - Gabriel Munteanu


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> +DECLARE_TRACE(swap_in,
> +	TPPROTO(struct page *page, swp_entry_t entry),
> +		TPARGS(page, entry));
> +DECLARE_TRACE(swap_out,
> +	TPPROTO(struct page *page),
> +		TPARGS(page));
> +DECLARE_TRACE(swap_file_open,
> +	TPPROTO(struct file *file, char *filename),
> +		TPARGS(file, filename));
> +DECLARE_TRACE(swap_file_close,
> +	TPPROTO(struct file *file),
> +		TPARGS(file));

These are more complete than the pagecache tracepoints, but still 
incomplete to make a comprehensive picture about swap activities.

Firstly, the swap_file_open/close events seem quite pointless. Most 
systems enable swap during bootup and never close it. These 
tracepoints just wont be excercised in practice.

Also, to _really_ help with debugging VM pressure problems, the 
whole LRU state-machine should be instrumented, and linked up with 
pagecache instrumentation via page frame numbers and (inode,offset) 
[file] and (pgd,addr) [anon] pairs.

Not just the fact that something got swapped out is interesting, but 
also the whole decision chain that leads up to it. The lifetime of a 
page how it jumps between the various stages of eviction and LRU 
scores.

a minor nit:

> +DECLARE_TRACE(swap_file_open,
> +	TPPROTO(struct file *file, char *filename),
> +		TPARGS(file, filename));

there's no need to pass in the filename - it can be deducted in the 
probe from struct file.

a small inconsistency:

> +DECLARE_TRACE(swap_in,
> +	TPPROTO(struct page *page, swp_entry_t entry),
> +		TPARGS(page, entry));
> +DECLARE_TRACE(swap_out,
> +	TPPROTO(struct page *page),
> +		TPARGS(page));

you pass in swp_entry to trace_swap_in(), which encodes the offset - 
but that parameter is not needed, the page already represents the 
offset at that stage in do_swap_page(). (the actual data is not read 
in yet from swap, but the page is already linked up in the 
swap-cache and has the offset available - which a probe can 
recover.)

So this suffices:

 DECLARE_TRACE(swap_in,
	TPPROTO(struct page *page),
		TPARGS(page));

 DECLARE_TRACE(swap_out,
	TPPROTO(struct page *page),
		TPARGS(page));

And here again i'd like to see actual meaningful probe contents via 
a TRACE_EVENT() construct. That shows and proves that it's all part 
of a comprehensive framework, and the data that is recovered is 
understood and put into a coherent whole - upstream. That makes it 
immediately useful to the built-in tracers, and will also cause 
fewer surprises downstream.

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-03-24 18:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20090324155625.420966314@polymtl.ca>
2009-03-24 15:56 ` [patch 8/9] LTTng instrumentation - filemap Mathieu Desnoyers
2009-03-24 18:39   ` Ingo Molnar
2009-03-24 15:56 ` [patch 9/9] LTTng instrumentation - swap Mathieu Desnoyers
2009-03-24 18:51   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox