* [patch 8/9] LTTng instrumentation - filemap
[not found] <20090324155625.420966314@polymtl.ca>
@ 2009-03-24 15:56 ` Mathieu Desnoyers
2009-03-24 18:39 ` Ingo Molnar
2009-03-24 15:56 ` [patch 9/9] LTTng instrumentation - swap Mathieu Desnoyers
1 sibling, 1 reply; 4+ messages in thread
From: Mathieu Desnoyers @ 2009-03-24 15:56 UTC (permalink / raw)
To: akpm, Ingo Molnar, linux-kernel, ltt-dev
Cc: Mathieu Desnoyers, linux-mm, Dave Hansen, Masami Hiramatsu,
Peter Zijlstra, Frank Ch. Eigler, Frederic Weisbecker,
Hideo AOKI, Takashi Nishiie, Steven Rostedt,
Eduard - Gabriel Munteanu
[-- Attachment #1: lttng-instrumentation-filemap.patch --]
[-- Type: text/plain, Size: 3043 bytes --]
Instrumentation of waits caused by memory accesses on mmap regions.
Those tracepoints are used by LTTng.
About the performance impact of tracepoints (which is comparable to markers),
even without immediate values optimizations, tests done by Hideo Aoki on ia64
show no regression. His test case was using hackbench on a kernel where
scheduler instrumentation (about 5 events in code scheduler code) was added.
See the "Tracepoints" patch header for performance result detail.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
CC: Dave Hansen <haveblue@us.ibm.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
CC: 'Peter Zijlstra' <peterz@infradead.org>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: 'Hideo AOKI' <haoki@redhat.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
CC: 'Steven Rostedt' <rostedt@goodmis.org>
CC: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
include/trace/filemap.h | 13 +++++++++++++
mm/filemap.c | 5 +++++
2 files changed, 18 insertions(+)
Index: linux-2.6-lttng/mm/filemap.c
===================================================================
--- linux-2.6-lttng.orig/mm/filemap.c 2009-03-24 09:09:52.000000000 -0400
+++ linux-2.6-lttng/mm/filemap.c 2009-03-24 09:32:05.000000000 -0400
@@ -34,6 +34,7 @@
#include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
#include <linux/memcontrol.h>
#include <linux/mm_inline.h> /* for page_is_file_cache() */
+#include <trace/filemap.h>
#include "internal.h"
/*
@@ -43,6 +44,8 @@
#include <asm/mman.h>
+DEFINE_TRACE(wait_on_page_start);
+DEFINE_TRACE(wait_on_page_end);
/*
* Shared mappings implemented 30.11.1994. It's not fully working yet,
@@ -558,9 +561,11 @@ void wait_on_page_bit(struct page *page,
{
DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
+ trace_wait_on_page_start(page, bit_nr);
if (test_bit(bit_nr, &page->flags))
__wait_on_bit(page_waitqueue(page), &wait, sync_page,
TASK_UNINTERRUPTIBLE);
+ trace_wait_on_page_end(page, bit_nr);
}
EXPORT_SYMBOL(wait_on_page_bit);
Index: linux-2.6-lttng/include/trace/filemap.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6-lttng/include/trace/filemap.h 2009-03-24 09:32:13.000000000 -0400
@@ -0,0 +1,13 @@
+#ifndef _TRACE_FILEMAP_H
+#define _TRACE_FILEMAP_H
+
+#include <linux/tracepoint.h>
+
+DECLARE_TRACE(wait_on_page_start,
+ TPPROTO(struct page *page, int bit_nr),
+ TPARGS(page, bit_nr));
+DECLARE_TRACE(wait_on_page_end,
+ TPPROTO(struct page *page, int bit_nr),
+ TPARGS(page, bit_nr));
+
+#endif
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch 9/9] LTTng instrumentation - swap
[not found] <20090324155625.420966314@polymtl.ca>
2009-03-24 15:56 ` [patch 8/9] LTTng instrumentation - filemap Mathieu Desnoyers
@ 2009-03-24 15:56 ` Mathieu Desnoyers
2009-03-24 18:51 ` Ingo Molnar
1 sibling, 1 reply; 4+ messages in thread
From: Mathieu Desnoyers @ 2009-03-24 15:56 UTC (permalink / raw)
To: akpm, Ingo Molnar, linux-kernel, ltt-dev
Cc: Mathieu Desnoyers, linux-mm, Dave Hansen, Masami Hiramatsu,
Peter Zijlstra, Frank Ch. Eigler, Frederic Weisbecker,
Hideo AOKI, Takashi Nishiie, Steven Rostedt,
Eduard - Gabriel Munteanu
[-- Attachment #1: lttng-instrumentation-swap.patch --]
[-- Type: text/plain, Size: 5091 bytes --]
Instrumentation of waits caused by swap activity. Also instrumentation
swapon/swapoff events to keep track of active swap partitions.
Those tracepoints are used by LTTng.
About the performance impact of tracepoints (which is comparable to markers),
even without immediate values optimizations, tests done by Hideo Aoki on ia64
show no regression. His test case was using hackbench on a kernel where
scheduler instrumentation (about 5 events in code scheduler code) was added.
See the "Tracepoints" patch header for performance result detail.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
CC: Dave Hansen <haveblue@us.ibm.com>
CC: Masami Hiramatsu <mhiramat@redhat.com>
CC: 'Peter Zijlstra' <peterz@infradead.org>
CC: "Frank Ch. Eigler" <fche@redhat.com>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: 'Hideo AOKI' <haoki@redhat.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
CC: 'Steven Rostedt' <rostedt@goodmis.org>
CC: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
include/trace/swap.h | 20 ++++++++++++++++++++
mm/memory.c | 4 ++++
mm/page_io.c | 4 ++++
mm/swapfile.c | 6 ++++++
4 files changed, 34 insertions(+)
Index: linux-2.6-lttng/mm/memory.c
===================================================================
--- linux-2.6-lttng.orig/mm/memory.c 2009-03-24 09:09:55.000000000 -0400
+++ linux-2.6-lttng/mm/memory.c 2009-03-24 09:32:15.000000000 -0400
@@ -55,6 +55,7 @@
#include <linux/kallsyms.h>
#include <linux/swapops.h>
#include <linux/elf.h>
+#include <trace/swap.h>
#include <asm/pgalloc.h>
#include <asm/uaccess.h>
@@ -64,6 +65,8 @@
#include "internal.h"
+DEFINE_TRACE(swap_in);
+
#ifndef CONFIG_NEED_MULTIPLE_NODES
/* use the per-pgdat data instead for discontigmem - mbligh */
unsigned long max_mapnr;
@@ -2431,6 +2434,7 @@ static int do_swap_page(struct mm_struct
/* Had to read the page from swap area: Major fault */
ret = VM_FAULT_MAJOR;
count_vm_event(PGMAJFAULT);
+ trace_swap_in(page, entry);
}
mark_page_accessed(page);
Index: linux-2.6-lttng/mm/page_io.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_io.c 2009-03-24 09:09:52.000000000 -0400
+++ linux-2.6-lttng/mm/page_io.c 2009-03-24 09:32:15.000000000 -0400
@@ -17,8 +17,11 @@
#include <linux/bio.h>
#include <linux/swapops.h>
#include <linux/writeback.h>
+#include <trace/swap.h>
#include <asm/pgtable.h>
+DEFINE_TRACE(swap_out);
+
static struct bio *get_swap_bio(gfp_t gfp_flags, pgoff_t index,
struct page *page, bio_end_io_t end_io)
{
@@ -114,6 +117,7 @@ int swap_writepage(struct page *page, st
rw |= (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG);
count_vm_event(PSWPOUT);
set_page_writeback(page);
+ trace_swap_out(page);
unlock_page(page);
submit_bio(rw, bio);
out:
Index: linux-2.6-lttng/mm/swapfile.c
===================================================================
--- linux-2.6-lttng.orig/mm/swapfile.c 2009-03-24 09:09:52.000000000 -0400
+++ linux-2.6-lttng/mm/swapfile.c 2009-03-24 09:32:15.000000000 -0400
@@ -29,12 +29,16 @@
#include <linux/capability.h>
#include <linux/syscalls.h>
#include <linux/memcontrol.h>
+#include <trace/swap.h>
#include <asm/pgtable.h>
#include <asm/tlbflush.h>
#include <linux/swapops.h>
#include <linux/page_cgroup.h>
+DEFINE_TRACE(swap_file_open);
+DEFINE_TRACE(swap_file_close);
+
static DEFINE_SPINLOCK(swap_lock);
static unsigned int nr_swapfiles;
long nr_swap_pages;
@@ -1497,6 +1501,7 @@ SYSCALL_DEFINE1(swapoff, const char __us
swap_map = p->swap_map;
p->swap_map = NULL;
p->flags = 0;
+ trace_swap_file_close(swap_file);
spin_unlock(&swap_lock);
mutex_unlock(&swapon_mutex);
vfree(swap_map);
@@ -1886,6 +1891,7 @@ SYSCALL_DEFINE2(swapon, const char __use
} else {
swap_info[prev].next = p - swap_info;
}
+ trace_swap_file_open(swap_file, name);
spin_unlock(&swap_lock);
mutex_unlock(&swapon_mutex);
error = 0;
Index: linux-2.6-lttng/include/trace/swap.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6-lttng/include/trace/swap.h 2009-03-24 09:32:26.000000000 -0400
@@ -0,0 +1,20 @@
+#ifndef _TRACE_SWAP_H
+#define _TRACE_SWAP_H
+
+#include <linux/swap.h>
+#include <linux/tracepoint.h>
+
+DECLARE_TRACE(swap_in,
+ TPPROTO(struct page *page, swp_entry_t entry),
+ TPARGS(page, entry));
+DECLARE_TRACE(swap_out,
+ TPPROTO(struct page *page),
+ TPARGS(page));
+DECLARE_TRACE(swap_file_open,
+ TPPROTO(struct file *file, char *filename),
+ TPARGS(file, filename));
+DECLARE_TRACE(swap_file_close,
+ TPPROTO(struct file *file),
+ TPARGS(file));
+
+#endif
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch 8/9] LTTng instrumentation - filemap
2009-03-24 15:56 ` [patch 8/9] LTTng instrumentation - filemap Mathieu Desnoyers
@ 2009-03-24 18:39 ` Ingo Molnar
0 siblings, 0 replies; 4+ messages in thread
From: Ingo Molnar @ 2009-03-24 18:39 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: akpm, linux-kernel, ltt-dev, linux-mm, Dave Hansen,
Masami Hiramatsu, Peter Zijlstra, Frank Ch. Eigler,
Frederic Weisbecker, Hideo AOKI, Takashi Nishiie, Steven Rostedt,
Eduard - Gabriel Munteanu
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> Index: linux-2.6-lttng/mm/filemap.c
> +DEFINE_TRACE(wait_on_page_start);
> +DEFINE_TRACE(wait_on_page_end);
These are extremely incomplete - to the level of being useless.
To understand the lifetime of the pagecache, the following basic
events have to be observed and instrumented:
- create a new page
- fill in a new page
- dirty a page [when we know this]
- request writeout of a page
- clean a page / complete writeout
- free a page due to MM pressure
- free a page due to truncation/delete
The following additional events are useful as well:
- mmap a page to a user-space address
- copy a page to a user-space address (read)
- write to a page from a user-space address (write)
- unmap a page from a user-space address
- fault in a user-space mapped pagecache page
optional:
- shmem attach/detach events
- shmem map/unmap events
- hugetlb map/unmap events
I'm sure i havent listed them all. Have a look at the function-graph
tracer output to see what kind of basic events can happen to a
pagecache page.
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch 9/9] LTTng instrumentation - swap
2009-03-24 15:56 ` [patch 9/9] LTTng instrumentation - swap Mathieu Desnoyers
@ 2009-03-24 18:51 ` Ingo Molnar
0 siblings, 0 replies; 4+ messages in thread
From: Ingo Molnar @ 2009-03-24 18:51 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: akpm, linux-kernel, ltt-dev, linux-mm, Dave Hansen,
Masami Hiramatsu, Peter Zijlstra, Frank Ch. Eigler,
Frederic Weisbecker, Hideo AOKI, Takashi Nishiie, Steven Rostedt,
Eduard - Gabriel Munteanu
* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> +DECLARE_TRACE(swap_in,
> + TPPROTO(struct page *page, swp_entry_t entry),
> + TPARGS(page, entry));
> +DECLARE_TRACE(swap_out,
> + TPPROTO(struct page *page),
> + TPARGS(page));
> +DECLARE_TRACE(swap_file_open,
> + TPPROTO(struct file *file, char *filename),
> + TPARGS(file, filename));
> +DECLARE_TRACE(swap_file_close,
> + TPPROTO(struct file *file),
> + TPARGS(file));
These are more complete than the pagecache tracepoints, but still
incomplete to make a comprehensive picture about swap activities.
Firstly, the swap_file_open/close events seem quite pointless. Most
systems enable swap during bootup and never close it. These
tracepoints just wont be excercised in practice.
Also, to _really_ help with debugging VM pressure problems, the
whole LRU state-machine should be instrumented, and linked up with
pagecache instrumentation via page frame numbers and (inode,offset)
[file] and (pgd,addr) [anon] pairs.
Not just the fact that something got swapped out is interesting, but
also the whole decision chain that leads up to it. The lifetime of a
page how it jumps between the various stages of eviction and LRU
scores.
a minor nit:
> +DECLARE_TRACE(swap_file_open,
> + TPPROTO(struct file *file, char *filename),
> + TPARGS(file, filename));
there's no need to pass in the filename - it can be deducted in the
probe from struct file.
a small inconsistency:
> +DECLARE_TRACE(swap_in,
> + TPPROTO(struct page *page, swp_entry_t entry),
> + TPARGS(page, entry));
> +DECLARE_TRACE(swap_out,
> + TPPROTO(struct page *page),
> + TPARGS(page));
you pass in swp_entry to trace_swap_in(), which encodes the offset -
but that parameter is not needed, the page already represents the
offset at that stage in do_swap_page(). (the actual data is not read
in yet from swap, but the page is already linked up in the
swap-cache and has the offset available - which a probe can
recover.)
So this suffices:
DECLARE_TRACE(swap_in,
TPPROTO(struct page *page),
TPARGS(page));
DECLARE_TRACE(swap_out,
TPPROTO(struct page *page),
TPARGS(page));
And here again i'd like to see actual meaningful probe contents via
a TRACE_EVENT() construct. That shows and proves that it's all part
of a comprehensive framework, and the data that is recovered is
understood and put into a coherent whole - upstream. That makes it
immediately useful to the built-in tracers, and will also cause
fewer surprises downstream.
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-03-24 18:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20090324155625.420966314@polymtl.ca>
2009-03-24 15:56 ` [patch 8/9] LTTng instrumentation - filemap Mathieu Desnoyers
2009-03-24 18:39 ` Ingo Molnar
2009-03-24 15:56 ` [patch 9/9] LTTng instrumentation - swap Mathieu Desnoyers
2009-03-24 18:51 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox