* Re: [PATCH] Add tracepoints to track pagecache transition
[not found] <499A7CAD.9030409@bk.jp.nec.com>
@ 2009-02-17 9:33 ` Peter Zijlstra
2009-02-17 11:04 ` Atsushi Tsuji
0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2009-02-17 9:33 UTC (permalink / raw)
To: Atsushi Tsuji
Cc: linux-kernel, Jason Baron, Ingo Molnar, Mathieu Desnoyers,
Frank Ch. Eigler, Kazuto Miyoshi, rostedt, linux-mm,
Andrew Morton, Nick Piggin, Hugh Dickins
On Tue, 2009-02-17 at 18:00 +0900, Atsushi Tsuji wrote:
> The below patch adds instrumentation for pagecache.
And somehow you forgot to CC any of the mm people.. ;-)
> I thought it would be useful to trace pagecache behavior for problem
> analysis (performance bottlenecks, behavior differences between stable
> time and trouble time).
>
> By using those tracepoints, we can describe and visualize pagecache
> transition (file-by-file basis) in kernel and pagecache
> consumes most of the memory in running system and pagecache hit rate
> and writeback behavior will influence system load and performance.
> Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
> ---
> diff --git a/include/trace/filemap.h b/include/trace/filemap.h
> new file mode 100644
> index 0000000..196955e
> --- /dev/null
> +++ b/include/trace/filemap.h
> @@ -0,0 +1,13 @@
> +#ifndef _TRACE_FILEMAP_H
> +#define _TRACE_FILEMAP_H
> +
> +#include <linux/tracepoint.h>
> +
> +DECLARE_TRACE(filemap_add_to_page_cache,
> + TPPROTO(struct address_space *mapping, pgoff_t offset),
> + TPARGS(mapping, offset));
> +DECLARE_TRACE(filemap_remove_from_page_cache,
> + TPPROTO(struct address_space *mapping),
> + TPARGS(mapping));
This is rather asymmetric, why don't we care about the offset for the
removed page?
> +#endif
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 23acefe..76a6887 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -34,6 +34,7 @@
> #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
> #include <linux/memcontrol.h>
> #include <linux/mm_inline.h> /* for page_is_file_cache() */
> +#include <trace/filemap.h>
> #include "internal.h"
>
> /*
> @@ -43,6 +44,8 @@
>
> #include <asm/mman.h>
>
> +DEFINE_TRACE(filemap_add_to_page_cache);
> +DEFINE_TRACE(filemap_remove_from_page_cache);
>
> /*
> * Shared mappings implemented 30.11.1994. It's not fully working yet,
> @@ -120,6 +123,7 @@ void __remove_from_page_cache(struct page *page)
> page->mapping = NULL;
> mapping->nrpages--;
> __dec_zone_page_state(page, NR_FILE_PAGES);
> + trace_filemap_remove_from_page_cache(mapping);
> BUG_ON(page_mapped(page));
> mem_cgroup_uncharge_cache_page(page);
>
> @@ -475,6 +479,7 @@ int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
> if (likely(!error)) {
> mapping->nrpages++;
> __inc_zone_page_state(page, NR_FILE_PAGES);
> + trace_filemap_add_to_page_cache(mapping, offset);
> } else {
> page->mapping = NULL;
> mem_cgroup_uncharge_cache_page(page);
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 9:33 ` [PATCH] Add tracepoints to track pagecache transition Peter Zijlstra
@ 2009-02-17 11:04 ` Atsushi Tsuji
2009-02-17 11:38 ` KOSAKI Motohiro
2009-02-17 15:24 ` Steven Rostedt
0 siblings, 2 replies; 13+ messages in thread
From: Atsushi Tsuji @ 2009-02-17 11:04 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, Jason Baron, Ingo Molnar, Mathieu Desnoyers,
Frank Ch. Eigler, Kazuto Miyoshi, rostedt, linux-mm,
Andrew Morton, Nick Piggin, Hugh Dickins
Peter Zijlstra wrote:
> On Tue, 2009-02-17 at 18:00 +0900, Atsushi Tsuji wrote:
>
>> The below patch adds instrumentation for pagecache.
>
> And somehow you forgot to CC any of the mm people.. ;-)
Hi Peter,
Ah, sorry.
Thank you for adding to CC list.
>> +DECLARE_TRACE(filemap_add_to_page_cache,
>> + TPPROTO(struct address_space *mapping, pgoff_t offset),
>> + TPARGS(mapping, offset));
>> +DECLARE_TRACE(filemap_remove_from_page_cache,
>> + TPPROTO(struct address_space *mapping),
>> + TPARGS(mapping));
>
> This is rather asymmetric, why don't we care about the offset for the
> removed page?
>
Indeed.
I added the offset to the argument for the removed page and resend fixed patch.
Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
---
diff --git a/include/trace/filemap.h b/include/trace/filemap.h
new file mode 100644
index 0000000..a17dc92
--- /dev/null
+++ b/include/trace/filemap.h
@@ -0,0 +1,13 @@
+#ifndef _TRACE_FILEMAP_H
+#define _TRACE_FILEMAP_H
+
+#include <linux/tracepoint.h>
+
+DECLARE_TRACE(filemap_add_to_page_cache,
+ TPPROTO(struct address_space *mapping, pgoff_t offset),
+ TPARGS(mapping, offset));
+DECLARE_TRACE(filemap_remove_from_page_cache,
+ TPPROTO(struct address_space *mapping, pgoff_t offset),
+ TPARGS(mapping, offset));
+
+#endif
diff --git a/mm/filemap.c b/mm/filemap.c
index 23acefe..23f75d2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -34,6 +34,7 @@
#include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
#include <linux/memcontrol.h>
#include <linux/mm_inline.h> /* for page_is_file_cache() */
+#include <trace/filemap.h>
#include "internal.h"
/*
@@ -43,6 +44,8 @@
#include <asm/mman.h>
+DEFINE_TRACE(filemap_add_to_page_cache);
+DEFINE_TRACE(filemap_remove_from_page_cache);
/*
* Shared mappings implemented 30.11.1994. It's not fully working yet,
@@ -120,6 +123,7 @@ void __remove_from_page_cache(struct page *page)
page->mapping = NULL;
mapping->nrpages--;
__dec_zone_page_state(page, NR_FILE_PAGES);
+ trace_filemap_remove_from_page_cache(mapping, page->index);
BUG_ON(page_mapped(page));
mem_cgroup_uncharge_cache_page(page);
@@ -475,6 +479,7 @@ int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
if (likely(!error)) {
mapping->nrpages++;
__inc_zone_page_state(page, NR_FILE_PAGES);
+ trace_filemap_add_to_page_cache(mapping, offset);
} else {
page->mapping = NULL;
mem_cgroup_uncharge_cache_page(page);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 11:04 ` Atsushi Tsuji
@ 2009-02-17 11:38 ` KOSAKI Motohiro
2009-02-17 11:54 ` Peter Zijlstra
` (2 more replies)
2009-02-17 15:24 ` Steven Rostedt
1 sibling, 3 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2009-02-17 11:38 UTC (permalink / raw)
To: Atsushi Tsuji
Cc: kosaki.motohiro, Peter Zijlstra, linux-kernel, Jason Baron,
Ingo Molnar, Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi,
rostedt, linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins
Hi
In my 1st impression, this patch description is a bit strange.
> The below patch adds instrumentation for pagecache.
>
> I thought it would be useful to trace pagecache behavior for problem
> analysis (performance bottlenecks, behavior differences between stable
> time and trouble time).
>
> By using those tracepoints, we can describe and visualize pagecache
> transition (file-by-file basis) in kernel and pagecache
> consumes most of the memory in running system and pagecache hit rate
> and writeback behavior will influence system load and performance.
Why do you think this tracepoint describe pagecache hit rate?
and, why describe writeback behavior?
>
> I attached an example which is visualization of pagecache status using
> SystemTap.
it seems no attached. and SystemTap isn't used kernel developer at all.
I don't think it's enough explanation.
Can you make seekwatcher liked completed comsumer program?
(if you don't know seekwatcher, see http://oss.oracle.com/~mason/seekwatcher/)
> That graph describes pagecache transition of File A and File B
> on a file-by-file basis with the situation where regular I/O to File A
> is delayed because of other I/O to File B.
If you want to see I/O activity, you need to add tracepoint into block layer.
> We visually understand
> pagecache for File A is narrowed down due to I/O pressure from File B.
confused. Can we assume the number of anon pages/files pages ratio don't chage?
> Peter Zijlstra wrote:
> > On Tue, 2009-02-17 at 18:00 +0900, Atsushi Tsuji wrote:
> >
> >> The below patch adds instrumentation for pagecache.
> >
> > And somehow you forgot to CC any of the mm people.. ;-)
>
> Hi Peter,
>
> Ah, sorry.
> Thank you for adding to CC list.
>
> >> +DECLARE_TRACE(filemap_add_to_page_cache,
> >> + TPPROTO(struct address_space *mapping, pgoff_t offset),
> >> + TPARGS(mapping, offset));
> >> +DECLARE_TRACE(filemap_remove_from_page_cache,
> >> + TPPROTO(struct address_space *mapping),
> >> + TPARGS(mapping));
> >
> > This is rather asymmetric, why don't we care about the offset for the
> > removed page?
> >
>
> Indeed.
> I added the offset to the argument for the removed page and resend fixed patch.
>
> Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
> ---
> diff --git a/include/trace/filemap.h b/include/trace/filemap.h
please add diffstat.
> new file mode 100644
> index 0000000..a17dc92
> --- /dev/null
> +++ b/include/trace/filemap.h
> @@ -0,0 +1,13 @@
> +#ifndef _TRACE_FILEMAP_H
> +#define _TRACE_FILEMAP_H
> +
> +#include <linux/tracepoint.h>
> +
> +DECLARE_TRACE(filemap_add_to_page_cache,
> + TPPROTO(struct address_space *mapping, pgoff_t offset),
> + TPARGS(mapping, offset));
> +DECLARE_TRACE(filemap_remove_from_page_cache,
> + TPPROTO(struct address_space *mapping, pgoff_t offset),
> + TPARGS(mapping, offset));
> +
> +#endif
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 23acefe..23f75d2 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -34,6 +34,7 @@
> #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
> #include <linux/memcontrol.h>
> #include <linux/mm_inline.h> /* for page_is_file_cache() */
> +#include <trace/filemap.h>
> #include "internal.h"
>
> /*
> @@ -43,6 +44,8 @@
>
> #include <asm/mman.h>
>
> +DEFINE_TRACE(filemap_add_to_page_cache);
> +DEFINE_TRACE(filemap_remove_from_page_cache);
>
> /*
> * Shared mappings implemented 30.11.1994. It's not fully working yet,
> @@ -120,6 +123,7 @@ void __remove_from_page_cache(struct page *page)
> page->mapping = NULL;
> mapping->nrpages--;
> __dec_zone_page_state(page, NR_FILE_PAGES);
> + trace_filemap_remove_from_page_cache(mapping, page->index);
__remove_from_page_cache() is passed struct page.
Why don't you use struct page
and, this mean
- the page have been removed from mapping.
- vmstate have been decremented.
- but, the page haven't been uncharged from memcg.
Why?
> BUG_ON(page_mapped(page));
> mem_cgroup_uncharge_cache_page(page);
>
> @@ -475,6 +479,7 @@ int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
> if (likely(!error)) {
> mapping->nrpages++;
> __inc_zone_page_state(page, NR_FILE_PAGES);
> + trace_filemap_add_to_page_cache(mapping, offset);
Why do you select this line?
In general, trace point calling under spin lock grabbing is a bit problematic.
> } else {
> page->mapping = NULL;
> mem_cgroup_uncharge_cache_page(page);
>
And, both function is freqentlly called one.
I worry about performance issue. can you prove no degression?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 11:38 ` KOSAKI Motohiro
@ 2009-02-17 11:54 ` Peter Zijlstra
2009-02-17 12:33 ` KOSAKI Motohiro
2009-02-17 14:33 ` Frederic Weisbecker
2009-02-19 4:41 ` Atsushi Tsuji
2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2009-02-17 11:54 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Atsushi Tsuji, linux-kernel, Jason Baron, Ingo Molnar,
Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi, rostedt,
linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins,
Arnaldo Carvalho de Melo
On Tue, 2009-02-17 at 20:38 +0900, KOSAKI Motohiro wrote:
> If you want to see I/O activity, you need to add tracepoint into block
> layer.
We already have that, Arnaldo converted blktrace to use tracepoints and
wrote an ftrace tracer for it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 11:54 ` Peter Zijlstra
@ 2009-02-17 12:33 ` KOSAKI Motohiro
0 siblings, 0 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2009-02-17 12:33 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Atsushi Tsuji, linux-kernel, Jason Baron, Ingo Molnar,
Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi, rostedt,
linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins,
Arnaldo Carvalho de Melo
>> If you want to see I/O activity, you need to add tracepoint into block
>> layer.
>
> We already have that, Arnaldo converted blktrace to use tracepoints and
> wrote an ftrace tracer for it.
Yup. I agree I selected wrong word.
My point is, if he want to know I/O delaying reason, the number of the
cache pages seems unrelated thing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 11:38 ` KOSAKI Motohiro
2009-02-17 11:54 ` Peter Zijlstra
@ 2009-02-17 14:33 ` Frederic Weisbecker
2009-02-18 0:29 ` KOSAKI Motohiro
2009-02-19 4:41 ` Atsushi Tsuji
2 siblings, 1 reply; 13+ messages in thread
From: Frederic Weisbecker @ 2009-02-17 14:33 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Atsushi Tsuji, Peter Zijlstra, linux-kernel, Jason Baron,
Ingo Molnar, Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi,
rostedt, linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins
On Tue, Feb 17, 2009 at 08:38:20PM +0900, KOSAKI Motohiro wrote:
> Hi
>
>
> In my 1st impression, this patch description is a bit strange.
>
> > The below patch adds instrumentation for pagecache.
> >
> > I thought it would be useful to trace pagecache behavior for problem
> > analysis (performance bottlenecks, behavior differences between stable
> > time and trouble time).
> >
> > By using those tracepoints, we can describe and visualize pagecache
> > transition (file-by-file basis) in kernel and pagecache
> > consumes most of the memory in running system and pagecache hit rate
> > and writeback behavior will influence system load and performance.
>
> Why do you think this tracepoint describe pagecache hit rate?
> and, why describe writeback behavior?
>
> >
> > I attached an example which is visualization of pagecache status using
> > SystemTap.
>
> it seems no attached. and SystemTap isn't used kernel developer at all.
> I don't think it's enough explanation.
> Can you make seekwatcher liked completed comsumer program?
> (if you don't know seekwatcher, see http://oss.oracle.com/~mason/seekwatcher/)
>
> > That graph describes pagecache transition of File A and File B
> > on a file-by-file basis with the situation where regular I/O to File A
> > is delayed because of other I/O to File B.
>
> If you want to see I/O activity, you need to add tracepoint into block layer.
>
> > We visually understand
> > pagecache for File A is narrowed down due to I/O pressure from File B.
>
> confused. Can we assume the number of anon pages/files pages ratio don't chage?
>
>
> > Peter Zijlstra wrote:
> > > On Tue, 2009-02-17 at 18:00 +0900, Atsushi Tsuji wrote:
> > >
> > >> The below patch adds instrumentation for pagecache.
> > >
> > > And somehow you forgot to CC any of the mm people.. ;-)
> >
> > Hi Peter,
> >
> > Ah, sorry.
> > Thank you for adding to CC list.
> >
> > >> +DECLARE_TRACE(filemap_add_to_page_cache,
> > >> + TPPROTO(struct address_space *mapping, pgoff_t offset),
> > >> + TPARGS(mapping, offset));
> > >> +DECLARE_TRACE(filemap_remove_from_page_cache,
> > >> + TPPROTO(struct address_space *mapping),
> > >> + TPARGS(mapping));
> > >
> > > This is rather asymmetric, why don't we care about the offset for the
> > > removed page?
> > >
> >
> > Indeed.
> > I added the offset to the argument for the removed page and resend fixed patch.
> >
> > Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
> > ---
> > diff --git a/include/trace/filemap.h b/include/trace/filemap.h
>
> please add diffstat.
>
>
> > new file mode 100644
> > index 0000000..a17dc92
> > --- /dev/null
> > +++ b/include/trace/filemap.h
> > @@ -0,0 +1,13 @@
> > +#ifndef _TRACE_FILEMAP_H
> > +#define _TRACE_FILEMAP_H
> > +
> > +#include <linux/tracepoint.h>
> > +
> > +DECLARE_TRACE(filemap_add_to_page_cache,
> > + TPPROTO(struct address_space *mapping, pgoff_t offset),
> > + TPARGS(mapping, offset));
> > +DECLARE_TRACE(filemap_remove_from_page_cache,
> > + TPPROTO(struct address_space *mapping, pgoff_t offset),
> > + TPARGS(mapping, offset));
> > +
> > +#endif
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 23acefe..23f75d2 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -34,6 +34,7 @@
> > #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
> > #include <linux/memcontrol.h>
> > #include <linux/mm_inline.h> /* for page_is_file_cache() */
> > +#include <trace/filemap.h>
> > #include "internal.h"
> >
> > /*
> > @@ -43,6 +44,8 @@
> >
> > #include <asm/mman.h>
> >
> > +DEFINE_TRACE(filemap_add_to_page_cache);
> > +DEFINE_TRACE(filemap_remove_from_page_cache);
> >
> > /*
> > * Shared mappings implemented 30.11.1994. It's not fully working yet,
> > @@ -120,6 +123,7 @@ void __remove_from_page_cache(struct page *page)
> > page->mapping = NULL;
> > mapping->nrpages--;
> > __dec_zone_page_state(page, NR_FILE_PAGES);
> > + trace_filemap_remove_from_page_cache(mapping, page->index);
>
> __remove_from_page_cache() is passed struct page.
> Why don't you use struct page
>
> and, this mean
> - the page have been removed from mapping.
> - vmstate have been decremented.
> - but, the page haven't been uncharged from memcg.
>
> Why?
>
>
> > BUG_ON(page_mapped(page));
> > mem_cgroup_uncharge_cache_page(page);
> >
> > @@ -475,6 +479,7 @@ int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
> > if (likely(!error)) {
> > mapping->nrpages++;
> > __inc_zone_page_state(page, NR_FILE_PAGES);
> > + trace_filemap_add_to_page_cache(mapping, offset);
>
> Why do you select this line?
> In general, trace point calling under spin lock grabbing is a bit problematic.
>
>
> > } else {
> > page->mapping = NULL;
> > mem_cgroup_uncharge_cache_page(page);
> >
>
> And, both function is freqentlly called one.
> I worry about performance issue. can you prove no degression?
It would be very hard to prove. Tracepoints are very cheap in that they only
add the overhead of a single branch check while off.
But are there some plans about writing a tracer or so for pagecache?
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 11:04 ` Atsushi Tsuji
2009-02-17 11:38 ` KOSAKI Motohiro
@ 2009-02-17 15:24 ` Steven Rostedt
2009-02-19 18:51 ` Christoph Hellwig
1 sibling, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2009-02-17 15:24 UTC (permalink / raw)
To: Atsushi Tsuji
Cc: Peter Zijlstra, linux-kernel, Jason Baron, Ingo Molnar,
Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi, linux-mm,
Andrew Morton, Nick Piggin, Hugh Dickins
On Tue, 17 Feb 2009, Atsushi Tsuji wrote:
> >
> > This is rather asymmetric, why don't we care about the offset for the
> > removed page?
> >
>
> Indeed.
> I added the offset to the argument for the removed page and resend fixed patch.
>
> Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
Could you package it up in one patch again and resend with [PATCH v2].
Also make sure to Cc the memory folks, and ask for an Acked-by from them.
Thanks,
-- Steve
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 14:33 ` Frederic Weisbecker
@ 2009-02-18 0:29 ` KOSAKI Motohiro
0 siblings, 0 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2009-02-18 0:29 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: kosaki.motohiro, Atsushi Tsuji, Peter Zijlstra, linux-kernel,
Jason Baron, Ingo Molnar, Mathieu Desnoyers, Frank Ch. Eigler,
Kazuto Miyoshi, rostedt, linux-mm, Andrew Morton, Nick Piggin,
Hugh Dickins
Hi Frederic,
> > And, both function is freqentlly called one.
> > I worry about performance issue. can you prove no degression?
>
> It would be very hard to prove. Tracepoints are very cheap in that they only
> add the overhead of a single branch check while off.
this is typical reviewing comment.
Memory folks adage says,
Don't believe theory, you believe benchmark result.
I don't oppose your theorical background opinion :)
> But are there some plans about writing a tracer or so for pagecache?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 11:38 ` KOSAKI Motohiro
2009-02-17 11:54 ` Peter Zijlstra
2009-02-17 14:33 ` Frederic Weisbecker
@ 2009-02-19 4:41 ` Atsushi Tsuji
2009-02-19 13:12 ` KOSAKI Motohiro
2 siblings, 1 reply; 13+ messages in thread
From: Atsushi Tsuji @ 2009-02-19 4:41 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Peter Zijlstra, linux-kernel, Jason Baron, Ingo Molnar,
Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi, rostedt,
linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins
Hi Kosaki-san,
Thank you for your comment.
KOSAKI Motohiro wrote:
> Hi
>
>
> In my 1st impression, this patch description is a bit strange.
>
>> The below patch adds instrumentation for pagecache.
>>
>> I thought it would be useful to trace pagecache behavior for problem
>> analysis (performance bottlenecks, behavior differences between stable
>> time and trouble time).
>>
>> By using those tracepoints, we can describe and visualize pagecache
>> transition (file-by-file basis) in kernel and pagecache
>> consumes most of the memory in running system and pagecache hit rate
>> and writeback behavior will influence system load and performance.
>
> Why do you think this tracepoint describe pagecache hit rate?
> and, why describe writeback behavior?
I mean, we can describe file-by-file basis pagecache usage by using
these tracepoints and it is important for analyzing process I/O behavior.
Currently, we can understand the amount of pagecache from "Cached"
in /proc/meminfo. So I'd like to understand which files are using pagecache.
>
>> I attached an example which is visualization of pagecache status using
>> SystemTap.
>
> it seems no attached. and SystemTap isn't used kernel developer at all.
> I don't think it's enough explanation.
> Can you make seekwatcher liked completed comsumer program?
> (if you don't know seekwatcher, see http://oss.oracle.com/~mason/seekwatcher/)
I understand a tracer using these tracepoints need to be implemented.
What I want to do is counting pagecache per file. We can retrieve inode
from mapping and count pagecache per inode in these tracepoints.
>> That graph describes pagecache transition of File A and File B
>> on a file-by-file basis with the situation where regular I/O to File A
>> is delayed because of other I/O to File B.
>
> If you want to see I/O activity, you need to add tracepoint into block layer.
I think tracking pagecache is useful for understanding process I/O activity,
because whether process I/O completes by accessing memory or HDD is determined by
accessed files on pagecache or not.
>
>> We visually understand
>> pagecache for File A is narrowed down due to I/O pressure from File B.
>
> confused. Can we assume the number of anon pages/files pages ratio don't chage?
>
>
>> Peter Zijlstra wrote:
>>> On Tue, 2009-02-17 at 18:00 +0900, Atsushi Tsuji wrote:
>>>
>>>> The below patch adds instrumentation for pagecache.
>>> And somehow you forgot to CC any of the mm people.. ;-)
>> Hi Peter,
>>
>> Ah, sorry.
>> Thank you for adding to CC list.
>>
>>>> +DECLARE_TRACE(filemap_add_to_page_cache,
>>>> + TPPROTO(struct address_space *mapping, pgoff_t offset),
>>>> + TPARGS(mapping, offset));
>>>> +DECLARE_TRACE(filemap_remove_from_page_cache,
>>>> + TPPROTO(struct address_space *mapping),
>>>> + TPARGS(mapping));
>>> This is rather asymmetric, why don't we care about the offset for the
>>> removed page?
>>>
>> Indeed.
>> I added the offset to the argument for the removed page and resend fixed patch.
>>
>> Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
>> ---
>> diff --git a/include/trace/filemap.h b/include/trace/filemap.h
>
> please add diffstat.
>
>
>> new file mode 100644
>> index 0000000..a17dc92
>> --- /dev/null
>> +++ b/include/trace/filemap.h
>> @@ -0,0 +1,13 @@
>> +#ifndef _TRACE_FILEMAP_H
>> +#define _TRACE_FILEMAP_H
>> +
>> +#include <linux/tracepoint.h>
>> +
>> +DECLARE_TRACE(filemap_add_to_page_cache,
>> + TPPROTO(struct address_space *mapping, pgoff_t offset),
>> + TPARGS(mapping, offset));
>> +DECLARE_TRACE(filemap_remove_from_page_cache,
>> + TPPROTO(struct address_space *mapping, pgoff_t offset),
>> + TPARGS(mapping, offset));
>> +
>> +#endif
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 23acefe..23f75d2 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -34,6 +34,7 @@
>> #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
>> #include <linux/memcontrol.h>
>> #include <linux/mm_inline.h> /* for page_is_file_cache() */
>> +#include <trace/filemap.h>
>> #include "internal.h"
>>
>> /*
>> @@ -43,6 +44,8 @@
>>
>> #include <asm/mman.h>
>>
>> +DEFINE_TRACE(filemap_add_to_page_cache);
>> +DEFINE_TRACE(filemap_remove_from_page_cache);
>>
>> /*
>> * Shared mappings implemented 30.11.1994. It's not fully working yet,
>> @@ -120,6 +123,7 @@ void __remove_from_page_cache(struct page *page)
>> page->mapping = NULL;
>> mapping->nrpages--;
>> __dec_zone_page_state(page, NR_FILE_PAGES);
>> + trace_filemap_remove_from_page_cache(mapping, page->index);
>
> __remove_from_page_cache() is passed struct page.
> Why don't you use struct page
>
> and, this mean
> - the page have been removed from mapping.
> - vmstate have been decremented.
> - but, the page haven't been uncharged from memcg.
>
> Why?
>
>
>> BUG_ON(page_mapped(page));
>> mem_cgroup_uncharge_cache_page(page);
>>
>> @@ -475,6 +479,7 @@ int add_to_page_cache_locked(struct page *page, struct address_space *mapping,
>> if (likely(!error)) {
>> mapping->nrpages++;
>> __inc_zone_page_state(page, NR_FILE_PAGES);
>> + trace_filemap_add_to_page_cache(mapping, offset);
>
> Why do you select this line?
> In general, trace point calling under spin lock grabbing is a bit problematic.
I understand my patch is wrong. I will fix and send it later.
>
>> } else {
>> page->mapping = NULL;
>> mem_cgroup_uncharge_cache_page(page);
>>
>
> And, both function is freqentlly called one.
> I worry about performance issue. can you prove no degression?
I will try to probe that.
Thanks,
-Atsushi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-19 4:41 ` Atsushi Tsuji
@ 2009-02-19 13:12 ` KOSAKI Motohiro
2009-02-19 14:21 ` Lee Schermerhorn
0 siblings, 1 reply; 13+ messages in thread
From: KOSAKI Motohiro @ 2009-02-19 13:12 UTC (permalink / raw)
To: Atsushi Tsuji
Cc: Peter Zijlstra, linux-kernel, Jason Baron, Ingo Molnar,
Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi, rostedt,
linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins
> Hi Kosaki-san,
>
> Thank you for your comment.
>
> KOSAKI Motohiro wrote:
>> Hi
>>
>>
>> In my 1st impression, this patch description is a bit strange.
>>
>>> The below patch adds instrumentation for pagecache.
>>>
>>> I thought it would be useful to trace pagecache behavior for problem
>>> analysis (performance bottlenecks, behavior differences between stable
>>> time and trouble time).
>>>
>>> By using those tracepoints, we can describe and visualize pagecache
>>> transition (file-by-file basis) in kernel and pagecache
>>> consumes most of the memory in running system and pagecache hit rate
>>> and writeback behavior will influence system load and performance.
>>
>> Why do you think this tracepoint describe pagecache hit rate?
>> and, why describe writeback behavior?
>
> I mean, we can describe file-by-file basis pagecache usage by using
> these tracepoints and it is important for analyzing process I/O behavior.
More confusing.
Your page cache tracepoint don't have any per-process information.
> Currently, we can understand the amount of pagecache from "Cached"
> in /proc/meminfo. So I'd like to understand which files are using pagecache.
There is one meta question, Why do you think file-by-file pagecache
infomartion is valueable?
>>> I attached an example which is visualization of pagecache status using
>>> SystemTap.
>>
>> it seems no attached. and SystemTap isn't used kernel developer at all.
>> I don't think it's enough explanation.
>> Can you make seekwatcher liked completed comsumer program?
>> (if you don't know seekwatcher, see http://oss.oracle.com/~mason/seekwatcher/)
>
> I understand a tracer using these tracepoints need to be implemented.
> What I want to do is counting pagecache per file. We can retrieve inode
> from mapping and count pagecache per inode in these tracepoints.
>
>
>>> That graph describes pagecache transition of File A and File B
>>> on a file-by-file basis with the situation where regular I/O to File A
>>> is delayed because of other I/O to File B.
>>
>> If you want to see I/O activity, you need to add tracepoint into block layer.
>
> I think tracking pagecache is useful for understanding process I/O activity,
> because whether process I/O completes by accessing memory or HDD is determined by
> accessed files on pagecache or not.
I don't know your opinion is right or not.
However, your opinion don't get consensus yet.
Perhaps, you need to make demonstrate programs, I think.
>> And, both function is freqentlly called one.
>> I worry about performance issue. can you prove no degression?
>
> I will try to probe that.
Perhaps, this is not needed yet.
Generally, worthless patch is never merged although it's no performance penalty.
So, I think you should explain the patch worth at first :)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-19 13:12 ` KOSAKI Motohiro
@ 2009-02-19 14:21 ` Lee Schermerhorn
2009-02-20 1:13 ` KOSAKI Motohiro
0 siblings, 1 reply; 13+ messages in thread
From: Lee Schermerhorn @ 2009-02-19 14:21 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Atsushi Tsuji, Peter Zijlstra, linux-kernel, Jason Baron,
Ingo Molnar, Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi,
rostedt, linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins
On Thu, 2009-02-19 at 22:12 +0900, KOSAKI Motohiro wrote:
> > Hi Kosaki-san,
> >
> > Thank you for your comment.
> >
> > KOSAKI Motohiro wrote:
> >> Hi
> >>
> >>
> >> In my 1st impression, this patch description is a bit strange.
> >>
> >>> The below patch adds instrumentation for pagecache.
> >>>
> >>> I thought it would be useful to trace pagecache behavior for problem
> >>> analysis (performance bottlenecks, behavior differences between stable
> >>> time and trouble time).
> >>>
> >>> By using those tracepoints, we can describe and visualize pagecache
> >>> transition (file-by-file basis) in kernel and pagecache
> >>> consumes most of the memory in running system and pagecache hit rate
> >>> and writeback behavior will influence system load and performance.
> >>
> >> Why do you think this tracepoint describe pagecache hit rate?
> >> and, why describe writeback behavior?
> >
> > I mean, we can describe file-by-file basis pagecache usage by using
> > these tracepoints and it is important for analyzing process I/O behavior.
>
> More confusing.
> Your page cache tracepoint don't have any per-process information.
>
>
> > Currently, we can understand the amount of pagecache from "Cached"
> > in /proc/meminfo. So I'd like to understand which files are using pagecache.
>
> There is one meta question, Why do you think file-by-file pagecache
> infomartion is valueable?
>
One might take a look at Marcello Tosatti's old 'vmtrace' patch. It
contains it's own data store/transport via relayfs, but the trace points
could be ported to the current kernel tracing infrastructure.
Here's a starting point: http://linux-mm.org/VmTrace
Quoting from that page:
>From the previous email to linux-mm:
>"The sequence of pages which a given process or workload accesses
>during its lifetime, a.k.a. "reference trace", is very important
>information. It has been used in the past for comparison of page
>replacement algorithms and other optimizations..."
Lee
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-17 15:24 ` Steven Rostedt
@ 2009-02-19 18:51 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2009-02-19 18:51 UTC (permalink / raw)
To: Steven Rostedt
Cc: Atsushi Tsuji, Peter Zijlstra, linux-kernel, Jason Baron,
Ingo Molnar, Mathieu Desnoyers, Frank Ch. Eigler, Kazuto Miyoshi,
linux-mm, Andrew Morton, Nick Piggin, Hugh Dickins
On Tue, Feb 17, 2009 at 10:24:46AM -0500, Steven Rostedt wrote:
>
> On Tue, 17 Feb 2009, Atsushi Tsuji wrote:
> > >
> > > This is rather asymmetric, why don't we care about the offset for the
> > > removed page?
> > >
> >
> > Indeed.
> > I added the offset to the argument for the removed page and resend fixed patch.
> >
> > Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
>
> Could you package it up in one patch again and resend with [PATCH v2].
> Also make sure to Cc the memory folks, and ask for an Acked-by from them.
Well, until we actually get a consumer of those tracepoints, e.g. a
ftrace pluging into the tree strong NACK for me.
(p.s. I really don't get it why people keep trying to push dead code
into the tree)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Add tracepoints to track pagecache transition
2009-02-19 14:21 ` Lee Schermerhorn
@ 2009-02-20 1:13 ` KOSAKI Motohiro
0 siblings, 0 replies; 13+ messages in thread
From: KOSAKI Motohiro @ 2009-02-20 1:13 UTC (permalink / raw)
To: Lee Schermerhorn
Cc: kosaki.motohiro, Atsushi Tsuji, Peter Zijlstra, linux-kernel,
Jason Baron, Ingo Molnar, Mathieu Desnoyers, Frank Ch. Eigler,
Kazuto Miyoshi, rostedt, linux-mm, Andrew Morton, Nick Piggin,
Hugh Dickins
Hi
> > > Currently, we can understand the amount of pagecache from "Cached"
> > > in /proc/meminfo. So I'd like to understand which files are using pagecache.
> >
> > There is one meta question, Why do you think file-by-file pagecache
> > infomartion is valueable?
>
> One might take a look at Marcello Tosatti's old 'vmtrace' patch. It
> contains it's own data store/transport via relayfs, but the trace points
> could be ported to the current kernel tracing infrastructure.
>
> Here's a starting point: http://linux-mm.org/VmTrace
>
> Quoting from that page:
>
> >From the previous email to linux-mm:
> >"The sequence of pages which a given process or workload accesses
> >during its lifetime, a.k.a. "reference trace", is very important
> >information. It has been used in the past for comparison of page
> >replacement algorithms and other optimizations..."
Sure.
but strong difference exist.
vmtrace
- can run standalone
- reviewer can confirm its output result is useful or not.
Christoph also explained reason more kindly.
I think we need useful consumer.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-02-20 1:17 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <499A7CAD.9030409@bk.jp.nec.com>
2009-02-17 9:33 ` [PATCH] Add tracepoints to track pagecache transition Peter Zijlstra
2009-02-17 11:04 ` Atsushi Tsuji
2009-02-17 11:38 ` KOSAKI Motohiro
2009-02-17 11:54 ` Peter Zijlstra
2009-02-17 12:33 ` KOSAKI Motohiro
2009-02-17 14:33 ` Frederic Weisbecker
2009-02-18 0:29 ` KOSAKI Motohiro
2009-02-19 4:41 ` Atsushi Tsuji
2009-02-19 13:12 ` KOSAKI Motohiro
2009-02-19 14:21 ` Lee Schermerhorn
2009-02-20 1:13 ` KOSAKI Motohiro
2009-02-17 15:24 ` Steven Rostedt
2009-02-19 18:51 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox