From: Curt Wohlgemuth <curtw@google.com>
To: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>,
Christoph Hellwig <hch@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>,
Michael Rubin <mrubin@google.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 2/2 v2] writeback: Add writeback stats for pages written
Date: Mon, 15 Aug 2011 11:56:08 -0700 [thread overview]
Message-ID: <CAO81RMbpK4ZE=4c5khSrGpzDrXbyynWp8QoFbUjMuHFeJtbDDw@mail.gmail.com> (raw)
In-Reply-To: <20110815184023.GA16369@quack.suse.cz>
Hi Jan:
On Mon, Aug 15, 2011 at 11:40 AM, Jan Kara <jack@suse.cz> wrote:
> On Mon 15-08-11 10:16:38, Curt Wohlgemuth wrote:
>> On Mon, Aug 15, 2011 at 6:48 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
>> > Curt,
>> >
>> > Some thoughts about the interface..before dipping into the code.
>> >
>> > On Sat, Aug 13, 2011 at 06:47:25AM +0800, Curt Wohlgemuth wrote:
>> >> Add a new file, /proc/writeback/stats, which displays
>> >
>> > That's creating a new top directory in /proc. Do you have plans for
>> > adding more files under it?
>>
>> Good question. We have several files under /proc/writeback in our
>> kernels that we created at various times, some of which are probably
>> no longer useful, but others seem to be. For example:
>> - congestion: prints # of calls, # of jiffies slept in
>> congestion_wait() / io_schedule_timeout() from various call points
>> - threshold_dirty : prints the current global FG threshold
>> - threshold_bg : prints the current global BG threshold
>> - pages_cleaned : prints the # pages sent to writeback -- same as
>> 'nr_written' in /proc/vmstat (ours was earlier :-( )
>> - pages_dirtied (same as nr_dirtied in /proc/vmstat)
>> - prop_vm_XXX : print shift/events from vm_completions and vm_dirties
>>
>> I'm not sure right now if global FG/BG thresholds appear anywhere in a
>> 3.1 kernel; if so, the two threshold files above are superfluous. So
>> are the pages_cleaned/dirtied. The prop_vm files have not proven
>> useful to me. I think the congestion file has a lot of value,
>> especially in an IO-less throttling world...
> /sys/kernel/debug/bdi/<dev>/stats has BdiDirtyThresh, DirtyThresh, and
> BackgroundThresh. So we should already expose all you have in the threshold
> files.
Ah, right, I knew that and overlooked it. I get confused looking at
lots of kernel versions and patches at the same time :-) .
> Regarding congestion_wait() statistics - do I get right that the numbers
> gathered actually depend on the number of threads using the congested
> device? They are something like
> \sum_{over threads} time_waited_for_bdi
> How do you interpret the resulting numbers then?
I don't have it by thread; just stupidly as totals, like this:
calls: ttfp 11290
time: ttfp 558191
calls: shrink_inactive_list isolated xxx
time : shrink_inactive_list isolated xxx
calls: shrink_inactive_list lumpy reclaim xxx
time : shrink_inactive_list lumpy reclaim xxx
calls: balance_pgdat xxx
time : balance_pgdat xxx
calls: alloc_pages_high_priority xxx
time : alloc_pages_high_priority xxx
calls: alloc_pages_slowpath xxx
time : alloc_pages_slowpath xxx
calls: throttle_vm_writeout xxx
time : throttle_vm_writeout xxx
calls: balance_dirty_pages xxx
time : balance_dirty_pages xxx
Note that the "call" points above are from a very old (2.6.34 +
backports) kernel, but you get the idea. We just wrap
congestion_wait() with a routine that takes a 'type' parameter; does
the congestion_wait(); and increments the appropriate 'call' stat, and
adds to the appropriate 'time' stat the return value from
congestion_wait().
For a given workload, you can get an idea for where congestion is
adding to delays. I really think that for IO-less
balance_dirty_pages(), we need some insight into how long writer
threads are being throttled. And tracepoints are great, but not
sufficient, IMHO.
Thanks,
Curt
>
> Honza
>
>> >> machine global data for how many pages were cleaned for
>> >> which reasons. It also displays some additional counts for
>> >> various writeback events.
>> >>
>> >> These data are also available for each BDI, in
>> >> /sys/block/<device>/bdi/writeback_stats .
>> >
>> >> Sample output:
>> >>
>> >> page: balance_dirty_pages 2561544
>> >> page: background_writeout 5153
>> >> page: try_to_free_pages 0
>> >> page: sync 0
>> >> page: kupdate 102723
>> >> page: fdatawrite 1228779
>> >> page: laptop_periodic 0
>> >> page: free_more_memory 0
>> >> page: fs_free_space 0
>> >> periodic writeback 377
>> >> single inode wait 0
>> >> writeback_wb wait 1
>> >
>> > That's already useful data, and could be further extended (in
>> > future patches) to answer questions like "what's the writeback
>> > efficiency in terms of effective chunk size?"
>> >
>> > So in future there could be lines like
>> >
>> > pages: balance_dirty_pages 2561544
>> > chunks: balance_dirty_pages XXXXXXX
>> > works: balance_dirty_pages XXXXXXX
>> >
>> > or even derived lines like
>> >
>> > pages_per_chunk: balance_dirty_pages XXXXXXX
>> > pages_per_work: balance_dirty_pages XXXXXXX
>> >
>> > Another question is, how can the display format be script friendly?
>> > The current form looks not easily parse-able at least for "cut"..
>>
>> I suppose you mean because of the variable number of tokens. Yeah,
>> this can be hard. Of course, I always just use "awk '{print $NF}'"
>> and it works for me :-) . But I'd be happy to change these to use a
>> consistent # of args.
>>
>> Thanks,
>> Curt
>>
>>
>> > Thanks,
>> > Fengguang
>> >
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-08-15 18:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-12 22:47 [PATCH 1/2 v2] writeback: Add a 'reason' to wb_writeback_work Curt Wohlgemuth
2011-08-12 22:47 ` [PATCH 2/2 v2] writeback: Add writeback stats for pages written Curt Wohlgemuth
2011-08-15 13:48 ` Wu Fengguang
2011-08-15 17:16 ` Curt Wohlgemuth
2011-08-15 18:40 ` Jan Kara
2011-08-15 18:56 ` Curt Wohlgemuth [this message]
2011-08-16 13:10 ` Jan Kara
2011-08-16 12:10 ` Wu Fengguang
2011-08-15 15:03 ` Jan Kara
2011-08-15 17:24 ` Curt Wohlgemuth
2011-08-16 12:26 ` Wu Fengguang
2011-08-15 13:19 ` [PATCH 1/2 v2] writeback: Add a 'reason' to wb_writeback_work Wu Fengguang
2011-09-28 15:02 ` Christoph Hellwig
2011-10-07 15:28 ` Wu Fengguang
2011-10-07 18:07 ` Curt Wohlgemuth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAO81RMbpK4ZE=4c5khSrGpzDrXbyynWp8QoFbUjMuHFeJtbDDw@mail.gmail.com' \
--to=curtw@google.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mrubin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox