linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] readahead:add blk_run_backing_dev
       [not found] <6.0.0.20.2.20090518183752.0581fdc0@172.19.0.2>
@ 2009-05-20  1:07 ` KOSAKI Motohiro
  2009-05-20  1:43   ` Hisashi Hifumi
       [not found] ` <20090518175259.GL4140@kernel.dk>
  1 sibling, 1 reply; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-05-20  1:07 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-fsdevel,
	linux-mm, Wu Fengguang

(cc to Wu and linux-mm)

> Hi.
> 
> I wrote a patch that adds blk_run_backing_dev on page_cache_async_readahead
> so readahead I/O is unpluged to improve throughput.
> 
> Following is the test result with dd.
> 
> #dd if=testdir/testfile of=/dev/null bs=16384
> 
> -2.6.30-rc6
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> 
> -2.6.30-rc6-patched
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> 
> Sequential read performance on a big file was improved.
> Please merge my patch.

I guess the improvement depend on readahead window size.
Have you mesure random access workload?

> 
> Thanks.
> 
> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> 
> diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c linux-2.6.30-rc6.unplug/mm/readahead.c
> --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
> +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
> @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-20  1:07 ` [PATCH] readahead:add blk_run_backing_dev KOSAKI Motohiro
@ 2009-05-20  1:43   ` Hisashi Hifumi
  2009-05-20  2:52     ` Wu Fengguang
  0 siblings, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-20  1:43 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, linux-mm, Wu Fengguang


At 10:07 09/05/20, KOSAKI Motohiro wrote:
>(cc to Wu and linux-mm)
>
>> Hi.
>> 
>> I wrote a patch that adds blk_run_backing_dev on page_cache_async_readahead
>> so readahead I/O is unpluged to improve throughput.
>> 
>> Following is the test result with dd.
>> 
>> #dd if=testdir/testfile of=/dev/null bs=16384
>> 
>> -2.6.30-rc6
>> 1048576+0 records in
>> 1048576+0 records out
>> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> 
>> -2.6.30-rc6-patched
>> 1048576+0 records in
>> 1048576+0 records out
>> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> 
>> Sequential read performance on a big file was improved.
>> Please merge my patch.
>
>I guess the improvement depend on readahead window size.
>Have you mesure random access workload?

I tried with iozone. But there was no difference.

>
>> 
>> Thanks.
>> 
>> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
>> 
>> diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
>linux-2.6.30-rc6.unplug/mm/readahead.c
>> --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
>> +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
>> @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
>>  
>>  	/* do read-ahead */
>>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>> +
>> +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
>>  }
>>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
       [not found] ` <20090518175259.GL4140@kernel.dk>
@ 2009-05-20  2:51   ` Wu Fengguang
  2009-05-21  6:01     ` Hisashi Hifumi
  0 siblings, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-20  2:51 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Hisashi Hifumi, Andrew Morton, linux-kernel, linux-fsdevel,
	KOSAKI Motohiro, linux-mm

On Mon, May 18, 2009 at 07:53:00PM +0200, Jens Axboe wrote:
> On Mon, May 18 2009, Hisashi Hifumi wrote:
> > Hi.
> > 
> > I wrote a patch that adds blk_run_backing_dev on page_cache_async_readahead
> > so readahead I/O is unpluged to improve throughput.
> > 
> > Following is the test result with dd.
> > 
> > #dd if=testdir/testfile of=/dev/null bs=16384
> > 
> > -2.6.30-rc6
> > 1048576+0 records in
> > 1048576+0 records out
> > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> > 
> > -2.6.30-rc6-patched
> > 1048576+0 records in
> > 1048576+0 records out
> > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> > 
> > Sequential read performance on a big file was improved.
> > Please merge my patch.
> > 
> > Thanks.
> > 
> > Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> > 
> > diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c linux-2.6.30-rc6.unplug/mm/readahead.c
> > --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
> > +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
> > @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
> >  
> >  	/* do read-ahead */
> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> > +
> > +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >  }
> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> 
> I'm surprised this makes much of a difference. It seems correct to me to
> NOT unplug the device, since it will get unplugged when someone ends up
> actually waiting for a page. And that will then kick off the remaining
> IO as well. For this dd case, you'll be hitting lock_page() for the
> readahead page really soon, definitely not long enough to warrant such a
> big difference in speed.

The possible timing change of this patch is (assuming readahead size=100):

T0   read(100), which triggers readahead(200, 100)
T1   read(101)
T2   read(102)
...
T100 read(200), find_get_page(200) => readahead(300, 100)
                lock_page(200) => implicit unplug

The readahead(200, 100) submitted at time T0 *might* be delayed to the
unplug time of T100.

But that is only a possibility. In normal cases, the read(200) would
be blocking and there will be a lock_page(200) that will immediately
unplug device for readahead(300, 100).

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-20  1:43   ` Hisashi Hifumi
@ 2009-05-20  2:52     ` Wu Fengguang
  0 siblings, 0 replies; 35+ messages in thread
From: Wu Fengguang @ 2009-05-20  2:52 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: KOSAKI Motohiro, Andrew Morton, linux-kernel, linux-fsdevel,
	linux-mm, Jens Axboe

On Wed, May 20, 2009 at 09:43:18AM +0800, Hisashi Hifumi wrote:
> 
> At 10:07 09/05/20, KOSAKI Motohiro wrote:
> >(cc to Wu and linux-mm)
> >
> >> Hi.
> >> 
> >> I wrote a patch that adds blk_run_backing_dev on page_cache_async_readahead
> >> so readahead I/O is unpluged to improve throughput.
> >> 
> >> Following is the test result with dd.
> >> 
> >> #dd if=testdir/testfile of=/dev/null bs=16384
> >> 
> >> -2.6.30-rc6
> >> 1048576+0 records in
> >> 1048576+0 records out
> >> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> 
> >> -2.6.30-rc6-patched
> >> 1048576+0 records in
> >> 1048576+0 records out
> >> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> 
> >> Sequential read performance on a big file was improved.
> >> Please merge my patch.
> >
> >I guess the improvement depend on readahead window size.
> >Have you mesure random access workload?
> 
> I tried with iozone. But there was no difference.

It does not impact random IO because the patch only modified the
*async* readahead path, and random IO is obviously *sync* ones.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-20  2:51   ` Wu Fengguang
@ 2009-05-21  6:01     ` Hisashi Hifumi
  2009-05-22  1:05       ` Wu Fengguang
  0 siblings, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-21  6:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-fsdevel, KOSAKI Motohiro, linux-mm,
	Wu Fengguang, Jens Axboe


At 11:51 09/05/20, Wu Fengguang wrote:
>On Mon, May 18, 2009 at 07:53:00PM +0200, Jens Axboe wrote:
>> On Mon, May 18 2009, Hisashi Hifumi wrote:
>> > Hi.
>> > 
>> > I wrote a patch that adds blk_run_backing_dev on page_cache_async_readahead
>> > so readahead I/O is unpluged to improve throughput.
>> > 
>> > Following is the test result with dd.
>> > 
>> > #dd if=testdir/testfile of=/dev/null bs=16384
>> > 
>> > -2.6.30-rc6
>> > 1048576+0 records in
>> > 1048576+0 records out
>> > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> > 
>> > -2.6.30-rc6-patched
>> > 1048576+0 records in
>> > 1048576+0 records out
>> > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> > 
>> > Sequential read performance on a big file was improved.
>> > Please merge my patch.
>> > 
>> > Thanks.
>> > 
>> > Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
>> > 
>> > diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
>linux-2.6.30-rc6.unplug/mm/readahead.c
>> > --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
>> > +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 
>13:00:42.000000000 +0900
>> > @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
>> >  
>> >  	/* do read-ahead */
>> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>> > +
>> > +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
>> >  }
>> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
>> 
>> I'm surprised this makes much of a difference. It seems correct to me to
>> NOT unplug the device, since it will get unplugged when someone ends up
>> actually waiting for a page. And that will then kick off the remaining
>> IO as well. For this dd case, you'll be hitting lock_page() for the
>> readahead page really soon, definitely not long enough to warrant such a
>> big difference in speed.
>
>The possible timing change of this patch is (assuming readahead size=100):
>
>T0   read(100), which triggers readahead(200, 100)
>T1   read(101)
>T2   read(102)
>...
>T100 read(200), find_get_page(200) => readahead(300, 100)
>                lock_page(200) => implicit unplug
>
>The readahead(200, 100) submitted at time T0 *might* be delayed to the
>unplug time of T100.
>
>But that is only a possibility. In normal cases, the read(200) would
>be blocking and there will be a lock_page(200) that will immediately
>unplug device for readahead(300, 100).


Hi Andrew.
Following patch improves sequential read performance and does not harm
other performance.
Please merge my patch.
Comments?
Thanks.

#dd if=testdir/testfile of=/dev/null bs=16384
-2.6.30-rc6
1048576+0 records in
1048576+0 records out
17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s

-2.6.30-rc6-patched
1048576+0 records in
1048576+0 records out
17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>

diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c linux-2.6.30-rc6.unplug/mm/readahead.c
--- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
+++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
@@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
 
 	/* do read-ahead */
 	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
+
+	blk_run_backing_dev(mapping->backing_dev_info, NULL);
 }
 EXPORT_SYMBOL_GPL(page_cache_async_readahead);



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-21  6:01     ` Hisashi Hifumi
@ 2009-05-22  1:05       ` Wu Fengguang
  2009-05-22  1:44         ` Hisashi Hifumi
  0 siblings, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-22  1:05 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, KOSAKI Motohiro,
	linux-mm, Jens Axboe

On Thu, May 21, 2009 at 02:01:47PM +0800, Hisashi Hifumi wrote:
> 
> At 11:51 09/05/20, Wu Fengguang wrote:
> >On Mon, May 18, 2009 at 07:53:00PM +0200, Jens Axboe wrote:
> >> On Mon, May 18 2009, Hisashi Hifumi wrote:
> >> > Hi.
> >> > 
> >> > I wrote a patch that adds blk_run_backing_dev on page_cache_async_readahead
> >> > so readahead I/O is unpluged to improve throughput.
> >> > 
> >> > Following is the test result with dd.
> >> > 
> >> > #dd if=testdir/testfile of=/dev/null bs=16384
> >> > 
> >> > -2.6.30-rc6
> >> > 1048576+0 records in
> >> > 1048576+0 records out
> >> > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> > 
> >> > -2.6.30-rc6-patched
> >> > 1048576+0 records in
> >> > 1048576+0 records out
> >> > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> > 
> >> > Sequential read performance on a big file was improved.
> >> > Please merge my patch.
> >> > 
> >> > Thanks.
> >> > 
> >> > Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> >> > 
> >> > diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
> >linux-2.6.30-rc6.unplug/mm/readahead.c
> >> > --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
> >> > +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 
> >13:00:42.000000000 +0900
> >> > @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
> >> >  
> >> >  	/* do read-ahead */
> >> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> >> > +
> >> > +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >> >  }
> >> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> >> 
> >> I'm surprised this makes much of a difference. It seems correct to me to
> >> NOT unplug the device, since it will get unplugged when someone ends up
> >> actually waiting for a page. And that will then kick off the remaining
> >> IO as well. For this dd case, you'll be hitting lock_page() for the
> >> readahead page really soon, definitely not long enough to warrant such a
> >> big difference in speed.
> >
> >The possible timing change of this patch is (assuming readahead size=100):
> >
> >T0   read(100), which triggers readahead(200, 100)
> >T1   read(101)
> >T2   read(102)
> >...
> >T100 read(200), find_get_page(200) => readahead(300, 100)
> >                lock_page(200) => implicit unplug
> >
> >The readahead(200, 100) submitted at time T0 *might* be delayed to the
> >unplug time of T100.
> >
> >But that is only a possibility. In normal cases, the read(200) would
> >be blocking and there will be a lock_page(200) that will immediately
> >unplug device for readahead(300, 100).
> 
> 
> Hi Andrew.
> Following patch improves sequential read performance and does not harm
> other performance.
> Please merge my patch.
> Comments?
> Thanks.
> 
> #dd if=testdir/testfile of=/dev/null bs=16384
> -2.6.30-rc6
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> 
> -2.6.30-rc6-patched
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> 
> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> 
> diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c linux-2.6.30-rc6.unplug/mm/readahead.c
> --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
> +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
> @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> 
> 

Hi Hisashi,

I wonder if the following updated patch can achieve the same
performance.  Can you try testing this out?

Thanks,
Fengguang
---

diff --git a/mm/readahead.c b/mm/readahead.c
index 133b6d5..fd3df66 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -490,5 +490,8 @@ page_cache_async_readahead(struct address_space *mapping,
 
 	/* do read-ahead */
 	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
+
+	if (PageUptodate(page))
+		blk_run_backing_dev(mapping->backing_dev_info, NULL);		
 }
 EXPORT_SYMBOL_GPL(page_cache_async_readahead);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-22  1:05       ` Wu Fengguang
@ 2009-05-22  1:44         ` Hisashi Hifumi
  2009-05-22  2:33           ` Wu Fengguang
  0 siblings, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-22  1:44 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, KOSAKI Motohiro,
	linux-mm, Jens Axboe


At 10:05 09/05/22, Wu Fengguang wrote:
>On Thu, May 21, 2009 at 02:01:47PM +0800, Hisashi Hifumi wrote:
>> 
>> At 11:51 09/05/20, Wu Fengguang wrote:
>> >On Mon, May 18, 2009 at 07:53:00PM +0200, Jens Axboe wrote:
>> >> On Mon, May 18 2009, Hisashi Hifumi wrote:
>> >> > Hi.
>> >> > 
>> >> > I wrote a patch that adds blk_run_backing_dev on 
>page_cache_async_readahead
>> >> > so readahead I/O is unpluged to improve throughput.
>> >> > 
>> >> > Following is the test result with dd.
>> >> > 
>> >> > #dd if=testdir/testfile of=/dev/null bs=16384
>> >> > 
>> >> > -2.6.30-rc6
>> >> > 1048576+0 records in
>> >> > 1048576+0 records out
>> >> > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> >> > 
>> >> > -2.6.30-rc6-patched
>> >> > 1048576+0 records in
>> >> > 1048576+0 records out
>> >> > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> >> > 
>> >> > Sequential read performance on a big file was improved.
>> >> > Please merge my patch.
>> >> > 
>> >> > Thanks.
>> >> > 
>> >> > Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
>> >> > 
>> >> > diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
>> >linux-2.6.30-rc6.unplug/mm/readahead.c
>> >> > --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 
>10:46:15.000000000 +0900
>> >> > +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 
>> >13:00:42.000000000 +0900
>> >> > @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
>> >> >  
>> >> >  	/* do read-ahead */
>> >> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>> >> > +
>> >> > +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
>> >> >  }
>> >> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
>> >> 
>> >> I'm surprised this makes much of a difference. It seems correct to me to
>> >> NOT unplug the device, since it will get unplugged when someone ends up
>> >> actually waiting for a page. And that will then kick off the remaining
>> >> IO as well. For this dd case, you'll be hitting lock_page() for the
>> >> readahead page really soon, definitely not long enough to warrant such a
>> >> big difference in speed.
>> >
>> >The possible timing change of this patch is (assuming readahead size=100):
>> >
>> >T0   read(100), which triggers readahead(200, 100)
>> >T1   read(101)
>> >T2   read(102)
>> >...
>> >T100 read(200), find_get_page(200) => readahead(300, 100)
>> >                lock_page(200) => implicit unplug
>> >
>> >The readahead(200, 100) submitted at time T0 *might* be delayed to the
>> >unplug time of T100.
>> >
>> >But that is only a possibility. In normal cases, the read(200) would
>> >be blocking and there will be a lock_page(200) that will immediately
>> >unplug device for readahead(300, 100).
>> 
>> 
>> Hi Andrew.
>> Following patch improves sequential read performance and does not harm
>> other performance.
>> Please merge my patch.
>> Comments?
>> Thanks.
>> 
>> #dd if=testdir/testfile of=/dev/null bs=16384
>> -2.6.30-rc6
>> 1048576+0 records in
>> 1048576+0 records out
>> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> 
>> -2.6.30-rc6-patched
>> 1048576+0 records in
>> 1048576+0 records out
>> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> 
>> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
>> 
>> diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
>linux-2.6.30-rc6.unplug/mm/readahead.c
>> --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
>> +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
>> @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
>>  
>>  	/* do read-ahead */
>>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>> +
>> +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
>>  }
>>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
>> 
>> 
>
>Hi Hisashi,
>
>I wonder if the following updated patch can achieve the same
>performance.  Can you try testing this out?
>
>Thanks,
>Fengguang
>---
>
>diff --git a/mm/readahead.c b/mm/readahead.c
>index 133b6d5..fd3df66 100644
>--- a/mm/readahead.c
>+++ b/mm/readahead.c
>@@ -490,5 +490,8 @@ page_cache_async_readahead(struct address_space *mapping,
> 
> 	/* do read-ahead */
> 	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>+
>+	if (PageUptodate(page))
>+		blk_run_backing_dev(mapping->backing_dev_info, NULL);		
> }
> EXPORT_SYMBOL_GPL(page_cache_async_readahead);

Hi.
I tested above patch, and I got same performance number.
I wonder why if (PageUptodate(page)) check is there...


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-22  1:44         ` Hisashi Hifumi
@ 2009-05-22  2:33           ` Wu Fengguang
  2009-05-26 23:42             ` Andrew Morton
  0 siblings, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-22  2:33 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, KOSAKI Motohiro,
	linux-mm, Jens Axboe

On Fri, May 22, 2009 at 09:44:59AM +0800, Hisashi Hifumi wrote:
> 
> At 10:05 09/05/22, Wu Fengguang wrote:
> >On Thu, May 21, 2009 at 02:01:47PM +0800, Hisashi Hifumi wrote:
> >> 
> >> At 11:51 09/05/20, Wu Fengguang wrote:
> >> >On Mon, May 18, 2009 at 07:53:00PM +0200, Jens Axboe wrote:
> >> >> On Mon, May 18 2009, Hisashi Hifumi wrote:
> >> >> > Hi.
> >> >> > 
> >> >> > I wrote a patch that adds blk_run_backing_dev on 
> >page_cache_async_readahead
> >> >> > so readahead I/O is unpluged to improve throughput.
> >> >> > 
> >> >> > Following is the test result with dd.
> >> >> > 
> >> >> > #dd if=testdir/testfile of=/dev/null bs=16384
> >> >> > 
> >> >> > -2.6.30-rc6
> >> >> > 1048576+0 records in
> >> >> > 1048576+0 records out
> >> >> > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> >> > 
> >> >> > -2.6.30-rc6-patched
> >> >> > 1048576+0 records in
> >> >> > 1048576+0 records out
> >> >> > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> >> > 
> >> >> > Sequential read performance on a big file was improved.
> >> >> > Please merge my patch.
> >> >> > 
> >> >> > Thanks.
> >> >> > 
> >> >> > Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> >> >> > 
> >> >> > diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
> >> >linux-2.6.30-rc6.unplug/mm/readahead.c
> >> >> > --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 
> >10:46:15.000000000 +0900
> >> >> > +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 
> >> >13:00:42.000000000 +0900
> >> >> > @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
> >> >> >  
> >> >> >  	/* do read-ahead */
> >> >> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> >> >> > +
> >> >> > +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >> >> >  }
> >> >> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> >> >> 
> >> >> I'm surprised this makes much of a difference. It seems correct to me to
> >> >> NOT unplug the device, since it will get unplugged when someone ends up
> >> >> actually waiting for a page. And that will then kick off the remaining
> >> >> IO as well. For this dd case, you'll be hitting lock_page() for the
> >> >> readahead page really soon, definitely not long enough to warrant such a
> >> >> big difference in speed.
> >> >
> >> >The possible timing change of this patch is (assuming readahead size=100):
> >> >
> >> >T0   read(100), which triggers readahead(200, 100)
> >> >T1   read(101)
> >> >T2   read(102)
> >> >...
> >> >T100 read(200), find_get_page(200) => readahead(300, 100)
> >> >                lock_page(200) => implicit unplug
> >> >
> >> >The readahead(200, 100) submitted at time T0 *might* be delayed to the
> >> >unplug time of T100.
> >> >
> >> >But that is only a possibility. In normal cases, the read(200) would
> >> >be blocking and there will be a lock_page(200) that will immediately
> >> >unplug device for readahead(300, 100).
> >> 
> >> 
> >> Hi Andrew.
> >> Following patch improves sequential read performance and does not harm
> >> other performance.
> >> Please merge my patch.
> >> Comments?
> >> Thanks.
> >> 
> >> #dd if=testdir/testfile of=/dev/null bs=16384
> >> -2.6.30-rc6
> >> 1048576+0 records in
> >> 1048576+0 records out
> >> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> 
> >> -2.6.30-rc6-patched
> >> 1048576+0 records in
> >> 1048576+0 records out
> >> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> 
> >> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> >> 
> >> diff -Nrup linux-2.6.30-rc6.org/mm/readahead.c 
> >linux-2.6.30-rc6.unplug/mm/readahead.c
> >> --- linux-2.6.30-rc6.org/mm/readahead.c	2009-05-18 10:46:15.000000000 +0900
> >> +++ linux-2.6.30-rc6.unplug/mm/readahead.c	2009-05-18 13:00:42.000000000 +0900
> >> @@ -490,5 +490,7 @@ page_cache_async_readahead(struct addres
> >>  
> >>  	/* do read-ahead */
> >>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> >> +
> >> +	blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >>  }
> >>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> >> 
> >> 
> >
> >Hi Hisashi,
> >
> >I wonder if the following updated patch can achieve the same
> >performance.  Can you try testing this out?
> >
> >Thanks,
> >Fengguang
> >---
> >
> >diff --git a/mm/readahead.c b/mm/readahead.c
> >index 133b6d5..fd3df66 100644
> >--- a/mm/readahead.c
> >+++ b/mm/readahead.c
> >@@ -490,5 +490,8 @@ page_cache_async_readahead(struct address_space *mapping,
> > 
> > 	/* do read-ahead */
> > 	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> >+
> >+	if (PageUptodate(page))
> >+		blk_run_backing_dev(mapping->backing_dev_info, NULL);		
> > }
> > EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> 
> Hi.
> I tested above patch, and I got same performance number.
> I wonder why if (PageUptodate(page)) check is there...

Thanks!  This is an interesting micro timing behavior that
demands some research work.  The above check is to confirm if it's
the PageUptodate() case that makes the difference. So why that case
happens so frequently so as to impact the performance? Will it also
happen in NFS?

The problem is readahead IO pipeline is not running smoothly, which is
undesirable and not well understood for now.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-22  2:33           ` Wu Fengguang
@ 2009-05-26 23:42             ` Andrew Morton
  2009-05-27  0:25               ` Hisashi Hifumi
  2009-05-27  2:07               ` Wu Fengguang
  0 siblings, 2 replies; 35+ messages in thread
From: Andrew Morton @ 2009-05-26 23:42 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: hifumi.hisashi, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Fri, 22 May 2009 10:33:23 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> > I tested above patch, and I got same performance number.
> > I wonder why if (PageUptodate(page)) check is there...
> 
> Thanks!  This is an interesting micro timing behavior that
> demands some research work.  The above check is to confirm if it's
> the PageUptodate() case that makes the difference. So why that case
> happens so frequently so as to impact the performance? Will it also
> happen in NFS?
> 
> The problem is readahead IO pipeline is not running smoothly, which is
> undesirable and not well understood for now.

The patch causes a remarkably large performance increase.  A 9%
reduction in time for a linear read?  I'd be surprised if the workload
even consumed 9% of a CPU, so where on earth has the kernel gone to?

Have you been able to reproduce this in your testing?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-26 23:42             ` Andrew Morton
@ 2009-05-27  0:25               ` Hisashi Hifumi
  2009-05-27  2:09                 ` Wu Fengguang
  2009-05-27  2:07               ` Wu Fengguang
  1 sibling, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-27  0:25 UTC (permalink / raw)
  To: Andrew Morton, Wu Fengguang
  Cc: linux-kernel, linux-fsdevel, kosaki.motohiro, linux-mm, jens.axboe


At 08:42 09/05/27, Andrew Morton wrote:
>On Fri, 22 May 2009 10:33:23 +0800
>Wu Fengguang <fengguang.wu@intel.com> wrote:
>
>> > I tested above patch, and I got same performance number.
>> > I wonder why if (PageUptodate(page)) check is there...
>> 
>> Thanks!  This is an interesting micro timing behavior that
>> demands some research work.  The above check is to confirm if it's
>> the PageUptodate() case that makes the difference. So why that case
>> happens so frequently so as to impact the performance? Will it also
>> happen in NFS?
>> 
>> The problem is readahead IO pipeline is not running smoothly, which is
>> undesirable and not well understood for now.
>
>The patch causes a remarkably large performance increase.  A 9%
>reduction in time for a linear read? I'd be surprised if the workload

Hi Andrew.
Yes, I tested this with dd.

>even consumed 9% of a CPU, so where on earth has the kernel gone to?
>
>Have you been able to reproduce this in your testing?

Yes, this test on my environment is reproducible.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-26 23:42             ` Andrew Morton
  2009-05-27  0:25               ` Hisashi Hifumi
@ 2009-05-27  2:07               ` Wu Fengguang
  1 sibling, 0 replies; 35+ messages in thread
From: Wu Fengguang @ 2009-05-27  2:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: hifumi.hisashi, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 07:42:52AM +0800, Andrew Morton wrote:
> On Fri, 22 May 2009 10:33:23 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > > I tested above patch, and I got same performance number.
> > > I wonder why if (PageUptodate(page)) check is there...
> > 
> > Thanks!  This is an interesting micro timing behavior that
> > demands some research work.  The above check is to confirm if it's
> > the PageUptodate() case that makes the difference. So why that case
> > happens so frequently so as to impact the performance? Will it also
> > happen in NFS?
> > 
> > The problem is readahead IO pipeline is not running smoothly, which is
> > undesirable and not well understood for now.
> 
> The patch causes a remarkably large performance increase.  A 9%
> reduction in time for a linear read?  I'd be surprised if the workload
> even consumed 9% of a CPU, so where on earth has the kernel gone to?
> 
> Have you been able to reproduce this in your testing?

No I cannot reproduce it on raw partition and ext4fs.

The commands I run:

        # echo 1 > /proc/sys/vm/drop_caches
        # dd if=/dev/sda1 of=/dev/null bs=16384 count=100000 # sda1 is not mounted

The results are almost identical:

before:
        1638400000 bytes (1.6 GB) copied, 31.3073 s, 52.3 MB/s
        1638400000 bytes (1.6 GB) copied, 31.3393 s, 52.3 MB/s
after:
        1638400000 bytes (1.6 GB) copied, 31.3216 s, 52.3 MB/s
        1638400000 bytes (1.6 GB) copied, 31.3762 s, 52.2 MB/s

My kernel is
        Linux hp 2.6.30-rc6 #281 SMP Wed May 27 09:32:37 CST 2009 x86_64 GNU/Linux

The readahead size is the default one:
        # blockdev --getra  /dev/sda    
        256

I tried another ext4 directory with many ~100MB files(vmlinux-2.6.*) in it:

        # time tar cf - /hp/boot | cat > /dev/null

before:
        tar cf - /hp/boot  0.22s user 5.63s system 21% cpu 26.750 total
        tar cf - /hp/boot  0.26s user 5.53s system 21% cpu 26.620 total
after:
        tar cf - /hp/boot  0.18s user 5.57s system 21% cpu 26.719 total
        tar cf - /hp/boot  0.22s user 5.32s system 21% cpu 26.321 total

Another round with 1MB readahead size:

before:
        tar cf - /hp/boot  0.24s user 4.70s system 19% cpu 25.689 total
        tar cf - /hp/boot  0.22s user 4.99s system 20% cpu 25.634 total
after:
        tar cf - /hp/boot  0.18s user 4.89s system 19% cpu 25.599 total
        tar cf - /hp/boot  0.18s user 4.97s system 20% cpu 25.645 total

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  0:25               ` Hisashi Hifumi
@ 2009-05-27  2:09                 ` Wu Fengguang
  2009-05-27  2:21                   ` Hisashi Hifumi
  0 siblings, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-27  2:09 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> 
> At 08:42 09/05/27, Andrew Morton wrote:
> >On Fri, 22 May 2009 10:33:23 +0800
> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >
> >> > I tested above patch, and I got same performance number.
> >> > I wonder why if (PageUptodate(page)) check is there...
> >> 
> >> Thanks!  This is an interesting micro timing behavior that
> >> demands some research work.  The above check is to confirm if it's
> >> the PageUptodate() case that makes the difference. So why that case
> >> happens so frequently so as to impact the performance? Will it also
> >> happen in NFS?
> >> 
> >> The problem is readahead IO pipeline is not running smoothly, which is
> >> undesirable and not well understood for now.
> >
> >The patch causes a remarkably large performance increase.  A 9%
> >reduction in time for a linear read? I'd be surprised if the workload
> 
> Hi Andrew.
> Yes, I tested this with dd.
> 
> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >
> >Have you been able to reproduce this in your testing?
> 
> Yes, this test on my environment is reproducible.

Hisashi, does your environment have some special configurations?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:09                 ` Wu Fengguang
@ 2009-05-27  2:21                   ` Hisashi Hifumi
  2009-05-27  2:35                     ` KOSAKI Motohiro
                                       ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-27  2:21 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe


At 11:09 09/05/27, Wu Fengguang wrote:
>On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
>> 
>> At 08:42 09/05/27, Andrew Morton wrote:
>> >On Fri, 22 May 2009 10:33:23 +0800
>> >Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >
>> >> > I tested above patch, and I got same performance number.
>> >> > I wonder why if (PageUptodate(page)) check is there...
>> >> 
>> >> Thanks!  This is an interesting micro timing behavior that
>> >> demands some research work.  The above check is to confirm if it's
>> >> the PageUptodate() case that makes the difference. So why that case
>> >> happens so frequently so as to impact the performance? Will it also
>> >> happen in NFS?
>> >> 
>> >> The problem is readahead IO pipeline is not running smoothly, which is
>> >> undesirable and not well understood for now.
>> >
>> >The patch causes a remarkably large performance increase.  A 9%
>> >reduction in time for a linear read? I'd be surprised if the workload
>> 
>> Hi Andrew.
>> Yes, I tested this with dd.
>> 
>> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
>> >
>> >Have you been able to reproduce this in your testing?
>> 
>> Yes, this test on my environment is reproducible.
>
>Hisashi, does your environment have some special configurations?

Hi.
My testing environment is as follows:
Hardware: HP DL580 
CPU:Xeon 3.2GHz *4 HT enabled
Memory:8GB
Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)

I did dd to this disk-array and got improved performance number.

I noticed that when a disk is just one HDD, performance improvement
is very small.

Thanks.




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:21                   ` Hisashi Hifumi
@ 2009-05-27  2:35                     ` KOSAKI Motohiro
  2009-05-27  2:36                     ` Andrew Morton
  2009-05-27  2:36                     ` Wu Fengguang
  2 siblings, 0 replies; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-05-27  2:35 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: kosaki.motohiro, Wu Fengguang, Andrew Morton, linux-kernel,
	linux-fsdevel, linux-mm, jens.axboe

> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >> >
> >> >Have you been able to reproduce this in your testing?
> >> 
> >> Yes, this test on my environment is reproducible.
> >
> >Hisashi, does your environment have some special configurations?
> 
> Hi.
> My testing environment is as follows:
> Hardware: HP DL580 
> CPU:Xeon 3.2GHz *4 HT enabled
> Memory:8GB
> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> 
> I did dd to this disk-array and got improved performance number.
> 
> I noticed that when a disk is just one HDD, performance improvement
> is very small.

thas's odd.

Why your patch depend on transfer rate difference?





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:21                   ` Hisashi Hifumi
  2009-05-27  2:35                     ` KOSAKI Motohiro
@ 2009-05-27  2:36                     ` Andrew Morton
  2009-05-27  2:38                       ` Hisashi Hifumi
  2009-05-27  3:55                       ` Wu Fengguang
  2009-05-27  2:36                     ` Wu Fengguang
  2 siblings, 2 replies; 35+ messages in thread
From: Andrew Morton @ 2009-05-27  2:36 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Wu Fengguang, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, 27 May 2009 11:21:53 +0900 Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> wrote:

> 
> At 11:09 09/05/27, Wu Fengguang wrote:
> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> >> 
> >> At 08:42 09/05/27, Andrew Morton wrote:
> >> >On Fri, 22 May 2009 10:33:23 +0800
> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> >
> >> >> > I tested above patch, and I got same performance number.
> >> >> > I wonder why if (PageUptodate(page)) check is there...
> >> >> 
> >> >> Thanks!  This is an interesting micro timing behavior that
> >> >> demands some research work.  The above check is to confirm if it's
> >> >> the PageUptodate() case that makes the difference. So why that case
> >> >> happens so frequently so as to impact the performance? Will it also
> >> >> happen in NFS?
> >> >> 
> >> >> The problem is readahead IO pipeline is not running smoothly, which is
> >> >> undesirable and not well understood for now.
> >> >
> >> >The patch causes a remarkably large performance increase.  A 9%
> >> >reduction in time for a linear read? I'd be surprised if the workload
> >> 
> >> Hi Andrew.
> >> Yes, I tested this with dd.
> >> 
> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >> >
> >> >Have you been able to reproduce this in your testing?
> >> 
> >> Yes, this test on my environment is reproducible.
> >
> >Hisashi, does your environment have some special configurations?
> 
> Hi.
> My testing environment is as follows:
> Hardware: HP DL580 
> CPU:Xeon 3.2GHz *4 HT enabled
> Memory:8GB
> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> 
> I did dd to this disk-array and got improved performance number.
> 
> I noticed that when a disk is just one HDD, performance improvement
> is very small.
> 

Ah.  So it's likely to be some strange interaction with the RAID setup.

I assume that you're using the SANNet 2's "hardware raid"?  Or is the
array set up as jbod and you're using kernel raid0?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:21                   ` Hisashi Hifumi
  2009-05-27  2:35                     ` KOSAKI Motohiro
  2009-05-27  2:36                     ` Andrew Morton
@ 2009-05-27  2:36                     ` Wu Fengguang
  2009-05-27  2:47                       ` Hisashi Hifumi
  2 siblings, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-27  2:36 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
>
> At 11:09 09/05/27, Wu Fengguang wrote:
> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> >>
> >> At 08:42 09/05/27, Andrew Morton wrote:
> >> >On Fri, 22 May 2009 10:33:23 +0800
> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> >
> >> >> > I tested above patch, and I got same performance number.
> >> >> > I wonder why if (PageUptodate(page)) check is there...
> >> >>
> >> >> Thanks!  This is an interesting micro timing behavior that
> >> >> demands some research work.  The above check is to confirm if it's
> >> >> the PageUptodate() case that makes the difference. So why that case
> >> >> happens so frequently so as to impact the performance? Will it also
> >> >> happen in NFS?
> >> >>
> >> >> The problem is readahead IO pipeline is not running smoothly, which is
> >> >> undesirable and not well understood for now.
> >> >
> >> >The patch causes a remarkably large performance increase.  A 9%
> >> >reduction in time for a linear read? I'd be surprised if the workload
> >>
> >> Hi Andrew.
> >> Yes, I tested this with dd.
> >>
> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >> >
> >> >Have you been able to reproduce this in your testing?
> >>
> >> Yes, this test on my environment is reproducible.
> >
> >Hisashi, does your environment have some special configurations?
>
> Hi.
> My testing environment is as follows:
> Hardware: HP DL580
> CPU:Xeon 3.2GHz *4 HT enabled
> Memory:8GB
> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)

This is a big hardware RAID. What's the readahead size?

The numbers look too small for a 7 disk RAID:

        > #dd if=testdir/testfile of=/dev/null bs=16384
        >
        > -2.6.30-rc6
        > 1048576+0 records in
        > 1048576+0 records out
        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
        >
        > -2.6.30-rc6-patched
        > 1048576+0 records in
        > 1048576+0 records out
        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s

I'd suggest you to configure the array properly before coming back to
measuring the impact of this patch.

> I did dd to this disk-array and got improved performance number.
>
> I noticed that when a disk is just one HDD, performance improvement
> is very small.

OK. You should mention the single disk and RAID performance earlier.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:36                     ` Andrew Morton
@ 2009-05-27  2:38                       ` Hisashi Hifumi
  2009-05-27  3:55                       ` Wu Fengguang
  1 sibling, 0 replies; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-27  2:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe


At 11:36 09/05/27, Andrew Morton wrote:
>On Wed, 27 May 2009 11:21:53 +0900 Hisashi Hifumi 
><hifumi.hisashi@oss.ntt.co.jp> wrote:
>
>> 
>> At 11:09 09/05/27, Wu Fengguang wrote:
>> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
>> >> 
>> >> At 08:42 09/05/27, Andrew Morton wrote:
>> >> >On Fri, 22 May 2009 10:33:23 +0800
>> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >> >
>> >> >> > I tested above patch, and I got same performance number.
>> >> >> > I wonder why if (PageUptodate(page)) check is there...
>> >> >> 
>> >> >> Thanks!  This is an interesting micro timing behavior that
>> >> >> demands some research work.  The above check is to confirm if it's
>> >> >> the PageUptodate() case that makes the difference. So why that case
>> >> >> happens so frequently so as to impact the performance? Will it also
>> >> >> happen in NFS?
>> >> >> 
>> >> >> The problem is readahead IO pipeline is not running smoothly, which is
>> >> >> undesirable and not well understood for now.
>> >> >
>> >> >The patch causes a remarkably large performance increase.  A 9%
>> >> >reduction in time for a linear read? I'd be surprised if the workload
>> >> 
>> >> Hi Andrew.
>> >> Yes, I tested this with dd.
>> >> 
>> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
>> >> >
>> >> >Have you been able to reproduce this in your testing?
>> >> 
>> >> Yes, this test on my environment is reproducible.
>> >
>> >Hisashi, does your environment have some special configurations?
>> 
>> Hi.
>> My testing environment is as follows:
>> Hardware: HP DL580 
>> CPU:Xeon 3.2GHz *4 HT enabled
>> Memory:8GB
>> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
>> 
>> I did dd to this disk-array and got improved performance number.
>> 
>> I noticed that when a disk is just one HDD, performance improvement
>> is very small.
>> 
>
>Ah.  So it's likely to be some strange interaction with the RAID setup.
>
>I assume that you're using the SANNet 2's "hardware raid"?  Or is the
>array set up as jbod and you're using kernel raid0?

I used SANNet 2's "hardware raid".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:36                     ` Wu Fengguang
@ 2009-05-27  2:47                       ` Hisashi Hifumi
  2009-05-27  2:57                         ` Wu Fengguang
  0 siblings, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-27  2:47 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe


At 11:36 09/05/27, Wu Fengguang wrote:
>On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
>>
>> At 11:09 09/05/27, Wu Fengguang wrote:
>> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
>> >>
>> >> At 08:42 09/05/27, Andrew Morton wrote:
>> >> >On Fri, 22 May 2009 10:33:23 +0800
>> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >> >
>> >> >> > I tested above patch, and I got same performance number.
>> >> >> > I wonder why if (PageUptodate(page)) check is there...
>> >> >>
>> >> >> Thanks!  This is an interesting micro timing behavior that
>> >> >> demands some research work.  The above check is to confirm if it's
>> >> >> the PageUptodate() case that makes the difference. So why that case
>> >> >> happens so frequently so as to impact the performance? Will it also
>> >> >> happen in NFS?
>> >> >>
>> >> >> The problem is readahead IO pipeline is not running smoothly, which is
>> >> >> undesirable and not well understood for now.
>> >> >
>> >> >The patch causes a remarkably large performance increase.  A 9%
>> >> >reduction in time for a linear read? I'd be surprised if the workload
>> >>
>> >> Hi Andrew.
>> >> Yes, I tested this with dd.
>> >>
>> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
>> >> >
>> >> >Have you been able to reproduce this in your testing?
>> >>
>> >> Yes, this test on my environment is reproducible.
>> >
>> >Hisashi, does your environment have some special configurations?
>>
>> Hi.
>> My testing environment is as follows:
>> Hardware: HP DL580
>> CPU:Xeon 3.2GHz *4 HT enabled
>> Memory:8GB
>> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
>
>This is a big hardware RAID. What's the readahead size?
>
>The numbers look too small for a 7 disk RAID:
>
>        > #dd if=testdir/testfile of=/dev/null bs=16384
>        >
>        > -2.6.30-rc6
>        > 1048576+0 records in
>        > 1048576+0 records out
>        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>        >
>        > -2.6.30-rc6-patched
>        > 1048576+0 records in
>        > 1048576+0 records out
>        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>
>I'd suggest you to configure the array properly before coming back to
>measuring the impact of this patch.


I created 16GB file to this disk array, and mounted to testdir, dd to this directory.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:47                       ` Hisashi Hifumi
@ 2009-05-27  2:57                         ` Wu Fengguang
  2009-05-27  3:06                           ` Hisashi Hifumi
  0 siblings, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-27  2:57 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
> 
> At 11:36 09/05/27, Wu Fengguang wrote:
> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
> >>
> >> At 11:09 09/05/27, Wu Fengguang wrote:
> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> >> >>
> >> >> At 08:42 09/05/27, Andrew Morton wrote:
> >> >> >On Fri, 22 May 2009 10:33:23 +0800
> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> >> >
> >> >> >> > I tested above patch, and I got same performance number.
> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
> >> >> >>
> >> >> >> Thanks!  This is an interesting micro timing behavior that
> >> >> >> demands some research work.  The above check is to confirm if it's
> >> >> >> the PageUptodate() case that makes the difference. So why that case
> >> >> >> happens so frequently so as to impact the performance? Will it also
> >> >> >> happen in NFS?
> >> >> >>
> >> >> >> The problem is readahead IO pipeline is not running smoothly, which is
> >> >> >> undesirable and not well understood for now.
> >> >> >
> >> >> >The patch causes a remarkably large performance increase.  A 9%
> >> >> >reduction in time for a linear read? I'd be surprised if the workload
> >> >>
> >> >> Hi Andrew.
> >> >> Yes, I tested this with dd.
> >> >>
> >> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >> >> >
> >> >> >Have you been able to reproduce this in your testing?
> >> >>
> >> >> Yes, this test on my environment is reproducible.
> >> >
> >> >Hisashi, does your environment have some special configurations?
> >>
> >> Hi.
> >> My testing environment is as follows:
> >> Hardware: HP DL580
> >> CPU:Xeon 3.2GHz *4 HT enabled
> >> Memory:8GB
> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> >
> >This is a big hardware RAID. What's the readahead size?
> >
> >The numbers look too small for a 7 disk RAID:
> >
> >        > #dd if=testdir/testfile of=/dev/null bs=16384
> >        >
> >        > -2.6.30-rc6
> >        > 1048576+0 records in
> >        > 1048576+0 records out
> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >        >
> >        > -2.6.30-rc6-patched
> >        > 1048576+0 records in
> >        > 1048576+0 records out
> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >
> >I'd suggest you to configure the array properly before coming back to
> >measuring the impact of this patch.
> 
> 
> I created 16GB file to this disk array, and mounted to testdir, dd to this directory.

I mean, you should get >300MB/s throughput with 7 disks, and you
should seek ways to achieve that before testing out this patch :-)

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:57                         ` Wu Fengguang
@ 2009-05-27  3:06                           ` Hisashi Hifumi
  2009-05-27  3:26                             ` KOSAKI Motohiro
  2009-06-01  2:37                             ` Wu Fengguang
  0 siblings, 2 replies; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-27  3:06 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe


At 11:57 09/05/27, Wu Fengguang wrote:
>On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
>> 
>> At 11:36 09/05/27, Wu Fengguang wrote:
>> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
>> >>
>> >> At 11:09 09/05/27, Wu Fengguang wrote:
>> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
>> >> >>
>> >> >> At 08:42 09/05/27, Andrew Morton wrote:
>> >> >> >On Fri, 22 May 2009 10:33:23 +0800
>> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >> >> >
>> >> >> >> > I tested above patch, and I got same performance number.
>> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
>> >> >> >>
>> >> >> >> Thanks!  This is an interesting micro timing behavior that
>> >> >> >> demands some research work.  The above check is to confirm if it's
>> >> >> >> the PageUptodate() case that makes the difference. So why that case
>> >> >> >> happens so frequently so as to impact the performance? Will it also
>> >> >> >> happen in NFS?
>> >> >> >>
>> >> >> >> The problem is readahead IO pipeline is not running smoothly, which is
>> >> >> >> undesirable and not well understood for now.
>> >> >> >
>> >> >> >The patch causes a remarkably large performance increase.  A 9%
>> >> >> >reduction in time for a linear read? I'd be surprised if the workload
>> >> >>
>> >> >> Hi Andrew.
>> >> >> Yes, I tested this with dd.
>> >> >>
>> >> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
>> >> >> >
>> >> >> >Have you been able to reproduce this in your testing?
>> >> >>
>> >> >> Yes, this test on my environment is reproducible.
>> >> >
>> >> >Hisashi, does your environment have some special configurations?
>> >>
>> >> Hi.
>> >> My testing environment is as follows:
>> >> Hardware: HP DL580
>> >> CPU:Xeon 3.2GHz *4 HT enabled
>> >> Memory:8GB
>> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
>> >
>> >This is a big hardware RAID. What's the readahead size?
>> >
>> >The numbers look too small for a 7 disk RAID:
>> >
>> >        > #dd if=testdir/testfile of=/dev/null bs=16384
>> >        >
>> >        > -2.6.30-rc6
>> >        > 1048576+0 records in
>> >        > 1048576+0 records out
>> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> >        >
>> >        > -2.6.30-rc6-patched
>> >        > 1048576+0 records in
>> >        > 1048576+0 records out
>> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> >
>> >I'd suggest you to configure the array properly before coming back to
>> >measuring the impact of this patch.
>> 
>> 
>> I created 16GB file to this disk array, and mounted to testdir, dd to 
>this directory.
>
>I mean, you should get >300MB/s throughput with 7 disks, and you
>should seek ways to achieve that before testing out this patch :-)

Throughput number of storage array is very from one product to another.
On my hardware environment I think this number is valid and
my patch is effective.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  3:06                           ` Hisashi Hifumi
@ 2009-05-27  3:26                             ` KOSAKI Motohiro
  2009-06-01  2:37                             ` Wu Fengguang
  1 sibling, 0 replies; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-05-27  3:26 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: kosaki.motohiro, Wu Fengguang, Andrew Morton, linux-kernel,
	linux-fsdevel, linux-mm, jens.axboe

> >> >The numbers look too small for a 7 disk RAID:
> >> >
> >> >        > #dd if=testdir/testfile of=/dev/null bs=16384
> >> >        >
> >> >        > -2.6.30-rc6
> >> >        > 1048576+0 records in
> >> >        > 1048576+0 records out
> >> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> >        >
> >> >        > -2.6.30-rc6-patched
> >> >        > 1048576+0 records in
> >> >        > 1048576+0 records out
> >> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> >
> >> >I'd suggest you to configure the array properly before coming back to
> >> >measuring the impact of this patch.
> >> 
> >> 
> >> I created 16GB file to this disk array, and mounted to testdir, dd to 
> >this directory.
> >
> >I mean, you should get >300MB/s throughput with 7 disks, and you
> >should seek ways to achieve that before testing out this patch :-)
> 
> Throughput number of storage array is very from one product to another.
> On my hardware environment I think this number is valid and
> my patch is effective.

Hifumi-san, if you really want to merge, you should reproduce this
issue on typical hardware, I think.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  2:36                     ` Andrew Morton
  2009-05-27  2:38                       ` Hisashi Hifumi
@ 2009-05-27  3:55                       ` Wu Fengguang
  2009-05-27  4:06                         ` KOSAKI Motohiro
  1 sibling, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-05-27  3:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Hisashi Hifumi, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 10:36:01AM +0800, Andrew Morton wrote:
> On Wed, 27 May 2009 11:21:53 +0900 Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> wrote:
> 
> > 
> > At 11:09 09/05/27, Wu Fengguang wrote:
> > >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> > >> 
> > >> At 08:42 09/05/27, Andrew Morton wrote:
> > >> >On Fri, 22 May 2009 10:33:23 +0800
> > >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> > >> >
> > >> >> > I tested above patch, and I got same performance number.
> > >> >> > I wonder why if (PageUptodate(page)) check is there...
> > >> >> 
> > >> >> Thanks!  This is an interesting micro timing behavior that
> > >> >> demands some research work.  The above check is to confirm if it's
> > >> >> the PageUptodate() case that makes the difference. So why that case
> > >> >> happens so frequently so as to impact the performance? Will it also
> > >> >> happen in NFS?
> > >> >> 
> > >> >> The problem is readahead IO pipeline is not running smoothly, which is
> > >> >> undesirable and not well understood for now.
> > >> >
> > >> >The patch causes a remarkably large performance increase.  A 9%
> > >> >reduction in time for a linear read? I'd be surprised if the workload
> > >> 
> > >> Hi Andrew.
> > >> Yes, I tested this with dd.
> > >> 
> > >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> > >> >
> > >> >Have you been able to reproduce this in your testing?
> > >> 
> > >> Yes, this test on my environment is reproducible.
> > >
> > >Hisashi, does your environment have some special configurations?
> > 
> > Hi.
> > My testing environment is as follows:
> > Hardware: HP DL580 
> > CPU:Xeon 3.2GHz *4 HT enabled
> > Memory:8GB
> > Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> > 
> > I did dd to this disk-array and got improved performance number.
> > 
> > I noticed that when a disk is just one HDD, performance improvement
> > is very small.
> > 
> 
> Ah.  So it's likely to be some strange interaction with the RAID setup.

The normal case is, if page N become uptodate at time T(N), then
T(N) <= T(N+1) holds. But for RAID, the data arrival time depends on
runtime status of individual disks, which breaks that formula. So
in do_generic_file_read(), just after submitting the async readahead IO
request, the current page may well be uptodate, so the page won't be locked,
and the block device won't be implicitly unplugged:

               if (PageReadahead(page))
                        page_cache_async_readahead()
                if (!PageUptodate(page))
                                goto page_not_up_to_date;
                //...
page_not_up_to_date:
                lock_page_killable(page);


Therefore explicit unplugging can help, so

        Acked-by: Wu Fengguang <fengguang.wu@intel.com> 

The only question is, shall we avoid the double unplug by doing this?

---
 mm/readahead.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- linux.orig/mm/readahead.c
+++ linux/mm/readahead.c
@@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
 
 	/* do read-ahead */
 	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
+
+	/*
+	 * Normally the current page is !uptodate and lock_page() will be
+	 * immediately called to implicitly unplug the device. However this
+	 * is not always true for RAID conifgurations, where data arrives
+	 * not strictly in their submission order. In this case we need to
+	 * explicitly kick off the IO.
+	 */
+	if (PageUptodate(page))
+		blk_run_backing_dev(mapping->backing_dev_info, NULL);
 }
 EXPORT_SYMBOL_GPL(page_cache_async_readahead);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  3:55                       ` Wu Fengguang
@ 2009-05-27  4:06                         ` KOSAKI Motohiro
  2009-05-27  4:36                           ` Wu Fengguang
  0 siblings, 1 reply; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-05-27  4:06 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: kosaki.motohiro, Andrew Morton, Hisashi Hifumi, linux-kernel,
	linux-fsdevel, linux-mm, jens.axboe

> > Ah.  So it's likely to be some strange interaction with the RAID setup.
> 
> The normal case is, if page N become uptodate at time T(N), then
> T(N) <= T(N+1) holds. But for RAID, the data arrival time depends on
> runtime status of individual disks, which breaks that formula. So
> in do_generic_file_read(), just after submitting the async readahead IO
> request, the current page may well be uptodate, so the page won't be locked,
> and the block device won't be implicitly unplugged:

Hifumi-san, Can you get blktrace data and confirm Wu's assumption?


> 
>                if (PageReadahead(page))
>                         page_cache_async_readahead()
>                 if (!PageUptodate(page))
>                                 goto page_not_up_to_date;
>                 //...
> page_not_up_to_date:
>                 lock_page_killable(page);
> 
> 
> Therefore explicit unplugging can help, so
> 
>         Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
> 
> The only question is, shall we avoid the double unplug by doing this?
> 
> ---
>  mm/readahead.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> --- linux.orig/mm/readahead.c
> +++ linux/mm/readahead.c
> @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +	/*
> +	 * Normally the current page is !uptodate and lock_page() will be
> +	 * immediately called to implicitly unplug the device. However this
> +	 * is not always true for RAID conifgurations, where data arrives
> +	 * not strictly in their submission order. In this case we need to
> +	 * explicitly kick off the IO.
> +	 */
> +	if (PageUptodate(page))
> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  4:06                         ` KOSAKI Motohiro
@ 2009-05-27  4:36                           ` Wu Fengguang
  2009-05-27  6:20                             ` Hisashi Hifumi
  2009-05-28  1:20                             ` Hisashi Hifumi
  0 siblings, 2 replies; 35+ messages in thread
From: Wu Fengguang @ 2009-05-27  4:36 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Hisashi Hifumi, linux-kernel, linux-fsdevel,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 12:06:12PM +0800, KOSAKI Motohiro wrote:
> > > Ah.  So it's likely to be some strange interaction with the RAID setup.
> > 
> > The normal case is, if page N become uptodate at time T(N), then
> > T(N) <= T(N+1) holds. But for RAID, the data arrival time depends on
> > runtime status of individual disks, which breaks that formula. So
> > in do_generic_file_read(), just after submitting the async readahead IO
> > request, the current page may well be uptodate, so the page won't be locked,
> > and the block device won't be implicitly unplugged:
> 
> Hifumi-san, Can you get blktrace data and confirm Wu's assumption?

To make the reasoning more obvious:

Assume we just submitted readahead IO request for pages N ~ N+M, then
        T(N) <= T(N+1)
        T(N) <= T(N+2)
        T(N) <= T(N+3)
        ...
        T(N) <= T(N+M)   (M = readahead size)
So if the reader is going to block on any page in the above chunk,
it is going to first block on page N.

With RAID (and NFS to some degree), there is no strict ordering,
so the reader is more likely to block on some random pages.

In the first case, the effective async_size = M, in the second case,
the effective async_size <= M. The more async_size, the more degree of
readahead pipeline, hence the more low level IO latencies are hidden
to the application.

Thanks,
Fengguang

> 
> > 
> >                if (PageReadahead(page))
> >                         page_cache_async_readahead()
> >                 if (!PageUptodate(page))
> >                                 goto page_not_up_to_date;
> >                 //...
> > page_not_up_to_date:
> >                 lock_page_killable(page);
> > 
> > 
> > Therefore explicit unplugging can help, so
> > 
> >         Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
> > 
> > The only question is, shall we avoid the double unplug by doing this?
> > 
> > ---
> >  mm/readahead.c |   10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > --- linux.orig/mm/readahead.c
> > +++ linux/mm/readahead.c
> > @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
> >  
> >  	/* do read-ahead */
> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> > +
> > +	/*
> > +	 * Normally the current page is !uptodate and lock_page() will be
> > +	 * immediately called to implicitly unplug the device. However this
> > +	 * is not always true for RAID conifgurations, where data arrives
> > +	 * not strictly in their submission order. In this case we need to
> > +	 * explicitly kick off the IO.
> > +	 */
> > +	if (PageUptodate(page))
> > +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >  }
> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  4:36                           ` Wu Fengguang
@ 2009-05-27  6:20                             ` Hisashi Hifumi
  2009-05-28  1:20                             ` Hisashi Hifumi
  1 sibling, 0 replies; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-27  6:20 UTC (permalink / raw)
  To: Wu Fengguang, KOSAKI Motohiro
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, linux-mm, jens.axboe


At 13:36 09/05/27, Wu Fengguang wrote:
>On Wed, May 27, 2009 at 12:06:12PM +0800, KOSAKI Motohiro wrote:
>> > > Ah.  So it's likely to be some strange interaction with the RAID setup.
>> > 
>> > The normal case is, if page N become uptodate at time T(N), then
>> > T(N) <= T(N+1) holds. But for RAID, the data arrival time depends on
>> > runtime status of individual disks, which breaks that formula. So
>> > in do_generic_file_read(), just after submitting the async readahead IO
>> > request, the current page may well be uptodate, so the page won't be locked,
>> > and the block device won't be implicitly unplugged:
>> 
>> Hifumi-san, Can you get blktrace data and confirm Wu's assumption?
>
>To make the reasoning more obvious:
>
>Assume we just submitted readahead IO request for pages N ~ N+M, then
>        T(N) <= T(N+1)
>        T(N) <= T(N+2)
>        T(N) <= T(N+3)
>        ...
>        T(N) <= T(N+M)   (M = readahead size)
>So if the reader is going to block on any page in the above chunk,
>it is going to first block on page N.
>
>With RAID (and NFS to some degree), there is no strict ordering,
>so the reader is more likely to block on some random pages.
>
>In the first case, the effective async_size = M, in the second case,
>the effective async_size <= M. The more async_size, the more degree of
>readahead pipeline, hence the more low level IO latencies are hidden
>to the application.

I got your explanation especially about RAID specific matters.

>
>Thanks,
>Fengguang
>
>> 
>> > 
>> >                if (PageReadahead(page))
>> >                         page_cache_async_readahead()
>> >                 if (!PageUptodate(page))
>> >                                 goto page_not_up_to_date;
>> >                 //...
>> > page_not_up_to_date:
>> >                 lock_page_killable(page);
>> > 
>> > 
>> > Therefore explicit unplugging can help, so
>> > 
>> >         Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
>> > 
>> > The only question is, shall we avoid the double unplug by doing this?
>> > 
>> > ---
>> >  mm/readahead.c |   10 ++++++++++
>> >  1 file changed, 10 insertions(+)
>> > 
>> > --- linux.orig/mm/readahead.c
>> > +++ linux/mm/readahead.c
>> > @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
>> >  
>> >  	/* do read-ahead */
>> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>> > +
>> > +	/*
>> > +	* Normally the current page is !uptodate and lock_page() will be
>> > +	* immediately called to implicitly unplug the device. However this
>> > +	* is not always true for RAID conifgurations, where data arrives
>> > +	* not strictly in their submission order. In this case we need to
>> > +	* explicitly kick off the IO.
>> > +	*/
>> > +	if (PageUptodate(page))
>> > +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
>> >  }
>> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);

I am for this to avoid double unplug.
Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  4:36                           ` Wu Fengguang
  2009-05-27  6:20                             ` Hisashi Hifumi
@ 2009-05-28  1:20                             ` Hisashi Hifumi
  2009-05-28  2:23                               ` KOSAKI Motohiro
  1 sibling, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-05-28  1:20 UTC (permalink / raw)
  To: Andrew Morton, Wu Fengguang
  Cc: linux-kernel, linux-fsdevel, linux-mm, jens.axboe, KOSAKI Motohiro



>To make the reasoning more obvious:
>
>Assume we just submitted readahead IO request for pages N ~ N+M, then
>        T(N) <= T(N+1)
>        T(N) <= T(N+2)
>        T(N) <= T(N+3)
>        ...
>        T(N) <= T(N+M)   (M = readahead size)
>So if the reader is going to block on any page in the above chunk,
>it is going to first block on page N.
>
>With RAID (and NFS to some degree), there is no strict ordering,
>so the reader is more likely to block on some random pages.
>
>In the first case, the effective async_size = M, in the second case,
>the effective async_size <= M. The more async_size, the more degree of
>readahead pipeline, hence the more low level IO latencies are hidden
>to the application.
>
>Thanks,
>Fengguang
>
>> 
>> > 
>> >                if (PageReadahead(page))
>> >                         page_cache_async_readahead()
>> >                 if (!PageUptodate(page))
>> >                                 goto page_not_up_to_date;
>> >                 //...
>> > page_not_up_to_date:
>> >                 lock_page_killable(page);
>> > 
>> > 
>> > Therefore explicit unplugging can help, so
>> > 
>> >         Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
>> > 
>> > The only question is, shall we avoid the double unplug by doing this?
>> > 


Hi Andrew.
Please merge following patch.
Thanks.

---

I added blk_run_backing_dev on page_cache_async_readahead
so readahead I/O is unpluged to improve throughput on 
especially RAID environment. 

Following is the test result with dd.

#dd if=testdir/testfile of=/dev/null bs=16384

-2.6.30-rc6
1048576+0 records in
1048576+0 records out
17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s

-2.6.30-rc6-patched
1048576+0 records in
1048576+0 records out
17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s

My testing environment is as follows:
Hardware: HP DL580 
CPU:Xeon 3.2GHz *4 HT enabled
Memory:8GB
Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)

The normal case is, if page N become uptodate at time T(N), then
T(N) <= T(N+1) holds. With RAID (and NFS to some degree), there 
is no strict ordering, the data arrival time depends on
runtime status of individual disks, which breaks that formula. So
in do_generic_file_read(), just after submitting the async readahead IO
request, the current page may well be uptodate, so the page won't be locked,
and the block device won't be implicitly unplugged:

               if (PageReadahead(page))
                        page_cache_async_readahead()
                if (!PageUptodate(page))
                                goto page_not_up_to_date;
                //...
page_not_up_to_date:
                lock_page_killable(page);

Therefore explicit unplugging can help.

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Acked-by: Wu Fengguang <fengguang.wu@intel.com> 


 mm/readahead.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- linux.orig/mm/readahead.c
+++ linux/mm/readahead.c
@@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
 
 	/* do read-ahead */
 	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
+
+	/*
+	* Normally the current page is !uptodate and lock_page() will be
+	* immediately called to implicitly unplug the device. However this
+	* is not always true for RAID conifgurations, where data arrives
+	* not strictly in their submission order. In this case we need to
+	* explicitly kick off the IO.
+	*/
+	if (PageUptodate(page))
+		blk_run_backing_dev(mapping->backing_dev_info, NULL);
 }
 EXPORT_SYMBOL_GPL(page_cache_async_readahead); 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-28  1:20                             ` Hisashi Hifumi
@ 2009-05-28  2:23                               ` KOSAKI Motohiro
  2009-06-01  1:39                                 ` Hisashi Hifumi
  0 siblings, 1 reply; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-05-28  2:23 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: kosaki.motohiro, Andrew Morton, Wu Fengguang, linux-kernel,
	linux-fsdevel, linux-mm, jens.axboe

> Hi Andrew.
> Please merge following patch.
> Thanks.
> 
> ---
> 
> I added blk_run_backing_dev on page_cache_async_readahead
> so readahead I/O is unpluged to improve throughput on 
> especially RAID environment. 
> 
> Following is the test result with dd.
> 
> #dd if=testdir/testfile of=/dev/null bs=16384
> 
> -2.6.30-rc6
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> 
> -2.6.30-rc6-patched
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> 
> My testing environment is as follows:
> Hardware: HP DL580 
> CPU:Xeon 3.2GHz *4 HT enabled
> Memory:8GB
> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> 
> The normal case is, if page N become uptodate at time T(N), then
> T(N) <= T(N+1) holds. With RAID (and NFS to some degree), there 
> is no strict ordering, the data arrival time depends on
> runtime status of individual disks, which breaks that formula. So
> in do_generic_file_read(), just after submitting the async readahead IO
> request, the current page may well be uptodate, so the page won't be locked,
> and the block device won't be implicitly unplugged:

Please attach blktrace analysis ;)


> 
>                if (PageReadahead(page))
>                         page_cache_async_readahead()
>                 if (!PageUptodate(page))
>                                 goto page_not_up_to_date;
>                 //...
> page_not_up_to_date:
>                 lock_page_killable(page);
> 
> Therefore explicit unplugging can help.
> 
> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
> 
> 
>  mm/readahead.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> --- linux.orig/mm/readahead.c
> +++ linux/mm/readahead.c
> @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +	/*
> +	* Normally the current page is !uptodate and lock_page() will be
> +	* immediately called to implicitly unplug the device. However this
> +	* is not always true for RAID conifgurations, where data arrives
> +	* not strictly in their submission order. In this case we need to
> +	* explicitly kick off the IO.
> +	*/
> +	if (PageUptodate(page))
> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead); 
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-28  2:23                               ` KOSAKI Motohiro
@ 2009-06-01  1:39                                 ` Hisashi Hifumi
  2009-06-01  2:23                                   ` KOSAKI Motohiro
  0 siblings, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-06-01  1:39 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, Wu Fengguang, linux-kernel, linux-fsdevel, linux-mm


At 11:23 09/05/28, KOSAKI Motohiro wrote:
>> Hi Andrew.
>> Please merge following patch.
>> Thanks.
>> 
>> ---
>> 
>> I added blk_run_backing_dev on page_cache_async_readahead
>> so readahead I/O is unpluged to improve throughput on 
>> especially RAID environment. 
>> 
>> Following is the test result with dd.
>> 
>> #dd if=testdir/testfile of=/dev/null bs=16384
>> 
>> -2.6.30-rc6
>> 1048576+0 records in
>> 1048576+0 records out
>> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> 
>> -2.6.30-rc6-patched
>> 1048576+0 records in
>> 1048576+0 records out
>> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> 
>> My testing environment is as follows:
>> Hardware: HP DL580 
>> CPU:Xeon 3.2GHz *4 HT enabled
>> Memory:8GB
>> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
>> 
>> The normal case is, if page N become uptodate at time T(N), then
>> T(N) <= T(N+1) holds. With RAID (and NFS to some degree), there 
>> is no strict ordering, the data arrival time depends on
>> runtime status of individual disks, which breaks that formula. So
>> in do_generic_file_read(), just after submitting the async readahead IO
>> request, the current page may well be uptodate, so the page won't be locked,
>> and the block device won't be implicitly unplugged:
>
>Please attach blktrace analysis ;)

Hi, Motohiro.

I've got blktrace output for both with and without the patch, 
but I just did not clarify the reason for throuput improvement
from this result.

I do not notice any difference except around unplug behavior by dd.
Comments?

-2.6.30-rc6
  8,0    3   177784    50.001437357     0  C   R 8717567 + 512 [0]
  8,0    3   177785    50.001635405  4148  A   R 8718079 + 256 <- (8,1) 8718016
  8,0    3   177786    50.001635675  4148  Q   R 8718079 + 256 [dd]
  8,0    3   177787    50.001637517  4148  G   R 8718079 + 256 [dd]
  8,0    3   177788    50.001638954  4148  P   N [dd]
  8,0    3   177789    50.001639290  4148  I   R 8718079 + 256 [dd]
  8,0    3   177790    50.001765339  4148  A   R 8718335 + 256 <- (8,1) 8718272
  8,0    3   177791    50.001765699  4148  Q   R 8718335 + 256 [dd]
  8,0    3   177792    50.001766971  4148  M   R 8718335 + 256 [dd]
  8,0    3   177793    50.001768243  4148  U   N [dd] 1
  8,0    3   177794    50.001769464  4148  D   R 8718079 + 512 [dd]
  8,0    3   177795    50.003815034     0  C   R 8718079 + 512 [0]
  8,0    3   177796    50.004008636  4148  A   R 8718591 + 256 <- (8,1) 8718528
  8,0    3   177797    50.004008951  4148  Q   R 8718591 + 256 [dd]
  8,0    3   177798    50.004010787  4148  G   R 8718591 + 256 [dd]
  8,0    3   177799    50.004012089  4148  P   N [dd]
  8,0    3   177800    50.004012641  4148  I   R 8718591 + 256 [dd]
  8,0    3   177801    50.004139944  4148  A   R 8718847 + 256 <- (8,1) 8718784
  8,0    3   177802    50.004140298  4148  Q   R 8718847 + 256 [dd]
  8,0    3   177803    50.004141393  4148  M   R 8718847 + 256 [dd]
  8,0    3   177804    50.004142815  4148  U   N [dd] 1
  8,0    3   177805    50.004144003  4148  D   R 8718591 + 512 [dd]
  8,0    3   177806    50.007151480     0  C   R 8718591 + 512 [0]
  8,0    3   177807    50.007344467  4148  A   R 8719103 + 256 <- (8,1) 8719040
  8,0    3   177808    50.007344779  4148  Q   R 8719103 + 256 [dd]
  8,0    3   177809    50.007346636  4148  G   R 8719103 + 256 [dd]
  8,0    3   177810    50.007347821  4148  P   N [dd]
  8,0    3   177811    50.007348346  4148  I   R 8719103 + 256 [dd]
  8,0    3   177812    50.007480827  4148  A   R 8719359 + 256 <- (8,1) 8719296
  8,0    3   177813    50.007481187  4148  Q   R 8719359 + 256 [dd]
  8,0    3   177814    50.007482669  4148  M   R 8719359 + 256 [dd]
  8,0    3   177815    50.007483965  4148  U   N [dd] 1
  8,0    3   177816    50.007485171  4148  D   R 8719103 + 512 [dd]
  8,0    3   177817    50.009885672     0  C   R 8719103 + 512 [0]
  8,0    3   177818    50.010077696  4148  A   R 8719615 + 256 <- (8,1) 8719552
  8,0    3   177819    50.010078008  4148  Q   R 8719615 + 256 [dd]
  8,0    3   177820    50.010079841  4148  G   R 8719615 + 256 [dd]
  8,0    3   177821    50.010081227  4148  P   N [dd]
  8,0    3   177822    50.010081560  4148  I   R 8719615 + 256 [dd]
  8,0    3   177823    50.010208686  4148  A   R 8719871 + 256 <- (8,1) 8719808
  8,0    3   177824    50.010209046  4148  Q   R 8719871 + 256 [dd]
  8,0    3   177825    50.010210366  4148  M   R 8719871 + 256 [dd]
  8,0    3   177826    50.010211686  4148  U   N [dd] 1
  8,0    3   177827    50.010212916  4148  D   R 8719615 + 512 [dd]
  8,0    3   177828    50.013880081     0  C   R 8719615 + 512 [0]
  8,0    3   177829    50.014071235  4148  A   R 8720127 + 256 <- (8,1) 8720064
  8,0    3   177830    50.014071544  4148  Q   R 8720127 + 256 [dd]
  8,0    3   177831    50.014073332  4148  G   R 8720127 + 256 [dd]
  8,0    3   177832    50.014074517  4148  P   N [dd]
  8,0    3   177833    50.014075084  4148  I   R 8720127 + 256 [dd]
  8,0    3   177834    50.014201763  4148  A   R 8720383 + 256 <- (8,1) 8720320
  8,0    3   177835    50.014202123  4148  Q   R 8720383 + 256 [dd]
  8,0    3   177836    50.014203608  4148  M   R 8720383 + 256 [dd]
  8,0    3   177837    50.014204889  4148  U   N [dd] 1
  8,0    3   177838    50.014206095  4148  D   R 8720127 + 512 [dd]
  8,0    3   177839    50.017545281     0  C   R 8720127 + 512 [0]
  8,0    3   177840    50.017741679  4148  A   R 8720639 + 256 <- (8,1) 8720576
  8,0    3   177841    50.017742006  4148  Q   R 8720639 + 256 [dd]
  8,0    3   177842    50.017743848  4148  G   R 8720639 + 256 [dd]
  8,0    3   177843    50.017745318  4148  P   N [dd]
  8,0    3   177844    50.017745672  4148  I   R 8720639 + 256 [dd]
  8,0    3   177845    50.017876956  4148  A   R 8720895 + 256 <- (8,1) 8720832
  8,0    3   177846    50.017877286  4148  Q   R 8720895 + 256 [dd]
  8,0    3   177847    50.017878615  4148  M   R 8720895 + 256 [dd]
  8,0    3   177848    50.017880082  4148  U   N [dd] 1
  8,0    3   177849    50.017881339  4148  D   R 8720639 + 512 [dd]
  8,0    3   177850    50.020674534     0  C   R 8720639 + 512 [0]
  8,0    3   177851    50.020864689  4148  A   R 8721151 + 256 <- (8,1) 8721088
  8,0    3   177852    50.020865007  4148  Q   R 8721151 + 256 [dd]
  8,0    3   177853    50.020866900  4148  G   R 8721151 + 256 [dd]
  8,0    3   177854    50.020868283  4148  P   N [dd]
  8,0    3   177855    50.020868628  4148  I   R 8721151 + 256 [dd]
  8,0    3   177856    50.020997302  4148  A   R 8721407 + 256 <- (8,1) 8721344
  8,0    3   177857    50.020997662  4148  Q   R 8721407 + 256 [dd]
  8,0    3   177858    50.020998976  4148  M   R 8721407 + 256 [dd]
  8,0    3   177859    50.021000305  4148  U   N [dd] 1
  8,0    3   177860    50.021001520  4148  D   R 8721151 + 512 [dd]
  8,0    3   177861    50.024269136     0  C   R 8721151 + 512 [0]
  8,0    3   177862    50.024460931  4148  A   R 8721663 + 256 <- (8,1) 8721600
  8,0    3   177863    50.024461337  4148  Q   R 8721663 + 256 [dd]
  8,0    3   177864    50.024463175  4148  G   R 8721663 + 256 [dd]
  8,0    3   177865    50.024464537  4148  P   N [dd]
  8,0    3   177866    50.024464871  4148  I   R 8721663 + 256 [dd]
  8,0    3   177867    50.024597943  4148  A   R 8721919 + 256 <- (8,1) 8721856
  8,0    3   177868    50.024598213  4148  Q   R 8721919 + 256 [dd]
  8,0    3   177869    50.024599323  4148  M   R 8721919 + 256 [dd]
  8,0    3   177870    50.024600751  4148  U   N [dd] 1
  8,0    3   177871    50.024602104  4148  D   R 8721663 + 512 [dd]
  8,0    3   177872    50.026966145     0  C   R 8721663 + 512 [0]
  8,0    3   177873    50.027157245  4148  A   R 8722175 + 256 <- (8,1) 8722112
  8,0    3   177874    50.027157563  4148  Q   R 8722175 + 256 [dd]
  8,0    3   177875    50.027159351  4148  G   R 8722175 + 256 [dd]
  8,0    3   177876    50.027160731  4148  P   N [dd]
  8,0    3   177877    50.027161064  4148  I   R 8722175 + 256 [dd]
  8,0    3   177878    50.027288745  4148  A   R 8722431 + 256 <- (8,1) 8722368
  8,0    3   177879    50.027289105  4148  Q   R 8722431 + 256 [dd]
  8,0    3   177880    50.027290206  4148  M   R 8722431 + 256 [dd]
  8,0    3   177881    50.027291697  4148  U   N [dd] 1
  8,0    3   177882    50.027293119  4148  D   R 8722175 + 512 [dd]
  8,0    3   177883    50.030406105     0  C   R 8722175 + 512 [0]
  8,0    3   177884    50.030600613  4148  A   R 8722687 + 256 <- (8,1) 8722624
  8,0    3   177885    50.030601199  4148  Q   R 8722687 + 256 [dd]
  8,0    3   177886    50.030603269  4148  G   R 8722687 + 256 [dd]
  8,0    3   177887    50.030604463  4148  P   N [dd]
  8,0    3   177888    50.030604799  4148  I   R 8722687 + 256 [dd]
  8,0    3   177889    50.030731757  4148  A   R 8722943 + 256 <- (8,1) 8722880
  8,0    3   177890    50.030732117  4148  Q   R 8722943 + 256 [dd]
  8,0    3   177891    50.030733397  4148  M   R 8722943 + 256 [dd]
  8,0    3   177892    50.030734882  4148  U   N [dd] 1
  8,0    3   177893    50.030736109  4148  D   R 8722687 + 512 [dd]
  8,0    3   177894    50.032916699     0  C   R 8722687 + 512 [0]
  8,0    3   177895    50.033176618  4148  A   R 8723199 + 256 <- (8,1) 8723136
  8,0    3   177896    50.033177218  4148  Q   R 8723199 + 256 [dd]
  8,0    3   177897    50.033181433  4148  G   R 8723199 + 256 [dd]
  8,0    3   177898    50.033184757  4148  P   N [dd]
  8,0    3   177899    50.033185642  4148  I   R 8723199 + 256 [dd]
  8,0    3   177900    50.033371264  4148  A   R 8723455 + 256 <- (8,1) 8723392
  8,0    3   177901    50.033371717  4148  Q   R 8723455 + 256 [dd]
  8,0    3   177902    50.033374015  4148  M   R 8723455 + 256 [dd]
  8,0    3   177903    50.033376814  4148  U   N [dd] 1
  8,0    3   177904    50.033380126  4148  D   R 8723199 + 512 [dd]
  8,0    3   177905    50.036715133     0  C   R 8723199 + 512 [0]
  8,0    3   177906    50.036971296  4148  A   R 8723711 + 256 <- (8,1) 8723648
  8,0    3   177907    50.036972136  4148  Q   R 8723711 + 256 [dd]
  8,0    3   177908    50.036975673  4148  G   R 8723711 + 256 [dd]
  8,0    3   177909    50.036978277  4148  P   N [dd]
  8,0    3   177910    50.036979450  4148  I   R 8723711 + 256 [dd]
  8,0    3   177911    50.037162429  4148  A   R 8723967 + 256 <- (8,1) 8723904
  8,0    3   177912    50.037162840  4148  Q   R 8723967 + 256 [dd]
  8,0    3   177913    50.037164967  4148  M   R 8723967 + 256 [dd]
  8,0    3   177914    50.037167223  4148  U   N [dd] 1
  8,0    3   177915    50.037170001  4148  D   R 8723711 + 512 [dd]
  8,0    3   177916    50.040521790     0  C   R 8723711 + 512 [0]
  8,0    3   177917    50.040729738  4148  A   R 8724223 + 256 <- (8,1) 8724160
  8,0    3   177918    50.040730200  4148  Q   R 8724223 + 256 [dd]
  8,0    3   177919    50.040732060  4148  G   R 8724223 + 256 [dd]
  8,0    3   177920    50.040733551  4148  P   N [dd]
  8,0    3   177921    50.040734109  4148  I   R 8724223 + 256 [dd]
  8,0    3   177922    50.040860173  4148  A   R 8724479 + 160 <- (8,1) 8724416
  8,0    3   177923    50.040860536  4148  Q   R 8724479 + 160 [dd]
  8,0    3   177924    50.040861517  4148  M   R 8724479 + 160 [dd]
  8,0    3   177925    50.040872542  4148  A   R 1055943 + 8 <- (8,1) 1055880
  8,0    3   177926    50.040872800  4148  Q   R 1055943 + 8 [dd]
  8,0    3   177927    50.040874849  4148  G   R 1055943 + 8 [dd]
  8,0    3   177928    50.040875485  4148  I   R 1055943 + 8 [dd]
  8,0    3   177929    50.040877045  4148  U   N [dd] 2
  8,0    3   177930    50.040878625  4148  D   R 8724223 + 416 [dd]
  8,0    3   177931    50.040895335  4148  D   R 1055943 + 8 [dd]
  8,0    3   177932    50.044383267     0  C   R 8724223 + 416 [0]
  8,0    3   177933    50.044704725     0  C   R 1055943 + 8 [0]
  8,0    3   177934    50.044749068  4148  A   R 8724639 + 96 <- (8,1) 8724576
  8,0    3   177935    50.044749472  4148  Q   R 8724639 + 96 [dd]
  8,0    3   177936    50.044752184  4148  G   R 8724639 + 96 [dd]
  8,0    3   177937    50.044753552  4148  P   N [dd]
  8,0    3   177938    50.044754032  4148  I   R 8724639 + 96 [dd]
  8,0    3   177939    50.044896095  4148  A   R 8724735 + 256 <- (8,1) 8724672
  8,0    3   177940    50.044896443  4148  Q   R 8724735 + 256 [dd]
  8,0    3   177941    50.044897538  4148  M   R 8724735 + 256 [dd]
  8,0    3   177942    50.044948546  4148  U   N [dd] 1
  8,0    3   177943    50.044950001  4148  D   R 8724639 + 352 [dd]
  8,0    3   177944    50.047150137     0  C   R 8724639 + 352 [0]
  8,0    3   177945    50.047294824  4148  A   R 8724991 + 256 <- (8,1) 8724928
  8,0    3   177946    50.047295142  4148  Q   R 8724991 + 256 [dd]
  8,0    3   177947    50.047296978  4148  G   R 8724991 + 256 [dd]
  8,0    3   177948    50.047298301  4148  P   N [dd]
  8,0    3   177949    50.047298637  4148  I   R 8724991 + 256 [dd]
  8,0    3   177950    50.047429027  4148  A   R 8725247 + 256 <- (8,1) 8725184
  8,0    3   177951    50.047429387  4148  Q   R 8725247 + 256 [dd]
  8,0    3   177952    50.047430479  4148  M   R 8725247 + 256 [dd]
  8,0    3   177953    50.047431736  4148  U   N [dd] 1
  8,0    3   177954    50.047432951  4148  D   R 8724991 + 512 [dd]
  8,0    3   177955    50.050313976     0  C   R 8724991 + 512 [0]
  8,0    3   177956    50.050507961  4148  A   R 8725503 + 256 <- (8,1) 8725440
  8,0    3   177957    50.050508273  4148  Q   R 8725503 + 256 [dd]
  8,0    3   177958    50.050510139  4148  G   R 8725503 + 256 [dd]
  8,0    3   177959    50.050511522  4148  P   N [dd]
  8,0    3   177960    50.050512062  4148  I   R 8725503 + 256 [dd]
  8,0    3   177961    50.050645393  4148  A   R 8725759 + 256 <- (8,1) 8725696
  8,0    3   177962    50.050645867  4148  Q   R 8725759 + 256 [dd]
  8,0    3   177963    50.050647171  4148  M   R 8725759 + 256 [dd]
  8,0    3   177964    50.050648593  4148  U   N [dd] 1
  8,0    3   177965    50.050649985  4148  D   R 8725503 + 512 [dd]
  8,0    3   177966    50.053380250     0  C   R 8725503 + 512 [0]
  8,0    3   177967    50.053576324  4148  A   R 8726015 + 256 <- (8,1) 8725952
  8,0    3   177968    50.053576615  4148  Q   R 8726015 + 256 [dd]
  8,0    3   177969    50.053578994  4148  G   R 8726015 + 256 [dd]
  8,0    3   177970    50.053580173  4148  P   N [dd]
  8,0    3   177971    50.053580509  4148  I   R 8726015 + 256 [dd]
  8,0    3   177972    50.053711503  4148  A   R 8726271 + 256 <- (8,1) 8726208
  8,0    3   177973    50.053712001  4148  Q   R 8726271 + 256 [dd]
  8,0    3   177974    50.053713332  4148  M   R 8726271 + 256 [dd]
  8,0    3   177975    50.053714583  4148  U   N [dd] 1
  8,0    3   177976    50.053715768  4148  D   R 8726015 + 512 [dd]
  8,0    3   177977    50.056970395     0  C   R 8726015 + 512 [0]
  8,0    3   177978    50.057161408  4148  A   R 8726527 + 256 <- (8,1) 8726464
  8,0    3   177979    50.057161726  4148  Q   R 8726527 + 256 [dd]
  8,0    3   177980    50.057163718  4148  G   R 8726527 + 256 [dd]
  8,0    3   177981    50.057165098  4148  P   N [dd]
  8,0    3   177982    50.057165431  4148  I   R 8726527 + 256 [dd]
  8,0    3   177983    50.057294630  4148  A   R 8726783 + 256 <- (8,1) 8726720
  8,0    3   177984    50.057294990  4148  Q   R 8726783 + 256 [dd]
  8,0    3   177985    50.057296070  4148  M   R 8726783 + 256 [dd]
  8,0    3   177986    50.057297402  4148  U   N [dd] 1
  8,0    3   177987    50.057298899  4148  D   R 8726527 + 512 [dd]
  8,0    3   177988    50.060326743     0  C   R 8726527 + 512 [0]
  8,0    3   177989    50.060523768  4148  A   R 8727039 + 256 <- (8,1) 8726976
  8,0    3   177990    50.060524095  4148  Q   R 8727039 + 256 [dd]
  8,0    3   177991    50.060525910  4148  G   R 8727039 + 256 [dd]
  8,0    3   177992    50.060527239  4148  P   N [dd]
  8,0    3   177993    50.060527575  4148  I   R 8727039 + 256 [dd]
  8,0    3   177994    50.060662280  4148  A   R 8727295 + 256 <- (8,1) 8727232
  8,0    3   177995    50.060662778  4148  Q   R 8727295 + 256 [dd]
  8,0    3   177996    50.060663993  4148  M   R 8727295 + 256 [dd]
  8,0    3   177997    50.060665403  4148  U   N [dd] 1
  8,0    3   177998    50.060666999  4148  D   R 8727039 + 512 [dd]
  8,0    3   177999    50.063922341     0  C   R 8727039 + 512 [0]
  8,0    3   178000    50.064113177  4148  A   R 8727551 + 256 <- (8,1) 8727488
  8,0    3   178001    50.064113492  4148  Q   R 8727551 + 256 [dd]
  8,0    3   178002    50.064115373  4148  G   R 8727551 + 256 [dd]

-2.6.30-rc6-patched
  8,0    3   257297    50.000760847     0  C   R 9480703 + 256 [0]
  8,0    3   257298    50.000944399  4139  A   R 9481215 + 256 <- (8,1) 9481152
  8,0    3   257299    50.000944693  4139  Q   R 9481215 + 256 [dd]
  8,0    3   257300    50.000946541  4139  G   R 9481215 + 256 [dd]
  8,0    3   257301    50.000947954  4139  P   N [dd]
  8,0    3   257302    50.000948368  4139  I   R 9481215 + 256 [dd]
  8,0    3   257303    50.000948920  4139  U   N [dd] 2
  8,0    3   257304    50.000950003  4139  D   R 9481215 + 256 [dd]
  8,0    3   257305    50.000962541  4139  U   N [dd] 2
  8,0    3   257306    50.003034240     0  C   R 9480959 + 256 [0]
  8,0    3   257307    50.003076338     0  C   R 9481215 + 256 [0]
  8,0    3   257308    50.003258111  4139  A   R 9481471 + 256 <- (8,1) 9481408
  8,0    3   257309    50.003258402  4139  Q   R 9481471 + 256 [dd]
  8,0    3   257310    50.003260190  4139  G   R 9481471 + 256 [dd]
  8,0    3   257311    50.003261399  4139  P   N [dd]
  8,0    3   257312    50.003261768  4139  I   R 9481471 + 256 [dd]
  8,0    3   257313    50.003262335  4139  U   N [dd] 1
  8,0    3   257314    50.003263406  4139  D   R 9481471 + 256 [dd]
  8,0    3   257315    50.003430472  4139  A   R 9481727 + 256 <- (8,1) 9481664
  8,0    3   257316    50.003430748  4139  Q   R 9481727 + 256 [dd]
  8,0    3   257317    50.003433065  4139  G   R 9481727 + 256 [dd]
  8,0    3   257318    50.003434343  4139  P   N [dd]
  8,0    3   257319    50.003434658  4139  I   R 9481727 + 256 [dd]
  8,0    3   257320    50.003435138  4139  U   N [dd] 2
  8,0    3   257321    50.003436083  4139  D   R 9481727 + 256 [dd]
  8,0    3   257322    50.003447795  4139  U   N [dd] 2
  8,0    3   257323    50.004774693     0  C   R 9481471 + 256 [0]
  8,0    3   257324    50.004959499  4139  A   R 9481983 + 256 <- (8,1) 9481920
  8,0    3   257325    50.004959790  4139  Q   R 9481983 + 256 [dd]
  8,0    3   257326    50.004961590  4139  G   R 9481983 + 256 [dd]
  8,0    3   257327    50.004962793  4139  P   N [dd]
  8,0    3   257328    50.004963153  4139  I   R 9481983 + 256 [dd]
  8,0    3   257329    50.004964098  4139  U   N [dd] 2
  8,0    3   257330    50.004965184  4139  D   R 9481983 + 256 [dd]
  8,0    3   257331    50.004978967  4139  U   N [dd] 2
  8,0    3   257332    50.006865854     0  C   R 9481727 + 256 [0]
  8,0    3   257333    50.007052043  4139  A   R 9482239 + 256 <- (8,1) 9482176
  8,0    3   257334    50.007052331  4139  Q   R 9482239 + 256 [dd]
  8,0    3   257335    50.007054146  4139  G   R 9482239 + 256 [dd]
  8,0    3   257336    50.007055355  4139  P   N [dd]
  8,0    3   257337    50.007055724  4139  I   R 9482239 + 256 [dd]
  8,0    3   257338    50.007056438  4139  U   N [dd] 2
  8,0    3   257339    50.007057605  4139  D   R 9482239 + 256 [dd]
  8,0    3   257340    50.007069963  4139  U   N [dd] 2
  8,0    3   257341    50.008250294     0  C   R 9481983 + 256 [0]
  8,0    3   257342    50.008431589  4139  A   R 9482495 + 256 <- (8,1) 9482432
  8,0    3   257343    50.008431881  4139  Q   R 9482495 + 256 [dd]
  8,0    3   257344    50.008433921  4139  G   R 9482495 + 256 [dd]
  8,0    3   257345    50.008435097  4139  P   N [dd]
  8,0    3   257346    50.008435466  4139  I   R 9482495 + 256 [dd]
  8,0    3   257347    50.008436213  4139  U   N [dd] 2
  8,0    3   257348    50.008437296  4139  D   R 9482495 + 256 [dd]
  8,0    3   257349    50.008450034  4139  U   N [dd] 2
  8,0    3   257350    50.010008843     0  C   R 9482239 + 256 [0]
  8,0    3   257351    50.010135287  4139  C   R 9482495 + 256 [0]
  8,0    3   257352    50.010226816  4139  A   R 9482751 + 256 <- (8,1) 9482688
  8,0    3   257353    50.010227107  4139  Q   R 9482751 + 256 [dd]
  8,0    3   257354    50.010229363  4139  G   R 9482751 + 256 [dd]
  8,0    3   257355    50.010230728  4139  P   N [dd]
  8,0    3   257356    50.010231097  4139  I   R 9482751 + 256 [dd]
  8,0    3   257357    50.010231655  4139  U   N [dd] 1
  8,0    3   257358    50.010232696  4139  D   R 9482751 + 256 [dd]
  8,0    3   257359    50.010380946  4139  A   R 9483007 + 256 <- (8,1) 9482944
  8,0    3   257360    50.010381264  4139  Q   R 9483007 + 256 [dd]
  8,0    3   257361    50.010383358  4139  G   R 9483007 + 256 [dd]
  8,0    3   257362    50.010384429  4139  P   N [dd]
  8,0    3   257363    50.010384741  4139  I   R 9483007 + 256 [dd]
  8,0    3   257364    50.010385395  4139  U   N [dd] 2
  8,0    3   257365    50.010386364  4139  D   R 9483007 + 256 [dd]
  8,0    3   257366    50.010397869  4139  U   N [dd] 2
  8,0    3   257367    50.014210132     0  C   R 9482751 + 256 [0]
  8,0    3   257368    50.014252938     0  C   R 9483007 + 256 [0]
  8,0    3   257369    50.014430811  4139  A   R 9483263 + 256 <- (8,1) 9483200
  8,0    3   257370    50.014431105  4139  Q   R 9483263 + 256 [dd]
  8,0    3   257371    50.014433139  4139  G   R 9483263 + 256 [dd]
  8,0    3   257372    50.014434520  4139  P   N [dd]
  8,0    3   257373    50.014435110  4139  I   R 9483263 + 256 [dd]
  8,0    3   257374    50.014435674  4139  U   N [dd] 1
  8,0    3   257375    50.014436770  4139  D   R 9483263 + 256 [dd]
  8,0    3   257376    50.014592117  4139  A   R 9483519 + 256 <- (8,1) 9483456
  8,0    3   257377    50.014592573  4139  Q   R 9483519 + 256 [dd]
  8,0    3   257378    50.014594391  4139  G   R 9483519 + 256 [dd]
  8,0    3   257379    50.014595504  4139  P   N [dd]
  8,0    3   257380    50.014595876  4139  I   R 9483519 + 256 [dd]
  8,0    3   257381    50.014596366  4139  U   N [dd] 2
  8,0    3   257382    50.014597368  4139  D   R 9483519 + 256 [dd]
  8,0    3   257383    50.014609521  4139  U   N [dd] 2
  8,0    3   257384    50.015937813     0  C   R 9483263 + 256 [0]
  8,0    3   257385    50.016124825  4139  A   R 9483775 + 256 <- (8,1) 9483712
  8,0    3   257386    50.016125116  4139  Q   R 9483775 + 256 [dd]
  8,0    3   257387    50.016127162  4139  G   R 9483775 + 256 [dd]
  8,0    3   257388    50.016128569  4139  P   N [dd]
  8,0    3   257389    50.016128983  4139  I   R 9483775 + 256 [dd]
  8,0    3   257390    50.016129538  4139  U   N [dd] 2
  8,0    3   257391    50.016130627  4139  D   R 9483775 + 256 [dd]
  8,0    3   257392    50.016143077  4139  U   N [dd] 2
  8,0    3   257393    50.016925304     0  C   R 9483519 + 256 [0]
  8,0    3   257394    50.017111307  4139  A   R 9484031 + 256 <- (8,1) 9483968
  8,0    3   257395    50.017111598  4139  Q   R 9484031 + 256 [dd]
  8,0    3   257396    50.017113410  4139  G   R 9484031 + 256 [dd]
  8,0    3   257397    50.017114835  4139  P   N [dd]
  8,0    3   257398    50.017115213  4139  I   R 9484031 + 256 [dd]
  8,0    3   257399    50.017115765  4139  U   N [dd] 2
  8,0    3   257400    50.017116839  4139  D   R 9484031 + 256 [dd]
  8,0    3   257401    50.017129023  4139  U   N [dd] 2
  8,0    3   257402    50.017396693     0  C   R 9483775 + 256 [0]
  8,0    3   257403    50.017584595  4139  A   R 9484287 + 256 <- (8,1) 9484224
  8,0    3   257404    50.017585018  4139  Q   R 9484287 + 256 [dd]
  8,0    3   257405    50.017586866  4139  G   R 9484287 + 256 [dd]
  8,0    3   257406    50.017587997  4139  P   N [dd]
  8,0    3   257407    50.017588393  4139  I   R 9484287 + 256 [dd]
  8,0    3   257408    50.017589105  4139  U   N [dd] 2
  8,0    3   257409    50.017590173  4139  D   R 9484287 + 256 [dd]
  8,0    3   257410    50.017602614  4139  U   N [dd] 2
  8,0    3   257411    50.020578876     0  C   R 9484031 + 256 [0]
  8,0    3   257412    50.020721857  4139  C   R 9484287 + 256 [0]
  8,0    3   257413    50.020803183  4139  A   R 9484543 + 256 <- (8,1) 9484480
  8,0    3   257414    50.020803507  4139  Q   R 9484543 + 256 [dd]
  8,0    3   257415    50.020805256  4139  G   R 9484543 + 256 [dd]
  8,0    3   257416    50.020806672  4139  P   N [dd]
  8,0    3   257417    50.020807065  4139  I   R 9484543 + 256 [dd]
  8,0    3   257418    50.020807668  4139  U   N [dd] 1
  8,0    3   257419    50.020808733  4139  D   R 9484543 + 256 [dd]
  8,0    3   257420    50.020957132  4139  A   R 9484799 + 256 <- (8,1) 9484736
  8,0    3   257421    50.020957423  4139  Q   R 9484799 + 256 [dd]
  8,0    3   257422    50.020959205  4139  G   R 9484799 + 256 [dd]
  8,0    3   257423    50.020960276  4139  P   N [dd]
  8,0    3   257424    50.020960594  4139  I   R 9484799 + 256 [dd]
  8,0    3   257425    50.020961062  4139  U   N [dd] 2
  8,0    3   257426    50.020961959  4139  D   R 9484799 + 256 [dd]
  8,0    3   257427    50.020974191  4139  U   N [dd] 2
  8,0    3   257428    50.023987847     0  C   R 9484543 + 256 [0]
  8,0    3   257429    50.024093062  4139  C   R 9484799 + 256 [0]
  8,0    3   257430    50.024207161  4139  A   R 9485055 + 256 <- (8,1) 9484992
  8,0    3   257431    50.024207434  4139  Q   R 9485055 + 256 [dd]
  8,0    3   257432    50.024209567  4139  G   R 9485055 + 256 [dd]
  8,0    3   257433    50.024210728  4139  P   N [dd]
  8,0    3   257434    50.024211097  4139  I   R 9485055 + 256 [dd]
  8,0    3   257435    50.024211661  4139  U   N [dd] 1
  8,0    3   257436    50.024212693  4139  D   R 9485055 + 256 [dd]
  8,0    3   257437    50.024359266  4139  A   R 9485311 + 256 <- (8,1) 9485248
  8,0    3   257438    50.024359584  4139  Q   R 9485311 + 256 [dd]
  8,0    3   257439    50.024361720  4139  G   R 9485311 + 256 [dd]
  8,0    3   257440    50.024362794  4139  P   N [dd]
  8,0    3   257441    50.024363106  4139  I   R 9485311 + 256 [dd]
  8,0    3   257442    50.024363760  4139  U   N [dd] 2
  8,0    3   257443    50.024364759  4139  D   R 9485311 + 256 [dd]
  8,0    3   257444    50.024376535  4139  U   N [dd] 2
  8,0    3   257445    50.026532544     0  C   R 9485055 + 256 [0]
  8,0    3   257446    50.026714236  4139  A   R 9485567 + 256 <- (8,1) 9485504
  8,0    3   257447    50.026714524  4139  Q   R 9485567 + 256 [dd]
  8,0    3   257448    50.026716354  4139  G   R 9485567 + 256 [dd]
  8,0    3   257449    50.026717791  4139  P   N [dd]
  8,0    3   257450    50.026718175  4139  I   R 9485567 + 256 [dd]
  8,0    3   257451    50.026718778  4139  U   N [dd] 2
  8,0    3   257452    50.026719876  4139  D   R 9485567 + 256 [dd]
  8,0    3   257453    50.026736383  4139  U   N [dd] 2
  8,0    3   257454    50.028531879     0  C   R 9485311 + 256 [0]
  8,0    3   257455    50.028684347  4139  C   R 9485567 + 256 [0]
  8,0    3   257456    50.028758787  4139  A   R 9485823 + 256 <- (8,1) 9485760
  8,0    3   257457    50.028759069  4139  Q   R 9485823 + 256 [dd]
  8,0    3   257458    50.028760884  4139  G   R 9485823 + 256 [dd]
  8,0    3   257459    50.028762099  4139  P   N [dd]
  8,0    3   257460    50.028762447  4139  I   R 9485823 + 256 [dd]
  8,0    3   257461    50.028763038  4139  U   N [dd] 1
  8,0    3   257462    50.028764268  4139  D   R 9485823 + 256 [dd]
  8,0    3   257463    50.028909841  4139  A   R 9486079 + 256 <- (8,1) 9486016
  8,0    3   257464    50.028910156  4139  Q   R 9486079 + 256 [dd]
  8,0    3   257465    50.028911896  4139  G   R 9486079 + 256 [dd]
  8,0    3   257466    50.028912964  4139  P   N [dd]
  8,0    3   257467    50.028913270  4139  I   R 9486079 + 256 [dd]
  8,0    3   257468    50.028913912  4139  U   N [dd] 2
  8,0    3   257469    50.028914878  4139  D   R 9486079 + 256 [dd]
  8,0    3   257470    50.028927497  4139  U   N [dd] 2
  8,0    3   257471    50.031158357     0  C   R 9485823 + 256 [0]
  8,0    3   257472    50.031292365  4139  C   R 9486079 + 256 [0]
  8,0    3   257473    50.031369697  4139  A   R 9486335 + 160 <- (8,1) 9486272
  8,0    3   257474    50.031369988  4139  Q   R 9486335 + 160 [dd]
  8,0    3   257475    50.031371779  4139  G   R 9486335 + 160 [dd]
  8,0    3   257476    50.031372850  4139  P   N [dd]
  8,0    3   257477    50.031373198  4139  I   R 9486335 + 160 [dd]
  8,0    3   257478    50.031384931  4139  A   R 1056639 + 8 <- (8,1) 1056576
  8,0    3   257479    50.031385201  4139  Q   R 1056639 + 8 [dd]
  8,0    3   257480    50.031388480  4139  G   R 1056639 + 8 [dd]
  8,0    3   257481    50.031388904  4139  I   R 1056639 + 8 [dd]
  8,0    3   257482    50.031390362  4139  U   N [dd] 2
  8,0    3   257483    50.031391523  4139  D   R 9486335 + 160 [dd]
  8,0    3   257484    50.031403403  4139  D   R 1056639 + 8 [dd]
  8,0    3   257485    50.033630747     0  C   R 1056639 + 8 [0]
  8,0    3   257486    50.033690300  4139  A   R 9486495 + 96 <- (8,1) 9486432
  8,0    3   257487    50.033690810  4139  Q   R 9486495 + 96 [dd]
  8,0    3   257488    50.033694581  4139  G   R 9486495 + 96 [dd]
  8,0    3   257489    50.033696739  4139  P   N [dd]
  8,0    3   257490    50.033697357  4139  I   R 9486495 + 96 [dd]
  8,0    3   257491    50.033698611  4139  U   N [dd] 2
  8,0    3   257492    50.033700945  4139  D   R 9486495 + 96 [dd]
  8,0    3   257493    50.033727763  4139  C   R 9486335 + 160 [0]
  8,0    3   257494    50.033996024  4139  A   R 9486591 + 256 <- (8,1) 9486528
  8,0    3   257495    50.033996396  4139  Q   R 9486591 + 256 [dd]
  8,0    3   257496    50.034000030  4139  G   R 9486591 + 256 [dd]
  8,0    3   257497    50.034002268  4139  P   N [dd]
  8,0    3   257498    50.034002820  4139  I   R 9486591 + 256 [dd]
  8,0    3   257499    50.034003924  4139  U   N [dd] 2
  8,0    3   257500    50.034006201  4139  D   R 9486591 + 256 [dd]
  8,0    3   257501    50.034091438  4139  U   N [dd] 2
  8,0    3   257502    50.034637372     0  C   R 9486495 + 96 [0]
  8,0    3   257503    50.034841508  4139  A   R 9486847 + 256 <- (8,1) 9486784
  8,0    3   257504    50.034842072  4139  Q   R 9486847 + 256 [dd]
  8,0    3   257505    50.034846117  4139  G   R 9486847 + 256 [dd]
  8,0    3   257506    50.034848676  4139  P   N [dd]
  8,0    3   257507    50.034849384  4139  I   R 9486847 + 256 [dd]
  8,0    3   257508    50.034850545  4139  U   N [dd] 2
  8,0    3   257509    50.034852795  4139  D   R 9486847 + 256 [dd]
  8,0    3   257510    50.034875503  4139  U   N [dd] 2
  8,0    3   257511    50.035370009     0  C   R 9486591 + 256 [0]
  8,0    3   257512    50.035622315  4139  A   R 9487103 + 256 <- (8,1) 9487040
  8,0    3   257513    50.035622954  4139  Q   R 9487103 + 256 [dd]
  8,0    3   257514    50.035627101  4139  G   R 9487103 + 256 [dd]
  8,0    3   257515    50.035629510  4139  P   N [dd]
  8,0    3   257516    50.035630143  4139  I   R 9487103 + 256 [dd]
  8,0    3   257517    50.035631058  4139  U   N [dd] 2
  8,0    3   257518    50.035632657  4139  D   R 9487103 + 256 [dd]
  8,0    3   257519    50.035656358  4139  U   N [dd] 2
  8,0    3   257520    50.036703329     0  C   R 9486847 + 256 [0]
  8,0    3   257521    50.036963604  4139  A   R 9487359 + 256 <- (8,1) 9487296
  8,0    3   257522    50.036964057  4139  Q   R 9487359 + 256 [dd]
  8,0    3   257523    50.036967636  4139  G   R 9487359 + 256 [dd]
  8,0    3   257524    50.036969710  4139  P   N [dd]
  8,0    3   257525    50.036970586  4139  I   R 9487359 + 256 [dd]
  8,0    3   257526    50.036971684  4139  U   N [dd] 2
  8,0    3   257527    50.036973631  4139  D   R 9487359 + 256 [dd]
  8,0    3   257528    50.036995034  4139  U   N [dd] 2
  8,0    3   257529    50.038904428     0  C   R 9487103 + 256 [0]
  8,0    3   257530    50.039161508  4139  A   R 9487615 + 256 <- (8,1) 9487552
  8,0    3   257531    50.039161934  4139  Q   R 9487615 + 256 [dd]
  8,0    3   257532    50.039165834  4139  G   R 9487615 + 256 [dd]
  8,0    3   257533    50.039168561  4139  P   N [dd]
  8,0    3   257534    50.039169353  4139  I   R 9487615 + 256 [dd]
  8,0    3   257535    50.039170343  4139  U   N [dd] 2
  8,0    3   257536    50.039171645  4139  D   R 9487615 + 256 [dd]
  8,0    3   257537    50.039193195  4139  U   N [dd] 2
  8,0    3   257538    50.040570003     0  C   R 9487359 + 256 [0]
  8,0    3   257539    50.040842161  4139  A   R 9487871 + 256 <- (8,1) 9487808
  8,0    3   257540    50.040842827  4139  Q   R 9487871 + 256 [dd]
  8,0    3   257541    50.040846803  4139  G   R 9487871 + 256 [dd]
  8,0    3   257542    50.040849902  4139  P   N [dd]
  8,0    3   257543    50.040850715  4139  I   R 9487871 + 256 [dd]
  8,0    3   257544    50.040851642  4139  U   N [dd] 2
  8,0    3   257545    50.040853658  4139  D   R 9487871 + 256 [dd]
  8,0    3   257546    50.040876270  4139  U   N [dd] 2
  8,0    3   257547    50.042081391     0  C   R 9487615 + 256 [0]
  8,0    3   257548    50.042215837  4139  C   R 9487871 + 256 [0]
  8,0    3   257549    50.042316192  4139  A   R 9488127 + 256 <- (8,1) 9488064
  8,0    3   257550    50.042316633  4139  Q   R 9488127 + 256 [dd]
  8,0    3   257551    50.042319213  4139  G   R 9488127 + 256 [dd]
  8,0    3   257552    50.042320803  4139  P   N [dd]
  8,0    3   257553    50.042321412  4139  I   R 9488127 + 256 [dd]
  8,0    3   257554    50.042322219  4139  U   N [dd] 1
  8,0    3   257555    50.042323362  4139  D   R 9488127 + 256 [dd]
  8,0    3   257556    50.042484350  4139  A   R 9488383 + 256 <- (8,1) 9488320
  8,0    3   257557    50.042484602  4139  Q   R 9488383 + 256 [dd]
  8,0    3   257558    50.042486744  4139  G   R 9488383 + 256 [dd]
  8,0    3   257559    50.042487908  4139  P   N [dd]
  8,0    3   257560    50.042488223  4139  I   R 9488383 + 256 [dd]
  8,0    3   257561    50.042488754  4139  U   N [dd] 2
  8,0    3   257562    50.042489927  4139  D   R 9488383 + 256 [dd]
  8,0    3   257563    50.042502678  4139  U   N [dd] 2
  8,0    3   257564    50.045166592     0  C   R 9488127 + 256 [0]
  8,0    3   257565    50.045355163  4139  A   R 9488639 + 256 <- (8,1) 9488576
  8,0    3   257566    50.045355493  4139  Q   R 9488639 + 256 [dd]
  8,0    3   257567    50.045357497  4139  G   R 9488639 + 256 [dd]
  8,0    3   257568    50.045358673  4139  P   N [dd]
  8,0    3   257569    50.045359267  4139  I   R 9488639 + 256 [dd]
  8,0    3   257570    50.045359831  4139  U   N [dd] 2
  8,0    3   257571    50.045360911  4139  D   R 9488639 + 256 [dd]
  8,0    3   257572    50.045373959  4139  U   N [dd] 2
  8,0    3   257573    50.046450730     0  C   R 9488383 + 256 [0]
  8,0    3   257574    50.046641639  4139  A   R 9488895 + 256 <- (8,1) 9488832
  8,0    3   257575    50.046642086  4139  Q   R 9488895 + 256 [dd]
  8,0    3   257576    50.046643937  4139  G   R 9488895 + 256 [dd]
  8,0    3   257577    50.046645092  4139  P   N [dd]
  8,0    3   257578    50.046645527  4139  I   R 9488895 + 256 [dd]
  8,0    3   257579    50.046646244  4139  U   N [dd] 2
  8,0    3   257580    50.046647327  4139  D   R 9488895 + 256 [dd]
  8,0    3   257581    50.046660234  4139  U   N [dd] 2
  8,0    3   257582    50.047826305     0  C   R 9488639 + 256 [0]
  8,0    3   257583    50.048011468  4139  A   R 9489151 + 256 <- (8,1) 9489088
  8,0    3   257584    50.048011762  4139  Q   R 9489151 + 256 [dd]
  8,0    3   257585    50.048013793  4139  G   R 9489151 + 256 [dd]
  8,0    3   257586    50.048014966  4139  P   N [dd]
  8,0    3   257587    50.048015380  4139  I   R 9489151 + 256 [dd]
  8,0    3   257588    50.048016112  4139  U   N [dd] 2
  8,0    3   257589    50.048017202  4139  D   R 9489151 + 256 [dd]
  8,0    3   257590    50.048029553  4139  U   N [dd] 2
  8,0    3   257591    50.049319830     0  C   R 9488895 + 256 [0]
  8,0    3   257592    50.049446089  4139  C   R 9489151 + 256 [0]
  8,0    3   257593    50.049545199  4139  A   R 9489407 + 256 <- (8,1) 9489344
  8,0    3   257594    50.049545628  4139  Q   R 9489407 + 256 [dd]
  8,0    3   257595    50.049547512  4139  G   R 9489407 + 256 [dd]
  8,0    3   257596    50.049548886  4139  P   N [dd]
  8,0    3   257597    50.049549318  4139  I   R 9489407 + 256 [dd]
  8,0    3   257598    50.049550047  4139  U   N [dd] 1
  8,0    3   257599    50.049551241  4139  D   R 9489407 + 256 [dd]
  8,0    3   257600    50.049699283  4139  A   R 9489663 + 256 <- (8,1) 9489600
  8,0    3   257601    50.049699556  4139  Q   R 9489663 + 256 [dd]
  8,0    3   257602    50.049701266  4139  G   R 9489663 + 256 [dd]
  8,0    3   257603    50.049702310  4139  P   N [dd]
  8,0    3   257604    50.049702656  4139  I   R 9489663 + 256 [dd]
  8,0    3   257605    50.049703118  4139  U   N [dd] 2
  8,0    3   257606    50.049704020  4139  D   R 9489663 + 256 [dd]
  8,0    3   257607    50.049715940  4139  U   N [dd] 2
  8,0    3   257608    50.052662150     0  C   R 9489407 + 256 [0]
  8,0    3   257609    50.052853688  4139  A   R 9489919 + 256 <- (8,1) 9489856
  8,0    3   257610    50.052853985  4139  Q   R 9489919 + 256 [dd]
  8,0    3   257611    50.052855869  4139  G   R 9489919 + 256 [dd]
  8,0    3   257612    50.052857057  4139  P   N [dd]
  8,0    3   257613    50.052857423  4139  I   R 9489919 + 256 [dd]
  8,0    3   257614    50.052858065  4139  U   N [dd] 2
  8,0    3   257615    50.052859164  4139  D   R 9489919 + 256 [dd]
  8,0    3   257616    50.052871806  4139  U   N [dd] 2
  8,0    3   257617    50.053470795     0  C   R 9489663 + 256 [0]
  8,0    3   257618    50.053661719  4139  A   R 9490175 + 256 <- (8,1) 9490112
  8,0    3   257619    50.053662097  4139  Q   R 9490175 + 256 [dd]
  8,0    3   257620    50.053663891  4139  G   R 9490175 + 256 [dd]
  8,0    3   257621    50.053665034  4139  P   N [dd]
  8,0    3   257622    50.053665436  4139  I   R 9490175 + 256 [dd]
  8,0    3   257623    50.053665982  4139  U   N [dd] 2
  8,0    3   257624    50.053667077  4139  D   R 9490175 + 256 [dd]
  8,0    3   257625    50.053679732  4139  U   N [dd] 2
  8,0    3   257626    50.055776383     0  C   R 9489919 + 256 [0]
  8,0    3   257627    50.055915017  4139  C   R 9490175 + 256 [0]
  8,0    3   257628    50.055997812  4139  A   R 9490431 + 256 <- (8,1) 9490368
  8,0    3   257629    50.055998085  4139  Q   R 9490431 + 256 [dd]
  8,0    3   257630    50.055999867  4139  G   R 9490431 + 256 [dd]
  8,0    3   257631    50.056001049  4139  P   N [dd]
  8,0    3   257632    50.056001451  4139  I   R 9490431 + 256 [dd]
  8,0    3   257633    50.056002189  4139  U   N [dd] 1
  8,0    3   257634    50.056003197  4139  D   R 9490431 + 256 [dd]
  8,0    3   257635    50.056149977  4139  A   R 9490687 + 256 <- (8,1) 9490624
  8,0    3   257636    50.056150279  4139  Q   R 9490687 + 256 [dd]
  8,0    3   257637    50.056152047  4139  G   R 9490687 + 256 [dd]
  8,0    3   257638    50.056153109  4139  P   N [dd]
  8,0    3   257639    50.056153442  4139  I   R 9490687 + 256 [dd]
  8,0    3   257640    50.056153904  4139  U   N [dd] 2
  8,0    3   257641    50.056154852  4139  D   R 9490687 + 256 [dd]
  8,0    3   257642    50.056166948  4139  U   N [dd] 2
  8,0    3   257643    50.057600660     0  C   R 9490431 + 256 [0]
  8,0    3   257644    50.057786753  4139  A   R 9490943 + 256 <- (8,1) 9490880
  8,0    3   257645    50.057787050  4139  Q   R 9490943 + 256 [dd]
  8,0    3   257646    50.057788865  4139  G   R 9490943 + 256 [dd]
  8,0    3   257647    50.057790236  4139  P   N [dd]
  8,0    3   257648    50.057790614  4139  I   R 9490943 + 256 [dd]
  8,0    3   257649    50.057791169  4139  U   N [dd] 2
  8,0    3   257650    50.057792246  4139  D   R 9490943 + 256 [dd]
  8,0    3   257651    50.057804469  4139  U   N [dd] 2
  8,0    3   257652    50.060322995     0  C   R 9490687 + 256 [0]
  8,0    3   257653    50.060464005  4139  C   R 9490943 + 256 [0]
  8,0    3   257654    50.060548216  4139  A   R 9491199 + 256 <- (8,1) 9491136
  8,0    3   257655    50.060548696  4139  Q   R 9491199 + 256 [dd]
  8,0    3   257656    50.060550922  4139  G   R 9491199 + 256 [dd]
  8,0    3   257657    50.060552096  4139  P   N [dd]
  8,0    3   257658    50.060552531  4139  I   R 9491199 + 256 [dd]
  8,0    3   257659    50.060553101  4139  U   N [dd] 1
  8,0    3   257660    50.060554100  4139  D   R 9491199 + 256 [dd]
  8,0    3   257661    50.060701569  4139  A   R 9491455 + 256 <- (8,1) 9491392
  8,0    3   257662    50.060701890  4139  Q   R 9491455 + 256 [dd]
  8,0    3   257663    50.060703993  4139  G   R 9491455 + 256 [dd]
  8,0    3   257664    50.060705070  4139  P   N [dd]
  8,0    3   257665    50.060705385  4139  I   R 9491455 + 256 [dd]
  8,0    3   257666    50.060706012  4139  U   N [dd] 2
  8,0    3   257667    50.060706987  4139  D   R 9491455 + 256 [dd]
  8,0    3   257668    50.060718784  4139  U   N [dd] 2
  8,0    3   257669    50.062964966     0  C   R 9491199 + 256 [0]
  8,0    3   257670    50.063102772  4139  C   R 9491455 + 256 [0]
  8,0    3   257671    50.063182666  4139  A   R 9491711 + 256 <- (8,1) 9491648
  8,0    3   257672    50.063182939  4139  Q   R 9491711 + 256 [dd]
  8,0    3   257673    50.063184889  4139  G   R 9491711 + 256 [dd]
  8,0    3   257674    50.063186074  4139  P   N [dd]
  8,0    3   257675    50.063186440  4139  I   R 9491711 + 256 [dd]
  8,0    3   257676    50.063187271  4139  U   N [dd] 1
  8,0    3   257677    50.063188312  4139  D   R 9491711 + 256 [dd]
  8,0    3   257678    50.063340467  4139  A   R 9491967 + 256 <- (8,1) 9491904
  8,0    3   257679    50.063340749  4139  Q   R 9491967 + 256 [dd]
  8,0    3   257680    50.063342529  4139  G   R 9491967 + 256 [dd]
  8,0    3   257681    50.063343597  4139  P   N [dd]
  8,0    3   257682    50.063343915  4139  I   R 9491967 + 256 [dd]
  8,0    3   257683    50.063344374  4139  U   N [dd] 2
  8,0    3   257684    50.063345313  4139  D   R 9491967 + 256 [dd]
  8,0    3   257685    50.063357370  4139  U   N [dd] 2
  8,0    3   257686    50.066605011     0  C   R 9491711 + 256 [0]
  8,0    3   257687    50.066643587     0  C   R 9491967 + 256 [0]
  8,0    3   257688    50.066821310  4139  A   R 9492223 + 256 <- (8,1) 9492160
  8,0    3   257689    50.066821601  4139  Q   R 9492223 + 256 [dd]
  8,0    3   257690    50.066823605  4139  G   R 9492223 + 256 [dd]
  8,0    3   257691    50.066825063  4139  P   N [dd]



>
>
>> 
>>                if (PageReadahead(page))
>>                         page_cache_async_readahead()
>>                 if (!PageUptodate(page))
>>                                 goto page_not_up_to_date;
>>                 //...
>> page_not_up_to_date:
>>                 lock_page_killable(page);
>> 
>> Therefore explicit unplugging can help.
>> 
>> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
>> Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
>> 
>> 
>>  mm/readahead.c |   10 ++++++++++
>>  1 file changed, 10 insertions(+)
>> 
>> --- linux.orig/mm/readahead.c
>> +++ linux/mm/readahead.c
>> @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
>>  
>>  	/* do read-ahead */
>>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
>> +
>> +	/*
>> +	* Normally the current page is !uptodate and lock_page() will be
>> +	* immediately called to implicitly unplug the device. However this
>> +	* is not always true for RAID conifgurations, where data arrives
>> +	* not strictly in their submission order. In this case we need to
>> +	* explicitly kick off the IO.
>> +	*/
>> +	if (PageUptodate(page))
>> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
>>  }
>>  EXPORT_SYMBOL_GPL(page_cache_async_readahead); 
>> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-06-01  1:39                                 ` Hisashi Hifumi
@ 2009-06-01  2:23                                   ` KOSAKI Motohiro
  0 siblings, 0 replies; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-06-01  2:23 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: kosaki.motohiro, Andrew Morton, Wu Fengguang, linux-kernel,
	linux-fsdevel, linux-mm

> 
> At 11:23 09/05/28, KOSAKI Motohiro wrote:
> >> Hi Andrew.
> >> Please merge following patch.
> >> Thanks.
> >> 
> >> ---
> >> 
> >> I added blk_run_backing_dev on page_cache_async_readahead
> >> so readahead I/O is unpluged to improve throughput on 
> >> especially RAID environment. 
> >> 
> >> Following is the test result with dd.
> >> 
> >> #dd if=testdir/testfile of=/dev/null bs=16384
> >> 
> >> -2.6.30-rc6
> >> 1048576+0 records in
> >> 1048576+0 records out
> >> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> 
> >> -2.6.30-rc6-patched
> >> 1048576+0 records in
> >> 1048576+0 records out
> >> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> 
> >> My testing environment is as follows:
> >> Hardware: HP DL580 
> >> CPU:Xeon 3.2GHz *4 HT enabled
> >> Memory:8GB
> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> >> 
> >> The normal case is, if page N become uptodate at time T(N), then
> >> T(N) <= T(N+1) holds. With RAID (and NFS to some degree), there 
> >> is no strict ordering, the data arrival time depends on
> >> runtime status of individual disks, which breaks that formula. So
> >> in do_generic_file_read(), just after submitting the async readahead IO
> >> request, the current page may well be uptodate, so the page won't be locked,
> >> and the block device won't be implicitly unplugged:
> >
> >Please attach blktrace analysis ;)
> 
> Hi, Motohiro.
> 
> I've got blktrace output for both with and without the patch, 
> but I just did not clarify the reason for throuput improvement
> from this result.
> 
> I do not notice any difference except around unplug behavior by dd.
> Comments?

Please makes analysis yourself.
following are summarized log of IO completion.

it doesn't contain IO completion inversion.
Why do you think it is RAID specific behavior?

  8,0    3   177784    50.001437357     0  C   R 8717567 + 512 [0]
  8,0    3   177795    50.003815034     0  C   R 8718079 + 512 [0]
  8,0    3   177806    50.007151480     0  C   R 8718591 + 512 [0]
  8,0    3   177817    50.009885672     0  C   R 8719103 + 512 [0]
  8,0    3   177828    50.013880081     0  C   R 8719615 + 512 [0]
  8,0    3   177839    50.017545281     0  C   R 8720127 + 512 [0]
  8,0    3   177850    50.020674534     0  C   R 8720639 + 512 [0]
  8,0    3   177861    50.024269136     0  C   R 8721151 + 512 [0]
  8,0    3   177872    50.026966145     0  C   R 8721663 + 512 [0]
  8,0    3   177883    50.030406105     0  C   R 8722175 + 512 [0]
  8,0    3   177894    50.032916699     0  C   R 8722687 + 512 [0]
  8,0    3   177905    50.036715133     0  C   R 8723199 + 512 [0]
  8,0    3   177916    50.040521790     0  C   R 8723711 + 512 [0]
  8,0    3   177932    50.044383267     0  C   R 8724223 + 416 [0]
  8,0    3   177955    50.050313976     0  C   R 8724991 + 512 [0]
  8,0    3   177966    50.053380250     0  C   R 8725503 + 512 [0]
  8,0    3   177977    50.056970395     0  C   R 8726015 + 512 [0]
  8,0    3   177988    50.060326743     0  C   R 8726527 + 512 [0]
  8,0    3   177999    50.063922341     0  C   R 8727039 + 512 [0]




> 
> -2.6.30-rc6
>   8,0    3   177784    50.001437357     0  C   R 8717567 + 512 [0]
>   8,0    3   177785    50.001635405  4148  A   R 8718079 + 256 <- (8,1) 8718016
>   8,0    3   177786    50.001635675  4148  Q   R 8718079 + 256 [dd]
>   8,0    3   177787    50.001637517  4148  G   R 8718079 + 256 [dd]
>   8,0    3   177788    50.001638954  4148  P   N [dd]
>   8,0    3   177789    50.001639290  4148  I   R 8718079 + 256 [dd]
>   8,0    3   177790    50.001765339  4148  A   R 8718335 + 256 <- (8,1) 8718272
>   8,0    3   177791    50.001765699  4148  Q   R 8718335 + 256 [dd]
>   8,0    3   177792    50.001766971  4148  M   R 8718335 + 256 [dd]
>   8,0    3   177793    50.001768243  4148  U   N [dd] 1
>   8,0    3   177794    50.001769464  4148  D   R 8718079 + 512 [dd]
>   8,0    3   177795    50.003815034     0  C   R 8718079 + 512 [0]
>   8,0    3   177796    50.004008636  4148  A   R 8718591 + 256 <- (8,1) 8718528
>   8,0    3   177797    50.004008951  4148  Q   R 8718591 + 256 [dd]
>   8,0    3   177798    50.004010787  4148  G   R 8718591 + 256 [dd]
>   8,0    3   177799    50.004012089  4148  P   N [dd]
>   8,0    3   177800    50.004012641  4148  I   R 8718591 + 256 [dd]
>   8,0    3   177801    50.004139944  4148  A   R 8718847 + 256 <- (8,1) 8718784
>   8,0    3   177802    50.004140298  4148  Q   R 8718847 + 256 [dd]
>   8,0    3   177803    50.004141393  4148  M   R 8718847 + 256 [dd]
>   8,0    3   177804    50.004142815  4148  U   N [dd] 1
>   8,0    3   177805    50.004144003  4148  D   R 8718591 + 512 [dd]
>   8,0    3   177806    50.007151480     0  C   R 8718591 + 512 [0]
>   8,0    3   177807    50.007344467  4148  A   R 8719103 + 256 <- (8,1) 8719040
>   8,0    3   177808    50.007344779  4148  Q   R 8719103 + 256 [dd]
>   8,0    3   177809    50.007346636  4148  G   R 8719103 + 256 [dd]
>   8,0    3   177810    50.007347821  4148  P   N [dd]
>   8,0    3   177811    50.007348346  4148  I   R 8719103 + 256 [dd]
>   8,0    3   177812    50.007480827  4148  A   R 8719359 + 256 <- (8,1) 8719296
>   8,0    3   177813    50.007481187  4148  Q   R 8719359 + 256 [dd]
>   8,0    3   177814    50.007482669  4148  M   R 8719359 + 256 [dd]
>   8,0    3   177815    50.007483965  4148  U   N [dd] 1
>   8,0    3   177816    50.007485171  4148  D   R 8719103 + 512 [dd]
>   8,0    3   177817    50.009885672     0  C   R 8719103 + 512 [0]
>   8,0    3   177818    50.010077696  4148  A   R 8719615 + 256 <- (8,1) 8719552
>   8,0    3   177819    50.010078008  4148  Q   R 8719615 + 256 [dd]
>   8,0    3   177820    50.010079841  4148  G   R 8719615 + 256 [dd]
>   8,0    3   177821    50.010081227  4148  P   N [dd]
>   8,0    3   177822    50.010081560  4148  I   R 8719615 + 256 [dd]
>   8,0    3   177823    50.010208686  4148  A   R 8719871 + 256 <- (8,1) 8719808
>   8,0    3   177824    50.010209046  4148  Q   R 8719871 + 256 [dd]
>   8,0    3   177825    50.010210366  4148  M   R 8719871 + 256 [dd]
>   8,0    3   177826    50.010211686  4148  U   N [dd] 1
>   8,0    3   177827    50.010212916  4148  D   R 8719615 + 512 [dd]
>   8,0    3   177828    50.013880081     0  C   R 8719615 + 512 [0]
>   8,0    3   177829    50.014071235  4148  A   R 8720127 + 256 <- (8,1) 8720064
>   8,0    3   177830    50.014071544  4148  Q   R 8720127 + 256 [dd]
>   8,0    3   177831    50.014073332  4148  G   R 8720127 + 256 [dd]
>   8,0    3   177832    50.014074517  4148  P   N [dd]
>   8,0    3   177833    50.014075084  4148  I   R 8720127 + 256 [dd]
>   8,0    3   177834    50.014201763  4148  A   R 8720383 + 256 <- (8,1) 8720320
>   8,0    3   177835    50.014202123  4148  Q   R 8720383 + 256 [dd]
>   8,0    3   177836    50.014203608  4148  M   R 8720383 + 256 [dd]
>   8,0    3   177837    50.014204889  4148  U   N [dd] 1
>   8,0    3   177838    50.014206095  4148  D   R 8720127 + 512 [dd]
>   8,0    3   177839    50.017545281     0  C   R 8720127 + 512 [0]
>   8,0    3   177840    50.017741679  4148  A   R 8720639 + 256 <- (8,1) 8720576
>   8,0    3   177841    50.017742006  4148  Q   R 8720639 + 256 [dd]
>   8,0    3   177842    50.017743848  4148  G   R 8720639 + 256 [dd]
>   8,0    3   177843    50.017745318  4148  P   N [dd]
>   8,0    3   177844    50.017745672  4148  I   R 8720639 + 256 [dd]
>   8,0    3   177845    50.017876956  4148  A   R 8720895 + 256 <- (8,1) 8720832
>   8,0    3   177846    50.017877286  4148  Q   R 8720895 + 256 [dd]
>   8,0    3   177847    50.017878615  4148  M   R 8720895 + 256 [dd]
>   8,0    3   177848    50.017880082  4148  U   N [dd] 1
>   8,0    3   177849    50.017881339  4148  D   R 8720639 + 512 [dd]
>   8,0    3   177850    50.020674534     0  C   R 8720639 + 512 [0]
>   8,0    3   177851    50.020864689  4148  A   R 8721151 + 256 <- (8,1) 8721088
>   8,0    3   177852    50.020865007  4148  Q   R 8721151 + 256 [dd]
>   8,0    3   177853    50.020866900  4148  G   R 8721151 + 256 [dd]
>   8,0    3   177854    50.020868283  4148  P   N [dd]
>   8,0    3   177855    50.020868628  4148  I   R 8721151 + 256 [dd]
>   8,0    3   177856    50.020997302  4148  A   R 8721407 + 256 <- (8,1) 8721344
>   8,0    3   177857    50.020997662  4148  Q   R 8721407 + 256 [dd]
>   8,0    3   177858    50.020998976  4148  M   R 8721407 + 256 [dd]
>   8,0    3   177859    50.021000305  4148  U   N [dd] 1
>   8,0    3   177860    50.021001520  4148  D   R 8721151 + 512 [dd]
>   8,0    3   177861    50.024269136     0  C   R 8721151 + 512 [0]
>   8,0    3   177862    50.024460931  4148  A   R 8721663 + 256 <- (8,1) 8721600
>   8,0    3   177863    50.024461337  4148  Q   R 8721663 + 256 [dd]
>   8,0    3   177864    50.024463175  4148  G   R 8721663 + 256 [dd]
>   8,0    3   177865    50.024464537  4148  P   N [dd]
>   8,0    3   177866    50.024464871  4148  I   R 8721663 + 256 [dd]
>   8,0    3   177867    50.024597943  4148  A   R 8721919 + 256 <- (8,1) 8721856
>   8,0    3   177868    50.024598213  4148  Q   R 8721919 + 256 [dd]
>   8,0    3   177869    50.024599323  4148  M   R 8721919 + 256 [dd]
>   8,0    3   177870    50.024600751  4148  U   N [dd] 1
>   8,0    3   177871    50.024602104  4148  D   R 8721663 + 512 [dd]
>   8,0    3   177872    50.026966145     0  C   R 8721663 + 512 [0]
>   8,0    3   177873    50.027157245  4148  A   R 8722175 + 256 <- (8,1) 8722112
>   8,0    3   177874    50.027157563  4148  Q   R 8722175 + 256 [dd]
>   8,0    3   177875    50.027159351  4148  G   R 8722175 + 256 [dd]
>   8,0    3   177876    50.027160731  4148  P   N [dd]
>   8,0    3   177877    50.027161064  4148  I   R 8722175 + 256 [dd]
>   8,0    3   177878    50.027288745  4148  A   R 8722431 + 256 <- (8,1) 8722368
>   8,0    3   177879    50.027289105  4148  Q   R 8722431 + 256 [dd]
>   8,0    3   177880    50.027290206  4148  M   R 8722431 + 256 [dd]
>   8,0    3   177881    50.027291697  4148  U   N [dd] 1
>   8,0    3   177882    50.027293119  4148  D   R 8722175 + 512 [dd]
>   8,0    3   177883    50.030406105     0  C   R 8722175 + 512 [0]
>   8,0    3   177884    50.030600613  4148  A   R 8722687 + 256 <- (8,1) 8722624
>   8,0    3   177885    50.030601199  4148  Q   R 8722687 + 256 [dd]
>   8,0    3   177886    50.030603269  4148  G   R 8722687 + 256 [dd]
>   8,0    3   177887    50.030604463  4148  P   N [dd]
>   8,0    3   177888    50.030604799  4148  I   R 8722687 + 256 [dd]
>   8,0    3   177889    50.030731757  4148  A   R 8722943 + 256 <- (8,1) 8722880
>   8,0    3   177890    50.030732117  4148  Q   R 8722943 + 256 [dd]
>   8,0    3   177891    50.030733397  4148  M   R 8722943 + 256 [dd]
>   8,0    3   177892    50.030734882  4148  U   N [dd] 1
>   8,0    3   177893    50.030736109  4148  D   R 8722687 + 512 [dd]
>   8,0    3   177894    50.032916699     0  C   R 8722687 + 512 [0]
>   8,0    3   177895    50.033176618  4148  A   R 8723199 + 256 <- (8,1) 8723136
>   8,0    3   177896    50.033177218  4148  Q   R 8723199 + 256 [dd]
>   8,0    3   177897    50.033181433  4148  G   R 8723199 + 256 [dd]
>   8,0    3   177898    50.033184757  4148  P   N [dd]
>   8,0    3   177899    50.033185642  4148  I   R 8723199 + 256 [dd]
>   8,0    3   177900    50.033371264  4148  A   R 8723455 + 256 <- (8,1) 8723392
>   8,0    3   177901    50.033371717  4148  Q   R 8723455 + 256 [dd]
>   8,0    3   177902    50.033374015  4148  M   R 8723455 + 256 [dd]
>   8,0    3   177903    50.033376814  4148  U   N [dd] 1
>   8,0    3   177904    50.033380126  4148  D   R 8723199 + 512 [dd]
>   8,0    3   177905    50.036715133     0  C   R 8723199 + 512 [0]
>   8,0    3   177906    50.036971296  4148  A   R 8723711 + 256 <- (8,1) 8723648
>   8,0    3   177907    50.036972136  4148  Q   R 8723711 + 256 [dd]
>   8,0    3   177908    50.036975673  4148  G   R 8723711 + 256 [dd]
>   8,0    3   177909    50.036978277  4148  P   N [dd]
>   8,0    3   177910    50.036979450  4148  I   R 8723711 + 256 [dd]
>   8,0    3   177911    50.037162429  4148  A   R 8723967 + 256 <- (8,1) 8723904
>   8,0    3   177912    50.037162840  4148  Q   R 8723967 + 256 [dd]
>   8,0    3   177913    50.037164967  4148  M   R 8723967 + 256 [dd]
>   8,0    3   177914    50.037167223  4148  U   N [dd] 1
>   8,0    3   177915    50.037170001  4148  D   R 8723711 + 512 [dd]
>   8,0    3   177916    50.040521790     0  C   R 8723711 + 512 [0]
>   8,0    3   177917    50.040729738  4148  A   R 8724223 + 256 <- (8,1) 8724160
>   8,0    3   177918    50.040730200  4148  Q   R 8724223 + 256 [dd]
>   8,0    3   177919    50.040732060  4148  G   R 8724223 + 256 [dd]
>   8,0    3   177920    50.040733551  4148  P   N [dd]
>   8,0    3   177921    50.040734109  4148  I   R 8724223 + 256 [dd]
>   8,0    3   177922    50.040860173  4148  A   R 8724479 + 160 <- (8,1) 8724416
>   8,0    3   177923    50.040860536  4148  Q   R 8724479 + 160 [dd]
>   8,0    3   177924    50.040861517  4148  M   R 8724479 + 160 [dd]
>   8,0    3   177925    50.040872542  4148  A   R 1055943 + 8 <- (8,1) 1055880
>   8,0    3   177926    50.040872800  4148  Q   R 1055943 + 8 [dd]
>   8,0    3   177927    50.040874849  4148  G   R 1055943 + 8 [dd]
>   8,0    3   177928    50.040875485  4148  I   R 1055943 + 8 [dd]
>   8,0    3   177929    50.040877045  4148  U   N [dd] 2
>   8,0    3   177930    50.040878625  4148  D   R 8724223 + 416 [dd]
>   8,0    3   177931    50.040895335  4148  D   R 1055943 + 8 [dd]
>   8,0    3   177932    50.044383267     0  C   R 8724223 + 416 [0]
>   8,0    3   177933    50.044704725     0  C   R 1055943 + 8 [0]
>   8,0    3   177934    50.044749068  4148  A   R 8724639 + 96 <- (8,1) 8724576
>   8,0    3   177935    50.044749472  4148  Q   R 8724639 + 96 [dd]
>   8,0    3   177936    50.044752184  4148  G   R 8724639 + 96 [dd]
>   8,0    3   177937    50.044753552  4148  P   N [dd]
>   8,0    3   177938    50.044754032  4148  I   R 8724639 + 96 [dd]
>   8,0    3   177939    50.044896095  4148  A   R 8724735 + 256 <- (8,1) 8724672
>   8,0    3   177940    50.044896443  4148  Q   R 8724735 + 256 [dd]
>   8,0    3   177941    50.044897538  4148  M   R 8724735 + 256 [dd]
>   8,0    3   177942    50.044948546  4148  U   N [dd] 1
>   8,0    3   177943    50.044950001  4148  D   R 8724639 + 352 [dd]
>   8,0    3   177944    50.047150137     0  C   R 8724639 + 352 [0]
>   8,0    3   177945    50.047294824  4148  A   R 8724991 + 256 <- (8,1) 8724928
>   8,0    3   177946    50.047295142  4148  Q   R 8724991 + 256 [dd]
>   8,0    3   177947    50.047296978  4148  G   R 8724991 + 256 [dd]
>   8,0    3   177948    50.047298301  4148  P   N [dd]
>   8,0    3   177949    50.047298637  4148  I   R 8724991 + 256 [dd]
>   8,0    3   177950    50.047429027  4148  A   R 8725247 + 256 <- (8,1) 8725184
>   8,0    3   177951    50.047429387  4148  Q   R 8725247 + 256 [dd]
>   8,0    3   177952    50.047430479  4148  M   R 8725247 + 256 [dd]
>   8,0    3   177953    50.047431736  4148  U   N [dd] 1
>   8,0    3   177954    50.047432951  4148  D   R 8724991 + 512 [dd]
>   8,0    3   177955    50.050313976     0  C   R 8724991 + 512 [0]
>   8,0    3   177956    50.050507961  4148  A   R 8725503 + 256 <- (8,1) 8725440
>   8,0    3   177957    50.050508273  4148  Q   R 8725503 + 256 [dd]
>   8,0    3   177958    50.050510139  4148  G   R 8725503 + 256 [dd]
>   8,0    3   177959    50.050511522  4148  P   N [dd]
>   8,0    3   177960    50.050512062  4148  I   R 8725503 + 256 [dd]
>   8,0    3   177961    50.050645393  4148  A   R 8725759 + 256 <- (8,1) 8725696
>   8,0    3   177962    50.050645867  4148  Q   R 8725759 + 256 [dd]
>   8,0    3   177963    50.050647171  4148  M   R 8725759 + 256 [dd]
>   8,0    3   177964    50.050648593  4148  U   N [dd] 1
>   8,0    3   177965    50.050649985  4148  D   R 8725503 + 512 [dd]
>   8,0    3   177966    50.053380250     0  C   R 8725503 + 512 [0]
>   8,0    3   177967    50.053576324  4148  A   R 8726015 + 256 <- (8,1) 8725952
>   8,0    3   177968    50.053576615  4148  Q   R 8726015 + 256 [dd]
>   8,0    3   177969    50.053578994  4148  G   R 8726015 + 256 [dd]
>   8,0    3   177970    50.053580173  4148  P   N [dd]
>   8,0    3   177971    50.053580509  4148  I   R 8726015 + 256 [dd]
>   8,0    3   177972    50.053711503  4148  A   R 8726271 + 256 <- (8,1) 8726208
>   8,0    3   177973    50.053712001  4148  Q   R 8726271 + 256 [dd]
>   8,0    3   177974    50.053713332  4148  M   R 8726271 + 256 [dd]
>   8,0    3   177975    50.053714583  4148  U   N [dd] 1
>   8,0    3   177976    50.053715768  4148  D   R 8726015 + 512 [dd]
>   8,0    3   177977    50.056970395     0  C   R 8726015 + 512 [0]
>   8,0    3   177978    50.057161408  4148  A   R 8726527 + 256 <- (8,1) 8726464
>   8,0    3   177979    50.057161726  4148  Q   R 8726527 + 256 [dd]
>   8,0    3   177980    50.057163718  4148  G   R 8726527 + 256 [dd]
>   8,0    3   177981    50.057165098  4148  P   N [dd]
>   8,0    3   177982    50.057165431  4148  I   R 8726527 + 256 [dd]
>   8,0    3   177983    50.057294630  4148  A   R 8726783 + 256 <- (8,1) 8726720
>   8,0    3   177984    50.057294990  4148  Q   R 8726783 + 256 [dd]
>   8,0    3   177985    50.057296070  4148  M   R 8726783 + 256 [dd]
>   8,0    3   177986    50.057297402  4148  U   N [dd] 1
>   8,0    3   177987    50.057298899  4148  D   R 8726527 + 512 [dd]
>   8,0    3   177988    50.060326743     0  C   R 8726527 + 512 [0]
>   8,0    3   177989    50.060523768  4148  A   R 8727039 + 256 <- (8,1) 8726976
>   8,0    3   177990    50.060524095  4148  Q   R 8727039 + 256 [dd]
>   8,0    3   177991    50.060525910  4148  G   R 8727039 + 256 [dd]
>   8,0    3   177992    50.060527239  4148  P   N [dd]
>   8,0    3   177993    50.060527575  4148  I   R 8727039 + 256 [dd]
>   8,0    3   177994    50.060662280  4148  A   R 8727295 + 256 <- (8,1) 8727232
>   8,0    3   177995    50.060662778  4148  Q   R 8727295 + 256 [dd]
>   8,0    3   177996    50.060663993  4148  M   R 8727295 + 256 [dd]
>   8,0    3   177997    50.060665403  4148  U   N [dd] 1
>   8,0    3   177998    50.060666999  4148  D   R 8727039 + 512 [dd]
>   8,0    3   177999    50.063922341     0  C   R 8727039 + 512 [0]
>   8,0    3   178000    50.064113177  4148  A   R 8727551 + 256 <- (8,1) 8727488
>   8,0    3   178001    50.064113492  4148  Q   R 8727551 + 256 [dd]
>   8,0    3   178002    50.064115373  4148  G   R 8727551 + 256 [dd]
> 
> -2.6.30-rc6-patched
>   8,0    3   257297    50.000760847     0  C   R 9480703 + 256 [0]
>   8,0    3   257298    50.000944399  4139  A   R 9481215 + 256 <- (8,1) 9481152
>   8,0    3   257299    50.000944693  4139  Q   R 9481215 + 256 [dd]
>   8,0    3   257300    50.000946541  4139  G   R 9481215 + 256 [dd]
>   8,0    3   257301    50.000947954  4139  P   N [dd]
>   8,0    3   257302    50.000948368  4139  I   R 9481215 + 256 [dd]
>   8,0    3   257303    50.000948920  4139  U   N [dd] 2
>   8,0    3   257304    50.000950003  4139  D   R 9481215 + 256 [dd]
>   8,0    3   257305    50.000962541  4139  U   N [dd] 2
>   8,0    3   257306    50.003034240     0  C   R 9480959 + 256 [0]
>   8,0    3   257307    50.003076338     0  C   R 9481215 + 256 [0]
>   8,0    3   257308    50.003258111  4139  A   R 9481471 + 256 <- (8,1) 9481408
>   8,0    3   257309    50.003258402  4139  Q   R 9481471 + 256 [dd]
>   8,0    3   257310    50.003260190  4139  G   R 9481471 + 256 [dd]
>   8,0    3   257311    50.003261399  4139  P   N [dd]
>   8,0    3   257312    50.003261768  4139  I   R 9481471 + 256 [dd]
>   8,0    3   257313    50.003262335  4139  U   N [dd] 1
>   8,0    3   257314    50.003263406  4139  D   R 9481471 + 256 [dd]
>   8,0    3   257315    50.003430472  4139  A   R 9481727 + 256 <- (8,1) 9481664
>   8,0    3   257316    50.003430748  4139  Q   R 9481727 + 256 [dd]
>   8,0    3   257317    50.003433065  4139  G   R 9481727 + 256 [dd]
>   8,0    3   257318    50.003434343  4139  P   N [dd]
>   8,0    3   257319    50.003434658  4139  I   R 9481727 + 256 [dd]
>   8,0    3   257320    50.003435138  4139  U   N [dd] 2
>   8,0    3   257321    50.003436083  4139  D   R 9481727 + 256 [dd]
>   8,0    3   257322    50.003447795  4139  U   N [dd] 2
>   8,0    3   257323    50.004774693     0  C   R 9481471 + 256 [0]
>   8,0    3   257324    50.004959499  4139  A   R 9481983 + 256 <- (8,1) 9481920
>   8,0    3   257325    50.004959790  4139  Q   R 9481983 + 256 [dd]
>   8,0    3   257326    50.004961590  4139  G   R 9481983 + 256 [dd]
>   8,0    3   257327    50.004962793  4139  P   N [dd]
>   8,0    3   257328    50.004963153  4139  I   R 9481983 + 256 [dd]
>   8,0    3   257329    50.004964098  4139  U   N [dd] 2
>   8,0    3   257330    50.004965184  4139  D   R 9481983 + 256 [dd]
>   8,0    3   257331    50.004978967  4139  U   N [dd] 2
>   8,0    3   257332    50.006865854     0  C   R 9481727 + 256 [0]
>   8,0    3   257333    50.007052043  4139  A   R 9482239 + 256 <- (8,1) 9482176
>   8,0    3   257334    50.007052331  4139  Q   R 9482239 + 256 [dd]
>   8,0    3   257335    50.007054146  4139  G   R 9482239 + 256 [dd]
>   8,0    3   257336    50.007055355  4139  P   N [dd]
>   8,0    3   257337    50.007055724  4139  I   R 9482239 + 256 [dd]
>   8,0    3   257338    50.007056438  4139  U   N [dd] 2
>   8,0    3   257339    50.007057605  4139  D   R 9482239 + 256 [dd]
>   8,0    3   257340    50.007069963  4139  U   N [dd] 2
>   8,0    3   257341    50.008250294     0  C   R 9481983 + 256 [0]
>   8,0    3   257342    50.008431589  4139  A   R 9482495 + 256 <- (8,1) 9482432
>   8,0    3   257343    50.008431881  4139  Q   R 9482495 + 256 [dd]
>   8,0    3   257344    50.008433921  4139  G   R 9482495 + 256 [dd]
>   8,0    3   257345    50.008435097  4139  P   N [dd]
>   8,0    3   257346    50.008435466  4139  I   R 9482495 + 256 [dd]
>   8,0    3   257347    50.008436213  4139  U   N [dd] 2
>   8,0    3   257348    50.008437296  4139  D   R 9482495 + 256 [dd]
>   8,0    3   257349    50.008450034  4139  U   N [dd] 2
>   8,0    3   257350    50.010008843     0  C   R 9482239 + 256 [0]
>   8,0    3   257351    50.010135287  4139  C   R 9482495 + 256 [0]
>   8,0    3   257352    50.010226816  4139  A   R 9482751 + 256 <- (8,1) 9482688
>   8,0    3   257353    50.010227107  4139  Q   R 9482751 + 256 [dd]
>   8,0    3   257354    50.010229363  4139  G   R 9482751 + 256 [dd]
>   8,0    3   257355    50.010230728  4139  P   N [dd]
>   8,0    3   257356    50.010231097  4139  I   R 9482751 + 256 [dd]
>   8,0    3   257357    50.010231655  4139  U   N [dd] 1
>   8,0    3   257358    50.010232696  4139  D   R 9482751 + 256 [dd]
>   8,0    3   257359    50.010380946  4139  A   R 9483007 + 256 <- (8,1) 9482944
>   8,0    3   257360    50.010381264  4139  Q   R 9483007 + 256 [dd]
>   8,0    3   257361    50.010383358  4139  G   R 9483007 + 256 [dd]
>   8,0    3   257362    50.010384429  4139  P   N [dd]
>   8,0    3   257363    50.010384741  4139  I   R 9483007 + 256 [dd]
>   8,0    3   257364    50.010385395  4139  U   N [dd] 2
>   8,0    3   257365    50.010386364  4139  D   R 9483007 + 256 [dd]
>   8,0    3   257366    50.010397869  4139  U   N [dd] 2
>   8,0    3   257367    50.014210132     0  C   R 9482751 + 256 [0]
>   8,0    3   257368    50.014252938     0  C   R 9483007 + 256 [0]
>   8,0    3   257369    50.014430811  4139  A   R 9483263 + 256 <- (8,1) 9483200
>   8,0    3   257370    50.014431105  4139  Q   R 9483263 + 256 [dd]
>   8,0    3   257371    50.014433139  4139  G   R 9483263 + 256 [dd]
>   8,0    3   257372    50.014434520  4139  P   N [dd]
>   8,0    3   257373    50.014435110  4139  I   R 9483263 + 256 [dd]
>   8,0    3   257374    50.014435674  4139  U   N [dd] 1
>   8,0    3   257375    50.014436770  4139  D   R 9483263 + 256 [dd]
>   8,0    3   257376    50.014592117  4139  A   R 9483519 + 256 <- (8,1) 9483456
>   8,0    3   257377    50.014592573  4139  Q   R 9483519 + 256 [dd]
>   8,0    3   257378    50.014594391  4139  G   R 9483519 + 256 [dd]
>   8,0    3   257379    50.014595504  4139  P   N [dd]
>   8,0    3   257380    50.014595876  4139  I   R 9483519 + 256 [dd]
>   8,0    3   257381    50.014596366  4139  U   N [dd] 2
>   8,0    3   257382    50.014597368  4139  D   R 9483519 + 256 [dd]
>   8,0    3   257383    50.014609521  4139  U   N [dd] 2
>   8,0    3   257384    50.015937813     0  C   R 9483263 + 256 [0]
>   8,0    3   257385    50.016124825  4139  A   R 9483775 + 256 <- (8,1) 9483712
>   8,0    3   257386    50.016125116  4139  Q   R 9483775 + 256 [dd]
>   8,0    3   257387    50.016127162  4139  G   R 9483775 + 256 [dd]
>   8,0    3   257388    50.016128569  4139  P   N [dd]
>   8,0    3   257389    50.016128983  4139  I   R 9483775 + 256 [dd]
>   8,0    3   257390    50.016129538  4139  U   N [dd] 2
>   8,0    3   257391    50.016130627  4139  D   R 9483775 + 256 [dd]
>   8,0    3   257392    50.016143077  4139  U   N [dd] 2
>   8,0    3   257393    50.016925304     0  C   R 9483519 + 256 [0]
>   8,0    3   257394    50.017111307  4139  A   R 9484031 + 256 <- (8,1) 9483968
>   8,0    3   257395    50.017111598  4139  Q   R 9484031 + 256 [dd]
>   8,0    3   257396    50.017113410  4139  G   R 9484031 + 256 [dd]
>   8,0    3   257397    50.017114835  4139  P   N [dd]
>   8,0    3   257398    50.017115213  4139  I   R 9484031 + 256 [dd]
>   8,0    3   257399    50.017115765  4139  U   N [dd] 2
>   8,0    3   257400    50.017116839  4139  D   R 9484031 + 256 [dd]
>   8,0    3   257401    50.017129023  4139  U   N [dd] 2
>   8,0    3   257402    50.017396693     0  C   R 9483775 + 256 [0]
>   8,0    3   257403    50.017584595  4139  A   R 9484287 + 256 <- (8,1) 9484224
>   8,0    3   257404    50.017585018  4139  Q   R 9484287 + 256 [dd]
>   8,0    3   257405    50.017586866  4139  G   R 9484287 + 256 [dd]
>   8,0    3   257406    50.017587997  4139  P   N [dd]
>   8,0    3   257407    50.017588393  4139  I   R 9484287 + 256 [dd]
>   8,0    3   257408    50.017589105  4139  U   N [dd] 2
>   8,0    3   257409    50.017590173  4139  D   R 9484287 + 256 [dd]
>   8,0    3   257410    50.017602614  4139  U   N [dd] 2
>   8,0    3   257411    50.020578876     0  C   R 9484031 + 256 [0]
>   8,0    3   257412    50.020721857  4139  C   R 9484287 + 256 [0]
>   8,0    3   257413    50.020803183  4139  A   R 9484543 + 256 <- (8,1) 9484480
>   8,0    3   257414    50.020803507  4139  Q   R 9484543 + 256 [dd]
>   8,0    3   257415    50.020805256  4139  G   R 9484543 + 256 [dd]
>   8,0    3   257416    50.020806672  4139  P   N [dd]
>   8,0    3   257417    50.020807065  4139  I   R 9484543 + 256 [dd]
>   8,0    3   257418    50.020807668  4139  U   N [dd] 1
>   8,0    3   257419    50.020808733  4139  D   R 9484543 + 256 [dd]
>   8,0    3   257420    50.020957132  4139  A   R 9484799 + 256 <- (8,1) 9484736
>   8,0    3   257421    50.020957423  4139  Q   R 9484799 + 256 [dd]
>   8,0    3   257422    50.020959205  4139  G   R 9484799 + 256 [dd]
>   8,0    3   257423    50.020960276  4139  P   N [dd]
>   8,0    3   257424    50.020960594  4139  I   R 9484799 + 256 [dd]
>   8,0    3   257425    50.020961062  4139  U   N [dd] 2
>   8,0    3   257426    50.020961959  4139  D   R 9484799 + 256 [dd]
>   8,0    3   257427    50.020974191  4139  U   N [dd] 2
>   8,0    3   257428    50.023987847     0  C   R 9484543 + 256 [0]
>   8,0    3   257429    50.024093062  4139  C   R 9484799 + 256 [0]
>   8,0    3   257430    50.024207161  4139  A   R 9485055 + 256 <- (8,1) 9484992
>   8,0    3   257431    50.024207434  4139  Q   R 9485055 + 256 [dd]
>   8,0    3   257432    50.024209567  4139  G   R 9485055 + 256 [dd]
>   8,0    3   257433    50.024210728  4139  P   N [dd]
>   8,0    3   257434    50.024211097  4139  I   R 9485055 + 256 [dd]
>   8,0    3   257435    50.024211661  4139  U   N [dd] 1
>   8,0    3   257436    50.024212693  4139  D   R 9485055 + 256 [dd]
>   8,0    3   257437    50.024359266  4139  A   R 9485311 + 256 <- (8,1) 9485248
>   8,0    3   257438    50.024359584  4139  Q   R 9485311 + 256 [dd]
>   8,0    3   257439    50.024361720  4139  G   R 9485311 + 256 [dd]
>   8,0    3   257440    50.024362794  4139  P   N [dd]
>   8,0    3   257441    50.024363106  4139  I   R 9485311 + 256 [dd]
>   8,0    3   257442    50.024363760  4139  U   N [dd] 2
>   8,0    3   257443    50.024364759  4139  D   R 9485311 + 256 [dd]
>   8,0    3   257444    50.024376535  4139  U   N [dd] 2
>   8,0    3   257445    50.026532544     0  C   R 9485055 + 256 [0]
>   8,0    3   257446    50.026714236  4139  A   R 9485567 + 256 <- (8,1) 9485504
>   8,0    3   257447    50.026714524  4139  Q   R 9485567 + 256 [dd]
>   8,0    3   257448    50.026716354  4139  G   R 9485567 + 256 [dd]
>   8,0    3   257449    50.026717791  4139  P   N [dd]
>   8,0    3   257450    50.026718175  4139  I   R 9485567 + 256 [dd]
>   8,0    3   257451    50.026718778  4139  U   N [dd] 2
>   8,0    3   257452    50.026719876  4139  D   R 9485567 + 256 [dd]
>   8,0    3   257453    50.026736383  4139  U   N [dd] 2
>   8,0    3   257454    50.028531879     0  C   R 9485311 + 256 [0]
>   8,0    3   257455    50.028684347  4139  C   R 9485567 + 256 [0]
>   8,0    3   257456    50.028758787  4139  A   R 9485823 + 256 <- (8,1) 9485760
>   8,0    3   257457    50.028759069  4139  Q   R 9485823 + 256 [dd]
>   8,0    3   257458    50.028760884  4139  G   R 9485823 + 256 [dd]
>   8,0    3   257459    50.028762099  4139  P   N [dd]
>   8,0    3   257460    50.028762447  4139  I   R 9485823 + 256 [dd]
>   8,0    3   257461    50.028763038  4139  U   N [dd] 1
>   8,0    3   257462    50.028764268  4139  D   R 9485823 + 256 [dd]
>   8,0    3   257463    50.028909841  4139  A   R 9486079 + 256 <- (8,1) 9486016
>   8,0    3   257464    50.028910156  4139  Q   R 9486079 + 256 [dd]
>   8,0    3   257465    50.028911896  4139  G   R 9486079 + 256 [dd]
>   8,0    3   257466    50.028912964  4139  P   N [dd]
>   8,0    3   257467    50.028913270  4139  I   R 9486079 + 256 [dd]
>   8,0    3   257468    50.028913912  4139  U   N [dd] 2
>   8,0    3   257469    50.028914878  4139  D   R 9486079 + 256 [dd]
>   8,0    3   257470    50.028927497  4139  U   N [dd] 2
>   8,0    3   257471    50.031158357     0  C   R 9485823 + 256 [0]
>   8,0    3   257472    50.031292365  4139  C   R 9486079 + 256 [0]
>   8,0    3   257473    50.031369697  4139  A   R 9486335 + 160 <- (8,1) 9486272
>   8,0    3   257474    50.031369988  4139  Q   R 9486335 + 160 [dd]
>   8,0    3   257475    50.031371779  4139  G   R 9486335 + 160 [dd]
>   8,0    3   257476    50.031372850  4139  P   N [dd]
>   8,0    3   257477    50.031373198  4139  I   R 9486335 + 160 [dd]
>   8,0    3   257478    50.031384931  4139  A   R 1056639 + 8 <- (8,1) 1056576
>   8,0    3   257479    50.031385201  4139  Q   R 1056639 + 8 [dd]
>   8,0    3   257480    50.031388480  4139  G   R 1056639 + 8 [dd]
>   8,0    3   257481    50.031388904  4139  I   R 1056639 + 8 [dd]
>   8,0    3   257482    50.031390362  4139  U   N [dd] 2
>   8,0    3   257483    50.031391523  4139  D   R 9486335 + 160 [dd]
>   8,0    3   257484    50.031403403  4139  D   R 1056639 + 8 [dd]
>   8,0    3   257485    50.033630747     0  C   R 1056639 + 8 [0]
>   8,0    3   257486    50.033690300  4139  A   R 9486495 + 96 <- (8,1) 9486432
>   8,0    3   257487    50.033690810  4139  Q   R 9486495 + 96 [dd]
>   8,0    3   257488    50.033694581  4139  G   R 9486495 + 96 [dd]
>   8,0    3   257489    50.033696739  4139  P   N [dd]
>   8,0    3   257490    50.033697357  4139  I   R 9486495 + 96 [dd]
>   8,0    3   257491    50.033698611  4139  U   N [dd] 2
>   8,0    3   257492    50.033700945  4139  D   R 9486495 + 96 [dd]
>   8,0    3   257493    50.033727763  4139  C   R 9486335 + 160 [0]
>   8,0    3   257494    50.033996024  4139  A   R 9486591 + 256 <- (8,1) 9486528
>   8,0    3   257495    50.033996396  4139  Q   R 9486591 + 256 [dd]
>   8,0    3   257496    50.034000030  4139  G   R 9486591 + 256 [dd]
>   8,0    3   257497    50.034002268  4139  P   N [dd]
>   8,0    3   257498    50.034002820  4139  I   R 9486591 + 256 [dd]
>   8,0    3   257499    50.034003924  4139  U   N [dd] 2
>   8,0    3   257500    50.034006201  4139  D   R 9486591 + 256 [dd]
>   8,0    3   257501    50.034091438  4139  U   N [dd] 2
>   8,0    3   257502    50.034637372     0  C   R 9486495 + 96 [0]
>   8,0    3   257503    50.034841508  4139  A   R 9486847 + 256 <- (8,1) 9486784
>   8,0    3   257504    50.034842072  4139  Q   R 9486847 + 256 [dd]
>   8,0    3   257505    50.034846117  4139  G   R 9486847 + 256 [dd]
>   8,0    3   257506    50.034848676  4139  P   N [dd]
>   8,0    3   257507    50.034849384  4139  I   R 9486847 + 256 [dd]
>   8,0    3   257508    50.034850545  4139  U   N [dd] 2
>   8,0    3   257509    50.034852795  4139  D   R 9486847 + 256 [dd]
>   8,0    3   257510    50.034875503  4139  U   N [dd] 2
>   8,0    3   257511    50.035370009     0  C   R 9486591 + 256 [0]
>   8,0    3   257512    50.035622315  4139  A   R 9487103 + 256 <- (8,1) 9487040
>   8,0    3   257513    50.035622954  4139  Q   R 9487103 + 256 [dd]
>   8,0    3   257514    50.035627101  4139  G   R 9487103 + 256 [dd]
>   8,0    3   257515    50.035629510  4139  P   N [dd]
>   8,0    3   257516    50.035630143  4139  I   R 9487103 + 256 [dd]
>   8,0    3   257517    50.035631058  4139  U   N [dd] 2
>   8,0    3   257518    50.035632657  4139  D   R 9487103 + 256 [dd]
>   8,0    3   257519    50.035656358  4139  U   N [dd] 2
>   8,0    3   257520    50.036703329     0  C   R 9486847 + 256 [0]
>   8,0    3   257521    50.036963604  4139  A   R 9487359 + 256 <- (8,1) 9487296
>   8,0    3   257522    50.036964057  4139  Q   R 9487359 + 256 [dd]
>   8,0    3   257523    50.036967636  4139  G   R 9487359 + 256 [dd]
>   8,0    3   257524    50.036969710  4139  P   N [dd]
>   8,0    3   257525    50.036970586  4139  I   R 9487359 + 256 [dd]
>   8,0    3   257526    50.036971684  4139  U   N [dd] 2
>   8,0    3   257527    50.036973631  4139  D   R 9487359 + 256 [dd]
>   8,0    3   257528    50.036995034  4139  U   N [dd] 2
>   8,0    3   257529    50.038904428     0  C   R 9487103 + 256 [0]
>   8,0    3   257530    50.039161508  4139  A   R 9487615 + 256 <- (8,1) 9487552
>   8,0    3   257531    50.039161934  4139  Q   R 9487615 + 256 [dd]
>   8,0    3   257532    50.039165834  4139  G   R 9487615 + 256 [dd]
>   8,0    3   257533    50.039168561  4139  P   N [dd]
>   8,0    3   257534    50.039169353  4139  I   R 9487615 + 256 [dd]
>   8,0    3   257535    50.039170343  4139  U   N [dd] 2
>   8,0    3   257536    50.039171645  4139  D   R 9487615 + 256 [dd]
>   8,0    3   257537    50.039193195  4139  U   N [dd] 2
>   8,0    3   257538    50.040570003     0  C   R 9487359 + 256 [0]
>   8,0    3   257539    50.040842161  4139  A   R 9487871 + 256 <- (8,1) 9487808
>   8,0    3   257540    50.040842827  4139  Q   R 9487871 + 256 [dd]
>   8,0    3   257541    50.040846803  4139  G   R 9487871 + 256 [dd]
>   8,0    3   257542    50.040849902  4139  P   N [dd]
>   8,0    3   257543    50.040850715  4139  I   R 9487871 + 256 [dd]
>   8,0    3   257544    50.040851642  4139  U   N [dd] 2
>   8,0    3   257545    50.040853658  4139  D   R 9487871 + 256 [dd]
>   8,0    3   257546    50.040876270  4139  U   N [dd] 2
>   8,0    3   257547    50.042081391     0  C   R 9487615 + 256 [0]
>   8,0    3   257548    50.042215837  4139  C   R 9487871 + 256 [0]
>   8,0    3   257549    50.042316192  4139  A   R 9488127 + 256 <- (8,1) 9488064
>   8,0    3   257550    50.042316633  4139  Q   R 9488127 + 256 [dd]
>   8,0    3   257551    50.042319213  4139  G   R 9488127 + 256 [dd]
>   8,0    3   257552    50.042320803  4139  P   N [dd]
>   8,0    3   257553    50.042321412  4139  I   R 9488127 + 256 [dd]
>   8,0    3   257554    50.042322219  4139  U   N [dd] 1
>   8,0    3   257555    50.042323362  4139  D   R 9488127 + 256 [dd]
>   8,0    3   257556    50.042484350  4139  A   R 9488383 + 256 <- (8,1) 9488320
>   8,0    3   257557    50.042484602  4139  Q   R 9488383 + 256 [dd]
>   8,0    3   257558    50.042486744  4139  G   R 9488383 + 256 [dd]
>   8,0    3   257559    50.042487908  4139  P   N [dd]
>   8,0    3   257560    50.042488223  4139  I   R 9488383 + 256 [dd]
>   8,0    3   257561    50.042488754  4139  U   N [dd] 2
>   8,0    3   257562    50.042489927  4139  D   R 9488383 + 256 [dd]
>   8,0    3   257563    50.042502678  4139  U   N [dd] 2
>   8,0    3   257564    50.045166592     0  C   R 9488127 + 256 [0]
>   8,0    3   257565    50.045355163  4139  A   R 9488639 + 256 <- (8,1) 9488576
>   8,0    3   257566    50.045355493  4139  Q   R 9488639 + 256 [dd]
>   8,0    3   257567    50.045357497  4139  G   R 9488639 + 256 [dd]
>   8,0    3   257568    50.045358673  4139  P   N [dd]
>   8,0    3   257569    50.045359267  4139  I   R 9488639 + 256 [dd]
>   8,0    3   257570    50.045359831  4139  U   N [dd] 2
>   8,0    3   257571    50.045360911  4139  D   R 9488639 + 256 [dd]
>   8,0    3   257572    50.045373959  4139  U   N [dd] 2
>   8,0    3   257573    50.046450730     0  C   R 9488383 + 256 [0]
>   8,0    3   257574    50.046641639  4139  A   R 9488895 + 256 <- (8,1) 9488832
>   8,0    3   257575    50.046642086  4139  Q   R 9488895 + 256 [dd]
>   8,0    3   257576    50.046643937  4139  G   R 9488895 + 256 [dd]
>   8,0    3   257577    50.046645092  4139  P   N [dd]
>   8,0    3   257578    50.046645527  4139  I   R 9488895 + 256 [dd]
>   8,0    3   257579    50.046646244  4139  U   N [dd] 2
>   8,0    3   257580    50.046647327  4139  D   R 9488895 + 256 [dd]
>   8,0    3   257581    50.046660234  4139  U   N [dd] 2
>   8,0    3   257582    50.047826305     0  C   R 9488639 + 256 [0]
>   8,0    3   257583    50.048011468  4139  A   R 9489151 + 256 <- (8,1) 9489088
>   8,0    3   257584    50.048011762  4139  Q   R 9489151 + 256 [dd]
>   8,0    3   257585    50.048013793  4139  G   R 9489151 + 256 [dd]
>   8,0    3   257586    50.048014966  4139  P   N [dd]
>   8,0    3   257587    50.048015380  4139  I   R 9489151 + 256 [dd]
>   8,0    3   257588    50.048016112  4139  U   N [dd] 2
>   8,0    3   257589    50.048017202  4139  D   R 9489151 + 256 [dd]
>   8,0    3   257590    50.048029553  4139  U   N [dd] 2
>   8,0    3   257591    50.049319830     0  C   R 9488895 + 256 [0]
>   8,0    3   257592    50.049446089  4139  C   R 9489151 + 256 [0]
>   8,0    3   257593    50.049545199  4139  A   R 9489407 + 256 <- (8,1) 9489344
>   8,0    3   257594    50.049545628  4139  Q   R 9489407 + 256 [dd]
>   8,0    3   257595    50.049547512  4139  G   R 9489407 + 256 [dd]
>   8,0    3   257596    50.049548886  4139  P   N [dd]
>   8,0    3   257597    50.049549318  4139  I   R 9489407 + 256 [dd]
>   8,0    3   257598    50.049550047  4139  U   N [dd] 1
>   8,0    3   257599    50.049551241  4139  D   R 9489407 + 256 [dd]
>   8,0    3   257600    50.049699283  4139  A   R 9489663 + 256 <- (8,1) 9489600
>   8,0    3   257601    50.049699556  4139  Q   R 9489663 + 256 [dd]
>   8,0    3   257602    50.049701266  4139  G   R 9489663 + 256 [dd]
>   8,0    3   257603    50.049702310  4139  P   N [dd]
>   8,0    3   257604    50.049702656  4139  I   R 9489663 + 256 [dd]
>   8,0    3   257605    50.049703118  4139  U   N [dd] 2
>   8,0    3   257606    50.049704020  4139  D   R 9489663 + 256 [dd]
>   8,0    3   257607    50.049715940  4139  U   N [dd] 2
>   8,0    3   257608    50.052662150     0  C   R 9489407 + 256 [0]
>   8,0    3   257609    50.052853688  4139  A   R 9489919 + 256 <- (8,1) 9489856
>   8,0    3   257610    50.052853985  4139  Q   R 9489919 + 256 [dd]
>   8,0    3   257611    50.052855869  4139  G   R 9489919 + 256 [dd]
>   8,0    3   257612    50.052857057  4139  P   N [dd]
>   8,0    3   257613    50.052857423  4139  I   R 9489919 + 256 [dd]
>   8,0    3   257614    50.052858065  4139  U   N [dd] 2
>   8,0    3   257615    50.052859164  4139  D   R 9489919 + 256 [dd]
>   8,0    3   257616    50.052871806  4139  U   N [dd] 2
>   8,0    3   257617    50.053470795     0  C   R 9489663 + 256 [0]
>   8,0    3   257618    50.053661719  4139  A   R 9490175 + 256 <- (8,1) 9490112
>   8,0    3   257619    50.053662097  4139  Q   R 9490175 + 256 [dd]
>   8,0    3   257620    50.053663891  4139  G   R 9490175 + 256 [dd]
>   8,0    3   257621    50.053665034  4139  P   N [dd]
>   8,0    3   257622    50.053665436  4139  I   R 9490175 + 256 [dd]
>   8,0    3   257623    50.053665982  4139  U   N [dd] 2
>   8,0    3   257624    50.053667077  4139  D   R 9490175 + 256 [dd]
>   8,0    3   257625    50.053679732  4139  U   N [dd] 2
>   8,0    3   257626    50.055776383     0  C   R 9489919 + 256 [0]
>   8,0    3   257627    50.055915017  4139  C   R 9490175 + 256 [0]
>   8,0    3   257628    50.055997812  4139  A   R 9490431 + 256 <- (8,1) 9490368
>   8,0    3   257629    50.055998085  4139  Q   R 9490431 + 256 [dd]
>   8,0    3   257630    50.055999867  4139  G   R 9490431 + 256 [dd]
>   8,0    3   257631    50.056001049  4139  P   N [dd]
>   8,0    3   257632    50.056001451  4139  I   R 9490431 + 256 [dd]
>   8,0    3   257633    50.056002189  4139  U   N [dd] 1
>   8,0    3   257634    50.056003197  4139  D   R 9490431 + 256 [dd]
>   8,0    3   257635    50.056149977  4139  A   R 9490687 + 256 <- (8,1) 9490624
>   8,0    3   257636    50.056150279  4139  Q   R 9490687 + 256 [dd]
>   8,0    3   257637    50.056152047  4139  G   R 9490687 + 256 [dd]
>   8,0    3   257638    50.056153109  4139  P   N [dd]
>   8,0    3   257639    50.056153442  4139  I   R 9490687 + 256 [dd]
>   8,0    3   257640    50.056153904  4139  U   N [dd] 2
>   8,0    3   257641    50.056154852  4139  D   R 9490687 + 256 [dd]
>   8,0    3   257642    50.056166948  4139  U   N [dd] 2
>   8,0    3   257643    50.057600660     0  C   R 9490431 + 256 [0]
>   8,0    3   257644    50.057786753  4139  A   R 9490943 + 256 <- (8,1) 9490880
>   8,0    3   257645    50.057787050  4139  Q   R 9490943 + 256 [dd]
>   8,0    3   257646    50.057788865  4139  G   R 9490943 + 256 [dd]
>   8,0    3   257647    50.057790236  4139  P   N [dd]
>   8,0    3   257648    50.057790614  4139  I   R 9490943 + 256 [dd]
>   8,0    3   257649    50.057791169  4139  U   N [dd] 2
>   8,0    3   257650    50.057792246  4139  D   R 9490943 + 256 [dd]
>   8,0    3   257651    50.057804469  4139  U   N [dd] 2
>   8,0    3   257652    50.060322995     0  C   R 9490687 + 256 [0]
>   8,0    3   257653    50.060464005  4139  C   R 9490943 + 256 [0]
>   8,0    3   257654    50.060548216  4139  A   R 9491199 + 256 <- (8,1) 9491136
>   8,0    3   257655    50.060548696  4139  Q   R 9491199 + 256 [dd]
>   8,0    3   257656    50.060550922  4139  G   R 9491199 + 256 [dd]
>   8,0    3   257657    50.060552096  4139  P   N [dd]
>   8,0    3   257658    50.060552531  4139  I   R 9491199 + 256 [dd]
>   8,0    3   257659    50.060553101  4139  U   N [dd] 1
>   8,0    3   257660    50.060554100  4139  D   R 9491199 + 256 [dd]
>   8,0    3   257661    50.060701569  4139  A   R 9491455 + 256 <- (8,1) 9491392
>   8,0    3   257662    50.060701890  4139  Q   R 9491455 + 256 [dd]
>   8,0    3   257663    50.060703993  4139  G   R 9491455 + 256 [dd]
>   8,0    3   257664    50.060705070  4139  P   N [dd]
>   8,0    3   257665    50.060705385  4139  I   R 9491455 + 256 [dd]
>   8,0    3   257666    50.060706012  4139  U   N [dd] 2
>   8,0    3   257667    50.060706987  4139  D   R 9491455 + 256 [dd]
>   8,0    3   257668    50.060718784  4139  U   N [dd] 2
>   8,0    3   257669    50.062964966     0  C   R 9491199 + 256 [0]
>   8,0    3   257670    50.063102772  4139  C   R 9491455 + 256 [0]
>   8,0    3   257671    50.063182666  4139  A   R 9491711 + 256 <- (8,1) 9491648
>   8,0    3   257672    50.063182939  4139  Q   R 9491711 + 256 [dd]
>   8,0    3   257673    50.063184889  4139  G   R 9491711 + 256 [dd]
>   8,0    3   257674    50.063186074  4139  P   N [dd]
>   8,0    3   257675    50.063186440  4139  I   R 9491711 + 256 [dd]
>   8,0    3   257676    50.063187271  4139  U   N [dd] 1
>   8,0    3   257677    50.063188312  4139  D   R 9491711 + 256 [dd]
>   8,0    3   257678    50.063340467  4139  A   R 9491967 + 256 <- (8,1) 9491904
>   8,0    3   257679    50.063340749  4139  Q   R 9491967 + 256 [dd]
>   8,0    3   257680    50.063342529  4139  G   R 9491967 + 256 [dd]
>   8,0    3   257681    50.063343597  4139  P   N [dd]
>   8,0    3   257682    50.063343915  4139  I   R 9491967 + 256 [dd]
>   8,0    3   257683    50.063344374  4139  U   N [dd] 2
>   8,0    3   257684    50.063345313  4139  D   R 9491967 + 256 [dd]
>   8,0    3   257685    50.063357370  4139  U   N [dd] 2
>   8,0    3   257686    50.066605011     0  C   R 9491711 + 256 [0]
>   8,0    3   257687    50.066643587     0  C   R 9491967 + 256 [0]
>   8,0    3   257688    50.066821310  4139  A   R 9492223 + 256 <- (8,1) 9492160
>   8,0    3   257689    50.066821601  4139  Q   R 9492223 + 256 [dd]
>   8,0    3   257690    50.066823605  4139  G   R 9492223 + 256 [dd]
>   8,0    3   257691    50.066825063  4139  P   N [dd]
> 
> 
> 
> >
> >
> >> 
> >>                if (PageReadahead(page))
> >>                         page_cache_async_readahead()
> >>                 if (!PageUptodate(page))
> >>                                 goto page_not_up_to_date;
> >>                 //...
> >> page_not_up_to_date:
> >>                 lock_page_killable(page);
> >> 
> >> Therefore explicit unplugging can help.
> >> 
> >> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> >> Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
> >> 
> >> 
> >>  mm/readahead.c |   10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >> 
> >> --- linux.orig/mm/readahead.c
> >> +++ linux/mm/readahead.c
> >> @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
> >>  
> >>  	/* do read-ahead */
> >>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> >> +
> >> +	/*
> >> +	* Normally the current page is !uptodate and lock_page() will be
> >> +	* immediately called to implicitly unplug the device. However this
> >> +	* is not always true for RAID conifgurations, where data arrives
> >> +	* not strictly in their submission order. In this case we need to
> >> +	* explicitly kick off the IO.
> >> +	*/
> >> +	if (PageUptodate(page))
> >> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >>  }
> >>  EXPORT_SYMBOL_GPL(page_cache_async_readahead); 
> >> 
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-05-27  3:06                           ` Hisashi Hifumi
  2009-05-27  3:26                             ` KOSAKI Motohiro
@ 2009-06-01  2:37                             ` Wu Fengguang
  2009-06-01  2:51                               ` Hisashi Hifumi
  1 sibling, 1 reply; 35+ messages in thread
From: Wu Fengguang @ 2009-06-01  2:37 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Wed, May 27, 2009 at 11:06:37AM +0800, Hisashi Hifumi wrote:
> 
> At 11:57 09/05/27, Wu Fengguang wrote:
> >On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
> >> 
> >> At 11:36 09/05/27, Wu Fengguang wrote:
> >> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
> >> >>
> >> >> At 11:09 09/05/27, Wu Fengguang wrote:
> >> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> >> >> >>
> >> >> >> At 08:42 09/05/27, Andrew Morton wrote:
> >> >> >> >On Fri, 22 May 2009 10:33:23 +0800
> >> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> >> >> >
> >> >> >> >> > I tested above patch, and I got same performance number.
> >> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
> >> >> >> >>
> >> >> >> >> Thanks!  This is an interesting micro timing behavior that
> >> >> >> >> demands some research work.  The above check is to confirm if it's
> >> >> >> >> the PageUptodate() case that makes the difference. So why that case
> >> >> >> >> happens so frequently so as to impact the performance? Will it also
> >> >> >> >> happen in NFS?
> >> >> >> >>
> >> >> >> >> The problem is readahead IO pipeline is not running smoothly, which is
> >> >> >> >> undesirable and not well understood for now.
> >> >> >> >
> >> >> >> >The patch causes a remarkably large performance increase.  A 9%
> >> >> >> >reduction in time for a linear read? I'd be surprised if the workload
> >> >> >>
> >> >> >> Hi Andrew.
> >> >> >> Yes, I tested this with dd.
> >> >> >>
> >> >> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >> >> >> >
> >> >> >> >Have you been able to reproduce this in your testing?
> >> >> >>
> >> >> >> Yes, this test on my environment is reproducible.
> >> >> >
> >> >> >Hisashi, does your environment have some special configurations?
> >> >>
> >> >> Hi.
> >> >> My testing environment is as follows:
> >> >> Hardware: HP DL580
> >> >> CPU:Xeon 3.2GHz *4 HT enabled
> >> >> Memory:8GB
> >> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> >> >
> >> >This is a big hardware RAID. What's the readahead size?
> >> >
> >> >The numbers look too small for a 7 disk RAID:
> >> >
> >> >        > #dd if=testdir/testfile of=/dev/null bs=16384
> >> >        >
> >> >        > -2.6.30-rc6
> >> >        > 1048576+0 records in
> >> >        > 1048576+0 records out
> >> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> >        >
> >> >        > -2.6.30-rc6-patched
> >> >        > 1048576+0 records in
> >> >        > 1048576+0 records out
> >> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> >
> >> >I'd suggest you to configure the array properly before coming back to
> >> >measuring the impact of this patch.
> >> 
> >> 
> >> I created 16GB file to this disk array, and mounted to testdir, dd to 
> >this directory.
> >
> >I mean, you should get >300MB/s throughput with 7 disks, and you
> >should seek ways to achieve that before testing out this patch :-)
> 
> Throughput number of storage array is very from one product to another.
> On my hardware environment I think this number is valid and
> my patch is effective.

What's your readahead size? Is it large enough to cover the stripe width?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-06-01  2:37                             ` Wu Fengguang
@ 2009-06-01  2:51                               ` Hisashi Hifumi
  2009-06-01  3:02                                 ` Wu Fengguang
  0 siblings, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-06-01  2:51 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe


At 11:37 09/06/01, Wu Fengguang wrote:
>On Wed, May 27, 2009 at 11:06:37AM +0800, Hisashi Hifumi wrote:
>> 
>> At 11:57 09/05/27, Wu Fengguang wrote:
>> >On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
>> >> 
>> >> At 11:36 09/05/27, Wu Fengguang wrote:
>> >> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
>> >> >>
>> >> >> At 11:09 09/05/27, Wu Fengguang wrote:
>> >> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
>> >> >> >>
>> >> >> >> At 08:42 09/05/27, Andrew Morton wrote:
>> >> >> >> >On Fri, 22 May 2009 10:33:23 +0800
>> >> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >> >> >> >
>> >> >> >> >> > I tested above patch, and I got same performance number.
>> >> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
>> >> >> >> >>
>> >> >> >> >> Thanks!  This is an interesting micro timing behavior that
>> >> >> >> >> demands some research work.  The above check is to confirm if it's
>> >> >> >> >> the PageUptodate() case that makes the difference. So why that case
>> >> >> >> >> happens so frequently so as to impact the performance? Will it also
>> >> >> >> >> happen in NFS?
>> >> >> >> >>
>> >> >> >> >> The problem is readahead IO pipeline is not running smoothly, 
>which is
>> >> >> >> >> undesirable and not well understood for now.
>> >> >> >> >
>> >> >> >> >The patch causes a remarkably large performance increase.  A 9%
>> >> >> >> >reduction in time for a linear read? I'd be surprised if the workload
>> >> >> >>
>> >> >> >> Hi Andrew.
>> >> >> >> Yes, I tested this with dd.
>> >> >> >>
>> >> >> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
>> >> >> >> >
>> >> >> >> >Have you been able to reproduce this in your testing?
>> >> >> >>
>> >> >> >> Yes, this test on my environment is reproducible.
>> >> >> >
>> >> >> >Hisashi, does your environment have some special configurations?
>> >> >>
>> >> >> Hi.
>> >> >> My testing environment is as follows:
>> >> >> Hardware: HP DL580
>> >> >> CPU:Xeon 3.2GHz *4 HT enabled
>> >> >> Memory:8GB
>> >> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
>> >> >
>> >> >This is a big hardware RAID. What's the readahead size?
>> >> >
>> >> >The numbers look too small for a 7 disk RAID:
>> >> >
>> >> >        > #dd if=testdir/testfile of=/dev/null bs=16384
>> >> >        >
>> >> >        > -2.6.30-rc6
>> >> >        > 1048576+0 records in
>> >> >        > 1048576+0 records out
>> >> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> >> >        >
>> >> >        > -2.6.30-rc6-patched
>> >> >        > 1048576+0 records in
>> >> >        > 1048576+0 records out
>> >> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> >> >
>> >> >I'd suggest you to configure the array properly before coming back to
>> >> >measuring the impact of this patch.
>> >> 
>> >> 
>> >> I created 16GB file to this disk array, and mounted to testdir, dd to 
>> >this directory.
>> >
>> >I mean, you should get >300MB/s throughput with 7 disks, and you
>> >should seek ways to achieve that before testing out this patch :-)
>> 
>> Throughput number of storage array is very from one product to another.
>> On my hardware environment I think this number is valid and
>> my patch is effective.
>
>What's your readahead size? Is it large enough to cover the stripe width?

Do you mean strage's readahead size?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-06-01  2:51                               ` Hisashi Hifumi
@ 2009-06-01  3:02                                 ` Wu Fengguang
  2009-06-01  3:06                                   ` KOSAKI Motohiro
  2009-06-01  3:07                                   ` Hisashi Hifumi
  0 siblings, 2 replies; 35+ messages in thread
From: Wu Fengguang @ 2009-06-01  3:02 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Mon, Jun 01, 2009 at 10:51:56AM +0800, Hisashi Hifumi wrote:
> 
> At 11:37 09/06/01, Wu Fengguang wrote:
> >On Wed, May 27, 2009 at 11:06:37AM +0800, Hisashi Hifumi wrote:
> >> 
> >> At 11:57 09/05/27, Wu Fengguang wrote:
> >> >On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
> >> >> 
> >> >> At 11:36 09/05/27, Wu Fengguang wrote:
> >> >> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
> >> >> >>
> >> >> >> At 11:09 09/05/27, Wu Fengguang wrote:
> >> >> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> >> >> >> >>
> >> >> >> >> At 08:42 09/05/27, Andrew Morton wrote:
> >> >> >> >> >On Fri, 22 May 2009 10:33:23 +0800
> >> >> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> >> >> >> >
> >> >> >> >> >> > I tested above patch, and I got same performance number.
> >> >> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
> >> >> >> >> >>
> >> >> >> >> >> Thanks!  This is an interesting micro timing behavior that
> >> >> >> >> >> demands some research work.  The above check is to confirm if it's
> >> >> >> >> >> the PageUptodate() case that makes the difference. So why that case
> >> >> >> >> >> happens so frequently so as to impact the performance? Will it also
> >> >> >> >> >> happen in NFS?
> >> >> >> >> >>
> >> >> >> >> >> The problem is readahead IO pipeline is not running smoothly, 
> >which is
> >> >> >> >> >> undesirable and not well understood for now.
> >> >> >> >> >
> >> >> >> >> >The patch causes a remarkably large performance increase.  A 9%
> >> >> >> >> >reduction in time for a linear read? I'd be surprised if the workload
> >> >> >> >>
> >> >> >> >> Hi Andrew.
> >> >> >> >> Yes, I tested this with dd.
> >> >> >> >>
> >> >> >> >> >even consumed 9% of a CPU, so where on earth has the kernel gone to?
> >> >> >> >> >
> >> >> >> >> >Have you been able to reproduce this in your testing?
> >> >> >> >>
> >> >> >> >> Yes, this test on my environment is reproducible.
> >> >> >> >
> >> >> >> >Hisashi, does your environment have some special configurations?
> >> >> >>
> >> >> >> Hi.
> >> >> >> My testing environment is as follows:
> >> >> >> Hardware: HP DL580
> >> >> >> CPU:Xeon 3.2GHz *4 HT enabled
> >> >> >> Memory:8GB
> >> >> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> >> >> >
> >> >> >This is a big hardware RAID. What's the readahead size?
> >> >> >
> >> >> >The numbers look too small for a 7 disk RAID:
> >> >> >
> >> >> >        > #dd if=testdir/testfile of=/dev/null bs=16384
> >> >> >        >
> >> >> >        > -2.6.30-rc6
> >> >> >        > 1048576+0 records in
> >> >> >        > 1048576+0 records out
> >> >> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> >> >        >
> >> >> >        > -2.6.30-rc6-patched
> >> >> >        > 1048576+0 records in
> >> >> >        > 1048576+0 records out
> >> >> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> >> >
> >> >> >I'd suggest you to configure the array properly before coming back to
> >> >> >measuring the impact of this patch.
> >> >> 
> >> >> 
> >> >> I created 16GB file to this disk array, and mounted to testdir, dd to 
> >> >this directory.
> >> >
> >> >I mean, you should get >300MB/s throughput with 7 disks, and you
> >> >should seek ways to achieve that before testing out this patch :-)
> >> 
> >> Throughput number of storage array is very from one product to another.
> >> On my hardware environment I think this number is valid and
> >> my patch is effective.
> >
> >What's your readahead size? Is it large enough to cover the stripe width?
> 
> Do you mean strage's readahead size?

What's strage? I mean if your RAID's block device file is /dev/sda, then

        blockdev --getra /dev/sda

will tell its readahead size in unit of 512 bytes.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-06-01  3:02                                 ` Wu Fengguang
@ 2009-06-01  3:06                                   ` KOSAKI Motohiro
  2009-06-01  3:07                                   ` Hisashi Hifumi
  1 sibling, 0 replies; 35+ messages in thread
From: KOSAKI Motohiro @ 2009-06-01  3:06 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: kosaki.motohiro, Hisashi Hifumi, Andrew Morton, linux-kernel,
	linux-fsdevel, linux-mm, jens.axboe

> > >> >I mean, you should get >300MB/s throughput with 7 disks, and you
> > >> >should seek ways to achieve that before testing out this patch :-)
> > >> 
> > >> Throughput number of storage array is very from one product to another.
> > >> On my hardware environment I think this number is valid and
> > >> my patch is effective.
> > >
> > >What's your readahead size? Is it large enough to cover the stripe width?
> > 
> > Do you mean strage's readahead size?
> 
> What's strage? I mean if your RAID's block device file is /dev/sda, then

I guess it's typo :-)
but I recommend he use sane test environment...


> 
>         blockdev --getra /dev/sda
> 
> will tell its readahead size in unit of 512 bytes.
> 
> Thanks,
> Fengguang
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-06-01  3:02                                 ` Wu Fengguang
  2009-06-01  3:06                                   ` KOSAKI Motohiro
@ 2009-06-01  3:07                                   ` Hisashi Hifumi
  2009-06-01  4:30                                     ` Wu Fengguang
  1 sibling, 1 reply; 35+ messages in thread
From: Hisashi Hifumi @ 2009-06-01  3:07 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe


At 12:02 09/06/01, Wu Fengguang wrote:
>On Mon, Jun 01, 2009 at 10:51:56AM +0800, Hisashi Hifumi wrote:
>> 
>> At 11:37 09/06/01, Wu Fengguang wrote:
>> >On Wed, May 27, 2009 at 11:06:37AM +0800, Hisashi Hifumi wrote:
>> >> 
>> >> At 11:57 09/05/27, Wu Fengguang wrote:
>> >> >On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
>> >> >> 
>> >> >> At 11:36 09/05/27, Wu Fengguang wrote:
>> >> >> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
>> >> >> >>
>> >> >> >> At 11:09 09/05/27, Wu Fengguang wrote:
>> >> >> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
>> >> >> >> >>
>> >> >> >> >> At 08:42 09/05/27, Andrew Morton wrote:
>> >> >> >> >> >On Fri, 22 May 2009 10:33:23 +0800
>> >> >> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
>> >> >> >> >> >
>> >> >> >> >> >> > I tested above patch, and I got same performance number.
>> >> >> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
>> >> >> >> >> >>
>> >> >> >> >> >> Thanks!  This is an interesting micro timing behavior that
>> >> >> >> >> >> demands some research work.  The above check is to confirm 
>if it's
>> >> >> >> >> >> the PageUptodate() case that makes the difference. So why 
>that case
>> >> >> >> >> >> happens so frequently so as to impact the performance? 
>Will it also
>> >> >> >> >> >> happen in NFS?
>> >> >> >> >> >>
>> >> >> >> >> >> The problem is readahead IO pipeline is not running smoothly, 
>> >which is
>> >> >> >> >> >> undesirable and not well understood for now.
>> >> >> >> >> >
>> >> >> >> >> >The patch causes a remarkably large performance increase.  A 9%
>> >> >> >> >> >reduction in time for a linear read? I'd be surprised if the 
>workload
>> >> >> >> >>
>> >> >> >> >> Hi Andrew.
>> >> >> >> >> Yes, I tested this with dd.
>> >> >> >> >>
>> >> >> >> >> >even consumed 9% of a CPU, so where on earth has the kernel 
>gone to?
>> >> >> >> >> >
>> >> >> >> >> >Have you been able to reproduce this in your testing?
>> >> >> >> >>
>> >> >> >> >> Yes, this test on my environment is reproducible.
>> >> >> >> >
>> >> >> >> >Hisashi, does your environment have some special configurations?
>> >> >> >>
>> >> >> >> Hi.
>> >> >> >> My testing environment is as follows:
>> >> >> >> Hardware: HP DL580
>> >> >> >> CPU:Xeon 3.2GHz *4 HT enabled
>> >> >> >> Memory:8GB
>> >> >> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
>> >> >> >
>> >> >> >This is a big hardware RAID. What's the readahead size?
>> >> >> >
>> >> >> >The numbers look too small for a 7 disk RAID:
>> >> >> >
>> >> >> >        > #dd if=testdir/testfile of=/dev/null bs=16384
>> >> >> >        >
>> >> >> >        > -2.6.30-rc6
>> >> >> >        > 1048576+0 records in
>> >> >> >        > 1048576+0 records out
>> >> >> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
>> >> >> >        >
>> >> >> >        > -2.6.30-rc6-patched
>> >> >> >        > 1048576+0 records in
>> >> >> >        > 1048576+0 records out
>> >> >> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
>> >> >> >
>> >> >> >I'd suggest you to configure the array properly before coming back to
>> >> >> >measuring the impact of this patch.
>> >> >> 
>> >> >> 
>> >> >> I created 16GB file to this disk array, and mounted to testdir, dd to 
>> >> >this directory.
>> >> >
>> >> >I mean, you should get >300MB/s throughput with 7 disks, and you
>> >> >should seek ways to achieve that before testing out this patch :-)
>> >> 
>> >> Throughput number of storage array is very from one product to another.
>> >> On my hardware environment I think this number is valid and
>> >> my patch is effective.
>> >
>> >What's your readahead size? Is it large enough to cover the stripe width?
>> 
>> Do you mean strage's readahead size?
>
>What's strage? I mean if your RAID's block device file is /dev/sda, then
>
>        blockdev --getra /dev/sda
>
>will tell its readahead size in unit of 512 bytes.

256 sectors.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH] readahead:add blk_run_backing_dev
  2009-06-01  3:07                                   ` Hisashi Hifumi
@ 2009-06-01  4:30                                     ` Wu Fengguang
  0 siblings, 0 replies; 35+ messages in thread
From: Wu Fengguang @ 2009-06-01  4:30 UTC (permalink / raw)
  To: Hisashi Hifumi
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, kosaki.motohiro,
	linux-mm, jens.axboe

On Mon, Jun 01, 2009 at 11:07:42AM +0800, Hisashi Hifumi wrote:
> 
> At 12:02 09/06/01, Wu Fengguang wrote:
> >On Mon, Jun 01, 2009 at 10:51:56AM +0800, Hisashi Hifumi wrote:
> >> 
> >> At 11:37 09/06/01, Wu Fengguang wrote:
> >> >On Wed, May 27, 2009 at 11:06:37AM +0800, Hisashi Hifumi wrote:
> >> >> 
> >> >> At 11:57 09/05/27, Wu Fengguang wrote:
> >> >> >On Wed, May 27, 2009 at 10:47:47AM +0800, Hisashi Hifumi wrote:
> >> >> >> 
> >> >> >> At 11:36 09/05/27, Wu Fengguang wrote:
> >> >> >> >On Wed, May 27, 2009 at 10:21:53AM +0800, Hisashi Hifumi wrote:
> >> >> >> >>
> >> >> >> >> At 11:09 09/05/27, Wu Fengguang wrote:
> >> >> >> >> >On Wed, May 27, 2009 at 08:25:04AM +0800, Hisashi Hifumi wrote:
> >> >> >> >> >>
> >> >> >> >> >> At 08:42 09/05/27, Andrew Morton wrote:
> >> >> >> >> >> >On Fri, 22 May 2009 10:33:23 +0800
> >> >> >> >> >> >Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> >> >> >> >> >
> >> >> >> >> >> >> > I tested above patch, and I got same performance number.
> >> >> >> >> >> >> > I wonder why if (PageUptodate(page)) check is there...
> >> >> >> >> >> >>
> >> >> >> >> >> >> Thanks!  This is an interesting micro timing behavior that
> >> >> >> >> >> >> demands some research work.  The above check is to confirm 
> >if it's
> >> >> >> >> >> >> the PageUptodate() case that makes the difference. So why 
> >that case
> >> >> >> >> >> >> happens so frequently so as to impact the performance? 
> >Will it also
> >> >> >> >> >> >> happen in NFS?
> >> >> >> >> >> >>
> >> >> >> >> >> >> The problem is readahead IO pipeline is not running smoothly, 
> >> >which is
> >> >> >> >> >> >> undesirable and not well understood for now.
> >> >> >> >> >> >
> >> >> >> >> >> >The patch causes a remarkably large performance increase.  A 9%
> >> >> >> >> >> >reduction in time for a linear read? I'd be surprised if the 
> >workload
> >> >> >> >> >>
> >> >> >> >> >> Hi Andrew.
> >> >> >> >> >> Yes, I tested this with dd.
> >> >> >> >> >>
> >> >> >> >> >> >even consumed 9% of a CPU, so where on earth has the kernel 
> >gone to?
> >> >> >> >> >> >
> >> >> >> >> >> >Have you been able to reproduce this in your testing?
> >> >> >> >> >>
> >> >> >> >> >> Yes, this test on my environment is reproducible.
> >> >> >> >> >
> >> >> >> >> >Hisashi, does your environment have some special configurations?
> >> >> >> >>
> >> >> >> >> Hi.
> >> >> >> >> My testing environment is as follows:
> >> >> >> >> Hardware: HP DL580
> >> >> >> >> CPU:Xeon 3.2GHz *4 HT enabled
> >> >> >> >> Memory:8GB
> >> >> >> >> Storage: Dothill SANNet2 FC (7Disks RAID-0 Array)
> >> >> >> >
> >> >> >> >This is a big hardware RAID. What's the readahead size?
> >> >> >> >
> >> >> >> >The numbers look too small for a 7 disk RAID:
> >> >> >> >
> >> >> >> >        > #dd if=testdir/testfile of=/dev/null bs=16384
> >> >> >> >        >
> >> >> >> >        > -2.6.30-rc6
> >> >> >> >        > 1048576+0 records in
> >> >> >> >        > 1048576+0 records out
> >> >> >> >        > 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> >> >> >> >        >
> >> >> >> >        > -2.6.30-rc6-patched
> >> >> >> >        > 1048576+0 records in
> >> >> >> >        > 1048576+0 records out
> >> >> >> >        > 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> >> >> >> >
> >> >> >> >I'd suggest you to configure the array properly before coming back to
> >> >> >> >measuring the impact of this patch.
> >> >> >> 
> >> >> >> 
> >> >> >> I created 16GB file to this disk array, and mounted to testdir, dd to 
> >> >> >this directory.
> >> >> >
> >> >> >I mean, you should get >300MB/s throughput with 7 disks, and you
> >> >> >should seek ways to achieve that before testing out this patch :-)
> >> >> 
> >> >> Throughput number of storage array is very from one product to another.
> >> >> On my hardware environment I think this number is valid and
> >> >> my patch is effective.
> >> >
> >> >What's your readahead size? Is it large enough to cover the stripe width?
> >> 
> >> Do you mean strage's readahead size?
> >
> >What's strage? I mean if your RAID's block device file is /dev/sda, then
> >
> >        blockdev --getra /dev/sda
> >
> >will tell its readahead size in unit of 512 bytes.
> 
> 256 sectors.

That's too small! Try this:

        blockdev --setra 8192 /dev/sda

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2009-06-01  4:30 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6.0.0.20.2.20090518183752.0581fdc0@172.19.0.2>
2009-05-20  1:07 ` [PATCH] readahead:add blk_run_backing_dev KOSAKI Motohiro
2009-05-20  1:43   ` Hisashi Hifumi
2009-05-20  2:52     ` Wu Fengguang
     [not found] ` <20090518175259.GL4140@kernel.dk>
2009-05-20  2:51   ` Wu Fengguang
2009-05-21  6:01     ` Hisashi Hifumi
2009-05-22  1:05       ` Wu Fengguang
2009-05-22  1:44         ` Hisashi Hifumi
2009-05-22  2:33           ` Wu Fengguang
2009-05-26 23:42             ` Andrew Morton
2009-05-27  0:25               ` Hisashi Hifumi
2009-05-27  2:09                 ` Wu Fengguang
2009-05-27  2:21                   ` Hisashi Hifumi
2009-05-27  2:35                     ` KOSAKI Motohiro
2009-05-27  2:36                     ` Andrew Morton
2009-05-27  2:38                       ` Hisashi Hifumi
2009-05-27  3:55                       ` Wu Fengguang
2009-05-27  4:06                         ` KOSAKI Motohiro
2009-05-27  4:36                           ` Wu Fengguang
2009-05-27  6:20                             ` Hisashi Hifumi
2009-05-28  1:20                             ` Hisashi Hifumi
2009-05-28  2:23                               ` KOSAKI Motohiro
2009-06-01  1:39                                 ` Hisashi Hifumi
2009-06-01  2:23                                   ` KOSAKI Motohiro
2009-05-27  2:36                     ` Wu Fengguang
2009-05-27  2:47                       ` Hisashi Hifumi
2009-05-27  2:57                         ` Wu Fengguang
2009-05-27  3:06                           ` Hisashi Hifumi
2009-05-27  3:26                             ` KOSAKI Motohiro
2009-06-01  2:37                             ` Wu Fengguang
2009-06-01  2:51                               ` Hisashi Hifumi
2009-06-01  3:02                                 ` Wu Fengguang
2009-06-01  3:06                                   ` KOSAKI Motohiro
2009-06-01  3:07                                   ` Hisashi Hifumi
2009-06-01  4:30                                     ` Wu Fengguang
2009-05-27  2:07               ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox