* ext3 fsync being starved for a long time by cp and cronjob
@ 2006-08-25 11:53 Andi Kleen
2006-08-25 12:07 ` Jens Axboe
0 siblings, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2006-08-25 11:53 UTC (permalink / raw)
To: axboe, akpm, linux-mm, ext2-devel
My vim is right now sitting for over a minute being stalled in a fsync
(it was several minutes overall):
vi D ffff810077879d98 0 13905 13900 (NOTLB)
ffff810077879d98 ffffffff804d1c4e 000000000000008f ffff810009256240
ffff81007be8e080 ffff810009256418 0000000000000001 0000000000000246
0000000000000003 0000000000000000 000000008022284e ffff81007bd02024
Call Trace:
[<ffffffff804d1c4e>] thread_return+0x0/0xd3
[<ffffffff802db658>] log_wait_commit+0xa3/0xf5
[<ffffffff8023b05c>] autoremove_wake_function+0x0/0x2e
[<ffffffff802d4cee>] journal_stop+0x1d2/0x202
[<ffffffff80284f13>] __writeback_single_inode+0x1ec/0x372
[<ffffffff8023b05c>] autoremove_wake_function+0x0/0x2e
[<ffffffff802850ba>] sync_inode+0x21/0x30
[<ffffffff802c5bd9>] ext3_sync_file+0xb1/0xc4
[<ffffffff8026763b>] do_fsync+0x4f/0x85
[<ffffffff80267694>] __do_fsync+0x23/0x36
[<ffffffff802094ee>] system_call+0x7e/0x83
Background load is a large cp from the same fs to a tmpfs and a cron job
doing random cron job stuff. All on a single sata disk with a 28G partition.
While I write this other windows keep stalling too, like my
mailer and I have to wait to continue. I'm not sure it did fsync or not.
Kernel is 2.6.18rc3. Elevator is CFQ2.
Is such long starvation expected? Will ext4 fix that?
cp D ffff81003f041bd8 0 13873 13872 (NOTLB)
ffff81003f041bd8 ffff81005d6937c0 0000000000002578 ffff8100186c89e0
ffff81007be8e080 ffff8100186c8bb8 ffff8100551ff710 ffff81003f041cb8
ffffffff802c6f2d 0000000000000000 0000000000000046 ffff81007b2f9968
Call Trace:
[<ffffffff802c6f2d>] __ext3_get_inode_loc+0x156/0x317
[<ffffffff802485a2>] sync_page+0x0/0x41
[<ffffffff804d1d47>] io_schedule+0x26/0x32
[<ffffffff80433dc1>] dm_unplug_all+0x0/0x28
[<ffffffff802485de>] sync_page+0x3c/0x41
[<ffffffff804d255a>] __wait_on_bit_lock+0x37/0x64
[<ffffffff802486b2>] __lock_page+0x5e/0x64
[<ffffffff8023b16c>] wake_bit_function+0x0/0x23
[<ffffffff80249cff>] do_generic_mapping_read+0x1c6/0x3f4
[<ffffffff80248b20>] file_read_actor+0x0/0xfe
[<ffffffff8024a67d>] __generic_file_aio_read+0x14e/0x19b
[<ffffffff8024a86b>] generic_file_aio_read+0x34/0x39
[<ffffffff8026625a>] do_sync_read+0xc7/0x104
[<ffffffff80272edf>] may_open+0x59/0x1bf
[<ffffffff8023b05c>] autoremove_wake_function+0x0/0x2e
[<ffffffff802664b2>] vfs_read+0xa8/0x14d
[<ffffffff80266e78>] sys_read+0x45/0x6e
[<ffffffff802094ee>] system_call+0x7e/0x83
kjournald S ffff81007aa3be98 0 1369 11 7279 910 (L-TLB)
ffff81007aa3be98 0000000000000fb4 0000000000000510 ffff81007b330ae0
ffff81007be5b5e0 ffff81007b330cb8 0000000000000001 0000000000000246
0000000000000003 ffff81007aa3be98 ffffffff8022284e 0000000000000000
Call Trace:
[<ffffffff8022284e>] __wake_up+0x36/0x4d
[<ffffffff802da225>] kjournald+0x192/0x213
[<ffffffff8023b05c>] autoremove_wake_function+0x0/0x2e
[<ffffffff8023acfc>] keventd_create_kthread+0x0/0x5e
[<ffffffff802da093>] kjournald+0x0/0x213
[<ffffffff8023acfc>] keventd_create_kthread+0x0/0x5e
[<ffffffff8023aef7>] kthread+0xcb/0xf5
[<ffffffff8020a3d6>] child_rip+0x8/0x12
[<ffffffff8023acfc>] keventd_create_kthread+0x0/0x5e
[<ffffffff8023ae2c>] kthread+0x0/0xf5
[<ffffffff8020a3ce>] child_rip+0x0/0x12
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 11:53 ext3 fsync being starved for a long time by cp and cronjob Andi Kleen
@ 2006-08-25 12:07 ` Jens Axboe
2006-08-25 12:22 ` Andi Kleen
0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2006-08-25 12:07 UTC (permalink / raw)
To: Andi Kleen; +Cc: akpm, linux-mm, ext2-devel
On Fri, Aug 25 2006, Andi Kleen wrote:
> My vim is right now sitting for over a minute being stalled in a fsync
> (it was several minutes overall):
>
> vi D ffff810077879d98 0 13905 13900 (NOTLB)
> ffff810077879d98 ffffffff804d1c4e 000000000000008f ffff810009256240
> ffff81007be8e080 ffff810009256418 0000000000000001 0000000000000246
> 0000000000000003 0000000000000000 000000008022284e ffff81007bd02024
> Call Trace:
> [<ffffffff804d1c4e>] thread_return+0x0/0xd3
> [<ffffffff802db658>] log_wait_commit+0xa3/0xf5
> [<ffffffff8023b05c>] autoremove_wake_function+0x0/0x2e
> [<ffffffff802d4cee>] journal_stop+0x1d2/0x202
> [<ffffffff80284f13>] __writeback_single_inode+0x1ec/0x372
> [<ffffffff8023b05c>] autoremove_wake_function+0x0/0x2e
> [<ffffffff802850ba>] sync_inode+0x21/0x30
> [<ffffffff802c5bd9>] ext3_sync_file+0xb1/0xc4
> [<ffffffff8026763b>] do_fsync+0x4f/0x85
> [<ffffffff80267694>] __do_fsync+0x23/0x36
> [<ffffffff802094ee>] system_call+0x7e/0x83
>
> Background load is a large cp from the same fs to a tmpfs and a cron job
> doing random cron job stuff. All on a single sata disk with a 28G partition.
>
> While I write this other windows keep stalling too, like my
> mailer and I have to wait to continue. I'm not sure it did fsync or not.
The problem with fsync() is that it's disconnected from the previously
submitted IO (which was async). The fsync() really wants to say "the IO
I'm submitting now and submitted previously is now sync", but we don't
do that well enough. More than a minute long stall is pretty nasty,
though. Not quite sure what the best way to fix this would be, but it's
certainly on my TODO for things to get done.
Does deadline do better?
--
Jens Axboe
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 12:07 ` Jens Axboe
@ 2006-08-25 12:22 ` Andi Kleen
2006-08-25 12:26 ` Jens Axboe
0 siblings, 1 reply; 9+ messages in thread
From: Andi Kleen @ 2006-08-25 12:22 UTC (permalink / raw)
To: Jens Axboe; +Cc: akpm, linux-mm, ext2-devel
> Does deadline do better?
It's not really repeatable workload. It's just my workstation which
got into this unpleasant state while me trying to get work done.
I can change it to deadline and see if I see this still again, but it might
take some time.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 12:22 ` Andi Kleen
@ 2006-08-25 12:26 ` Jens Axboe
2006-08-25 12:30 ` Andi Kleen
0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2006-08-25 12:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: akpm, linux-mm, ext2-devel
On Fri, Aug 25 2006, Andi Kleen wrote:
>
> > Does deadline do better?
>
> It's not really repeatable workload. It's just my workstation which
> got into this unpleasant state while me trying to get work done.
>
> I can change it to deadline and see if I see this still again, but it might
> take some time.
Yeah, a test case might be simpler to write and test with. I'll see if I
can come up with something.
--
Jens Axboe
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 12:26 ` Jens Axboe
@ 2006-08-25 12:30 ` Andi Kleen
2006-08-25 12:34 ` Jens Axboe
2006-08-26 4:14 ` [Ext2-devel] " Theodore Tso
0 siblings, 2 replies; 9+ messages in thread
From: Andi Kleen @ 2006-08-25 12:30 UTC (permalink / raw)
To: Jens Axboe; +Cc: akpm, linux-mm, ext2-devel
On Friday 25 August 2006 14:26, Jens Axboe wrote:
> On Fri, Aug 25 2006, Andi Kleen wrote:
> >
> > > Does deadline do better?
> >
> > It's not really repeatable workload. It's just my workstation which
> > got into this unpleasant state while me trying to get work done.
> >
> > I can change it to deadline and see if I see this still again, but it might
> > take some time.
>
> Yeah, a test case might be simpler to write and test with. I'll see if I
> can come up with something.
So you think it's the elevator? I was about to blame JBD.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 12:30 ` Andi Kleen
@ 2006-08-25 12:34 ` Jens Axboe
2006-08-25 12:51 ` Andi Kleen
2006-08-26 4:14 ` [Ext2-devel] " Theodore Tso
1 sibling, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2006-08-25 12:34 UTC (permalink / raw)
To: Andi Kleen; +Cc: akpm, linux-mm, ext2-devel
On Fri, Aug 25 2006, Andi Kleen wrote:
> On Friday 25 August 2006 14:26, Jens Axboe wrote:
> > On Fri, Aug 25 2006, Andi Kleen wrote:
> > >
> > > > Does deadline do better?
> > >
> > > It's not really repeatable workload. It's just my workstation which
> > > got into this unpleasant state while me trying to get work done.
> > >
> > > I can change it to deadline and see if I see this still again, but it might
> > > take some time.
> >
> > Yeah, a test case might be simpler to write and test with. I'll see if I
> > can come up with something.
>
> So you think it's the elevator? I was about to blame JBD.
Not sure, it might be ext3. Hence the deadline test would be useful. All
I know for sure is that the io scheduling for fsync() can be improved.
Did you try data=writeback?
--
Jens Axboe
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 12:34 ` Jens Axboe
@ 2006-08-25 12:51 ` Andi Kleen
0 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2006-08-25 12:51 UTC (permalink / raw)
To: Jens Axboe; +Cc: akpm, linux-mm, ext2-devel
On Friday 25 August 2006 14:34, Jens Axboe wrote:
> On Fri, Aug 25 2006, Andi Kleen wrote:
> > On Friday 25 August 2006 14:26, Jens Axboe wrote:
> > > On Fri, Aug 25 2006, Andi Kleen wrote:
> > > >
> > > > > Does deadline do better?
> > > >
> > > > It's not really repeatable workload. It's just my workstation which
> > > > got into this unpleasant state while me trying to get work done.
> > > >
> > > > I can change it to deadline and see if I see this still again, but it might
> > > > take some time.
> > >
> > > Yeah, a test case might be simpler to write and test with. I'll see if I
> > > can come up with something.
> >
> > So you think it's the elevator? I was about to blame JBD.
>
> Not sure, it might be ext3. Hence the deadline test would be useful. All
> I know for sure is that the io scheduling for fsync() can be improved.
> Did you try data=writeback?
No. And I would prefer to not try that because I would have to run
my workstation with it for a long time, and ordered seems somewhat
safer for that.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ext2-devel] ext3 fsync being starved for a long time by cp and cronjob
2006-08-25 12:30 ` Andi Kleen
2006-08-25 12:34 ` Jens Axboe
@ 2006-08-26 4:14 ` Theodore Tso
2006-08-26 10:04 ` Andi Kleen
1 sibling, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2006-08-26 4:14 UTC (permalink / raw)
To: Andi Kleen; +Cc: Jens Axboe, akpm, linux-mm, ext2-devel
On Fri, Aug 25, 2006 at 02:30:56PM +0200, Andi Kleen wrote:
> So you think it's the elevator? I was about to blame JBD.
Earlier in the thread, you said:
>Background load is a large cp from the same fs to a tmpfs and a cron job
>doing random cron job stuff. All on a single sata disk with a 28G partition.
That doesn't sound like you are doing anything that would result in a
lot of ext3 journal activity (unless there's something strange running
out of your cron scripts).
As such, it's hard to see how this would be an JBD issue. Ext3 might
have been in the middle of doing a synchronous write of a commit
block, which might have been getting starved by an elevator which
prioritizes read traffic ahead of write traffic, but it doesn't sound
like it's due to the excessive journal traffic.
So if you're focused on allocating blame :-), it's probably both ext3
and the elevator code equally at fault. I suspect what we need is a
way of informing the elevator that when ext3 is writing commit records
or other writes that block filesystem I/O, that these synchronous
writes should be prioritized about other (asynchronous) write traffic.
This hint would have to be passed through the buffer cache layer,
since the jbd layer is still using buffer heads.
Regards,
- Ted
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ext2-devel] ext3 fsync being starved for a long time by cp and cronjob
2006-08-26 4:14 ` [Ext2-devel] " Theodore Tso
@ 2006-08-26 10:04 ` Andi Kleen
0 siblings, 0 replies; 9+ messages in thread
From: Andi Kleen @ 2006-08-26 10:04 UTC (permalink / raw)
To: Theodore Tso; +Cc: Jens Axboe, akpm, linux-mm, ext2-devel
On Saturday 26 August 2006 06:14, Theodore Tso wrote:
> >Background load is a large cp from the same fs to a tmpfs and a cron job
> >doing random cron job stuff. All on a single sata disk with a 28G partition.
>
> That doesn't sound like you are doing anything that would result in a
> lot of ext3 journal activity (unless there's something strange running
> out of your cron scripts).
(looking through the process list again)
kmail was doing some write IO, not sure how much.
So yes.
> So if you're focused on allocating blame :-),
I would be mostly interested in a solution.
(sorry "assigning blame" is a SUSE tongue-in-cheek and just means the
first step in debugging when you try to figure out which subsystem
to look at. It wasn't meant as in to blame some person of
wrong-doing.)
> it's probably both ext3
> and the elevator code equally at fault. I suspect what we need is a
> way of informing the elevator that when ext3 is writing commit records
> or other writes that block filesystem I/O, that these synchronous
> writes should be prioritized about other (asynchronous) write traffic.
> This hint would have to be passed through the buffer cache layer,
> since the jbd layer is still using buffer heads.
Hmm, I thought CFQ currently only looked at processes for priority,
but maybe it's possible to add temporary boosts. Jens?
Or maybe just run kjournald always with a high io priority? I assume
it mostly does journal IO and not much else, right?
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-08-26 10:04 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-25 11:53 ext3 fsync being starved for a long time by cp and cronjob Andi Kleen
2006-08-25 12:07 ` Jens Axboe
2006-08-25 12:22 ` Andi Kleen
2006-08-25 12:26 ` Jens Axboe
2006-08-25 12:30 ` Andi Kleen
2006-08-25 12:34 ` Jens Axboe
2006-08-25 12:51 ` Andi Kleen
2006-08-26 4:14 ` [Ext2-devel] " Theodore Tso
2006-08-26 10:04 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox