Re: xfs bug in 2.6.26-rc9

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: xfs bug in 2.6.26-rc9
       [not found]     ` <487B019B.9090401@sgi.com>
@ 2008-07-14 12:13       ` Dave Chinner
  2008-07-15  2:12         ` Lachlan McIlroy
  2008-07-15  6:17         ` Nick Piggin
  0 siblings, 2 replies; 6+ messages in thread
From: Dave Chinner @ 2008-07-14 12:13 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: Mikael Abrahamsson, linux-kernel, xfs, linux-mm

On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote:
> Mikael Abrahamsson wrote:
>> On Fri, 11 Jul 2008, Dave Chinner wrote:
>>
>>> That aside, what was the assert failure reported prior to the oops?  
>>> i.e. paste the lines in the log before the ---[ cut here ]--- line?  
>>> One of them will start with 'Assertion failed:', I think....
>>
>> These ones?
>>
>> Jul  8 04:44:56 via kernel: [554197.888008] Assertion failed: whichfork 
>> == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c,  
>> line: 5879
>> Jul  9 03:25:21 via kernel: [42940.748007] Assertion failed: whichfork  
>> == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c,  
>> line: 5879
>
> 	xfs_ilock(ip, XFS_IOLOCK_SHARED);
>
> 	if (whichfork == XFS_DATA_FORK &&
> 		(ip->i_delayed_blks || ip->i_size > ip->i_d.di_size)) {
> 		/* xfs_fsize_t last_byte = xfs_file_last_byte(ip); */
> 		error = xfs_flush_pages(ip, (xfs_off_t)0,
> 					       -1, 0, FI_REMAPF);
> 		if (error) {
> 			xfs_iunlock(ip, XFS_IOLOCK_SHARED);
> 		return error;
> 		}
> 	}
>
> 	ASSERT(whichfork == XFS_ATTR_FORK || ip->i_delayed_blks == 0);
>
> This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the
> iolock and then flushes the file and because it has the iolock it doesn't
> expect any new delayed allocations to occur.  A mmap write can allocate
> delayed allocations without acquiring the iolock so is able to get in
> after the flush but before the ASSERT.

Christoph and I were contemplating this problem with ->page_mkwrite
reecently. The problem is that we can't, right now, return an
EAGAIN-like error to ->page_mkwrite() and have it retry the
page fault. Other parts of the page faulting code can do this,
so it seems like a solvable problem.

The basic concept is that if we can return a EAGAIN result we can
try-lock the inode and hold the locks necessary to avoid this race
or prevent the page fault from dirtying the page until the
filesystem is unfrozen.

Added linux-mm to the cc list for discussion.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs bug in 2.6.26-rc9
  2008-07-14 12:13       ` xfs bug in 2.6.26-rc9 Dave Chinner
@ 2008-07-15  2:12         ` Lachlan McIlroy
  2008-07-15  3:18           ` Dave Chinner
  2008-07-15  6:17         ` Nick Piggin
  1 sibling, 1 reply; 6+ messages in thread
From: Lachlan McIlroy @ 2008-07-15  2:12 UTC (permalink / raw)
  To: Lachlan McIlroy, Mikael Abrahamsson, linux-kernel, xfs, linux-mm

Dave Chinner wrote:
> On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote:
>> Mikael Abrahamsson wrote:
>>> On Fri, 11 Jul 2008, Dave Chinner wrote:
>>>
>>>> That aside, what was the assert failure reported prior to the oops?  
>>>> i.e. paste the lines in the log before the ---[ cut here ]--- line?  
>>>> One of them will start with 'Assertion failed:', I think....
>>> These ones?
>>>
>>> Jul  8 04:44:56 via kernel: [554197.888008] Assertion failed: whichfork 
>>> == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c,  
>>> line: 5879
>>> Jul  9 03:25:21 via kernel: [42940.748007] Assertion failed: whichfork  
>>> == XFS_ATTR_FORK || ip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap.c,  
>>> line: 5879
>> 	xfs_ilock(ip, XFS_IOLOCK_SHARED);
>>
>> 	if (whichfork == XFS_DATA_FORK &&
>> 		(ip->i_delayed_blks || ip->i_size > ip->i_d.di_size)) {
>> 		/* xfs_fsize_t last_byte = xfs_file_last_byte(ip); */
>> 		error = xfs_flush_pages(ip, (xfs_off_t)0,
>> 					       -1, 0, FI_REMAPF);
>> 		if (error) {
>> 			xfs_iunlock(ip, XFS_IOLOCK_SHARED);
>> 		return error;
>> 		}
>> 	}
>>
>> 	ASSERT(whichfork == XFS_ATTR_FORK || ip->i_delayed_blks == 0);
>>
>> This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the
>> iolock and then flushes the file and because it has the iolock it doesn't
>> expect any new delayed allocations to occur.  A mmap write can allocate
>> delayed allocations without acquiring the iolock so is able to get in
>> after the flush but before the ASSERT.
> 
> Christoph and I were contemplating this problem with ->page_mkwrite
> reecently. The problem is that we can't, right now, return an
> EAGAIN-like error to ->page_mkwrite() and have it retry the
> page fault. Other parts of the page faulting code can do this,
> so it seems like a solvable problem.
> 
> The basic concept is that if we can return a EAGAIN result we can
> try-lock the inode and hold the locks necessary to avoid this race
> or prevent the page fault from dirtying the page until the
> filesystem is unfrozen.
Why do we need to try-lock the inode?  Will we have an ABBA deadlock
if we block on the iolock in ->page_mkwrite()?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs bug in 2.6.26-rc9
  2008-07-15  2:12         ` Lachlan McIlroy
@ 2008-07-15  3:18           ` Dave Chinner
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2008-07-15  3:18 UTC (permalink / raw)
  To: Lachlan McIlroy; +Cc: Mikael Abrahamsson, linux-kernel, xfs, linux-mm

On Tue, Jul 15, 2008 at 12:12:52PM +1000, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> On Mon, Jul 14, 2008 at 05:34:51PM +1000, Lachlan McIlroy wrote:
>>> This is a race between xfs_fsr and a mmap write. xfs_fsr acquires the
>>> iolock and then flushes the file and because it has the iolock it doesn't
>>> expect any new delayed allocations to occur.  A mmap write can allocate
>>> delayed allocations without acquiring the iolock so is able to get in
>>> after the flush but before the ASSERT.
>>
>> Christoph and I were contemplating this problem with ->page_mkwrite
>> reecently. The problem is that we can't, right now, return an
>> EAGAIN-like error to ->page_mkwrite() and have it retry the
>> page fault. Other parts of the page faulting code can do this,
>> so it seems like a solvable problem.
>>
>> The basic concept is that if we can return a EAGAIN result we can
>> try-lock the inode and hold the locks necessary to avoid this race
>> or prevent the page fault from dirtying the page until the
>> filesystem is unfrozen.
> Why do we need to try-lock the inode?  Will we have an ABBA deadlock
> if we block on the iolock in ->page_mkwrite()?

Yes. With the mmap_sem. Look at the rules in mm/filemap.c
and replace i_mutex with iolock....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs bug in 2.6.26-rc9
  2008-07-14 12:13       ` xfs bug in 2.6.26-rc9 Dave Chinner
  2008-07-15  2:12         ` Lachlan McIlroy
@ 2008-07-15  6:17         ` Nick Piggin
  2008-07-15 12:22           ` Christoph Hellwig
  1 sibling, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2008-07-15  6:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Lachlan McIlroy, Mikael Abrahamsson, linux-kernel, xfs, linux-mm

On Monday 14 July 2008 22:13, Dave Chinner wrote:

> Christoph and I were contemplating this problem with ->page_mkwrite
> reecently. The problem is that we can't, right now, return an
> EAGAIN-like error to ->page_mkwrite() and have it retry the
> page fault. Other parts of the page faulting code can do this,
> so it seems like a solvable problem.
>
> The basic concept is that if we can return a EAGAIN result we can
> try-lock the inode and hold the locks necessary to avoid this race
> or prevent the page fault from dirtying the page until the
> filesystem is unfrozen.
>
> Added linux-mm to the cc list for discussion.

It would be easily possible to do, yes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs bug in 2.6.26-rc9
  2008-07-15  6:17         ` Nick Piggin
@ 2008-07-15 12:22           ` Christoph Hellwig
  2008-07-16  4:12             ` Nick Piggin
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2008-07-15 12:22 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Dave Chinner, Lachlan McIlroy, Mikael Abrahamsson, linux-kernel,
	xfs, linux-mm

> It would be easily possible to do, yes.

What happened to your plans to merge ->nopfn into ->fault?  Beeing
able to restart page based faults would be a logical fallout from that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xfs bug in 2.6.26-rc9
  2008-07-15 12:22           ` Christoph Hellwig
@ 2008-07-16  4:12             ` Nick Piggin
  0 siblings, 0 replies; 6+ messages in thread
From: Nick Piggin @ 2008-07-16  4:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Chinner, Lachlan McIlroy, Mikael Abrahamsson, linux-kernel,
	xfs, linux-mm

On Tuesday 15 July 2008 22:22, Christoph Hellwig wrote:
> > It would be easily possible to do, yes.
>
> What happened to your plans to merge ->nopfn into ->fault?  Beeing
> able to restart page based faults would be a logical fallout from that.

Yeah I guess I should really do that, and you're right it would
work nicely for this.

Actually I have some code but it is not quite as nice as I'd like.
The problem is that we have the generic file fault handler, but not
a generic page mkwrite handler. So we still need some kind of
page_mkwrite aop which the file fault handler can then call if it
exists. It isn't a big problem AFAIKS, but just a bit irritating.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-07-16  4:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <alpine.DEB.1.10.0807110939520.30192@uplift.swm.pp.se>
     [not found] ` <20080711084248.GU29319@disturbed>
     [not found]   ` <alpine.DEB.1.10.0807111215040.30192@uplift.swm.pp.se>
     [not found]     ` <487B019B.9090401@sgi.com>
2008-07-14 12:13       ` xfs bug in 2.6.26-rc9 Dave Chinner
2008-07-15  2:12         ` Lachlan McIlroy
2008-07-15  3:18           ` Dave Chinner
2008-07-15  6:17         ` Nick Piggin
2008-07-15 12:22           ` Christoph Hellwig
2008-07-16  4:12             ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox