linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dmitriy Monakhov <dmonakhov@sw.ru>
To: Andrew Morton <akpm@osdl.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>,
	linux-kernel@vger.kernel.org,
	Linux Memory Management <linux-mm@kvack.org>,
	devel@openvz.org, xfs@oss.sgi.com
Subject: Re: [PATCH]  incorrect error handling inside generic_file_direct_write
Date: Tue, 12 Dec 2006 12:22:14 +0300	[thread overview]
Message-ID: <87lkld31vd.fsf@sw.ru> (raw)
In-Reply-To: <20061211124052.144e69a0.akpm@osdl.org> (Andrew Morton's message of "Mon, 11 Dec 2006 12:40:52 -0800")

Andrew Morton <akpm@osdl.org> writes:

> On Mon, 11 Dec 2006 16:34:27 +0300
> Dmitriy Monakhov <dmonakhov@openvz.org> wrote:
>
>> OpenVZ team has discovered error inside generic_file_direct_write()
>> If generic_file_direct_IO() has fail (ENOSPC condition) it may have instantiated
>> a few blocks outside i_size. And fsck will complain about wrong i_size
>> (ext2, ext3 and reiserfs interpret i_size and biggest block difference as error),
>> after fsck will fix error i_size will be increased to the biggest block,
>> but this blocks contain gurbage from previous write attempt, this is not 
>> information leak, but its silence file data corruption. 
>> We need truncate any block beyond i_size after write have failed , do in simular
>> generic_file_buffered_write() error path.
>> 
>> Exampe:
>> open("mnt2/FILE3", O_WRONLY|O_CREAT|O_DIRECT, 0666) = 3
>> write(3, "aaaaaa"..., 4096) = -1 ENOSPC (No space left on device)
>> 
>> stat mnt2/FILE3
>> File: `mnt2/FILE3'
>> Size: 0               Blocks: 4          IO Block: 4096   regular empty file
>> >>>>>>>>>>>>>>>>>>>>>>^^^^^^^^^^ file size is less than biggest block idx
>> Device: 700h/1792d      Inode: 14          Links: 1
>> Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
>> 
>> fsck.ext2 -f -n  mnt1/fs_img
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 14, i_size is 0, should be 2048.  Fix? no
>> 
>> Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
>> ----------
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 7b84dc8..bf7cf6c 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -2041,6 +2041,14 @@ generic_file_direct_write(struct kiocb *
>>  			mark_inode_dirty(inode);
>>  		}
>>  		*ppos = end;
>> +	} else if (written < 0) {
>> +		loff_t isize = i_size_read(inode);
>> +		/*
>> +		 * generic_file_direct_IO() may have instantiated a few blocks
>> +		 * outside i_size.  Trim these off again.
>> +		 */
>> +		if (pos + count > isize)
>> +			vmtruncate(inode, isize);
>>  	}
>>  
>
> XFS (at least) can call generic_file_direct_write() with i_mutex not held. 
How could it be ?

from mm/filemap.c:2046 generic_file_direct_write() comment right after 
place where i want to add vmtruncate()
/*
	 * Sync the fs metadata but not the minor inode changes and
	 * of course not the data as we did direct DMA for the IO.
	 * i_mutex is held, which protects generic_osync_inode() from
	 * livelocking.
	 */

> And vmtruncate() expects i_mutex to be held.
generic_file_direct_IO must called under i_mutex too
from mm/filemap.c:2388
  /*
   * Called under i_mutex for writes to S_ISREG files.   Returns -EIO if something
   * went wrong during pagecache shootdown.
   */
  static ssize_t
  generic_file_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,

This means XFS generic_file_direct_write() call generic_file_direct_IO() without
i_mutex held too?
>
> I guess a suitable solution would be to push this problem back up to the
> callers: let them decide whether to run vmtruncate() and if so, to ensure
> that i_mutex is held.
>
> The existence of generic_file_aio_write_nolock() makes that rather messy
> though.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-12-12  9:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-11 13:34 Dmitriy Monakhov
2006-12-11 12:38 ` [Devel] " Kirill Korotaev
2006-12-11 20:40 ` Andrew Morton
2006-12-12  9:22   ` Dmitriy Monakhov [this message]
2006-12-12  6:36     ` Andrew Morton
2006-12-12 12:20   ` Dmitriy Monakhov
2006-12-12  9:52     ` Andrew Morton
2006-12-12 13:18       ` Dmitriy Monakhov
2006-12-12 10:40         ` Andrew Morton
2006-12-12 23:14           ` Dmitriy Monakhov
2006-12-13  2:43           ` Chen, Kenneth W
2006-12-15 10:43             ` 'Christoph Hellwig'
2006-12-15 18:53               ` Chen, Kenneth W
2007-01-02 11:17                 ` 'Christoph Hellwig'

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lkld31vd.fsf@sw.ru \
    --to=dmonakhov@sw.ru \
    --cc=akpm@osdl.org \
    --cc=devel@openvz.org \
    --cc=dmonakhov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox