linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Blaisorblade <blaisorblade@yahoo.it>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Andrea Arcangeli <andrea@suse.de>,
	lkml <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hugh@veritas.com>,
	akpm@osdl.org, dvhltc@us.ibm.com, linux-mm <linux-mm@kvack.org>,
	Jeff Dike <jdike@addtoit.com>
Subject: New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE))
Date: Wed, 2 Nov 2005 20:54:14 +0100	[thread overview]
Message-ID: <200511022054.15119.blaisorblade@yahoo.it> (raw)
In-Reply-To: <1130947957.24503.70.camel@localhost.localdomain>

On Wednesday 02 November 2005 17:12, Badari Pulavarty wrote:
> Hi Andrew & Andrea,
>
> Here is the updated patch with name change again :(
> Hopefully this would be final. (MADV_REMOVE).
>
> BTW, I am not sure if we need to hold i_sem and i_allocsem
> all the way ? I wanted to be safe - but this may be overkill ?
While looking into this, I probably found another problem, a race with 
install_page(), which doesn't use the seqlock-style check we use for 
everything else (aka do_no_page) but simply assumes a page is valid if its 
index is below the current file size.

This is clearly "truncate" specific, and is already racy. Suppose I truncate a 
file and reduce its size, and then re-extend it, the page which I previously 
fetched from the cache is invalid. The current install_page code generates 
corruption.

In fact the page is fetched from the caller of install_page and passed to it.

This affects anybody using MAP_POPULATE or using remap_file_pages.

> +       /* XXX - Do we need both i_sem and i_allocsem all the way ? */
> +       down(&inode->i_sem);
> +       down_write(&inode->i_alloc_sem);
> +       unmap_mapping_range(mapping, offset, (end - offset), 1);
In my opinion, as already said, unmap_mapping_range can be called without 
these two locks, as it operates only on mappings for the file.

However currently it's called with these locks held in vmtruncate, but I think 
the locks are held in that case only because we need to truncate the file, 
and are hold in excess also across this call.

Instead, we need to protect against concurrent faults on the mapping (not 
against concurrent mmaps)...and that is done through (struct address_space*) 
mapping->truncate_count.

=====
Finally, there is MAP_POPULATE and other pre-faulting, i.e. install_page (no, 
this is not peculiar to VM_NONLINEAR, even if some code is shared), but 
install_page checks explicitly for truncation; the problem is that the check 
is rather bogus, compared to the rest of checks:

        /*
         * This page may have been truncated. Tell the
         * caller about it.
         */
        err = -EINVAL;
        inode = vma->vm_file->f_mapping->host;
        size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
        if (!page->mapping || page->index >= size)
                goto err_unlock;

I remember there being a BUG_ON and Linus fixing it up.

It should be converted to the normal checks used for the rest 
(->truncate_count based - see do_no_page()).

To do so, the caller (*_populate) needs to call again *_getpage, if 
install_page detects a race, but it must also save and pass the 
truncate_count.

So, we probably want need to cleanup and join {filemap,shmem}_populate 
together, because the only real difference between them is the function 
called to lookup the page from disk ({shmem,filemap}_getpage).

So, we should replace struct vm_operations_struct "populate" method with a 
"getpage" method, by using the shmem_getpage prototype, which is better 
engineered, see my comment:

        page = filemap_getpage(file, pgoff, nonblock);

        /* XXX: This is wrong, a filesystem I/O error may have happened. Fix 
that as
         * done in shmem_populate calling shmem_getpage */
        if (!page && !nonblock)
                return -ENOMEM;

> +       truncate_inode_pages_range(mapping, offset, end);
> +       inode->i_op->truncate_range(inode, offset, end);
> +       up_write(&inode->i_alloc_sem);
> +       up(&inode->i_sem);

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2005-11-02 19:54 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-26 22:49 [RFC] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-10-27  8:38 ` Andi Kleen
2005-10-27 13:17   ` Andrea Arcangeli
2005-10-27 15:00     ` Badari Pulavarty
2005-10-27 15:11       ` Andrea Arcangeli
2005-10-27 18:20         ` Andrew Morton
2005-10-27 18:35           ` Badari Pulavarty
2005-10-27 18:50             ` Andrew Morton
2005-10-27 19:40               ` Gerrit Huizenga
2005-10-27 19:56                 ` Andi Kleen
2005-10-27 23:21                   ` Darren Hart
2005-10-27 20:05               ` Theodore Ts'o
2005-10-27 20:16                 ` Andrea Arcangeli
2005-10-28  1:42                 ` Badari Pulavarty
2005-10-28 16:33                   ` Theodore Ts'o
2005-10-27 20:22               ` Jeff Dike
2005-10-27 20:04           ` Andrea Arcangeli
2005-10-27 20:50             ` Andrew Morton
2005-10-27 21:37               ` Andrea Arcangeli
2005-10-27 22:23                 ` Andrew Morton
2005-10-27 23:05                   ` Badari Pulavarty
2005-10-27 23:16                     ` Andrew Morton
2005-10-27 23:33                       ` Peter Chubb
2005-10-28  0:22                   ` Andrea Arcangeli
2005-10-28  0:32                     ` Andrew Morton
2005-10-28  1:10                       ` Andrea Arcangeli
2005-10-28  1:27                       ` Badari Pulavarty
2005-10-28  2:00                         ` Andrew Morton
2005-10-27 22:32               ` Badari Pulavarty
2005-10-27 23:28             ` Peter Chubb
2005-10-27 23:49               ` Andrew Morton
2005-10-27 23:56                 ` Nathan Scott
2005-10-28  0:15                   ` Andrea Arcangeli
2005-10-27 23:59                 ` Peter Chubb
2005-10-28  3:46 ` Jeff Dike
2005-10-28 11:03   ` Blaisorblade
2005-10-28 13:29     ` Andrea Arcangeli
2005-10-28 16:56       ` Blaisorblade
2005-10-28 16:16     ` Badari Pulavarty
2005-10-28 18:40       ` Blaisorblade
2005-10-28 18:56         ` Badari Pulavarty
2005-10-29  0:35         ` Badari Pulavarty
2005-10-28 16:19   ` Badari Pulavarty
2005-10-28 17:10     ` Blaisorblade
2005-10-28 18:28       ` Jeff Dike
2005-10-28 18:44         ` Blaisorblade
2005-10-28 18:42     ` Jeff Dike
2005-10-28 18:54       ` Badari Pulavarty
2005-10-29  0:03       ` Badari Pulavarty
2005-10-29  2:51         ` Jeff Dike
2005-10-31 16:34           ` Badari Pulavarty
2005-10-31 19:15           ` Badari Pulavarty
2005-10-31 19:49           ` [RFC][PATCH] madvise(MADV_TRUNCATE) Badari Pulavarty
2005-11-01  0:05             ` Jeff Dike
2005-11-02  1:15               ` [PATCH] 2.6.14 patch for supporting madvise(MADV_FREE) Badari Pulavarty
2005-11-02  1:43                 ` Andrea Arcangeli
2005-11-02 15:49                   ` Badari Pulavarty
2005-11-02 16:12                   ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Badari Pulavarty
2005-11-02 19:54                     ` Blaisorblade [this message]
2005-11-02 20:12                       ` New bug in patch and existing Linux code - race with install_page() (was: Re: [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE)) Hugh Dickins
2005-11-02 20:45                         ` Hugh Dickins
2005-11-02 21:36                       ` Badari Pulavarty
2005-11-02 21:55                         ` Hugh Dickins
2005-11-02 22:02                           ` Badari Pulavarty
2005-11-12  0:25                     ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-11-12  0:34                       ` Badari Pulavarty
2005-11-12  1:43                         ` Andrew Morton
2005-11-12  4:41                           ` Badari Pulavarty
2006-01-16 13:06                             ` differences between MADV_FREE and MADV_DONTNEED Andrea Arcangeli
2006-01-16 16:02                               ` Suleiman Souhlal
2006-01-16 16:28                                 ` Andrea Arcangeli
2006-01-16 17:03                                   ` Suleiman Souhlal
2006-01-16 17:24                                     ` Andrea Arcangeli
2006-01-16 21:43                                       ` Eric W. Biederman
2006-01-17  0:24                                         ` Suleiman Souhlal
2006-01-17  1:04                                           ` Nicholas Miell
2006-01-17 12:43                                             ` Christoph Hellwig
2006-01-17 18:23                                               ` Eric W. Biederman
2006-01-17 22:55                                                 ` Nicholas Miell
2007-03-01 18:11                                                 ` Samuel Thibault
2006-01-17 19:06                                               ` Badari Pulavarty
2006-01-17  1:06                               ` Blaisorblade
2006-01-17  1:33                                 ` Andrea Arcangeli
2005-11-12  0:34                     ` [PATCH] 2.6.14 patch for supporting madvise(MADV_REMOVE) Andrew Morton
2005-10-28 17:55   ` [RFC] madvise(MADV_TRUNCATE) Blaisorblade
2005-10-28 21:23     ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200511022054.15119.blaisorblade@yahoo.it \
    --to=blaisorblade@yahoo.it \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=dvhltc@us.ibm.com \
    --cc=hugh@veritas.com \
    --cc=jdike@addtoit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pbadari@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox