From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ying Han <yinghan@google.com>, Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
guichaz@gmail.com, Alex Khesin <alexk@google.com>,
Mike Waychison <mikew@google.com>,
Rohit Seth <rohitseth@google.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.
Date: Fri, 20 Mar 2009 03:34:54 +1100 [thread overview]
Message-ID: <200903200334.55710.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <alpine.LFD.2.00.0903190902000.17240@localhost.localdomain>
On Friday 20 March 2009 03:20:57 Linus Torvalds wrote:
> On Fri, 20 Mar 2009, Nick Piggin wrote:
> > But I think we do have a race in __set_page_dirty_buffers():
> >
> > The page may not have buffers between the mapping->private_lock
> > critical section and the __set_page_dirty call there. So between
> > them, another thread might do a create_empty_buffers which can
> > see !PageDirty and thus it will create clean buffers.
>
> Hmm.
>
> Creating clean buffers is locked by the page lock, nothing else. And not
> all page dirtiers hold the page lock (in fact, most try to avoid it - the
> rule is that you either have to hold the page lock _or_ hold a reference
> to the 'mapping', and the latter is what the mmap code does, I think).
>
> So yeah, the page lock isn't sufficient.
No. FWIW, I thought there might be a race due to page fault code
not holding page lock over set_page_dirty (well there *are* some kinds
of races, but they're another story). So I tried out my patches that
move that lock over set_page_dirty for __do_fault and do_wp_page
(so the lock is held over pte_mkwrite and set_page_dirty), but that
still didn't solve the problem either.
> > Holding mapping->private_lock over the __set_page_dirty should
> > fix it, although I guess you'd want to release it before calling
> > __mark_inode_dirty so as not to put inode_lock under there. I
> > have a patch for this if it sounds reasonable.
>
> That would seem to make sense. Maybe moving the "TestSetPageDirty()" from
> inside __set_page_dirty() to the caller? Something like the appended?
>
> This is TOTALLY untested. Of course.
Yeah, probably no need to hold private_lock while tagging the radix
tree (which is what my version did). So maybe this one is a little
better. I did test mine, it worked, but it didn't solve the problem.
Still, it does appear to solve a real race, which we should close.
>
> Linus
>
> ---
> fs/buffer.c | 23 +++++++++++------------
> 1 files changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 9f69741..891e1c7 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -760,15 +760,9 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
> * If warn is true, then emit a warning if the page is not uptodate and
> has * not been truncated.
> */
> -static int __set_page_dirty(struct page *page,
> +static void __set_page_dirty(struct page *page,
> struct address_space *mapping, int warn)
> {
> - if (unlikely(!mapping))
> - return !TestSetPageDirty(page);
> -
> - if (TestSetPageDirty(page))
> - return 0;
> -
> spin_lock_irq(&mapping->tree_lock);
> if (page->mapping) { /* Race with truncate? */
> WARN_ON_ONCE(warn && !PageUptodate(page));
> @@ -785,8 +779,6 @@ static int __set_page_dirty(struct page *page,
> }
> spin_unlock_irq(&mapping->tree_lock);
> __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> -
> - return 1;
> }
>
> /*
> @@ -816,6 +808,7 @@ static int __set_page_dirty(struct page *page,
> */
> int __set_page_dirty_buffers(struct page *page)
> {
> + int newly_dirty;
> struct address_space *mapping = page_mapping(page);
>
> if (unlikely(!mapping))
> @@ -831,9 +824,12 @@ int __set_page_dirty_buffers(struct page *page)
> bh = bh->b_this_page;
> } while (bh != head);
> }
> + newly_dirty = !TestSetPageDirty(page);
> spin_unlock(&mapping->private_lock);
>
> - return __set_page_dirty(page, mapping, 1);
> + if (newly_dirty)
> + __set_page_dirty(page, mapping, 1);
> + return newly_dirty;
> }
> EXPORT_SYMBOL(__set_page_dirty_buffers);
>
> @@ -1262,8 +1258,11 @@ void mark_buffer_dirty(struct buffer_head *bh)
> return;
> }
>
> - if (!test_set_buffer_dirty(bh))
> - __set_page_dirty(bh->b_page, page_mapping(bh->b_page), 0);
> + if (!test_set_buffer_dirty(bh)) {
> + struct page *page = bh->b_page;
> + if (!TestSetPageDirty(page))
> + __set_page_dirty(page, page_mapping(page), 0);
> + }
> }
>
> /*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-19 16:35 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-18 19:44 Ying Han
2009-03-18 22:11 ` Andrew Morton
2009-03-18 22:40 ` Linus Torvalds
2009-03-18 23:18 ` Ying Han
2009-03-18 23:36 ` Linus Torvalds
2009-03-18 23:54 ` Ying Han
2009-03-19 15:48 ` Nick Piggin
2009-03-19 16:16 ` Peter Zijlstra
2009-03-19 16:36 ` Nick Piggin
2009-03-19 16:20 ` Linus Torvalds
2009-03-19 16:34 ` Nick Piggin [this message]
2009-03-19 16:51 ` Linus Torvalds
2009-03-19 17:03 ` Jan Kara
2009-03-19 17:06 ` Jan Kara
2009-03-19 20:05 ` Linus Torvalds
2009-03-19 20:21 ` Linus Torvalds
2009-03-19 21:17 ` Ying Han
2009-03-19 22:16 ` Jan Kara
2009-03-19 16:46 ` Jan Kara
2009-03-24 7:44 ` Nick Piggin
2009-03-24 10:27 ` Nick Piggin
2009-03-24 10:32 ` Andrew Morton
2009-03-24 15:35 ` Nick Piggin
2009-03-26 18:29 ` Jan Kara
2009-03-26 0:03 ` Ying Han
2009-03-24 12:39 ` Jan Kara
2009-03-24 12:55 ` Jan Kara
2009-03-24 13:26 ` Jan Kara
2009-03-24 14:01 ` Chris Mason
2009-03-24 14:07 ` Jan Kara
2009-03-26 8:18 ` Aneesh Kumar K.V
2009-03-24 14:30 ` Nick Piggin
2009-03-24 14:47 ` Jan Kara
2009-03-24 14:56 ` Peter Zijlstra
2009-03-24 15:29 ` Jan Kara
2009-03-24 20:14 ` OGAWA Hirofumi
2009-03-26 8:47 ` Aneesh Kumar K.V
2009-03-26 11:37 ` Jan Kara
2009-03-26 23:02 ` Linus Torvalds
2009-03-24 15:03 ` Nick Piggin
2009-03-24 15:48 ` Jan Kara
2009-03-24 17:35 ` Jan Kara
2009-04-01 22:36 ` Ying Han
2009-04-02 10:11 ` Jan Kara
2009-04-02 11:24 ` Nick Piggin
2009-04-02 11:34 ` Jan Kara
2009-04-02 15:51 ` Nick Piggin
2009-04-02 17:44 ` Ying Han
2009-04-02 22:52 ` Ying Han
2009-04-02 23:39 ` Jan Kara
2009-04-03 0:25 ` Ying Han
2009-04-03 1:29 ` Ying Han
2009-04-03 9:41 ` Jan Kara
2009-04-03 21:34 ` Ying Han
2009-04-03 0:13 ` Ying Han
2009-03-27 20:35 ` Ying Han
2009-03-20 0:34 ` Ying Han
2009-03-20 0:49 ` Linus Torvalds
2009-03-20 7:00 ` Ying Han
2009-03-25 23:15 ` Ying Han
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200903200334.55710.nickpiggin@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=alexk@google.com \
--cc=guichaz@gmail.com \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mikew@google.com \
--cc=rohitseth@google.com \
--cc=torvalds@linux-foundation.org \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox