From: Marco Stornelli <marco.stornelli@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com,
linux-ext4@vger.kernel.org, Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org
Subject: Re: Hole punching and mmap races
Date: Tue, 5 Jun 2012 08:22:29 +0200 [thread overview]
Message-ID: <CANGUGtBnhRjWGK2v-+ExhZExNbYkF9nTBzQNd7-0f6G5sn51Sg@mail.gmail.com> (raw)
In-Reply-To: <20120605055150.GF4347@dastard>
2012/6/5 Dave Chinner <david@fromorbit.com>:
> On Thu, May 24, 2012 at 02:35:38PM +0200, Jan Kara wrote:
>> On Sat 19-05-12 11:40:24, Dave Chinner wrote:
>> > So let's step back a moment and have a look at how we've got here.
>> > The problem is that we've optimised ourselves into a corner with the
>> > way we handle page cache truncation - we don't need mmap
>> > serialisation because of the combination of i_size and page locks
>> > mean we can detect truncated pages safely at page fault time. With
>> > hole punching, we don't have that i_size safety blanket, and so we
>> > need some other serialisation mechanism to safely detect whether a
>> > page is valid or not at any given point in time.
>> >
>> > Because it needs to serialise against IO operations, we need a
>> > sleeping lock of some kind, and it can't be the existing IO lock.
>> > And now we are looking at needing a new lock for hole punching, I'm
>> > really wondering if the i_size/page lock truncation optimisation
>> > should even continue to exist. i.e. replace it with a single
>> > mechanism that works for both hole punching, truncation and other
>> > functions that require exclusive access or exclusion against
>> > modifications to the mapping tree.
>> >
>> > But this is only one of the problems in this area.The way I see it
>> > is that we have many kludges in the area of page invalidation w.r.t.
>> > different types of IO, the page cache and mmap, especially when we
>> > take into account direct IO. What we are seeing here is we need
>> > some level of _mapping tree exclusion_ between:
>> >
>> > 1. mmap vs hole punch (broken)
>> > 2. mmap vs truncate (i_size/page lock)
>> > 3. mmap vs direct IO (non-existent)
>> > 4. mmap vs buffered IO (page lock)
>> > 5. writeback vs truncate (i_size/page lock)
>> > 6. writeback vs hole punch (page lock, possibly broken)
>> > 7. direct IO vs buffered IO (racy - flush cache before/after DIO)
>> Yes, this is a nice summary of the most interesting cases. For completeness,
>> here are the remaining cases:
>> 8. mmap vs writeback (page lock)
>> 9. writeback vs direct IO (as direct IO vs buffered IO)
>> 10. writeback vs buffered IO (page lock)
>> 11. direct IO vs truncate (dio_wait)
>> 12. direct IO vs hole punch (dio_wait)
>> 13. buffered IO vs truncate (i_mutex for writes, i_size/page lock for reads)
>> 14. buffered IO vs hole punch (fs dependent, broken for ext4)
>> 15. truncate vs hole punch (fs dependent)
>> 16. mmap vs mmap (page lock)
>> 17. writeback vs writeback (page lock)
>> 18. direct IO vs direct IO (i_mutex or fs dependent)
>> 19. buffered IO vs buffered IO (i_mutex for writes, page lock for reads)
>> 20. truncate vs truncate (i_mutex)
>> 21. punch hole vs punch hole (fs dependent)
>
I think we have even the xip cases here.
Marco
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-06-05 6:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-15 22:48 Jan Kara
2012-05-16 2:14 ` Dave Chinner
2012-05-16 13:04 ` Jan Kara
2012-05-17 7:43 ` Dave Chinner
2012-05-17 23:28 ` Jan Kara
2012-05-18 10:12 ` Dave Chinner
2012-05-18 13:32 ` Jan Kara
2012-05-19 1:40 ` Dave Chinner
2012-05-24 12:35 ` Jan Kara
2012-06-05 5:51 ` Dave Chinner
2012-06-05 6:22 ` Marco Stornelli [this message]
2012-06-05 23:15 ` Jan Kara
2012-06-06 0:06 ` Dave Chinner
2012-06-06 9:58 ` Jan Kara
2012-06-06 13:36 ` Dave Chinner
2012-06-07 21:58 ` Jan Kara
2012-06-08 0:57 ` Dave Chinner
2012-06-08 21:36 ` Jan Kara
2012-06-08 23:06 ` Dave Chinner
2012-06-12 8:56 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANGUGtBnhRjWGK2v-+ExhZExNbYkF9nTBzQNd7-0f6G5sn51Sg@mail.gmail.com \
--to=marco.stornelli@gmail.com \
--cc=david@fromorbit.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox