linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Race between vmtruncate and mapped areas?
@ 2003-05-13 20:44 Dave McCracken
  2003-05-13 20:58 ` Mika Penttilä
  2003-05-13 21:00 ` William Lee Irwin III
  0 siblings, 2 replies; 51+ messages in thread
From: Dave McCracken @ 2003-05-13 20:44 UTC (permalink / raw)
  To: Linux Memory Management, Linux Kernel

As part of chasing the BUG() we've been seeing in objrmap I took a good
look at vmtruncate().  I believe I've identified a race condition that no
only  triggers that BUG(), but also could cause some strange behavior
without the objrmap patch.

Basically vmtruncate() does the following steps:  first, it unmaps the
truncated pages from all page tables using zap_page_range().  Then it
removes those pages from the page cache using truncate_inode_pages().
These steps are done without any lock that I can find, so it's possible for
another task to get in between the unmap and the remove, and remap one or
more pages back into its page tables.

The result of this is a page that has been disconnected from the file but
is mapped in a task's address space as if it were still part of that file.
Any further modifications to this page will be lost.

I can easily detect this condition by adding a bugcheck for page_mapped()
in truncate_complete_page(), then running Andrew's bash-shared-mapping test
case.

Please feel free to poke holes in my analysis.  I'm not at all sure I
haven't missed some subtlety here.

Dave McCracken

======================================================================
Dave McCracken          IBM Linux Base Kernel Team      1-512-838-3059
dmccr@us.ibm.com                                        T/L   678-3059

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 51+ messages in thread
* Re: Race between vmtruncate and mapped areas?
@ 2003-05-17 18:19 Paul McKenney
  2003-05-17 18:42 ` Andrea Arcangeli
  0 siblings, 1 reply; 51+ messages in thread
From: Paul McKenney @ 2003-05-17 18:19 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andrew Morton, dmccr, linux-kernel, linux-kernel-owner, linux-mm,
	mika.penttila




> On Thu, May 15, 2003 at 02:20:00AM -0700, Andrew Morton wrote:
> > Andrea Arcangeli <andrea@suse.de> wrote:
> > >
> > > and it's still racy
> >
> > damn, and it just booted ;)
> >
> > I'm just a little bit concerned over the ever-expanding inode.  Do you
> > think the dual sequence numbers can be replaced by a single generation
> > counter?
>
> yes, I wrote it as a single counter first, but was unreadable and it had
> more branches, so I added the other sequence number to make it cleaner.
> I don't mind another 4 bytes, that cacheline should be hot anyways.
>
> > I do think that we should push the revalidate operation over into the
vm_ops.
> > That'll require an extra arg to ->nopage, but it has a spare one anyway
(!).
>
> not sure why you need a callback, the lowlevel if needed can serialize
> using the same locking in the address space that vmtruncate uses. I
> would wait a real case need before adding a callback.

FYI, we verified that the revalidate callback could also do the same
job that the proposed nopagedone callback does -- permitting filesystems
that provide their on vm_operations_struct to avoid the race between
page faults and invalidating a page from a mapped file.

                                    Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 51+ messages in thread
* Re: Race between vmtruncate and mapped areas?
@ 2003-05-19 18:11 Paul McKenney
  0 siblings, 0 replies; 51+ messages in thread
From: Paul McKenney @ 2003-05-19 18:11 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andrew Morton, dmccr, linux-kernel, linux-kernel-owner, linux-mm,
	mika.penttila




> On Sat, May 17, 2003 at 11:19:39AM -0700, Paul McKenney wrote:
> > > On Thu, May 15, 2003 at 02:20:00AM -0700, Andrew Morton wrote:
> > > not sure why you need a callback, the lowlevel if needed can
serialize
> > > using the same locking in the address space that vmtruncate uses. I
> > > would wait a real case need before adding a callback.
> >
> > FYI, we verified that the revalidate callback could also do the same
> > job that the proposed nopagedone callback does -- permitting
filesystems
> > that provide their on vm_operations_struct to avoid the race between
> > page faults and invalidating a page from a mapped file.
>
> don't you need two callbacks to avoid the race? (really I mean, to call
> two times a callback, the callback can be also the same)

I do not believe so -- though we could well be talking about
different race conditions.  The one that I am worried about
is where a distributed filesystem has a page fault against an
mmap race against an invalidation request.  The thought is
that the DFS takes one of its locks in the nopage callback,
and then releases it in the revalidate callback.  The
invalidation request would use the same DFS lock, and would
therefore not be able to run between nopage and revalidate.
It would call something like invalidate_mmap_range(), which
in turn calls zap_page_range(), which acquires the
mm->page_table_lock.  Since do_no_page() does not release
mm->page_table_lock until after it fills in the PTE, I believe
things are covered.

So, is there another race that I am missing here?  ;-)

                                    Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2003-05-19 18:11 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-13 20:44 Race between vmtruncate and mapped areas? Dave McCracken
2003-05-13 20:58 ` Mika Penttilä
2003-05-13 21:04   ` William Lee Irwin III
2003-05-13 22:26   ` Dave McCracken
2003-05-13 22:49     ` William Lee Irwin III
2003-05-13 23:00       ` Dave McCracken
2003-05-13 23:11         ` William Lee Irwin III
2003-05-13 23:16           ` Dave McCracken
2003-05-13 23:20             ` William Lee Irwin III
2003-05-13 23:28               ` Dave McCracken
2003-05-13 23:29                 ` William Lee Irwin III
2003-05-13 23:16         ` William Lee Irwin III
2003-05-14  1:10         ` Andrew Morton
2003-05-14 15:02           ` Dave McCracken
2003-05-14  1:10     ` Andrew Morton
2003-05-14 15:02       ` Dave McCracken
2003-05-14 15:06         ` William Lee Irwin III
2003-05-14 15:25           ` Dave McCracken
2003-05-14 16:42           ` Gerrit Huizenga
2003-05-14 17:34         ` Andrew Morton
2003-05-14 17:42           ` Dave McCracken
2003-05-14 17:57             ` Andrew Morton
2003-05-14 18:05               ` Dave McCracken
2003-05-14 18:17                 ` Andrew Morton
2003-05-14 18:24                   ` Dave McCracken
2003-05-14 18:53                     ` Andrew Morton
2003-05-15  8:50                       ` Andrea Arcangeli
2003-05-14 19:02               ` Rik van Riel
2003-05-14 19:04                 ` Rik van Riel
2003-05-14 19:07                   ` Dave McCracken
2003-05-14 19:11                     ` Rik van Riel
2003-05-15  0:49             ` Andrea Arcangeli
2003-05-15  2:36               ` Rik van Riel
2003-05-15  9:46                 ` Andrea Arcangeli
2003-05-15  9:55                   ` Andrew Morton
2003-05-15  8:32               ` Andrew Morton
2003-05-15  8:42                 ` Andrew Morton
2003-05-15  8:55                 ` Andrea Arcangeli
2003-05-15  9:20                   ` Andrew Morton
2003-05-15  9:40                     ` Andrea Arcangeli
2003-05-15  9:58                       ` Andrew Morton
2003-05-15 16:38                       ` Daniel McNeil
2003-05-15 19:19                         ` Andrea Arcangeli
2003-05-15 22:04                           ` Daniel McNeil
2003-05-15 23:17                             ` Andrea Arcangeli
2003-05-17  0:27                               ` Daniel McNeil
2003-05-17 17:29                                 ` Andrea Arcangeli
2003-05-13 21:00 ` William Lee Irwin III
2003-05-17 18:19 Paul McKenney
2003-05-17 18:42 ` Andrea Arcangeli
2003-05-19 18:11 Paul McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox