linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] mm: fix anon_vma races
@ 2008-10-16  4:10 Nick Piggin
  2008-10-17 22:14 ` Hugh Dickins
  2008-10-17 23:13 ` Peter Zijlstra
  0 siblings, 2 replies; 52+ messages in thread
From: Nick Piggin @ 2008-10-16  4:10 UTC (permalink / raw)
  To: Hugh Dickins, Linux Memory Management List

Hi,

Still would like independent confirmation of these problems and the fix, but
it still looks buggy to me...

---

There are some races in the anon_vma code.

The race comes about because adding our vma to the tail of anon_vma->head comes
after the assignment of vma->anon_vma to a new anon_vma pointer. In the case
where this is a new anon_vma, this is done without holding any locks.  So a
parallel anon_vma_prepare might see the vma already has an anon_vma, so it
won't serialise on the page_table_lock. It may proceed and the anon_vma to a
page in the page fault path. Another thread may then pick up the page from the
LRU list, find its mapcount incremented, and attempt to iterate over the
anon_vma's list concurrently with the first thread (because the first one is
not holding the anon_vma lock). This is a fairly subtle race, and only likely
to be hit in kernels where the spinlock is preemptible and the first thread is
preempted at the right time... but OTOH it is _possible_ to hit here; on bigger
SMP systems cacheline transfer latencies could be very large, or we could take
an NMI inside the lock or something. Fix this by initialising the list before
adding the anon_vma to vma.

After that, there is a similar data-race with memory ordering where the store
to make the anon_vma visible passes previous stores to initialize the anon_vma.
This race also includes stores to initialize the anon_vma spinlock by the
slab constructor. Add and comment appropriate barriers to solve this.

Signed-off-by: Nick Piggin <npiggin@suse.de>
---
Index: linux-2.6/mm/rmap.c
===================================================================
--- linux-2.6.orig/mm/rmap.c
+++ linux-2.6/mm/rmap.c
@@ -81,8 +81,15 @@ int anon_vma_prepare(struct vm_area_stru
 		/* page_table_lock to protect against threads */
 		spin_lock(&mm->page_table_lock);
 		if (likely(!vma->anon_vma)) {
-			vma->anon_vma = anon_vma;
 			list_add_tail(&vma->anon_vma_node, &anon_vma->head);
+			/*
+			 * This smp_wmb() is required to order all previous
+			 * stores to initialize the anon_vma (by the slab
+			 * ctor) and add this vma, with the store to make it
+			 * visible to other CPUs via vma->anon_vma.
+			 */
+			smp_wmb();
+			vma->anon_vma = anon_vma;
 			allocated = NULL;
 		}
 		spin_unlock(&mm->page_table_lock);
@@ -91,6 +98,15 @@ int anon_vma_prepare(struct vm_area_stru
 			spin_unlock(&locked->lock);
 		if (unlikely(allocated))
 			anon_vma_free(allocated);
+	} else {
+		/*
+		 * This smp_read_barrier_depends is required to order the data
+		 * dependent loads of fields in anon_vma, with the load of the
+		 * anon_vma pointer vma->anon_vma. This complements the above
+		 * smp_wmb, and prevents a CPU from loading uninitialized
+		 * contents of anon_vma.
+		 */
+		smp_read_barrier_depends();
 	}
 	return 0;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2008-10-22  9:29 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-16  4:10 [patch] mm: fix anon_vma races Nick Piggin
2008-10-17 22:14 ` Hugh Dickins
2008-10-17 23:05   ` Linus Torvalds
2008-10-18  0:13     ` Hugh Dickins
2008-10-18  0:25       ` Linus Torvalds
2008-10-18  1:53       ` Nick Piggin
2008-10-18  2:50         ` Paul Mackerras
2008-10-18  2:57           ` Linus Torvalds
2008-10-18  5:49           ` Nick Piggin
2008-10-18 10:49             ` Paul Mackerras
2008-10-18 17:00             ` Linus Torvalds
2008-10-18 18:44               ` Matthew Wilcox
2008-10-19  2:54                 ` Nick Piggin
2008-10-19  2:53               ` Nick Piggin
2008-10-17 23:13 ` Peter Zijlstra
2008-10-17 23:53   ` Linus Torvalds
2008-10-18  0:42     ` Linus Torvalds
2008-10-18  1:08       ` Linus Torvalds
2008-10-18  1:32         ` Nick Piggin
2008-10-18  2:11           ` Linus Torvalds
2008-10-18  2:25             ` Nick Piggin
2008-10-18  2:35               ` Nick Piggin
2008-10-18  2:53               ` Linus Torvalds
2008-10-18  5:20                 ` Nick Piggin
2008-10-18 10:38                   ` Peter Zijlstra
2008-10-19  9:52                     ` Hugh Dickins
2008-10-19 10:51                       ` Peter Zijlstra
2008-10-19 12:39                         ` Hugh Dickins
2008-10-19 18:25                         ` Linus Torvalds
2008-10-19 18:45                           ` Peter Zijlstra
2008-10-19 19:00                           ` Hugh Dickins
2008-10-20  4:03                           ` Hugh Dickins
2008-10-20 15:17                             ` Linus Torvalds
2008-10-20 18:21                               ` Hugh Dickins
2008-10-21  2:56                               ` Nick Piggin
2008-10-21  3:25                                 ` Linus Torvalds
2008-10-21  4:33                                   ` Nick Piggin
2008-10-21 12:58                                     ` Hugh Dickins
2008-10-21 15:59                                     ` Christoph Lameter
2008-10-22  9:29                                       ` Nick Piggin
2008-10-21  4:34                                   ` Nick Piggin
2008-10-21 13:55                                     ` Hugh Dickins
2008-10-21  2:44                           ` Nick Piggin
2008-10-18 19:14               ` Hugh Dickins
2008-10-19  3:03                 ` Nick Piggin
2008-10-19  7:07                   ` Hugh Dickins
2008-10-20  3:26                     ` Hugh Dickins
2008-10-21  2:45                       ` Nick Piggin
2008-10-19  1:13       ` Hugh Dickins
2008-10-19  2:41         ` Nick Piggin
2008-10-19  9:45           ` Hugh Dickins
2008-10-21  3:59             ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox