linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Hugh Dickins <hugh@veritas.com>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: [patch] mm: fix anon_vma races
Date: Thu, 16 Oct 2008 06:10:33 +0200	[thread overview]
Message-ID: <20081016041033.GB10371@wotan.suse.de> (raw)

Hi,

Still would like independent confirmation of these problems and the fix, but
it still looks buggy to me...

---

There are some races in the anon_vma code.

The race comes about because adding our vma to the tail of anon_vma->head comes
after the assignment of vma->anon_vma to a new anon_vma pointer. In the case
where this is a new anon_vma, this is done without holding any locks.  So a
parallel anon_vma_prepare might see the vma already has an anon_vma, so it
won't serialise on the page_table_lock. It may proceed and the anon_vma to a
page in the page fault path. Another thread may then pick up the page from the
LRU list, find its mapcount incremented, and attempt to iterate over the
anon_vma's list concurrently with the first thread (because the first one is
not holding the anon_vma lock). This is a fairly subtle race, and only likely
to be hit in kernels where the spinlock is preemptible and the first thread is
preempted at the right time... but OTOH it is _possible_ to hit here; on bigger
SMP systems cacheline transfer latencies could be very large, or we could take
an NMI inside the lock or something. Fix this by initialising the list before
adding the anon_vma to vma.

After that, there is a similar data-race with memory ordering where the store
to make the anon_vma visible passes previous stores to initialize the anon_vma.
This race also includes stores to initialize the anon_vma spinlock by the
slab constructor. Add and comment appropriate barriers to solve this.

Signed-off-by: Nick Piggin <npiggin@suse.de>
---
Index: linux-2.6/mm/rmap.c
===================================================================
--- linux-2.6.orig/mm/rmap.c
+++ linux-2.6/mm/rmap.c
@@ -81,8 +81,15 @@ int anon_vma_prepare(struct vm_area_stru
 		/* page_table_lock to protect against threads */
 		spin_lock(&mm->page_table_lock);
 		if (likely(!vma->anon_vma)) {
-			vma->anon_vma = anon_vma;
 			list_add_tail(&vma->anon_vma_node, &anon_vma->head);
+			/*
+			 * This smp_wmb() is required to order all previous
+			 * stores to initialize the anon_vma (by the slab
+			 * ctor) and add this vma, with the store to make it
+			 * visible to other CPUs via vma->anon_vma.
+			 */
+			smp_wmb();
+			vma->anon_vma = anon_vma;
 			allocated = NULL;
 		}
 		spin_unlock(&mm->page_table_lock);
@@ -91,6 +98,15 @@ int anon_vma_prepare(struct vm_area_stru
 			spin_unlock(&locked->lock);
 		if (unlikely(allocated))
 			anon_vma_free(allocated);
+	} else {
+		/*
+		 * This smp_read_barrier_depends is required to order the data
+		 * dependent loads of fields in anon_vma, with the load of the
+		 * anon_vma pointer vma->anon_vma. This complements the above
+		 * smp_wmb, and prevents a CPU from loading uninitialized
+		 * contents of anon_vma.
+		 */
+		smp_read_barrier_depends();
 	}
 	return 0;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2008-10-16  4:10 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-16  4:10 Nick Piggin [this message]
2008-10-17 22:14 ` Hugh Dickins
2008-10-17 23:05   ` Linus Torvalds
2008-10-18  0:13     ` Hugh Dickins
2008-10-18  0:25       ` Linus Torvalds
2008-10-18  1:53       ` Nick Piggin
2008-10-18  2:50         ` Paul Mackerras
2008-10-18  2:57           ` Linus Torvalds
2008-10-18  5:49           ` Nick Piggin
2008-10-18 10:49             ` Paul Mackerras
2008-10-18 17:00             ` Linus Torvalds
2008-10-18 18:44               ` Matthew Wilcox
2008-10-19  2:54                 ` Nick Piggin
2008-10-19  2:53               ` Nick Piggin
2008-10-17 23:13 ` Peter Zijlstra
2008-10-17 23:53   ` Linus Torvalds
2008-10-18  0:42     ` Linus Torvalds
2008-10-18  1:08       ` Linus Torvalds
2008-10-18  1:32         ` Nick Piggin
2008-10-18  2:11           ` Linus Torvalds
2008-10-18  2:25             ` Nick Piggin
2008-10-18  2:35               ` Nick Piggin
2008-10-18  2:53               ` Linus Torvalds
2008-10-18  5:20                 ` Nick Piggin
2008-10-18 10:38                   ` Peter Zijlstra
2008-10-19  9:52                     ` Hugh Dickins
2008-10-19 10:51                       ` Peter Zijlstra
2008-10-19 12:39                         ` Hugh Dickins
2008-10-19 18:25                         ` Linus Torvalds
2008-10-19 18:45                           ` Peter Zijlstra
2008-10-19 19:00                           ` Hugh Dickins
2008-10-20  4:03                           ` Hugh Dickins
2008-10-20 15:17                             ` Linus Torvalds
2008-10-20 18:21                               ` Hugh Dickins
2008-10-21  2:56                               ` Nick Piggin
2008-10-21  3:25                                 ` Linus Torvalds
2008-10-21  4:33                                   ` Nick Piggin
2008-10-21 12:58                                     ` Hugh Dickins
2008-10-21 15:59                                     ` Christoph Lameter
2008-10-22  9:29                                       ` Nick Piggin
2008-10-21  4:34                                   ` Nick Piggin
2008-10-21 13:55                                     ` Hugh Dickins
2008-10-21  2:44                           ` Nick Piggin
2008-10-18 19:14               ` Hugh Dickins
2008-10-19  3:03                 ` Nick Piggin
2008-10-19  7:07                   ` Hugh Dickins
2008-10-20  3:26                     ` Hugh Dickins
2008-10-21  2:45                       ` Nick Piggin
2008-10-19  1:13       ` Hugh Dickins
2008-10-19  2:41         ` Nick Piggin
2008-10-19  9:45           ` Hugh Dickins
2008-10-21  3:59             ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081016041033.GB10371@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox