linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Hugh Dickins <hugh@veritas.com>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [patch] mm: fix anon_vma races
Date: Sat, 18 Oct 2008 12:38:19 +0200	[thread overview]
Message-ID: <1224326299.28131.132.camel@twins> (raw)
In-Reply-To: <20081018052046.GA26472@wotan.suse.de>

On Sat, 2008-10-18 at 07:20 +0200, Nick Piggin wrote:

> Index: linux-2.6/mm/rmap.c
> ===================================================================
> --- linux-2.6.orig/mm/rmap.c
> +++ linux-2.6/mm/rmap.c
> @@ -63,32 +63,42 @@ int anon_vma_prepare(struct vm_area_stru
>  	might_sleep();
>  	if (unlikely(!anon_vma)) {
>  		struct mm_struct *mm = vma->vm_mm;
> -		struct anon_vma *allocated, *locked;
> +		struct anon_vma *allocated;
>  
>  		anon_vma = find_mergeable_anon_vma(vma);
>  		if (anon_vma) {
>  			allocated = NULL;
> -			locked = anon_vma;
> -			spin_lock(&locked->lock);
>  		} else {
>  			anon_vma = anon_vma_alloc();
>  			if (unlikely(!anon_vma))
>  				return -ENOMEM;
>  			allocated = anon_vma;
> -			locked = NULL;
>  		}
>  
> +		/*
> +		 * The lock is required even for new anon_vmas, because as
> +		 * soon as we store vma->anon_vma = anon_vma, then the
> +		 * anon_vma becomes visible via the vma. This means another
> +		 * CPU can find the anon_vma, then store it into the struct
> +		 * page with page_add_anon_rmap. At this point, anon_vma can
> +		 * be loaded from the page with page_lock_anon_vma.
> +		 *
> +		 * So long as the anon_vma->lock is taken before looking at
> +		 * any fields in the anon_vma, the lock should take care of
> +		 * races and memory ordering issues WRT anon_vma fields.
> +		 */
> +		spin_lock(&anon_vma->lock);
> +
>  		/* page_table_lock to protect against threads */
>  		spin_lock(&mm->page_table_lock);
>  		if (likely(!vma->anon_vma)) {
> -			vma->anon_vma = anon_vma;
>  			list_add_tail(&vma->anon_vma_node, &anon_vma->head);
> +			vma->anon_vma = anon_vma;
>  			allocated = NULL;
>  		}
>  		spin_unlock(&mm->page_table_lock);
> +		spin_lock(&anon_vma->lock);

did you perchance mean, spin_unlock() ?

>  
> -		if (locked)
> -			spin_unlock(&locked->lock);
>  		if (unlikely(allocated))
>  			anon_vma_free(allocated);
>  	}
> @@ -171,6 +181,21 @@ static struct anon_vma *page_lock_anon_v
>  
>  	anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
>  	spin_lock(&anon_vma->lock);
> +
> +	/*
> +	 * If the page is no longer mapped, we have no way to keep the
> +	 * anon_vma stable. It may be freed and even re-allocated for some
> +	 * other set of anonymous mappings at any point. If the page is
> +	 * mapped while we have the lock on the anon_vma, then we know
> +	 * anon_vma_unlink can't run and garbage collect the anon_vma
> +	 * (because unmapping the page happens before unlinking the anon_vma).
> +	 */
> +	if (unlikely(!page_mapped(page))) {
> +		spin_unlock(&anon_vma->lock);
> +		goto out;
> +	}
> +	BUG_ON(page->mapping != anon_mapping);
> +
>  	return anon_vma;
>  out:
>  	rcu_read_unlock();


fault_creation:

 anon_vma_prepare()
 page_add_new_anon_rmap();

expand_creation:

 anon_vma_prepare()
 anon_vma_lock();

rmap_lookup:

 page_referenced()/try_to_unmap()
   page_lock_anon_vma()

vma_lookup:

 vma_adjust()/vma_*
   vma->anon_vma

teardown:

 unmap_vmas()
   zap_range()
      page_remove_rmap()
      free_page()
 free_pgtables()
   anon_vma_unlink()
   free_range()
  
IOW we remove rmap, free the page (set mapping=NULL) and then unlink and
free the anon_vma.

But at that time vma->anon_vma is still set.


head starts to hurt,.. 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-10-18 10:38 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-16  4:10 Nick Piggin
2008-10-17 22:14 ` Hugh Dickins
2008-10-17 23:05   ` Linus Torvalds
2008-10-18  0:13     ` Hugh Dickins
2008-10-18  0:25       ` Linus Torvalds
2008-10-18  1:53       ` Nick Piggin
2008-10-18  2:50         ` Paul Mackerras
2008-10-18  2:57           ` Linus Torvalds
2008-10-18  5:49           ` Nick Piggin
2008-10-18 10:49             ` Paul Mackerras
2008-10-18 17:00             ` Linus Torvalds
2008-10-18 18:44               ` Matthew Wilcox
2008-10-19  2:54                 ` Nick Piggin
2008-10-19  2:53               ` Nick Piggin
2008-10-17 23:13 ` Peter Zijlstra
2008-10-17 23:53   ` Linus Torvalds
2008-10-18  0:42     ` Linus Torvalds
2008-10-18  1:08       ` Linus Torvalds
2008-10-18  1:32         ` Nick Piggin
2008-10-18  2:11           ` Linus Torvalds
2008-10-18  2:25             ` Nick Piggin
2008-10-18  2:35               ` Nick Piggin
2008-10-18  2:53               ` Linus Torvalds
2008-10-18  5:20                 ` Nick Piggin
2008-10-18 10:38                   ` Peter Zijlstra [this message]
2008-10-19  9:52                     ` Hugh Dickins
2008-10-19 10:51                       ` Peter Zijlstra
2008-10-19 12:39                         ` Hugh Dickins
2008-10-19 18:25                         ` Linus Torvalds
2008-10-19 18:45                           ` Peter Zijlstra
2008-10-19 19:00                           ` Hugh Dickins
2008-10-20  4:03                           ` Hugh Dickins
2008-10-20 15:17                             ` Linus Torvalds
2008-10-20 18:21                               ` Hugh Dickins
2008-10-21  2:56                               ` Nick Piggin
2008-10-21  3:25                                 ` Linus Torvalds
2008-10-21  4:33                                   ` Nick Piggin
2008-10-21 12:58                                     ` Hugh Dickins
2008-10-21 15:59                                     ` Christoph Lameter
2008-10-22  9:29                                       ` Nick Piggin
2008-10-21  4:34                                   ` Nick Piggin
2008-10-21 13:55                                     ` Hugh Dickins
2008-10-21  2:44                           ` Nick Piggin
2008-10-18 19:14               ` Hugh Dickins
2008-10-19  3:03                 ` Nick Piggin
2008-10-19  7:07                   ` Hugh Dickins
2008-10-20  3:26                     ` Hugh Dickins
2008-10-21  2:45                       ` Nick Piggin
2008-10-19  1:13       ` Hugh Dickins
2008-10-19  2:41         ` Nick Piggin
2008-10-19  9:45           ` Hugh Dickins
2008-10-21  3:59             ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1224326299.28131.132.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox