linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [patch] mm: fix anon_vma races
Date: Sat, 18 Oct 2008 20:14:12 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0810181952580.27309@blonde.site> (raw)
In-Reply-To: <20081018022541.GA19018@wotan.suse.de>

Sorry, I've only just got back to this, and just had one quick read
through the thread.  A couple of points on what I believe so far...

On Sat, 18 Oct 2008, Nick Piggin wrote:
> On Fri, Oct 17, 2008 at 07:11:38PM -0700, Linus Torvalds wrote:
> > On Sat, 18 Oct 2008, Nick Piggin wrote:
> > > 
> > > We can't do that, unfortuantely, because anon_vmas are allocated with
> > > SLAB_DESTROY_BY_RCU.
> > 
> > Aughh. I see what you're saying. We don't _free_ them by RCU, we just 
> > destroy the page allocation. So an anon_vma can get _re-allocated_ for 
> > another page (without being destroyed), concurrently with somebody 
> > optimistically being busy with that same anon_vma that they got through 
> > that optimistic 'page_lock_anon_vma()' thing.
> > 
> > So if we were to just set the lock, we might actually be messing with 
> > something that is still actively used by the previous page that was 
> > unmapped concurrently and still being accessed by try_to_unmap_anon. So 
> > even though we allocated a "new" anon_vma, it might still be busy.
> > 
> > Yes? No?

Yes, I believe Linus is right; but need to mull it over some more.
That not-taking-the-lock-on-newly-allocated optimization comes
from Andrea, and dates from before I removed the page_map_lock().
It looks like an optimization that was valid in Andrea's original,
but something I should have removed when going to SLAB_DESTROY_BY_RCU.

> 
> That's what I'm thinking, yes. But I admit the last time I looked at
> this really closely was probably reading through Hugh's patches and
> changelogs (which at the time must have convinced me ;)). So I could
> be wrong.
> 
> 
> > That thing really is too subtle for words. But if that's actually what you 
> > are alluding to, then doesn't that mean that we _really_ should be doing 
> > that "spin_lock(&anon_vma->lock)" even for the first allocation, and that 
> > the current code is broken? Because otherwise that other concurrent user 
> > that found the stale vma through page_lock_anon_vma() will now try to 
> > follow the linked list and _think_ it's stable (thanks to the lock), but 
> > we're actually inserting entries into it without holding any locks at all.
> 
> Yes, that's what I meant by "has other problems". Another thing is also
> that even if we have the lock here, I can't see why page_lock_anon_vma
> is safe against finding an anon_vma which has been deallocated then
> allocated for something else (and had vmas inserted into it etc.).

And Nick is right that page_lock_anon_vma() is not safe against finding
an anon_vma which has now been allocated for something else: but that
is no surprise, it's very much in the nature of SLAB_DESTROY_BY_RCU
(I left most of the comment in mm/slab.c, just said "tricky" here).

It should be no problem: having locked the right-or-perhaps-wrong
anon_vma, we then go on to search its list for a page which may or
may not be there, even when it's the right anon_vma; there's no need
for special code to deal with the very unlikely case that we've now
got an irrelevant list, it's just that the page we're looking for
won't be found in it.

But not-taking-the-lock-on-newly-allocated does then look wrong.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-10-18 19:14 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-16  4:10 Nick Piggin
2008-10-17 22:14 ` Hugh Dickins
2008-10-17 23:05   ` Linus Torvalds
2008-10-18  0:13     ` Hugh Dickins
2008-10-18  0:25       ` Linus Torvalds
2008-10-18  1:53       ` Nick Piggin
2008-10-18  2:50         ` Paul Mackerras
2008-10-18  2:57           ` Linus Torvalds
2008-10-18  5:49           ` Nick Piggin
2008-10-18 10:49             ` Paul Mackerras
2008-10-18 17:00             ` Linus Torvalds
2008-10-18 18:44               ` Matthew Wilcox
2008-10-19  2:54                 ` Nick Piggin
2008-10-19  2:53               ` Nick Piggin
2008-10-17 23:13 ` Peter Zijlstra
2008-10-17 23:53   ` Linus Torvalds
2008-10-18  0:42     ` Linus Torvalds
2008-10-18  1:08       ` Linus Torvalds
2008-10-18  1:32         ` Nick Piggin
2008-10-18  2:11           ` Linus Torvalds
2008-10-18  2:25             ` Nick Piggin
2008-10-18  2:35               ` Nick Piggin
2008-10-18  2:53               ` Linus Torvalds
2008-10-18  5:20                 ` Nick Piggin
2008-10-18 10:38                   ` Peter Zijlstra
2008-10-19  9:52                     ` Hugh Dickins
2008-10-19 10:51                       ` Peter Zijlstra
2008-10-19 12:39                         ` Hugh Dickins
2008-10-19 18:25                         ` Linus Torvalds
2008-10-19 18:45                           ` Peter Zijlstra
2008-10-19 19:00                           ` Hugh Dickins
2008-10-20  4:03                           ` Hugh Dickins
2008-10-20 15:17                             ` Linus Torvalds
2008-10-20 18:21                               ` Hugh Dickins
2008-10-21  2:56                               ` Nick Piggin
2008-10-21  3:25                                 ` Linus Torvalds
2008-10-21  4:33                                   ` Nick Piggin
2008-10-21 12:58                                     ` Hugh Dickins
2008-10-21 15:59                                     ` Christoph Lameter
2008-10-22  9:29                                       ` Nick Piggin
2008-10-21  4:34                                   ` Nick Piggin
2008-10-21 13:55                                     ` Hugh Dickins
2008-10-21  2:44                           ` Nick Piggin
2008-10-18 19:14               ` Hugh Dickins [this message]
2008-10-19  3:03                 ` Nick Piggin
2008-10-19  7:07                   ` Hugh Dickins
2008-10-20  3:26                     ` Hugh Dickins
2008-10-21  2:45                       ` Nick Piggin
2008-10-19  1:13       ` Hugh Dickins
2008-10-19  2:41         ` Nick Piggin
2008-10-19  9:45           ` Hugh Dickins
2008-10-21  3:59             ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0810181952580.27309@blonde.site \
    --to=hugh@veritas.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox