linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Paul Turner <pjt@google.com>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>,
	Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [RFC PATCH] mm/migration: Don't lock anon vmas in rmap_walk_anon()
Date: Sat, 1 Dec 2012 19:30:50 +0100	[thread overview]
Message-ID: <20121201183050.GA32102@gmail.com> (raw)
In-Reply-To: <CA+55aFymM2MjEhRUmd-T_3ZJRTa9t5NzBuWgcbcdZmbUJv6dcQ@mail.gmail.com>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sat, Dec 1, 2012 at 1:49 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > I *think* you are right that for this type of migration that 
> > we are using here we indeed don't need to take an exclusive 
> > vma lock - in fact I think we don't need to take it at all:
> 
> I'm pretty sure we do need at least a read-side reference.

Ok.

> Even if no other MM can contain that particular pte, the vma 
> lock protects the chain that is created by fork and exit and 
> vma splitting etc.
> 
> So it's enough that another thread does a fork() at the same 
> time. Or a partial unmap of the vma (that splits it in two), 
> for the rmap chain to be modified.
> 
> Besides, there's absolutely nothing that protects that vma to 
> be part of the same vma chain in entirely unrelated processes. 
> The vma chain can get quite long over multiple forks (it's 
> even a performance problem under some extreme loads).
> 
> And we do walk the rmap chain - so we need the lock.
> 
> But we walk it read-only afaik, which is why I think the 
> semaphore could be an rwsem.
> 
> Now, there *are* likely cases where we could avoid anon_vma 
> locking entirely, but they are very specialized. They'd be 
> along the lines of
> 
>  - we hold the page table lock
>  - we see that vma->anon_vma == vma->anon_vma->root
>  - we see that vma->anon_vma->refcount == 1
> 
> or similar, because then we can guarantee that the anon-vma 
> chain has a length of one without even locking, and holding 
> the page table lock means that any concurrent fork or 
> mmap/munmap from another thread will block on this particular 
> pte.

Hm. These conditions may be true for some pretty common cases, 
but it's difficult to discover that information from the 
migration code due to the way we discover all anon vmas and walk 
the anon vma list: we first lock the anon vma then do we get to 
iterate over the individual vmas and do the pte changes.

So it's the wrong way around.

I think your rwsem suggestion is a lot more generic and more 
robust as well.

> So I suspect that in the above kind of special case (which 
> might be a somewhat common case for normal page faults, for 
> example) we could make a "we have exclusive pte access to this 
> page" argument. But quite frankly, I completely made the above 
> rules up in my head, they may be bogus too.
> 
> For the general migration case, it's definitely not possible 
> to just drop the anon_vma lock.

Ok, I see - I'll redo this part then and try out how an rwsem 
fares. I suspect it would give a small speedup to a fair number 
of workloads, so it's worthwile to spend some time on it.

Thanks for the suggestions!

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-12-01 18:30 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-30 19:58 [PATCH 00/10] Latest numa/core release, v18 Ingo Molnar
2012-11-30 19:58 ` [PATCH 01/10] sched: Add "task flipping" support Ingo Molnar
2012-11-30 19:58 ` [PATCH 02/10] sched: Move the NUMA placement logic to a worklet Ingo Molnar
2012-11-30 19:58 ` [PATCH 03/10] numa, mempolicy: Improve CONFIG_NUMA_BALANCING=y OOM behavior Ingo Molnar
2012-11-30 19:58 ` [PATCH 04/10] mm, numa: Turn 4K pte NUMA faults into effective hugepage ones Ingo Molnar
2012-11-30 19:58 ` [PATCH 05/10] sched: Introduce directed NUMA convergence Ingo Molnar
2012-11-30 19:58 ` [PATCH 06/10] sched: Remove statistical NUMA scheduling Ingo Molnar
2012-11-30 19:58 ` [PATCH 07/10] sched: Track quality and strength of convergence Ingo Molnar
2012-11-30 19:58 ` [PATCH 08/10] sched: Converge NUMA migrations Ingo Molnar
2012-11-30 19:58 ` [PATCH 09/10] sched: Add convergence strength based adaptive NUMA page fault rate Ingo Molnar
2012-11-30 19:58 ` [PATCH 10/10] sched: Refine the 'shared tasks' memory interleaving logic Ingo Molnar
2012-11-30 20:37 ` [PATCH 00/10] Latest numa/core release, v18 Linus Torvalds
2012-12-01  9:49   ` [RFC PATCH] mm/migration: Don't lock anon vmas in rmap_walk_anon() Ingo Molnar
2012-12-01 12:26     ` [RFC PATCH] mm/migration: Remove anon vma locking from try_to_unmap() use Ingo Molnar
2012-12-01 18:38       ` Linus Torvalds
2012-12-01 18:41         ` Ingo Molnar
2012-12-01 18:50           ` Linus Torvalds
2012-12-01 20:10             ` [PATCH 1/2] mm/rmap: Convert the struct anon_vma::mutex to an rwsem Ingo Molnar
2012-12-01 20:19               ` Rik van Riel
2012-12-02 15:10                 ` Ingo Molnar
2012-12-03 13:59               ` Mel Gorman
2012-12-01 20:15             ` [PATCH 2/2] mm/migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable Ingo Molnar
2012-12-01 20:33               ` Rik van Riel
2012-12-02 15:12                 ` [PATCH 2/2, v2] " Ingo Molnar
2012-12-02 17:53                   ` Rik van Riel
2012-12-04 14:42                   ` Michel Lespinasse
2012-12-05  2:59                   ` Michel Lespinasse
2012-12-03 14:17               ` [PATCH 2/2] " Mel Gorman
2012-12-04 14:37                 ` Michel Lespinasse
2012-12-04 18:17                   ` Mel Gorman
2012-12-01 18:55         ` [RFC PATCH] mm/migration: Remove anon vma locking from try_to_unmap() use Rik van Riel
2012-12-01 16:19     ` [RFC PATCH] mm/migration: Don't lock anon vmas in rmap_walk_anon() Rik van Riel
2012-12-01 17:55     ` Linus Torvalds
2012-12-01 18:30       ` Ingo Molnar [this message]
2012-12-03 13:41   ` [PATCH 00/10] Latest numa/core release, v18 Mel Gorman
2012-12-04 17:30     ` Thomas Gleixner
2012-12-03 10:43 ` Mel Gorman
2012-12-03 11:32 ` Mel Gorman
2012-12-04 22:49 ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121201183050.GA32102@gmail.com \
    --to=mingo@kernel.org \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=pjt@google.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox