linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>, hugh <hugh@veritas.com>,
	Paul E McKenney <paulmck@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>, linux-mm <linux-mm@kvack.org>
Subject: Re: [RFC] another crazy idea to get rid of mmap_sem in faults
Date: Sun, 30 Nov 2008 20:42:04 +0100	[thread overview]
Message-ID: <1228074124.24749.26.camel@lappy.programming.kicks-ass.net> (raw)
In-Reply-To: <alpine.LFD.2.00.0811301123320.24125@nehalem.linux-foundation.org>

On Sun, 2008-11-30 at 11:27 -0800, Linus Torvalds wrote:
> 
> On Fri, 28 Nov 2008, Peter Zijlstra wrote:
> > 
> > While pondering the page_fault retry stuff, I came up with the following
> > idea.
> 
> I don't know if your idea is any good, but this part of your patch is 
> utter crap:
> 
> 	-       if (!down_read_trylock(&mm->mmap_sem)) {
> 	-               if ((error_code & PF_USER) == 0 &&
> 	-                   !search_exception_tables(regs->ip))
> 	-                       goto bad_area_nosemaphore;
> 	-               down_read(&mm->mmap_sem);
> 	-       }
> 	-
> 	+       down_read(&mm->mmap_sem);
> 
> because the reason we do a down_read_trylock() is not because of any lock 
> order issue or anything like that, but really really fundamental: we want 
> to be able to print an oops, instead of deadlocking, if we take a page 
> fault in kernel code while holding/waiting-for the mmap_sem for writing.
> 
> .... and I don't even see the reason why you did tht change anyway, since 
> it seems to be totally independent of all the other locking changes.

Yes, the patch sucks, and you replying to it in such detail makes me
think I should have never attached it..

I know why we do trylock, and the only reason I did that change is to
place the down/up right around the find_vma(), if you'd read the changed
comment you'd see I'd intended that to become rcu_read_{,un}lock()
however we currently cannot since RB trees are not compatible with
lockless lookups (due to the tree rotations).

Please consider the idea of lockless vma lookup and synchronizing
against the PTE lock.

If that primary idea seems feasible, I'll continue working on it and try
to tackle further obstacles.

One non trivial issue is splitting/merging vmas against this lockless
lookup - I do have a solution, but I;m not particularly fond of it. 

Other issues include replacing the RB tree with something that is suited
for lockless lookups (B+ trees for example) and extending SRCU to suit
the new requirements.

Anyway, please don't take the patch too serious (and again, yes, its
utter shite), but consider the idea of getting rid of the mmap_sem usage
in the regular fault path as outlined (currently I'm still not sure on
how to get rid of it in the stack extend case).


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-11-30 19:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-28 15:42 Peter Zijlstra
2008-11-28 17:04 ` Paul E. McKenney
2008-11-30 19:27 ` Linus Torvalds
2008-11-30 19:42   ` Peter Zijlstra [this message]
2008-12-01 22:55     ` Hugh Dickins
2008-12-01 14:00 ` Christoph Lameter
2008-12-01 14:48   ` Peter Zijlstra
2008-12-01 15:06     ` Christoph Lameter
2008-12-01 15:28       ` Peter Zijlstra
2008-12-01 16:22       ` Paul E. McKenney
2008-12-01 15:27     ` Lee Schermerhorn
2008-12-01 15:33       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1228074124.24749.26.camel@lappy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox