linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Andrew Morton <akpm@osdl.org>
Cc: Mike Waychison <mikew@google.com>,
	linux-mm@kvack.org,
	Linux Kernel list <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFC] page fault retry with NOPAGE_RETRY
Date: Wed, 20 Sep 2006 11:57:09 +1000	[thread overview]
Message-ID: <1158717429.6002.231.camel@localhost.localdomain> (raw)
In-Reply-To: <20060919172105.bad4a89e.akpm@osdl.org>

> Livelocks.  I've described the deliberate one, but I fear there are
> accidental ones I haven't thought of.

But if you check for signal pending in handle_pte_fault() and possibly
do a cond_resched(), there should be no livelock situation...

> > My thinking was something around the lines of no_page() always does the
> > retry logic. Then, we do something like:
> > 
> > handle_pte_fault() gets modified. If do_no_page() returns
> > VM_FAULT_RETRY, it checks pte_present() again. If the PTE is present, it
> > returns VM_FAULT_MINOR. If PTE is absent, it checks for signals, and
> > returns VM_FAULT_MINOR if a signal is pending. If PTE is absent and no
> > signals are pending, it returns VM_FAULT_RETRY.
> 
> You forget that the point of this optimisation is to undo mmap_sem while
> waiting on the disk IO.  Once we've done that we cannot go looking at ptes
> or vmas: another thread could have munapped the whole lot or anything. 
> (And we always need to be afraid of use_mm()..)

Wait wait .. .we don't need to have the mmap sem to -look- at a PTE. Or
the hardware would be in pain when doing table walk. My point is, if the
PTE is -present- (which we can check without taking the mmap sem), we
just return to userland and re-do the instruction.

> Once mmap_sem has been dropped we need to go all the way back to the
> process's virtual address and redo the vma lookup. The easy and clean way
> of doing that is to rerun the fault, reuse all the existing code.

Sure, but what I'm saying is that we can still check if the PTE is
present or a signal pending and based on that, decide to return either
VM_FAULT_MINOR or VM_FAULT_RETRY. The former would cause do_page_fault()
to return all the way to userland while the later would just loop in
do_page_fault() as per Mike patch.

> > In addition, we still need to modify all archs do_page_fault() to handle
> > VM_FAULT_RETRY...
> 
> Yup, it would need some temporary ifdeffery while architetures convert.
> 
> But bear in mind my earlier comments regarding possible optimisations to
> this code.

Yup and I think my idea does, and I still don't see why we need this
additional burden of MAY_RETRY to carry around... (that is, I don't see
the livelock if we are careful enough to test for signals when we get a
VM_FAULT_RETRY result, and possibly cond_resched() or test
need_resched() and go back to userland, either way is fine by me).

Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-09-20  1:57 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-14 22:55 Benjamin Herrenschmidt
2006-09-15  0:19 ` Linus Torvalds
2006-09-15  7:11 ` Andrew Morton
2006-09-15  7:35   ` Andrew Morton
2006-09-15 13:30     ` Hugh Dickins
2006-09-16  1:03       ` Benjamin Herrenschmidt
2006-09-19 23:35   ` Mike Waychison
2006-09-19 23:50     ` Benjamin Herrenschmidt
2006-09-19 23:59       ` Andrew Morton
2006-09-20  0:06         ` Benjamin Herrenschmidt
2006-09-20  0:05       ` Benjamin Herrenschmidt
2006-09-20  0:21         ` Andrew Morton
2006-09-20  1:57           ` Benjamin Herrenschmidt [this message]
2006-09-20  3:05             ` Andrew Morton
2006-09-20  5:04               ` Benjamin Herrenschmidt
2006-09-20  5:26                 ` Andrew Morton
2006-09-20  6:54                   ` Benjamin Herrenschmidt
2006-09-20 17:53                     ` Andrew Morton
2006-09-21 22:05                       ` Benjamin Herrenschmidt
2006-09-21 22:41                         ` Andrew Morton
2006-09-21 23:09                           ` Benjamin Herrenschmidt
2006-09-23 14:21                       ` Hugh Dickins
2006-09-23 19:46                         ` Andrew Morton
2006-09-23 22:35                           ` Benjamin Herrenschmidt
2006-09-20  5:06               ` Benjamin Herrenschmidt
2006-09-20  1:14       ` Mike Waychison
2006-09-20  2:02         ` Benjamin Herrenschmidt
2006-09-15 21:35 ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1158717429.6002.231.camel@localhost.localdomain \
    --to=benh@kernel.crashing.org \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mikew@google.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox