From: Chris Snook <csnook@redhat.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
Hugh Dickens <hugh@veritas.com>,
Linux Memory Management List <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Avi Kivity <avi@qumranet.com>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>
Subject: Re: Populating multiple ptes at fault time
Date: Wed, 17 Sep 2008 16:02:36 -0400 [thread overview]
Message-ID: <48D1625C.7000309@redhat.com> (raw)
In-Reply-To: <48D142B2.3040607@goop.org>
Jeremy Fitzhardinge wrote:
> Avi and I were discussing whether we should populate multiple ptes at
> pagefault time, rather than one at at time as we do now.
>
> When Linux is operating as a virtual guest, pte population will
> generally involve some kind of trap to the hypervisor, either to
> validate the pte contents (in Xen's case) or to update the shadow
> pagetable (kvm). This is relatively expensive, and it would be good to
> amortise the cost by populating multiple ptes at once.
Is it still expensive when you're using nested page tables?
> Xen and kvm already batch pte updates where multiple ptes are explicitly
> updated at once (mprotect and unmap, mostly), but in practise that's
> relatively rare. Most pages are demand faulted into a process one at a
> time.
>
> It seems to me there are two cases: major faults, and minor faults:
>
> Major faults: the page in question is physically missing, and so the
> fault invokes IO. If we blindly pull in a lot of extra pages that are
> never used, then we'll end up wasting a lot of memory. However, page at
> a time IO is pretty bad performance-wise too, so I guess we do clustered
> fault-time IO? If we can distinguish between random and linear fault
> patterns, then we can use that as a basis for deciding how much
> speculative mapping to do. Certainly, we should create mappings for any
> nearby page which does become physically present.
We already have rather well-tested code in the VM to detect fault patterns,
complete with userspace hints to set readahead policy. It seems to me that if
we're going to read nearby pages into pagecache, we might as well actually map
them at the same time. Duplicating the readahead code is probably a bad idea.
> Minor faults are easier; if the page already exists in memory, we should
> just create mappings to it. If neighbouring pages are also already
> present, then we can can cheaply create mappings for them too.
If we're mapping pagecache, then sure, this is really cheap, but speculatively
allocating anonymous pages will hurt, badly, on many workloads.
> This seems like an obvious idea, so I'm wondering if someone has
> prototyped it already to see what effects there are. In the native
> case, pte updates are much cheaper, so perhaps it doesn't help much
> there, though it would potentially reduce the number of faults needed.
> But I think there's scope for measurable benefits in the virtual case.
Sounds like something we might want to enable conditionally on the use of pv_ops
features.
-- Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-09-17 20:02 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-17 17:47 Jeremy Fitzhardinge
2008-09-17 18:28 ` Rik van Riel
2008-09-17 21:47 ` Jeremy Fitzhardinge
2008-09-17 20:02 ` Chris Snook [this message]
2008-09-17 21:45 ` Jeremy Fitzhardinge
2008-09-18 18:16 ` Christoph Lameter
2008-09-18 18:53 ` Jeremy Fitzhardinge
2008-09-18 19:39 ` Christoph Lameter
2008-09-18 22:21 ` KOSAKI Motohiro
2008-09-18 20:52 ` Martin Bligh
2008-09-18 20:53 ` Chris Snook
2008-09-18 21:11 ` Martin Bligh
2008-09-18 21:13 ` Christoph Lameter
2008-09-18 21:21 ` Martin Bligh
2008-09-18 21:32 ` Christoph Lameter
2008-09-18 21:49 ` MinChan Kim
2008-09-18 21:58 ` Christoph Lameter
2008-09-18 22:08 ` Martin Bligh
2008-09-18 22:11 ` Christoph Lameter
2008-09-18 22:18 ` Martin Bligh
2008-09-18 22:22 ` Jeremy Fitzhardinge
2008-09-18 22:23 ` Chris Snook
2008-09-18 23:16 ` MinChan Kim
2008-09-17 22:02 ` Avi Kivity
2008-09-17 22:30 ` Jeremy Fitzhardinge
2008-09-17 22:47 ` Avi Kivity
2008-09-17 23:02 ` Jeremy Fitzhardinge
2008-09-18 20:26 ` Avi Kivity
2008-09-18 22:18 ` Jeremy Fitzhardinge
2008-09-18 23:38 ` Avi Kivity
2008-09-19 0:00 ` Jeremy Fitzhardinge
2008-09-19 0:20 ` Avi Kivity
2008-09-19 0:42 ` Jeremy Fitzhardinge
2008-09-24 12:31 ` Avi Kivity
2008-09-25 18:32 ` Jeremy Fitzhardinge
2008-09-26 10:26 ` Martin Schwidefsky
2008-09-19 17:45 ` Benjamin Herrenschmidt
2008-09-17 23:50 ` MinChan Kim
2008-09-18 6:58 ` KOSAKI Motohiro
2008-09-18 7:26 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48D1625C.7000309@redhat.com \
--to=csnook@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@qumranet.com \
--cc=hugh@veritas.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox