Re: Page host virtual assist patches.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andrew Morton <akpm@osdl.org>,
	linux-mm@kvack.org, frankeh@watson.ibm.com, rhim@cc.gatech.edu
Subject: Re: Page host virtual assist patches.
Date: Tue, 25 Apr 2006 12:36:26 +0200	[thread overview]
Message-ID: <1145961386.5282.37.camel@localhost> (raw)
In-Reply-To: <444DDD1B.4010202@yahoo.com.au>

On Tue, 2006-04-25 at 18:26 +1000, Nick Piggin wrote:
> > Because calling into the guest is too slow. You need to schedule a cpu,
> > the code that does the allocation needs to run, which might need other
> > pages, etc. The beauty of the scheme is that the host can immediately
> > remove a page that is mark as volatile or unused. No i/o, no scheduling,
> > nothing. Consider what that does to the latency of the hosts memory
> > allocation. Even if the percentage of discardable pages is small, lets
> > say 25% of the guests memory, the host will quickly find reusable
> > memory. If the vmscan of the host attempts to evict 100 pages, on
> > average it will start i/o for 75 of them, the other 25 are immediately
> > free for reuse.
> > 
> 
> I don't think there is any beauty in this scheme, to be honest.

Beauty lies in the eye of the beholder. From my point of view there is
benefit to the method.

> I don't see why calling into the host is bad - won't it be able to
> make better reclaim decisions? If starting IO is the wrong thing to
> do under a hypervisor, why is it the right thing to do on bare metal?

First some assumptions about the environment. We are talking about a
paging hypervisor that runs several hundreds of guest Linux images. The
memory is overcommited, the sum of the guest memory sizes is larger than
the host memory by a factor of 2-3. Usually a large percentage of the
guests memory is paged out by the hypervisor.

Both the host and the guest follow an LRU strategy. That means that the
host will pick the oldest page from the idlest guest. Almost the same
would happen if you call into the idlest guest to let the guest free its
oldest page. But the catch is that the guest will touch a lot of page
doing its vmscan operation, if that causes a single additional host i/o
because a guest page needs to be retrieved from the host swap device,
you are already in negative territory.

> As for latency of host's memory allocation, it should attempt to
> keep some buffer of memory free.

It does attempt to keep some memory free. But lets say 1000 guest images
generate a lot of memory pressure. You will run out of memory, and
anything that speeds up the host reclaim will improve the situation. And
the method allows to reduce the number of i/o that the host needs to do.
Consider an old, volatile page that is picked for eviction. Without hva
the host will write it to the paging device. If the guest touches the
page again the host has to read it back to memory again. Two host i/o's.
If the host discards the page, the guest will get a discard fault when
it tries to reaccess the page. The guest will read the page from its
backing device. One guest i/o. Seems like a good deal to me..

-- 
blue skies,
  Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2006-04-25 10:37 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-24 12:34 Martin Schwidefsky
2006-04-25  1:01 ` Andrew Morton
2006-04-25  7:19   ` Nick Piggin
2006-04-25  8:31     ` Martin Schwidefsky
2006-04-25  8:37       ` Andrew Morton
2006-04-25 10:44         ` Martin Schwidefsky
2006-04-25 16:29           ` Andrew Morton
2006-04-25 17:04             ` Martin Schwidefsky
2006-04-25 10:04       ` Nick Piggin
2006-04-25 11:28         ` Martin Schwidefsky
2006-04-25 12:13           ` Nick Piggin
2006-04-25 14:15             ` Martin Schwidefsky
2006-04-26  1:13               ` Nick Piggin
2006-04-26  7:39                 ` Martin Schwidefsky
2006-04-26 12:03                   ` Hubertus Franke
2006-04-27 20:55           ` jschopp
2006-04-25  8:10   ` Martin Schwidefsky
2006-04-25  8:26     ` Nick Piggin
2006-04-25 10:36       ` Martin Schwidefsky [this message]
2006-04-25 10:51         ` Nick Piggin
2006-04-25 12:18           ` Martin Schwidefsky
2006-04-25  8:30     ` Andrew Morton
2006-04-25 10:43       ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1145961386.5282.37.camel@localhost \
    --to=schwidefsky@de.ibm.com \
    --cc=akpm@osdl.org \
    --cc=frankeh@watson.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rhim@cc.gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox