linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Rik van Riel <riel@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	akpm@osdl.org, frankeh@watson.ibm.com,
	virtualization@lists.osdl.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, linux-mm@kvack.org,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	hugh@veritas.com
Subject: Re: [patch 0/6] Guest page hinting version 7.
Date: Thu, 02 Apr 2009 17:50:49 -0700	[thread overview]
Message-ID: <49D55D69.5030605@goop.org> (raw)
In-Reply-To: <49D51A82.8090908@redhat.com>

Rik van Riel wrote:
> I guess we could try to figure out a simple and robust policy
> for ballooning.  If we can come up with a policy which nobody
> can shoot holes in by just discussing it, it may be worth
> implementing and benchmarking.
>
> Maybe something based on the host passing memory pressure
> on to the guests, and the guests having their own memory
> pressure push back to the host.
>
> I'l start by telling you the best auto-ballooning policy idea
> I have come up with so far, and the (major?) hole in it.
>   

I think the first step is to reasonably precisely describe what the 
outcome you're trying to get to.  Once you have that you can start 
talking about policies and mechanisms to achieve it.  I suspect we all 
have basically the same thing in mind, but there's no harm in being 
explicit.

I'm assuming that:

   1. Each domain has a minimum guaranteed amount of resident memory.  
      If you want to shrink a domain to smaller than that minimum, you
      may as well take all its memory away (ie suspend to disk,
      completely swap out, migrate elsewhere, etc).  The amount is at
      least the bare-minimum WSS for the domain, but it may be higher to
      achieve other guarantees.
   2. Each domain has a maximum allowable resident memory, which could
      be unbounded.  The sums of all maximums could well exceed the
      total amount of host memory, and that represents the overcommit case.
   3. Each domain has a weight, or memory priority.  The simple case is
      that they all have the same weight, but a useful implementation
      would probably allow more.
   4. Domains can be cooperative, unhelpful (ignore all requests and
      make none) or malicious (ignore requests, always try to claim more
      memory).  An incompetent cooperative domain could be effectively
      unhelpful or malicious.
          * hard max limits will prevent non-cooperative domains from
            causing too much damage
          * they could be limited in other ways, by lowering IO or CPU
            priorities
          * a domain's "goodness" could be measured by looking to see
            how much memory is actually using relative to its min size
            and its weight
          * other remedies are essentially non-technical, such as more
            expensive billing the more non-good a domain is
          * (its hard to force a Xen domain to give up memory you've
            already given it)

Given that, what outcome do we want?  What are we optimising for?

    * Overall throughput?
    * Fairness?
    * Minimise wastage?
    * Rapid response to changes in conditions?  (Cope with domains
      swinging between 64MB and 3GB on a regular basis?)
    * Punish wrong-doers / Reward cooperative domains?
    * ...?

Trying to make one thing work for all cases isn't going to be simple or 
robust.  If we pick one or two (minimise wastage+overall throughput?) 
then it might be more tractable.

> Basically, the host needs the memory pressure notification,
> where the VM will notify the guests when memory is running
> low (and something could soon be swapped).  At that point,
> each guest which receives the signal will try to free some
> memory and return it to the host.
>
> Each guest can have the reverse in its own pageout code.
> Once memory pressure grows to a certain point (eg. when
> the guest is about to swap something out), it could reclaim
> a few pages from the host.
>
> If all the guests behave themselves, this could work.
>   

Yes.  It seems to me the basic metric is that each domain needs to keep 
track of how much easily allocatable memory it has on hand (ie, pages it 
can drop without causing a significant increase in IO).  If it gets too 
large, then it can afford to give pages back to the host.  If it gets 
too small, it must ask for more memory (preferably early enough to 
prevent a real memory crunch).

> However, even with just reasonably behaving guests,
> differences between the VMs in each guest could lead
> to unbalanced reclaiming, penalizing better behaving
> guests.
>   

Well, it depends on what you mean by penalized.  If they can function 
properly with the amount of memory they have, then they're fine.  If 
they're struggling because they don't have enough memory for their WSS, 
then they got their "do I have enough memory on hand" calculation wrong.

> If one guest is behaving badly, it could really impact
> the other guests.
>
> Can you think of improvements to this idea?
>
> Can you think of another balloon policy that does
> not have nasty corner cases?
>   

In fully cooperative environments you can rely on ballooning to move 
things around dramatically.  But with only partially cooperative guests, 
the best you can hope for is that it allows you some provisioning 
flexibility so you can deal with fluctuating demands in guests, but not 
order-of-magnitude size changes.  You just have to leave enough headroom 
to make the corner cases not too pointy.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-04-03  0:50 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-27 15:09 Martin Schwidefsky
2009-03-27 15:09 ` [patch 1/6] Guest page hinting: core + volatile page cache Martin Schwidefsky
2009-03-27 22:57   ` Rik van Riel
2009-03-29 13:56     ` Martin Schwidefsky
2009-03-29 14:35       ` Rik van Riel
2009-03-27 15:09 ` [patch 2/6] Guest page hinting: volatile swap cache Martin Schwidefsky
2009-04-01  2:10   ` Rik van Riel
2009-04-01  8:13     ` Martin Schwidefsky
2009-03-27 15:09 ` [patch 3/6] Guest page hinting: mlocked pages Martin Schwidefsky
2009-04-01  2:52   ` Rik van Riel
2009-04-01  8:13     ` Martin Schwidefsky
2009-03-27 15:09 ` [patch 4/6] Guest page hinting: writable page table entries Martin Schwidefsky
2009-04-01 13:25   ` Rik van Riel
2009-04-01 14:36     ` Martin Schwidefsky
2009-04-01 14:45       ` Rik van Riel
2009-03-27 15:09 ` [patch 5/6] Guest page hinting: minor fault optimization Martin Schwidefsky
2009-04-01 15:33   ` Rik van Riel
2009-03-27 15:09 ` [patch 6/6] Guest page hinting: s390 support Martin Schwidefsky
2009-04-01 16:18   ` Rik van Riel
2009-03-27 23:03 ` [patch 0/6] Guest page hinting version 7 Dave Hansen
2009-03-28  0:06   ` Rik van Riel
2009-03-29 14:20     ` Martin Schwidefsky
2009-03-29 14:38       ` Rik van Riel
2009-03-29 14:12   ` Martin Schwidefsky
2009-03-30 15:54     ` Dave Hansen
2009-03-30 16:34       ` Martin Schwidefsky
2009-03-30 18:37       ` Jeremy Fitzhardinge
2009-03-30 18:42         ` Rik van Riel
2009-03-30 18:59           ` Jeremy Fitzhardinge
2009-03-30 20:02             ` Rik van Riel
2009-03-30 20:35               ` Jeremy Fitzhardinge
2009-03-30 21:38                 ` Dor Laor
2009-03-30 22:16                   ` Izik Eidus
2009-03-28  6:35 ` Rusty Russell
2009-03-29 14:23   ` Martin Schwidefsky
2009-04-02 11:32     ` Nick Piggin
2009-04-02 15:52       ` Martin Schwidefsky
2009-04-02 16:18         ` Jeremy Fitzhardinge
2009-04-02 16:23         ` Nick Piggin
2009-04-02 19:06         ` Rik van Riel
2009-04-02 19:22           ` Nick Piggin
2009-04-02 20:05             ` Rik van Riel
2009-04-03  0:50               ` Jeremy Fitzhardinge [this message]
2009-04-02 19:58           ` Jeremy Fitzhardinge
2009-04-02 20:14             ` Rik van Riel
2009-04-02 20:34               ` Jeremy Fitzhardinge
2009-04-03  8:49                 ` Martin Schwidefsky
2009-04-03 18:19                   ` Jeremy Fitzhardinge
2009-04-06  7:21                     ` Martin Schwidefsky
2009-04-06  7:32                       ` Nick Piggin
2009-04-06 19:23                       ` Jeremy Fitzhardinge
2009-04-02 19:27         ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49D55D69.5030605@goop.org \
    --to=jeremy@goop.org \
    --cc=akpm@osdl.org \
    --cc=frankeh@watson.ibm.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=riel@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=virtualization@lists.osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox