linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, npiggin@suse.de, akpm@osdl.org,
	jeremy@goop.org, xen-devel@lists.xensource.com,
	tmem-devel@oss.oracle.com, alan@lxorguk.ukuu.org.uk,
	linux-mm@kvack.org, kurt.hackel@oracle.com,
	Rusty Russell <rusty@rustcorp.com.au>,
	dave.mccracken@oracle.com, Marcelo Tosatti <mtosatti@redhat.com>,
	sunil.mushran@oracle.com, Avi Kivity <avi@redhat.com>,
	Schwidefsky <schwidefsky@de.ibm.com>,
	chris.mason@oracle.com, Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux
Date: Sun, 12 Jul 2009 08:28:34 -0500	[thread overview]
Message-ID: <4A59E502.1020008@codemonkey.ws> (raw)
In-Reply-To: <d693761e-2f2b-4d8c-ae4f-7f22479f6c0f@default>

Dan Magenheimer wrote:
> Oops, sorry, I guess that was a bit inflammatory.  What I meant to
> say is that inferring resource utilization efficiency is a very
> hard problem and VMware (and I'm sure IBM too) has done a fine job
> with it; CMM2 explicitly provides some very useful information from
> within the OS to the hypervisor so that it doesn't have to infer
> that information; but tmem is trying to go a step further by making
> the cooperation between the OS and hypervisor more explicit
> and directly beneficial to the OS.
>   

KVM definitely falls into the camp of trying to minimize modification to 
the guest.

>> If there was one change to tmem that would make it more 
>> palatable, for 
>> me it would be changing the way pools are "allocated".  Instead of 
>> getting an opaque handle from the hypervisor, I would force 
>> the guest to 
>> allocate it's own memory and to tell the hypervisor that it's a tmem 
>> pool.
>>     
>
> An interesting idea but one of the nice advantages of tmem being
> completely external to the OS is that the tmem pool may be much
> larger than the total memory available to the OS.  As an extreme
> example, assume you have one 1GB guest on a physical machine that
> has 64GB physical RAM.  The guest now has 1GB of directly-addressable
> memory and 63GB of indirectly-addressable memory through tmem.
> That 63GB requires no page structs or other data structures in the
> guest.  And in the current (external) implementation, the size
> of each pool is constantly changing, sometimes dramatically so
> the guest would have to be prepared to handle this.  I also wonder
> if this would make shared-tmem-pools more difficult.
>
> I can see how it might be useful for KVM though.  Once the
> core API and all the hooks are in place, a KVM implementation of
> tmem could attempt something like this.
>   

It's the core API that is really the issue.  The semantics of tmem 
(external memory pool with copy interface) is really what is problematic.

The basic concept, notifying the VMM about memory that can be recreated 
by the guest to avoid the VMM having to swap before reclaim, is great 
and I'd love to see Linux support it in some way.

>> The big advantage of keeping the tmem pool part of the normal set of 
>> guest memory is that you don't introduce new challenges with 
>> respect to memory accounting.  Whether or not tmem is directly 
>> accessible from the guest, it is another memory resource.  I'm
>> certain that you'll want to do accounting of how much tmem is being
>> consumed by each guest
>>     
>
> Yes, the Xen implementation of tmem does accounting on a per-pool
> and a per-guest basis and exposes the data via a privileged
> "tmem control" hypercall.
>   

I was talking about accounting within the guest.  It's not just a matter 
of accounting within the mm, it's also about accounting in userspace.  A 
lot of software out there depends on getting detailed statistics from 
Linux about how much memory is in use in order to determine things like 
memory pressure.  If you introduce a new class of memory, you need a new 
class of statistics to expose to userspace and all those tools need 
updating.

Regards,

Anthony Liguori

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-07-12 13:13 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-07 16:17 Dan Magenheimer
2009-07-07 17:28 ` Rik van Riel
2009-07-07 19:53   ` Dan Magenheimer
2009-07-08 22:56 ` Anthony Liguori
2009-07-08 23:31   ` [Xen-devel] " Dan Magenheimer
2009-07-08 23:57     ` Anthony Liguori
2009-07-09  0:17       ` Jeremy Fitzhardinge
2009-07-09  0:27         ` Anthony Liguori
2009-07-09  1:20   ` Rik van Riel
2009-07-09 21:09     ` Dan Magenheimer
2009-07-09 21:27       ` Rik van Riel
2009-07-09 21:48         ` Dan Magenheimer
2009-07-09 21:41       ` Anthony Liguori
2009-07-09 22:34         ` Dan Magenheimer
2009-07-09 22:45           ` Rik van Riel
2009-07-09 23:33           ` Anthony Liguori
2009-07-10 15:23             ` Dan Magenheimer
2009-07-12  9:20               ` Avi Kivity
2009-07-12 16:28                 ` Dan Magenheimer
2009-07-12 17:27                   ` Avi Kivity
2009-07-12 20:59                     ` Dan Magenheimer
2009-07-12 13:28               ` Anthony Liguori [this message]
2009-07-12 16:20                 ` Dan Magenheimer
2009-07-12 17:16                   ` Avi Kivity
2009-07-12 19:34                     ` Anthony Liguori
2009-07-13 20:17                       ` Chris Mason
2009-07-13 20:38                         ` Anthony Liguori
2009-07-13 20:38                         ` Anthony Liguori
2009-07-13 21:01                           ` Chris Mason
2009-07-13 21:17                             ` Anthony Liguori
2009-07-26 15:00                               ` Avi Kivity
2009-07-12 20:39                     ` [Xen-devel] " Dan Magenheimer
2009-07-12 20:43                       ` Avi Kivity
2009-07-12 21:08                         ` Dan Magenheimer
2009-07-13 11:33                           ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A59E502.1020008@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=avi@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=chris.mason@oracle.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=dave.mccracken@oracle.com \
    --cc=jeremy@goop.org \
    --cc=kurt.hackel@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtosatti@redhat.com \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=schwidefsky@de.ibm.com \
    --cc=sunil.mushran@oracle.com \
    --cc=tmem-devel@oss.oracle.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox