From: Dan Magenheimer <dan.magenheimer@oracle.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, npiggin@suse.de, akpm@osdl.org,
jeremy@goop.org, xen-devel@lists.xensource.com,
tmem-devel@oss.oracle.com, alan@lxorguk.ukuu.org.uk,
linux-mm@kvack.org, kurt.hackel@oracle.com,
Rusty Russell <rusty@rustcorp.com.au>,
dave.mccracken@oracle.com, Marcelo Tosatti <mtosatti@redhat.com>,
sunil.mushran@oracle.com, Avi Kivity <avi@redhat.com>,
Schwidefsky <schwidefsky@de.ibm.com>,
chris.mason@oracle.com, Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: RE: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux
Date: Sun, 12 Jul 2009 09:20:22 -0700 (PDT) [thread overview]
Message-ID: <a09e4489-a755-46e7-a569-a0751e0fc39f@default> (raw)
In-Reply-To: <4A59E502.1020008@codemonkey.ws>
> > that information; but tmem is trying to go a step further by making
> > the cooperation between the OS and hypervisor more explicit
> > and directly beneficial to the OS.
>
> KVM definitely falls into the camp of trying to minimize
> modification to the guest.
No argument there. Well, maybe one :-) Yes, but KVM
also heavily encourages unmodified guests. Tmem is
philosophically in favor of finding a balance between
things that work well with no changes to any OS (and
thus work just fine regardless of whether the OS is
running in a virtual environment or not), and things
that could work better if the OS is knowledgable that
it is running in a virtual environment.
For those that believe virtualization is a flash-in-
the-pan, no modifications to the OS is the right answer.
For those that believe it will be pervasive in the
future, finding the right balance is a critical step
in operating system evolution.
(Sorry for the Sunday morning evangelizing :-)
> >> If there was one change to tmem that would make it more
> >> palatable, for
> >> me it would be changing the way pools are "allocated". Instead of
> >> getting an opaque handle from the hypervisor, I would force
> >> the guest to
> >> allocate it's own memory and to tell the hypervisor that
> it's a tmem
> >> pool.
> >
> > I can see how it might be useful for KVM though. Once the
> > core API and all the hooks are in place, a KVM implementation of
> > tmem could attempt something like this.
>
> It's the core API that is really the issue. The semantics of tmem
> (external memory pool with copy interface) is really what is
> problematic.
> The basic concept, notifying the VMM about memory that can be
> recreated
> by the guest to avoid the VMM having to swap before reclaim, is great
> and I'd love to see Linux support it in some way.
Is it the tmem API or the precache/preswap API layered on
top of it that is problematic? Both currently assume copying
but perhaps the precache/preswap API could, with minor
modifications, meet KVM's needs better?
> > Yes, the Xen implementation of tmem does accounting on a per-pool
> > and a per-guest basis and exposes the data via a privileged
> > "tmem control" hypercall.
>
> I was talking about accounting within the guest. It's not
> just a matter
> of accounting within the mm, it's also about accounting in
> userspace. A
> lot of software out there depends on getting detailed statistics from
> Linux about how much memory is in use in order to determine
> things like
> memory pressure. If you introduce a new class of memory, you
> need a new
> class of statistics to expose to userspace and all those tools need
> updating.
OK, I see.
Well, first, tmem's very name means memory that is "beyond the
range of normal perception". This is certainly not the first class
of memory in use in data centers that can't be accounted at
process granularity. I'm thinking disk array caches as the
primary example. Also lots of tools that work great in a
non-virtualized OS are worthless or misleading in a virtual
environment.
Second, CPUs are getting much more complicated with massive
pipelines, many layers of caches each with different characteristics,
etc, and its getting increasingly impossible to accurately and
reproducibly measure performance at a very fine granularity.
One could only expect that other resources, such as memory,
would move in that direction.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-07-12 16:05 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-07 16:17 Dan Magenheimer
2009-07-07 17:28 ` Rik van Riel
2009-07-07 19:53 ` Dan Magenheimer
2009-07-08 22:56 ` Anthony Liguori
2009-07-08 23:31 ` [Xen-devel] " Dan Magenheimer
2009-07-08 23:57 ` Anthony Liguori
2009-07-09 0:17 ` Jeremy Fitzhardinge
2009-07-09 0:27 ` Anthony Liguori
2009-07-09 1:20 ` Rik van Riel
2009-07-09 21:09 ` Dan Magenheimer
2009-07-09 21:27 ` Rik van Riel
2009-07-09 21:48 ` Dan Magenheimer
2009-07-09 21:41 ` Anthony Liguori
2009-07-09 22:34 ` Dan Magenheimer
2009-07-09 22:45 ` Rik van Riel
2009-07-09 23:33 ` Anthony Liguori
2009-07-10 15:23 ` Dan Magenheimer
2009-07-12 9:20 ` Avi Kivity
2009-07-12 16:28 ` Dan Magenheimer
2009-07-12 17:27 ` Avi Kivity
2009-07-12 20:59 ` Dan Magenheimer
2009-07-12 13:28 ` Anthony Liguori
2009-07-12 16:20 ` Dan Magenheimer [this message]
2009-07-12 17:16 ` Avi Kivity
2009-07-12 19:34 ` Anthony Liguori
2009-07-13 20:17 ` Chris Mason
2009-07-13 20:38 ` Anthony Liguori
2009-07-13 21:01 ` Chris Mason
2009-07-13 21:17 ` Anthony Liguori
2009-07-26 15:00 ` Avi Kivity
2009-07-13 20:38 ` Anthony Liguori
2009-07-12 20:39 ` [Xen-devel] " Dan Magenheimer
2009-07-12 20:43 ` Avi Kivity
2009-07-12 21:08 ` Dan Magenheimer
2009-07-13 11:33 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a09e4489-a755-46e7-a569-a0751e0fc39f@default \
--to=dan.magenheimer@oracle.com \
--cc=akpm@osdl.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=chris.mason@oracle.com \
--cc=dave.mccracken@oracle.com \
--cc=jeremy@goop.org \
--cc=kurt.hackel@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=rusty@rustcorp.com.au \
--cc=schwidefsky@de.ibm.com \
--cc=sunil.mushran@oracle.com \
--cc=tmem-devel@oss.oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox