From: Anthony Liguori <anthony@codemonkey.ws>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: npiggin@suse.de, akpm@osdl.org, jeremy@goop.org,
xen-devel@lists.xensource.com, tmem-devel@oss.oracle.com,
kurt.hackel@oracle.com, Rusty Russell <rusty@rustcorp.com.au>,
linux-kernel@vger.kernel.org, dave.mccracken@oracle.com,
linux-mm@kvack.org, chris.mason@oracle.com,
sunil.mushran@oracle.com, Avi Kivity <avi@redhat.com>,
Schwidefsky <schwidefsky@de.ibm.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
alan@lxorguk.ukuu.org.uk,
Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [Xen-devel] Re: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux
Date: Wed, 08 Jul 2009 18:57:38 -0500 [thread overview]
Message-ID: <4A553272.5050909@codemonkey.ws> (raw)
In-Reply-To: <ac5dec0d-e593-4a82-8c9d-8aa374e8c6ed@default>
Dan Magenheimer wrote:
> Hi Anthony --
>
> Thanks for the comments.
>
>
>> I have trouble mapping this to a VMM capable of overcommit
>> without just coming back to CMM2.
>>
>> In CMM2 parlance, ephemeral tmem pools is just normal kernel memory
>> marked in the volatile state, no?
>>
>
> They are similar in concept, but a volatile-marked kernel page
> is still a kernel page, can be changed by a kernel (or user)
> store instruction, and counts as part of the memory used
> by the VM. An ephemeral tmem page cannot be directly written
> by a kernel (or user) store,
Why does tmem require a special store?
A VMM can trap write operations pages can be stored on disk
transparently by the VMM if necessary. I guess that's the bit I'm missing.
>> It seems to me that an architecture built around hinting
>> would be more
>> robust than having to use separate memory pools for this type
>> of memory
>> (especially since you are requiring a copy to/from the pool).
>>
>
> Depends on what you mean by robust, I suppose. Once you
> understand the basics of tmem, it is very simple and this
> is borne out in the low invasiveness of the Linux patch.
> Simplicity is another form of robustness.
>
The main disadvantage I see is that you need to explicitly convert
portions of the kernel to use a data copying API. That seems like an
invasive change to me. Hinting on the other hand can be done in a
less-invasive way.
I'm not really arguing against tmem, just the need to have explicit
get/put mechanisms for the transcendent memory areas.
> The copy may be expensive on an older machine, but on newer
> machines copying a page is relatively inexpensive.
I don't think that's a true statement at all :-) If you had a workload
where data never came into the CPU cache (zero-copy) and now you
introduce a copy, even with new system, you're going to see a
significant performance hit.
> On a reasonable
> multi-VM-kernbench-like benchmark I'll be presenting at Linux
> Symposium next week, the overhead is on the order of 0.01%
> for a fairly significant savings in IOs.
>
But how would something like specweb do where you should be doing
zero-copy IO from the disk to the network? This is the area where I
would be concerned. For something like kernbench, you're already
bringing the disk data into the CPU cache anyway so I can appreciate
that the copy could get lost in the noise.
Regards,
Anthony Liguori
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-07-08 23:45 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-07 16:17 Dan Magenheimer
2009-07-07 17:28 ` Rik van Riel
2009-07-07 19:53 ` Dan Magenheimer
2009-07-08 22:56 ` Anthony Liguori
2009-07-08 23:31 ` [Xen-devel] " Dan Magenheimer
2009-07-08 23:57 ` Anthony Liguori [this message]
2009-07-09 0:17 ` Jeremy Fitzhardinge
2009-07-09 0:27 ` Anthony Liguori
2009-07-09 1:20 ` Rik van Riel
2009-07-09 21:09 ` Dan Magenheimer
2009-07-09 21:27 ` Rik van Riel
2009-07-09 21:48 ` Dan Magenheimer
2009-07-09 21:41 ` Anthony Liguori
2009-07-09 22:34 ` Dan Magenheimer
2009-07-09 22:45 ` Rik van Riel
2009-07-09 23:33 ` Anthony Liguori
2009-07-10 15:23 ` Dan Magenheimer
2009-07-12 9:20 ` Avi Kivity
2009-07-12 16:28 ` Dan Magenheimer
2009-07-12 17:27 ` Avi Kivity
2009-07-12 20:59 ` Dan Magenheimer
2009-07-12 13:28 ` Anthony Liguori
2009-07-12 16:20 ` Dan Magenheimer
2009-07-12 17:16 ` Avi Kivity
2009-07-12 19:34 ` Anthony Liguori
2009-07-13 20:17 ` Chris Mason
2009-07-13 20:38 ` Anthony Liguori
2009-07-13 21:01 ` Chris Mason
2009-07-13 21:17 ` Anthony Liguori
2009-07-26 15:00 ` Avi Kivity
2009-07-13 20:38 ` Anthony Liguori
2009-07-12 20:39 ` [Xen-devel] " Dan Magenheimer
2009-07-12 20:43 ` Avi Kivity
2009-07-12 21:08 ` Dan Magenheimer
2009-07-13 11:33 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A553272.5050909@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=akpm@osdl.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=chris.mason@oracle.com \
--cc=dan.magenheimer@oracle.com \
--cc=dave.mccracken@oracle.com \
--cc=jeremy@goop.org \
--cc=kurt.hackel@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=rusty@rustcorp.com.au \
--cc=schwidefsky@de.ibm.com \
--cc=sunil.mushran@oracle.com \
--cc=tmem-devel@oss.oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox