From: Nitin Gupta <ngupta@vflare.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Nick Piggin <npiggin@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
jeremy@goop.org, xen-devel@lists.xensource.com,
tmem-devel@oss.oracle.com, Rusty Russell <rusty@rustcorp.com.au>,
Rik van Riel <riel@redhat.com>,
dave.mccracken@oracle.com, sunil.mushran@oracle.com,
Avi Kivity <avi@redhat.com>, Schwidefsky <schwidefsky@de.ibm.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
chris.mason@oracle.com, Pavel Machek <pavel@ucw.cz>,
linux-mm <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Tmem [PATCH 0/5] (Take 3): Transcendent memory
Date: Thu, 24 Dec 2009 08:57:19 +0530 [thread overview]
Message-ID: <4B32DF97.5060400@vflare.org> (raw)
In-Reply-To: <ff435130-98a2-417c-8109-9dd029022a91@default>
Hi Dan,
On 12/23/2009 10:45 PM, Dan Magenheimer wrote:
> I'm definitely OK with exploring alternatives. I just think that
> existing kernel mechanisms are very firmly rooted in the notion
> that either the kernel owns the memory/cache or an asynchronous
> device owns it. Tmem falls somewhere in between and is very
> carefully designed to maximize memory flexibility *outside* of
> the kernel -- across all guests in a virtualized environment --
> with minimal impact to the kernel, while still providing the
> kernel with the ability to use -- but not own, directly address,
> or control -- additional memory when conditions allow. And
> these conditions are not only completely invisible to the kernel,
> but change frequently and asynchronously from the kernel,
> unlike most external devices for which the kernel can "reserve"
> space and use it asynchronously later.
>
> Maybe ramzswap and FS-cache could be augmented to have similar
> advantages in a virtualized environment, but I suspect they'd
> end up with something very similar to tmem. Since the objective
> of both is to optimize memory that IS owned (used, directly
> addressable, and controlled) by the kernel, they are entirely
> complementary with tmem.
>
What we want is surely tmem but attempt is to better integrate with
existing infrastructure. Please give me few days as I try to develop
a prototype.
>> Swapping to hypervisor is mainly useful to overcome
>> 'static partitioning' problem you mentioned in article:
>> http://oss.oracle.com/projects/tmem/
>> ...such 'para-swap' can shrink/expand outside of VM constraints.
>
> Frontswap is very different than "hypervisor swapping" as what's
> done by VMware as a side-effect of transparent page-sharing. With
> frontswap, the kernel still decides which pages are swapped out.
> If frontswap says there is space, the swap goes "fast" to tmem;
> if not, the kernel writes it to its own swapdisk. So there's
> no "double paging" or random page selection/swapping. On
> the downside, kernels must have real swap configured and,
> to avoid DoS issues, frontswap is limited by the same constraint
> as ballooning (ie. can NOT expand outside of VM constraints).
>
I think I did not explain my point regarding "para-swap" correctly.
What I meant was a virtual swap device which appears as swap disk
to kernel. Kernel swaps to this disk as usual (hence the kernel decides
what pages to swap out). This device tries to send these pages to hvisor
but if that fails, it will fall back to swapping inside guest only. There
is no double swapping. I think this correctly explains purpose of Frontswap.
>> ...such 'para-swap' can shrink/expand outside of VM constraints.
<snip>
> frontswap is limited by the same constraint
> as ballooning (ie. can NOT expand outside of VM constraints).
>
What I meant was: A VM can have 512M of memory while this "para swap" disk
can have any size, say 1G. Now kernel can swapout 1G worth of data to this
swap which can (potentially) send all these pages to hvisor. In future, if
we want even more RAM for this VM, we can add another such swap device to guest.
Thus, in a way, we are able to overcome rigid static partitioning of VMs w.r.t.
memory resource.
>
> P.S. If you want to look at implementing FS-cache or ramzswap
> on top of tmem, I'd be happy to help, but I'll bet your concern:
>
>> we might later encounter some hidder/dangerous problems :)
>
> will prove to be correct.
>
Please allow me few days to experiment with 'virtswap' which will
be virtualization aware ramzswap driver. This will help us understand
problems we might face with such an approach. I am new to this virtio thing,
so it might take some time.
If we find virtswap to be feasible with virtio, we can go for fs-cache
backend we talked about.
Thanks,
Nitin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-24 3:29 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-21 13:46 Nitin Gupta
2009-12-21 23:46 ` Dan Magenheimer
2009-12-23 6:28 ` Nitin Gupta
2009-12-23 17:15 ` Dan Magenheimer
2009-12-24 3:27 ` Nitin Gupta [this message]
2009-12-24 20:51 ` Dan Magenheimer
2009-12-25 19:18 ` Pavel Machek
2009-12-28 15:57 ` Dan Magenheimer
2009-12-28 20:51 ` Pavel Machek
2009-12-28 21:41 ` Dan Magenheimer
2009-12-29 2:07 ` Nitin Gupta
-- strict thread matches above, loose matches on Subject: below --
2009-12-18 0:36 Dan Magenheimer
2009-12-18 8:06 ` Pavel Machek
2009-12-21 10:54 ` Nitin Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B32DF97.5060400@vflare.org \
--to=ngupta@vflare.org \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=chris.mason@oracle.com \
--cc=dan.magenheimer@oracle.com \
--cc=dave.mccracken@oracle.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=pavel@ucw.cz \
--cc=riel@redhat.com \
--cc=rusty@rustcorp.com.au \
--cc=schwidefsky@de.ibm.com \
--cc=sunil.mushran@oracle.com \
--cc=tmem-devel@oss.oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox