linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nitin Gupta <ngupta@vflare.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Pavel Machek <pavel@ucw.cz>, Nick Piggin <npiggin@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	jeremy@goop.org, xen-devel@lists.xensource.com,
	tmem-devel@oss.oracle.com, Rusty Russell <rusty@rustcorp.com.au>,
	Rik van Riel <riel@redhat.com>,
	dave.mccracken@oracle.com, sunil.mushran@oracle.com,
	Avi Kivity <avi@redhat.com>, Schwidefsky <schwidefsky@de.ibm.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	chris.mason@oracle.com, linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Tmem [PATCH 0/5] (Take 3): Transcendent memory
Date: Tue, 29 Dec 2009 07:37:38 +0530	[thread overview]
Message-ID: <4B39646A.3080007@vflare.org> (raw)
In-Reply-To: <f4ab13eb-daaa-40be-82ad-691505b1f169@default>

On 12/28/2009 09:27 PM, Dan Magenheimer wrote:
> 
>> From: Pavel Machek [mailto:pavel@ucw.cz]

>>> I'm definitely OK with exploring alternatives.  I just think that
>>> existing kernel mechanisms are very firmly rooted in the notion
>>> that either the kernel owns the memory/cache or an asynchronous
>>> device owns it.  Tmem falls somewhere in between and is very
>>
>> Well... compcache seems to be very similar to preswap: in preswap case
>> you don't know if hypervisor will have space, in ramzswap you don't
>> know if data are compressible.
> 
> Hi Pavel --
> 
> Yes there are definitely similarities too.  In fact, I started
> prototyping preswap (now called frontswap) with Nitin's
> compcache code.  IIRC I ran into some problems with compcache's
> difficulties in dealing with failed "puts" due to dynamic
> changes in size of hypervisor-available-memory.
> 
> Nitin may have addressed this in later versions of ramzswap.
> 

Any kind of swap device that works entirely within guest
(or in native case), will always have problems with any write(put)
failure -- we want to reclaim a page but due to write failure, we can't. Problem!
So, ramzswap also cannot afford to have lot of write failures.

However, the story is different when ramzswap is "virtualization aware".
In this case, we can surely afford to have any numnber of "put" failures
to hypervisor. When this put fails, we will either compress the page and
keep it in guest memory itself or forward it to ramzswap backing swap
device (if present).

Another side point is that we can achieve all this with ramzswap approach
of virtual block devices without any kernel changes as everything is a module.


> One feature of frontswap which is different than ramzswap is
> that frontswap acts as a "fronting store" for all configured
> swap devices, including SAN/NAS swap devices.  It doesn't
> need to be separately configured as a "highest priority" swap
> device.  In many installations and depending on how ramzswap
> is configured, this difference probably doesn't make much
> difference though.
> 

Having a frontswap layer over *every* swap might not be desirable. I think such
things should be completely out of way when not desired. This was one the primary
reasons to have virtual block device approach for ramzswap. You can create any number
of such devices (/dev/ramzswap{0,1,2...}) with each having separate backing device (optional),
memory pools, buffers etc. which adds additional flexibility and helps with scalability.

On a downside however, as you pointed out, managing all this can be a problem for sysadmins.
To ease this, some userspace magic can help which will dynamically manage these virtual disks,
though I have not yet thought much in this direction.

Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-12-29  2:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-21 13:46 Nitin Gupta
2009-12-21 23:46 ` Dan Magenheimer
2009-12-23  6:28   ` Nitin Gupta
2009-12-23 17:15     ` Dan Magenheimer
2009-12-24  3:27       ` Nitin Gupta
2009-12-24 20:51         ` Dan Magenheimer
2009-12-25 19:18       ` Pavel Machek
2009-12-28 15:57         ` Dan Magenheimer
2009-12-28 20:51           ` Pavel Machek
2009-12-28 21:41             ` Dan Magenheimer
2009-12-29  2:07           ` Nitin Gupta [this message]
  -- strict thread matches above, loose matches on Subject: below --
2009-12-18  0:36 Dan Magenheimer
2009-12-18  8:06 ` Pavel Machek
2009-12-21 10:54 ` Nitin Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B39646A.3080007@vflare.org \
    --to=ngupta@vflare.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=avi@redhat.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=chris.mason@oracle.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=dave.mccracken@oracle.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtosatti@redhat.com \
    --cc=npiggin@suse.de \
    --cc=pavel@ucw.cz \
    --cc=riel@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=schwidefsky@de.ibm.com \
    --cc=sunil.mushran@oracle.com \
    --cc=tmem-devel@oss.oracle.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox