linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Avi Kivity <avi@redhat.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Pavel Machek <pavel@ucw.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	hugh.dickins@tiscali.co.uk, ngupta@vflare.org,
	JBeulich@novell.com, chris.mason@oracle.com,
	kurt.hackel@oracle.com, dave.mccracken@oracle.com,
	npiggin@suse.de, akpm@linux-foundation.org, riel@redhat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
Date: Fri, 30 Apr 2010 10:52:14 -0700	[thread overview]
Message-ID: <4BDB18CE.2090608@goop.org> (raw)
In-Reply-To: <4BDB026E.1030605@redhat.com>

On 04/30/2010 09:16 AM, Avi Kivity wrote:
> Given that whenever frontswap fails you need to swap anyway, it is
> better for the host to never fail a frontswap request and instead back
> it with disk storage if needed.  This way you avoid a pointless vmexit
> when you're out of memory.  Since it's disk backed it needs to be
> asynchronous and batched.

I'd argue the opposite.  There's no point in having the host do swapping
on behalf of guests if guests can do it themselves; it's just a
duplication of functionality.  You end up having two IO paths for each
guest, and the resulting problems in trying to account for the IO,
rate-limit it, etc.  If you can simply say "all guest disk IO happens
via this single interface", its much easier to manage.

If frontswap has value, it's because its providing a new facility to
guests that doesn't already exist and can't be easily emulated with
existing interfaces.

It seems to me the great strengths of the synchronous interface are:

    * it matches the needs of an existing implementation (tmem in Xen)
    * it is simple to understand within the context of the kernel code
      it's used in

Simplicity is important, because it allows the mm code to be understood
and maintained without having to have a deep understanding of
virtualization.  One of the problems with CMM2 was that it puts a lot of
intricate constraints on the mm code which can be easily broken, which
would only become apparent in subtle edge cases in a CMM2-using
environment.  An addition async frontswap-like interface - while not as
complex as CMM2 - still makes things harder for mm maintainers.

The downside is that it may not match some implementation in which the
get/put operations could take a long time (ie, physical IO to a slow
mechanical device).  But a general Linux principle is not to overdesign
interfaces for hypothetical users, only for real needs.

Do you think that you would be able to use frontswap in kvm if it were
an async interface, but not otherwise?  Or are you arguing a hypothetical?

> At this point we're back with the ordinary swap API.  Simply have your
> host expose a device which is write cached by host memory, you'll have
> all the benefits of frontswap with none of the disadvantages, and with
> no changes to guest code.

Yes, that's comfortably within the "guests page themselves" model. 
Setting up a block device for the domain which is backed by pagecache
(something we usually try hard to avoid) is pretty straightforward.  But
it doesn't work well for Xen unless the blkback domain is sized so that
it has all of Xen's free memory in its pagecache.

That said, it does concern me that the host/hypervisor is left holding
the bag on frontswapped pages.  A evil/uncooperative/lazy can just pump
a whole lot of pages into the frontswap pool and leave them there.   I
guess this is mitigated by the fact that the API is designed such that
they can't update or read the data without also allowing the hypervisor
to drop the page (updates can fail destructively, and reads are also
destructive), so the guest can't use it as a clumsy extension of their
normal dedicated memory.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-04-30 17:52 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-22 13:42 Dan Magenheimer
2010-04-22 15:28 ` Avi Kivity
2010-04-22 15:48   ` Dan Magenheimer
2010-04-22 16:13     ` Avi Kivity
2010-04-22 20:15       ` Dan Magenheimer
2010-04-23  9:48         ` Avi Kivity
2010-04-23 13:47           ` Dan Magenheimer
2010-04-23 13:57             ` Avi Kivity
2010-04-23 14:43               ` Dan Magenheimer
2010-04-23 14:52                 ` Avi Kivity
2010-04-23 15:00                   ` Avi Kivity
2010-04-23 16:26                     ` Dan Magenheimer
2010-04-24 18:25                       ` Avi Kivity
     [not found]                         ` <1c02a94a-a6aa-4cbb-a2e6-9d4647760e91@default4BD43033.7090706@redhat.com>
2010-04-25  0:41                         ` Dan Magenheimer
2010-04-25 12:06                           ` Avi Kivity
2010-04-25 13:12                             ` Dan Magenheimer
2010-04-25 13:18                               ` Avi Kivity
2010-04-28  5:55                               ` Pavel Machek
2010-04-29 14:42                                 ` Dan Magenheimer
2010-04-29 18:59                                   ` Avi Kivity
2010-04-29 19:01                                     ` Avi Kivity
2010-04-29 18:53                                 ` Avi Kivity
2010-04-30  1:45                                 ` Dave Hansen
2010-04-30  7:13                                   ` Avi Kivity
2010-04-30 15:59                                     ` Dan Magenheimer
2010-04-30 16:08                                       ` Dave Hansen
2010-05-10 16:05                                         ` Martin Schwidefsky
2010-04-30 16:16                                       ` Avi Kivity
     [not found]                                         ` <4BDB18CE.2090608@goop.org4BDB2069.4000507@redhat.com>
     [not found]                                           ` <3a62a058-7976-48d7-acd2-8c6a8312f10f@default20100502071059.GF1790@ucw.cz>
2010-04-30 16:43                                         ` Dan Magenheimer
2010-04-30 17:10                                           ` Dave Hansen
2010-04-30 18:08                                           ` Avi Kivity
2010-04-30 17:52                                         ` Jeremy Fitzhardinge [this message]
2010-04-30 18:24                                           ` Avi Kivity
2010-04-30 18:59                                             ` Jeremy Fitzhardinge
2010-05-01  8:28                                               ` Avi Kivity
2010-05-01 17:10                                             ` Dan Magenheimer
2010-05-02  7:11                                               ` Pavel Machek
2010-05-02 15:05                                                 ` Dan Magenheimer
2010-05-02 20:06                                                   ` Pavel Machek
2010-05-02 21:05                                                     ` Dan Magenheimer
2010-05-02  7:57                                               ` Nitin Gupta
2010-05-02 16:06                                                 ` Dan Magenheimer
2010-05-02 16:48                                                   ` Avi Kivity
2010-05-02 17:22                                                     ` Dan Magenheimer
2010-05-03  9:39                                                       ` Avi Kivity
2010-05-03 14:59                                                         ` Dan Magenheimer
2010-05-02 15:35                                               ` Avi Kivity
2010-05-02 17:06                                                 ` Dan Magenheimer
2010-05-03  8:46                                                   ` Avi Kivity
2010-05-03 16:01                                                     ` Dan Magenheimer
2010-05-03 19:32                                                       ` Pavel Machek
2010-04-30 16:04                                     ` Dave Hansen
2010-04-23 15:56                   ` Dan Magenheimer
2010-04-24 18:22                     ` Avi Kivity
2010-04-25  0:30                       ` Dan Magenheimer
2010-04-25 12:11                         ` Avi Kivity
     [not found]                           ` <c5062f3a-3232-4b21-b032-2ee1f2485ff0@default4BD44E74.2020506@redhat.com>
2010-04-25 13:37                           ` Dan Magenheimer
2010-04-25 14:15                             ` Avi Kivity
2010-04-25 15:29                               ` Dan Magenheimer
2010-04-26  6:01                                 ` Avi Kivity
2010-04-26 12:45                                   ` Dan Magenheimer
2010-04-26 13:48                                     ` Avi Kivity
2010-04-27 12:56                                 ` Pavel Machek
2010-04-27 14:32                                   ` Dan Magenheimer
2010-04-29 13:02                                     ` Pavel Machek
2010-04-27 11:52                             ` Valdis.Kletnieks
2010-04-27  0:49                           ` Jeremy Fitzhardinge
2010-04-27 12:55                         ` Pavel Machek
2010-04-27 14:43                           ` Nitin Gupta
2010-04-29 13:04                             ` Pavel Machek
2010-04-24  1:49                   ` Nitin Gupta
2010-04-24 18:27                     ` Avi Kivity
2010-04-25  3:11                       ` Nitin Gupta
2010-04-25 12:16                         ` Avi Kivity
2010-04-25 16:05                           ` Nitin Gupta
2010-04-26  6:06                             ` Avi Kivity
2010-04-26 12:50                               ` Dan Magenheimer
2010-04-26 13:43                                 ` Avi Kivity
2010-04-27  8:29                                   ` Dan Magenheimer
2010-04-27  9:21                                     ` Avi Kivity
2010-04-26 13:47                               ` Nitin Gupta
2010-04-23 16:35             ` Jiahua

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BDB18CE.2090608@goop.org \
    --to=jeremy@goop.org \
    --cc=JBeulich@novell.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=dave.mccracken@oracle.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=kurt.hackel@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=npiggin@suse.de \
    --cc=pavel@ucw.cz \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox