linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Magenheimer <dan.magenheimer@oracle.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Stoffel <john@stoffel.org>,
	Johannes Weiner <jweiner@redhat.com>,
	Pekka Enberg <penberg@kernel.org>,
	Cyclonus J <cyclonusj@gmail.com>,
	Sasha Levin <levinsasha928@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	David Rientjes <rientjes@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	Seth Jennings <sjenning@linux.vnet.ibm.com>,
	ngupta@vflare.org, Chris Mason <chris.mason@oracle.com>,
	JBeulich@novell.com, Dave Hansen <dave@linux.vnet.ibm.com>,
	Jonathan Corbet <corbet@lwn.net>
Subject: RE: [GIT PULL] mm: frontswap (for 3.2 window)
Date: Mon, 31 Oct 2011 08:39:32 -0700 (PDT)	[thread overview]
Message-ID: <424e9e3a-670d-4835-914f-83e99a11991a@default> (raw)
In-Reply-To: <1320048767.8283.13.camel@dabdike>

> From: James Bottomley [mailto:James.Bottomley@HansenPartnership.com]
> Subject: RE: [GIT PULL] mm: frontswap (for 3.2 window)

Hi James --

Thanks for the reply.  You raise some good points but
I hope you will read what I believe are reasonable though
long-winded answers.
 
> On Fri, 2011-10-28 at 13:19 -0700, Dan Magenheimer wrote:
> > For those who "hack on the VM", I can't imagine why the handful
> > of lines in the swap subsystem, which is probably the most stable
> > and barely touched subsystem in Linux or any OS on the planet,
> > is going to be a burden or much of a cost.
> 
> Saying things like this doesn't encourage anyone to trust you.  The
> whole of the MM is a complex, highly interacting system.  The recent
> issues we've had with kswapd and the shrinker code gives a nice
> demonstration of this ... and that was caused by well tested code
> updates.

I do understand that.  My point was that the hooks are
placed _statically_ in largely stable code so it's not
going to constantly get in the way of VM developers
adding new features and fixing bugs, particularly
any developers that don't care about whether frontswap
works or not.  I do think that is a very relevant
point about maintenance... do you disagree?

Runtime interactions can only occur if the code is
config'ed and, if config'ed, only if a tmem backend (e.g.
Xen or zcache) enables it also at runtime.  When
both are enabled, runtime interactions do occur
and absolutely must be fully tested.  My point was
that any _users_ who don't care about whether frontswap
works or not don't need to have any concerns about
VM system runtime interactions.  I think this is also
a very relevant point about maintenance... do you
disagree?

> You can't hand wave away the need for benchmarks and
> performance tests.

I'm not.  Conclusive benchmarks are available for one user
(Xen) but not (yet) for other users.  I've already acknowledged
the feedback desiring benchmarking for zcache, but zcache
is already merged (albeit in  staging), and Xen tmem
is already merged in both Linux and the Xen hypervisor,
and cleancache (the alter ego of frontswap) is already
merged.

So the question is not whether benchmarks are waived,
but whether one accepts (1) conclusive benchmarks for Xen;
PLUS (2) insufficiently benchmarked zcache; PLUS (3) at
least two other interesting-but-not-yet-benchmarkable users;
as sufficient for adding this small set of hooks into
swap code.

I understand that some kernel developers (mostly from one
company) continue to completely discount Xen, and
thus won't even look at the Xen results.  IMHO
that is mudslinging.

> You have also answered all questions about inactive cost by saying "the
> code has zero cost when it's compiled out"  This also is a non starter.
> For the few use cases it has, this code has to be compiled in.  I
> suspect even Oracle isn't going to ship separate frontswap and
> non-frontswap kernels in its distro.  So you have to quantify what the
> performance impact is when this code is compiled in but not used.
> Please do so.

First, no, Oracle is not going to ship separate frontswap and
non-frontswap kernels.  It IS going to ship a frontswap-enabled
kernel and this can be seen in Oracle's publicly-available
kernel git tree (the next release, now in Beta).  Frontswap is
compiled in, but still must be enabled at runtime (e.g. for
a Xen guest, either manually by the guest's administrator
or automagically by the Oracle VM product's management layer).

I did fully quantify the performance impact elsewhere in
this thread.  The performance impact with CONFIG_FRONTSWAP=n
(which is ZERO) is relevant for distros which choose to
ignore it entirely.  The performance impact for CONFIG_FRONTSWAP=y
but not-enabled-at-runtume is one compare-pointer-against-NULL
per page actually swapped in or out (essentially ZERO);
this is relevant for distros which choose to configure it
enabled in case they wish to enable it at runtime in
the future.

So the remaining question is the performance impact when
compile-time AND runtime enabled; this is in the published
Xen presentation I've referenced -- the impact is much much
less than the performance gain.  IMHO benchmark results can
be easily manipulated so I prefer to discuss the theoretical
underpinnings which, in short, is that just about anything
a tmem backend does (hypercall, compression, deduplication,
even moving data across a fast network) is a helluva lot
faster than swapping a page to disk.

Are there corner cases and probably even real workloads
where the cost exceeds the benefits?  Probably... though
less likely for frontswap than for cleancache because ONLY
pages that would actually be swapped out/in use frontswap.

But I have never suggested that every kernel should always
unconditionally compile-time-enable and run-time-enable
frontswap... simply that it should be in-tree so those
who wish to enable it are able to enable it.

Thanks,
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-10-31 15:39 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-27 18:52 Dan Magenheimer
     [not found] ` <alpine.DEB.2.00.1110271318220.7639@chino.kir.corp.google.com20111027211157.GA1199@infradead.org>
2011-10-27 19:30 ` Kurt Hackel
2011-10-27 20:18 ` David Rientjes
2011-10-27 21:11   ` Christoph Hellwig
2011-10-27 21:49     ` Dan Magenheimer
2011-10-27 21:52       ` Christoph Hellwig
2011-10-27 22:21         ` Dan Magenheimer
2011-10-28  7:12         ` Sasha Levin
     [not found]           ` <CAOzbF4fnD=CGR-nizZoBxmFSuAjFC3uAHf3wDj5RLneJvJhrOQ@mail.gmail.comCAOJsxLGOTw7rtFnqeHvzFxifA0QgPVDHZzrEo=-uB2Gkrvp=JQ@mail.gmail.com>
     [not found]             ` <552d2067-474d-4aef-a9a4-89e5fd8ef84f@default20111031181651.GF3466@redhat.com>
     [not found]               ` <60592afd-97aa-4eaf-b86b-f6695d31c7f1@default20111031223717.GI3466@redhat.com>
     [not found]                 ` <1b2e4f74-7058-4712-85a7-84198723e3ee@default20111101012017.GJ3466@redhat.com>
     [not found]                   ` <6a9db6d9-6f13-4855-b026-ba668c29ddfa@default20111101180702.GL3466@redhat.com>
     [not found]                     ` <b8a0ca71-a31b-488a-9a92-2502d4a6e9bf@default20111102013122.GA18879@redhat.com>
2011-10-28  7:30           ` Cyclonus J
2011-10-28 14:26             ` Pekka Enberg
2011-10-28 15:21               ` Dan Magenheimer
     [not found]                 ` <CAOJsxLEE-qf9me1SAZLFiEVhHVnDh7BDrSx1+abe9R4mfkhD=g@mail.gmail.com20111028163053.GC1319@redhat.com>
2011-10-28 15:36                 ` Pekka Enberg
2011-10-28 16:30                   ` Johannes Weiner
2011-10-28 17:01                     ` Pekka Enberg
2011-10-28 17:07                     ` Dan Magenheimer
2011-10-28 18:28                       ` John Stoffel
2011-10-28 20:19                         ` Dan Magenheimer
2011-10-28 20:52                           ` John Stoffel
2011-10-30 19:18                             ` Dan Magenheimer
2011-10-30 20:06                               ` Dave Hansen
2011-10-30 21:50                                 ` Dan Magenheimer
2011-11-02 19:45                                 ` Rik van Riel
2011-11-02 20:45                                   ` Dan Magenheimer
2011-11-06 22:32                             ` Valdis.Kletnieks
2011-11-08 12:15                               ` Ed Tomlinson
2011-10-31  8:12                           ` James Bottomley
2011-10-31 15:39                             ` Dan Magenheimer [this message]
2011-11-01 10:13                               ` James Bottomley
2011-11-01 18:10                                 ` Dan Magenheimer
2011-11-01 18:48                                   ` Dave Hansen
2011-11-01 21:32                                     ` Dan Magenheimer
2011-11-02  7:44                                   ` James Bottomley
2011-11-02 19:39                                     ` Dan Magenheimer
2011-10-31 18:44                         ` Andrea Arcangeli
2011-10-30 21:47                       ` Johannes Weiner
2011-10-30 23:19                         ` Dan Magenheimer
2011-10-31 18:34                       ` Andrea Arcangeli
2011-10-31 21:45                         ` Dan Magenheimer
2011-10-28 16:37                   ` Dan Magenheimer
2011-10-28 16:59                     ` Pekka Enberg
2011-10-28 17:20                       ` Dan Magenheimer
2011-10-31 18:16                 ` Andrea Arcangeli
2011-10-31 20:58                   ` Dan Magenheimer
2011-10-31 22:37                     ` Andrea Arcangeli
2011-10-31 23:36                       ` Dan Magenheimer
2011-11-01  1:20                         ` Andrea Arcangeli
2011-11-01 16:41                           ` Dan Magenheimer
2011-11-01 18:07                             ` Andrea Arcangeli
2011-11-01 21:00                               ` Dan Magenheimer
2011-11-02  1:31                                 ` Andrea Arcangeli
2011-11-02 19:06                                   ` Dan Magenheimer
2011-11-03  0:32                                     ` Andrea Arcangeli
2011-11-03 22:29                                       ` Dan Magenheimer
2011-11-02 20:51                         ` Rik van Riel
2011-11-02 21:14                           ` Dan Magenheimer
2011-11-15 16:29                             ` Rik van Riel
2011-11-15 17:33                               ` Jeremy Fitzhardinge
2011-11-16 14:49                                 ` Konrad Rzeszutek Wilk
2011-11-01 10:16                   ` James Bottomley
2011-11-01 18:21                     ` Dan Magenheimer
2011-11-02  8:14                       ` James Bottomley
2011-11-02 20:08                         ` Dan Magenheimer
2011-11-03 10:30                           ` Theodore Tso
2011-11-03 14:59                             ` Dan Magenheimer
2011-11-02 15:44                     ` Avi Kivity
2011-11-02 16:02                       ` Andrea Arcangeli
2011-11-02 16:13                         ` Avi Kivity
2011-11-02 20:27                           ` Dan Magenheimer
2011-11-02 20:19                       ` Dan Magenheimer
2011-10-27 21:44 ` Avi Miller
2011-10-27 22:33 ` Brian King
2011-10-28  5:17 ` Nitin Gupta
2011-10-29 13:43 ` Ed Tomlinson
2011-10-31  8:13 ` KAMEZAWA Hiroyuki
2011-10-31 16:38   ` Dan Magenheimer
2011-11-01  0:50     ` KAMEZAWA Hiroyuki
2011-11-01 15:25       ` Dan Magenheimer
2011-11-01 21:43         ` Andrew Morton
2011-11-01 22:25           ` Dan Magenheimer
2011-11-02 21:03           ` Rik van Riel
2011-11-02 21:42             ` Dan Magenheimer
2011-11-02  1:14         ` KAMEZAWA Hiroyuki
2011-11-02 15:12           ` Dan Magenheimer
2011-11-04  4:19             ` KAMEZAWA Hiroyuki
2011-11-03 16:49 ` Jan Beulich
2011-11-04  0:54   ` Andrew Morton
2011-11-04  8:49     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=424e9e3a-670d-4835-914f-83e99a11991a@default \
    --to=dan.magenheimer@oracle.com \
    --cc=JBeulich@novell.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=corbet@lwn.net \
    --cc=cyclonusj@gmail.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=jeremy@goop.org \
    --cc=john@stoffel.org \
    --cc=jweiner@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=sjenning@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox