linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] mm: frontswap (SUMMARY)
@ 2011-11-04 14:01 Dan Magenheimer
  2011-11-04 15:12 ` Dan Magenheimer
  2011-11-04 16:45 ` Andrea Arcangeli
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Magenheimer @ 2011-11-04 14:01 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: Neo Jia, levinsasha928, JeremyFitzhardinge, linux-mm,
	Dave Hansen, Seth Jennings, Jonathan Corbet, Chris Mason,
	Konrad Wilk, ngupta, LKML, Theodore Tso, James Bottomley,
	Andrea Arcangeli, Pekka Enberg, Christoph Hellwig,
	David Rientjes, KAMEZAWA Hiroyuki

Hi Andrew and Linux --

I thought I'd try to summarize for you the current status
resulting from the 100+ emails stemming from the original
git-pull request.

EXECUTIVE SUMMARY (djm bias noted)

Frontswap is part 2 of 2 of transcendent memory; cleancache
(merged at 3.0) is part 1.  Frontswap consists primarily of
a handful of hooks in the swap subsystem, which end
in a frontswap_ops function vector.  If no "backend"
registers the vector, all hooks become no-ops.  Current
in-tree users are Xen, and "zcache" (in staging), but
two other users, RAMster and KVM, are under development
in public git trees.

Xen is by far the most mature user for frontswap.  If you
count Xen as a valid user, you should IMHO seriously consider
the commit-set, as-is, as ready to merge (even for 3.2),
especially since shipping distros already include it.
If one disregards Xen, there's a lot more work to be done
to prove frontswap should be merged.

RESPONDED TO SUPPORT FRONTSWAP

Jan Beulich (Novell): frontswap in OpenSuse for two years
Brian King (IBM): wants frontswap/zcache for Linux on Power
Sasha Levin (*): actively developing KVM+tmem, wants frontswap
Neo Jia (*): actively developing KVM+tmem, wants frontswap
Nitin Gupta (UMass): zcache co-designer, better than zram
Seth Jennings (IBM): actively improving zcache
Ed Tomlinson (* user): wants frontswap instead of zram
Kurt Hackel (Oracle): shipping Oracle VM product supports frontswap
Avi Miller (Oracle): Beta of next Oracle kernel supports frontswap

Note: Oracle, as a company, has committed to support frontswap.

* affiliation unspecified (but not Oracle ;-)

LAST KNOWN POSITION OF AD HOC ARCHITECTURE REVIEW GROUP

Andrea: zcache still needs a lot of work, has ideas for
  future related swap improvements, "now that you cleared the
  fact there is no API/ABI in [zcache] to worry about, frankly,
  I'm a lot more happy  now", "don't want to stifle innovation
  by saying no to something that makes sense and is free to
  evolve", "this overall sounds very positive (or at least
  better than neutral) to me"... I also think Andrea's
  last remaining issue (need batching for KVM) now has a viable
  solution that works with no frontswap commit-set changes,
  but Andrea has not confirmed

Rik: list of concerns, but I think all were discussed and
  resolved later in the thread (except possibly wanting to
  see more non-Xen benchmarks), no final response from Rik

James: wants more benchmarks especially for zcache, thinks
  ABI should be proven to be useful to KVM before
  frontswap gets merged

Hannes: Nacked, but I think raised issues were later
 discussed and resolved in the thread, with no further
 response from Hannes

 (If anyone quoted here feels misquoted/missummarized,
please feel free to respond.)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-04 14:01 [GIT PULL] mm: frontswap (SUMMARY) Dan Magenheimer
@ 2011-11-04 15:12 ` Dan Magenheimer
  2011-11-04 16:45 ` Andrea Arcangeli
  1 sibling, 0 replies; 8+ messages in thread
From: Dan Magenheimer @ 2011-11-04 15:12 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: Neo Jia, levinsasha928, JeremyFitzhardinge, linux-mm,
	Dave Hansen, Seth Jennings, Jonathan Corbet, Chris Mason,
	Konrad Wilk, ngupta, LKML, Theodore Tso, James Bottomley,
	Andrea Arcangeli, Pekka Enberg, Christoph Hellwig,
	David Rientjes, KAMEZAWA Hiroyuki

> From: Dan Magenheimer
> Subject: [GIT PULL] mm: frontswap (SUMMARY)
> 
> Hi Andrew and Linux --

ARGH! Stupid fingers.  Should be...

Hi Andrew and Linus --

<etc>

:-/
Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-04 14:01 [GIT PULL] mm: frontswap (SUMMARY) Dan Magenheimer
  2011-11-04 15:12 ` Dan Magenheimer
@ 2011-11-04 16:45 ` Andrea Arcangeli
  2011-11-06 19:31   ` Dan Magenheimer
  1 sibling, 1 reply; 8+ messages in thread
From: Andrea Arcangeli @ 2011-11-04 16:45 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Andrew Morton, Linus Torvalds, Neo Jia, levinsasha928,
	JeremyFitzhardinge, linux-mm, Dave Hansen, Seth Jennings,
	Jonathan Corbet, Chris Mason, Konrad Wilk, ngupta, LKML,
	Theodore Tso, James Bottomley, Pekka Enberg, Christoph Hellwig,
	David Rientjes, KAMEZAWA Hiroyuki

On Fri, Nov 04, 2011 at 07:01:23AM -0700, Dan Magenheimer wrote:
>   evolve", "this overall sounds very positive (or at least

You quoted me wrong! If you check back my email I said:

=====
Thanks. So this overall sounds _fairly_ positive (or at least better
than neutral) to me.
=====

I guess your clipboard buffer stored in tmem memory in between the cut
and paste and you still got a bug in there that corrupts memory :).

>   last remaining issue (need batching for KVM) now has a viable
>   solution that works with no frontswap commit-set changes,
>   but Andrea has not confirmed

I think it really should be vectored, just like get_user_pages is
vectored and we're not forced to call it one page at time. This is
even more important here because you have a "size" parameter which
means you can push "bytes" into tmem memory, so there's no way you can
possibly want to push bytes with an external call for each one of
those bytes.

You said the tmem.c is all free to be modified so it may be improved
later.

My biggest concern of all is this moves memory outside the VM, and in
control of tmem, but the major trouble will be how the VM controls the
size of tmem. It'll be huge hard to be able to tell what's the ideal
size of tmem at any given time. You admitted yourself that's the messy
part. And your current code isn't handling this properly today, so it
looks simpler than what will really happen if we can handle a mlockall
program allocating 90% of ram at max CPU speed without going OOM
because of zcache enabled.

I also don't think the frontswap+KVM effort is worth it, I doubt we
want to deal with the added complexity of it and the obvious
unreliability we'd run into to shrink the tmem pools. Xen may be ok
unreliable, KVM must be rock solid, we have a design that is as solid
as Linux bare metal, no change at all in terms of VM algorithms in the
hypervisor, and that's our core value. There's no way we add
unreliability with a mlock program allocating ram in the host and going OOM
because some VM is running, even if we solve the vmexit every 4k which
would destroy performance.

So my main interest is only for having compressed swap for linux in
general. It may speedup swap I/O too if done right. I'm still not sure
what's the right design it to handle compressed swap, but whatever we
do should eventually be able to write to disk the compressed data,
which zcache can't today, so I focused on making sure it's freely
hackable and not constrained by Xen ABI, so I liked your confirmation
it's all hackable. It's an intriguing design if we can make the
plugins stackable and we can change the backing store of the zcache
compressed ram with ramster or a one writing to disk. The dark side of
it, is the magic algorithm that will be needed to reliably shrink the
tmem pools, which right now seems disconnected to the VM and can't be
reliable. It looks a design that simplify things but once it will be
reliable things will get more complex and it will have to be driven by
the core VM so that it can react fast to memory pressure events, even
the decision to write to disk or send the zcache compressed pages to
other nodes with ramster should come from the main VM. I still have no
idea if this is the simpler design to allow it or not though, but
again I can't exclude it is and for some things it's certainly
intriguing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-04 16:45 ` Andrea Arcangeli
@ 2011-11-06 19:31   ` Dan Magenheimer
  2011-11-07  7:49     ` Pekka Enberg
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Magenheimer @ 2011-11-06 19:31 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, Andrea Arcangeli
  Cc: Neo Jia, levinsasha928, JeremyFitzhardinge, linux-mm,
	Dave Hansen, Seth Jennings, Jonathan Corbet, Chris Mason,
	Konrad Wilk, ngupta, LKML, KAMEZAWA Hiroyuki

A farewell haiku:

Crash test dummy folds.
KVM mafia wins.
Innovation cries.

Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-06 19:31   ` Dan Magenheimer
@ 2011-11-07  7:49     ` Pekka Enberg
  2011-11-07 14:35       ` Dan Magenheimer
  0 siblings, 1 reply; 8+ messages in thread
From: Pekka Enberg @ 2011-11-07  7:49 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Linus Torvalds, Andrew Morton, Andrea Arcangeli, Neo Jia,
	levinsasha928, JeremyFitzhardinge, linux-mm, Dave Hansen,
	Seth Jennings, Jonathan Corbet, Chris Mason, Konrad Wilk, ngupta,
	LKML, KAMEZAWA Hiroyuki

On Sun, Nov 6, 2011 at 9:31 PM, Dan Magenheimer
<dan.magenheimer@oracle.com> wrote:
> A farewell haiku:
>
> Crash test dummy folds.
> KVM mafia wins.
> Innovation cries.

Does this mean you've stopped working on frontswap or that frontswap
is dead? What does this mean for the cleancache hooks? Are they still
useful?

                        Pekka

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-07  7:49     ` Pekka Enberg
@ 2011-11-07 14:35       ` Dan Magenheimer
  2011-11-07 14:48         ` Pekka Enberg
  2011-11-08 23:41         ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Magenheimer @ 2011-11-07 14:35 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Linus Torvalds, Andrew Morton, Andrea Arcangeli, Neo Jia,
	levinsasha928, JeremyFitzhardinge, linux-mm, Dave Hansen,
	Seth Jennings, Jonathan Corbet, Chris Mason, Konrad Wilk, ngupta,
	LKML, KAMEZAWA Hiroyuki

> From: Pekka Enberg [mailto:penberg@kernel.org]
> Subject: Re: [GIT PULL] mm: frontswap (SUMMARY)
> 
> On Sun, Nov 6, 2011 at 9:31 PM, Dan Magenheimer
> <dan.magenheimer@oracle.com> wrote:
> > A farewell haiku:
> >
> > Crash test dummy folds.
> > KVM mafia wins.
> > Innovation cries.
> 
> Does this mean you've stopped working on frontswap or that frontswap
> is dead? What does this mean for the cleancache hooks? Are they still
> useful?

Wow.  F***ing incredible.

Pekka, you'd best leave the politics to Andrea.  He's
_much_ better at it.

No, I haven't stopped, though I may be pausing to lick
my wounds.  No it's not dead yet.  Yes, the cleancache
hooks are still useful, so narrow-minded anti-Xen
vultures can go circle elsewhere.

Since my attempt at gracefully ending the discussion
with poetry has been ruined, I might as well spell it out:

"Crash test dummy folds":  (1) Andrew, I'm warning you
(from the first crash test dummy) that the new process
may be too heavy handed and corruptible.  (2) I've taken
enough beatings for now, thank you.

"KVM mafia wins" :  If one reads between the many
(far too many) lines of this discussion, and as further
evidenced by Pekka's reply, the anti-Xen crowd has
been losing too many battles recently, is damn
well not going to lose this one, and would like
by any means possible to reverse previous losses.
People, can't we just get along?

"Innovation cries":  I'm expressing sadness that
a very innovative and elegant approach to a very
hard problem, that began Xen-specific but seems
to have lots of interesting uses, is being blocked
for political reasons.  I'm not denying that there
is plenty of work still to be done, just arguing
that this can best be explored as a community
project... and that's not going to happen by
conveniently ignoring the most mature user (Xen)
because one has a personal or corporate vendetta
against it.

Frontswap should be in-tree.  For anyone familiar
with the American political system, frontswap
has been blocked by a filibuster.

I won't be responding to further posts on this
topic for awhile, for health reasons.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-07 14:35       ` Dan Magenheimer
@ 2011-11-07 14:48         ` Pekka Enberg
  2011-11-08 23:41         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 8+ messages in thread
From: Pekka Enberg @ 2011-11-07 14:48 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Linus Torvalds, Andrew Morton, Andrea Arcangeli, Neo Jia,
	levinsasha928, JeremyFitzhardinge, linux-mm, Dave Hansen,
	Seth Jennings, Jonathan Corbet, Chris Mason, Konrad Wilk, ngupta,
	LKML, KAMEZAWA Hiroyuki

On Mon, Nov 7, 2011 at 4:35 PM, Dan Magenheimer
<dan.magenheimer@oracle.com> wrote:
> No, I haven't stopped, though I may be pausing to lick
> my wounds.  No it's not dead yet.  Yes, the cleancache
> hooks are still useful, so narrow-minded anti-Xen
> vultures can go circle elsewhere.

Whatever.

It was a honest question but somehow you managed to turn it into
something else. It's
funny that you don't see that it's _you_ that's the biggest obstacle
in getting frontswap
merged. You managed to completely alienate me, for example, with your
style of arguing.

If you had bothered to check your facts before accusing me of being
part of the "KVM
mafia", you'd know that if such mafia existed, it would probably hate
me as much as they
hate you. Furthermore, Sasha Levin who worked on KVM tmem happens to work
on the same project as I do so I was naturally interested in frontswap
until you showed
up.

                        Pekka

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] mm: frontswap (SUMMARY)
  2011-11-07 14:35       ` Dan Magenheimer
  2011-11-07 14:48         ` Pekka Enberg
@ 2011-11-08 23:41         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 8+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-11-08 23:41 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, Andrea Arcangeli,
	Neo Jia, levinsasha928, JeremyFitzhardinge, linux-mm,
	Dave Hansen, Seth Jennings, Jonathan Corbet, Chris Mason, ngupta,
	LKML, KAMEZAWA Hiroyuki

. snip..
> I won't be responding to further posts on this
> topic for awhile, for health reasons.

I like stories with a nice happy end so:
 - I am going to step up as the maintainer of the cleancache/frontswap
   zcache and shepard those.
 - Go through the list of review comments and work them out.
 - Not going to push this patchset for 3.2 (obviously since rc1 is out).
 - Dan will help me out, but won't be active on lkml for a while (see above).

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-11-08 23:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-04 14:01 [GIT PULL] mm: frontswap (SUMMARY) Dan Magenheimer
2011-11-04 15:12 ` Dan Magenheimer
2011-11-04 16:45 ` Andrea Arcangeli
2011-11-06 19:31   ` Dan Magenheimer
2011-11-07  7:49     ` Pekka Enberg
2011-11-07 14:35       ` Dan Magenheimer
2011-11-07 14:48         ` Pekka Enberg
2011-11-08 23:41         ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox