RE: frontswap/zcache: xvmalloc discussion

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Dan Magenheimer <dan.magenheimer@oracle.com>
To: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: linux-mm <linux-mm@kvack.org>, Nitin Gupta <ngupta@vflare.org>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	Brian King <brking@linux.vnet.ibm.com>,
	Greg Kroah-Hartman <gregkh@suse.de>, Dave Hansen <dave@sr71.net>
Subject: RE: frontswap/zcache: xvmalloc discussion
Date: Wed, 29 Jun 2011 19:31:23 -0700 (PDT)	[thread overview]
Message-ID: <f6415652-5925-4aad-b8be-900ce3afd902@default> (raw)
In-Reply-To: <89b9d94d-27d1-4f51-ab7e-b2210b6b0eb5@default>

> > > One neat feature of frontswap (and the underlying Transcendent
> > > Memory definition) is that ANY PUT may be rejected**.  So zcache
> > > could keep track of the distribution of "zsize" and if the number
> > > of pages with zsize>PAGE_SIZE/2 greatly exceeds the number of pages
> > > with "complementary zsize", the frontswap code in zcache can reject
> > > the larger pages until balance/sanity is restored.
> > >
> > > Might that help?
> >
> > We could do that, but I imagine that would let a lot of pages through
> > on most workloads.  Ideally, I'd like to find a solution that would
> > capture and (efficiently) store pages that compressed to up to 80% of
> > their original size.
> 
> After thinking about this a bit, I have to disagree.  For workloads
> where the vast majority of pages have zsize>PAGE_SIZE/2, this would
> let a lot of pages through.  So if you are correct that LZO
> is poor at compression and a large majority of pages are in
> this category, some page-crossing scheme is necessary.  However,
> that isn't what I've seen... the zsize of many swap pages is
> quite small.
> 
> So before commencing on a major compression rewrite, it might
> be a good idea to measure distribution of zsize for swap pages
> on a large variety of workloads.  This could probably be done
> by adding a code snippet in the swap path of a normal (non-zcache)
> kernel.  And if the distribution is bad, replacing LZO with a
> higher-compression-but-slower algorithm might be the best answer,
> since zcache is replacing VERY slow swap-device reads/writes with
> reasonably fast compression/decompression.  I certainly think
> that an algorithm approaching an average 50% compression ratio
> should be the goal.

FWIW, I've measured the distribution of zsize (pages compressed
with frontswap) on my favorite workload (kernel "make -j2" on
mem=512M to force lots of swapping) and the mean is small, close
to 1K (PAGE_SIZE/4).  I've added some sysfs shows for both
the current and cumulative distribution (0-63 bytes, 64-127
bytes, ..., 4032-4095 bytes) for the next update.

I tried your program on the text of Moby Dick and the mean
was still under 1500 bytes ((3*PAGE_SIZE)/8) with a good
broad distribution for zsize.  I tried your program also on
gzip'ed Moby Dick and zcache correctly rejects most of the
pages as uncompressible and does fine on other swapped pages.

So I can't reproduce what you are seeing.  Somehow you
must create and swap a set of pages with a zsize distribution
almost entirely between PAGE_SIZE/2 and (PAGE_SIZE*7)/8.
How did you do that?

FYI, I also added a sysfs settable for zv_max_page_size...
if zsize exceeds it, the page is rejected.  It defaults to
(PAGE_SIZE*7)/8, which was the non-settable hardwired
value before.

Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-06-30  2:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-22 19:15 Seth Jennings
2011-06-22 19:23 ` [PATCH] Add zv_pool_pages_count to zcache sysfs Seth Jennings
2011-06-23 15:38   ` Dave Hansen
2011-06-23 16:38 ` frontswap/zcache: xvmalloc discussion Dan Magenheimer
2011-06-23 21:59   ` Seth Jennings
2011-06-24 22:40     ` Dan Magenheimer
2011-06-30  2:31       ` Dan Magenheimer [this message]
2011-06-30 16:09         ` Dan Magenheimer
2011-06-24  6:11 ` Nitin Gupta
2011-06-24 15:52   ` Dave Hansen
2011-06-25  2:42     ` Nitin Gupta
2011-08-05 16:22   ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6415652-5925-4aad-b8be-900ce3afd902@default \
    --to=dan.magenheimer@oracle.com \
    --cc=brking@linux.vnet.ibm.com \
    --cc=dave@sr71.net \
    --cc=gregkh@suse.de \
    --cc=linux-mm@kvack.org \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox