linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Magenheimer <dan.magenheimer@oracle.com>
To: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nitin Gupta <ngupta@vflare.org>, Minchan Kim <minchan@kernel.org>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	Robert Jennings <rcj@linux.vnet.ibm.com>,
	Jenifer Hopper <jhopper@us.ibm.com>, Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <jweiner@redhat.com>,
	Rik van Riel <riel@redhat.com>,
	Larry Woodman <lwoodman@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	devel@driverdev.osuosl.org, Dave Hansen <dave@linux.vnet.ibm.com>
Subject: RE: [PATCH 7/8] zswap: add to mm/
Date: Thu, 3 Jan 2013 14:33:09 -0800 (PST)	[thread overview]
Message-ID: <640d712e-0217-456a-a2d1-d03dd7914a55@default> (raw)
In-Reply-To: <50E4C1FA.4070701@linux.vnet.ibm.com>

> From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com]
> Subject: Re: [PATCH 7/8] zswap: add to mm/
> 
> >> What I'm looking to do is give zswap a little insight into zsmalloc
> >> internals,
> 
> I'm putting at lot of thought into how to do it cleanly, while
> maintaining the generic nature of zsmalloc.  I did a PoC already, but
> I wasn't comfortable with how much had to be exposed to achieve it.
> So I'm still working on it.

Zbud exposes LRU-ness as well... why not also do that
for zsmalloc?  I guess I am asking, as long as you are doing
a "new API" for zsmalloc that zram doesn't benefit from, why not
eliminate the abstraction layer and consider your "zsmalloc2" to
be a captive allocator, like zbud?  What's the point of limiting
"how much had to be exposed" unless you are certain that other
kernel-internal-users will benefit?
 
> > I fear that problem (B) is the fundamental concern with
> > using a high-density storage allocator such as zsmalloc, which
> > is why I abandoned zsmalloc in favor of a more-predictable-but-
> > less-dense allocator (zbud).  However, if you have a solution
> > for (B) as well, I would gladly abandon zbud in zcache (for _both_
> > cleancache and frontswap pages) and our respective in-kernel
> > compression efforts would be more easy to merge into one solution
> > in the future.
> 
> The only difference afaict between zbud and zsmalloc wrt vacating a
> page frame is that zbud will only ever need to write back two pages to
> swap where zsmalloc might have to write back more to free up the
> zspage that contains the page frame to be vacated.  This is doable
> with zsmalloc.  The question that remains for me is how cleanly can it
> be done (for either approach).

One other major difference is that zpages in zsmalloc can cross pageframe
boundaries.  I don't understand zsmalloc well enough to estimate
how frequently this is true, but it is does complicate the set of
race conditions.

Also, unless I misunderstand, ensuring a pageframe can be freed
will require you to somehow "lock" all the zpages and remove
them atomically.  At least, that's what was required for zbud
to avoid races.  If this is true, increasing the number of zpages
to be freed would have a significant impact.

You may have a way around this... I'm just cautioning.

> > Hmmm... IIRC, to avoid races in zcache, it was necessary to
> > update both the data (zpage) and meta-data ("tree" in zswap,
> > and tmem-data-structure in zcache) atomically.  I will need
> > to study your code more to understand how zswap avoids this
> > requirement.  Or if it is obvious to you, I would be grateful
> > if you would point it out to me.
> 
> Without the flushing mechanism, a simple lock guarding the tree was
> enough in zswap.  The per-entry serialization of the
> store/load/invalidate paths are all handled at a higher level.  For
> example, we never get a load and invalidate concurrently on the same
> swap entry.
> 
> However, once the flushing code was introduced and could free an entry
> from the zswap_fs_store() path, it became necessary to add a per-entry
> refcount to make sure that the entry isn't freed while another code
> path was operating on it.

Hmmm... doesn't the refcount at least need to be an atomic_t?

Also, how can you "free" any entry of an rbtree while another
thread is walking the rbtree?  (Deleting an entry from an rbtree
causes rebalancing... afaik there is no equivalent RCU
implementation for rbtrees... not that RCU would necessarily
work well for this anyway.)

BTW, in case it appears otherwise, I'm trying to be helpful, not
critical.  In the end, I think we are in agreement that in-kernel
compression is very important and that the frontswap (and/or
cleancache) interface(s) are the right way to identify compressible
data, and we are mostly arguing allocation and implementation details.

Dan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-01-03 22:33 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <<1355262966-15281-1-git-send-email-sjenning@linux.vnet.ibm.com>
     [not found] ` <<1355262966-15281-8-git-send-email-sjenning@linux.vnet.ibm.com>
2012-12-31 23:06   ` Dan Magenheimer
2013-01-01 17:52     ` Seth Jennings
2013-01-02 15:55       ` Dave Hansen
2013-01-02 17:26         ` Dan Magenheimer
2013-01-02 18:17           ` Dave Hansen
2013-01-02 19:04             ` Dan Magenheimer
2013-01-03  7:33               ` Dave Chinner
2013-01-03 22:37                 ` Dan Magenheimer
2013-01-04  2:30                   ` Dave Chinner
2013-01-04 15:55                     ` Seth Jennings
2013-01-04 18:45                     ` Dan Magenheimer
2013-01-22 23:58                 ` High slab usage testing with zcache/zswap (Was: [PATCH 7/8] zswap: add to mm/) Dan Magenheimer
2013-01-02 22:44         ` [PATCH 7/8] zswap: add to mm/ Seth Jennings
2013-01-02 17:08       ` Dan Magenheimer
2013-01-02 23:25         ` Seth Jennings
2013-01-03 22:33           ` Dan Magenheimer [this message]
2013-01-04 15:42             ` Seth Jennings
2013-01-04 22:45               ` Dan Magenheimer
2013-01-07 14:47                 ` Seth Jennings
2012-12-11 21:55 [PATCH 0/8] zswap: compressed swap caching Seth Jennings
2012-12-11 21:55 ` [PATCH 1/8] staging: zsmalloc: add gfp flags to zs_create_pool Seth Jennings
2012-12-11 21:56 ` [PATCH 2/8] staging: zsmalloc: remove unsed pool name Seth Jennings
2012-12-11 21:56 ` [PATCH 3/8] staging: zsmalloc: add page alloc/free callbacks Seth Jennings
2012-12-11 21:56 ` [PATCH 4/8] staging: zsmalloc: make CLASS_DELTA relative to PAGE_SIZE Seth Jennings
2012-12-11 21:56 ` [PATCH 5/8] debugfs: add get/set for atomic types Seth Jennings
2012-12-11 21:56 ` [PATCH 6/8] zsmalloc: promote to lib/ Seth Jennings
2012-12-11 21:56 ` [PATCH 7/8] zswap: add to mm/ Seth Jennings
2013-01-03 16:07   ` Seth Jennings
2012-12-11 21:56 ` [PATCH 8/8] zswap: add documentation Seth Jennings
2012-12-11 22:01 ` [PATCH 0/8] zswap: compressed swap caching Greg Kroah-Hartman
2012-12-12 16:29   ` Seth Jennings
2012-12-12 17:27     ` Dan Magenheimer
2012-12-12 18:32       ` Seth Jennings
2012-12-12 18:36 ` Seth Jennings
2012-12-12 22:49 ` Luigi Semenzato
2012-12-12 23:46   ` Dan Magenheimer
2012-12-14 15:59   ` Seth Jennings
2013-01-03 16:01 ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=640d712e-0217-456a-a2d1-d03dd7914a55@default \
    --to=dan.magenheimer@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jhopper@us.ibm.com \
    --cc=jweiner@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=rcj@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox