linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: nishimura@mxp.nes.nec.co.jp,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Li Zefan <lizf@cn.fujitsu.com>, Hugh Dickins <hugh@veritas.com>
Subject: Re: [BUGFIX][PATCH] memcg: charge swapcache to proper memcg
Date: Tue, 10 Mar 2009 15:38:59 +0900	[thread overview]
Message-ID: <20090310153859.ffe0671b.nishimura@mxp.nes.nec.co.jp> (raw)
In-Reply-To: <20090310140416.ecf5ba18.kamezawa.hiroyu@jp.fujitsu.com>

On Tue, 10 Mar 2009 14:04:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 10 Mar 2009 13:47:57 +0900
> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> 
> > On Tue, 10 Mar 2009 13:33:16 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > On Tue, 10 Mar 2009 10:07:07 +0900
> > > Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > > 
> > > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > 
> > > > memcg_test.txt says at 4.1:
> > > > 
> > > > 	This swap-in is one of the most complicated work. In do_swap_page(),
> > > > 	following events occur when pte is unchanged.
> > > > 
> > > > 	(1) the page (SwapCache) is looked up.
> > > > 	(2) lock_page()
> > > > 	(3) try_charge_swapin()
> > > > 	(4) reuse_swap_page() (may call delete_swap_cache())
> > > > 	(5) commit_charge_swapin()
> > > > 	(6) swap_free().
> > > > 
> > > > 	Considering following situation for example.
> > > > 
> > > > 	(A) The page has not been charged before (2) and reuse_swap_page()
> > > > 	    doesn't call delete_from_swap_cache().
> > > > 	(B) The page has not been charged before (2) and reuse_swap_page()
> > > > 	    calls delete_from_swap_cache().
> > > > 	(C) The page has been charged before (2) and reuse_swap_page() doesn't
> > > > 	    call delete_from_swap_cache().
> > > > 	(D) The page has been charged before (2) and reuse_swap_page() calls
> > > > 	    delete_from_swap_cache().
> > > > 
> > > > 	    memory.usage/memsw.usage changes to this page/swp_entry will be
> > > > 	 Case          (A)      (B)       (C)     (D)
> > > >          Event
> > > >        Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
> > > >           ===========================================
> > > >           (3)        +1/+1    +1/+1     +1/+1   +1/+1
> > > >           (4)          -       0/ 0       -     -1/ 0
> > > >           (5)         0/-1     0/ 0     -1/-1    0/ 0
> > > >           (6)          -       0/-1       -      0/-1
> > > >           ===========================================
> > > >        Result         1/ 1     1/ 1      1/ 1    1/ 1
> > > > 
> > > >        In any cases, charges to this page should be 1/ 1.
> > > > 
> > > > In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
> > > > (because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
> > > > charges to the memcg("foo") to which the "current" belongs.
> > > > OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
> > > > to which the page has been charged.
> > > > 
> > > > So, if the "foo" and "baa" is different(for example because of task move),
> > > > this charge will be moved from "baa" to "foo".
> > > > 
> > > > I think this is an unexpected behavior.
> > > > 
> > > > This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
> > > > to return the memcg to which the swapcache has been charged if PCG_USED bit
> > > > is set.
> > > > IIUC, checking PCG_USED bit of swapcache is safe under page lock.
> > > > 
> > > > 
> > > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > ---
> > > >  mm/memcontrol.c |   15 +++++++++++++--
> > > >  1 files changed, 13 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > > index 73c51c8..f2efbc0 100644
> > > > --- a/mm/memcontrol.c
> > > > +++ b/mm/memcontrol.c
> > > > @@ -909,13 +909,24 @@ nomem:
> > > >  static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
> > > >  {
> > > >  	struct mem_cgroup *mem;
> > > > +	struct page_cgroup *pc;
> > > >  	swp_entry_t ent;
> > > >  
> > > > +	VM_BUG_ON(!PageLocked(page));
> > > > +
> > > >  	if (!PageSwapCache(page))
> > > >  		return NULL;
> > > >  
> > > > -	ent.val = page_private(page);
> > > > -	mem = lookup_swap_cgroup(ent);
> > > > +	pc = lookup_page_cgroup(page);
> > > > +	/*
> > > > +	 * Used bit of swapcache is solid under page lock.
> > > > +	 */
> > > > +	if (PageCgroupUsed(pc))
> > > > +		mem = pc->mem_cgroup;
> > > 
> > > I've already acked but how about returning NULL here ?
> > > 
> > Returning NULL here means try_charge_swapin charges "current" memcg("foo"
> > in the patch description above).
> > So, it doesn't change current behavior at all.
> > 
> 
> ok, maybe try_charge_swapin() should check Used bit...and set ptr=NULL 
> before reaching here.
> 
> Can't we move this 
> + pc = lookup_page_cgroup(page);
> + if (PageCgroupUsed(pc))
> 
> check to try_charge_swapin() ? (I think this is safe because the page is locked.)
> 
> By this, we can avoid more works in commit_charge().
> 
hmm, I thought the same thing at first, but this means:

     Case       (D)
     Event
   Before (2)   1/ 1
      ===============
      (3)       0/ 0
      (4)      -1/ 0
      (5)       0/ 0
      (6)       0/-1
      ===============
   Result       0/ 0

Instead of handling this case properly, I selected
the current derection(changed memcg codes only).


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-10  6:40 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-10  1:07 Daisuke Nishimura
2009-03-10  2:35 ` KAMEZAWA Hiroyuki
2009-03-10  3:18   ` Daisuke Nishimura
2009-03-10  3:42     ` KAMEZAWA Hiroyuki
2009-03-10  4:33 ` KAMEZAWA Hiroyuki
2009-03-10  4:47   ` Daisuke Nishimura
2009-03-10  5:04     ` KAMEZAWA Hiroyuki
2009-03-10  6:38       ` Daisuke Nishimura [this message]
2009-03-10  6:56         ` KAMEZAWA Hiroyuki
2009-03-10 23:08 ` Andrew Morton
2009-03-10 23:53   ` KAMEZAWA Hiroyuki
2009-03-11  0:43     ` nishimura
2009-03-11  0:47       ` KAMEZAWA Hiroyuki
2009-03-11  3:04         ` [PATCH] use css id in swap cgroup for saving memory v5 KAMEZAWA Hiroyuki
2009-03-11 11:05           ` Hugh Dickins
2009-03-11 17:16             ` Balbir Singh
2009-03-11 23:46             ` KAMEZAWA Hiroyuki
2009-03-11 23:50               ` KAMEZAWA Hiroyuki
2009-03-16 22:25               ` Hugh Dickins
2009-03-19  0:44                 ` [PATCH] memcg remvoe redundant message at swapon KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090310153859.ffe0671b.nishimura@mxp.nes.nec.co.jp \
    --to=nishimura@mxp.nes.nec.co.jp \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox