linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>, Hugh Dickins <hughd@google.com>,
	Greg Thelen <gthelen@google.com>, Ying Han <yinghan@google.com>
Subject: Re: [PATCH 4/6] memcg: use new logic for page stat accounting.
Date: Tue, 28 Feb 2012 13:22:50 +0100	[thread overview]
Message-ID: <20120228122250.GB1702@cmpxchg.org> (raw)
In-Reply-To: <20120217182743.2d5f629e.kamezawa.hiroyu@jp.fujitsu.com>

On Fri, Feb 17, 2012 at 06:27:43PM +0900, KAMEZAWA Hiroyuki wrote:
> >From 5bf592d432e4552db74e94a575afcd83aa652a9d Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Thu, 2 Feb 2012 11:49:59 +0900
> Subject: [PATCH 4/6] memcg: use new logic for page stat accounting.
> 
> Now, page-stat-per-memcg is recorded into per page_cgroup flag by
> duplicating page's status into the flag. The reason is that memcg
> has a feature to move a page from a group to another group and we
> have race between "move" and "page stat accounting",
> 
> Under current logic, assume CPU-A and CPU-B. CPU-A does "move"
> and CPU-B does "page stat accounting".
> 
> When CPU-A goes 1st,
> 
>             CPU-A                           CPU-B
>                                     update "struct page" info.
>     move_lock_mem_cgroup(memcg)
>     see pc->flags
>     copy page stat to new group
>     overwrite pc->mem_cgroup.
>     move_unlock_mem_cgroup(memcg)
>                                     move_lock_mem_cgroup(mem)
>                                     set pc->flags
>                                     update page stat accounting
>                                     move_unlock_mem_cgroup(mem)
> 
> stat accounting is guarded by move_lock_mem_cgroup() and "move"
> logic (CPU-A) doesn't see changes in "struct page" information.
> 
> But it's costly to have the same information both in 'struct page' and
> 'struct page_cgroup'. And, there is a potential problem.
> 
> For example, assume we have PG_dirty accounting in memcg.
> PG_..is a flag for struct page.
> PCG_ is a flag for struct page_cgroup.
> (This is just an example. The same problem can be found in any
>  kind of page stat accounting.)
> 
> 	  CPU-A                               CPU-B
>       TestSet PG_dirty
>       (delay)                        TestClear PG_dirty
>                                      if (TestClear(PCG_dirty))
>                                           memcg->nr_dirty--
>       if (TestSet(PCG_dirty))
>           memcg->nr_dirty++
> 
> Here, memcg->nr_dirty = +1, this is wrong.
> This race was reported by  Greg Thelen <gthelen@google.com>.
> Now, only FILE_MAPPED is supported but fortunately, it's serialized by
> page table lock and this is not real bug, _now_,
> 
> If this potential problem is caused by having duplicated information in
> struct page and struct page_cgroup, we may be able to fix this by using
> original 'struct page' information.
> But we'll have a problem in "move account"
> 
> Assume we use only PG_dirty.
> 
>          CPU-A                   CPU-B
>     TestSet PG_dirty
>     (delay)                    move_lock_mem_cgroup()
>                                if (PageDirty(page))
>                                       new_memcg->nr_dirty++
>                                pc->mem_cgroup = new_memcg;
>                                move_unlock_mem_cgroup()
>     move_lock_mem_cgroup()
>     memcg = pc->mem_cgroup
>     new_memcg->nr_dirty++
> 
> accounting information may be double-counted. This was original
> reason to have PCG_xxx flags but it seems PCG_xxx has another problem.
> 
> I think we need a bigger lock as
> 
>      move_lock_mem_cgroup(page)
>      TestSetPageDirty(page)
>      update page stats (without any checks)
>      move_unlock_mem_cgroup(page)
> 
> This fixes both of problems and we don't have to duplicate page flag
> into page_cgroup. Please note: move_lock_mem_cgroup() is held
> only when there are possibility of "account move" under the system.
> So, in most path, status update will go without atomic locks.
> 
> This patch introduce mem_cgroup_begin_update_page_stat() and
> mem_cgroup_end_update_page_stat() both should be called at modifying
> 'struct page' information if memcg takes care of it. as
> 
>      mem_cgroup_begin_update_page_stat()
>      modify page information
>      mem_cgroup_update_page_stat()
>      => never check any 'struct page' info, just update counters.
>      mem_cgroup_end_update_page_stat().
> 
> This patch is slow because we need to call begin_update_page_stat()/
> end_update_page_stat() regardless of accounted will be changed or not.
> A following patch adds an easy optimization and reduces the cost.
> 
> Changes since v4
>  - removed unused argument *lock
>  - added more comments.
> Changes since v3
>  - fixed typos
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-02-28 12:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-17  9:24 [PATCH 0/6] page cgroup diet v5 KAMEZAWA Hiroyuki
2012-02-17  9:25 ` [PATCH 1/6] memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) KAMEZAWA Hiroyuki
2012-02-20  8:54   ` Johannes Weiner
2012-02-17  9:26 ` [PATCH 2/6] memcg: simplify move_account() check KAMEZAWA Hiroyuki
2012-02-20  8:55   ` Johannes Weiner
2012-02-17  9:26 ` [PATCH 3/6] memcg: remove PCG_MOVE_LOCK flag from page_cgroup KAMEZAWA Hiroyuki
2012-02-20  8:56   ` Johannes Weiner
2012-02-17  9:27 ` [PATCH 4/6] memcg: use new logic for page stat accounting KAMEZAWA Hiroyuki
2012-02-28 12:22   ` Johannes Weiner [this message]
2012-02-17  9:28 ` [PATCH 5/6] memcg: remove PCG_FILE_MAPPED KAMEZAWA Hiroyuki
2012-02-18 13:39   ` Johannes Weiner
2012-02-18 14:43     ` Hillf Danton
2012-02-19 23:52       ` KAMEZAWA Hiroyuki
2012-02-17  9:28 ` [PATCH 6/6] memcg: fix performance of mem_cgroup_begin_update_page_stat() KAMEZAWA Hiroyuki
2012-02-28 12:25   ` Johannes Weiner
2012-02-17 10:04 ` [PATCH 0/6] page cgroup diet v5 KAMEZAWA Hiroyuki
2012-02-17 21:45 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120228122250.GB1702@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox