From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org,
"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
vgoyal@redhat.com, m-ikeda@ds.jp.nec.com, gthelen@google.com,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -mm 3/5] memcg scalable file stat accounting method
Date: Tue, 3 Aug 2010 09:03:27 +0530 [thread overview]
Message-ID: <20100803033327.GD3863@balbir.in.ibm.com> (raw)
In-Reply-To: <20100802191559.6af0cded.kamezawa.hiroyu@jp.fujitsu.com>
* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2010-08-02 19:15:59]:
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> At accounting file events per memory cgroup, we need to find memory cgroup
> via page_cgroup->mem_cgroup. Now, we use lock_page_cgroup().
>
> But, considering the context which page-cgroup for files are accessed,
> we can use alternative light-weight mutual execusion in the most case.
> At handling file-caches, the only race we have to take care of is "moving"
> account, IOW, overwriting page_cgroup->mem_cgroup. Because file status
> update is done while the page-cache is in stable state, we don't have to
> take care of race with charge/uncharge.
>
> Unlike charge/uncharge, "move" happens not so frequently. It happens only when
> rmdir() and task-moving (with a special settings.)
> This patch adds a race-checker for file-cache-status accounting v.s. account
> moving. The new per-cpu-per-memcg counter MEM_CGROUP_ON_MOVE is added.
> The routine for account move
> 1. Increment it before start moving
> 2. Call synchronize_rcu()
> 3. Decrement it after the end of moving.
> By this, file-status-counting routine can check it needs to call
> lock_page_cgroup(). In most case, I doesn't need to call it.
>
>
> Changelog: 20100730
> - some cleanup.
> Changelog: 20100729
> - replaced __this_cpu_xxx() with this_cpu_xxx
> (because we don't call spinlock)
> - added VM_BUG_ON().
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> mm/memcontrol.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 66 insertions(+), 12 deletions(-)
>
> Index: mmotm-0727/mm/memcontrol.c
> ===================================================================
> --- mmotm-0727.orig/mm/memcontrol.c
> +++ mmotm-0727/mm/memcontrol.c
> @@ -88,6 +88,7 @@ enum mem_cgroup_stat_index {
> MEM_CGROUP_STAT_PGPGOUT_COUNT, /* # of pages paged out */
> MEM_CGROUP_STAT_SWAPOUT, /* # of pages, swapped out */
> MEM_CGROUP_EVENTS, /* incremented at every pagein/pageout */
> + MEM_CGROUP_ON_MOVE, /* A check for locking move account/status */
>
> MEM_CGROUP_STAT_NSTATS,
> };
> @@ -1074,7 +1075,49 @@ static unsigned int get_swappiness(struc
> return swappiness;
> }
>
> -/* A routine for testing mem is not under move_account */
> +static void mem_cgroup_start_move(struct mem_cgroup *mem)
> +{
> + int cpu;
> + /* for fast checking in mem_cgroup_update_file_stat() etc..*/
> + spin_lock(&mc.lock);
> + for_each_possible_cpu(cpu)
> + per_cpu(mem->stat->count[MEM_CGROUP_ON_MOVE], cpu) += 1;
Is for_each_possible really required? Won't online cpus suffice? There
can be a race if a hotplug event happens between start and end move,
shouldn't we handle that. My concern is that with something like 1024
cpus possible today, we might need to optimize this further.
May be we can do this first and optimize later.
--
Three Cheers,
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-08-03 3:29 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-02 10:11 [PATCH -mm 0/5] towards I/O aware memory cgroup v3 KAMEZAWA Hiroyuki
2010-08-02 10:13 ` [PATCH -mm 1/5] quick lookup memcg by ID KAMEZAWA Hiroyuki
2010-08-03 3:22 ` Balbir Singh
2010-08-03 3:21 ` KAMEZAWA Hiroyuki
2010-08-03 3:38 ` Balbir Singh
2010-08-03 4:31 ` Daisuke Nishimura
2010-08-03 4:37 ` KAMEZAWA Hiroyuki
2010-08-03 4:51 ` Daisuke Nishimura
2010-08-03 4:54 ` KAMEZAWA Hiroyuki
2010-08-03 5:04 ` Daisuke Nishimura
2010-08-02 10:14 ` [PATCH -mm 2/5] use ID in page cgroup KAMEZAWA Hiroyuki
2010-08-03 3:45 ` Balbir Singh
2010-08-03 3:48 ` KAMEZAWA Hiroyuki
2010-08-02 10:15 ` [PATCH -mm 3/5] memcg scalable file stat accounting method KAMEZAWA Hiroyuki
2010-08-03 3:33 ` Balbir Singh [this message]
2010-08-03 3:39 ` KAMEZAWA Hiroyuki
2010-08-04 0:55 ` Daisuke Nishimura
2010-08-04 1:11 ` KAMEZAWA Hiroyuki
2010-08-04 1:25 ` Daisuke Nishimura
2010-08-02 10:17 ` [PATCH -mm 4/5] memcg generic file stat accounting interface KAMEZAWA Hiroyuki
2010-08-03 4:03 ` Balbir Singh
2010-08-03 4:24 ` KAMEZAWA Hiroyuki
2010-08-02 10:20 ` [PATCH -mm 5/5] memcg: use spinlock in page_cgroup instead of bit_spinlock KAMEZAWA Hiroyuki
2010-08-03 4:06 ` Balbir Singh
2010-08-03 4:25 ` KAMEZAWA Hiroyuki
2010-08-03 2:36 ` [PATCH -mm 0/5] towards I/O aware memory cgroup v3 Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100803033327.GD3863@balbir.in.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=m-ikeda@ds.jp.nec.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox