From: Michal Hocko <mhocko@suse.cz>
To: Sha Zhengju <handai.szj@gmail.com>
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
kamezawa.hiroyu@jp.fujitsu.com, akpm@linux-foundation.org,
hughd@google.com, gthelen@google.com,
Sha Zhengju <handai.szj@taobao.com>
Subject: Re: [PATCH V2 3/3] memcg: simplify lock of memcg page stat account
Date: Mon, 13 May 2013 15:12:51 +0200 [thread overview]
Message-ID: <20130513131251.GB5246@dhcp22.suse.cz> (raw)
In-Reply-To: <1368421545-4974-1-git-send-email-handai.szj@taobao.com>
On Mon 13-05-13 13:05:44, Sha Zhengju wrote:
> From: Sha Zhengju <handai.szj@taobao.com>
>
> After removing duplicated information like PCG_* flags in
> 'struct page_cgroup'(commit 2ff76f1193), there's a problem between
> "move" and "page stat accounting"(only FILE_MAPPED is supported now
> but other stats will be added in future, and here I'd like to take
> dirty page as an example):
>
> Assume CPU-A does "page stat accounting" and CPU-B does "move"
>
> CPU-A CPU-B
> TestSet PG_dirty
> (delay) move_lock_mem_cgroup()
> if (PageDirty(page)) {
> old_memcg->nr_dirty --
> new_memcg->nr_dirty++
> }
> pc->mem_cgroup = new_memcg;
> move_unlock_mem_cgroup()
>
> move_lock_mem_cgroup()
> memcg = pc->mem_cgroup
> memcg->nr_dirty++
> move_unlock_mem_cgroup()
>
> while accounting information of new_memcg may be double-counted. So we
> use a bigger lock to solve this problem: (commit: 89c06bd52f)
>
> move_lock_mem_cgroup() <-- mem_cgroup_begin_update_page_stat()
> TestSetPageDirty(page)
> update page stats (without any checks)
> move_unlock_mem_cgroup() <-- mem_cgroup_begin_update_page_stat()
>
>
> But this method also has its pros and cons: at present we use two layers
> of lock avoidance(memcg_moving and memcg->moving_account) then spinlock
> on memcg (see mem_cgroup_begin_update_page_stat()), but the lock
> granularity is a little bigger that not only the critical section but
> also some code logic is in the range of locking which may be deadlock
> prone. While trying to add memcg dirty page accounting, it gets into
> further difficulty with page cache radix-tree lock and even worse
> mem_cgroup_begin_update_page_stat() requires nesting
> (https://lkml.org/lkml/2013/1/2/48). However, when the current patch is
> preparing, the lock nesting problem is longer possible as s390/mm has
> reworked it out(commit:abf09bed), but it should be better
> if we can make the lock simpler and recursive safe.
This patch doesn't make the charge move locking recursive safe. It
just tries to overcome the problem in the path where it doesn't exist
anymore. mem_cgroup_begin_update_page_stat would still deadlock if it
was re-entered.
It makes PageCgroupUsed usage even more tricky because it uses it out of
lock_page_cgroup context. It seems that this would work in this
particular path because atomic_inc_and_test(_mapcount) will protect from
double accounting but the whole dance around old_memcg seems pointless
to me.
I am sorry but I do not think this is the right approach. IMO we should
focus on mem_cgroup_begin_update_page_stat and make it really recursive
safe - ideally without any additional overhead (which sounds like a real
challenge)
[...]
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-05-13 13:12 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-13 5:03 [PATCH V2 0/3] memcg: simply lock of page stat accounting Sha Zhengju
2013-05-13 5:04 ` [PATCH V2 1/3] memcg: rewrite the comment about race condition " Sha Zhengju
2013-05-13 5:05 ` [PATCH V2 2/3] memcg: alter mem_cgroup_{update,inc,dec}_page_stat() args to memcg pointer Sha Zhengju
2013-05-13 12:25 ` Michal Hocko
2013-05-14 9:00 ` Sha Zhengju
2013-05-14 9:10 ` Michal Hocko
2013-05-14 0:15 ` Kamezawa Hiroyuki
2013-05-14 9:03 ` Sha Zhengju
2013-05-13 5:05 ` [PATCH V2 3/3] memcg: simplify lock of memcg page stat account Sha Zhengju
2013-05-13 13:12 ` Michal Hocko [this message]
2013-05-13 13:38 ` Michal Hocko
2013-05-14 9:13 ` Sha Zhengju
2013-05-14 9:28 ` Michal Hocko
2013-05-14 8:35 ` Sha Zhengju
2013-05-14 0:41 ` [PATCH V2 0/3] memcg: simply lock of page stat accounting Kamezawa Hiroyuki
2013-05-14 7:13 ` Michal Hocko
2013-05-15 12:35 ` Konstantin Khlebnikov
2013-05-15 13:41 ` Michal Hocko
2013-05-16 4:28 ` Konstantin Khlebnikov
2013-05-16 13:28 ` Michal Hocko
2013-05-17 5:57 ` Konstantin Khlebnikov
2013-05-17 8:38 ` Michal Hocko
2013-05-17 10:29 ` Konstantin Khlebnikov
2013-05-17 12:53 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130513131251.GB5246@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=gthelen@google.com \
--cc=handai.szj@gmail.com \
--cc=handai.szj@taobao.com \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox