linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Greg Thelen <gthelen@google.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Righi <arighi@develer.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	containers@lists.osdl.org,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Subject: Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
Date: Tue, 12 Oct 2010 00:32:33 -0700	[thread overview]
Message-ID: <xr931v7vdfxq.fsf@ninji.mtv.corp.google.com> (raw)
In-Reply-To: <20101012095546.f23bb950.kamezawa.hiroyu@jp.fujitsu.com> (KAMEZAWA Hiroyuki's message of "Tue, 12 Oct 2010 09:55:46 +0900")

KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:

> On Mon, 11 Oct 2010 17:24:21 -0700
> Greg Thelen <gthelen@google.com> wrote:
>
>> >> Is your motivation to increase performance with the same functionality?
>> >> If so, then would a 'static inline' be performance equivalent to a
>> >> preprocessor macro yet be safer to use?
>> >> 
>> > Ah, if lockdep finds this as bug, I think other parts will hit this,
>> > too.  like this.
>> >> static struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
>> >> {
>> >>         struct mem_cgroup *mem = NULL;
>> >> 
>> >>         if (!mm)
>> >>                 return NULL;
>> >>         /*
>> >>          * Because we have no locks, mm->owner's may be being moved to other
>> >>          * cgroup. We use css_tryget() here even if this looks
>> >>          * pessimistic (rather than adding locks here).
>> >>          */
>> >>         rcu_read_lock();
>> >>         do {
>> >>                 mem = mem_cgroup_from_task(rcu_dereference(mm->owner));
>> >>                 if (unlikely(!mem))
>> >>                         break;
>> >>         } while (!css_tryget(&mem->css));
>> >>         rcu_read_unlock();
>> >>         return mem;
>> >> }
>> 
>> mem_cgroup_from_task() calls task_subsys_state() calls
>> task_subsys_state_check().  task_subsys_state_check() will be happy if
>> rcu_read_lock is held.
>> 
> yes.
>
>> I don't think that this will fail lockdep, because rcu_read_lock_held()
>> is true when calling mem_cgroup_from_task() within
>> try_get_mem_cgroup_from_mm()..
>> 
> agreed.
>
>> > mem_cgroup_from_task() is designed to be used as this.
>> > If dqefined as macro, I think it will not be catched.
>> 
>> I do not understand how making mem_cgroup_from_task() a macro will
>> change its behavior wrt. to lockdep assertion checking.  I assume that
>> as a macro mem_cgroup_from_task() would still call task_subsys_state(),
>> which requires either:
>> a) rcu read lock held
>> b) task->alloc_lock held
>> c) cgroup lock held
>> 
>
> Hmm. Maybe I was wrong.
>
>> 
>> >> Maybe it makes more sense to find a way to perform this check in
>> >> mem_cgroup_has_dirty_limit() without needing to grab the rcu lock.  I
>> >> think this lock grab is unneeded.  I am still collecting performance
>> >> data, but suspect that this may be making the code slower than it needs
>> >> to be.
>> >> 
>> >
>> > Hmm. css_set[] itself is freed by RCU..what idea to remove rcu_read_lock() do
>> > you have ? Adding some flags ?
>> 
>> It seems like a shame to need a lock to determine if current is in the
>> root cgroup.  Especially given that as soon as
>> mem_cgroup_has_dirty_limit() returns, the task could be moved
>> in-to/out-of the root cgroup thereby invaliding the answer.  So the
>> answer is just a sample that may be wrong. 
>
> Yes. But it's not a bug but a specification.
>
>> But I think you are correct.
>> We will need the rcu read lock in mem_cgroup_has_dirty_limit().
>> 
>
> yes.
>
>
>> > Ah...I noticed that you should do
>> >
>> >  mem = mem_cgroup_from_task(current->mm->owner);
>> >
>> > to check has_dirty_limit...
>> 
>> What are the cases where current->mm->owner->cgroups !=
>> current->cgroups?
>> 
> In that case, assume group A and B.
>
>    thread(1) -> belongs to cgroup A  (thread(1) is mm->owner)
>    thread(2) -> belongs to cgroup B
> and
>    a page    -> charnged to cgroup A
>
> Then, thread(2) make the page dirty which is under cgroup A.
>
> In this case, if page's dirty_pages accounting is added to cgroup B,
> cgroup B' statistics may show "dirty_pages > all_lru_pages". This is
> bug.

I agree that in this case the dirty_pages accounting should be added to
cgroup A because that is where the page was charged.  This will happen
because pc->mem_cgroup was set to A when the page was charged.  The
mark-page-dirty code will check pc->mem_cgroup to determine which cgroup
to add the dirty page to.

I think that the current vs current->mm->owner decision is in areas of
the code that is used to query the dirty limits.  These routines do not
use this data to determine which cgroup to charge for dirty pages.  The
usage of either mem_cgroup_from_task(current->mm->owner) or
mem_cgroup_from_task(current) in mem_cgroup_has_dirty_limit() does not
determine which cgroup is added for dirty_pages.
mem_cgroup_has_dirty_limit() is only used to determine if the process
has a dirty limit.  As discussed, this is a momentary answer that may be
wrong by the time decisions are made because the task may be migrated
in-to/out-of root cgroup while mem_cgroup_has_dirty_limit() runs.  If
the process has a dirty limit, then the process's memcg is used to
compute dirty limits.  Using your example, I assume that thread(1) and
thread(2) will git dirty limits from cgroup(A) and cgroup(B)
respectively.

Are you thinking that when accounting for a dirty page (by incrementing
pc->mem_cgroup->stat->count[MEM_CGROUP_STAT_FILE_DIRTY]) that we should
check the pc->mem_cgroup dirty limit?

>> I was hoping to avoid having add even more logic into
>> mem_cgroup_has_dirty_limit() to handle the case where current->mm is
>> NULL.
>> 
>
> Blease check current->mm. We can't limit works of kernel-thread by this, let's
> consider it later if necessary.
>
>> Presumably the newly proposed vm_dirty_param(),
>> mem_cgroup_has_dirty_limit(), and mem_cgroup_page_stat() routines all
>> need to use the same logic.  I assume they should all be consistently
>> using current->mm->owner or current.
>> 
>
> please.
>
> Thanks,
> -Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-10-12  7:32 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-04  6:57 [PATCH 00/10] memcg: per cgroup dirty page accounting Greg Thelen
2010-10-04  6:57 ` [PATCH 01/10] memcg: add page_cgroup flags for dirty page tracking Greg Thelen
2010-10-05  6:20   ` KAMEZAWA Hiroyuki
2010-10-06  0:37   ` Daisuke Nishimura
2010-10-06 11:07   ` Balbir Singh
2010-10-04  6:57 ` [PATCH 02/10] memcg: document cgroup dirty memory interfaces Greg Thelen
2010-10-05  6:48   ` KAMEZAWA Hiroyuki
2010-10-06  0:49   ` Daisuke Nishimura
2010-10-06 11:12   ` Balbir Singh
2010-10-04  6:57 ` [PATCH 03/10] memcg: create extensible page stat update routines Greg Thelen
2010-10-04 13:48   ` Ciju Rajan K
2010-10-04 15:43     ` Greg Thelen
2010-10-04 17:35       ` Ciju Rajan K
2010-10-05  6:51   ` KAMEZAWA Hiroyuki
2010-10-05  7:10     ` Greg Thelen
2010-10-05 15:42   ` Minchan Kim
2010-10-05 19:59     ` Greg Thelen
2010-10-05 23:57       ` Minchan Kim
2010-10-06  0:48         ` Greg Thelen
2010-10-06 16:19   ` Balbir Singh
2010-10-04  6:57 ` [PATCH 04/10] memcg: disable local interrupts in lock_page_cgroup() Greg Thelen
2010-10-05  6:54   ` KAMEZAWA Hiroyuki
2010-10-05  7:18     ` Greg Thelen
2010-10-05 16:03   ` Minchan Kim
2010-10-05 23:26     ` Greg Thelen
2010-10-06  0:15       ` Minchan Kim
2010-10-07  0:35         ` KAMEZAWA Hiroyuki
2010-10-07  1:54           ` Daisuke Nishimura
2010-10-07  2:17             ` KAMEZAWA Hiroyuki
2010-10-07  6:21               ` [PATCH] memcg: reduce lock time at move charge (Was " KAMEZAWA Hiroyuki
2010-10-07  6:24                 ` [PATCH] memcg: lock-free clear page writeback " KAMEZAWA Hiroyuki
2010-10-07  9:05                   ` KAMEZAWA Hiroyuki
2010-10-07 23:35                   ` Minchan Kim
2010-10-08  4:41                     ` KAMEZAWA Hiroyuki
2010-10-07  7:28                 ` [PATCH] memcg: reduce lock time at move charge " Daisuke Nishimura
2010-10-07  7:42                   ` KAMEZAWA Hiroyuki
2010-10-07  8:04                     ` [PATCH v2] " KAMEZAWA Hiroyuki
2010-10-07 23:14                       ` Andrew Morton
2010-10-08  1:12                         ` Daisuke Nishimura
2010-10-08  4:37                         ` KAMEZAWA Hiroyuki
2010-10-08  4:55                           ` Andrew Morton
2010-10-08  5:12                             ` KAMEZAWA Hiroyuki
2010-10-08 10:41                               ` KAMEZAWA Hiroyuki
2010-10-12  3:39                                 ` Balbir Singh
2010-10-12  3:42                                   ` KAMEZAWA Hiroyuki
2010-10-12  3:54                                     ` Balbir Singh
2010-10-12  3:56                                 ` Daisuke Nishimura
2010-10-12  5:01                                   ` KAMEZAWA Hiroyuki
2010-10-12  5:48                                   ` [PATCH v4] memcg: reduce lock time at move charge KAMEZAWA Hiroyuki
2010-10-12  6:23                                     ` Daisuke Nishimura
2010-10-12  5:39   ` [PATCH 04/10] memcg: disable local interrupts in lock_page_cgroup() Balbir Singh
2010-10-04  6:58 ` [PATCH 05/10] memcg: add dirty page accounting infrastructure Greg Thelen
2010-10-05  7:22   ` KAMEZAWA Hiroyuki
2010-10-05  7:35     ` Greg Thelen
2010-10-05 16:09   ` Minchan Kim
2010-10-05 20:06     ` Greg Thelen
2010-10-04  6:58 ` [PATCH 06/10] memcg: add kernel calls for memcg dirty page stats Greg Thelen
2010-10-05  6:55   ` KAMEZAWA Hiroyuki
2010-10-04  6:58 ` [PATCH 07/10] memcg: add dirty limits to mem_cgroup Greg Thelen
2010-10-05  7:07   ` KAMEZAWA Hiroyuki
2010-10-05  9:43   ` Andrea Righi
2010-10-05 19:00     ` Greg Thelen
2010-10-07  0:13       ` KAMEZAWA Hiroyuki
2010-10-07  0:27         ` Greg Thelen
2010-10-07  0:48           ` KAMEZAWA Hiroyuki
2010-10-12  0:24             ` Greg Thelen
2010-10-12  0:55               ` KAMEZAWA Hiroyuki
2010-10-12  7:32                 ` Greg Thelen [this message]
2010-10-12  8:38                   ` KAMEZAWA Hiroyuki
2010-10-04  6:58 ` [PATCH 08/10] memcg: add cgroupfs interface to memcg dirty limits Greg Thelen
2010-10-05  7:13   ` KAMEZAWA Hiroyuki
2010-10-05  7:33     ` Greg Thelen
2010-10-05  7:31       ` KAMEZAWA Hiroyuki
2010-10-05  9:18       ` Andrea Righi
2010-10-05 18:31         ` David Rientjes
2010-10-06 18:34         ` Greg Thelen
2010-10-06 20:54           ` Andrea Righi
2010-10-06 13:30   ` Balbir Singh
2010-10-06 13:32     ` Balbir Singh
2010-10-06 16:21       ` Greg Thelen
2010-10-06 16:24         ` Balbir Singh
2010-10-07  6:23   ` Ciju Rajan K
2010-10-07 17:46     ` Greg Thelen
2010-10-04  6:58 ` [PATCH 09/10] writeback: make determine_dirtyable_memory() static Greg Thelen
2010-10-05  7:15   ` KAMEZAWA Hiroyuki
2010-10-04  6:58 ` [PATCH 10/10] memcg: check memcg dirty limits in page writeback Greg Thelen
2010-10-05  7:29   ` KAMEZAWA Hiroyuki
2010-10-06  0:32   ` Minchan Kim
2010-10-05  4:20 ` [PATCH 00/10] memcg: per cgroup dirty page accounting Balbir Singh
2010-10-05  4:50 ` Balbir Singh
2010-10-05  5:50   ` Greg Thelen
2010-10-05  8:37     ` Ciju Rajan K
2010-10-05 22:15 ` Andrea Righi
2010-10-06  3:23 ` Balbir Singh
2010-10-18  5:56 ` KAMEZAWA Hiroyuki
2010-10-18 18:09   ` Greg Thelen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xr931v7vdfxq.fsf@ninji.mtv.corp.google.com \
    --to=gthelen@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=arighi@develer.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.osdl.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox