Re: [PATCH 5/8] memcg: migrate charge of anon

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: linux-mm <linux-mm@kvack.org>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Paul Menage <menage@google.com>, Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [PATCH 5/8] memcg: migrate charge of anon
Date: Thu, 17 Sep 2009 15:25:40 +0900	[thread overview]
Message-ID: <20090917152540.10b10028.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20090917145657.4ccc8d27.nishimura@mxp.nes.nec.co.jp>

On Thu, 17 Sep 2009 14:56:57 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> > > +	/*
> > > +	 * We should put back all pages which remain on "list".
> > > +	 * This means try_charge above has failed.
> > > +	 * Pages which have been moved to mc->list would be put back at
> > > +	 * clear_migrate_charge.
> > > +	 */
> > > +	if (!list_empty(&list))
> > > +		list_for_each_entry_safe(page, tmp, &list, lru) {
> > > +			list_del(&page->lru);
> > > +			putback_lru_page(page);
> > > +			put_page(page);
> > 
> > I wonder this put_page() is not necessary.
> > 
> get_page_unless_zero(page) is called above.
> putback_lru_page() will only decrease the refcount got by isolate_lru_page().
> 
ok. I misunderstood.


> > > +		}
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static int migrate_charge_prepare(void)
> > > +{
> > > +	int ret = 0;
> > > +	struct mm_struct *mm;
> > > +	struct vm_area_struct *vma;
> > > +
> > > +	mm = get_task_mm(mc->tsk);
> > > +	if (!mm)
> > > +		return 0;
> > > +
> > > +	down_read(&mm->mmap_sem);
> > > +	for (vma = mm->mmap; vma; vma = vma->vm_next) {
> > > +		struct mm_walk migrate_charge_walk = {
> > > +			.pmd_entry = migrate_charge_prepare_pte_range,
> > > +			.mm = mm,
> > > +			.private = vma,
> > > +		};
> > > +		if (signal_pending(current)) {
> > > +			ret = -EINTR;
> > > +			break;
> > > +		}
> > > +		if (is_vm_hugetlb_page(vma))
> > > +			continue;
> > > +		/* We migrate charge of private pages for now */
> > > +		if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE))
> > > +			continue;
> > > +		if (mc->to->migrate_charge) {
> > > +			ret = walk_page_range(vma->vm_start, vma->vm_end,
> > > +							&migrate_charge_walk);
> > > +			if (ret)
> > > +				break;
> > > +		}
> > > +	}
> > > +	up_read(&mm->mmap_sem);
> > > +
> > > +	mmput(mm);
> > 
> > Hmm, Does this means a thread which  is moved can continue its work and
> > newly allocated pages will remain in old group ?
> > 
> Yes.
> Pages allocated between can_attach() and cgroup_task_migrate() will be
> charged to old group.
> But, IIUC, we should release mmap_sem because attach() of cpuset tries to down_write
> mmap_sem(update_tasks_nodemask -> cpuset_change_nodemask -> mpol_rebind_mm). 
> 
agreed.


> > 
> > > +	return ret;
> > > +}
> > > +
> > > +static void mem_cgroup_clear_migrate_charge(void)
> > > +{
> > > +	struct page *page, *tmp;
> > > +
> > > +	VM_BUG_ON(!mc);
> > > +
> > > +	if (!list_empty(&mc->list))
> > 
> > I think list_for_each_entry_safe() handles empty case.
> > 
> 
> #define list_for_each_entry_safe(pos, n, head, member)                  \
>         for (pos = list_entry((head)->next, typeof(*pos), member),      \
>                 n = list_entry(pos->member.next, typeof(*pos), member); \
>              &pos->member != (head);                                    \
>              pos = n, n = list_entry(n->member.next, typeof(*n), member))
> 
> hmm, "pos = list_entry((head)->next, typeof(*pos), member)" points to proper pointer
> in list_empty(i.e. head->next == head) case ?
I think so.

> "mc->list" is struct list_head, not struct page.
> 
yes.
	pos = list_entry((mc->list)->next, struct page, lru);
	here, mc->list->next == &mc->list if empty.

Then, pos is "struct page" but points to &mc->list - some bytes.
Then, pos->member == &mc->list == head.

But this point is a nitpick. plz do as you want.
if (!list_empty(mc->list)) show us mc->list can be empty.


> > > +	mem_cgroup_clear_migrate_charge();
> > >  }
> > >  
> > 
> > Okay, them, if other subsystem fails "can_attach()", migrated charges are not
> > moved back to original group. Right ?
> > 
> We haven't migrated charges at can_attach() stage. It only does try_charge to
> the new group.
> If can_attach() of other subsystem fails, we only cancel the result of try_charge.
> 
> > 
> > 
> > My biggest concern in this implementation is this "isolate" pages too much.
> > Could you modify the whole routine as...
> > 
> > struct migrate_charge {
> > 	struct task_struct *tsk;
> > 	struct mem_cgroup *from;
> > 	struct mem_cgroup *to;
> > 	struct list_head list;
> > 	long charged;
> > 	long committed;
> > };
> > 	- mem_cgroup_can_migrate_charge() ....
> > 			count pages and do "charge", no page isolation.
> > 			and remember the number of charges as mc->charged++;
> > 	- mem_cgroup_migrate_charge()
> > 		- scan vmas/page table again. And isolate pages in fine grain.
> > 		  (256 pages per scan..or some small number)
> > 		- migrate pages if success mc->committed++
> > 
> > 	  after move, uncharge (mc->charged - mc->commited)
> > 
> hmm, I selected current implementation just to prevent parsing page table twice.
> But if you prefer the parse-again direction, I'll try it.
> It will fix most of the problems that current implementation has.
> 
What I afraid of is a corner case, migrating "very big" process, and
INACITVE/ACTIVE lru of a zone goes down to be nearly empty, OOM.

If you look into other callers of isolate_lru_page(), the number of
pages isolated per iteration is limited to some value.
(Almost all callers use array for limiting/batching.)

Thanks,
-Kame







> 
> Thanks,
> Daisuke Nishimura.
> 
> > Maybe you can find better one. But isolating all pages of process at once is a
> > big hammber for vm, OOM.
> > 
> > Thanks,
> > -Kame
> > 
> > 
> > 
> > >  static void mem_cgroup_move_task(struct cgroup_subsys *ss,
> > > 
> > 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-09-17  6:27 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-17  2:23 [RFC][EXPERIMENTAL][PATCH 0/8] memcg: migrate charge at task move Daisuke Nishimura
2009-09-17  2:24 ` [PATCH 1/8] memcg: introduce mem_cgroup_cancel_charge() Daisuke Nishimura
2009-09-17  4:12   ` KAMEZAWA Hiroyuki
2009-09-17  2:24 ` [PATCH 2/8] memcg: cleanup mem_cgroup_move_parent() Daisuke Nishimura
2009-09-17  4:15   ` KAMEZAWA Hiroyuki
2009-09-17  2:25 ` [PATCH 3/8] cgroup: introduce cancel_attach() Daisuke Nishimura
2009-09-17  2:26 ` [PATCH 4/8] memcg: add interface to migrate charge Daisuke Nishimura
2009-09-17  4:20   ` KAMEZAWA Hiroyuki
2009-09-17  4:40     ` Daisuke Nishimura
2009-09-17  2:26 ` [PATCH 5/8] memcg: migrate charge of anon Daisuke Nishimura
2009-09-17  4:57   ` KAMEZAWA Hiroyuki
2009-09-17  5:56     ` Daisuke Nishimura
2009-09-17  6:25       ` KAMEZAWA Hiroyuki [this message]
2009-09-17 23:52     ` KOSAKI Motohiro
2009-09-17  2:27 ` [PATCH 6/8] memcg: migrate charge of shmem Daisuke Nishimura
2009-09-17  5:02   ` KAMEZAWA Hiroyuki
2009-09-17  2:28 ` [PATCH 7/8] memcg: migrate charge of swap Daisuke Nishimura
2009-09-17  5:25   ` KAMEZAWA Hiroyuki
2009-09-17  6:17     ` Daisuke Nishimura
2009-09-17  6:28       ` KAMEZAWA Hiroyuki
2009-09-17  2:29 ` [PATCH 8/8] memcg: avoid oom during charge migration Daisuke Nishimura
2009-09-17  7:01 ` [RFC][EXPERIMENTAL][PATCH 0/8] memcg: migrate charge at task move Daisuke Nishimura
2009-09-24  5:42   ` [RFC][PATCH 0/8] memcg: migrate charge at task move (24/Sep) Daisuke Nishimura
2009-09-24  5:43     ` [RFC][PATCH 1/8] cgroup: introduce cancel_attach() Daisuke Nishimura
2009-09-24  6:33       ` KAMEZAWA Hiroyuki
2009-09-24 23:39         ` Daisuke Nishimura
2009-09-24  5:44     ` [RFC][PATCH 2/8] memcg: introduce mem_cgroup_cancel_charge() Daisuke Nishimura
2009-09-24  5:46     ` [RFC][PATCH 3/8] memcg: cleanup mem_cgroup_move_parent() Daisuke Nishimura
2009-09-24  6:37       ` KAMEZAWA Hiroyuki
2009-09-24  6:54         ` Daisuke Nishimura
2009-09-24  5:47     ` [RFC][PATCH 4/8] memcg: add interface to migrate charge Daisuke Nishimura
2009-09-24  6:54       ` KAMEZAWA Hiroyuki
2009-09-24 23:39         ` Daisuke Nishimura
2009-09-24  5:48     ` [RFC][PATCH 5/8] memcg: migrate charge of mapped page Daisuke Nishimura
2009-09-24  7:22       ` KAMEZAWA Hiroyuki
2009-09-24  8:00         ` Daisuke Nishimura
2009-09-25  0:28           ` Daisuke Nishimura
2009-09-25  0:49             ` KAMEZAWA Hiroyuki
2009-09-24  5:49     ` [RFC][PATCH 6/8] memcg: avoid oom during charge migration Daisuke Nishimura
2009-09-24  7:34       ` KAMEZAWA Hiroyuki
2009-09-25  1:44         ` Daisuke Nishimura
2009-09-25  1:55           ` KAMEZAWA Hiroyuki
2009-09-25  4:51             ` Daisuke Nishimura
2009-09-25  5:36               ` Daisuke Nishimura
2009-09-25  5:52                 ` KAMEZAWA Hiroyuki
2009-09-24  5:49     ` [RFC][PATCH 7/8] memcg: migrate charge of swap Daisuke Nishimura
2009-09-24  5:50     ` [RFC][PATCH 8/8] memcg: migrate charge of shmem swap Daisuke Nishimura
2009-09-24  7:41       ` KAMEZAWA Hiroyuki
2009-09-25  0:28         ` Daisuke Nishimura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090917152540.10b10028.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox