linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC][PATCH 0/3] memcg: remove refcnt
       [not found] <20080408190734.70ab55b0.kamezawa.hiroyu@jp.fujitsu.com>
@ 2008-04-11  4:57 ` Balbir Singh
       [not found] ` <20080408191311.73b167bb.kamezawa.hiroyu@jp.fujitsu.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Balbir Singh @ 2008-04-11  4:57 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-mm, xemul, yamamoto, lizf

KAMEZAWA Hiroyuki wrote:
> This patch is based on 2.6.25-rc8-mm1 + mem_cgroup_per_zone() fix.
> (already in -mm) 
> 
> This patch is a set for removing refcnt from memory resource controller's
> page_cgroup. Instead of ref_cnt, this patch uses page_mapped().
> By this, we can avoid unnecesary locks and calls to some extent.
> 
> Brief Patch Desc.
>  [1/3] change migration handling .... charge new-page before migration.
>  [2/3] remove refcnt             .... remove refcnt from page_cgroup.
>  [3/3] handle swapcache          .... handle swapcache again.
> 
> [1/3] works for better page migration handling.
> [2/3] works for better speed. (depends on [1/3])
> [3/3] works for swap-cache.   (depends on [2/3])
> 
> 
> 
> Unix bench execl result(ia64):
> No controller   :           43.0     2654.7      617.4
> with controller :           43.0     2461.3      572.4
> after this patch:           43.0     2553.6      593.9
> 
> If page_cgroup->ref_cnt is necessary (for some purpose), please tell me.
> 
> Plan:
> I'd like to push this set before complicated radix-tree page_cgroup set.
> But this should be reviewd before going ahead.
> 

I think this makes a lot of sense. We can push the optimizations independent of
the radix tree, so that it is easy to debug and develop.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH 3/3] account swapcache
       [not found] ` <20080408191311.73b167bb.kamezawa.hiroyu@jp.fujitsu.com>
@ 2008-04-11 12:20   ` Daisuke Nishimura
  2008-04-14  0:47     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 7+ messages in thread
From: Daisuke Nishimura @ 2008-04-11 12:20 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-mm, balbir, xemul, yamamoto, lizf, Hugh Dickins, IKEDA, Munehiro

Hi, KAMEZAWA-san.

KAMEZAWA Hiroyuki wrote:
> Now swapcache is not accounted. (because it had some troubles.)
> 
> This is retrying account swap cache, based on remove-refcnt patch.
> 
> This does.
>  * When a page is swap-cache,  mem_cgroup_uncharge_page() will *not*
>    uncharge page even if page->mapcount == 0.
>  * When a page is removed from swap-cache, mem_cgroup_uncharge_page()
>    is called again.
>  * A swapcache page is newly charged only when it's mapped.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 

I agree with the idea that swap caches should be charged as memory.
(I think they may be charged as swap at the same time.)

IMO, not charging swap caches as memory occasionally causes a problem
that swap caches are not freed even when a process that owns
those pages try to free them(e.g. task exit).

For example:

  Some pages are being reclaimed via memcg memory reclaim.

  Assume that shrink_page_list() has already moved those pages
  to swap cache, unmapped them from ptes, removed from mz->lru,
  and is working on other pages on page_list.
  Those swap cache pages are unlocked and
  page_count of them are 2(swap cache, isolate_page).

  At the same time on other CPU, if the process that owns those
  pages are trying to free them, free_swap_and_cache() cannot
  free those pages unless vm_swap_full, because find_get_pages()
  increases page_count.

I think this rare case itself also exists on global memory reclaim,
but global memory reclaim does not assume that those pases have
been freed, so, if it need to free more memory, those pases
will be freed later because they remain on global inactive list.

The problem here is that those swap cache pages are uncharged
from memcg, so memcg can never reclaim those pages that belonged
to the group.


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH 3/3] account swapcache
  2008-04-11 12:20   ` [RFC][PATCH 3/3] account swapcache Daisuke Nishimura
@ 2008-04-14  0:47     ` KAMEZAWA Hiroyuki
  2008-04-14  8:03       ` Daisuke Nishimura
  0 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-04-14  0:47 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: linux-mm, balbir, xemul, yamamoto, lizf, Hugh Dickins, IKEDA, Munehiro

On Fri, 11 Apr 2008 21:20:55 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
> IMO, not charging swap caches as memory occasionally causes a problem
> that swap caches are not freed even when a process that owns
> those pages try to free them(e.g. task exit).
> 
> For example:
> 
>   Some pages are being reclaimed via memcg memory reclaim.
> 
>   Assume that shrink_page_list() has already moved those pages
>   to swap cache, unmapped them from ptes, removed from mz->lru,
>   and is working on other pages on page_list.
>   Those swap cache pages are unlocked and
>   page_count of them are 2(swap cache, isolate_page).
> 
>   At the same time on other CPU, if the process that owns those
>   pages are trying to free them, free_swap_and_cache() cannot
>   free those pages unless vm_swap_full, because find_get_pages()
>   increases page_count.
> 
> I think this rare case itself also exists on global memory reclaim,
> but global memory reclaim does not assume that those pases have
> been freed, so, if it need to free more memory, those pases
> will be freed later because they remain on global inactive list.
> 
yes.

> The problem here is that those swap cache pages are uncharged
> from memcg, so memcg can never reclaim those pages that belonged
> to the group.
> 
why "never" uncharged ? 

Assume "page" is SwapCache and unmapped and clean. 
==
 shrink_page_list()
	-> PageSwapCache() == true
	-> PageWriteback() == false
	-> PageDirty()     == false
	-> PagePrivate()   == true or false
	-> remove_mapping()
		-> page_count() == 2
		-> PageDirty()  == false
		-> PageSwapCache() == true
			-> __delete_from_swap_cache()
			-> true
	-> page will be freed
==

page shirinking can free SwapCache regardless of vm_swap_full() result.
Of course, my patch handles __delete_from_swap_cache().
 
Thanks,
-Kame



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH 3/3] account swapcache
  2008-04-14  0:47     ` KAMEZAWA Hiroyuki
@ 2008-04-14  8:03       ` Daisuke Nishimura
  2008-04-14  8:23         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 7+ messages in thread
From: Daisuke Nishimura @ 2008-04-14  8:03 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-mm, balbir, xemul, yamamoto, lizf, Hugh Dickins, IKEDA, Munehiro

KAMEZAWA Hiroyuki wrote:
> On Fri, 11 Apr 2008 21:20:55 +0900
> Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:
>> IMO, not charging swap caches as memory occasionally causes a problem
>> that swap caches are not freed even when a process that owns
>> those pages try to free them(e.g. task exit).
>>
>> For example:
>>
>>   Some pages are being reclaimed via memcg memory reclaim.
>>
>>   Assume that shrink_page_list() has already moved those pages
>>   to swap cache, unmapped them from ptes, removed from mz->lru,
>>   and is working on other pages on page_list.
>>   Those swap cache pages are unlocked and
>>   page_count of them are 2(swap cache, isolate_page).
>>
>>   At the same time on other CPU, if the process that owns those
>>   pages are trying to free them, free_swap_and_cache() cannot
>>   free those pages unless vm_swap_full, because find_get_pages()
>>   increases page_count.
>>
>> I think this rare case itself also exists on global memory reclaim,
>> but global memory reclaim does not assume that those pases have
>> been freed, so, if it need to free more memory, those pases
>> will be freed later because they remain on global inactive list.
>>
> yes.
> 
>> The problem here is that those swap cache pages are uncharged
>> from memcg, so memcg can never reclaim those pages that belonged
>> to the group.
>>
> why "never" uncharged ? 
> 
> Assume "page" is SwapCache and unmapped and clean. 
> ==
>  shrink_page_list()
> 	-> PageSwapCache() == true
> 	-> PageWriteback() == false
> 	-> PageDirty()     == false
> 	-> PagePrivate()   == true or false
> 	-> remove_mapping()
> 		-> page_count() == 2
> 		-> PageDirty()  == false
> 		-> PageSwapCache() == true
> 			-> __delete_from_swap_cache()
> 			-> true
> 	-> page will be freed
> ==
> 
You are right.

I was thinking the case below.
Assume some anonymous pages(mapped, referenced, !SwapCache)
are being reclaimed.

shrink_page_list()
	-> add_to_swap() <- makes the page dirty.
	->  try_to_unmap() <- uncharged from memcg and removed from mz->lru.
	-> PageDirty() == true
		sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
			goto keep_locked
	-> unlocks the page and will work on other pages on page_list.

And, if on other CPU the process that owns those pages is exiting
at the timing of my example above, those pages remain only on
global lru, and are never charged(mapped) because the process exits.

I said "never" because once they are removed from mz->lru,
mem_cgroup_isolate_pages() doesn't select those pages
unless they are charged(mapped) again.


> page shirinking can free SwapCache regardless of vm_swap_full() result.
> Of course, my patch handles __delete_from_swap_cache().
>  
Yes.
I think your patch can handle the case what I'm saying.


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH 3/3] account swapcache
  2008-04-14  8:03       ` Daisuke Nishimura
@ 2008-04-14  8:23         ` KAMEZAWA Hiroyuki
  2008-04-14  8:36           ` Daisuke Nishimura
  0 siblings, 1 reply; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-04-14  8:23 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Daisuke Nishimura, linux-mm, balbir, xemul, yamamoto, lizf,
	Hugh Dickins, IKEDA, Munehiro

On Mon, 14 Apr 2008 17:03:53 +0900
Daisuke Nishimura <d-nishimura@mtf.biglobe.ne.jp> wrote:

> I was thinking the case below.
> Assume some anonymous pages(mapped, referenced, !SwapCache)
> are being reclaimed.
> 

Numbering for below.

(1) > shrink_page_list()
(2)> 	-> add_to_swap() <- makes the page dirty.
(3)> 	->  try_to_unmap() <- uncharged from memcg and removed from mz->lru.
(4)> 	-> PageDirty() == true
(5)> 		sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
(6)> 			goto keep_locked
(7)> 	-> unlocks the page and will work on other pages on page_list.
> 
> And, if on other CPU the process that owns those pages is exiting
> at the timing of my example above, those pages remain only on
> global lru, and are never charged(mapped) because the process exits.
> 
> I said "never" because once they are removed from mz->lru,
> mem_cgroup_isolate_pages() doesn't select those pages
> unless they are charged(mapped) again.
> 
I'm sorry if I don't catch your points.

Because of (1), it's marked as SwapCache.
At (2) , page is not removed from mz->lru because it's SwapCache. (see my patch)
page is still on mz->lru after (7).

After a process exits, this page will be reclaimed when page-recalim for
page_cgroup find this.

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH 3/3] account swapcache
  2008-04-14  8:23         ` KAMEZAWA Hiroyuki
@ 2008-04-14  8:36           ` Daisuke Nishimura
  2008-04-14  8:45             ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 7+ messages in thread
From: Daisuke Nishimura @ 2008-04-14  8:36 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-mm, balbir, xemul, yamamoto, lizf, Hugh Dickins, IKEDA, Munehiro

KAMEZAWA Hiroyuki wrote:
> On Mon, 14 Apr 2008 17:03:53 +0900
> Daisuke Nishimura <d-nishimura@mtf.biglobe.ne.jp> wrote:
> 
>> I was thinking the case below.
>> Assume some anonymous pages(mapped, referenced, !SwapCache)
>> are being reclaimed.
>>
> 
> Numbering for below.
> 
> (1) > shrink_page_list()
> (2)> 	-> add_to_swap() <- makes the page dirty.
> (3)> 	->  try_to_unmap() <- uncharged from memcg and removed from mz->lru.
> (4)> 	-> PageDirty() == true
> (5)> 		sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
> (6)> 			goto keep_locked
> (7)> 	-> unlocks the page and will work on other pages on page_list.
>> And, if on other CPU the process that owns those pages is exiting
>> at the timing of my example above, those pages remain only on
>> global lru, and are never charged(mapped) because the process exits.
>>
>> I said "never" because once they are removed from mz->lru,
>> mem_cgroup_isolate_pages() doesn't select those pages
>> unless they are charged(mapped) again.
>>
> I'm sorry if I don't catch your points.
> 
> Because of (1), it's marked as SwapCache.
> At (2) , page is not removed from mz->lru because it's SwapCache. (see my patch)
> page is still on mz->lru after (7).
> 
> After a process exits, this page will be reclaimed when page-recalim for
> page_cgroup find this.
> 
> Thanks,
> -Kame
> 
I was saying the case when swapcaches are not charged.
I showed one of the problems if they are not charged.

Sorry for confusing you.

I agree that your patch handles this case :-)


Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH 3/3] account swapcache
  2008-04-14  8:36           ` Daisuke Nishimura
@ 2008-04-14  8:45             ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 7+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-04-14  8:45 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Daisuke Nishimura, linux-mm, balbir, xemul, yamamoto, lizf,
	Hugh Dickins, IKEDA, Munehiro

On Mon, 14 Apr 2008 17:36:05 +0900
Daisuke Nishimura <d-nishimura@mtf.biglobe.ne.jp> wrote:

> I was saying the case when swapcaches are not charged.
> I showed one of the problems if they are not charged.
> 
> Sorry for confusing you.
> 
no problem.

> I agree that your patch handles this case :-)
> 
Thank you for review :)

Regards,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-04-14  8:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20080408190734.70ab55b0.kamezawa.hiroyu@jp.fujitsu.com>
2008-04-11  4:57 ` [RFC][PATCH 0/3] memcg: remove refcnt Balbir Singh
     [not found] ` <20080408191311.73b167bb.kamezawa.hiroyu@jp.fujitsu.com>
2008-04-11 12:20   ` [RFC][PATCH 3/3] account swapcache Daisuke Nishimura
2008-04-14  0:47     ` KAMEZAWA Hiroyuki
2008-04-14  8:03       ` Daisuke Nishimura
2008-04-14  8:23         ` KAMEZAWA Hiroyuki
2008-04-14  8:36           ` Daisuke Nishimura
2008-04-14  8:45             ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox