From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: "linux-mm@kvack.org" <linux-mm@kvack.org>
Cc: "balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
"hugh.dickins@tiscali.co.uk" <hugh.dickins@tiscali.co.uk>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [RFC][PATCH] memcg: fix swap account (26/May)[0/5]
Date: Tue, 26 May 2009 12:12:59 +0900 [thread overview]
Message-ID: <20090526121259.b91b3e9d.kamezawa.hiroyu@jp.fujitsu.com> (raw)
As Nishimura reported, there is a race at handling swap cache.
Typical cases are following (from Nishimura's mail)
== Type-1 ==
If some pages of processA has been swapped out, it calls free_swap_and_cache().
And if at the same time, processB is calling read_swap_cache_async() about
a swap entry *that is used by processA*, a race like below can happen.
processA | processB
-------------------------------------+-------------------------------------
(free_swap_and_cache()) | (read_swap_cache_async())
| swap_duplicate()
| __set_page_locked()
| add_to_swap_cache()
swap_entry_free() == 0 |
find_get_page() -> found |
try_lock_page() -> fail & return |
| lru_cache_add_anon()
| doesn't link this page to memcg's
| LRU, because of !PageCgroupUsed.
This type of leak can be avoided by setting /proc/sys/vm/page-cluster to 0.
== Type-2 ==
Assume processA is exiting and pte points to a page(!PageSwapCache).
And processB is trying reclaim the page.
processA | processB
-------------------------------------+-------------------------------------
(page_remove_rmap()) | (shrink_page_list())
mem_cgroup_uncharge_page() |
->uncharged because it's not |
PageSwapCache yet. |
So, both mem/memsw.usage |
are decremented. |
| add_to_swap() -> added to swap cache.
If this page goes thorough without being freed for some reason, this page
doesn't goes back to memcg's LRU because of !PageCgroupUsed.
==
This patch is a trial for fixing above problems by fixing memcg's swap account logic.
But this requires some amount of changes in swap.
Comaparing with my previous post (22/May)
(http://marc.info/?l=linux-mm&m=124297915418698&w=2),
I think this one is much easier to read...
[1/5] change interface of swap_duplicate()/swap_free()
Adds an function swapcache_prepare() and swapcache_free().
[2/5] add SWAP_HAS_CACHE flag to swap_map
Add SWAP_HAS_CACHE flag to swap_map array for knowing an information that
"there is an only swap cache and swap has no reference"
without calling find_get_page().
[3/5] Count the number of swap-cache-only swaps
After repeating swap-in/out, there are tons of cache-only swaps.
(via a mapped swapcache under vm_swap_full()==false)
This patch counts the number of entry and show it in debug information.
(for example, sysrq-m)
[4/5] fix memcg's swap accounting.
change the memcg's swap accounting logic to see # of references to swap.
[5/5] experimental garbage collection for cache-only swaps.
reclaim swap enty which is not used.
patch [4/5] is for type-1
patch [5/5] is for type-2 and sanity of swaps control...
Thank you for all helps. Any comments are welcome.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2009-05-26 3:14 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-26 3:12 KAMEZAWA Hiroyuki [this message]
2009-05-26 3:14 ` [RFC][PATCH 1/5] change swap cache interfaces KAMEZAWA Hiroyuki
2009-05-26 3:15 ` [RFC][PATCH 2/5] add SWAP_HAS_CACHE flag to swap_map KAMEZAWA Hiroyuki
2009-05-27 4:02 ` Daisuke Nishimura
2009-05-27 4:36 ` KAMEZAWA Hiroyuki
2009-05-27 5:00 ` Daisuke Nishimura
2009-05-28 0:41 ` Daisuke Nishimura
2009-05-28 1:05 ` KAMEZAWA Hiroyuki
2009-05-28 1:40 ` Daisuke Nishimura
2009-05-28 1:44 ` KAMEZAWA Hiroyuki
2009-05-26 3:16 ` [RFC][PATCH 3/5] count cache-only swaps KAMEZAWA Hiroyuki
2009-05-26 17:37 ` Johannes Weiner
2009-05-26 23:49 ` KAMEZAWA Hiroyuki
2009-05-26 3:17 ` [RFC][PATCH 4/5] memcg: fix swap account KAMEZAWA Hiroyuki
2009-05-26 3:18 ` [RFC][PATCH 5/5] (experimental) chase and free cache only swap KAMEZAWA Hiroyuki
2009-05-26 18:14 ` Johannes Weiner
2009-05-27 0:08 ` KAMEZAWA Hiroyuki
2009-05-27 1:26 ` Johannes Weiner
2009-05-27 1:31 ` KAMEZAWA Hiroyuki
2009-05-27 2:06 ` Johannes Weiner
2009-05-27 5:14 ` KAMEZAWA Hiroyuki
2009-05-27 6:30 ` Daisuke Nishimura
2009-05-27 6:50 ` KAMEZAWA Hiroyuki
2009-05-27 6:43 ` [RFC][PATCH] memcg: fix swap account (26/May)[0/5] KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090526121259.b91b3e9d.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=hannes@cmpxchg.org \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox