From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
mingo@elte.hu,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [PATCH 3/3] fix stale swap cache at writeback.
Date: Tue, 12 May 2009 10:47:30 +0900 [thread overview]
Message-ID: <20090512104730.78bf5ab0.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20090512104401.28edc0a8.kamezawa.hiroyu@jp.fujitsu.com>
From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
memcg: free unused swapcache on swapout path
Recaliming anonymous memory in vmscan.c does following 2 steps.
1. add to swap and unmap.
2. pageout
But above 2 steps doesn't occur at once. There are many chances
to avoid pageout and _really_ unused pages are swapped out by
visit-and-check-again logic of LRU rotation.
But this behavior has troubles with memcg.
memcg cannot handle !PageCgroupUsed swapcache the owner process of which
has been exited.
This patch is for handling such swap caches created by a race like below:
Assume processA is exiting and pte points to a page(!PageSwapCache).
And processB is trying reclaim the page.
processA | processB
-------------------------------------+-------------------------------------
(page_remove_rmap()) | (shrink_page_list())
mem_cgroup_uncharge_page() |
->uncharged because it's not |
PageSwapCache yet. |
So, both mem/memsw.usage |
are decremented. |
| add_to_swap() -> added to swap cache.
If this page goes thorough without being freed for some reason, this page
doesn't goes back to memcg's LRU because of !PageCgroupUsed.
These swap cache cannot be freed in memcg's LRU scanning, and swp_entry cannot
be freed properly as a result.
This patch adds a hook after add_to_swap() to check the page is mapped by a
process or not, and frees it if it has been unmapped already.
If a page has been on swap cache already when the owner process calls
page_remove_rmap() -> mem_cgroup_uncharge_page(), the page is not uncharged.
It goes back to memcg's LRU even if it goes through shrink_page_list()
without being freed, so this patch ignores these case.
Changelog: from Nishimura's original one.
- moved functions to vmscan.c
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
---
Index: mmotm-2.6.30-May07/mm/vmscan.c
===================================================================
--- mmotm-2.6.30-May07.orig/mm/vmscan.c
+++ mmotm-2.6.30-May07/mm/vmscan.c
@@ -586,6 +586,32 @@ void putback_lru_page(struct page *page)
}
#endif /* CONFIG_UNEVICTABLE_LRU */
+#if defined(CONFIG_CGROUP_MEM_RES_CTLR) && defined(CONFIG_SWAP)
+
+static int memcg_free_unused_swapcache(struct page *page)
+{
+ VM_BUG_ON(!PageLocked(page));
+ VM_BUG_ON(!PageSwapCache(page));
+
+ if (mem_cgroup_disabled())
+ return 0;
+ /*
+ * What we do here is checking the page is accounted by memcg or not.
+ * page_mapped() is enough check for avoding race.
+ */
+ if (!PageAnon(page) || page_mapped(page))
+ return 0;
+ return try_to_free_swap(page); /* checks page_swapcount */
+}
+
+#else
+
+static int memcg_free_unused_swapcache(struct page *page)
+{
+ return 0;
+}
+
+#endif
/*
* shrink_page_list() returns the number of reclaimed pages
@@ -663,6 +689,14 @@ static unsigned long shrink_page_list(st
goto keep_locked;
if (!add_to_swap(page))
goto activate_locked;
+ /*
+ * The owner process might have uncharged the page
+ * (by page_remove_rmap()) before it has been added
+ * to swap cache.
+ * Check it here to avoid making it stale.
+ */
+ if (memcg_free_unused_swapcache(page))
+ goto keep_locked;
may_enter_fs = 1;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-05-12 1:48 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-12 1:44 [PATCH 0/3] fix stale swap cache account leak in memcg v7 KAMEZAWA Hiroyuki
2009-05-12 1:45 ` [PATCH 1/3] add check for mem cgroup is activated KAMEZAWA Hiroyuki
2009-05-12 1:46 ` [PATCH 2/3] fix swap cache account leak at swapin-readahead KAMEZAWA Hiroyuki
2009-05-12 4:32 ` Daisuke Nishimura
2009-05-12 11:24 ` Johannes Weiner
2009-05-12 23:58 ` KAMEZAWA Hiroyuki
2009-05-13 11:18 ` Johannes Weiner
2009-05-13 18:03 ` Hugh Dickins
2009-05-14 0:05 ` KAMEZAWA Hiroyuki
2009-05-12 1:47 ` KAMEZAWA Hiroyuki [this message]
2009-05-12 5:06 ` [PATCH 4/3] memcg: call uncharge_swapcache outside of tree_lock (Re: [PATCH 0/3] fix stale swap cache account leak in memcg v7) Daisuke Nishimura
2009-05-12 7:09 ` KAMEZAWA Hiroyuki
2009-05-12 8:00 ` Daisuke Nishimura
2009-05-12 8:13 ` [PATCH][BUGFIX] memcg: fix for deadlock between lock_page_cgroup and mapping tree_lock KAMEZAWA Hiroyuki
2009-05-12 10:58 ` Daisuke Nishimura
2009-05-12 23:59 ` KAMEZAWA Hiroyuki
2009-05-13 0:28 ` Daisuke Nishimura
2009-05-13 0:32 ` KAMEZAWA Hiroyuki
2009-05-13 3:55 ` KAMEZAWA Hiroyuki
2009-05-13 4:11 ` nishimura
2009-05-12 9:51 ` [PATCH 0/3] fix stale swap cache account leak in memcg v7 Balbir Singh
2009-05-13 0:31 ` KAMEZAWA Hiroyuki
2009-05-14 23:47 ` KAMEZAWA Hiroyuki
2009-05-15 0:38 ` Daisuke Nishimura
2009-05-15 0:54 ` KAMEZAWA Hiroyuki
2009-05-15 1:12 ` Daisuke Nishimura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090512104730.78bf5ab0.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=nishimura@mxp.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox