From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>
Subject: [PATCH v3] memcg: fix leak on wrong LRU with FUSE
Date: Wed, 9 Mar 2011 16:48:01 +0900 [thread overview]
Message-ID: <20110309164801.3a4c8d10.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20110309150750.d570798c.kamezawa.hiroyu@jp.fujitsu.com>
On Wed, 9 Mar 2011 15:07:50 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > } else {
> > /* shmem */
> > if (PageSwapCache(page)) {
> > ..
> > } else {
> > ..
> > }
> > }
> >
> > Otherwise, the page cache will be charged twice.
> >
>
> Ahh, thanks. I'll send v3.
>
Okay, this is a fixed one.
==
fs/fuse/dev.c::fuse_try_move_page() does
(1) remove a page by ->steal()
(2) re-add the page to page cache
(3) link the page to LRU if it was not on LRU at (1)
This implies the page is _on_ LRU when it's added to radix-tree.
So, the page is added to memory cgroup while it's on LRU.
because LRU is lazy and no one flushs it.
This is the same behavior as SwapCache and needs special care as
- remove page from LRU before overwrite pc->mem_cgroup.
- add page to LRU after overwrite pc->mem_cgroup.
And we need to taking care of pagevec.
If PageLRU(page) is set before we add PCG_USED bit, the page
will not be added to memcg's LRU (in short period).
So, regardlress of PageLRU(page) value before commit_charge(),
we need to check PageLRU(page) after commit_charge().
Changelog v2=>v3:
- fixed double accounting.
Changelog v1=>v2:
- clean up.
- cover !PageLRU() by pagevec case.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 54 +++++++++++++++++++++++++++++++++++-------------------
1 file changed, 35 insertions(+), 19 deletions(-)
Index: mmotm-0303/mm/memcontrol.c
===================================================================
--- mmotm-0303.orig/mm/memcontrol.c
+++ mmotm-0303/mm/memcontrol.c
@@ -926,13 +926,12 @@ void mem_cgroup_add_lru_list(struct page
}
/*
- * At handling SwapCache, pc->mem_cgroup may be changed while it's linked to
- * lru because the page may.be reused after it's fully uncharged (because of
- * SwapCache behavior).To handle that, unlink page_cgroup from LRU when charge
- * it again. This function is only used to charge SwapCache. It's done under
- * lock_page and expected that zone->lru_lock is never held.
+ * At handling SwapCache and other FUSE stuff, pc->mem_cgroup may be changed
+ * while it's linked to lru because the page may be reused after it's fully
+ * uncharged. To handle that, unlink page_cgroup from LRU when charge it again.
+ * It's done under lock_page and expected that zone->lru_lock isnever held.
*/
-static void mem_cgroup_lru_del_before_commit_swapcache(struct page *page)
+static void mem_cgroup_lru_del_before_commit(struct page *page)
{
unsigned long flags;
struct zone *zone = page_zone(page);
@@ -948,7 +947,7 @@ static void mem_cgroup_lru_del_before_co
spin_unlock_irqrestore(&zone->lru_lock, flags);
}
-static void mem_cgroup_lru_add_after_commit_swapcache(struct page *page)
+static void mem_cgroup_lru_add_after_commit(struct page *page)
{
unsigned long flags;
struct zone *zone = page_zone(page);
@@ -2431,9 +2430,28 @@ static void
__mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr,
enum charge_type ctype);
+static void
+__mem_cgroup_commit_charge_lrucare(struct page *page, struct mem_cgroup *mem,
+ enum charge_type ctype)
+{
+ struct page_cgroup *pc = lookup_page_cgroup(page);
+ /*
+ * In some case, SwapCache, FUSE(splice_buf->radixtree), the page
+ * is already on LRU. It means the page may on some other page_cgroup's
+ * LRU. Take care of it.
+ */
+ if (unlikely(PageLRU(page)))
+ mem_cgroup_lru_del_before_commit(page);
+ __mem_cgroup_commit_charge(mem, page, 1, pc, ctype);
+ if (unlikely(PageLRU(page)))
+ mem_cgroup_lru_add_after_commit(page);
+ return;
+}
+
int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
gfp_t gfp_mask)
{
+ struct mem_cgroup *mem = NULL;
int ret;
if (mem_cgroup_disabled())
@@ -2468,14 +2486,16 @@ int mem_cgroup_cache_charge(struct page
if (unlikely(!mm))
mm = &init_mm;
- if (page_is_file_cache(page))
- return mem_cgroup_charge_common(page, mm, gfp_mask,
- MEM_CGROUP_CHARGE_TYPE_CACHE);
-
+ if (page_is_file_cache(page)) {
+ ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &mem, true);
+ if (ret || !mem)
+ return ret;
+ __mem_cgroup_commit_charge_lrucare(page, mem,
+ MEM_CGROUP_CHARGE_TYPE_CACHE);
+ return ret;
+ }
/* shmem */
if (PageSwapCache(page)) {
- struct mem_cgroup *mem;
-
ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &mem);
if (!ret)
__mem_cgroup_commit_charge_swapin(page, mem,
@@ -2532,17 +2552,13 @@ static void
__mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr,
enum charge_type ctype)
{
- struct page_cgroup *pc;
-
if (mem_cgroup_disabled())
return;
if (!ptr)
return;
cgroup_exclude_rmdir(&ptr->css);
- pc = lookup_page_cgroup(page);
- mem_cgroup_lru_del_before_commit_swapcache(page);
- __mem_cgroup_commit_charge(ptr, page, 1, pc, ctype);
- mem_cgroup_lru_add_after_commit_swapcache(page);
+
+ __mem_cgroup_commit_charge_lrucare(page, ptr, ctype);
/*
* Now swap is on-memory. This means this page may be
* counted both as mem and swap....double count.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-03-09 7:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-08 4:56 [PATCH v2] " KAMEZAWA Hiroyuki
2011-03-08 9:18 ` Daisuke Nishimura
2011-03-09 6:07 ` KAMEZAWA Hiroyuki
2011-03-09 7:48 ` KAMEZAWA Hiroyuki [this message]
2011-03-09 10:00 ` [PATCH v3] " Johannes Weiner
2011-03-09 23:36 ` KAMEZAWA Hiroyuki
2011-03-10 5:47 ` [PATCH v4] " KAMEZAWA Hiroyuki
2011-03-10 6:04 ` Daisuke Nishimura
2011-03-10 22:20 ` Andrew Morton
2011-03-10 23:45 ` KAMEZAWA Hiroyuki
2011-03-11 0:01 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110309164801.3a4c8d10.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox