From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"balbir@linux.vnet.ibm.com" <balbir@linux.vnet.ibm.com>,
"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
hugh@veritas.com, taka@valinux.co.jp
Subject: [RFC][PATCH 2/5] memcg : handle swap cache
Date: Fri, 31 Oct 2008 11:54:11 +0900 [thread overview]
Message-ID: <20081031115411.25478878.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20081031115057.6da3dafd.kamezawa.hiroyu@jp.fujitsu.com>
SwapCache support for memory resource controller (memcg)
Before mem+swap controller, memcg itself should handle SwapCache in proper way.
In current memcg, SwapCache is just leaked and the user can create tons of
SwapCache. This is leak of account and should be handled.
SwapCache accounting is done as following.
charge (anon)
- charged when it's mapped.
(because of readahead, charge at add_to_swap_cache() is not sane)
uncharge (anon)
- uncharged when it's dropped from swapcache and fully unmapped.
means it's not uncharged at unmap.
Note: delete from swap cache at swap-in is done after rmap information
is established.
charge (shmem)
- charged at swap-in. this prevents charge at add_to_page_cache().
uncharge (shmem)
- uncharged when it's dropped from swapcache and not on shmem's
radix-tree.
at migartion, check against 'old page' is modified to handle shmem.
Compareing to the old version discussed (and caused troubles), we have
advanteges of
- PCG_USED bit.
- simple migraton handling.
So, situation is much easier than several months ago, maybe.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Documentation/controllers/memory.txt | 5 ++
include/linux/swap.h | 16 ++++++++
mm/memcontrol.c | 66 +++++++++++++++++++++++++++++++----
mm/shmem.c | 18 ++++++++-
mm/swap_state.c | 1
5 files changed, 98 insertions(+), 8 deletions(-)
Index: mmotm-2.6.28-rc2+/mm/memcontrol.c
===================================================================
--- mmotm-2.6.28-rc2+.orig/mm/memcontrol.c
+++ mmotm-2.6.28-rc2+/mm/memcontrol.c
@@ -21,6 +21,7 @@
#include <linux/memcontrol.h>
#include <linux/cgroup.h>
#include <linux/mm.h>
+#include <linux/pagemap.h>
#include <linux/smp.h>
#include <linux/page-flags.h>
#include <linux/backing-dev.h>
@@ -140,6 +141,7 @@ enum charge_type {
MEM_CGROUP_CHARGE_TYPE_MAPPED,
MEM_CGROUP_CHARGE_TYPE_SHMEM, /* used by page migration of shmem */
MEM_CGROUP_CHARGE_TYPE_FORCE, /* used by force_empty */
+ MEM_CGROUP_CHARGE_TYPE_SWAPOUT, /* for accounting swapcache */
NR_CHARGE_TYPE,
};
@@ -781,6 +783,32 @@ int mem_cgroup_cache_charge(struct page
MEM_CGROUP_CHARGE_TYPE_SHMEM, NULL);
}
+#ifdef CONFIG_SWAP
+int mem_cgroup_cache_charge_swapin(struct page *page,
+ struct mm_struct *mm, gfp_t mask)
+{
+ int ret = 0;
+
+ if (mem_cgroup_subsys.disabled)
+ return 0;
+ if (unlikely(!mm))
+ mm = &init_mm;
+
+ ret = mem_cgroup_charge_common(page, mm, mask,
+ MEM_CGROUP_CHARGE_TYPE_SHMEM, NULL);
+ /*
+ * The page may be dropped from SwapCache because we don't have
+ * lock_page().This may cause charge-after-uncharge trouble.
+ * Fix it up here. (the caller have refcnt to this page and
+ * page itself is guaranteed not to be freed.)
+ */
+ if (ret && !PageSwapCache(page))
+ mem_cgroup_uncharge_swapcache(page);
+
+ return ret;
+}
+#endif
+
void mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr)
{
struct page_cgroup *pc;
@@ -818,6 +846,9 @@ __mem_cgroup_uncharge_common(struct page
if (mem_cgroup_subsys.disabled)
return;
+ if (PageSwapCache(page))
+ return;
+
/*
* Check if our page_cgroup is valid
*/
@@ -826,12 +857,26 @@ __mem_cgroup_uncharge_common(struct page
return;
lock_page_cgroup(pc);
- if ((ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED && page_mapped(page))
- || !PageCgroupUsed(pc)) {
- /* This happens at race in zap_pte_range() and do_swap_page()*/
- unlock_page_cgroup(pc);
- return;
+
+ if (!PageCgroupUsed(pc))
+ goto unlock_out;
+
+ switch(ctype) {
+ case MEM_CGROUP_CHARGE_TYPE_MAPPED:
+ if (page_mapped(page))
+ goto unlock_out;
+ break;
+ case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
+ if (!PageAnon(page)) { /* Shared memory */
+ if (page->mapping && !page_is_file_cache(page))
+ goto unlock_out;
+ } else if (page_mapped(page)) /* Anon */
+ goto unlock_out;
+ break;
+ default:
+ break;
}
+
ClearPageCgroupUsed(pc);
mem = pc->mem_cgroup;
@@ -845,6 +890,10 @@ __mem_cgroup_uncharge_common(struct page
css_put(&mem->css);
return;
+
+unlock_out:
+ unlock_page_cgroup(pc);
+ return;
}
void mem_cgroup_uncharge_page(struct page *page)
@@ -864,6 +913,11 @@ void mem_cgroup_uncharge_cache_page(stru
__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
}
+void mem_cgroup_uncharge_swapcache(struct page *page)
+{
+ __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_SWAPOUT);
+}
+
/*
* Before starting migration, account PAGE_SIZE to mem_cgroup that the old
* page belongs to.
@@ -921,7 +975,7 @@ void mem_cgroup_end_migration(struct mem
ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
/* unused page is not on radix-tree now. */
- if (unused && ctype != MEM_CGROUP_CHARGE_TYPE_MAPPED)
+ if (unused)
__mem_cgroup_uncharge_common(unused, ctype);
pc = lookup_page_cgroup(target);
Index: mmotm-2.6.28-rc2+/mm/swap_state.c
===================================================================
--- mmotm-2.6.28-rc2+.orig/mm/swap_state.c
+++ mmotm-2.6.28-rc2+/mm/swap_state.c
@@ -119,6 +119,7 @@ void __delete_from_swap_cache(struct pag
total_swapcache_pages--;
__dec_zone_page_state(page, NR_FILE_PAGES);
INC_CACHE_INFO(del_total);
+ mem_cgroup_uncharge_swapcache(page);
}
/**
Index: mmotm-2.6.28-rc2+/include/linux/swap.h
===================================================================
--- mmotm-2.6.28-rc2+.orig/include/linux/swap.h
+++ mmotm-2.6.28-rc2+/include/linux/swap.h
@@ -332,6 +332,22 @@ static inline void disable_swap_token(vo
put_swap_token(swap_token_mm);
}
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+extern int mem_cgroup_cache_charge_swapin(struct page *page,
+ struct mm_struct *mm, gfp_t mask);
+extern void mem_cgroup_uncharge_swapcache(struct page *page);
+#else
+static inline
+int mem_cgroup_cache_charge_swapin(struct page *page,
+ struct mm_struct *mm, gfp_t mask)
+{
+ return 0;
+}
+static inline void mem_cgroup_uncharge_swapcache(struct page *page)
+{
+}
+#endif
+
#else /* CONFIG_SWAP */
#define total_swap_pages 0
Index: mmotm-2.6.28-rc2+/mm/shmem.c
===================================================================
--- mmotm-2.6.28-rc2+.orig/mm/shmem.c
+++ mmotm-2.6.28-rc2+/mm/shmem.c
@@ -920,8 +920,12 @@ found:
error = 1;
if (!inode)
goto out;
- /* Charge page using GFP_HIGHUSER_MOVABLE while we can wait */
- error = mem_cgroup_cache_charge(page, current->mm, GFP_HIGHUSER_MOVABLE);
+ /*
+ * Charge page using GFP_HIGHUSER_MOVABLE while we can wait.
+ * charged back to the user(not to caller) when swap account is used.
+ */
+ error = mem_cgroup_cache_charge_swapin(page,
+ current->mm, GFP_HIGHUSER_MOVABLE);
if (error)
goto out;
error = radix_tree_preload(GFP_KERNEL);
@@ -1258,6 +1262,16 @@ repeat:
goto repeat;
}
wait_on_page_locked(swappage);
+ /*
+ * We want to avoid charge at add_to_page_cache().
+ * charge against this swap cache here.
+ */
+ if (mem_cgroup_cache_charge_swapin(swappage,
+ current->mm, gfp)) {
+ page_cache_release(swappage);
+ error = -ENOMEM;
+ goto failed;
+ }
page_cache_release(swappage);
goto repeat;
}
Index: mmotm-2.6.28-rc2+/Documentation/controllers/memory.txt
===================================================================
--- mmotm-2.6.28-rc2+.orig/Documentation/controllers/memory.txt
+++ mmotm-2.6.28-rc2+/Documentation/controllers/memory.txt
@@ -137,6 +137,11 @@ behind this approach is that a cgroup th
page will eventually get charged for it (once it is uncharged from
the cgroup that brought it in -- this will happen on memory pressure).
+Exception: When you do swapoff and make swapped-out pages of shmem(tmpfs) to
+be backed into memory in force, charges for pages are accounted against the
+caller of swapoff rather than the users of shmem.
+
+
2.4 Reclaim
Each cgroup maintains a per cgroup LRU that consists of an active
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-10-31 2:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-31 2:50 [RFC][PATCH 0/5] memcg : some patches related to swap KAMEZAWA Hiroyuki
2008-10-31 2:52 ` [RFC][PATCH 1/5] memcg : force_empty to do move account KAMEZAWA Hiroyuki
2008-10-31 9:25 ` Daisuke Nishimura
2008-11-04 6:15 ` Paul Menage
2008-11-04 6:17 ` KAMEZAWA Hiroyuki
2008-11-04 6:23 ` Paul Menage
2008-11-04 6:25 ` KAMEZAWA Hiroyuki
2008-10-31 2:54 ` KAMEZAWA Hiroyuki [this message]
2008-11-04 8:42 ` [RFC][PATCH 2/5] memcg : handle swap cache Daisuke Nishimura
2008-11-04 9:04 ` KAMEZAWA Hiroyuki
2008-11-04 10:28 ` Daisuke Nishimura
2008-11-05 0:07 ` KAMEZAWA Hiroyuki
2008-11-05 2:03 ` KAMEZAWA Hiroyuki
2008-10-31 2:55 ` [RFC][PATCH 3/5] memcg : mem+swap controller kconfig KAMEZAWA Hiroyuki
2008-11-04 8:54 ` Daisuke Nishimura
2008-11-04 9:13 ` KAMEZAWA Hiroyuki
2008-10-31 2:58 ` [RFC][PATCH 4/5] memcg : add swap_cgroup KAMEZAWA Hiroyuki
2008-10-31 2:59 ` [RFC][PATCH 5/5] memcg : mem+swap controller KAMEZAWA Hiroyuki
2008-11-02 6:43 ` [RFC][PATCH 6/5] memcg : synchronous LRU KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081031115411.25478878.kamezawa.hiroyu@jp.fujitsu.com \
--to=kamezawa.hiroyu@jp.fujitsu.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=taka@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox