* [PATCH mmotm] vmscan: fix may_swap handling for memcg
@ 2009-06-08 3:02 Daisuke Nishimura
2009-06-08 3:20 ` KOSAKI Motohiro
0 siblings, 1 reply; 14+ messages in thread
From: Daisuke Nishimura @ 2009-06-08 3:02 UTC (permalink / raw)
To: LKML, linux-mm
Cc: Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki,
KOSAKI Motohiro, Daisuke Nishimura
From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce
sc->may_swap) add may_swap flag and handle it at get_scan_ratio().
But the result of get_scan_ratio() is ignored when priority == 0, and this
means, when memcg hits the mem+swap limit, anon pages can be swapped
just in vain. Especially when memcg causes oom by mem+swap limit,
we can see many and many pages are swapped out.
Instead of not scanning anon lru completely when priority == 0, this patch adds
a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps,
and calls try_to_free_swap() if needed because it can reduce
both mem.usage and memsw.usage if the page(SwapCache) is unused anymore.
Such unused-but-managed-under-memcg SwapCache can be made in some paths,
for example trylock_page() failure in free_swap_cache().
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
---
mm/vmscan.c | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2ddcfc8..d9a3f54 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -640,6 +640,25 @@ static unsigned long shrink_page_list(struct list_head *page_list,
referenced && page_mapping_inuse(page))
goto activate_locked;
+ if (!sc->may_swap && PageSwapBacked(page)) {
+ /* SwapCache has already uses swap entry */
+ if (!PageSwapCache(page))
+ goto keep_locked;
+ /*
+ * From the view point of memcg, may_swap is false when
+ * memsw.usage hits the limit.
+ * But swaping out SwapCache to disk doesn't reduce the
+ * memsw.usage, so it is a waste of time.
+ * Call try_to_free_swap() if the page isn't used,
+ * because it can reduce both mem.usage and memsw.usage.
+ */
+ if (!scanning_global_lru(sc)) {
+ if (!page_mapped(page))
+ try_to_free_swap(page);
+ goto keep_locked;
+ }
+ }
+
/*
* Anonymous process memory has backing store?
* Try to allocate it some swap space here.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg 2009-06-08 3:02 [PATCH mmotm] vmscan: fix may_swap handling for memcg Daisuke Nishimura @ 2009-06-08 3:20 ` KOSAKI Motohiro 2009-06-08 6:39 ` Daisuke Nishimura 0 siblings, 1 reply; 14+ messages in thread From: KOSAKI Motohiro @ 2009-06-08 3:20 UTC (permalink / raw) To: Daisuke Nishimura Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki Hi > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce > sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). > > But the result of get_scan_ratio() is ignored when priority == 0, and this > means, when memcg hits the mem+swap limit, anon pages can be swapped > just in vain. Especially when memcg causes oom by mem+swap limit, > we can see many and many pages are swapped out. > > Instead of not scanning anon lru completely when priority == 0, this patch adds > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps, > and calls try_to_free_swap() if needed because it can reduce > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore. > > Such unused-but-managed-under-memcg SwapCache can be made in some paths, > for example trylock_page() failure in free_swap_cache(). > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> I think root cause is following branch, right? if so, Why can't we handle this issue on shrink_zone()? --------------------------------------------------------------- static void shrink_zone(int priority, struct zone *zone, struct scan_control *sc) { get_scan_ratio(zone, sc, percent); for_each_evictable_lru(l) { int file = is_file_lru(l); unsigned long scan; scan = zone_nr_pages(zone, sc, l); if (priority) { // !!here!! scan >>= priority; scan = (scan * percent[file]) / 100; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg 2009-06-08 3:20 ` KOSAKI Motohiro @ 2009-06-08 6:39 ` Daisuke Nishimura 2009-06-08 6:53 ` KOSAKI Motohiro 0 siblings, 1 reply; 14+ messages in thread From: Daisuke Nishimura @ 2009-06-08 6:39 UTC (permalink / raw) To: KOSAKI Motohiro Cc: LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki, Daisuke Nishimura On Mon, 8 Jun 2009 12:20:54 +0900 (JST), KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > Hi > Hi, thank you for your comment. > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce > > sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). > > > > But the result of get_scan_ratio() is ignored when priority == 0, and this > > means, when memcg hits the mem+swap limit, anon pages can be swapped > > just in vain. Especially when memcg causes oom by mem+swap limit, > > we can see many and many pages are swapped out. > > > > Instead of not scanning anon lru completely when priority == 0, this patch adds > > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps, > > and calls try_to_free_swap() if needed because it can reduce > > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore. > > > > Such unused-but-managed-under-memcg SwapCache can be made in some paths, > > for example trylock_page() failure in free_swap_cache(). > > > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > I think root cause is following branch, right? yes. > if so, Why can't we handle this issue on shrink_zone()? > Just because priority==0 means oom is about to happen and I don't want to see oom if possible. So I thought it would be better to reclaim as much pages(memsw.usage) as possible in this case. > > --------------------------------------------------------------- > static void shrink_zone(int priority, struct zone *zone, > struct scan_control *sc) > { > get_scan_ratio(zone, sc, percent); > > for_each_evictable_lru(l) { > int file = is_file_lru(l); > unsigned long scan; > > scan = zone_nr_pages(zone, sc, l); > if (priority) { // !!here!! > scan >>= priority; > scan = (scan * percent[file]) / 100; > } > > > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg 2009-06-08 6:39 ` Daisuke Nishimura @ 2009-06-08 6:53 ` KOSAKI Motohiro 2009-06-08 7:54 ` Daisuke Nishimura 0 siblings, 1 reply; 14+ messages in thread From: KOSAKI Motohiro @ 2009-06-08 6:53 UTC (permalink / raw) To: Daisuke Nishimura Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki > On Mon, 8 Jun 2009 12:20:54 +0900 (JST), KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > > Hi > > > Hi, thank you for your comment. > > > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > > > > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce > > > sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). > > > > > > But the result of get_scan_ratio() is ignored when priority == 0, and this > > > means, when memcg hits the mem+swap limit, anon pages can be swapped > > > just in vain. Especially when memcg causes oom by mem+swap limit, > > > we can see many and many pages are swapped out. > > > > > > Instead of not scanning anon lru completely when priority == 0, this patch adds > > > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps, > > > and calls try_to_free_swap() if needed because it can reduce > > > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore. > > > > > > Such unused-but-managed-under-memcg SwapCache can be made in some paths, > > > for example trylock_page() failure in free_swap_cache(). > > > > > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > > > I think root cause is following branch, right? > yes. > > > if so, Why can't we handle this issue on shrink_zone()? > > > Just because priority==0 means oom is about to happen and I don't > want to see oom if possible. > So I thought it would be better to reclaim as much pages(memsw.usage) as possible > in this case. hmmm.. In general, adding new branch to shrink_page_list() is not good idea. it can cause performance degression. Plus, it is not big problem at all. it happen only when priority==0. Definitely, priority==0 don't occur normally. and, too many recliaming pages is not only memcg issue. I don't think this patch provide generic solution. Why your test environment makes oom so frequently? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg 2009-06-08 6:53 ` KOSAKI Motohiro @ 2009-06-08 7:54 ` Daisuke Nishimura 2009-06-09 7:13 ` [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) Daisuke Nishimura 0 siblings, 1 reply; 14+ messages in thread From: Daisuke Nishimura @ 2009-06-08 7:54 UTC (permalink / raw) To: KOSAKI Motohiro Cc: LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki, Daisuke Nishimura On Mon, 8 Jun 2009 15:53:50 +0900 (JST), KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > > On Mon, 8 Jun 2009 12:20:54 +0900 (JST), KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > > > Hi > > > > > Hi, thank you for your comment. > > > > > > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > > > > > > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce > > > > sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). > > > > > > > > But the result of get_scan_ratio() is ignored when priority == 0, and this > > > > means, when memcg hits the mem+swap limit, anon pages can be swapped > > > > just in vain. Especially when memcg causes oom by mem+swap limit, > > > > we can see many and many pages are swapped out. > > > > > > > > Instead of not scanning anon lru completely when priority == 0, this patch adds > > > > a hook to handle may_swap flag in shrink_page_list() to avoid using useless swaps, > > > > and calls try_to_free_swap() if needed because it can reduce > > > > both mem.usage and memsw.usage if the page(SwapCache) is unused anymore. > > > > > > > > Such unused-but-managed-under-memcg SwapCache can be made in some paths, > > > > for example trylock_page() failure in free_swap_cache(). > > > > > > > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > > > > > I think root cause is following branch, right? > > yes. > > > > > if so, Why can't we handle this issue on shrink_zone()? > > > > > Just because priority==0 means oom is about to happen and I don't > > want to see oom if possible. > > So I thought it would be better to reclaim as much pages(memsw.usage) as possible > > in this case. > > hmmm.. > > In general, adding new branch to shrink_page_list() is not good idea. > it can cause performance degression. > > Plus, it is not big problem at all. it happen only when priority==0. > Definitely, priority==0 don't occur normally. But it happens under high memory pressure... > and, too many recliaming pages is not only memcg issue. I don't think this > patch provide generic solution. > Ah, you're right. It's not only memcg issue. > > Why your test environment makes oom so frequently? > Not so frequently :) But I can see almost all of pages are swapped-out when memcg causes oom by memsw.limit(it's a waste of cpu time). And even after Kamezawa-san's memcg-fix-behavior-under-memorylimit-equals-to-memswlimit.patch, I can sometimes see swap usage when mem.limit==memsw.limit(it's a waste of cpu time too). Thanks, Daisuke Nishimura. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-08 7:54 ` Daisuke Nishimura @ 2009-06-09 7:13 ` Daisuke Nishimura 2009-06-09 7:20 ` KOSAKI Motohiro 2009-06-09 7:28 ` KAMEZAWA Hiroyuki 0 siblings, 2 replies; 14+ messages in thread From: Daisuke Nishimura @ 2009-06-09 7:13 UTC (permalink / raw) To: KOSAKI Motohiro Cc: LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki, Daisuke Nishimura > > and, too many recliaming pages is not only memcg issue. I don't think this > > patch provide generic solution. > > > Ah, you're right. It's not only memcg issue. > How about this one ? === From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). But the result of get_scan_ratio() is ignored when priority == 0, so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0. IMHO, this is not an expected behavior. As for memcg especially, because of this behavior many and many pages are swapped-out just in vain when oom is invoked by mem+swap limit. This patch is for handling may_swap flag more strictly. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> --- mm/vmscan.c | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 2ddcfc8..bacb092 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1407,13 +1407,6 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, unsigned long ap, fp; struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); - /* If we have no swap space, do not bother scanning anon pages. */ - if (!sc->may_swap || (nr_swap_pages <= 0)) { - percent[0] = 0; - percent[1] = 100; - return; - } - anon = zone_nr_pages(zone, sc, LRU_ACTIVE_ANON) + zone_nr_pages(zone, sc, LRU_INACTIVE_ANON); file = zone_nr_pages(zone, sc, LRU_ACTIVE_FILE) + @@ -1511,15 +1504,22 @@ static void shrink_zone(int priority, struct zone *zone, enum lru_list l; unsigned long nr_reclaimed = sc->nr_reclaimed; unsigned long swap_cluster_max = sc->swap_cluster_max; + int noswap = 0; - get_scan_ratio(zone, sc, percent); + /* If we have no swap space, do not bother scanning anon pages. */ + if (!sc->may_swap || (nr_swap_pages <= 0)) { + noswap = 1; + percent[0] = 0; + percent[1] = 100; + } else + get_scan_ratio(zone, sc, percent); for_each_evictable_lru(l) { int file = is_file_lru(l); unsigned long scan; scan = zone_nr_pages(zone, sc, l); - if (priority) { + if (priority || noswap) { scan >>= priority; scan = (scan * percent[file]) / 100; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 7:13 ` [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) Daisuke Nishimura @ 2009-06-09 7:20 ` KOSAKI Motohiro 2009-06-09 7:48 ` Minchan Kim 2009-06-09 7:28 ` KAMEZAWA Hiroyuki 1 sibling, 1 reply; 14+ messages in thread From: KOSAKI Motohiro @ 2009-06-09 7:20 UTC (permalink / raw) To: Daisuke Nishimura Cc: kosaki.motohiro, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki > > > and, too many recliaming pages is not only memcg issue. I don't think this > > > patch provide generic solution. > > > > > Ah, you're right. It's not only memcg issue. > > > How about this one ? > > === > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce > sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). > > But the result of get_scan_ratio() is ignored when priority == 0, > so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0. > IMHO, this is not an expected behavior. > > As for memcg especially, because of this behavior many and many pages are > swapped-out just in vain when oom is invoked by mem+swap limit. > > This patch is for handling may_swap flag more strictly. > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Looks great. your patch doesn't only improve memcg, bug also improve noswap system. Thanks. Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > --- > mm/vmscan.c | 18 +++++++++--------- > 1 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 2ddcfc8..bacb092 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1407,13 +1407,6 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, > unsigned long ap, fp; > struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); > > - /* If we have no swap space, do not bother scanning anon pages. */ > - if (!sc->may_swap || (nr_swap_pages <= 0)) { > - percent[0] = 0; > - percent[1] = 100; > - return; > - } > - > anon = zone_nr_pages(zone, sc, LRU_ACTIVE_ANON) + > zone_nr_pages(zone, sc, LRU_INACTIVE_ANON); > file = zone_nr_pages(zone, sc, LRU_ACTIVE_FILE) + > @@ -1511,15 +1504,22 @@ static void shrink_zone(int priority, struct zone *zone, > enum lru_list l; > unsigned long nr_reclaimed = sc->nr_reclaimed; > unsigned long swap_cluster_max = sc->swap_cluster_max; > + int noswap = 0; > > - get_scan_ratio(zone, sc, percent); > + /* If we have no swap space, do not bother scanning anon pages. */ > + if (!sc->may_swap || (nr_swap_pages <= 0)) { > + noswap = 1; > + percent[0] = 0; > + percent[1] = 100; > + } else > + get_scan_ratio(zone, sc, percent); > > for_each_evictable_lru(l) { > int file = is_file_lru(l); > unsigned long scan; > > scan = zone_nr_pages(zone, sc, l); > - if (priority) { > + if (priority || noswap) { > scan >>= priority; > scan = (scan * percent[file]) / 100; > } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 7:20 ` KOSAKI Motohiro @ 2009-06-09 7:48 ` Minchan Kim 2009-06-09 7:58 ` KOSAKI Motohiro 0 siblings, 1 reply; 14+ messages in thread From: Minchan Kim @ 2009-06-09 7:48 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Daisuke Nishimura, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki Hi, KOSAKI. As you know, this problem caused by if condition(priority) in shrink_zone. Let me have a question. Why do we have to prevent scan value calculation when the priority is zero ? As I know, before split-lru, we didn't do it. Is there any specific issue in case of the priority is zero ? On Tue, Jun 9, 2009 at 4:20 PM, KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: >> > > and, too many recliaming pages is not only memcg issue. I don't think this >> > > patch provide generic solution. >> > > >> > Ah, you're right. It's not only memcg issue. >> > >> How about this one ? >> >> === >> From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> >> >> Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce >> sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). >> >> But the result of get_scan_ratio() is ignored when priority == 0, >> so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0. >> IMHO, this is not an expected behavior. >> >> As for memcg especially, because of this behavior many and many pages are >> swapped-out just in vain when oom is invoked by mem+swap limit. >> >> This patch is for handling may_swap flag more strictly. >> >> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > Looks great. > your patch doesn't only improve memcg, bug also improve noswap system. > > Thanks. > Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > > >> --- >> mm/vmscan.c | 18 +++++++++--------- >> 1 files changed, 9 insertions(+), 9 deletions(-) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 2ddcfc8..bacb092 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1407,13 +1407,6 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc, >> unsigned long ap, fp; >> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc); >> >> - /* If we have no swap space, do not bother scanning anon pages. */ >> - if (!sc->may_swap || (nr_swap_pages <= 0)) { >> - percent[0] = 0; >> - percent[1] = 100; >> - return; >> - } >> - >> anon = zone_nr_pages(zone, sc, LRU_ACTIVE_ANON) + >> zone_nr_pages(zone, sc, LRU_INACTIVE_ANON); >> file = zone_nr_pages(zone, sc, LRU_ACTIVE_FILE) + >> @@ -1511,15 +1504,22 @@ static void shrink_zone(int priority, struct zone *zone, >> enum lru_list l; >> unsigned long nr_reclaimed = sc->nr_reclaimed; >> unsigned long swap_cluster_max = sc->swap_cluster_max; >> + int noswap = 0; >> >> - get_scan_ratio(zone, sc, percent); >> + /* If we have no swap space, do not bother scanning anon pages. */ >> + if (!sc->may_swap || (nr_swap_pages <= 0)) { >> + noswap = 1; >> + percent[0] = 0; >> + percent[1] = 100; >> + } else >> + get_scan_ratio(zone, sc, percent); >> >> for_each_evictable_lru(l) { >> int file = is_file_lru(l); >> unsigned long scan; >> >> scan = zone_nr_pages(zone, sc, l); >> - if (priority) { >> + if (priority || noswap) { >> scan >>= priority; >> scan = (scan * percent[file]) / 100; >> } > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 7:48 ` Minchan Kim @ 2009-06-09 7:58 ` KOSAKI Motohiro 2009-06-09 8:19 ` Minchan Kim 0 siblings, 1 reply; 14+ messages in thread From: KOSAKI Motohiro @ 2009-06-09 7:58 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Daisuke Nishimura, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki > Hi, KOSAKI. > > As you know, this problem caused by if condition(priority) in shrink_zone. > Let me have a question. > > Why do we have to prevent scan value calculation when the priority is zero ? > As I know, before split-lru, we didn't do it. > > Is there any specific issue in case of the priority is zero ? Yes. example: get_scan_ratio() return anon:80%, file=20%. and the system have 10000 anon pages and 10000 file pages. shrink_zone() picked up 8000 anon pages and 2000 file pages. it mean 8000 file pages aren't scanned at all. Oops, it can makes OOM-killer although system have droppable file cache. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 7:58 ` KOSAKI Motohiro @ 2009-06-09 8:19 ` Minchan Kim 2009-06-09 8:24 ` KOSAKI Motohiro 0 siblings, 1 reply; 14+ messages in thread From: Minchan Kim @ 2009-06-09 8:19 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Daisuke Nishimura, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: >> Hi, KOSAKI. >> >> As you know, this problem caused by if condition(priority) in shrink_zone. >> Let me have a question. >> >> Why do we have to prevent scan value calculation when the priority is zero ? >> As I know, before split-lru, we didn't do it. >> >> Is there any specific issue in case of the priority is zero ? > > Yes. > > example: > > get_scan_ratio() return anon:80%, file=20%. and the system have > 10000 anon pages and 10000 file pages. > > shrink_zone() picked up 8000 anon pages and 2000 file pages. > it mean 8000 file pages aren't scanned at all. > > Oops, it can makes OOM-killer although system have droppable file cache. > Hmm..Can that problem be happen in real system ? The file ratio is big means that file lru list scanning is so big but rotate is small. It means file lru have few reclaimable page. Isn't it ? I am confusing. Could you elaborate, please if you don't mind ? -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 8:19 ` Minchan Kim @ 2009-06-09 8:24 ` KOSAKI Motohiro 2009-06-09 8:35 ` Minchan Kim 0 siblings, 1 reply; 14+ messages in thread From: KOSAKI Motohiro @ 2009-06-09 8:24 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Daisuke Nishimura, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki > On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI > Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: > >> Hi, KOSAKI. > >> > >> As you know, this problem caused by if condition(priority) in shrink_zone. > >> Let me have a question. > >> > >> Why do we have to prevent scan value calculation when the priority is zero ? > >> As I know, before split-lru, we didn't do it. > >> > >> Is there any specific issue in case of the priority is zero ? > > > > Yes. > > > > example: > > > > get_scan_ratio() return anon:80%, file=20%. and the system have > > 10000 anon pages and 10000 file pages. > > > > shrink_zone() picked up 8000 anon pages and 2000 file pages. > > it mean 8000 file pages aren't scanned at all. > > > > Oops, it can makes OOM-killer although system have droppable file cache. > > > Hmm..Can that problem be happen in real system ? > The file ratio is big means that file lru list scanning is so big but > rotate is small. > It means file lru have few reclaimable page. > > Isn't it ? I am confusing. > Could you elaborate, please if you don't mind ? hm, ok, my example was wrong. I intention is, if there are droppable file-back pages (althout only 1 page), OOM-killer shouldn't occuer. many or few is unrelated. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 8:24 ` KOSAKI Motohiro @ 2009-06-09 8:35 ` Minchan Kim 2009-06-09 8:37 ` KOSAKI Motohiro 0 siblings, 1 reply; 14+ messages in thread From: Minchan Kim @ 2009-06-09 8:35 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Daisuke Nishimura, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki, Rik van Riel On Tue, Jun 9, 2009 at 5:24 PM, KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: >> On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI >> Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: >> >> Hi, KOSAKI. >> >> >> >> As you know, this problem caused by if condition(priority) in shrink_zone. >> >> Let me have a question. >> >> >> >> Why do we have to prevent scan value calculation when the priority is zero ? >> >> As I know, before split-lru, we didn't do it. >> >> >> >> Is there any specific issue in case of the priority is zero ? >> > >> > Yes. >> > >> > example: >> > >> > get_scan_ratio() return anon:80%, file=20%. and the system have >> > 10000 anon pages and 10000 file pages. >> > >> > shrink_zone() picked up 8000 anon pages and 2000 file pages. >> > it mean 8000 file pages aren't scanned at all. >> > >> > Oops, it can makes OOM-killer although system have droppable file cache. >> > >> Hmm..Can that problem be happen in real system ? >> The file ratio is big means that file lru list scanning is so big but >> rotate is small. >> It means file lru have few reclaimable page. >> >> Isn't it ? I am confusing. >> Could you elaborate, please if you don't mind ? > > hm, ok, my example was wrong. > I intention is, if there are droppable file-back pages (althout only 1 page), > OOM-killer shouldn't occuer. > > many or few is unrelated. > I am not sure that is effective. Have you ever met this problem in real situation ? BTW, I have to dive into code. :) Thanks for spending valuable time for commenting -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 8:35 ` Minchan Kim @ 2009-06-09 8:37 ` KOSAKI Motohiro 0 siblings, 0 replies; 14+ messages in thread From: KOSAKI Motohiro @ 2009-06-09 8:37 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Daisuke Nishimura, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh, KAMEZAWA Hiroyuki, Rik van Riel > On Tue, Jun 9, 2009 at 5:24 PM, KOSAKI > Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: > >> On Tue, Jun 9, 2009 at 4:58 PM, KOSAKI > >> Motohiro<kosaki.motohiro@jp.fujitsu.com> wrote: > >> >> Hi, KOSAKI. > >> >> > >> >> As you know, this problem caused by if condition(priority) in shrink_zone. > >> >> Let me have a question. > >> >> > >> >> Why do we have to prevent scan value calculation when the priority is zero ? > >> >> As I know, before split-lru, we didn't do it. > >> >> > >> >> Is there any specific issue in case of the priority is zero ? > >> > > >> > Yes. > >> > > >> > example: > >> > > >> > get_scan_ratio() return anon:80%, file=20%. and the system have > >> > 10000 anon pages and 10000 file pages. > >> > > >> > shrink_zone() picked up 8000 anon pages and 2000 file pages. > >> > it mean 8000 file pages aren't scanned at all. > >> > > >> > Oops, it can makes OOM-killer although system have droppable file cache. > >> > > >> Hmm..Can that problem be happen in real system ? > >> The file ratio is big means that file lru list scanning is so big but > >> rotate is small. > >> It means file lru have few reclaimable page. > >> > >> Isn't it ? I am confusing. > >> Could you elaborate, please if you don't mind ? > > > > hm, ok, my example was wrong. > > I intention is, if there are droppable file-back pages (althout only 1 page), > > OOM-killer shouldn't occuer. > > > > many or few is unrelated. > > > > I am not sure that is effective. > Have you ever met this problem in real situation ? No. It's only stress workload issue. but VM subsystem sould work on stress workload, I think. > BTW, I have to dive into code. :) > Thanks for spending valuable time for commenting -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) 2009-06-09 7:13 ` [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) Daisuke Nishimura 2009-06-09 7:20 ` KOSAKI Motohiro @ 2009-06-09 7:28 ` KAMEZAWA Hiroyuki 1 sibling, 0 replies; 14+ messages in thread From: KAMEZAWA Hiroyuki @ 2009-06-09 7:28 UTC (permalink / raw) To: Daisuke Nishimura Cc: KOSAKI Motohiro, LKML, linux-mm, Andrew Morton, Johannes Weiner, Balbir Singh On Tue, 9 Jun 2009 16:13:30 +0900 Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote: > > > and, too many recliaming pages is not only memcg issue. I don't think this > > > patch provide generic solution. > > > > > Ah, you're right. It's not only memcg issue. > > > How about this one ? > > === > From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> > > Commit 2e2e425989080cc534fc0fca154cae515f971cf5 ("vmscan,memcg: reintroduce > sc->may_swap) add may_swap flag and handle it at get_scan_ratio(). > > But the result of get_scan_ratio() is ignored when priority == 0, > so anon lru is scanned even if may_swap == 0 or nr_swap_pages == 0. > IMHO, this is not an expected behavior. > > As for memcg especially, because of this behavior many and many pages are > swapped-out just in vain when oom is invoked by mem+swap limit. > > This patch is for handling may_swap flag more strictly. > > Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Thanks, Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-06-09 8:08 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-06-08 3:02 [PATCH mmotm] vmscan: fix may_swap handling for memcg Daisuke Nishimura 2009-06-08 3:20 ` KOSAKI Motohiro 2009-06-08 6:39 ` Daisuke Nishimura 2009-06-08 6:53 ` KOSAKI Motohiro 2009-06-08 7:54 ` Daisuke Nishimura 2009-06-09 7:13 ` [PATCH mmotm] vmscan: handle may_swap more strictly (Re: [PATCH mmotm] vmscan: fix may_swap handling for memcg) Daisuke Nishimura 2009-06-09 7:20 ` KOSAKI Motohiro 2009-06-09 7:48 ` Minchan Kim 2009-06-09 7:58 ` KOSAKI Motohiro 2009-06-09 8:19 ` Minchan Kim 2009-06-09 8:24 ` KOSAKI Motohiro 2009-06-09 8:35 ` Minchan Kim 2009-06-09 8:37 ` KOSAKI Motohiro 2009-06-09 7:28 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox