From: Greg Thelen <gthelen@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Mel Gorman <mel@csn.ul.ie>,
linux-mm@kvack.org
Subject: Re: [patch 3/5] vmscan: remove all_unreclaimable scan control
Date: Mon, 31 May 2010 11:32:51 -0700 [thread overview]
Message-ID: <xr93sk57yl9o.fsf@ninji.mtv.corp.google.com> (raw)
In-Reply-To: <20100430224316.056084208@cmpxchg.org> (Johannes Weiner's message of "Sat, 1 May 2010 01:05:31 +0200")
Johannes Weiner <hannes@cmpxchg.org> writes:
> This scan control is abused to communicate a return value from
> shrink_zones(). Write this idiomatically and remove the knob.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> mm/vmscan.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -70,8 +70,6 @@ struct scan_control {
>
> int swappiness;
>
> - int all_unreclaimable;
> -
> int order;
>
> int lumpy_reclaim;
> @@ -1701,14 +1699,14 @@ static void shrink_zone(int priority, st
> * If a zone is deemed to be full of pinned pages then just give it a light
> * scan then give up on it.
> */
> -static void shrink_zones(int priority, struct zonelist *zonelist,
> +static int shrink_zones(int priority, struct zonelist *zonelist,
> struct scan_control *sc)
> {
> enum zone_type high_zoneidx = gfp_zone(sc->gfp_mask);
> struct zoneref *z;
> struct zone *zone;
> + int progress = 0;
>
> - sc->all_unreclaimable = 1;
> for_each_zone_zonelist_nodemask(zone, z, zonelist, high_zoneidx,
> sc->nodemask) {
> if (!populated_zone(zone))
> @@ -1724,19 +1722,19 @@ static void shrink_zones(int priority, s
>
> if (zone->all_unreclaimable && priority != DEF_PRIORITY)
> continue; /* Let kswapd poll it */
> - sc->all_unreclaimable = 0;
> } else {
> /*
> * Ignore cpuset limitation here. We just want to reduce
> * # of used pages by us regardless of memory shortage.
> */
> - sc->all_unreclaimable = 0;
> mem_cgroup_note_reclaim_priority(sc->mem_cgroup,
> priority);
> }
>
> shrink_zone(priority, zone, sc);
> + progress = 1;
> }
> + return progress;
> }
>
> /*
> @@ -1789,7 +1787,7 @@ static unsigned long do_try_to_free_page
> sc->nr_scanned = 0;
> if (!priority)
> disable_swap_token();
> - shrink_zones(priority, zonelist, sc);
> + ret = shrink_zones(priority, zonelist, sc);
> /*
> * Don't shrink slabs when reclaiming memory from
> * over limit cgroups
> @@ -1826,7 +1824,7 @@ static unsigned long do_try_to_free_page
> congestion_wait(BLK_RW_ASYNC, HZ/10);
> }
> /* top priority shrink_zones still had more to do? don't OOM, then */
> - if (!sc->all_unreclaimable && scanning_global_lru(sc))
> + if (ret && scanning_global_lru(sc))
> ret = sc->nr_reclaimed;
> out:
> /*
I agree with the direction of this patch, but I am seeing a hang when
testing with mmotm-2010-05-21-16-05. The following test hangs, unless I
remove this patch from mmotm:
mount -t cgroup none /cgroups -o memory
mkdir /cgroups/cg1
echo $$ > /cgroups/cg1/tasks
dd bs=1024 count=1024 if=/dev/null of=/data/foo
echo $$ > /cgroups/tasks
echo 1 > /cgroups/cg1/memory.force_empty
I think the hang is caused by the following portion of
mem_cgroup_force_empty():
while (nr_retries && mem->res.usage > 0) {
int progress;
if (signal_pending(current)) {
ret = -EINTR;
goto out;
}
progress = try_to_free_mem_cgroup_pages(mem, GFP_KERNEL,
false, get_swappiness(mem));
if (!progress) {
nr_retries--;
/* maybe some writeback is necessary */
congestion_wait(BLK_RW_ASYNC, HZ/10);
}
}
With this patch applied, it is possible that when do_try_to_free_pages()
calls shrink_zones() for priority 0 that shrink_zones() may return 1
indicating progress, even though no pages may have been reclaimed.
Because this is a cgroup operation, scanning_global_lru() is false and
the following portion of do_try_to_free_pages() fails to set ret=0.
> if (ret && scanning_global_lru(sc))
> ret = sc->nr_reclaimed;
This leaves ret=1 indicating that do_try_to_free_pages() reclaimed 1
page even though it did not reclaim any pages. Therefore
mem_cgroup_force_empty() erroneously believes that
try_to_free_mem_cgroup_pages() is making progress (one page at a time),
so there is an endless loop.
If I apply the following fix, then your patch does not hang and the
system appears to operate correctly.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 915dceb..772913c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1850,7 +1850,7 @@ static unsigned long do_try_to_free_pages(struct
zonelist *zonelist,
congestion_wait(BLK_RW_ASYNC, HZ/10);
}
/* top priority shrink_zones still had more to do? don't OOM,
then */
- if (ret && scanning_global_lru(sc))
+ if (ret)
ret = sc->nr_reclaimed;
out:
/*
I have not done thorough testing, so this may introduce other problems.
Is there a reason not return nr_reclaimed when operating on a cgroup?
This may affect mem_cgroup_hierarchical_reclaim().
--
Greg
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-05-31 18:33 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-30 23:05 [patch 0/5] vmscan: cut down on struct scan_control Johannes Weiner
2010-04-30 23:05 ` [patch 1/5] vmscan: fix unmapping behaviour for RECLAIM_SWAP Johannes Weiner
2010-05-13 3:02 ` KOSAKI Motohiro
2010-05-19 21:32 ` Johannes Weiner
2010-04-30 23:05 ` [patch 2/5] vmscan: remove may_unmap scan control Johannes Weiner
2010-04-30 23:05 ` [patch 3/5] vmscan: remove all_unreclaimable " Johannes Weiner
2010-05-13 3:25 ` KOSAKI Motohiro
2010-05-19 21:34 ` Johannes Weiner
2010-05-31 18:32 ` Greg Thelen [this message]
2010-06-01 3:29 ` [PATCH] vmscan: Fix do_try_to_free_pages() return value when priority==0 reclaim failure KOSAKI Motohiro
2010-06-01 6:48 ` KAMEZAWA Hiroyuki
2010-06-01 8:10 ` Balbir Singh
2010-06-02 0:33 ` KAMEZAWA Hiroyuki
2010-06-01 14:50 ` Greg Thelen
2010-06-04 14:32 ` Johannes Weiner
2010-04-30 23:05 ` [patch 4/5] vmscan: remove isolate_pages callback scan control Johannes Weiner
2010-05-13 3:29 ` KOSAKI Motohiro
2010-05-19 21:42 ` Johannes Weiner
2010-05-20 23:23 ` KOSAKI Motohiro
2010-04-30 23:05 ` [patch 5/5] vmscan: remove may_swap " Johannes Weiner
2010-05-13 3:36 ` KOSAKI Motohiro
2010-05-19 21:44 ` Johannes Weiner
2010-05-21 0:15 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xr93sk57yl9o.fsf@ninji.mtv.corp.google.com \
--to=gthelen@google.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox