From: Lisa Du <cldu@marvell.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Minchan Kim <minchan@kernel.org>,
KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
Mel Gorman <mel@csn.ul.ie>, Christoph Lameter <cl@linux.com>,
Bob Liu <lliubbo@gmail.com>, Neil Zhang <zhangwm@marvell.com>,
Russell King - ARM Linux <linux@arm.linux.org.uk>,
Aaditya Kumar <aaditya.kumar.30@gmail.com>,
"yinghan@google.com" <yinghan@google.com>,
"npiggin@gmail.com" <npiggin@gmail.com>,
"riel@redhat.com" <riel@redhat.com>,
"kamezawa.hiroyu@jp.fujitsu.com" <kamezawa.hiroyu@jp.fujitsu.com>
Subject: RE: [resend] [PATCH V3] mm: vmscan: fix do_try_to_free_pages() livelock
Date: Tue, 27 Aug 2013 18:58:48 -0700 [thread overview]
Message-ID: <89813612683626448B837EE5A0B6A7CB3B633DFBF8@SC-VEXCH4.marvell.com> (raw)
In-Reply-To: <20130827124307.63259439a80042bd81f27684@linux-foundation.org>
>-----Original Message-----
>From: Andrew Morton [mailto:akpm@linux-foundation.org]
>Sent: 2013年8月28日 3:43
>To: Lisa Du
>Cc: Johannes Weiner; Michal Hocko; linux-mm@kvack.org; Minchan Kim; KOSAKI Motohiro; Mel Gorman; Christoph Lameter; Bob Liu;
>Neil Zhang; Russell King - ARM Linux; Aaditya Kumar; yinghan@google.com; npiggin@gmail.com; riel@redhat.com;
>kamezawa.hiroyu@jp.fujitsu.com
>Subject: Re: [resend] [PATCH V3] mm: vmscan: fix do_try_to_free_pages() livelock
>
>On Sun, 11 Aug 2013 18:46:08 -0700 Lisa Du <cldu@marvell.com> wrote:
>
>> This patch is based on KOSAKI's work and I add a little more
>> description, please refer https://lkml.org/lkml/2012/6/14/74.
>>
>> Currently, I found system can enter a state that there are lots of
>> free pages in a zone but only order-0 and order-1 pages which means
>> the zone is heavily fragmented, then high order allocation could make
>> direct reclaim path's long stall(ex, 60 seconds) especially in no swap
>> and no compaciton enviroment. This problem happened on v3.4, but it
>> seems issue still lives in current tree, the reason is
>> do_try_to_free_pages enter live lock:
>>
>> kswapd will go to sleep if the zones have been fully scanned and are
>> still not balanced. As kswapd thinks there's little point trying all
>> over again to avoid infinite loop. Instead it changes order from
>> high-order to 0-order because kswapd think order-0 is the most
>> important. Look at 73ce02e9 in detail. If watermarks are ok, kswapd
>> will go back to sleep and may leave zone->all_unreclaimable = 0.
>> It assume high-order users can still perform direct reclaim if they wish.
>>
>> Direct reclaim continue to reclaim for a high order which is not a
>> COSTLY_ORDER without oom-killer until kswapd turn on zone->all_unreclaimble.
>> This is because to avoid too early oom-kill. So it means
>> direct_reclaim depends on kswapd to break this loop.
>>
>> In worst case, direct-reclaim may continue to page reclaim forever
>> when kswapd sleeps forever until someone like watchdog detect and
>> finally kill the process. As described in:
>> http://thread.gmane.org/gmane.linux.kernel.mm/103737
>>
>> We can't turn on zone->all_unreclaimable from direct reclaim path
>> because direct reclaim path don't take any lock and this way is racy.
>> Thus this patch removes zone->all_unreclaimable field completely and
>> recalculates zone reclaimable state every time.
>>
>> Note: we can't take the idea that direct-reclaim see
>> zone->pages_scanned directly and kswapd continue to use
>> zone->all_unreclaimable. Because, it is racy. commit 929bea7c71
>> (vmscan: all_unreclaimable() use
>> zone->all_unreclaimable as a name) describes the detail.
>
>I did this to fix the build:
>
>--- a/mm/migrate.c~mm-vmscan-fix-do_try_to_free_pages-livelock-fix-2
>+++ a/mm/migrate.c
>@@ -1471,7 +1471,7 @@ static bool migrate_balanced_pgdat(struc
> if (!populated_zone(zone))
> continue;
>
>- if (zone->all_unreclaimable)
>+ if (!zone_reclaimable(zone))
> continue;
>
> /* Avoid waking kswapd by allocating pages_to_migrate pages. */
>
>Please review and runtime test it?
This should be reasonable, I'm sorry that I only have the v3.4 environment.
And v3.4 doesn't have this function.
next prev parent reply other threads:[~2013-08-28 2:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-05 2:26 [resend] [PATCH] " Lisa Du
2013-08-05 2:56 ` Minchan Kim
2013-08-05 4:53 ` Johannes Weiner
2013-08-05 5:02 ` Minchan Kim
2013-08-05 7:41 ` Michal Hocko
2013-08-06 9:23 ` [resend] [PATCH V2] " Lisa Du
2013-08-06 10:35 ` Michal Hocko
2013-08-07 1:42 ` Lisa Du
2013-08-08 18:14 ` Johannes Weiner
2013-08-12 1:46 ` [resend] [PATCH V3] " Lisa Du
2013-08-20 22:16 ` Andrew Morton
2013-08-22 5:24 ` Lisa Du
2013-08-22 6:24 ` Minchan Kim
2013-08-22 7:14 ` Lisa Du
2013-08-27 19:43 ` Andrew Morton
2013-08-28 1:58 ` Lisa Du [this message]
2013-08-19 8:19 ` Lisa Du
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=89813612683626448B837EE5A0B6A7CB3B633DFBF8@SC-VEXCH4.marvell.com \
--to=cldu@marvell.com \
--cc=aaditya.kumar.30@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@gmail.com \
--cc=linux-mm@kvack.org \
--cc=linux@arm.linux.org.uk \
--cc=lliubbo@gmail.com \
--cc=mel@csn.ul.ie \
--cc=mhocko@suse.cz \
--cc=minchan@kernel.org \
--cc=npiggin@gmail.com \
--cc=riel@redhat.com \
--cc=yinghan@google.com \
--cc=zhangwm@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox