Re: Unending loop in __alloc_pages_slowpath following OOM-kill; rfc: patch.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: minchan.kim@gmail.com
Cc: abarry@cray.com, akpm@linux-foundation.org, linux-mm@kvack.org,
	mgorman@suse.de, riel@redhat.com, hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org
Subject: Re: Unending loop in __alloc_pages_slowpath following OOM-kill; rfc: patch.
Date: Tue, 24 May 2011 13:54:54 +0900	[thread overview]
Message-ID: <4DDB3A1E.6090206@jp.fujitsu.com> (raw)
In-Reply-To: <20110520164924.GB2386@barrios-desktop>

>>From 8bd3f16736548375238161d1bd85f7d7c381031f Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan.kim@gmail.com>
> Date: Sat, 21 May 2011 01:37:41 +0900
> Subject: [PATCH] Prevent unending loop in __alloc_pages_slowpath
> 
> From: Andrew Barry <abarry@cray.com>
> 
> I believe I found a problem in __alloc_pages_slowpath, which allows a process to
> get stuck endlessly looping, even when lots of memory is available.
> 
> Running an I/O and memory intensive stress-test I see a 0-order page allocation
> with __GFP_IO and __GFP_WAIT, running on a system with very little free memory.
> Right about the same time that the stress-test gets killed by the OOM-killer,
> the utility trying to allocate memory gets stuck in __alloc_pages_slowpath even
> though most of the systems memory was freed by the oom-kill of the stress-test.
> 
> The utility ends up looping from the rebalance label down through the
> wait_iff_congested continiously. Because order=0, __alloc_pages_direct_compact
> skips the call to get_page_from_freelist. Because all of the reclaimable memory
> on the system has already been reclaimed, __alloc_pages_direct_reclaim skips the
> call to get_page_from_freelist. Since there is no __GFP_FS flag, the block with
> __alloc_pages_may_oom is skipped. The loop hits the wait_iff_congested, then
> jumps back to rebalance without ever trying to get_page_from_freelist. This loop
> repeats infinitely.
> 
> The test case is pretty pathological. Running a mix of I/O stress-tests that do
> a lot of fork() and consume all of the system memory, I can pretty reliably hit
> this on 600 nodes, in about 12 hours. 32GB/node.
> 
> Signed-off-by: Andrew Barry <abarry@cray.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> Cc: Mel Gorman <mgorman@suse.de>
> ---
>  mm/page_alloc.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3f8bce2..e78b324 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2064,6 +2064,7 @@ restart:
>  		first_zones_zonelist(zonelist, high_zoneidx, NULL,
>  					&preferred_zone);
>  
> +rebalance:
>  	/* This is the last chance, in general, before the goto nopage. */
>  	page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
>  			high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
> @@ -2071,7 +2072,6 @@ restart:
>  	if (page)
>  		goto got_pg;
>  
> -rebalance:
>  	/* Allocate without watermarks if the context allows */
>  	if (alloc_flags & ALLOC_NO_WATERMARKS) {
>  		page = __alloc_pages_high_priority(gfp_mask, order,

I'm sorry I missed this thread long time.

In this case, I think we should call drain_all_pages(). then following
patch is better.
However I also think your patch is valuable. because while the task is
sleeping in wait_iff_congested(), an another task may free some pages.
thus, rebalance path should try to get free pages. iow, you makes sense.

So, I'd like to propose to merge both your and my patch.

Thanks.

next prev parent reply	other threads:[~2011-05-24  4:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-13 21:31 Andrew Barry
2011-05-17 10:34 ` Minchan Kim
2011-05-17 11:34   ` Mel Gorman
2011-05-17 15:49   ` Andrew Barry
2011-05-18 22:29     ` Minchan Kim
2011-05-20 16:49       ` Minchan Kim
2011-05-20 17:16         ` Rik van Riel
2011-05-20 17:23         ` Mel Gorman
2011-05-24  4:54         ` KOSAKI Motohiro [this message]
2011-05-24  5:45           ` KOSAKI Motohiro
2011-05-24  8:30           ` Mel Gorman
2011-05-24  8:36             ` KOSAKI Motohiro
2011-05-24  8:49               ` Mel Gorman
2011-05-24  9:05                 ` KOSAKI Motohiro
2011-05-24  9:16                   ` Mel Gorman
2011-05-24  9:40                     ` KOSAKI Motohiro
2011-05-24 10:57                       ` Mel Gorman
2011-05-24 23:53                         ` KOSAKI Motohiro
2011-05-24  8:34           ` Minchan Kim
2011-05-24  8:41             ` KOSAKI Motohiro
2011-05-24  8:57               ` Minchan Kim
2011-05-24  9:36                 ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DDB3A1E.6090206@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=abarry@cray.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox