Re: [patch 08/12] mm: page_alloc: wait for OOM killer progress before retrying

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Huang Ying <ying.huang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>
Subject: Re: [patch 08/12] mm: page_alloc: wait for OOM killer progress before retrying
Date: Thu, 26 Mar 2015 16:58:46 +0100	[thread overview]
Message-ID: <20150326155846.GQ15257@dhcp22.suse.cz> (raw)
In-Reply-To: <1427264236-17249-9-git-send-email-hannes@cmpxchg.org>

On Wed 25-03-15 02:17:12, Johannes Weiner wrote:
> There is not much point in rushing back to the freelists and burning
> CPU cycles in direct reclaim when somebody else is in the process of
> OOM killing, or right after issuing a kill ourselves, because it could
> take some time for the OOM victim to release memory.

Yes this makes sense and it is better than what we have now. The
question is how long we should wait. I can see you have gone with HZ.
What is the value based on? Have your testing shown that the OOM victim
manages to die within a second most of the time?

I do not want to get into which value is the best discussion but I would
expect a larger value. Most OOM victims are not blocked so they would
wake up soon. This is a safety net for those who are blocked and I do
not think we have to expedite those rare cases and rather optimize for
"regular" OOM situations. How about 10-30s?

> This is a very cold error path, so there is not much hurry.  Use the
> OOM victim waitqueue to wait for victims to actually exit, which is a
> solid signal that the memory pinned by those tasks has been released.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/oom_kill.c   | 11 +++++++----
>  mm/page_alloc.c | 42 +++++++++++++++++++++++++-----------------
>  2 files changed, 32 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5cfda39b3268..e066ac7353a4 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -711,12 +711,15 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
>  		killed = 1;
>  	}
>  out:
> +	if (test_thread_flag(TIF_MEMDIE))
> +		return true;
>  	/*
> -	 * Give the killed threads a good chance of exiting before trying to
> -	 * allocate memory again.
> +	 * Wait for any outstanding OOM victims to die.  In rare cases
> +	 * victims can get stuck behind the allocating tasks, so the
> +	 * wait needs to be bounded.  It's crude alright, but cheaper
> +	 * than keeping a global dependency tree between all tasks.
>  	 */
> -	if (killed)
> -		schedule_timeout_killable(1);
> +	wait_event_timeout(oom_victims_wait, !atomic_read(&oom_victims), HZ);
>  
>  	return true;
>  }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c1224ba45548..9ce9c4c083a0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2330,30 +2330,29 @@ void warn_alloc_failed(gfp_t gfp_mask, int order, const char *fmt, ...)
>  }
>  
>  static inline struct page *
> -__alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> +__alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags,
>  	const struct alloc_context *ac, unsigned long *did_some_progress)
>  {
> -	struct page *page;
> +	struct page *page = NULL;
>  
>  	*did_some_progress = 0;
>  
>  	/*
> -	 * Acquire the oom lock.  If that fails, somebody else is
> -	 * making progress for us.
> +	 * This allocating task can become the OOM victim itself at
> +	 * any point before acquiring the lock.  In that case, exit
> +	 * quickly and don't block on the lock held by another task
> +	 * waiting for us to exit.
>  	 */
> -	if (!mutex_trylock(&oom_lock)) {
> -		*did_some_progress = 1;
> -		schedule_timeout_uninterruptible(1);
> -		return NULL;
> +	if (test_thread_flag(TIF_MEMDIE) || mutex_lock_killable(&oom_lock)) {
> +		alloc_flags |= ALLOC_NO_WATERMARKS;
> +		goto alloc;
>  	}
>  
>  	/*
> -	 * Go through the zonelist yet one more time, keep very high watermark
> -	 * here, this is only to catch a parallel oom killing, we must fail if
> -	 * we're still under heavy pressure.
> +	 * While we have been waiting for the lock, the previous OOM
> +	 * kill might have released enough memory for the both of us.
>  	 */
> -	page = get_page_from_freelist(gfp_mask | __GFP_HARDWALL, order,
> -					ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>  	if (page)
>  		goto out;
>  
> @@ -2383,12 +2382,20 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
>  		if (gfp_mask & __GFP_THISNODE)
>  			goto out;
>  	}
> -	/* Exhausted what can be done so it's blamo time */
> -	if (out_of_memory(ac->zonelist, gfp_mask, order, ac->nodemask, false)
> -			|| WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL))
> +
> +	if (out_of_memory(ac->zonelist, gfp_mask, order, ac->nodemask, false)) {
>  		*did_some_progress = 1;
> +	} else {
> +		/* Oops, these shouldn't happen with the OOM killer disabled */
> +		if (WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL))
> +			*did_some_progress = 1;
> +	}
>  out:
>  	mutex_unlock(&oom_lock);
> +alloc:
> +	if (!page)
> +		page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
> +
>  	return page;
>  }
>  
> @@ -2775,7 +2782,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	}
>  
>  	/* Reclaim has failed us, start killing things */
> -	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
> +	page = __alloc_pages_may_oom(gfp_mask, order, alloc_flags, ac,
> +				     &did_some_progress);
>  	if (page)
>  		goto got_pg;
>  
> -- 
> 2.3.3
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2015-03-26 15:58 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25  6:17 [patch 00/12] mm: page_alloc: improve OOM mechanism and policy Johannes Weiner
2015-03-25  6:17 ` [patch 01/12] mm: oom_kill: remove unnecessary locking in oom_enable() Johannes Weiner
2015-03-26  0:51   ` David Rientjes
2015-03-26 11:51     ` Michal Hocko
2015-03-26 13:18       ` Michal Hocko
2015-03-26 19:30         ` David Rientjes
2015-03-26 11:43   ` Michal Hocko
2015-03-26 20:05   ` David Rientjes
2015-03-25  6:17 ` [patch 02/12] mm: oom_kill: clean up victim marking and exiting interfaces Johannes Weiner
2015-03-26  3:34   ` David Rientjes
2015-03-26 11:54   ` Michal Hocko
2015-03-25  6:17 ` [patch 03/12] mm: oom_kill: switch test-and-clear of known TIF_MEMDIE to clear Johannes Weiner
2015-03-26  3:31   ` David Rientjes
2015-03-26 11:05     ` Johannes Weiner
2015-03-26 19:50       ` David Rientjes
2015-03-30 14:48         ` Michal Hocko
2015-04-02 23:01         ` [patch] android, lmk: avoid setting TIF_MEMDIE if process has already exited David Rientjes
2015-04-28 22:50           ` [patch resend] " David Rientjes
2015-03-26 11:57   ` [patch 03/12] mm: oom_kill: switch test-and-clear of known TIF_MEMDIE to clear Michal Hocko
2015-03-25  6:17 ` [patch 04/12] mm: oom_kill: remove unnecessary locking in exit_oom_victim() Johannes Weiner
2015-03-26 12:53   ` Michal Hocko
2015-03-26 13:01     ` Michal Hocko
2015-03-26 15:10       ` Johannes Weiner
2015-03-26 15:04     ` Johannes Weiner
2015-03-25  6:17 ` [patch 05/12] mm: oom_kill: generalize OOM progress waitqueue Johannes Weiner
2015-03-26 13:03   ` Michal Hocko
2015-03-25  6:17 ` [patch 06/12] mm: oom_kill: simplify OOM killer locking Johannes Weiner
2015-03-26 13:31   ` Michal Hocko
2015-03-26 15:17     ` Johannes Weiner
2015-03-26 16:07       ` Michal Hocko
2015-03-25  6:17 ` [patch 07/12] mm: page_alloc: inline should_alloc_retry() Johannes Weiner
2015-03-26 14:11   ` Michal Hocko
2015-03-26 15:18     ` Johannes Weiner
2015-03-25  6:17 ` [patch 08/12] mm: page_alloc: wait for OOM killer progress before retrying Johannes Weiner
2015-03-25 14:15   ` Tetsuo Handa
2015-03-25 17:01     ` Vlastimil Babka
2015-03-26 11:28       ` Johannes Weiner
2015-03-26 11:24     ` Johannes Weiner
2015-03-26 14:32       ` Michal Hocko
2015-03-26 15:23         ` Johannes Weiner
2015-03-26 15:38           ` Michal Hocko
2015-03-26 18:17             ` Johannes Weiner
2015-03-27 14:01             ` [patch 08/12] mm: page_alloc: wait for OOM killer progressbefore retrying Tetsuo Handa
2015-03-26 15:58   ` Michal Hocko [this message]
2015-03-26 18:23     ` [patch 08/12] mm: page_alloc: wait for OOM killer progress before retrying Johannes Weiner
2015-03-25  6:17 ` [patch 09/12] mm: page_alloc: private memory reserves for OOM-killing allocations Johannes Weiner
2015-04-14 16:49   ` Michal Hocko
2015-04-24 19:13     ` Johannes Weiner
2015-03-25  6:17 ` [patch 10/12] mm: page_alloc: emergency reserve access for __GFP_NOFAIL allocations Johannes Weiner
2015-04-14 16:55   ` Michal Hocko
2015-03-25  6:17 ` [patch 11/12] mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM Johannes Weiner
2015-03-26 14:50   ` Michal Hocko
2015-03-25  6:17 ` [patch 12/12] mm: page_alloc: do not lock up low-order " Johannes Weiner
2015-03-26 15:32   ` Michal Hocko
2015-03-26 19:58 ` [patch 00/12] mm: page_alloc: improve OOM mechanism and policy Dave Chinner
2015-03-27 15:05   ` Johannes Weiner
2015-03-30  0:32     ` Dave Chinner
2015-03-30 19:31       ` Johannes Weiner
2015-04-01 15:19       ` Michal Hocko
2015-04-01 21:39         ` Dave Chinner
2015-04-02  7:29           ` Michal Hocko
2015-04-07 14:18         ` Johannes Weiner
2015-04-11  7:29           ` Tetsuo Handa
2015-04-13 12:49             ` Michal Hocko
2015-04-13 12:46           ` Michal Hocko
2015-04-14  0:11             ` Dave Chinner
2015-04-14  7:20               ` Michal Hocko
2015-04-14 10:36             ` Johannes Weiner
2015-04-14 14:23               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150326155846.GQ15257@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox