From: David Rientjes <rientjes@google.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Oleg Nesterov <oleg@redhat.com>, anfei <anfei.zhou@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
nishimura@mxp.nes.nec.co.jp,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] oom killer: break from infinite loop
Date: Sun, 4 Apr 2010 16:26:38 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1004041616280.7198@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100402101711.GC12886@csn.ul.ie>
On Fri, 2 Apr 2010, Mel Gorman wrote:
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1610,13 +1610,21 @@ try_next_zone:
> > }
> >
> > static inline int
> > -should_alloc_retry(gfp_t gfp_mask, unsigned int order,
> > +should_alloc_retry(struct task_struct *p, gfp_t gfp_mask, unsigned int order,
> > unsigned long pages_reclaimed)
> > {
> > /* Do not loop if specifically requested */
> > if (gfp_mask & __GFP_NORETRY)
> > return 0;
> >
> > + /* Loop if specifically requested */
> > + if (gfp_mask & __GFP_NOFAIL)
> > + return 1;
> > +
>
> Meh, you could have preserved the comment but no biggie.
>
I'll remember to preserve it when it's proposed.
> > + /* Task is killed, fail the allocation if possible */
> > + if (fatal_signal_pending(p))
> > + return 0;
> > +
>
> Seems reasonable. This will be checked on every major loop in the
> allocator slow patch.
>
> > /*
> > * In this implementation, order <= PAGE_ALLOC_COSTLY_ORDER
> > * means __GFP_NOFAIL, but that may not be true in other
> > @@ -1635,13 +1643,6 @@ should_alloc_retry(gfp_t gfp_mask, unsigned int order,
> > if (gfp_mask & __GFP_REPEAT && pages_reclaimed < (1 << order))
> > return 1;
> >
> > - /*
> > - * Don't let big-order allocations loop unless the caller
> > - * explicitly requests that.
> > - */
> > - if (gfp_mask & __GFP_NOFAIL)
> > - return 1;
> > -
> > return 0;
> > }
> >
> > @@ -1798,6 +1799,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> > if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
> > if (!in_interrupt() &&
> > ((p->flags & PF_MEMALLOC) ||
> > + (fatal_signal_pending(p) && (gfp_mask & __GFP_NOFAIL)) ||
>
> This is a lot less clear. GFP_NOFAIL is rare so this is basically saying
> that all threads with a fatal signal pending can ignore watermarks. This
> is dangerous because if 1000 threads get killed, there is a possibility
> of deadlocking the system.
>
I don't quite understand the comment, this is only for __GFP_NOFAIL
allocations, which you say are rare, so a large number of threads won't be
doing this simultaneously.
> Why not obey the watermarks and just not retry the loop later and fail
> the allocation?
>
The above check for (fatal_signal_pending(p) && (gfp_mask & __GFP_NOFAIL))
essentially oom kills p without invoking the oom killer before direct
reclaim is invoked. We know it has a pending SIGKILL and wants to exit,
so we allow it to allocate beyond the min watermark to avoid costly
reclaim or needlessly killing another task.
> > unlikely(test_thread_flag(TIF_MEMDIE))))
> > alloc_flags |= ALLOC_NO_WATERMARKS;
> > }
> > @@ -1812,6 +1814,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> > int migratetype)
> > {
> > const gfp_t wait = gfp_mask & __GFP_WAIT;
> > + const gfp_t nofail = gfp_mask & __GFP_NOFAIL;
> > struct page *page = NULL;
> > int alloc_flags;
> > unsigned long pages_reclaimed = 0;
> > @@ -1876,7 +1879,7 @@ rebalance:
> > goto nopage;
> >
> > /* Avoid allocations with no watermarks from looping endlessly */
> > - if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
> > + if (test_thread_flag(TIF_MEMDIE) && !nofail)
> > goto nopage;
> >
> > /* Try direct reclaim and then allocating */
> > @@ -1888,6 +1891,10 @@ rebalance:
> > if (page)
> > goto got_pg;
> >
> > + /* Task is killed, fail the allocation if possible */
> > + if (fatal_signal_pending(p) && !nofail)
> > + goto nopage;
> > +
>
> Again, I would expect this to be caught by should_alloc_retry().
>
It is, but only after the oom killer is called. We don't want to
needlessly kill another task here when p has already been killed but may
not be PF_EXITING yet.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-04 23:26 UTC|newest]
Thread overview: 115+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-24 16:25 Anfei Zhou
2010-03-25 2:51 ` KOSAKI Motohiro
2010-03-26 22:08 ` Andrew Morton
2010-03-26 22:33 ` Oleg Nesterov
2010-03-28 14:55 ` anfei
2010-03-28 16:28 ` Oleg Nesterov
2010-03-28 21:21 ` David Rientjes
2010-03-29 11:21 ` Oleg Nesterov
2010-03-29 20:49 ` [patch] oom: give current access to memory reserves if it has been killed David Rientjes
2010-03-30 15:46 ` Oleg Nesterov
2010-03-30 20:26 ` David Rientjes
2010-03-31 17:58 ` Oleg Nesterov
2010-03-31 20:47 ` Oleg Nesterov
2010-04-01 8:35 ` David Rientjes
2010-04-01 8:57 ` [patch -mm] oom: hold tasklist_lock when dumping tasks David Rientjes
2010-04-01 14:27 ` Oleg Nesterov
2010-04-01 19:16 ` David Rientjes
2010-04-01 13:59 ` [patch] oom: give current access to memory reserves if it has been killed Oleg Nesterov
2010-04-01 19:12 ` David Rientjes
2010-04-02 11:14 ` Oleg Nesterov
2010-04-02 18:30 ` [PATCH -mm 0/4] oom: linux has threads Oleg Nesterov
2010-04-02 18:31 ` [PATCH -mm 1/4] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads Oleg Nesterov
2010-04-02 19:05 ` David Rientjes
2010-04-02 18:32 ` [PATCH -mm 2/4] oom: select_bad_process: PF_EXITING check should take ->mm into account Oleg Nesterov
2010-04-06 11:42 ` anfei
2010-04-06 12:18 ` Oleg Nesterov
2010-04-06 13:05 ` anfei
2010-04-06 13:38 ` Oleg Nesterov
2010-04-02 18:32 ` [PATCH -mm 3/4] oom: introduce find_lock_task_mm() to fix !mm false positives Oleg Nesterov
2010-04-02 18:33 ` [PATCH -mm 4/4] oom: oom_forkbomb_penalty: move thread_group_cputime() out of task_lock() Oleg Nesterov
2010-04-02 19:04 ` David Rientjes
2010-04-05 14:23 ` [PATCH -mm] oom: select_bad_process: never choose tasks with badness == 0 Oleg Nesterov
2010-04-02 19:02 ` [patch] oom: give current access to memory reserves if it has been killed David Rientjes
2010-04-02 19:14 ` Oleg Nesterov
2010-04-02 19:46 ` David Rientjes
2010-04-02 19:54 ` [patch -mm] oom: exclude tasks with badness score of 0 from being selected David Rientjes
2010-04-02 21:04 ` Oleg Nesterov
2010-04-02 21:22 ` [patch -mm v2] " David Rientjes
2010-04-02 20:55 ` [patch] oom: give current access to memory reserves if it has been killed Oleg Nesterov
2010-03-31 21:07 ` David Rientjes
2010-03-31 22:50 ` Oleg Nesterov
2010-03-31 23:30 ` Oleg Nesterov
2010-03-31 23:48 ` David Rientjes
2010-04-01 14:39 ` Oleg Nesterov
2010-04-01 18:58 ` David Rientjes
2010-04-01 8:25 ` David Rientjes
2010-04-01 15:26 ` Oleg Nesterov
2010-04-08 21:08 ` David Rientjes
2010-04-09 12:38 ` Oleg Nesterov
2010-03-30 16:39 ` [PATCH] oom: fix the unsafe proc_oom_score()->badness() call Oleg Nesterov
2010-03-30 17:43 ` [PATCH -mm] proc: don't take ->siglock for /proc/pid/oom_adj Oleg Nesterov
2010-03-30 20:30 ` David Rientjes
2010-03-31 9:17 ` Oleg Nesterov
2010-03-31 18:59 ` Oleg Nesterov
2010-03-31 21:14 ` David Rientjes
2010-03-31 23:00 ` Oleg Nesterov
2010-04-01 8:32 ` David Rientjes
2010-04-01 15:37 ` Oleg Nesterov
2010-04-01 19:04 ` David Rientjes
2010-03-30 20:32 ` [PATCH] oom: fix the unsafe proc_oom_score()->badness() call David Rientjes
2010-03-31 9:16 ` Oleg Nesterov
2010-03-31 20:17 ` Oleg Nesterov
2010-04-01 7:41 ` David Rientjes
2010-04-01 13:13 ` [PATCH 0/1] oom: fix the unsafe usage of badness() in proc_oom_score() Oleg Nesterov
2010-04-01 13:13 ` [PATCH 1/1] " Oleg Nesterov
2010-04-01 19:03 ` David Rientjes
2010-03-29 14:06 ` [PATCH] oom killer: break from infinite loop anfei
2010-03-29 20:01 ` David Rientjes
2010-03-30 14:29 ` anfei
2010-03-30 20:29 ` David Rientjes
2010-03-31 0:57 ` KAMEZAWA Hiroyuki
2010-03-31 6:07 ` David Rientjes
2010-03-31 6:13 ` KAMEZAWA Hiroyuki
2010-03-31 6:30 ` Balbir Singh
2010-03-31 6:31 ` KAMEZAWA Hiroyuki
2010-03-31 7:04 ` David Rientjes
2010-03-31 6:32 ` David Rientjes
2010-03-31 7:08 ` [patch -mm] memcg: make oom killer a no-op when no killable task can be found David Rientjes
2010-03-31 7:08 ` KAMEZAWA Hiroyuki
2010-03-31 8:04 ` Balbir Singh
2010-03-31 10:38 ` David Rientjes
2010-04-04 23:28 ` David Rientjes
2010-04-05 21:30 ` Andrew Morton
2010-04-05 22:40 ` David Rientjes
2010-04-05 22:49 ` Andrew Morton
2010-04-05 23:01 ` David Rientjes
2010-04-06 12:08 ` KOSAKI Motohiro
2010-04-06 21:47 ` David Rientjes
2010-04-07 0:20 ` KAMEZAWA Hiroyuki
2010-04-07 13:29 ` KOSAKI Motohiro
2010-04-08 18:05 ` David Rientjes
2010-04-21 19:17 ` Andrew Morton
2010-04-21 22:04 ` David Rientjes
2010-04-22 0:23 ` KAMEZAWA Hiroyuki
2010-04-22 8:34 ` David Rientjes
2010-04-27 22:58 ` [patch -mm] oom: reintroduce and deprecate oom_kill_allocating_task David Rientjes
2010-04-28 0:57 ` KAMEZAWA Hiroyuki
2010-04-22 7:23 ` [patch -mm] memcg: make oom killer a no-op when no killable task can be found Nick Piggin
2010-04-22 7:25 ` KAMEZAWA Hiroyuki
2010-04-22 10:09 ` Nick Piggin
2010-04-22 10:27 ` KAMEZAWA Hiroyuki
2010-04-22 21:11 ` David Rientjes
2010-04-22 10:28 ` David Rientjes
2010-04-22 15:39 ` Nick Piggin
2010-04-22 21:09 ` David Rientjes
2010-05-04 23:55 ` David Rientjes
2010-04-08 17:36 ` David Rientjes
2010-04-02 10:17 ` [PATCH] oom killer: break from infinite loop Mel Gorman
2010-04-04 23:26 ` David Rientjes [this message]
2010-04-05 10:47 ` Mel Gorman
2010-04-06 22:40 ` David Rientjes
2010-03-29 11:31 ` anfei
2010-03-29 11:46 ` Oleg Nesterov
2010-03-29 12:09 ` anfei
2010-03-28 2:46 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1004041616280.7198@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=anfei.zhou@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox