linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@cpushare.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 04 of 11] avoid selecting already killed tasks
Date: Thu, 3 Jan 2008 14:41:37 +0100	[thread overview]
Message-ID: <20080103134137.GT30939@v2.random> (raw)
In-Reply-To: <alpine.DEB.0.9999.0801030134130.25018@chino.kir.corp.google.com>

On Thu, Jan 03, 2008 at 01:40:09AM -0800, David Rientjes wrote:
> On Thu, 3 Jan 2008, Andrea Arcangeli wrote:
> 
> > avoid selecting already killed tasks
> > 
> > If the killed task doesn't go away because it's waiting on some other
> > task who needs to allocate memory, to release the i_sem or some other
> > lock, we must fallback to killing some other task in order to kill the
> > original selected and already oomkilled task, but the logic that kills
> > the childs first, would deadlock, if the already oom-killed task was
> > actually the first child of the newly oom-killed task.
> > 
> 
> The problem is that this can cause the parent or one of its children to be 
> unnecessarily killed.

Well, the single fact I'm skipping over the TIF_MEMDIE tasks to
prevent deadlocks, allows for spurious oom killing again. Like you
said we can later add a per-task timeout so we wait only X seconds for
a certain TIF_MEMDIE task to quit before selecting another one.

But we got to ignore those TIF_MEMDIE tasks unfortunately, or we
deadlock, no matter if we're in select_bad_process, or in
oom_kill_process. Initially I didn't notice oom_kill_process had that
problem so I was then deadlocking despite select_bad_process was
selecting the parent that didn't have TIF_MEMDIE set (but the first
child already had it).

> Regardless of any OOM killer sychronization that we do, it is still 
> possible for the OOM killer to return after killing a task and then 
> another OOM situation be triggered on a subsequent allocation attempt 
> before the killed task has exited.  It's still marked as TIF_MEMDIE, so 
> your change will exempt it from being a target again and one of its 
> siblings or, worse, it's parent will be killed.

This is the risk of suprious oom killing yes. You got to choose
between a deadlock and risking a suprious oom killing. Even when you
add your 60second timeout in the task_struct between each new TIF_MEMDIE
bitflag set, you're still going to risk spurious oom killing...

The schedule_timeout in the oom killer and in the VM that I have in my
patchset combined with your very limited functionality of
zone-oom-lock (limited because it's gone by the time out_of_memory
returns and it currently can't take into account when the TIF_MEMDIE
task actually exited) in practice didn't generate suprious kills in my
testing. It may not be enough but it's a start...

> You can't guarantee that this couldn't have been prevented given 
> sufficient time for the exiting task to die, so this change introduces the 
> possibility that tasks will unnecessarily be killed to alleviate the OOM 
> condition.

Not just to 'alleviate' the oom condition, but to prevent a system crash.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-01-03 13:41 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-03  2:09 [PATCH 00 of 11] oom deadlock fixes Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 01 of 11] limit shrink zone scanning Andrea Arcangeli
2008-01-07 19:11   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 02 of 11] avoid oom deadlock in nfs_create_request Andrea Arcangeli
2008-01-07 19:13   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 03 of 11] prevent oom deadlocks during read/write operations Andrea Arcangeli
2008-01-07 19:15   ` Christoph Lameter
2008-01-07 19:26     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 04 of 11] avoid selecting already killed tasks Andrea Arcangeli
2008-01-03  9:40   ` David Rientjes
2008-01-03 13:41     ` Andrea Arcangeli [this message]
2008-01-03 18:47       ` David Rientjes
2008-01-03 19:54         ` Andrea Arcangeli
2008-01-03 20:49           ` David Rientjes
2008-01-07 19:17   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 05 of 11] reduce the probability of an OOM livelock Andrea Arcangeli
2008-01-07 19:32   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 06 of 11] balance_pgdat doesn't return the number of pages freed Andrea Arcangeli
2008-01-07 19:33   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 07 of 11] don't depend on PF_EXITING tasks to go away Andrea Arcangeli
2008-01-03  9:52   ` David Rientjes
2008-01-03 13:29     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 08 of 11] stop useless vm trashing while we wait the TIF_MEMDIE task to exit Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 09 of 11] oom select should only take rss into account Andrea Arcangeli
2008-01-07 19:35   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 10 of 11] limit reclaim if enough pages have been freed Andrea Arcangeli
2008-01-07 19:37   ` Christoph Lameter
2008-01-08  7:28     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 11 of 11] not-wait-memdie Andrea Arcangeli
2008-01-03  9:55   ` David Rientjes
2008-01-03 13:06     ` Andrea Arcangeli
2008-01-03 18:54       ` David Rientjes
2008-01-07 19:43   ` Christoph Lameter
2008-01-08  1:57     ` David Rientjes
2008-01-08  3:25       ` Nick Piggin
2008-01-08  3:37         ` David Rientjes
2008-01-08  7:42           ` Nick Piggin
2008-01-08  7:45         ` Andrea Arcangeli
2008-01-08  7:37       ` Andrea Arcangeli
2008-01-08  7:31     ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080103134137.GT30939@v2.random \
    --to=andrea@cpushare.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox