linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <clameter@sgi.com>,
	Andrea Arcangeli <andrea@cpushare.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 11 of 11] not-wait-memdie
Date: Tue, 8 Jan 2008 18:42:33 +1100	[thread overview]
Message-ID: <200801081842.33482.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <alpine.DEB.0.9999.0801071929300.29897@chino.kir.corp.google.com>

On Tuesday 08 January 2008 14:37, David Rientjes wrote:
> On Tue, 8 Jan 2008, Nick Piggin wrote:
> > The problem is the global reserve. Once you have a kernel that doesn't
> > need this handwavy global reserve for forward progress, a lot of little
> > problems go away.
>
> I'm specifically talking about TIF_MEMDIE here which gives access to that
> global reserve.

And I'm specifically talking about PF_MEMALLOC, which does the same.

> In OOM situations there is no easy way to guarantee that 
> a task will have enough memory to exit, but that is exactly what is needed
> to alleviate the condition.  Additionally, it is not guaranteed that a
> task that has been OOM killed and given access to the global reserve will
> exit after it has exhausted that reserve in its entirety.  That's when the
> system deadlocks.

I know all that ;) Your second point is the reason to have more than 1
MEMDIE process...


> So giving access to the global reserve to multiple tasks that share memory
> in at least one of their zones for simultaneous OOM killings is not a
> complete solution.  There should be a timeout on tasks when they are OOM
> killed; if they cannot exit for the duration of that period, they lose
> access to the reserves and only then is another task selected.

Hmm, OK I didn't realise you'd proposed that as an alternative. Maybe.
I don't know if the complexity would be worthwhile, given that there is
no sort of reentrancy limit on the global reserve pool anyway.


> > > That's only possible with my proposal of adding
> > >
> > > 	unsigned long oom_kill_jiffies;
> > >
> > > to struct task_struct.  We can't get away with a system-wide jiffies
> > > variable, nor can we get away with per-cgroup, per-cpuset, or
> > > per-mempolicy variable.  The only way to clear such a variable is in
> > > the exit path (by checking test_thread_flag(tsk, TIF_MEMDIE) in
> > > do_exit()) and fails miserably if there are simultaneous but
> > > zone-disjoint OOMs occurring.
> >
> > Why not just have a global frequency limit on OOM events. Then the panic
> > has this delay factored in...
>
> Because OOM killing is going to become more and more frequent with the
> introduction of the memory controller which uses it as a mechanism to
> enforce its policy.  And a global frequency limit does not work well for
> parallel cpuset, mempolicy, or memory controller OOM events.  That is why
> it is currently serialized by the triggering task's zonelist and not
> globally.

I don't think that's a very good reason for the complexity. If your
system is OOM-throughput-limited, then something's very wrong with
your wokload management. (and I don't buy the DoS security argument
either because the memory controller doesn't provide security last
time I looked).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-01-08  7:42 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-03  2:09 [PATCH 00 of 11] oom deadlock fixes Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 01 of 11] limit shrink zone scanning Andrea Arcangeli
2008-01-07 19:11   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 02 of 11] avoid oom deadlock in nfs_create_request Andrea Arcangeli
2008-01-07 19:13   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 03 of 11] prevent oom deadlocks during read/write operations Andrea Arcangeli
2008-01-07 19:15   ` Christoph Lameter
2008-01-07 19:26     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 04 of 11] avoid selecting already killed tasks Andrea Arcangeli
2008-01-03  9:40   ` David Rientjes
2008-01-03 13:41     ` Andrea Arcangeli
2008-01-03 18:47       ` David Rientjes
2008-01-03 19:54         ` Andrea Arcangeli
2008-01-03 20:49           ` David Rientjes
2008-01-07 19:17   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 05 of 11] reduce the probability of an OOM livelock Andrea Arcangeli
2008-01-07 19:32   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 06 of 11] balance_pgdat doesn't return the number of pages freed Andrea Arcangeli
2008-01-07 19:33   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 07 of 11] don't depend on PF_EXITING tasks to go away Andrea Arcangeli
2008-01-03  9:52   ` David Rientjes
2008-01-03 13:29     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 08 of 11] stop useless vm trashing while we wait the TIF_MEMDIE task to exit Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 09 of 11] oom select should only take rss into account Andrea Arcangeli
2008-01-07 19:35   ` Christoph Lameter
2008-01-03  2:09 ` [PATCH 10 of 11] limit reclaim if enough pages have been freed Andrea Arcangeli
2008-01-07 19:37   ` Christoph Lameter
2008-01-08  7:28     ` Andrea Arcangeli
2008-01-03  2:09 ` [PATCH 11 of 11] not-wait-memdie Andrea Arcangeli
2008-01-03  9:55   ` David Rientjes
2008-01-03 13:06     ` Andrea Arcangeli
2008-01-03 18:54       ` David Rientjes
2008-01-07 19:43   ` Christoph Lameter
2008-01-08  1:57     ` David Rientjes
2008-01-08  3:25       ` Nick Piggin
2008-01-08  3:37         ` David Rientjes
2008-01-08  7:42           ` Nick Piggin [this message]
2008-01-08  7:45         ` Andrea Arcangeli
2008-01-08  7:37       ` Andrea Arcangeli
2008-01-08  7:31     ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200801081842.33482.nickpiggin@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@cpushare.com \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox