From: Andrea Arcangeli <andrea@novell.com>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Nick Piggin <piggin@cyberone.com.au>,
Rik van Riel <riel@redhat.com>,
Martin MOKREJ? <mmokrejs@ribosome.natur.cuni.cz>,
tglx@linutronix.de
Subject: Re: [PATCH] fix spurious OOM kills
Date: Thu, 11 Nov 2004 17:50:51 +0100 [thread overview]
Message-ID: <20041111165050.GA5822@x30.random> (raw)
In-Reply-To: <20041111123850.GA16349@logos.cnet>
On Thu, Nov 11, 2004 at 10:38:50AM -0200, Marcelo Tosatti wrote:
>
> Hi!
>
> On Thu, Nov 11, 2004 at 04:42:38PM +0100, Andrea Arcangeli wrote:
> > On Thu, Nov 11, 2004 at 09:29:22AM -0200, Marcelo Tosatti wrote:
> > > Hi,
> > >
> > > This is an improved version of OOM-kill-from-kswapd patch.
> > >
> > > I believe triggering the OOM killer from task reclaim context
> > > is broken because the chances that it happens increases as the amount
> > > of tasks inside reclaim increases - and that approach ignores efforts
> > > being done by kswapd, who is the main entity responsible for
> > > freeing pages.
> > >
> > > There have been a few problems pointed out by others (Andrea, Nick) on the
> > > last patch - this one solves them.
> >
> > I disagree about the design of killing anything from kswapd. kswapd is
> > an async helper like pdflush and it has no knowledge on the caller (it
> > cannot know if the caller is ok with the memory currently available in
> > the freelists, before triggering the oom).
>
> If zone_dma / zone_normal are below pages_min no caller is "OK with
> memory currently available" except GFP_ATOMIC/realtime callers.
If the GFP_DMA zone is filled, and nobody allocates with GFP_DMA,
nothing should be killed and everything should run fine, how can you
get this right from kswapd?
> > I'm just about to move the
> > oom killing away from vmscan.c to page_alloc.c which is basically the
> > opposite of moving the oom invocation from the task context to kswapd.
> > page_alloc.c in the task context is the only one who can know if
> > something has to be killed, vmscan.c cannot know. vmscan.c can only know
> > if something is still freeable, but if something isn't freeable it
> > doesn't mean that we've to kill anything
>
> Well Andrea, its not about "if something isnt freeable", its about
> "the VM is unable to make progress reclaiming pages".
"VM is unable to reclaim pages" == "nothing is freeable"
> > (for example if a task exited
> > or some dma or normal-zone or highmem memory was released by another
> > task while we were paging waiting for I/O).
>
> My last patch checks for pages_min before OOM killing, have you read it?
checking pages_min isn't correct anyways, the lowmem_reserve must taken
into account or you may not kill tasks when you should really kill
tasks.
Plus you're checking for all zones, but kswapd cannot know that it
doesn't matter if the zone dma is under pages_min, as far as there's no
GFP_DMA.
> > Every allocation is different and page_alloc.c is the only one who
> > knows what has to be done for every single allocation.
>
> OK, what do you propose? Its the third time I ask you this and got no
> concrete answer yet.
I want to move it to page_alloc.c (and up to the caller) and not in
kswapd, I mention this a few times.
> Sure, allocators should receive -ENOMEM whenever possible, but this
> is not the issue here.
it is the issue, because only the context of the task can choose if to
return -ENOMEM or to invoke the oom killer and try again.
> Triggering OOM killer on __alloc_pages() failure ?
yes, ideally I'd put the oom killer _outside_ alloc_pages, but just
moving it into alloc_pages should make things better than they are right
now in vmscan.c.
> Show us the code, please :)
I'm supposedly listening to a meeting right now, then I've a bad kernel
crash to debug with random mem corruption that I just managed to
reproduce deterministcally inside uml by emulating numa inside uml and
I'll be busy until next week at the very least. So I doubt I'll be able
to write any oom-related code until next week, sorry.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2004-11-11 16:50 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-11 11:29 Marcelo Tosatti
2004-11-11 15:42 ` Andrea Arcangeli
2004-11-11 12:38 ` Marcelo Tosatti
2004-11-11 16:50 ` Andrea Arcangeli [this message]
2004-11-11 13:56 ` Marcelo Tosatti
2004-11-11 21:45 ` Andrea Arcangeli
2004-11-11 19:19 ` Marcelo Tosatti
2004-11-11 17:42 ` Martin J. Bligh
2004-11-11 21:50 ` Andrea Arcangeli
2004-11-12 11:13 ` fix for mpol mm corruption on tmpfs Andrea Arcangeli
2004-11-11 21:57 ` [PATCH] fix spurious OOM kills Chris Ross
2004-11-12 16:52 ` Chris Ross
2004-11-12 23:56 ` Nick Piggin
2004-11-13 23:37 ` Andrea Arcangeli
2004-11-14 9:44 ` Marcelo Tosatti
2004-11-14 10:02 ` Marcelo Tosatti
2004-11-14 17:11 ` Andrea Arcangeli
2004-11-14 17:03 ` Andrea Arcangeli
2004-11-14 18:16 ` Martin J. Bligh
2004-11-14 18:27 ` Andrea Arcangeli
2004-11-14 20:21 ` Marcelo Tosatti
2004-11-16 16:30 ` Chris Ross
2004-11-17 9:08 ` Chris Ross
2004-11-17 9:23 ` Andrew Morton
2004-11-17 6:06 ` Marcelo Tosatti
2004-11-17 6:08 ` Marcelo Tosatti
2004-11-17 6:38 ` Marcelo Tosatti
2004-11-17 11:04 ` Chris Ross
2004-11-17 10:26 ` Andrew Morton
2004-11-17 10:50 ` Chris Ross
2004-11-17 7:09 ` Marcelo Tosatti
2004-11-17 11:49 ` Chris Ross
2004-11-17 12:09 ` Rik van Riel
2004-11-17 13:12 ` Chris Ross
[not found] ` <419CD8C1.4030506@ribosome.natur.cuni.cz>
2004-11-18 21:16 ` Andrew Morton
[not found] ` <419D25B5.1060504@ribosome.natur.cuni.cz>
[not found] ` <419D2987.8010305@cyberone.com.au>
2004-11-19 0:03 ` Martin MOKREJŠ
2004-11-19 0:08 ` Andrew Morton
2004-11-19 8:09 ` Marcelo Tosatti
2004-11-19 16:17 ` Thomas Gleixner
[not found] ` <419E821F.7010601@ribosome.natur.cuni.cz>
2004-11-20 10:23 ` Thomas Gleixner
2004-11-20 10:45 ` Martin MOKREJŠ
2004-11-20 11:29 ` Martin MOKREJŠ
2004-11-20 13:29 ` Thomas Gleixner
2004-11-20 21:19 ` Martin MOKREJŠ
2004-11-21 11:53 ` Thomas Gleixner
2004-11-21 12:17 ` Martin MOKREJŠ
2004-11-21 13:57 ` Thomas Gleixner
2004-11-22 10:55 ` Thomas Gleixner
2004-11-23 7:41 ` Martin MOKREJŠ
2004-11-23 10:27 ` Thomas Gleixner
2004-11-24 15:52 ` Martin MOKREJŠ
2004-11-24 16:36 ` Thomas Gleixner
2004-12-14 16:04 ` Martin MOKREJŠ
2004-12-14 17:38 ` Andrea Arcangeli
2004-12-14 23:30 ` Nick Piggin
2004-12-14 23:55 ` Andrea Arcangeli
2004-12-15 0:16 ` Thomas Gleixner
2004-12-15 0:37 ` Andrea Arcangeli
2004-12-15 0:48 ` Thomas Gleixner
2004-11-21 19:01 ` Chris Ross
2004-11-22 12:15 ` Chris Ross
2004-11-22 8:35 ` Marcelo Tosatti
2004-11-16 8:37 ` Chris Ross
2004-11-17 3:45 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041111165050.GA5822@x30.random \
--to=andrea@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=mmokrejs@ribosome.natur.cuni.cz \
--cc=piggin@cyberone.com.au \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox