From: David Rientjes <rientjes@google.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Nick Piggin <npiggin@suse.de>,
Andrea Arcangeli <aarcange@redhat.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
Lubos Lunak <l.lunak@suse.cz>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms
Date: Tue, 16 Feb 2010 13:52:53 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.2.00.1002161343070.23037@chino.kir.corp.google.com> (raw)
In-Reply-To: <20100216135240.72EC.A69D9226@jp.fujitsu.com>
On Tue, 16 Feb 2010, KOSAKI Motohiro wrote:
> > We need to get a refcount on the mempolicy to ensure it doesn't get freed
> > from under us, tsk is not necessarily current.
>
> Hm.
> if you explanation is correct, I think your patch have following race.
>
>
> CPU0 CPU1
> ----------------------------------------------
> mempolicy_nodemask_intersects()
> mempolicy = tsk->mempolicy;
> do_exit()
> mpol_put(tsk_mempolicy)
> mpol_get(mempolicy);
>
True, good point. It looks like we'll need to include mempolicy
detachment in exit_mm() while under task_lock() and then synchronize with
that. It's a legitimate place to do it since no memory allocation will be
done after its mm is detached, anyway.
> > For MPOL_F_LOCAL, we need to check whether the task's cpu is on a node
> > that is allowed by the zonelist passed to the page allocator. In the
> > second revision of this patchset, this was changed to
> >
> > node_isset(cpu_to_node(task_cpu(tsk)), *mask)
> >
> > to check. It would be possible for no memory to have been allocated on
> > that node and it just happens that the tsk is running on it momentarily,
> > but it's the best indication we have given the mempolicy of whether
> > killing a task may lead to future memory freeing.
>
> This calculation is still broken. In general, running cpu and allocation node
> is not bound.
Not sure what you mean, MPOL_F_LOCAL means that allocations will happen on
the node of the cpu on which it is running. The cpu-to-node mapping
doesn't change, only the cpu on which it is running may change. That may
be restricted by sched_setaffinity() or cpusets, however, so this task may
never allocate on any other node (i.e. it may run on another cpu, but
always one local to a specific node). That's enough of an indication that
it should be a candidate for kill: we're trying to eliminate tasks that
may never allocate on current's nodemask from consideration. In other
words, it would be unfair for two tasks that are isolated to their own
cpus on different physical nodes using MPOL_F_LOCAL for NUMA optimizations
to have the other needlessly killed when current can't allocate there
anyway.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-02-16 21:53 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-10 16:32 [patch 0/7 -mm] oom killer rewrite David Rientjes
2010-02-10 16:32 ` [patch 1/7 -mm] oom: filter tasks not sharing the same cpuset David Rientjes
2010-02-10 17:08 ` Rik van Riel
2010-02-11 23:52 ` KAMEZAWA Hiroyuki
2010-02-15 2:56 ` KOSAKI Motohiro
2010-02-15 22:06 ` David Rientjes
2010-02-16 4:52 ` KOSAKI Motohiro
2010-02-16 6:01 ` KOSAKI Motohiro
2010-02-16 7:03 ` Nick Piggin
2010-02-16 8:49 ` David Rientjes
2010-02-16 9:04 ` Nick Piggin
2010-02-16 9:10 ` David Rientjes
2010-02-16 8:46 ` David Rientjes
2010-02-10 16:32 ` [patch 2/7 -mm] oom: sacrifice child with highest badness score for parent David Rientjes
2010-02-10 20:52 ` Rik van Riel
2010-02-12 0:00 ` KAMEZAWA Hiroyuki
2010-02-12 0:15 ` David Rientjes
2010-02-13 2:49 ` Minchan Kim
2010-02-15 3:08 ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms David Rientjes
2010-02-10 22:47 ` Rik van Riel
2010-02-15 5:03 ` KOSAKI Motohiro
2010-02-15 22:11 ` David Rientjes
2010-02-16 5:15 ` KOSAKI Motohiro
2010-02-16 21:52 ` David Rientjes [this message]
2010-02-17 0:48 ` David Rientjes
2010-02-17 1:13 ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 4/7 -mm] oom: badness heuristic rewrite David Rientjes
2010-02-11 4:10 ` Rik van Riel
2010-02-11 9:14 ` David Rientjes
2010-02-11 15:07 ` Nick Bowler
2010-02-11 21:01 ` David Rientjes
2010-02-11 21:43 ` Andrew Morton
2010-02-11 21:51 ` David Rientjes
2010-02-11 22:31 ` Andrew Morton
2010-02-11 22:42 ` David Rientjes
2010-02-11 23:11 ` Andrew Morton
2010-02-11 23:31 ` David Rientjes
2010-02-11 23:37 ` Andrew Morton
2010-02-12 13:56 ` Minchan Kim
2010-02-12 21:00 ` David Rientjes
2010-02-13 2:45 ` Minchan Kim
2010-02-15 21:54 ` David Rientjes
2010-02-16 13:14 ` Minchan Kim
2010-02-16 21:41 ` David Rientjes
2010-02-17 7:41 ` Minchan Kim
2010-02-17 9:23 ` David Rientjes
2010-02-17 13:08 ` Minchan Kim
2010-02-15 8:05 ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 5/7 -mm] oom: replace sysctls with quick mode David Rientjes
2010-02-12 0:26 ` KAMEZAWA Hiroyuki
2010-02-12 9:58 ` David Rientjes
2010-02-15 8:09 ` KOSAKI Motohiro
2010-02-15 22:15 ` David Rientjes
2010-02-16 5:25 ` KOSAKI Motohiro
2010-02-16 9:04 ` David Rientjes
2010-02-10 16:32 ` [patch 6/7 -mm] oom: avoid oom killer for lowmem allocations David Rientjes
2010-02-11 4:13 ` Rik van Riel
2010-02-11 9:19 ` David Rientjes
2010-02-11 14:08 ` Rik van Riel
2010-02-12 1:28 ` KAMEZAWA Hiroyuki
2010-02-12 10:06 ` David Rientjes
2010-02-15 0:09 ` KAMEZAWA Hiroyuki
2010-02-15 22:01 ` David Rientjes
2010-02-15 8:29 ` KOSAKI Motohiro
2010-02-10 16:32 ` [patch 7/7 -mm] oom: remove unnecessary code and cleanup David Rientjes
2010-02-12 0:12 ` KAMEZAWA Hiroyuki
2010-02-12 0:21 ` David Rientjes
2010-02-15 8:31 ` KOSAKI Motohiro
2010-02-15 2:51 ` [patch 0/7 -mm] oom killer rewrite KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1002161343070.23037@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=l.lunak@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox