From: David Rientjes <rientjes@google.com>
To: Paul Jackson <pj@sgi.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
kosaki.motohiro@jp.fujitsu.com, andi@firstfloor.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, clameter@sgi.com, mel@csn.ul.ie
Subject: Re: [2.6.24-rc8-mm1][regression?] numactl --interleave=all doesn't works on memoryless node.
Date: Tue, 5 Feb 2008 11:56:57 -0800 (PST) [thread overview]
Message-ID: <alpine.DEB.0.9999.0802051146300.5854@chino.kir.corp.google.com> (raw)
In-Reply-To: <20080205041755.3411b5cc.pj@sgi.com>
On Tue, 5 Feb 2008, Paul Jackson wrote:
> But that discussion touched on some other long standing deficiencies
> in the way that I had originally glued cpusets and memory policies
> together. The current mechanism doesn't handle changing cpusets very
> well, especially if the number of nodes in the cpuset increases.
>
That's because of the nodemask remaps that are done for the various
mempolicy cases when rebinding the policy. I agree we cannot change that
implementation now even though it is undocumented.
The more alarming result of these remaps is in the MPOL_BIND case, as
we've talked about before. The language in set_mempolicy(2):
The MPOL_BIND policy is a strict policy that restricts memory
allocation to the nodes specified in nodemask. There won't be
allocations on other nodes.
makes it pretty clear that allocations will not be done on other nodes not
provided in the set_mempolicy() nodemask if the task is not swapped out.
But the current implementation allows that if the task is either moved to
a different cpuset or its cpuset's mems change. For example, consider a
task that is allowed nodes 1-3 by its cpuset and asks for a MPOL_BIND
mempolicy of node 2. If that cpuset's mems change to 4-6, the mempolicy
is now effectively a bind on node 5.
> The next two steps I need to take are:
> 1) propose this patch, with careful explanation (it's easy to lose
> one's bearings in the mappings and remappings of node numberings)
> to a wider audience, such as linux-mm or linux-kernel, and
Thanks.
> 2) carefully test this, especially on each code path I touched in
> mm/mempolicy.c, where the changes were delicate, to ensure I
> didn't break any existing code.
>
> There were also some other, smaller patches proposed, by myself and
> others. I was preferring to address a wider set of the long standing
> issues in this area, but the others above mostly preferred the smaller
> patches. This needs to be discussed in a wider forum, and a concensus
> reached.
>
I think if these MPOL_* flags that you're proposing are made as generic as
possible for all possible mempolicies (current and future), it would be
the optimal change. It would prevent us from having to add new flags for
corner-cases in the future and would allow us to keep the flag set as
small as possible. My suggestion of MPOL_F_STATIC_NODEMASK goes a long
way to solve these issues both for MPOL_INTERLEAVE (in conjunction with
storing the set_mempolicy() intent) and the MPOL_BIND discrepency I
mentioned above.
David
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-02-05 19:56 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-02 8:12 KOSAKI Motohiro
2008-02-02 9:09 ` Andi Kleen
2008-02-02 9:37 ` KOSAKI Motohiro
2008-02-02 11:30 ` Andi Kleen
2008-02-04 19:03 ` Christoph Lameter
2008-02-04 18:20 ` Lee Schermerhorn
2008-02-05 9:26 ` [2.6.24 regression][BUGFIX] " KOSAKI Motohiro
2008-02-08 19:45 ` [PATCH 2.6.24-mm1] Mempolicy: silently restrict nodemask to allowed nodes V3 Lee Schermerhorn
2008-02-09 18:11 ` KOSAKI Motohiro
2008-02-10 5:29 ` KOSAKI Motohiro
2008-02-10 5:49 ` Greg KH
2008-02-10 7:42 ` Linus Torvalds
2008-02-10 10:31 ` Andrew Morton
2008-02-11 16:47 ` Lee Schermerhorn
2008-02-12 0:43 ` KOSAKI Motohiro
2008-02-12 1:00 ` David Rientjes
2008-02-12 1:56 ` KOSAKI Motohiro
2008-02-12 2:05 ` David Rientjes
2008-02-12 3:05 ` KOSAKI Motohiro
2008-02-12 3:17 ` David Rientjes
2008-02-12 15:08 ` Lee Schermerhorn
2008-02-12 19:06 ` David Rientjes
2008-02-13 0:07 ` Lee Schermerhorn
2008-02-13 0:42 ` David Rientjes
2008-02-13 16:32 ` Lee Schermerhorn
2008-02-13 18:32 ` David Rientjes
2008-02-13 18:56 ` Lee Schermerhorn
2008-02-12 4:30 ` [PATCH for 2.6.24][regression fix] " KOSAKI Motohiro
2008-02-12 5:06 ` David Rientjes
2008-02-12 5:07 ` Andrew Morton
2008-02-12 13:18 ` KOSAKI Motohiro
2008-02-05 10:17 ` [2.6.24-rc8-mm1][regression?] numactl --interleave=all doesn't works on memoryless node Paul Jackson
2008-02-05 11:14 ` KOSAKI Motohiro
2008-02-05 19:56 ` David Rientjes [this message]
2008-02-05 20:51 ` Paul Jackson
2008-02-05 21:03 ` David Rientjes
2008-02-05 21:33 ` Paul Jackson
2008-02-05 22:04 ` Lee Schermerhorn
2008-02-05 22:44 ` David Rientjes
2008-02-05 22:50 ` Paul Jackson
2008-02-05 14:31 ` Mel Gorman
2008-02-05 15:23 ` Lee Schermerhorn
2008-02-05 18:12 ` Christoph Lameter
2008-02-05 18:27 ` Lee Schermerhorn
2008-02-05 19:04 ` Christoph Lameter
2008-02-05 19:15 ` Paul Jackson
2008-02-05 20:06 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.0.9999.0802051146300.5854@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=clameter@sgi.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox