linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Paul Mundt <lethal@linux-sh.org>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-mm <linux-mm@kvack.org>,
	Christoph Lameter <clameter@sgi.com>,
	Nishanth Aravamudan <nacc@us.ibm.com>,
	kxr@sgi.com, ak@suse.de, akpm@linux-foundation.org,
	Eric Whitney <eric.whitney@hp.com>
Subject: Re: [PATCH/RFC] Allow selected nodes to be excluded from MPOL_INTERLEAVE masks
Date: Fri, 3 Aug 2007 16:53:22 +0900	[thread overview]
Message-ID: <20070803075322.GA18267@linux-sh.org> (raw)
In-Reply-To: <1185975558.5059.18.camel@localhost>

On Wed, Aug 01, 2007 at 09:39:18AM -0400, Lee Schermerhorn wrote:
> On Wed, 2007-08-01 at 19:16 +0900, Paul Mundt wrote:
> > If we can differentiate between MPOL_INTERLEAVE from the kernel's point
> > of view, and explicit MPOL_INTERLEAVE specifiers via mbind() from
> > userspace, that works fine for my case. However, the mpol_new() changes
> > in this patch deny small nodes the ability to ever be included in an
> > MPOL_INTERLEAVE policy, when it's only the kernel policy that I have a
> > problem with.
> 
> Ah, but it would only "deny small nodes" if you nominate them in the
> boot option.  I haven't changed your heuristic in numa_policy_init.  So,
> it will still eliminate small nodes from the boot time interleave
> nodemask, independent of whether or not you specify them in the
> no_interleave_nodes list.
> 
> Or am I missing your point?

That's correct, as long as the size heuristic remains in
numa_policy_init() there's no problem with this. The point was more that
if we were able to use N_INTERLEAVE nodes for the system init policy, it
would be possible to do away with the size heuristic entirely.

Effectively we want the same things, but whereas you want the interleave
nodes to be something applied to all policies, I'm mostly concerned with
keeping the kernel away from the nodes we don't want to interleave.
Userland is basically a free-for-all in terms of the allowable nodemask,
so I don't have a need to restrict MPOL_INTERLEAVE policies once the
system is up.

The size heuristic itself is a bit of a kludge anyhow. I'd like to have a
single point where I can tell the kernel "these nodes are special, don't
use them unless you've been asked". And that's certainly something I
don't have an issue flagging in the pgdat when constructing the nodes in
the first place (at which point we already know which ones are special,
without having to bother with command line options). Whether this is
something that's best as a special node state or not is something that
will need some toying with. On the other hand, simply being able to take the
system init node list and keep that "pinned" is another option, so we
don't end up allocating there even if node 0 is under pressure.

Page migration also poses an interesting problem, in that we don't have a
problem in migrating pages between and off of these nodes, but we do not
want to migrate pages that started out in system memory to them, as the
node will run out of pages too quickly (and also gives those pages up to
whatever is migrated first, rather than something that actually _wants_
those pages out of performance considerations). I don't see an easy way
to do this without having a page flag that indicates whether migration to
special nodes is permitted or not, and setting that when the page is
allocated.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2007-08-03  7:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-27 20:07 Lee Schermerhorn
2007-07-28  6:19 ` KAMEZAWA Hiroyuki
2007-07-30 16:13   ` Lee Schermerhorn
2007-07-30 18:29     ` Christoph Lameter
2007-07-30 20:32       ` Lee Schermerhorn
2007-07-30 21:57         ` Christoph Lameter
2007-08-01 10:16     ` Paul Mundt
2007-08-01 10:33       ` Andi Kleen
2007-08-01 11:01         ` Paul Mundt
2007-08-01 11:07           ` Andi Kleen
2007-08-01 11:21             ` Paul Mundt
2007-08-01 13:54               ` Lee Schermerhorn
2007-08-02 17:38                 ` Mark Gross
2007-08-02 18:46                   ` Lee Schermerhorn
2007-08-06 16:42                     ` Mark Gross
2007-08-01 13:39       ` Lee Schermerhorn
2007-08-03  7:53         ` Paul Mundt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070803075322.GA18267@linux-sh.org \
    --to=lethal@linux-sh.org \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=eric.whitney@hp.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kxr@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=nacc@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox