linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Martin Bligh <mbligh@mbligh.org>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Andi Kleen <ak@suse.de>, Ingo Molnar <mingo@elte.hu>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Eric Whitney <eric.whitney@hp.com>
Subject: Re: [rfc] balance-on-fork NUMA placement
Date: Thu, 2 Aug 2007 03:36:31 +0200	[thread overview]
Message-ID: <20070802013631.GA15595@wotan.suse.de> (raw)
In-Reply-To: <46B10E9B.2030907@mbligh.org>

On Wed, Aug 01, 2007 at 03:52:11PM -0700, Martin Bligh wrote:
> 
> >And so forth.  Initial forks will balance.  If the children refuse to
> >die, forks will continue to balance.  If the parent starts seeing short
> >lived children, fork()s will eventually start to stay local.  
> 
> Fork without exec is much more rare than without. Optimising for
> the uncommon case is the Wrong Thing to Do (tm). What we decided

It's only the wrong thing to do if it hurts the common case too
much. Considering we _already_ balance on exec, then adding another
balance on fork is not going to introduce some order of magnitude
problem -- at worst it would be 2x but it really isn't too slow
anyway (at least nobody complained when we added it).

One place where we found it helps is clone for threads.

If we didn't do such a bad job at keeping tasks together with their
local memory, then we might indeed reduce some of the balance-on-crap
and increase the aggressiveness of periodic balancing.

Considering we _already_ balance on fork/clone, I don't know what
your argument is against this patch is? Doing the balance earlier
and allocating more stuff on the local node is surely not a bad
idea.


> the last time(s) this came up was to allow userspace to pass
> a hint in if they wanted to fork and not exec.
> 
> >I believe that this solved the pathological behavior we were seeing with
> >shell scripts taking way longer on the larger, supposedly more powerful,
> >platforms.
> >
> >Of course, that OS could migrate the equivalent of task structs and
> >kernel stack [the old Unix user struct that was traditionally swappable,
> >so fairly easy to migrate].  On Linux, all bets are off, once the
> >scheduler starts migrating tasks away from the node that contains their
> >task struct, ...  [Remember Eric Focht's "NUMA Affine Scheduler" patch
> >with it's "home node"?]
> 
> Task migration doesn't work well at all without userspace hints.
> SGI tried for ages (with IRIX) and failed. There's long discussions
> of all of these things back in the days when we merged the original
> NUMA scheduler in late 2.5 ...

Task migration? Automatic memory migration you mean? I think it deserves
another look regardless of what SGI could or could not do, and Lee and I
are slowly getting things in place. We'll see what happens...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-08-02  1:36 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-31  5:41 Nick Piggin
2007-07-31  8:01 ` Ingo Molnar
2007-08-01  0:21   ` Nick Piggin
2007-08-01  6:19     ` Ingo Molnar
2007-07-31  9:14 ` Andi Kleen
2007-07-31 23:40   ` Christoph Lameter
2007-08-01  8:39     ` Andi Kleen
2007-08-02  3:42     ` Nick Piggin
2007-08-02 19:58       ` Christoph Lameter
2007-08-03  0:26         ` Nick Piggin
2007-08-03  0:52           ` Christoph Lameter
2007-08-03  0:57             ` Nick Piggin
2007-08-03  1:02               ` Christoph Lameter
2007-08-03  1:14                 ` Nick Piggin
2007-08-03  1:34                   ` Christoph Lameter
2007-08-03  3:14                     ` Nick Piggin
2007-08-03  5:47                       ` Christoph Lameter
2007-08-01  0:23   ` Nick Piggin
2007-08-01 17:53     ` Martin Bligh
2007-08-01 18:32       ` Lee Schermerhorn
2007-08-01 22:52         ` Martin Bligh
2007-08-02  1:36           ` Nick Piggin [this message]
2007-08-02 18:33             ` Martin Bligh
2007-08-03  0:20               ` Nick Piggin
2007-08-03 20:10                 ` Siddha, Suresh B
2007-08-06  1:20                   ` Nick Piggin
2007-08-02 14:49           ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070802013631.GA15595@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=ak@suse.de \
    --cc=eric.whitney@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@mbligh.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox