Re: [PATCH] numa: mempolicy: dynamic interleave map for system init.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Matt Mackall <mpm@selenic.com>
Cc: Paul Mundt <lethal@linux-sh.org>,
	Christoph Lameter <clameter@sgi.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, ak@suse.de, hugh@veritas.com,
	lee.schermerhorn@hp.com
Subject: Re: [PATCH] numa: mempolicy: dynamic interleave map for system init.
Date: Wed, 13 Jun 2007 12:10:21 +1000	[thread overview]
Message-ID: <466F520D.9080206@yahoo.com.au> (raw)
In-Reply-To: <20070612153234.GI11115@waste.org>

Matt Mackall wrote:
> On Tue, Jun 12, 2007 at 06:43:59PM +0900, Paul Mundt wrote:
> 
>>On Fri, Jun 08, 2007 at 09:50:11AM -0500, Matt Mackall wrote:
>>
>>>SLOB's big scalability problem at this point is number of CPUs.
>>>Throwing some fine-grained locking at it or the like may be able to
>>>help with that too.
>>>
>>>Why would you even want to bother making it scale that large? For
>>>starters, it's less affected by things like dcache fragmentation. The
>>>majority of pages pinned by long-lived dcache entries will still be
>>>available to other allocations.
>>>
>>>Haven't given any thought to NUMA yet though..
>>>
>>
>>This is what I've hacked together and tested with my small nodes. It's
>>not terribly intelligent, and it pushes off most of the logic to the page
>>allocator. Obviously it's not terribly scalable, and I haven't tested it
>>with page migration, either. Still, it works for me with my simple tmpfs
>>+ mpol policy tests.
>>
>>Tested on a UP + SPARSEMEM (static, not extreme) + NUMA (2 nodes) + SLOB
>>configuration.
>>
>>Flame away!
> 
> 
> For starters, it's not against the current SLOB, which no longer has
> the bigblock list.
> 
> 
>>-void *__kmalloc(size_t size, gfp_t gfp)
>>+static void *__kmalloc_alloc(size_t size, gfp_t gfp, int node)
> 
> 
> That's a ridiculous name. So, uh.. more underbars!
> 
> Though really, I think you can just name it __kmalloc_node?
> 
> 
>>+		if (node == -1)
>>+			pages = alloc_pages(flags, get_order(c->size));
>>+		else
>>+			pages = alloc_pages_node(node, flags,
>>+						get_order(c->size));
> 
> 
> This fragment appears a few times. Looks like it ought to get its own
> function. And that function can reduce to a trivial inline in the
> !NUMA case.

BTW. what I would like to see tried initially -- which may give reasonable
scalability and NUMAness -- is perhaps a percpu or per-node free pages
lists. However these lists would not be exclusively per-cpu, because that
would result in worse memory consumption (we should always try to put
memory consumption above all else with SLOB).

So each list would have its own lock and can be accessed by any CPU, but
they would default to their own list first (or in the case of a
kmalloc_node, they could default to some other list).

Then we'd probably like to introduce a *little* bit of slack, so that we
will allocate a new page on our local list even if there is a small amount
of memory free on another list. I think this might be enough to get a
reasonable number of list-local allocations without blowing out the memory
usage much. The slack ratio could be configurable so at one extreme we
could always allocate from our local lists for best NUMA placement I guess.

I haven't given it a great deal of thought, so this strategy might go
horribly wrong in some cases... but I have a feeling something reasonably
simple like that might go a long way to improving locking scalability and
NUMAness.

-- 
SUSE Labs, Novell Inc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2007-06-13  2:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-07  1:17 Paul Mundt
2007-06-08  1:01 ` Andrew Morton
2007-06-08  2:47   ` Christoph Lameter
2007-06-08  3:01     ` Andrew Morton
2007-06-08  3:11       ` Christoph Lameter
2007-06-08  3:25     ` Paul Mundt
2007-06-08  3:49       ` Christoph Lameter
2007-06-08  4:13         ` Paul Mundt
2007-06-08  4:27           ` Christoph Lameter
2007-06-08  6:05             ` Paul Mundt
2007-06-08  6:09               ` Christoph Lameter
2007-06-08  6:27                 ` Paul Mundt
2007-06-08  6:43                   ` Christoph Lameter
2007-06-08 14:50       ` Matt Mackall
2007-06-12  2:36         ` Nick Piggin
2007-06-12  9:43         ` Paul Mundt
2007-06-12 15:32           ` Matt Mackall
2007-06-13  2:10             ` Nick Piggin [this message]
2007-06-13  3:12               ` Matt Mackall
2007-06-13  2:53             ` Paul Mundt
2007-06-13  3:16               ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=466F520D.9080206@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=hugh@veritas.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=lethal@linux-sh.org \
    --cc=linux-mm@kvack.org \
    --cc=mpm@selenic.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox