From: Andrew Morton <akpm@osdl.org>
To: Paul Jackson <pj@sgi.com>
Cc: clameter@sgi.com, linux-mm@kvack.org, rientjes@google.com
Subject: Re: [PATCH] GFP_THISNODE for the slab allocator
Date: Fri, 15 Sep 2006 01:06:22 -0700 [thread overview]
Message-ID: <20060915010622.0e3539d2.akpm@osdl.org> (raw)
In-Reply-To: <20060915004402.88d462ff.pj@sgi.com>
On Fri, 15 Sep 2006 00:44:02 -0700
Paul Jackson <pj@sgi.com> wrote:
> Andrew wrote:
> > Well some bright spark went and had the idea of using cpusets and fake numa
> > nodes as a means of memory paritioning, didn't he?
>
> If that bright spark is lurking here, perhaps he could educate
> me a little. I mostly ignored the fake numa node stuff when it
> went by, because I figured it was just an amusing novelty.
numa=fake=N is an x86_64-only hack which Andi stuck in there in the early
days just for developer NUMA testing. Once opterons became commodity it
bitrotted because there was no need for it.
Then it occurred to me (although apparently another
brightspark@somewhere.jp has the same idea earlier on) that if you can
slice a UMA machine into 128 or 256 little pieces and manage them using
cpusets, you have *all* the infrastructure you need to do crude but
effective machine partitioning.
David has fixed numa=fake (it was badly busted) and has been experimenting
with a 3GB machine sliced into 64 "nodes". So he can build containers
whose memory allocation is variable in 40-odd-megabyte hunks.
Testing looks promising: a group of processes in container A remains
constrained to its allocation - if it gets too fat it starts getting
reclaimed or swapped.
I _think_ it goes all the way up to getting oom-killed (David?). The
oom-killer appears to be doing the right thing - we don't want it to be
killing processes which aren't inside the offending container.
> Perhaps its time I learned why it is valuable. Can someone
> explain it to me, and describe a bit the situations in which
> it is useful. Seems like NUMA mechanisms are being (ab)used
> for micro-partitioning memory.
yup. afaict the only problem which has been encountered with this is that
search complexity in the page allocator.
> As Andrew speculates, this could lead to reconsidering and
> fancifying up some of the mechanisms, to cover a wider range
> of situations efficiently.
Yes. Speeding up get_page_from_freelist() is less than totally trivial.
I've been putting off thinking about it until we're pretty sure that there
aren't any other showstoppers.
I'm (very) impressed at how well the infrastructre which you and Christoph
have put together has held up under this.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-09-15 8:06 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-13 23:50 Christoph Lameter
2006-09-15 5:00 ` Andrew Morton
2006-09-15 6:49 ` Paul Jackson
2006-09-15 7:23 ` Andrew Morton
2006-09-15 7:44 ` Paul Jackson
2006-09-15 8:06 ` Andrew Morton [this message]
2006-09-15 15:53 ` David Rientjes
2006-09-15 23:03 ` David Rientjes
2006-09-16 0:04 ` Paul Jackson
2006-09-16 1:36 ` Andrew Morton
2006-09-16 2:23 ` Christoph Lameter
2006-09-16 4:34 ` Andrew Morton
2006-09-16 3:28 ` [PATCH] Add node to zone for the NUMA case Christoph Lameter
2006-09-16 3:40 ` Paul Jackson
2006-09-16 3:45 ` [PATCH] GFP_THISNODE for the slab allocator Paul Jackson
2006-09-16 2:47 ` Christoph Lameter
2006-09-17 3:45 ` David Rientjes
2006-09-17 11:17 ` Paul Jackson
2006-09-17 12:41 ` Christoph Lameter
2006-09-17 13:03 ` Paul Jackson
2006-09-17 20:36 ` David Rientjes
2006-09-17 21:20 ` Paul Jackson
2006-09-17 22:27 ` Paul Jackson
2006-09-17 23:49 ` David Rientjes
2006-09-18 2:20 ` Paul Jackson
2006-09-18 16:34 ` Paul Jackson
2006-09-18 17:49 ` David Rientjes
2006-09-18 20:46 ` Paul Jackson
2006-09-19 20:52 ` David Rientjes
2006-09-19 21:26 ` Christoph Lameter
2006-09-19 21:50 ` David Rientjes
2006-09-21 22:11 ` David Rientjes
2006-09-22 10:10 ` Nick Piggin
2006-09-22 16:26 ` Paul Jackson
2006-09-22 16:36 ` Christoph Lameter
2006-09-15 8:28 ` Andrew Morton
2006-09-16 3:38 ` Paul Jackson
2006-09-16 4:42 ` Andi Kleen
2006-09-16 11:38 ` Paul Jackson
2006-09-16 4:48 ` Andrew Morton
2006-09-16 11:30 ` Paul Jackson
2006-09-16 15:18 ` Andrew Morton
2006-09-17 9:28 ` Paul Jackson
2006-09-17 9:51 ` Nick Piggin
2006-09-17 11:15 ` Paul Jackson
2006-09-17 12:44 ` Nick Piggin
2006-09-17 13:19 ` Paul Jackson
2006-09-17 13:52 ` Nick Piggin
2006-09-17 21:19 ` Paul Jackson
2006-09-18 12:44 ` [PATCH] mm: exempt pcp alloc from watermarks Peter Zijlstra
2006-09-18 20:20 ` Christoph Lameter
2006-09-18 20:43 ` Peter Zijlstra
2006-09-19 14:35 ` Nick Piggin
2006-09-19 14:44 ` Christoph Lameter
2006-09-19 15:02 ` Nick Piggin
2006-09-19 14:51 ` Peter Zijlstra
2006-09-19 15:10 ` Nick Piggin
2006-09-19 15:05 ` Peter Zijlstra
2006-09-19 15:39 ` Christoph Lameter
2006-09-17 16:29 ` [PATCH] GFP_THISNODE for the slab allocator Andrew Morton
2006-09-18 2:11 ` Paul Jackson
2006-09-18 5:09 ` Andrew Morton
2006-09-18 7:49 ` Paul Jackson
2006-09-16 11:48 ` Paul Jackson
2006-09-16 15:38 ` Andrew Morton
2006-09-16 21:51 ` Paul Jackson
2006-09-16 23:10 ` Andrew Morton
2006-09-17 4:37 ` Christoph Lameter
2006-09-17 4:55 ` Andrew Morton
2006-09-17 12:09 ` Paul Jackson
2006-09-17 12:36 ` Christoph Lameter
2006-09-17 13:06 ` Paul Jackson
2006-09-19 19:17 ` David Rientjes
2006-09-19 19:19 ` David Rientjes
2006-09-19 19:31 ` Christoph Lameter
2006-09-19 21:12 ` David Rientjes
2006-09-19 21:28 ` Christoph Lameter
2006-09-19 21:53 ` Paul Jackson
2006-09-15 17:08 ` Christoph Lameter
2006-09-15 17:37 ` [PATCH] Add NUMA_BUILD definition in kernel.h to avoid #ifdef CONFIG_NUMA Christoph Lameter
2006-09-15 17:38 ` [PATCH] Disable GFP_THISNODE in the non-NUMA case Christoph Lameter
2006-09-15 17:42 ` [PATCH] GFP_THISNODE for the slab allocator V2 Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060915010622.0e3539d2.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=linux-mm@kvack.org \
--cc=pj@sgi.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox