From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Christoph Lameter <clameter@sgi.com>
Cc: Andrew Morton <akpm@osdl.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-mm@kvack.org
Subject: Re: Page allocator: Single Zone optimizations
Date: Sun, 29 Oct 2006 22:32:31 +1100 [thread overview]
Message-ID: <4544914F.3000502@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0610281805280.14100@schroedinger.engr.sgi.com>
Christoph Lameter wrote:
> On Sat, 28 Oct 2006, Andrew Morton wrote:
>
>
>>>We (and I personally with the prezeroing patches) have been down
>>>this road several times and did not like what we saw.
>>
>>Details?
>
>
> The most important issues that come to my mind right now (this has
> been discussed frequently in various contexts so I may be missing
> some things) are:
>
> 1. Duplicate the caches (pageset structures). This reduces cache hit
> rates. Duplicates lots of information in the page allocator.
You would have to do the same thing to get an O(1) per-CPU allocation
for a specific zone/reclaim type/etc regardless whether or not you use
zones.
> 2. Necessity of additional load balancing across multiple zones.
a. we have to do this anyway for eg. dma32 and NUMA, and b. it is much
better than the highmem problem was because all the memory is kernel
addressable.
If you use another scheme (eg. lists within zones within nodes, rather
than just more zones within nodes), then you still fundamentally have
to balance somehow.
> 3. The NUMA layer can only support memory policies for a single zone.
That's broken. The VM had zones long before it had nodes or memory
policies.
> 4. You may have to duplicate the slab allocator caches for that
> purpose.
If you want specific allocations from a given zone, yes. So you may
have to do the same if you want a specific slab allcoation from a
list within a zone.
> 5. More bits used in the page flags.
Aren't there patches to move the bits out of the page flags? A list
within zones approach would have to use either page flags or some
external info (eg. page pfn) to determine what list for the page to
go back to anyway, wouldn't you?
> 6. ZONES have to be sized at bootup which creates more dangers of runinng
> out of memory, possibly requiring more complex load balancing.
Mel's list based defrag approach requires complex load balancing too.
>>Again. On the whole, that was a pretty useless email. Please give us
>>something we can use.
>
>
> Well review the discussions that we had regarding Mel Gorman's defrag
> approaches. We discussed this in detail at the VM summit and decided to
> not create additional zones but instead separate the free lists. You and
> Linus seemed to be in agreement with this. I am a bit surprised ....
> Is this a Google effect?
>
> Moreover the discussion here is only remotely connected to the issue at
> hand. We all agree that ZONE_DMA is bad and we want to have an alternate
> scheme. Why not continue making it possible to not compile ZONE_DMA
> dependent code into the kernel?
>
> Single zone patches would increase VM performance. That would in turn
> make it more difficult to get approaches in that require multiple zones
> since the performance drop would be more significant.
node->zone->many lists vs node->many zones? I guess the zones approach is
faster?
Not that I am any more convinced that defragmentation is a good idea than
I was a year ago, but I think it is naive to think we can instantly be rid
of all the problems associated with zones by degenerating that layer of the
VM and introducing a new one that does basically the same things.
It is true that zones may not be a perfect fit for what some people want to
do, but until they have shown a) what they want to do is a good idea, and
b) zones can't easily be adapted, then using the infrastructure we already
have throughout the entire mm seems like a good idea.
IMO, Andrew's idea to have 1..N zones in a node seems sane and it would be
a good generalisation of even the present code.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-10-29 11:32 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-17 0:50 Christoph Lameter
2006-10-17 1:10 ` Andrew Morton
2006-10-17 1:13 ` Christoph Lameter
2006-10-17 1:27 ` KAMEZAWA Hiroyuki
2006-10-17 1:25 ` Christoph Lameter
2006-10-17 6:04 ` Nick Piggin
2006-10-17 17:54 ` Christoph Lameter
2006-10-18 11:15 ` Nick Piggin
2006-10-18 19:38 ` Andrew Morton
2006-10-23 23:08 ` Christoph Lameter
2006-10-24 1:07 ` Christoph Lameter
2006-10-26 22:09 ` Andrew Morton
2006-10-26 22:28 ` Christoph Lameter
2006-10-28 1:00 ` Christoph Lameter
2006-10-28 2:04 ` Andrew Morton
2006-10-28 2:12 ` Christoph Lameter
2006-10-28 2:24 ` Andrew Morton
2006-10-28 2:31 ` Christoph Lameter
2006-10-28 4:43 ` Andrew Morton
2006-10-28 7:47 ` KAMEZAWA Hiroyuki
2006-10-28 16:12 ` Andi Kleen
2006-10-29 0:48 ` Christoph Lameter
2006-10-29 1:04 ` Andrew Morton
2006-10-29 1:29 ` Christoph Lameter
2006-10-29 11:32 ` Nick Piggin [this message]
2006-10-30 16:41 ` Christoph Lameter
2006-11-01 18:26 ` Mel Gorman
2006-11-01 20:34 ` Andrew Morton
2006-11-01 21:00 ` Christoph Lameter
2006-11-01 21:46 ` Andrew Morton
2006-11-01 21:50 ` Christoph Lameter
2006-11-01 22:13 ` Mel Gorman
2006-11-01 23:29 ` Christoph Lameter
2006-11-02 0:22 ` Andrew Morton
2006-11-02 0:27 ` Christoph Lameter
2006-11-02 12:45 ` Mel Gorman
2006-11-01 22:10 ` Mel Gorman
2006-11-02 17:37 ` Andy Whitcroft
2006-11-02 18:08 ` Christoph Lameter
2006-11-02 20:58 ` Mel Gorman
2006-11-02 21:04 ` Christoph Lameter
2006-11-02 21:16 ` Mel Gorman
2006-11-02 21:52 ` Christoph Lameter
2006-11-02 22:37 ` Mel Gorman
2006-11-02 22:50 ` Christoph Lameter
2006-11-03 9:14 ` Mel Gorman
2006-11-03 13:17 ` Andy Whitcroft
2006-11-03 18:11 ` Christoph Lameter
2006-11-03 19:06 ` Mel Gorman
2006-11-03 19:44 ` Christoph Lameter
2006-11-03 21:11 ` Mel Gorman
2006-11-03 21:42 ` Christoph Lameter
2006-11-03 21:50 ` Andrew Morton
2006-11-03 21:53 ` Christoph Lameter
2006-11-03 22:12 ` Andrew Morton
2006-11-03 22:15 ` Christoph Lameter
2006-11-03 22:19 ` Andi Kleen
2006-11-04 0:37 ` Christoph Lameter
2006-11-04 1:32 ` Andi Kleen
2006-11-06 16:40 ` Christoph Lameter
2006-11-06 16:56 ` Andi Kleen
2006-11-06 17:00 ` Christoph Lameter
2006-11-06 17:07 ` Andi Kleen
2006-11-06 17:12 ` Hugh Dickins
2006-11-06 17:15 ` Christoph Lameter
2006-11-06 17:20 ` Andi Kleen
2006-11-06 17:26 ` Christoph Lameter
2006-11-07 16:30 ` Mel Gorman
2006-11-07 17:54 ` Christoph Lameter
2006-11-07 18:14 ` Mel Gorman
2006-11-08 0:29 ` KAMEZAWA Hiroyuki
2006-11-08 2:08 ` Christoph Lameter
2006-11-13 21:08 ` Mel Gorman
2006-11-03 12:48 ` Peter Zijlstra
2006-11-03 18:15 ` Christoph Lameter
2006-11-03 18:53 ` Peter Zijlstra
2006-11-03 19:23 ` Christoph Lameter
2006-11-02 18:52 ` Andrew Morton
2006-11-02 21:51 ` Mel Gorman
2006-11-02 22:03 ` Andy Whitcroft
2006-11-02 22:11 ` Andrew Morton
2006-11-01 18:13 ` Mel Gorman
2006-11-01 17:39 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4544914F.3000502@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox