From: mel@skynet.ie (Mel Gorman)
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <clameter@sgi.com>,
Andrew Morton <akpm@osdl.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-mm@kvack.org
Subject: Re: Page allocator: Single Zone optimizations
Date: Wed, 1 Nov 2006 18:26:05 +0000 [thread overview]
Message-ID: <20061101182605.GC27386@skynet.ie> (raw)
In-Reply-To: <4544914F.3000502@yahoo.com.au>
On (29/10/06 22:32), Nick Piggin didst pronounce:
> Christoph Lameter wrote:
> >On Sat, 28 Oct 2006, Andrew Morton wrote:
> >
> >
> >>>We (and I personally with the prezeroing patches) have been down
> >>>this road several times and did not like what we saw.
> >>
> >>Details?
> >
> >
> >The most important issues that come to my mind right now (this has
> >been discussed frequently in various contexts so I may be missing
> >some things) are:
> >
> >1. Duplicate the caches (pageset structures). This reduces cache hit
> > rates. Duplicates lots of information in the page allocator.
>
> You would have to do the same thing to get an O(1) per-CPU allocation
> for a specific zone/reclaim type/etc regardless whether or not you use
> zones.
>
> >2. Necessity of additional load balancing across multiple zones.
>
> a. we have to do this anyway for eg. dma32 and NUMA, and b. it is much
> better than the highmem problem was because all the memory is kernel
> addressable.
>
> If you use another scheme (eg. lists within zones within nodes, rather
> than just more zones within nodes), then you still fundamentally have
> to balance somehow.
>
> >3. The NUMA layer can only support memory policies for a single zone.
>
> That's broken. The VM had zones long before it had nodes or memory
> policies.
>
> >4. You may have to duplicate the slab allocator caches for that
> > purpose.
>
> If you want specific allocations from a given zone, yes. So you may
> have to do the same if you want a specific slab allcoation from a
> list within a zone.
>
> >5. More bits used in the page flags.
>
> Aren't there patches to move the bits out of the page flags? A list
> within zones approach would have to use either page flags or some
> external info (eg. page pfn) to determine what list for the page to
> go back to anyway, wouldn't you?
>
> >6. ZONES have to be sized at bootup which creates more dangers of runinng
> > out of memory, possibly requiring more complex load balancing.
>
> Mel's list based defrag approach requires complex load balancing too.
>
I never really got this objection. With list-based anti-frag, the
zone-balancing logic remains the same. There are patches from Andy
Whitcroft that reclaims pages in contiguous blocks, but still with the same
zone-ordering. It doesn't affect load balancing between zones as such.
With zone-based anti-fragmentation, the load balancing was a bit more
entertaining all right.
In the context of memory hot-unplug though, list-based anti-fragmentation
only really helps you if you can unplug regions of size MAX_ORDER_NR_PAGES. If
you go over that, you need zones.
> >>Again. On the whole, that was a pretty useless email. Please give us
> >>something we can use.
> >
> >
> >Well review the discussions that we had regarding Mel Gorman's defrag
> >approaches. We discussed this in detail at the VM summit and decided to
> >not create additional zones but instead separate the free lists. You and
> >Linus seemed to be in agreement with this. I am a bit surprised ....
> >Is this a Google effect?
> >
> >Moreover the discussion here is only remotely connected to the issue at
> >hand. We all agree that ZONE_DMA is bad and we want to have an alternate
> >scheme. Why not continue making it possible to not compile ZONE_DMA
> >dependent code into the kernel?
> >
> >Single zone patches would increase VM performance. That would in turn
> >make it more difficult to get approaches in that require multiple zones
> >since the performance drop would be more significant.
>
> node->zone->many lists vs node->many zones? I guess the zones approach is
> faster?
>
Not really. If I have a zone with two sets of free lists or two zones with
one set of free lists each, there are the same number of lists. However, for
anti-fragmentation with additional lists, you frequently use the preferred list
because they size themselves based on allocator usage patterns. With zones,
you *must* get the zone sizes right or the performance hit for zone
fallbacks starts becoming noticeable.
> Not that I am any more convinced that defragmentation is a good idea than
> I was a year ago, but I think it is naive to think we can instantly be rid
> of all the problems associated with zones by degenerating that layer of the
> VM and introducing a new one that does basically the same things.
>
> It is true that zones may not be a perfect fit for what some people want to
> do, but until they have shown a) what they want to do is a good idea, and
> b) zones can't easily be adapted, then using the infrastructure we already
> have throughout the entire mm seems like a good idea.
>
> IMO, Andrew's idea to have 1..N zones in a node seems sane and it would be
> a good generalisation of even the present code.
>
> --
> SUSE Labs, Novell Inc.
> Send instant messages to your online friends http://au.messenger.yahoo.com
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-11-01 18:26 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-17 0:50 Christoph Lameter
2006-10-17 1:10 ` Andrew Morton
2006-10-17 1:13 ` Christoph Lameter
2006-10-17 1:27 ` KAMEZAWA Hiroyuki
2006-10-17 1:25 ` Christoph Lameter
2006-10-17 6:04 ` Nick Piggin
2006-10-17 17:54 ` Christoph Lameter
2006-10-18 11:15 ` Nick Piggin
2006-10-18 19:38 ` Andrew Morton
2006-10-23 23:08 ` Christoph Lameter
2006-10-24 1:07 ` Christoph Lameter
2006-10-26 22:09 ` Andrew Morton
2006-10-26 22:28 ` Christoph Lameter
2006-10-28 1:00 ` Christoph Lameter
2006-10-28 2:04 ` Andrew Morton
2006-10-28 2:12 ` Christoph Lameter
2006-10-28 2:24 ` Andrew Morton
2006-10-28 2:31 ` Christoph Lameter
2006-10-28 4:43 ` Andrew Morton
2006-10-28 7:47 ` KAMEZAWA Hiroyuki
2006-10-28 16:12 ` Andi Kleen
2006-10-29 0:48 ` Christoph Lameter
2006-10-29 1:04 ` Andrew Morton
2006-10-29 1:29 ` Christoph Lameter
2006-10-29 11:32 ` Nick Piggin
2006-10-30 16:41 ` Christoph Lameter
2006-11-01 18:26 ` Mel Gorman [this message]
2006-11-01 20:34 ` Andrew Morton
2006-11-01 21:00 ` Christoph Lameter
2006-11-01 21:46 ` Andrew Morton
2006-11-01 21:50 ` Christoph Lameter
2006-11-01 22:13 ` Mel Gorman
2006-11-01 23:29 ` Christoph Lameter
2006-11-02 0:22 ` Andrew Morton
2006-11-02 0:27 ` Christoph Lameter
2006-11-02 12:45 ` Mel Gorman
2006-11-01 22:10 ` Mel Gorman
2006-11-02 17:37 ` Andy Whitcroft
2006-11-02 18:08 ` Christoph Lameter
2006-11-02 20:58 ` Mel Gorman
2006-11-02 21:04 ` Christoph Lameter
2006-11-02 21:16 ` Mel Gorman
2006-11-02 21:52 ` Christoph Lameter
2006-11-02 22:37 ` Mel Gorman
2006-11-02 22:50 ` Christoph Lameter
2006-11-03 9:14 ` Mel Gorman
2006-11-03 13:17 ` Andy Whitcroft
2006-11-03 18:11 ` Christoph Lameter
2006-11-03 19:06 ` Mel Gorman
2006-11-03 19:44 ` Christoph Lameter
2006-11-03 21:11 ` Mel Gorman
2006-11-03 21:42 ` Christoph Lameter
2006-11-03 21:50 ` Andrew Morton
2006-11-03 21:53 ` Christoph Lameter
2006-11-03 22:12 ` Andrew Morton
2006-11-03 22:15 ` Christoph Lameter
2006-11-03 22:19 ` Andi Kleen
2006-11-04 0:37 ` Christoph Lameter
2006-11-04 1:32 ` Andi Kleen
2006-11-06 16:40 ` Christoph Lameter
2006-11-06 16:56 ` Andi Kleen
2006-11-06 17:00 ` Christoph Lameter
2006-11-06 17:07 ` Andi Kleen
2006-11-06 17:12 ` Hugh Dickins
2006-11-06 17:15 ` Christoph Lameter
2006-11-06 17:20 ` Andi Kleen
2006-11-06 17:26 ` Christoph Lameter
2006-11-07 16:30 ` Mel Gorman
2006-11-07 17:54 ` Christoph Lameter
2006-11-07 18:14 ` Mel Gorman
2006-11-08 0:29 ` KAMEZAWA Hiroyuki
2006-11-08 2:08 ` Christoph Lameter
2006-11-13 21:08 ` Mel Gorman
2006-11-03 12:48 ` Peter Zijlstra
2006-11-03 18:15 ` Christoph Lameter
2006-11-03 18:53 ` Peter Zijlstra
2006-11-03 19:23 ` Christoph Lameter
2006-11-02 18:52 ` Andrew Morton
2006-11-02 21:51 ` Mel Gorman
2006-11-02 22:03 ` Andy Whitcroft
2006-11-02 22:11 ` Andrew Morton
2006-11-01 18:13 ` Mel Gorman
2006-11-01 17:39 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061101182605.GC27386@skynet.ie \
--to=mel@skynet.ie \
--cc=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox