From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>, Ingo Molnar <mingo@elte.hu>,
"Martin J. Bligh" <mbligh@mbligh.org>,
Andrew Morton <akpm@osdl.org>,
kravetz@us.ibm.com, linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
lhms <lhms-devel@lists.sourceforge.net>
Subject: Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19 - Summary
Date: Thu, 03 Nov 2005 14:14:08 +1100 [thread overview]
Message-ID: <43698080.3040800@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.58.0511021231440.5235@skynet>
Mel Gorman wrote:
>
> Ok. To me, the rest of the thread are beating around the same points and
> no one is giving ground. The points are made so lets summarise. Apologies
> if anything is missing.
>
Thanks for attempting a summary of a difficult topic. I have a couple
of suggestions.
> Who cares
> =========
> Physical hotplug remove: Vendors of the hardware that support this -
> Fujitsu, HP (I think), IBM etc
>
> Virtualization hotplug remove: Sellers of virtualization software, some
> hardware like any IBM machine that lists LPAR in it's list of
> features. Probably software solutions like Xen are also affected
> if they want to be able to grow and shrink the virtual machines on
> demand
>
Ingo said that Xen is fine with per page granular freeing - this covers
embedded, desktop and small server users of VMs into the future I'd say.
> High order allocations: Ultimately, hugepage users. Today, that is a
> feature only big server users like Oracle care about. In the
> future I reckon applications will be able to use them for things
> like backing the heap by huge pages. Other users like GigE,
> loopback devices with large MTUs, some filesystem like CIFS are
> all interested although they are also been told use use smaller
> pages.
>
I think that saying its now OK to use higher order allocations is wrong
because as I said even with your patches they are going to run into
problems.
Actually I think one reason your patches may perform so well is because
there aren't actually a lot of higher order allocations in the kernel.
I think that probably leaves us realistically with demand hugepages,
hot unplug memory, and IBM lpars?
> Pros/Cons of Solutions
> ======================
>
> Anti-defrag Pros
> o Aim9 shows no significant regressions (.37% on page_test). On some
> tests, it shows performance gains (> 5% on fork_test)
> o Stress tests show that it manages to keep fragmentation down to a far
> lower level even without teaching kswapd how to linear reclaim
This sounds like a kind of funny test to me if nobody is actually
using higher order allocations.
When a higher order allocation is attempted, either you will satisfy
it from the kernel region, in which case the vanilla kernel would
have done the same. Or you satisfy it from an easy-reclaim contiguous
region, in which case it is no longer an easy-reclaim contiguous
region.
> o Stress tests with a linear reclaim experimental patch shows that it
> can successfully find large contiguous chunks of memory
> o It is known to help hotplug on PPC64
> o No tunables. The approach tries to manage itself as much as possible
But it has more dreaded heuristics :P
> o It exists, heavily tested, and synced against the latest -mm1
> o Can be compiled away be redefining the RCLM_* macros and the
> __GFP_*RCLM flags
>
> Anti-defrag Cons
> o More complexity within the page allocator
> o Adds a new layer onto the allocator that effectively creates subzones
> o Adding a new concept that maintainers have to work with
> o Depending on the workload, it fragments anyway
>
> New Zone Pros
> o Zones are a well known and understood concept
> o For people that do not care about hotplug, they can easily get rid of it
> o Provides reliable areas of contiguous groups that can be freed for
> HugeTLB pages going to userspace
> o Uses existing zone infrastructure for balancing
>
> New Zone Cons
> o Zones historically have introduced balancing problems
> o Been tried for hotplug and dropped because of being awkward to work with
> o It only helps hotplug and potentially HugeTLB pages for userspace
> o Tunable required. If you get it wrong, the system suffers a lot
Pro: it keeps IBM mainframe and pseries sysadmins in a job ;) Let
them get it right.
> o Needs to be planned for and developed
>
Yasunori Goto had patches around from last year. Not sure what sort
of shape they're in now but I'd think most of the hard work is done.
> Scenarios
> =========
>
> Lets outline some situations then or workloads that can occur
>
> 1. Heavy job running that consumes 75% of physical memory. Like a kernel
> build
>
> Anti-defrag: It will not fragment as it will never have to fallback.High
> order allocations will be possible in the remaining 25%.
> Zone-based: After been tuned to a kernel build load, it will not
> fragment. Get the tuning wrong, performance suffers or workload
> fails. High order allocations will be possible in the remaining 25%.
>
You don't need to continually tune things for each and every possible
workload under the sun. It is like how we currently drive 16GB highmem
systems quite nicely under most workloads with 1GB of normal memory.
Make that an 8:1 ratio if you're worried.
[snip]
>
> I've tried to be as objective as possible with the summary.
>
>>From the points above though, I think that anti-defrag gets us a lot of
> the way, with the complexity isolated in one place. It's downside is that
> it can still break down and future work is needed to stop it degrading
> (kswapd cleaning UserRclm areas and page migration when we get really
> stuck). Zone-based is more reliable but only addresses a limited
> situation, principally hotplug and it does not even go 100% of the way for
> hotplug.
To me it seems like it solves the hotplug, lpar hotplug, and hugepages
problems which seem to be the main ones.
> It also depends on a tunable which is not cool and it is static.
I think it is very cool because it means the tiny minority of Linux
users who want this can do so without impacting the rest of the code
or users. This is how Linux has been traditionally run and I still
have a tiny bit of faith left :)
> If we make the zones growable+shrinkable, we run into all the same
> problems that anti-defrag has today.
>
But we don't have the extra zones layer that anti defrag has today.
And anti defrag needs limits if it is to be reliable anyway.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2005-11-03 3:14 UTC|newest]
Thread overview: 185+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-30 18:33 [PATCH 0/7] Fragmentation Avoidance V19 Mel Gorman
2005-10-30 18:34 ` [PATCH 1/7] Fragmentation Avoidance V19: 001_antidefrag_flags Mel Gorman
2005-10-30 18:34 ` [PATCH 2/7] Fragmentation Avoidance V19: 002_usemap Mel Gorman
2005-10-30 18:34 ` [PATCH 3/7] Fragmentation Avoidance V19: 003_fragcore Mel Gorman
2005-10-30 18:34 ` [PATCH 4/7] Fragmentation Avoidance V19: 004_fallback Mel Gorman
2005-10-30 18:34 ` [PATCH 5/7] Fragmentation Avoidance V19: 005_largealloc_tryharder Mel Gorman
2005-10-30 18:34 ` [PATCH 6/7] Fragmentation Avoidance V19: 006_percpu Mel Gorman
2005-10-30 18:34 ` [PATCH 7/7] Fragmentation Avoidance V19: 007_stats Mel Gorman
2005-10-31 5:57 ` [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19 Mike Kravetz
2005-10-31 6:37 ` Nick Piggin
2005-10-31 7:54 ` Andrew Morton
2005-10-31 7:11 ` Nick Piggin
2005-10-31 16:19 ` Mel Gorman
2005-10-31 23:54 ` Nick Piggin
2005-11-01 1:28 ` Mel Gorman
2005-11-01 1:42 ` Nick Piggin
2005-10-31 14:34 ` Martin J. Bligh
2005-10-31 19:24 ` Andrew Morton
2005-10-31 19:40 ` Martin J. Bligh
2005-10-31 23:59 ` Nick Piggin
2005-11-01 1:36 ` Mel Gorman
2005-10-31 23:29 ` Nick Piggin
2005-11-01 0:59 ` Mel Gorman
2005-11-01 1:31 ` Nick Piggin
2005-11-01 2:07 ` Mel Gorman
2005-11-01 2:35 ` Nick Piggin
2005-11-01 11:57 ` Mel Gorman
2005-11-01 13:56 ` Ingo Molnar
2005-11-01 14:10 ` Dave Hansen
2005-11-01 14:29 ` Ingo Molnar
2005-11-01 14:49 ` Dave Hansen
2005-11-01 15:01 ` Ingo Molnar
2005-11-01 15:22 ` Dave Hansen
2005-11-02 8:49 ` Ingo Molnar
2005-11-02 9:02 ` Nick Piggin
2005-11-02 9:17 ` Ingo Molnar
2005-11-02 9:32 ` Dave Hansen
2005-11-02 9:48 ` Nick Piggin
2005-11-02 10:54 ` Dave Hansen
2005-11-02 15:02 ` Martin J. Bligh
2005-11-03 3:21 ` Nick Piggin
2005-11-03 15:36 ` Martin J. Bligh
2005-11-03 15:40 ` Arjan van de Ven
2005-11-03 15:51 ` Linus Torvalds
2005-11-03 15:57 ` Martin J. Bligh
2005-11-03 16:20 ` Arjan van de Ven
2005-11-03 16:27 ` Mel Gorman
2005-11-03 16:46 ` Linus Torvalds
2005-11-03 16:52 ` Martin J. Bligh
2005-11-03 17:19 ` Linus Torvalds
2005-11-03 17:48 ` Dave Hansen
2005-11-03 17:51 ` Martin J. Bligh
2005-11-03 17:59 ` Arjan van de Ven
2005-11-03 18:08 ` Linus Torvalds
2005-11-03 18:17 ` Martin J. Bligh
2005-11-03 18:44 ` Linus Torvalds
2005-11-03 18:51 ` Martin J. Bligh
2005-11-03 19:35 ` Linus Torvalds
2005-11-03 22:40 ` Martin J. Bligh
2005-11-03 22:56 ` Linus Torvalds
2005-11-03 23:01 ` Martin J. Bligh
2005-11-04 0:58 ` Nick Piggin
2005-11-04 1:06 ` Linus Torvalds
2005-11-04 1:20 ` Paul Mackerras
2005-11-04 1:22 ` Nick Piggin
2005-11-04 1:48 ` Mel Gorman
2005-11-04 1:59 ` Nick Piggin
2005-11-04 2:35 ` Mel Gorman
2005-11-04 1:26 ` Mel Gorman
2005-11-03 21:11 ` Mel Gorman
2005-11-03 18:03 ` Linus Torvalds
2005-11-03 20:00 ` Paul Jackson
2005-11-03 20:46 ` Mel Gorman
2005-11-03 18:48 ` Martin J. Bligh
2005-11-03 19:08 ` Linus Torvalds
2005-11-03 22:37 ` Martin J. Bligh
2005-11-03 23:16 ` Linus Torvalds
2005-11-03 23:39 ` Martin J. Bligh
2005-11-04 0:42 ` Nick Piggin
2005-11-04 4:39 ` Andrew Morton
2005-11-04 16:22 ` Mel Gorman
2005-11-03 15:53 ` Martin J. Bligh
2005-11-02 14:57 ` Martin J. Bligh
2005-11-01 16:48 ` Kamezawa Hiroyuki
2005-11-01 16:59 ` Kamezawa Hiroyuki
2005-11-01 17:19 ` Mel Gorman
2005-11-02 0:32 ` KAMEZAWA Hiroyuki
2005-11-02 11:22 ` Mel Gorman
2005-11-01 18:06 ` linux-os (Dick Johnson)
2005-11-02 7:19 ` Ingo Molnar
2005-11-02 7:46 ` Gerrit Huizenga
2005-11-02 8:50 ` Nick Piggin
2005-11-02 9:12 ` Gerrit Huizenga
2005-11-02 9:37 ` Nick Piggin
2005-11-02 10:17 ` Gerrit Huizenga
2005-11-02 23:47 ` Rob Landley
2005-11-03 4:43 ` Nick Piggin
2005-11-03 6:07 ` Rob Landley
2005-11-03 7:34 ` Nick Piggin
2005-11-03 17:54 ` Rob Landley
2005-11-03 20:13 ` Jeff Dike
2005-11-03 16:35 ` Jeff Dike
2005-11-03 16:23 ` Badari Pulavarty
2005-11-03 18:27 ` Jeff Dike
2005-11-03 18:49 ` Rob Landley
2005-11-04 4:52 ` Andrew Morton
2005-11-04 5:35 ` Paul Jackson
2005-11-04 5:48 ` Andrew Morton
2005-11-04 6:42 ` Paul Jackson
2005-11-04 7:10 ` Andrew Morton
2005-11-04 7:45 ` Paul Jackson
2005-11-04 8:02 ` Andrew Morton
2005-11-04 9:52 ` Paul Jackson
2005-11-04 15:27 ` Martin J. Bligh
2005-11-04 15:19 ` Martin J. Bligh
2005-11-04 17:38 ` Andrew Morton
2005-11-04 6:16 ` Bron Nelson
2005-11-04 7:26 ` [patch] swapin rlimit Ingo Molnar
2005-11-04 7:36 ` Andrew Morton
2005-11-04 8:07 ` Ingo Molnar
2005-11-04 10:06 ` Paul Jackson
2005-11-04 15:24 ` Martin J. Bligh
2005-11-04 8:18 ` Arjan van de Ven
2005-11-04 10:04 ` Paul Jackson
2005-11-04 15:14 ` Rob Landley
2005-11-04 10:14 ` Bernd Petrovitsch
2005-11-04 10:21 ` Ingo Molnar
2005-11-04 11:17 ` Bernd Petrovitsch
2005-11-02 10:41 ` [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19 Ingo Molnar
2005-11-02 11:04 ` Gerrit Huizenga
2005-11-02 12:00 ` Ingo Molnar
2005-11-02 12:42 ` Dave Hansen
2005-11-02 15:02 ` Gerrit Huizenga
2005-11-03 0:10 ` Rob Landley
2005-11-02 7:57 ` Nick Piggin
2005-11-02 0:51 ` Nick Piggin
2005-11-02 7:42 ` Dave Hansen
2005-11-02 8:24 ` Nick Piggin
2005-11-02 8:33 ` Yasunori Goto
2005-11-02 8:43 ` Nick Piggin
2005-11-02 14:51 ` Martin J. Bligh
2005-11-02 23:28 ` Rob Landley
2005-11-03 5:26 ` Jeff Dike
2005-11-03 5:41 ` Rob Landley
2005-11-04 3:26 ` [uml-devel] " Blaisorblade
2005-11-04 15:50 ` Rob Landley
2005-11-04 17:18 ` Blaisorblade
2005-11-04 17:44 ` Rob Landley
2005-11-02 12:38 ` [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19 - Summary Mel Gorman
2005-11-03 3:14 ` Nick Piggin [this message]
2005-11-03 12:19 ` Mel Gorman
2005-11-10 18:47 ` Steve Lord
2005-11-03 15:34 ` Martin J. Bligh
2005-11-01 14:41 ` [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19 Mel Gorman
2005-11-01 14:46 ` Ingo Molnar
2005-11-01 15:23 ` Mel Gorman
2005-11-01 18:33 ` Rob Landley
2005-11-01 19:02 ` Ingo Molnar
2005-11-01 14:50 ` Dave Hansen
2005-11-01 15:24 ` Mel Gorman
2005-11-02 5:11 ` Andrew Morton
2005-11-01 18:23 ` Rob Landley
2005-11-01 20:31 ` Joel Schopp
2005-11-01 20:59 ` Joel Schopp
2005-11-02 1:06 ` Nick Piggin
2005-11-02 1:41 ` Martin J. Bligh
2005-11-02 2:03 ` Nick Piggin
2005-11-02 2:24 ` Martin J. Bligh
2005-11-02 2:49 ` Nick Piggin
2005-11-02 4:39 ` Martin J. Bligh
2005-11-02 5:09 ` Nick Piggin
2005-11-02 5:14 ` Martin J. Bligh
2005-11-02 6:23 ` KAMEZAWA Hiroyuki
2005-11-02 10:15 ` Nick Piggin
2005-11-02 7:19 ` Yasunori Goto
2005-11-02 11:48 ` Mel Gorman
2005-11-02 11:41 ` Mel Gorman
2005-11-02 11:37 ` Mel Gorman
2005-11-02 15:11 ` Mel Gorman
2005-11-01 15:25 ` Martin J. Bligh
2005-11-01 15:33 ` Dave Hansen
2005-11-01 16:57 ` Mel Gorman
2005-11-01 17:00 ` Mel Gorman
2005-11-01 18:58 ` Rob Landley
2005-11-01 14:40 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43698080.3040800@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@osdl.org \
--cc=haveblue@us.ibm.com \
--cc=kravetz@us.ibm.com \
--cc=lhms-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mbligh@mbligh.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox