From: Christoph Lameter <clameter@sgi.com>
To: Mel Gorman <mel@skynet.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andi Kleen <ak@suse.de>,
Lee.Schermerhorn@hp.com, kamezawa.hiroyu@jp.fujitsu.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Apply memory policies to top two highest zones when highest zone is ZONE_MOVABLE
Date: Mon, 6 Aug 2007 15:31:49 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0708061519420.4263@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20070806214812.GB6142@skynet.ie>
On Mon, 6 Aug 2007, Mel Gorman wrote:
> > So where do we stand on this? We made a mess of NUMA policies, and merging
> > "grouping pages by mobility" would fix that mess, only we're not sure that
> > we want to merge those and it's too late for 2.6.23 anwyay?
> >
>
> Grouping pages by mobility would still apply polciies only to
> ZONE_MOVABLE when it is configured. What grouping pages by mobility
> would relieve is much of the motivation to configure ZONE_MOVABLE at all
> for hugepages. The zone has such attributes as being useful to
Ultimately ZONE_MOVABLE can be removed. AFAIK ZONE_MOVABLE is a temporary
stepping stone to address concerns of about defrag reliability. Somehow
the stepping stone got into .23 without the real thing.
An additional issue with the current ZONE_MOVABLE in .23 is that the
tentative association of ZONE_MOVABLE with HIGHMEM also makes use of large
pages by SLUB not possible.
> There are patches in the works that change zonelists from having multiple
> zonelists to only having only one zonelist per node that is filtered based
> on the allocation flags. The place this filtering happens is the same as what
> the "hack" is currently doing. The cost of filtering should be offset by the
> reduced size of the node structure and tests with kernbench, hackbench and
> tbench seem to confirm that. This will bring the hack into being line with
> what we wanted with policies in the first place because things like MPOL_BIND
> will try nodes in node-local order instead of node-numeric order as it does
> currently.
I'd like to see that patch.
> >From there, we can eliminate policy_zone altogether by applying policies
> to all zones but forcing a situation where MPOL_BIND will always contain
> one node that GFP_KERNEL allocations can be satisified from. For example,
> if I have a NUMAQ that only has ZONE_NORMAL on node 0 and a user tries to
> bind to nodes 2+3, they will really bind to nodes 0,2,3 so that GFP_KERNEL
> allocations on that process will not return NULL. Alternatively, we could
> have mbind return a failure if it doesn't include a node that can satisfy
> GFP_KERNEL allocations. Either of these options seem more sensible than
> sometimes applying policies and other times not applying them.
We would still need to check on which nodes which zones area available.
Zones that are not available on all zones would need to be exempt from
policies. Maybe one could define an upper boundary of zones that are
policed? On NUMAQ zones up to ZONE_NORMAL would be under policy. On x86_64
this may only include ZONE_DMA. A similar thing would occur on ia64 with
the 4G DMA zone. Maybe policy_zone could become configurable?
> I'm for merging the hack for 2.6.23 and having one-zonelist-per-node
> ready for 2.6.24. If there is much fear that the hack will persist for too
Why not for .23? It does not seem to be too much code?
> long, I'm ok with applying policies only to ZONE_MOVABLE when kernelcore=
> is specified on the command line as one-zonelist-per-node can fix the same
> problem. Ultimately if we agree on patches to eliminate policy_zone altogether,
> the problem becomes moot as it no longer exists.
We cannot have a kernel release with broken mempolicy. We either need the
patch here or the one-zonelist patch for .23.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-08-06 22:31 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-02 17:21 Mel Gorman
2007-08-02 19:41 ` Lee Schermerhorn
2007-08-02 20:45 ` Christoph Lameter
2007-08-06 19:44 ` Andrew Morton
2007-08-06 20:13 ` Christoph Lameter
2007-08-06 21:56 ` Paul Jackson
2007-08-03 22:02 ` Andi Kleen
2007-08-04 0:23 ` Mel Gorman
2007-08-04 8:51 ` Andi Kleen
2007-08-04 16:39 ` Mel Gorman
2007-08-06 19:15 ` Andrew Morton
2007-08-06 19:18 ` Christoph Lameter
2007-08-06 20:31 ` Andi Kleen
2007-08-06 21:55 ` Mel Gorman
2007-08-07 5:12 ` Andrew Morton
2007-08-07 16:55 ` Mel Gorman
2007-08-07 18:14 ` Andrew Morton
2007-08-07 20:37 ` Christoph Lameter
2007-08-08 16:49 ` Mel Gorman
2007-08-08 17:03 ` Christoph Lameter
2007-08-06 21:48 ` Mel Gorman
2007-08-06 22:31 ` Christoph Lameter [this message]
2007-08-06 22:57 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0708061519420.4263@schroedinger.engr.sgi.com \
--to=clameter@sgi.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox