From: Christoph Lameter <clameter@sgi.com>
To: Mel Gorman <mel@skynet.ie>
Cc: linux-mm@kvack.org, Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
ak@suse.de, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
akpm@linux-foundation.org, pj@sgi.com
Subject: Re: NUMA policy issues with ZONE_MOVABLE
Date: Fri, 27 Jul 2007 10:35:39 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.64.0707271026040.15990@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20070727154519.GA21614@skynet.ie>
On Fri, 27 Jul 2007, Mel Gorman wrote:
> This was fairly straight-forward but I wouldn't call it a bug fix for 2.6.23
> for the policys + ZONE_MOVABLE issue; I still prefer the last patch for
> the fix.
>
> This patch uses one zonelist per node and filters based on a gfp_mask where
> necessary. It consumes less memory and reduces cache pressure at the cost
> of CPU. It also adds a zone_id field to struct zone as zone_idx is used more
> than it was previously.
>
> Performance differences on kernbench for Total CPU time ranged from
> -0.06% to +1.19%.
Performance is equal otherwise?
> Obvious things that are outstanding;
>
> o Compile-test parisc
> o Split patch in two to keep the zone_idx changes separetly
> o Verify zlccache is not broken
> o Have a version of __alloc_pages take a nodemask and ditch
> bind_zonelist()
Yeah. I think the NUMA folks would love this but the rest of the
developers may object.
> I can work on bringing this up to scratch during the cycle.
>
> Patch as follows. Comments?
Glad to see some movement in this area.
> index bc68dd9..f2a597e 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -116,6 +116,13 @@ static inline enum zone_type gfp_zone(gfp_t flags)
> return ZONE_NORMAL;
> }
>
> +static inline int should_filter_zone(struct zone *zone, int highest_zoneidx)
> +{
> + if (zone_idx(zone) > highest_zoneidx)
> + return 1;
> + return 0;
> +}
> +
I think this should_filter() creates more overhead than which it saves. In
particular true for configurations with a small number of zones like SMP
systems. For large NUMA systems the cache savings will likely may it
beneficial.
Simply filter all.
> @@ -258,7 +258,7 @@ static inline void mpol_fix_fork_child_flag(struct task_struct *p)
> static inline struct zonelist *huge_zonelist(struct vm_area_struct *vma,
> unsigned long addr, gfp_t gfp_flags)
> {
> - return NODE_DATA(0)->node_zonelists + gfp_zone(gfp_flags);
> + return &NODE_DATA(0)->node_zonelist;
> }
These modifications look good in terrms of code size reduction.
> @@ -438,7 +439,7 @@ extern struct page *mem_map;
> struct bootmem_data;
> typedef struct pglist_data {
> struct zone node_zones[MAX_NR_ZONES];
> - struct zonelist node_zonelists[MAX_NR_ZONES];
> + struct zonelist node_zonelist;
Looks like a significant memory savings on 1024 node numa. zonelist has
#define MAX_ZONES_PER_ZONELIST (MAX_NUMNODES * MAX_NR_ZONES)
zones.
> @@ -185,11 +186,15 @@ static inline int constrained_alloc(struct zonelist *zonelist, gfp_t gfp_mask)
> if (NODE_DATA(node)->node_present_pages)
> node_set(node, nodes);
>
> - for (z = zonelist->zones; *z; z++)
> + for (z = zonelist->zones; *z; z++) {
> +
> + if (should_filter_zone(*z, highest_zoneidx))
> + continue;
Huh? Why do you need it here? Note that this code is also going away with
the memoryless node patch. We can use the nodes with memory nodemask here.
> diff --git a/mm/slub.c b/mm/slub.c
> index 9b2d617..a020a12 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1276,6 +1276,7 @@ static struct page *get_any_partial(struct kmem_cache *s, gfp_t flags)
> struct zonelist *zonelist;
> struct zone **z;
> struct page *page;
> + enum zone_type highest_zoneidx = gfp_zone(flags);
>
> /*
> * The defrag ratio allows a configuration of the tradeoffs between
> @@ -1298,11 +1299,13 @@ static struct page *get_any_partial(struct kmem_cache *s, gfp_t flags)
> if (!s->defrag_ratio || get_cycles() % 1024 > s->defrag_ratio)
> return NULL;
>
> - zonelist = &NODE_DATA(slab_node(current->mempolicy))
> - ->node_zonelists[gfp_zone(flags)];
> + zonelist = &NODE_DATA(slab_node(current->mempolicy))->node_zonelist;
> for (z = zonelist->zones; *z; z++) {
> struct kmem_cache_node *n;
>
> + if (should_filter_zone(*z, highest_zoneidx))
> + continue;
> +
> n = get_node(s, zone_to_nid(*z));
>
> if (n && cpuset_zone_allowed_hardwall(*z, flags) &&
Isnt there some way to fold these traversals into a common page allocator
function?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-07-27 17:35 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-07-25 4:20 Christoph Lameter
2007-07-25 4:47 ` Nick Piggin
2007-07-25 5:05 ` Christoph Lameter
2007-07-25 5:24 ` Nick Piggin
2007-07-25 6:00 ` Christoph Lameter
2007-07-25 6:09 ` Nick Piggin
2007-07-25 9:32 ` Andi Kleen
2007-07-25 6:36 ` KAMEZAWA Hiroyuki
2007-07-25 11:16 ` Mel Gorman
2007-07-25 14:30 ` Lee Schermerhorn
2007-07-25 19:31 ` Christoph Lameter
2007-07-26 4:15 ` KAMEZAWA Hiroyuki
2007-07-26 4:53 ` Christoph Lameter
2007-07-26 7:41 ` KAMEZAWA Hiroyuki
2007-07-26 16:16 ` Mel Gorman
2007-07-26 18:03 ` Christoph Lameter
2007-07-26 18:26 ` Mel Gorman
2007-07-26 13:23 ` Mel Gorman
2007-07-26 18:07 ` Christoph Lameter
2007-07-26 22:59 ` Mel Gorman
2007-07-27 1:22 ` Christoph Lameter
2007-07-27 8:20 ` Mel Gorman
2007-07-27 15:45 ` Mel Gorman
2007-07-27 17:35 ` Christoph Lameter [this message]
2007-07-27 17:46 ` Mel Gorman
2007-07-27 18:38 ` Christoph Lameter
2007-07-27 18:00 ` [PATCH] Document Linux Memory Policy - V2 Lee Schermerhorn
2007-07-27 18:38 ` Randy Dunlap
2007-07-27 19:01 ` Lee Schermerhorn
2007-07-27 19:21 ` Randy Dunlap
2007-07-27 18:55 ` Christoph Lameter
2007-07-27 19:24 ` Lee Schermerhorn
2007-07-31 15:14 ` Mel Gorman
2007-07-31 16:34 ` Lee Schermerhorn
2007-07-31 19:10 ` Christoph Lameter
2007-07-31 19:46 ` Lee Schermerhorn
2007-07-31 19:58 ` Christoph Lameter
2007-07-31 20:23 ` Lee Schermerhorn
2007-07-31 20:48 ` [PATCH] Document Linux Memory Policy - V3 Lee Schermerhorn
2007-08-03 13:52 ` Mel Gorman
2007-07-28 7:28 ` NUMA policy issues with ZONE_MOVABLE KAMEZAWA Hiroyuki
2007-07-28 11:57 ` Mel Gorman
2007-07-28 14:10 ` KAMEZAWA Hiroyuki
2007-07-28 14:21 ` KAMEZAWA Hiroyuki
2007-07-30 12:41 ` Mel Gorman
2007-07-30 18:06 ` Christoph Lameter
2007-07-27 14:24 ` Lee Schermerhorn
2007-08-01 18:59 ` Lee Schermerhorn
2007-08-02 0:36 ` KAMEZAWA Hiroyuki
2007-08-02 17:10 ` Mel Gorman
2007-08-02 17:51 ` Lee Schermerhorn
2007-07-26 18:09 ` Lee Schermerhorn
2007-08-02 14:09 ` Mel Gorman
2007-08-02 18:56 ` Christoph Lameter
2007-08-02 19:42 ` Mel Gorman
2007-08-02 19:52 ` Christoph Lameter
2007-08-03 9:32 ` Mel Gorman
2007-08-03 16:36 ` Christoph Lameter
2007-07-25 14:27 ` Lee Schermerhorn
2007-07-25 17:39 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0707271026040.15990@schroedinger.engr.sgi.com \
--to=clameter@sgi.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=ak@suse.de \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@skynet.ie \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox