From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 26 Jul 2007 13:15:39 +0900 From: KAMEZAWA Hiroyuki Subject: Re: NUMA policy issues with ZONE_MOVABLE Message-Id: <20070726131539.8a05760f.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: References: <20070725111646.GA9098@skynet.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Mel Gorman , linux-mm@kvack.org, Lee Schermerhorn , ak@suse.de, akpm@linux-foundation.org, pj@sgi.com List-ID: On Wed, 25 Jul 2007 12:31:21 -0700 (PDT) Christoph Lameter wrote: > So for a __GFP_MOVABLE alloc we would scan all zones and for > policy_zone just the policy zone. > > Lee should probably also review this in detail since he has recent > experience fiddling around with memory policies. Paul has also > experience in this area. > > Maybe this can actually help to deal with some of the corner cases of > memory policies (just hope the performance impact is not significant). > > Hmm, How about following patch ? (not tested, just an idea). I'm sorry if I misunderstand concept ot policy_zone. == Index: linux-2.6.23-rc1/include/linux/mempolicy.h =================================================================== --- linux-2.6.23-rc1.orig/include/linux/mempolicy.h +++ linux-2.6.23-rc1/include/linux/mempolicy.h @@ -162,14 +162,11 @@ extern struct zonelist *huge_zonelist(st unsigned long addr, gfp_t gfp_flags); extern unsigned slab_node(struct mempolicy *policy); +/* + * The smalles zone_idx which all nodes can offer against GFP_xxx + */ extern enum zone_type policy_zone; -static inline void check_highest_zone(enum zone_type k) -{ - if (k > policy_zone) - policy_zone = k; -} - int do_migrate_pages(struct mm_struct *mm, const nodemask_t *from_nodes, const nodemask_t *to_nodes, int flags); Index: linux-2.6.23-rc1/mm/page_alloc.c =================================================================== --- linux-2.6.23-rc1.orig/mm/page_alloc.c +++ linux-2.6.23-rc1/mm/page_alloc.c @@ -1648,7 +1648,6 @@ static int build_zonelists_node(pg_data_ zone = pgdat->node_zones + zone_type; if (populated_zone(zone)) { zonelist->zones[nr_zones++] = zone; - check_highest_zone(zone_type); } } while (zone_type); @@ -1857,7 +1856,6 @@ static void build_zonelists_in_zone_orde z = &NODE_DATA(node)->node_zones[zone_type]; if (populated_zone(z)) { zonelist->zones[pos++] = z; - check_highest_zone(zone_type); } } } @@ -1934,6 +1932,7 @@ static void build_zonelists(pg_data_t *p int local_node, prev_node; struct zonelist *zonelist; int order = current_zonelist_order; + int highest_zone; /* initialize zonelists */ for (i = 0; i < MAX_NR_ZONES; i++) { @@ -1981,6 +1980,32 @@ static void build_zonelists(pg_data_t *p /* calculate node order -- i.e., DMA last! */ build_zonelists_in_zone_order(pgdat, j); } + /* + * Find the lowest zone where mempolicy (MBID) can work well. + */ + highest_zone = 0; + policy_zone = -1; + for (i = 0; i < MAX_NR_ZONES; i++) { + struct zone *first_zone; + int success = 1; + for_each_node_state(node, N_MEMORY) { + first_zone = NODE_DATA(node)->node_zonelists[i][0]; + if (zone_idx(first_zone) > highest_zone) + highest_zone = zone_idx(first_zone); + if (first_zone->zone_pgdat != NODE_DATA(node)) { + /* This node cannot offer right pages for this + GFP */ + success = 0; + break; + } + } + if (success) { + policy_zone = i; + break; + } + } + if (policy_zone == -1) + policy_zone = highest_zone; } /* Construct the zonelist performance cache - see further mmzone.h */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org