From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e4.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l57MG367003234 for ; Thu, 7 Jun 2007 18:16:03 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l57MG33I527580 for ; Thu, 7 Jun 2007 18:16:03 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l57MG2GT016790 for ; Thu, 7 Jun 2007 18:16:03 -0400 Date: Thu, 7 Jun 2007 15:16:02 -0700 From: Nishanth Aravamudan Subject: Re: [PATCH v2] gfp.h: GFP_THISNODE can go to other nodes if some are unpopulated Message-ID: <20070607221602.GD15776@us.ibm.com> References: <20070607150425.GA15776@us.ibm.com> <20070607220149.GC15776@us.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Lee.Schermerhorn@hp.com, anton@samba.org, apw@shadowen.org, mel@csn.ul.ie, akpm@linux-foundation.org, linux-mm@kvack.org List-ID: On 07.06.2007 [15:05:56 -0700], Christoph Lameter wrote: > On Thu, 7 Jun 2007, Nishanth Aravamudan wrote: > > > /* > > - * NOTE: if the requested node is unpopulated (no memory), a THISNODE > > - * request can go to other nodes due to the fallback list > > + * NOTE: GFP_THISNODE allocates from the first available pgdat (== node > > + * structure) from the zonelist of the requested node. The first pgdat > > + * may be the pgdat of another node if the requested node has no memory > > + * on its own. > > */ > > #define GFP_THISNODE (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY) > > #else > > Acked-by: Christoph Lameter Thanks, I'm an idiot, though, and didn't reset to stock -linus before refreshing my patch: gfp.h: GFP_THISNODE can go to other nodes if some are unpopulated While testing my sysfs per-node hugepage allocator (http://marc.info/?l=linux-mm&m=117935849517122&w=2), I found that an alloc_pages_node(nid, GFP_THISNODE) request would sometimes return a struct page such that page_to_nid(page) != nid. This was because, on that particular machine, nodes 0 and 1 are populated and nodes 2 and 3 are not. When a page is requested get_page_from_freelist() relies on zonelist->zones[0]->zone_pgdat indicating when THISNODE stops. But, because, say, node 2 has no memory, the first zone_pgdat in the fallback list points to a different node. Add a comment indicating that THISNODE may not return pages on THISNODE if the node is unpopulated. Am working on testing Lee/Anton's patch to add a node_populated_mask and use that in the hugepage allocator path. But I think this may be a problem anywhere THISNODE is used and memory is expected to come from the requested node and nowhere else. Signed-off-by: Nishanth Aravamudan Acked-by: Christoph Lameter diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 0d2ef0b..996cf08 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -67,6 +67,12 @@ struct vm_area_struct; __GFP_HIGHMEM) #ifdef CONFIG_NUMA +/* + * NOTE: GFP_THISNODE allocates from the first available pgdat (== node + * structure) from the zonelist of the requested node. The first pgdat + * may be the pgdat of another node if the requested node has no memory + * on its own. + */ #define GFP_THISNODE (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY) #else #define GFP_THISNODE ((__force gfp_t)0) -- Nishanth Aravamudan IBM Linux Technology Center -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org