From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail144.messagelabs.com (mail144.messagelabs.com [216.82.254.51]) by kanga.kvack.org (Postfix) with ESMTP id 66F1A6B0055 for ; Tue, 22 Sep 2009 16:13:31 -0400 (EDT) Received: from wpaz17.hot.corp.google.com (wpaz17.hot.corp.google.com [172.24.198.81]) by smtp-out.google.com with ESMTP id n8MKDWuo005494 for ; Tue, 22 Sep 2009 13:13:32 -0700 Received: from pxi36 (pxi36.prod.google.com [10.243.27.36]) by wpaz17.hot.corp.google.com with ESMTP id n8MKDTgR030967 for ; Tue, 22 Sep 2009 13:13:29 -0700 Received: by pxi36 with SMTP id 36so62611pxi.18 for ; Tue, 22 Sep 2009 13:13:29 -0700 (PDT) Date: Tue, 22 Sep 2009 13:13:26 -0700 (PDT) From: David Rientjes Subject: Re: [PATCH 1/11] hugetlb: rework hstate_next_node_* functions In-Reply-To: <1253650095.4973.12.camel@useless.americas.hpqcorp.net> Message-ID: References: <20090915204327.4828.4349.sendpatchset@localhost.localdomain> <20090915204333.4828.47722.sendpatchset@localhost.localdomain> <1253650095.4973.12.camel@useless.americas.hpqcorp.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org To: Lee Schermerhorn Cc: linux-mm@kvack.org, linux-numa@vger.kernel.org, akpm@linux-foundation.org, Mel Gorman , Randy Dunlap , Nishanth Aravamudan , Adam Litke , Andy Whitcroft , eric.whitney@hp.com List-ID: On Tue, 22 Sep 2009, Lee Schermerhorn wrote: > > > static int hstate_next_node_to_alloc(struct hstate *h) > > > { > > > - int next_nid; > > > - next_nid = next_node(h->next_nid_to_alloc, node_online_map); > > > - if (next_nid == MAX_NUMNODES) > > > - next_nid = first_node(node_online_map); > > > + int nid, next_nid; > > > + > > > + nid = h->next_nid_to_alloc; > > > + next_nid = next_node_allowed(nid); > > > h->next_nid_to_alloc = next_nid; > > > - return next_nid; > > > + return nid; > > > } > > > > > > static int alloc_fresh_huge_page(struct hstate *h) > > > > I thought you had refactored this to drop next_nid entirely since gcc > > optimizes it away? > > Looks like I handled that in the subsequent patch. Probably you had > commented about removing next_nid on that patch. > Ah, I see it in 2/11, thanks. > > > @@ -693,7 +711,7 @@ static int free_pool_huge_page(struct hs > > > int next_nid; > > > int ret = 0; > > > > > > - start_nid = h->next_nid_to_free; > > > + start_nid = hstate_next_node_to_free(h); > > > next_nid = start_nid; > > > > > > do { > > > @@ -715,9 +733,10 @@ static int free_pool_huge_page(struct hs > > > } > > > update_and_free_page(h, page); > > > ret = 1; > > > + break; > > > } > > > next_nid = hstate_next_node_to_free(h); > > > - } while (!ret && next_nid != start_nid); > > > + } while (next_nid != start_nid); > > > > > > return ret; > > > } > > > @@ -1028,10 +1047,9 @@ int __weak alloc_bootmem_huge_page(struc > > > void *addr; > > > > > > addr = __alloc_bootmem_node_nopanic( > > > - NODE_DATA(h->next_nid_to_alloc), > > > + NODE_DATA(hstate_next_node_to_alloc(h)), > > > huge_page_size(h), huge_page_size(h), 0); > > > > > > - hstate_next_node_to_alloc(h); > > > if (addr) { > > > /* > > > * Use the beginning of the huge page to store the > > > > Shouldn't that panic if hstate_next_node_to_alloc() returns a memoryless > > node since it uses node_online_map? > > Well, the code has always been like this. And, these allocs shouldn't > panic given a memoryless node. The run time ones don't anyway. If > '_THISNODE' is specified, it'll just fail with a NULL addr, else it's > walk the generic zonelist to find the first node that can provide the > requested page size. Of course, we don't want that fallback when > populating the pools with persistent huge pages, so we always use the > THISNODE flag. > Whether NODE_DATA() exists for a memoryless node is arch-dependent, I think, so the panic I was referring to was a NULL pointer in bootmem. I think you're safe with the conversion to N_HIGH_MEMORY in patch 9/11 upon further inspection, though. > Having said that, I've only recently started to [try to] create the > gigabyte pages on my x86_64 [Shanghai] test system, but haven't been > able to allocate any GB pages. 2.6.31 seems to hang early in boot with > the command line options: "hugepagesz=1GB, hugepages=16". I've got > 256GB of memory on this system, so 16GB shouldn't be a problem to find > at boot time. Just started looking at this. > I can try to reproduce that on one of my systems too, I've never tried it before. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org