From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from zps77.corp.google.com (zps77.corp.google.com [172.25.146.77]) by smtp-out.google.com with ESMTP id l6DLOl5f001188 for ; Fri, 13 Jul 2007 22:24:47 +0100 Received: from an-out-0708.google.com (anab33.prod.google.com [10.100.53.33]) by zps77.corp.google.com with ESMTP id l6DLOWOB015843 for ; Fri, 13 Jul 2007 14:24:32 -0700 Received: by an-out-0708.google.com with SMTP id b33so115358ana for ; Fri, 13 Jul 2007 14:24:32 -0700 (PDT) Message-ID: Date: Fri, 13 Jul 2007 14:24:32 -0700 From: "Ken Chen" Subject: Re: [PATCH 5/5] [hugetlb] Try to grow pool for MAP_SHARED mappings In-Reply-To: <1184360742.16671.55.camel@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20070713151621.17750.58171.stgit@kernel> <20070713151717.17750.44865.stgit@kernel> <20070713130508.6f5b9bbb.pj@sgi.com> <1184360742.16671.55.camel@localhost.localdomain> Sender: owner-linux-mm@kvack.org Return-Path: To: Adam Litke Cc: Paul Jackson , linux-mm@kvack.org, mel@skynet.ie, apw@shadowen.org, wli@holomorphy.com, clameter@sgi.com List-ID: On 7/13/07, Adam Litke wrote: > To be honest, I just don't think a global hugetlb pool and cpusets are > compatible, period. Agreed. It's a mess. > > But the cpuset behaviour of this hugetlb stuff looks suspicious to me: > > 1) The code in alloc_fresh_huge_page() seems to round robin over > > the entire system, spreading the hugetlb pages uniformly on all nodes. > > If one a task in one small cpuset starts aggressively allocating hugetlb > > pages, do you think this will work, Adam -- looks to me like we will end > > up calling alloc_fresh_huge_page() many times, most of which will fail to > > alloc_pages_node() anything because the 'static nid' clock hand will be > > pointing at a node outside of the current tasks cpuset (not in that tasks > > mems_allowed). Inefficient, but I guess ok. > > Very good point. I guess we call alloc_fresh_huge_page in two scenarios > now... 1) By echoing a number into /proc/sys/vm/nr_hugepages, and 2) by > trying to dynamically increase the pool size for a particular process. > Case 1 is not in the context of any process (per se) and so > node_online_map makes sense. For case 2 we could teach the > __alloc_fresh_huge_page() to take a nodemask. That could get nasty > though since we'd have to move away from a static variable to get proper > interleaving. alloc_fresh_huge_page alloc_pages_node get_page_from_freelist { ... if ((alloc_flags & ALLOC_CPUSET) && !cpuset_zone_allowed_softwall(zone, gfp_mask)) goto try_next_zone; ... It looks to me that cpuset rule is buried deep down in the buddy allocator. So the cpuset mem_allowed rule is enforced in both pool reservation time (in get_page_from_freelist) and hugetlb page fault time in dequeue_huge_page(). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org