From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 16 Nov 2007 00:10:14 +0000 Subject: Re: [PATCH][UPDATED] hugetlb: retry pool allocation attempts Message-ID: <20071116001014.GA7372@skynet.ie> References: <20071115201053.GA21245@us.ibm.com> <20071115201826.GB21245@us.ibm.com> <1195162475.7078.224.camel@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1195162475.7078.224.camel@localhost> From: mel@skynet.ie (Mel Gorman) Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: Nishanth Aravamudan , wli@holomorphy.com, kenchen@google.com, david@gibson.dropbear.id.au, linux-mm@kvack.org, Andy Whitcroft List-ID: On (15/11/07 13:34), Dave Hansen didst pronounce: > On Thu, 2007-11-15 at 12:18 -0800, Nishanth Aravamudan wrote: > > b) __alloc_pages() does not currently retry allocations for order > > > PAGE_ALLOC_COSTLY_ORDER. > > ... when __GFP_REPEAT has not been specified, right? > Currently if hugetlbfs specified __GFP_RELEAT, it would end up trying to allocate indefinitly - that does not sound like sane behaviour. Indefinite retries for small allocations makes some sense, but for the larger allocs it should give up after a while as __GFP_REPEAT is documented to do. > > Modify __alloc_pages() to retry GFP_REPEAT COSTLY_ORDER allocations up > > to COSTLY_ORDER_RETRY_ATTEMPTS times, which I've set to 5, and use > > GFP_REPEAT in the hugetlb pool allocation. 5 seems to give reasonable > > results for x86, x86_64 and ppc64, but I'm not sure how to come up with > > the "best" number here (suggestions are welcome!). With this patch > > applied, the same box that gave the above results now gives: > > Coding in an explicit number of retries like this seems a bit hackish to > me. Retrying the allocations N times internally (through looping) > should give roughly the same number of huge pages that retrying them N > times externally (from the /proc file). The third case is where the pool is being dynamically resized and this allocation attempt is happening via the mmap() or fault paths. In those cases it should be making a serious attempt to satisfy the allocation without peppering retry logic in multiple places when __GFP_REPEAT is meant to do what is desired. >Does doing another ~50 > allocations get you to the same number of huge pages? > > What happens if you *only* specify GFP_REPEAT from hugetlbfs? > Potentially, it will stay forever in a reclaim loop. > I think you're asking a bit much of the page allocator (and reclaim) > here. The ideal is that direct reclaim is only entered once. In practice, it may not work as even if lumpy reclaim gets the necessary contiguous pages, there is no guarantee that another process will take the pages because a process does not take ownership of those pages. Fixing that would be pretty invasive and while I expect those patches to exist eventually, they are pretty far away. Ideally Nish could just say "__GFP_REPEAT" in the flags but it looks like he had alter slightly how __GFP_REPEAT behaves so it is not an alias for __GFP_NOFAIL. > There is a discrete amount of memory pressure applied for each > allocator request. Increasing the number of requests will virtually > always increase the memory pressure and make more pages available. > For a __GFP_REPEAT allocation, it says "try and pressure more because I really could do with this page" as opposed to failing. > What is the actual behavior that you want to get here? Do you want that > 34th request to always absolutely plateau the number of huge pages? > I believe the desired behaviour is that for larger allocations specifying __GFP_REPEAT to apply a bit more pressure than might have been used otherwise. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org