linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: mel@skynet.ie (Mel Gorman)
To: Dave Hansen <haveblue@us.ibm.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>,
	wli@holomorphy.com, kenchen@google.com,
	david@gibson.dropbear.id.au, linux-mm@kvack.org,
	Andy Whitcroft <apw@shadowen.org>
Subject: Re: [PATCH][UPDATED] hugetlb: retry pool allocation attempts
Date: Fri, 16 Nov 2007 00:10:14 +0000	[thread overview]
Message-ID: <20071116001014.GA7372@skynet.ie> (raw)
In-Reply-To: <1195162475.7078.224.camel@localhost>

On (15/11/07 13:34), Dave Hansen didst pronounce:
> On Thu, 2007-11-15 at 12:18 -0800, Nishanth Aravamudan wrote:
> > b) __alloc_pages() does not currently retry allocations for order >
> > PAGE_ALLOC_COSTLY_ORDER.
> 
> ... when __GFP_REPEAT has not been specified, right?
> 

Currently if hugetlbfs specified __GFP_RELEAT, it would end up trying to
allocate indefinitly - that does not sound like sane behaviour. Indefinite
retries for small allocations makes some sense, but for the larger allocs
it should give up after a while as __GFP_REPEAT is documented to do.

> > Modify __alloc_pages() to retry GFP_REPEAT COSTLY_ORDER allocations up
> > to COSTLY_ORDER_RETRY_ATTEMPTS times, which I've set to 5, and use
> > GFP_REPEAT in the hugetlb pool allocation. 5 seems to give reasonable
> > results for x86, x86_64 and ppc64, but I'm not sure how to come up with
> > the "best" number here (suggestions are welcome!). With this patch
> > applied, the same box that gave the above results now gives: 
> 
> Coding in an explicit number of retries like this seems a bit hackish to
> me.  Retrying the allocations N times internally (through looping)
> should give roughly the same number of huge pages that retrying them N
> times externally (from the /proc file). 

The third case is where the pool is being dynamically resized and this
allocation attempt is happening via the mmap() or fault paths. In those cases
it should be making a serious attempt to satisfy the allocation without
peppering retry logic in multiple places when __GFP_REPEAT is meant to do
what is desired.

>Does doing another ~50
> allocations get you to the same number of huge pages?
> 
> What happens if you *only* specify GFP_REPEAT from hugetlbfs?
> 

Potentially, it will stay forever in a reclaim loop.

> I think you're asking a bit much of the page allocator (and reclaim)
> here. 

The ideal is that direct reclaim is only entered once. In practice, it
may not work as even if lumpy reclaim gets the necessary contiguous
pages, there is no guarantee that another process will take the pages
because a process does not take ownership of those pages. Fixing that
would be pretty invasive and while I expect those patches to exist
eventually, they are pretty far away.

Ideally Nish could just say "__GFP_REPEAT" in the flags but it looks
like he had alter slightly how __GFP_REPEAT behaves so it is not an
alias for __GFP_NOFAIL.

> There is a discrete amount of memory pressure applied for each
> allocator request.  Increasing the number of requests will virtually
> always increase the memory pressure and make more pages available.
> 

For a __GFP_REPEAT allocation, it says "try and pressure more because I
really could do with this page" as opposed to failing.

> What is the actual behavior that you want to get here?  Do you want that
> 34th request to always absolutely plateau the number of huge pages?
> 

I believe the desired behaviour is that for larger allocations specifying
__GFP_REPEAT to apply a bit more pressure than might have been used
otherwise.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-11-16  0:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-15 20:10 [PATCH] " Nishanth Aravamudan
2007-11-15 20:18 ` [PATCH][UPDATED] " Nishanth Aravamudan
2007-11-15 21:34   ` Dave Hansen
2007-11-16  0:10     ` Mel Gorman [this message]
2007-11-16  0:46     ` Nishanth Aravamudan
2007-11-16  0:19   ` Mel Gorman
2007-11-16  0:58     ` Nishanth Aravamudan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071116001014.GA7372@skynet.ie \
    --to=mel@skynet.ie \
    --cc=apw@shadowen.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=haveblue@us.ibm.com \
    --cc=kenchen@google.com \
    --cc=linux-mm@kvack.org \
    --cc=nacc@us.ibm.com \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox