linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andi Kleen <andi@firstfloor.org>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH mmotm] mm: alloc_large_system_hash check order
Date: Fri, 1 May 2009 12:30:03 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0905011202530.8513@blonde.anvils> (raw)
In-Reply-To: <20090430132544.GB21997@csn.ul.ie>

On Thu, 30 Apr 2009, Mel Gorman wrote:
> On Wed, Apr 29, 2009 at 10:09:48PM +0100, Hugh Dickins wrote:
> > On an x86_64 with 4GB ram, tcp_init()'s call to alloc_large_system_hash(),
> > to allocate tcp_hashinfo.ehash, is now triggering an mmotm WARN_ON_ONCE on
> > order >= MAX_ORDER - it's hoping for order 11.  alloc_large_system_hash()
> > had better make its own check on the order.
> > 
> > Signed-off-by: Hugh Dickins <hugh@veritas.com>
> 
> Looks good
> 
> Reviewed-by: Mel Gorman <mel@csn.ul.ie>

Thanks.

> 
> As I was looking there, it seemed that alloc_large_system_hash() should be
> using alloc_pages_exact() instead of having its own "give back the spare
> pages at the end of the buffer" logic. If alloc_pages_exact() was used, then
> the check for an order >= MAX_ORDER can be pushed down to alloc_pages_exact()
> where it may catch other unwary callers.
> 
> How about adding the following patch on top of yours?

Well observed, yes indeed.  In fact, it even looks as if, shock horror,
alloc_pages_exact() was _plagiarized_ from alloc_large_system_hash().
Blessed be the GPL, I'm sure we can skip the lengthy lawsuits!

> 
> ==== CUT HERE ====
> Use alloc_pages_exact() in alloc_large_system_hash() to avoid duplicated logic
> 
> alloc_large_system_hash() has logic for freeing unused pages at the end
> of an power-of-two-pages-aligned buffer that is a duplicate of what is in
> alloc_pages_exact(). This patch converts alloc_large_system_hash() to use
> alloc_pages_exact().
> 
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> --- 
>  mm/page_alloc.c |   27 +++++----------------------
>  1 file changed, 5 insertions(+), 22 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1b3da0f..c94b140 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1942,6 +1942,9 @@ void *alloc_pages_exact(size_t size, gfp_t gfp_mask)
>  	unsigned int order = get_order(size);
>  	unsigned long addr;
>  
> +	if (order >= MAX_ORDER)
> +		return NULL;
> +

I suppose there could be an argument about whether we do or do not
want to skip the WARN_ON when it's in alloc_pages_exact().

I have no opinion on that; but DaveM's reply on large_system_hash
does make it clear that we're not interested in the warning there.

>  	addr = __get_free_pages(gfp_mask, order);
>  	if (addr) {
>  		unsigned long alloc_end = addr + (PAGE_SIZE << order);
> @@ -4755,28 +4758,8 @@ void *__init alloc_large_system_hash(const char *tablename,
>  			table = alloc_bootmem_nopanic(size);
>  		else if (hashdist)
>  			table = __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL);
> -		else {
> -			unsigned long order = get_order(size);
> -
> -			if (order < MAX_ORDER)
> -				table = (void *)__get_free_pages(GFP_ATOMIC,
> -								order);
> -			/*
> -			 * If bucketsize is not a power-of-two, we may free
> -			 * some pages at the end of hash table.
> -			 */

That's actually a helpful comment, it's easy to think we're dealing
in powers of two here when we may not be.  Maybe retain it with your
alloc_pages_exact call?

> -			if (table) {
> -				unsigned long alloc_end = (unsigned long)table +
> -						(PAGE_SIZE << order);
> -				unsigned long used = (unsigned long)table +
> -						PAGE_ALIGN(size);
> -				split_page(virt_to_page(table), order);
> -				while (used < alloc_end) {
> -					free_page(used);
> -					used += PAGE_SIZE;
> -				}
> -			}
> -		}
> +		else
> +			table = alloc_pages_exact(PAGE_ALIGN(size), GFP_ATOMIC);

Do you actually need that PAGE_ALIGN on the size?

>  	} while (!table && size > PAGE_SIZE && --log2qty);
>  
>  	if (!table)

Andrew noticed another oddity: that if it goes the hashdist __vmalloc()
way, it won't be limited by MAX_ORDER.  Makes one wonder whether it
ought to fall back to __vmalloc() if the alloc_pages_exact() fails.
I think that's a change we could make _if_ the large_system_hash
users ever ask for it, but _not_ one we should make surreptitiously.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-05-01 11:30 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-29 21:09 Hugh Dickins
2009-04-29 21:28 ` Andrew Morton
2009-05-01 13:40   ` Hugh Dickins
2009-05-01 13:45     ` [PATCH 2.6.30] Doc: hashdist defaults on for 64bit Hugh Dickins
2009-05-01 14:29       ` Mel Gorman
2009-05-01 17:20       ` David Miller
2009-04-30  0:25 ` [PATCH mmotm] mm: alloc_large_system_hash check order David Miller
2009-04-30 13:25 ` Mel Gorman
2009-05-01 11:30   ` Hugh Dickins [this message]
2009-05-01 11:46     ` Eric Dumazet
2009-05-01 12:05       ` Hugh Dickins
2009-05-01 14:00     ` Mel Gorman
2009-05-01 13:59       ` Christoph Lameter
2009-05-01 15:09         ` Mel Gorman
2009-05-01 15:14           ` Christoph Lameter
2009-05-01 14:12       ` Mel Gorman
2009-05-01 14:28       ` Hugh Dickins
2009-05-01 14:43         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0905011202530.8513@blonde.anvils \
    --to=hugh@veritas.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox