From: Hugh Dickins <hugh@veritas.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andi Kleen <andi@firstfloor.org>,
David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH mmotm] mm: alloc_large_system_hash check order
Date: Fri, 1 May 2009 12:30:03 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.64.0905011202530.8513@blonde.anvils> (raw)
In-Reply-To: <20090430132544.GB21997@csn.ul.ie>
On Thu, 30 Apr 2009, Mel Gorman wrote:
> On Wed, Apr 29, 2009 at 10:09:48PM +0100, Hugh Dickins wrote:
> > On an x86_64 with 4GB ram, tcp_init()'s call to alloc_large_system_hash(),
> > to allocate tcp_hashinfo.ehash, is now triggering an mmotm WARN_ON_ONCE on
> > order >= MAX_ORDER - it's hoping for order 11. alloc_large_system_hash()
> > had better make its own check on the order.
> >
> > Signed-off-by: Hugh Dickins <hugh@veritas.com>
>
> Looks good
>
> Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Thanks.
>
> As I was looking there, it seemed that alloc_large_system_hash() should be
> using alloc_pages_exact() instead of having its own "give back the spare
> pages at the end of the buffer" logic. If alloc_pages_exact() was used, then
> the check for an order >= MAX_ORDER can be pushed down to alloc_pages_exact()
> where it may catch other unwary callers.
>
> How about adding the following patch on top of yours?
Well observed, yes indeed. In fact, it even looks as if, shock horror,
alloc_pages_exact() was _plagiarized_ from alloc_large_system_hash().
Blessed be the GPL, I'm sure we can skip the lengthy lawsuits!
>
> ==== CUT HERE ====
> Use alloc_pages_exact() in alloc_large_system_hash() to avoid duplicated logic
>
> alloc_large_system_hash() has logic for freeing unused pages at the end
> of an power-of-two-pages-aligned buffer that is a duplicate of what is in
> alloc_pages_exact(). This patch converts alloc_large_system_hash() to use
> alloc_pages_exact().
>
> Signed-off-by: Mel Gorman <mel@csn.ul.ie>
> ---
> mm/page_alloc.c | 27 +++++----------------------
> 1 file changed, 5 insertions(+), 22 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1b3da0f..c94b140 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1942,6 +1942,9 @@ void *alloc_pages_exact(size_t size, gfp_t gfp_mask)
> unsigned int order = get_order(size);
> unsigned long addr;
>
> + if (order >= MAX_ORDER)
> + return NULL;
> +
I suppose there could be an argument about whether we do or do not
want to skip the WARN_ON when it's in alloc_pages_exact().
I have no opinion on that; but DaveM's reply on large_system_hash
does make it clear that we're not interested in the warning there.
> addr = __get_free_pages(gfp_mask, order);
> if (addr) {
> unsigned long alloc_end = addr + (PAGE_SIZE << order);
> @@ -4755,28 +4758,8 @@ void *__init alloc_large_system_hash(const char *tablename,
> table = alloc_bootmem_nopanic(size);
> else if (hashdist)
> table = __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL);
> - else {
> - unsigned long order = get_order(size);
> -
> - if (order < MAX_ORDER)
> - table = (void *)__get_free_pages(GFP_ATOMIC,
> - order);
> - /*
> - * If bucketsize is not a power-of-two, we may free
> - * some pages at the end of hash table.
> - */
That's actually a helpful comment, it's easy to think we're dealing
in powers of two here when we may not be. Maybe retain it with your
alloc_pages_exact call?
> - if (table) {
> - unsigned long alloc_end = (unsigned long)table +
> - (PAGE_SIZE << order);
> - unsigned long used = (unsigned long)table +
> - PAGE_ALIGN(size);
> - split_page(virt_to_page(table), order);
> - while (used < alloc_end) {
> - free_page(used);
> - used += PAGE_SIZE;
> - }
> - }
> - }
> + else
> + table = alloc_pages_exact(PAGE_ALIGN(size), GFP_ATOMIC);
Do you actually need that PAGE_ALIGN on the size?
> } while (!table && size > PAGE_SIZE && --log2qty);
>
> if (!table)
Andrew noticed another oddity: that if it goes the hashdist __vmalloc()
way, it won't be limited by MAX_ORDER. Makes one wonder whether it
ought to fall back to __vmalloc() if the alloc_pages_exact() fails.
I think that's a change we could make _if_ the large_system_hash
users ever ask for it, but _not_ one we should make surreptitiously.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-05-01 11:30 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-29 21:09 Hugh Dickins
2009-04-29 21:28 ` Andrew Morton
2009-05-01 13:40 ` Hugh Dickins
2009-05-01 13:45 ` [PATCH 2.6.30] Doc: hashdist defaults on for 64bit Hugh Dickins
2009-05-01 14:29 ` Mel Gorman
2009-05-01 17:20 ` David Miller
2009-04-30 0:25 ` [PATCH mmotm] mm: alloc_large_system_hash check order David Miller
2009-04-30 13:25 ` Mel Gorman
2009-05-01 11:30 ` Hugh Dickins [this message]
2009-05-01 11:46 ` Eric Dumazet
2009-05-01 12:05 ` Hugh Dickins
2009-05-01 14:00 ` Mel Gorman
2009-05-01 13:59 ` Christoph Lameter
2009-05-01 15:09 ` Mel Gorman
2009-05-01 15:14 ` Christoph Lameter
2009-05-01 14:12 ` Mel Gorman
2009-05-01 14:28 ` Hugh Dickins
2009-05-01 14:43 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0905011202530.8513@blonde.anvils \
--to=hugh@veritas.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox