linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Travis <travis@sgi.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Andi Kleen <andi@firstfloor.org>,
	linux-mm@kvack.org, Marcelo Tosatti <mtosatti@redhat.com>,
	Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
	Izik Eidus <ieidus@redhat.com>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Nick Piggin <npiggin@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Karl Feind <kaf@sgi.com>,
	Jack Steiner <steiner@sgi.com>
Subject: Re: RFC: Transparent Hugepage support
Date: Thu, 29 Oct 2009 09:50:07 -0700	[thread overview]
Message-ID: <4AE9C7BF.3060509@sgi.com> (raw)
In-Reply-To: <20091029103658.GJ9640@random.random>

Hi Andrea,

I will find some time soon to test out your patch on a
(relatively) huge machine and let you know the results.

The memory size on this machine:

	480,700,399,616 bytes of system memory tested OK

This translates to ~240k available 2Mb pages.

Thanks,
Mike

Andrea Arcangeli wrote:
> Hello Ingo, Andi, everyone,
> 
> On Thu, Oct 29, 2009 at 10:43:44AM +0100, Ingo Molnar wrote:
>> * Andi Kleen <andi@firstfloor.org> wrote:
>>
>>>> 1GB pages can't be handled by this code, and clearly it's not 
>>>> practical to hope 1G pages to materialize in the buddy (even if we
>>> That seems short sightened. You do this because 2MB pages give you x% 
>>> performance advantage, but then it's likely that 1GB pages will give 
>>> another y% improvement and why should people stop at the smaller 
>>> improvement?
>>>
>>> Ignoring the gigantic pages now would just mean that this would need 
>>> to be revised later again or that users still need to use hacks like 
>>> libhugetlbfs.
>> I've read the patch and have read through this discussion and you are 
>> missing the big point that it's best to do such things gradually - one 
>> step at a time.
>>
>> Just like we went from 2 level pagetables to 3 level pagetables, then to 
>> 4 level pagetables - and we might go to 5 level pagetables in the 
>> future. We didnt go from 2 level pagetables to 5 level page tables in 
>> one go, despite predictions clearly pointing out the exponentially 
>> increasing need for RAM.
> 
> I totally agree with your assessment.
> 
>> So your obsession with 1GB pages is misguided. If indeed transparent 
>> largepages give us real benefits we can extend it to do transparent 
>> gbpages as well - should we ever want to. There's nothing 'shortsighted' 
>> about being gradual - the change is already ambitious enough as-is, and 
>> brings very clear benefits to a difficult, decade-old problem no other 
>> person was able to address.
>>
>> In fact introducing transparent 2MBpages makes 1GB pages support 
>> _easier_ to merge: as at that point we'll already have a (finally..) 
>> successful hugetlb facility happility used by an increasing range of 
>> applications.
> 
> Agreed.
> 
>> Hugetlbfs's big problem was always that it wasnt transparent and hence 
>> wasnt gradual for applications. It was an opt-in and constituted an 
>> interface/ABI change - that is always a big barrier to app adoption.
>>
>> So i give Andrea's patch a very big thumbs up - i hope it gets reviewed 
>> in fine detail and added to -mm ASAP. Our lack of decent, automatic 
>> hugepage support is sticking out like a sore thumb and is hurting us in 
>> high-performance setups. If largepage support within Linux has a chance, 
>> this might be the way to do it.
> 
> Thanks a lot for your review!
> 
>> A small comment regarding the patch itself: i think it could be 
>> simplified further by eliminating CONFIG_TRANSPARENT_HUGEPAGE and by 
>> making it a natural feature of hugepage support. If the code is correct 
>> i cannot see any scenario under which i wouldnt want a hugepage enabled 
>> kernel i'm booting to not have transparent hugepage support as well.
> 
> The two reasons why I added a config option are:
> 
> 1) because it was easy enough, gcc is smart enough to eliminate the
> external calls so I didn't need to add ifdefs with the exception of
> returning 0 from pmd_trans_huge and pmd_trans_frozen. I only had to
> make the exports of huge_memory.c visible unconditionally so it doesn't
> warn, after that I don't need to build and link huge_memory.o.
> 
> 2) to avoid breaking build of archs not implementing pmd_trans_huge
> and that may never be able to take advantage of it
> 
> But we could move CONFIG_TRANSPARENT_HUGEPAGE to an arch define forced
> to Y on x86-64 and N on power.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-10-29 16:50 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-26 18:51 Andrea Arcangeli
2009-10-27 15:41 ` Rik van Riel
2009-10-27 18:18 ` Andi Kleen
2009-10-27 19:30   ` Andrea Arcangeli
2009-10-28  4:28     ` Andi Kleen
2009-10-28 12:00       ` Andrea Arcangeli
2009-10-28 14:18         ` Andi Kleen
2009-10-28 14:54           ` Adam Litke
2009-10-28 15:13             ` Andi Kleen
2009-10-28 15:30               ` Andrea Arcangeli
2009-10-29 15:59             ` Dave Hansen
2009-10-31 21:32             ` Benjamin Herrenschmidt
2009-10-28 15:48           ` Andrea Arcangeli
2009-10-28 16:03             ` Andi Kleen
2009-10-28 16:22               ` Andrea Arcangeli
2009-10-28 16:34                 ` Andi Kleen
2009-10-28 16:56                   ` Adam Litke
2009-10-28 17:18                     ` Andi Kleen
2009-10-28 19:04                   ` Andrea Arcangeli
2009-10-28 19:22                     ` Andrea Arcangeli
2009-10-29  9:43       ` Ingo Molnar
2009-10-29 10:36         ` Andrea Arcangeli
2009-10-29 16:50           ` Mike Travis [this message]
2009-10-30  0:40           ` KAMEZAWA Hiroyuki
2009-11-03 10:55             ` Andrea Arcangeli
2009-11-04  0:36               ` KAMEZAWA Hiroyuki
2009-10-29 12:54     ` Andrea Arcangeli
2009-10-27 20:42 ` Christoph Lameter
2009-10-27 18:21   ` Andrea Arcangeli
2009-10-27 20:25     ` Chris Wright
2009-10-29 18:51       ` Christoph Lameter
2009-11-01 10:56         ` Andrea Arcangeli
2009-10-29 18:55     ` Christoph Lameter
2009-10-31 21:29 ` Benjamin Herrenschmidt
2009-11-03 11:18   ` Andrea Arcangeli
2009-11-03 19:10     ` Dave Hansen
2009-11-04  4:10     ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AE9C7BF.3060509@sgi.com \
    --to=travis@sgi.com \
    --cc=aarcange@redhat.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=ieidus@redhat.com \
    --cc=kaf@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=mtosatti@redhat.com \
    --cc=npiggin@suse.de \
    --cc=steiner@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox