linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: Brent Casavant <bcasavan@sgi.com>
Cc: "Martin J. Bligh" <mbligh@aracnet.com>, Andi Kleen <ak@suse.de>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-ia64@vger.kernel.org
Subject: Re: [PATCH 0/3] NUMA boot hash allocation interleaving
Date: Wed, 15 Dec 2004 05:08:54 +0100	[thread overview]
Message-ID: <20041215040854.GC27225@wotan.suse.de> (raw)
In-Reply-To: <Pine.SGI.4.61.0412141720420.22462@kzerza.americas.sgi.com>

On Tue, Dec 14, 2004 at 05:24:02PM -0600, Brent Casavant wrote:
> On Tue, 14 Dec 2004, Martin J. Bligh wrote:
> 
> > --On Tuesday, December 14, 2004 20:13:48 +0100 Andi Kleen <ak@suse.de> wrote:
> > 
> > > I originally was a bit worried about the TLB usage, but it doesn't
> > > seem to be a too big issue (hopefully the benchmarks weren't too
> > > micro though)
> > 
> > Well, as long as we stripe on large page boundaries, it should be fine,
> > I'd think. On PPC64, it'll screw the SLB, but ... tough ;-) We can either
> > turn it off, or only do it on things larger than the segment size, and
> > just round-robin the rest, or allocate from node with most free.
> 
> Is there a reasonably easy-to-use existing infrastructure to do this?

No. It will be a lot of work actually, requiring new code for 
each architecture and may even be impossible on some. 
The current hugetlb code is not really suitable for this
because it requires an preallocated pool and only works
for user space.

I actually considered implementing it for x86-64 some time ago
for the modules, but then I never bothered. On AMD systems
I actually prefer to use small pages here. The reason is that
Opteron has a separated large and small pages TLB and the small
pages TLB is much bigger. When someone else uses huge TLB 
pages too (user space or kernel direct mapping) then it's actually
a good idea to use small pages.

Also it may be difficult in some cases to even allocate
such large pages even at boot and impossible to do it
later when a module loads.

Also at least on IA64 the large page size is usually 1-2GB 
and that would seem to be a little too large to me for
interleaving purposes. Also it may prevent the purpose 
you implemented it - not using too much memory from a single
node. 

Using other page sizes would be probably tricky because the 
linux VM can currently barely deal with two page sizes.
I suspect handling more would need some VM infrastructure effort
at least in the changed port. 

> I didn't find anything in my examination of vmalloc itself, so I gave
> up on the idea.
> 
> And just to clarify, are you saying you want to see this before inclusion
> in mainline kernels, or that it would be nice to have but not necessary?

I wouldn't do anything in this area unless somebody shows a benchmark /
profiling results where TLB pressure makes a clear difference. And even
then it may be not worth the effort.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  parent reply	other threads:[~2004-12-15  4:08 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-14 17:53 Brent Casavant
2004-12-14 18:59 ` Martin J. Bligh
2004-12-14 19:13   ` Andi Kleen
2004-12-14 19:48     ` Brent Casavant
2004-12-14 20:08     ` Martin J. Bligh
2004-12-14 23:24       ` Brent Casavant
2004-12-14 22:00         ` Martin J. Bligh
2004-12-15  4:58           ` Andi Kleen
2004-12-15 14:47             ` Anton Blanchard
2004-12-15 23:37               ` Brent Casavant
2004-12-16  5:02               ` Andi Kleen
2004-12-16  5:13                 ` Anton Blanchard
2004-12-16 14:18                   ` Jose R. Santos
2004-12-20 16:56                     ` Jose R. Santos
2004-12-21 11:46                       ` Anton Blanchard
2004-12-21 16:23                         ` Brent Casavant
2004-12-23  2:19                           ` Jose R. Santos
2004-12-15  4:08         ` Andi Kleen [this message]
2004-12-15  7:14           ` Martin J. Bligh
2004-12-15  7:17             ` Andi Kleen
2004-12-15 15:08               ` Martin J. Bligh
2004-12-15 18:24               ` Brent Casavant
2004-12-15  7:41           ` Eric Dumazet
2004-12-15  7:46             ` Andi Kleen
2004-12-15  9:14               ` Andi Kleen
2004-12-14 23:24     ` Nick Piggin
2004-12-14 19:30   ` Brent Casavant
2004-12-14 20:10     ` Martin J. Bligh
2004-12-14 18:32 Luck, Tony
2004-12-15  0:28 ` Hiroyuki KAMEZAWA
2004-12-15 17:25 Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041215040854.GC27225@wotan.suse.de \
    --to=ak@suse.de \
    --cc=bcasavan@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@aracnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox