From: Dave Hansen <haveblue@us.ibm.com>
To: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: "Martin J. Bligh" <mbligh@mbligh.org>,
Christoph Lameter <clameter@engr.sgi.com>,
Andy Whitcroft <apw@shadowen.org>, Andrew Morton <akpm@osdl.org>,
linux-mm <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
shai@scalex86.org, steiner@sgi.com
Subject: Re: NUMA aware slab allocator V3
Date: Mon, 16 May 2005 14:54:30 -0700 [thread overview]
Message-ID: <1116280470.1005.137.camel@localhost> (raw)
In-Reply-To: <200505161410.43382.jbarnes@virtuousgeek.org>
On Mon, 2005-05-16 at 14:10 -0700, Jesse Barnes wrote:
> On Monday, May 16, 2005 11:08 am, Martin J. Bligh wrote:
> > > I have never seen such a machine. A SMP machine with multiple
> > > "nodes"? So essentially one NUMA node has multiple discontig
> > > "nodes"?
> >
> > I believe you (SGI) make one ;-) Anywhere where you have large gaps
> > in the physical address range within a node, this is what you really
> > need. Except ia64 has this wierd virtual mem_map thing that can go
> > away once we have sparsemem.
>
> Right, the SGI boxes have discontiguous memory within a node, but it's
> not represented by pgdats (like you said, one 'virtual memmap' spans
> the whole address space of a node). Sparse can help simplify this
> across platforms, but has the potential to be more expensive for
> systems with dynamically sized holes, due to the additional calculation
> and potential cache miss associated with indexing into the correct
> memmap (Dave can probably correct me here, it's been awhile). With a
> virtual memmap, you only occasionally take a TLB miss on the struct
> page access after indexing into the array.
The sparsemem calculation costs are quite low. One of the main costs is
bringing the actual 'struct page' into the cache so you can use the
hints in page->flags. In reality, after almost every pfn_to_page(), you
go ahead and touch the 'struct page' anyway. So, this cost is
effectively zero. In fact, it's kinda like doing a prefetch, so it may
even speed some things up.
After you have the section index from page->flags (which costs just a
shift and a mask), you access into a static array, and do a single
subtraction. Here's the I386) disassembly this function with
SPARSEMEM=y:
unsigned long page_to_pfn_stub(struct page *page)
{
return page_to_pfn(page);
}
1c30: 8b 54 24 04 mov 0x4(%esp),%edx
1c34: 8b 02 mov (%edx),%eax
1c36: c1 e8 1a shr $0x1a,%eax
1c39: 8b 04 85 00 00 00 00 mov 0x0(,%eax,4),%eax
1c40: 24 fc and $0xfc,%al
1c42: 29 c2 sub %eax,%edx
1c44: c1 fa 05 sar $0x5,%edx
1c47: 89 d0 mov %edx,%eax
1c49: c3 ret
Other than popping the arguments off the stack, I think there are only
two loads in there: the page->flags load, and the mem_section[]
dereference. So, in the end, the only advantage of the vmem_map[]
approach is saving that _one_ load. The worst-case-scenario for this
load in the sparsemem case is a full cache miss. The worst case in the
vmem_map[] case is a TLB miss, which is probably hundreds of times
slower than even a full cache miss.
BTW, the object footprint of sparsemem is lower than discontigmem, too:
SPARSEMEM DISCONTIGMEM
pfn_to_page: 25b 41b
page_to_pfn: 25b 33b
So, that helps out things like icache footprint.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-05-16 21:54 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-05-11 15:17 NUMA aware slab allocator V2 Christoph Lameter
2005-05-11 15:46 ` Jack Steiner
2005-05-12 7:04 ` Andrew Morton
2005-05-12 9:39 ` Niraj kumar
2005-05-12 20:02 ` Christoph Lameter
2005-05-12 20:22 ` Andrew Morton
2005-05-13 7:06 ` Andrew Morton
2005-05-13 11:21 ` Christoph Lameter
2005-05-13 11:33 ` Andrew Morton
2005-05-13 11:37 ` Christoph Lameter
2005-05-13 13:56 ` Dave Hansen
2005-05-13 16:20 ` Christoph Lameter
2005-05-14 1:24 ` NUMA aware slab allocator V3 Christoph Lameter
2005-05-14 7:42 ` Andrew Morton
2005-05-14 16:24 ` Christoph Lameter
2005-05-16 5:00 ` Andrew Morton
2005-05-16 13:52 ` Dave Hansen
2005-05-16 16:47 ` Christoph Lameter
2005-05-16 17:22 ` Dave Hansen
2005-05-16 17:54 ` Christoph Lameter
2005-05-16 18:08 ` Martin J. Bligh
2005-05-16 21:10 ` Jesse Barnes
2005-05-16 21:21 ` Martin J. Bligh
2005-05-17 0:14 ` Christoph Lameter
2005-05-17 0:26 ` Dave Hansen
2005-05-17 23:36 ` Matthew Dobson
2005-05-17 23:49 ` Christoph Lameter
2005-05-18 17:27 ` Matthew Dobson
2005-05-18 17:48 ` Christoph Lameter
2005-05-18 21:15 ` Matthew Dobson
2005-05-18 21:40 ` Christoph Lameter
2005-05-19 5:07 ` Christoph Lameter
2005-05-19 16:14 ` Jesse Barnes
2005-05-19 19:03 ` Matthew Dobson
2005-05-19 21:46 ` Matthew Dobson
2005-05-20 19:03 ` Matthew Dobson
2005-05-20 19:23 ` Christoph Lameter
2005-05-20 20:20 ` Matthew Dobson
2005-05-20 21:30 ` Matthew Dobson
2005-05-20 23:42 ` Christoph Lameter
2005-05-24 21:37 ` Christoph Lameter
2005-05-24 23:02 ` Matthew Dobson
2005-05-25 5:21 ` Christoph Lameter
2005-05-25 18:27 ` Matthew Dobson
2005-05-25 21:03 ` Christoph Lameter
2005-05-26 6:48 ` Martin J. Bligh
2005-05-28 1:59 ` NUMA aware slab allocator V4 Christoph Lameter
2005-05-16 21:54 ` Dave Hansen [this message]
2005-05-16 18:12 ` NUMA aware slab allocator V3 Dave Hansen
2005-05-13 13:46 ` NUMA aware slab allocator V2 Dave Hansen
2005-05-17 23:29 ` Matthew Dobson
2005-05-18 1:07 ` Christoph Lameter
2005-05-12 21:49 ` Robin Holt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1116280470.1005.137.camel@localhost \
--to=haveblue@us.ibm.com \
--cc=akpm@osdl.org \
--cc=apw@shadowen.org \
--cc=clameter@engr.sgi.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mbligh@mbligh.org \
--cc=shai@scalex86.org \
--cc=steiner@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox