From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 13 Sep 2006 14:40:25 -0700 (PDT) From: Christoph Lameter Subject: Re: [PATCH] Get rid of zone_table In-Reply-To: <1158180795.9141.158.camel@localhost.localdomain> Message-ID: References: <1158180795.9141.158.camel@localhost.localdomain> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Dave Hansen Cc: Andy Whitcroft , linux-mm@kvack.org List-ID: On Wed, 13 Sep 2006, Dave Hansen wrote: > > The section_to_node table (if we still need it) is still the size of the > > number of sections but the individual elements are integers (which already > > saves 50% on 64 bit platforms) and we do not need to duplicate the entries > > per zone type. So even if we have to keep the table then we shrink it to > > 1/4th (32bit) or 1/8th )(64bit). > > It doesn't feel like this is the best fit to go with sparsemem, but the > impact is pretty tiny, and it does seem somewhat sensible to put it > there. > > A few concerns: is there a cache or readability impact from keeping this > structure separate from the mem_section, when it is logically and > functionally pretty paired with it? It doesn't work with > SPARSEMEM_EXTREME (it would just blow up horribly), and this part at > least deserves a comment. Is there any impact from making this a > non-inlined call, unlike the old zonetable lookup? I am not that familiar with sparsemem thats why I asked you about it at first. I doubt there is much of an impact from making this non inlined. IMHO it is clearer and easier to maintain if the code to do the section lookup is put with the code that generates the sections. Its also an exceptional thing that is not needed in general. The main performance issue is probably the number of cachelines touched and the situation gets better here even for the worst case that we have to keep a separate lookup array. The array is denser. For page_zone(page) one would have to do two lookups in the worst case. One to get the node id and then another one in NODE_DATA() to get to the zone. However, the NODE_DATA()is frequently referenced so its likely to be in cache. The existing 3 lookups for page_to_nid() are reduced to a single lookup in the section_to_node_table(). Before we had to determine the zone and then fetch the corresponding pgdat address and then fetch the node number from the pgdat structure (yuck). You could put the node number with the section (put it in a separate cacheline before the start of the memsection array?) but then it would be in a cacheline of its own. This way you have the node number of a set of neighboring sections in one cacheline. With a 128 byte cacheline you have the nodes for the 32 neighboring section of memory. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org