From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Tue, 3 Oct 2006 12:37:52 -0700 (PDT) From: David Rientjes Subject: Re: [RFC] another way to speed up fake numa node page_alloc In-Reply-To: <20061003111517.a5cc30ea.pj@sgi.com> Message-ID: References: <20060925091452.14277.9236.sendpatchset@v0> <20061001231811.26f91c47.pj@sgi.com> <20061001234858.fe91109e.pj@sgi.com> <20061002014121.28b759da.pj@sgi.com> <20061003111517.a5cc30ea.pj@sgi.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Jackson Cc: linux-mm@kvack.org, akpm@osdl.org, nickpiggin@yahoo.com.au, ak@suse.de, mbligh@google.com, rohitseth@google.com, menage@google.com, clameter@sgi.com List-ID: On Tue, 3 Oct 2006, Paul Jackson wrote: > pj, responding to David: > > > With NODES_SHIFT equal to 10 as you recommend, you can't get away with an > > > unsigned short there. > > > > Apparently it's time for me to be a stupid git again. That's ok; I'm > > getting quite accustomed to it. > > > > Could you spell out exactly why I can't get away with an unsigned short > > node_id if NODES_SHIFT is 10? > > > Is this still in your queue to respond to, David? > > I'm still curious as to why I can't get away with an unsigned short there. > Because it's unnecessary. On my 4G machine with numa=fake=256, each of these node_id arrays is going to be 1.5K. You could get away with the exact same behavior with using a u8 or unsigned char. There's no reason to support anything greater than a shift of 8 since NUMA emulation is _only_ available on x86_64 and doesn't even work right as it stands in the current mainline so that you could boot my machine with anything more than numa=fake=8. If you are going to abstract this functionality to other architectures or even generically I would suggest following Magnus Damm's example and creating a NODES_SHIFT_HW instead that would limit the number of numa=fake nodes. There is simply no reason for this to be greater than 8 (even a 128G machine with numa=fake=256 would have 512M nodes). Secondly, the entire node_id lookup is redundant on x86_64 in the first place (see arch/x86_64/mm/numa.c and include/asm-x86_64/mmzone.h for memnodemap). The only thing that is being sped-up with your node_id array in each zonelist_faster is moving this calculation from two steps to one step; since the mainline implementation today are both inline functions I think the improvement is minimal. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org