Christoph Lameter wrote: >The -1 is optimized away for the non NUMA case. In the NUMA case its an >additional parameter that is passed to kmem_cache_alloc. So its one >additional register load that allows us to not have an additional function >for the case non node specific allocations. > > Correct, I was thinking about the NUMA case. You've decided to add one register load to every call of kmalloc. On i386, kmalloc_node() is a 24-byte function. I'd bet that adding the node parameter to every call of kmalloc causes a .text increase larger than 240 bytes. And I have not yet considered that you have increased the number of conditional branches in every kmalloc(32,GFP_KERNEL) call by 33%, i.e. from 3 to 4 conditional branch instructions. I'd add an explicit kmalloc_node function. Attached is a prototype patch. You'd have to reintroduce the flags field to kmem_cache_alloc_node() and update kmalloc_node. The patch was manually edited, I hope it applies to a recent tree ;-) What do you think? -- Manfred