From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) by kanga.kvack.org (Postfix) with ESMTP id E34D86B0107 for ; Mon, 24 Feb 2014 00:08:44 -0500 (EST) Received: by mail-pa0-f41.google.com with SMTP id fa1so6036932pad.14 for ; Sun, 23 Feb 2014 21:08:44 -0800 (PST) Received: from LGEAMRELO02.lge.com (lgeamrelo02.lge.com. [156.147.1.126]) by mx.google.com with ESMTP id mo4si15363434pbc.21.2014.02.23.21.08.42 for ; Sun, 23 Feb 2014 21:08:43 -0800 (PST) Date: Mon, 24 Feb 2014 14:08:51 +0900 From: Joonsoo Kim Subject: Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node Message-ID: <20140224050851.GB14814@lge.com> References: <20140207054819.GC28952@lge.com> <20140210191321.GD1558@linux.vnet.ibm.com> <20140211074159.GB27870@lge.com> <20140217065257.GD3468@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Nishanth Aravamudan , David Rientjes , Han Pingtian , Pekka Enberg , Linux Memory Management List , Paul Mackerras , Anton Blanchard , Matt Mackall , linuxppc-dev@lists.ozlabs.org, Wanpeng Li On Tue, Feb 18, 2014 at 10:38:01AM -0600, Christoph Lameter wrote: > On Mon, 17 Feb 2014, Joonsoo Kim wrote: > > > On Wed, Feb 12, 2014 at 04:16:11PM -0600, Christoph Lameter wrote: > > > Here is another patch with some fixes. The additional logic is only > > > compiled in if CONFIG_HAVE_MEMORYLESS_NODES is set. > > > > > > Subject: slub: Memoryless node support > > > > > > Support memoryless nodes by tracking which allocations are failing. > > > > I still don't understand why this tracking is needed. > > Its an optimization to avoid calling the page allocator to figure out if > there is memory available on a particular node. > > > All we need for allcation targeted to memoryless node is to fallback proper > > node, that it, numa_mem_id() node of targeted node. My previous patch > > implements it and use proper fallback node on every allocation code path. > > Why this tracking is needed? Please elaborate more on this. > > Its too slow to do that on every alloc. One needs to be able to satisfy > most allocations without switching percpu slabs for optimal performance. I don't think that we need to switch percpu slabs on every alloc. Allocation targeted to specific node is rare. And most of these allocations may be targeted to either numa_node_id() or numa_mem_id(). My patch considers these cases, so most of allocations are processed by percpu slabs. There is no suboptimal performance. > > > > Allocations targeted to the nodes without memory fall back to the > > > current available per cpu objects and if that is not available will > > > create a new slab using the page allocator to fallback from the > > > memoryless node to some other node. > > And what about the next alloc? Assuem there are N allocs from a memoryless > node this means we push back the partial slab on each alloc and then fall > back? > > > > { > > > void *object; > > > - int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node; > > > + int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node; > > > > > > object = get_partial_node(s, get_node(s, searchnode), c, flags); > > > if (object || node != NUMA_NO_NODE) > > > > This isn't enough. > > Consider that allcation targeted to memoryless node. > > It will not common get there because of the tracking. Instead a per cpu > object will be used. > > get_partial_node() always fails even if there are some partial slab on > > memoryless node's neareast node. > > Correct and that leads to a page allocator action whereupon the node will > be marked as empty. Why do we need to request to a page allocator if there is partial slab? Checking whether node is memoryless or not is really easy, so we don't need to skip this. To skip this is suboptimal solution. > > We should fallback to some proper node in this case, since there is no slab > > on memoryless node. > > NUMA is about optimization of memory allocations. It is often *not* about > correctness but heuristics are used in many cases. F.e. see the zone > reclaim logic, zone reclaim mode, fallback scenarios in the page allocator > etc etc. Okay. But, 'do our best' is preferable to me. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org