From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Mon, 6 Aug 2007 09:42:37 -0700 From: Mark Gross Subject: Re: [PATCH/RFC] Allow selected nodes to be excluded from MPOL_INTERLEAVE masks Message-ID: <20070806164237.GA21133@linux.intel.com> Reply-To: mgross@linux.intel.com References: <1185566878.5069.123.camel@localhost> <200708011233.02103.ak@suse.de> <20070801110120.GA9449@linux-sh.org> <200708011307.44189.ak@suse.de> <20070801112116.GA9617@linux-sh.org> <1185976446.5059.27.camel@localhost> <20070802173825.GA7815@linux.intel.com> <1186080363.5040.63.camel@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1186080363.5040.63.camel@localhost> Sender: owner-linux-mm@kvack.org Return-Path: To: Lee Schermerhorn Cc: Paul Mundt , Andi Kleen , KAMEZAWA Hiroyuki , linux-mm , Christoph Lameter , Nishanth Aravamudan , kxr@sgi.com, akpm@linux-foundation.org, Eric Whitney List-ID: On Thu, Aug 02, 2007 at 02:46:03PM -0400, Lee Schermerhorn wrote: > On Thu, 2007-08-02 at 10:38 -0700, Mark Gross wrote: > > On Wed, Aug 01, 2007 at 09:54:06AM -0400, Lee Schermerhorn wrote: > > > On Wed, 2007-08-01 at 20:21 +0900, Paul Mundt wrote: > > > > On Wed, Aug 01, 2007 at 01:07:43PM +0200, Andi Kleen wrote: > > > > > > > > > > > As long as interleaving is possible after boot, then yes. It's only the > > > > > > boot-time interleave that we would like to avoid, > > > > > > > > > > But when anybody does interleaving later it could just as easily > > > > > fill up your small nodes, couldn't it? > > > > > > > > > Yes, but these are in embedded environments where we have control over > > > > what the applications are doing. Most of these sorts of things are for > > > > applications where we know what sort of latency requires we have to deal > > > > with, and so the workload is very much tied to the worst-case range of > > > > nodes, or just to a particular node. We might only have certain buffers > > > > that need to be backed by faster memory as well, so while most of the > > > > application pages will come from node 0 (system memory), certain other > > > > allocations will come from other nodes. We've been experimenting with > > > > doing that through tmpfs with mpol tuning. > > > > > > > > In the general case however it's fairly safe to include the tiny nodes as > > > > part of a larger set with a prefer policy so we don't immediately OOM. > > > > > > > > > Boot time allocations are small compared to what user space > > > > > later can allocate. > > > > > > > > > Yes, we only want certain applications to explicitly poke at those nodes, > > > > but they do have a use case for interleave, so it is not functionality I > > > > would want to lose completely. > > > > > > This is why I wanted to use an "obscure boot option". I don't see this > > > as strictly an architectural/platform issue. Rather, it's a combination > > > of the arch/platform and how it's being used for specific applications. > > > So, I don't see how one could accomplish this with a heuristic. > > > > > > As Paul mentioned, in embedded systems, one has a bit more control over > > > what applications are doing. In that case, I could envision a config > > > option to specify the initial/default value for the no_interleave_nodes > > > at kernel build time and dispense with the boot option. [Any interest > > > > Having the interleave as a build time option won't work for some power > > managed memory applications. I posted an RFC a few months back and will > > be coming back to it in a few weeks, so take this comment with a grain > > of salt. But I want to be able to switch on some ACPI table entries to > > trigger the non-interleave boot time allocation behavior for some FBDIM > > based platforms. My needs are in surprising alignment with Paul's on > > this stuff. > > > > > > --mgross > > > Mark: you mean "boot time option", right? I meant to express a preffence of avoiding a compile time only enablement of the non-interleave nodes. (I like the boot time option better.) > > When you get back to it, can you verify that this patch won't affect > what you want to do in policy init [boot time interleave mask]--? ...as I will. > long as no one specifies any no_interleave_nodes, of course. And even > then, all that happens is that maybe more nodes get excluded from the > boot time policy mask than you would have excluded based on ACPI info. yes. > > Until you have the ACPI table info and parsing in place [or maybe you > already have this], this patch could allow you to test with the desired > nodes excluded... We have a custom bios / table for this, but having a boot option would enable easier testing. thanks, --mgross -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org