From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <434292D3.2040105@shadowen.org> Date: Tue, 04 Oct 2005 15:33:55 +0100 From: Andy Whitcroft MIME-Version: 1.0 Subject: Re: sparsemem & sparsemem extreme question References: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> In-Reply-To: <20051004065030.GA21741@osiris.boeblingen.de.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Heiko Carstens Cc: linux-mm@kvack.org List-ID: Heiko Carstens wrote: > I did an implementation of CONFIG_SPARSEMEM for s390, which indeed was quite > easy. Just to find out that it was not sufficient :) > SPARSEMEM_EXTREME looks better but unfortunately adds another layer of > indirection. > I'm just wondering why there is all this indirection stuff here and why not > have one contiguous aray of struct pages (residing in the vmalloc area) that > deals with whatever size of memory an architecture wants to support. > Unused areas just wouldn't have any backing with real pages and on access > generate a page fault (nobody is supposed to access these pages anyway). > This would have the advantage that all the primitives like e.g. pfn_to_page > would be as simple as before, no need to waste large parts of the page flags > and in addition it would easily allow for memory hotplug on page size > granularity. > The only drawbacks are (as far as I can see) a _huge_ virtual mem_map array, > but that shouldn't matter too much. A real problem could be that the mem_map > array and therefore the vmalloc area need to be generated quiete early. > > Most probably this has already been thought about before, but I couldn't find > anything in the achives. During the implementation of SPARSEMEM_EXTREME other layouts such as the huge 'partially populated' mem_map were considered. For a number of our target architectures kernel virtual address is at a premium so this would not be suitable for them. We did consider whether to have different mechanisms for KVA rich architectures but (if I remember correctly) benchmarking the implementation seemed to indicate that the additional indirection was insignificant if even detectable. The architecture of sparsemem is supposed to allow architecture specific implementations should that be necessary but I've not yet seen a compelling arguement for one yet. On the subject of page flags, I would point out that SPARSEMEM either reuses already used bits for 32 bit architectures, or makes use of unused bits in the 64 case. It doesn't reduce the number of flags bits available. -apw -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org