From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4470417F.2000605@yahoo.com.au> Date: Sun, 21 May 2006 20:31:27 +1000 From: Nick Piggin MIME-Version: 1.0 Subject: Re: [patch 2/2] mm: handle unaligned zones References: <4470232B.7040802@yahoo.com.au> <44702358.1090801@yahoo.com.au> <20060521021905.0f73e01a.akpm@osdl.org> In-Reply-To: <20060521021905.0f73e01a.akpm@osdl.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Andrew Morton Cc: apw@shadowen.org, mel@csn.ul.ie, stable@kernel.org, linux-mm@kvack.org List-ID: Andrew Morton wrote: > Nick Piggin wrote: > >>Allow unaligned zones, and make this an opt-in CONFIG_ option because >>some architectures appear to be relying on unaligned zones being handled >>correctly. >> >>- Also, the bad_range checks are removed, they are checked at meminit time >> since the last patch. >> >>... >> >>Index: linux-2.6/mm/page_alloc.c >>=================================================================== >>--- linux-2.6.orig/mm/page_alloc.c 2006-05-21 17:53:36.000000000 +1000 >>+++ linux-2.6/mm/page_alloc.c 2006-05-21 18:20:13.000000000 +1000 >> >>... >> >>+{ >>+#ifdef CONFIG_HOLES_IN_ZONE > > > (Why is this a config option? If we can optionally handle it, why not > always just handle it? Holes in zone? or unaligned zones? Holes in zone, I guess because it is seen as somewhat of a special case, and can be removed if ia64 moves to sparsemem. > > >>+/* >>+ * If the the zone's mem_map is not 1<>+ * must *not* be set by the architecture, because the buddy allocator will run >>+ * into "buddies" which are outside mem_map. >>+ * >>+ * It is not enough for the node's mem_map to be aligned, because unaligned >>+ * zone boundaries can cause a buddies to be in different zones. >>+ */ >>+static inline int buddy_outside_zone_span(struct page *page, struct page *buddy) >>+{ >>+ int ret = 0; >>+ >>+#ifndef CONFIG_ALIGNED_ZONE >>+ unsigned int seq; >>+ unsigned long pfn; >>+ struct zone *zone; >>+ >>+ pfn = page_to_pfn(page); >>+ zone = page_zone(page); >>+ >>+ do { > > > You'll want a `ret = 0' here. Thanks. > > >>+ seq = zone_span_seqbegin(zone); >>+ if (pfn >= zone->zone_start_pfn + zone->spanned_pages) >>+ ret = 1; >>+ else if (pfn < zone->zone_start_pfn) >>+ ret = 1; >>+ } while (zone_span_seqretry(zone, seq)); >>+ if (ret) >>+ goto out; >>+ >>+ /* >>+ * page_zone_idx accesses page->flags, so this test must go after >>+ * the above, which ensures that buddy is within the zone. >>+ */ >>+ if (page_zone_idx(page) != page_zone_idx(buddy)) >>+ ret = 1; >>+ >>+out: >>+#endif >>+ >>+ return ret; >>+} >>+ >>+/* >>+ * In some memory configurations, buddy pages may be found which are >>+ * outside the zone pages. Check for those here. >>+ */ >>+static int buddy_outside_zone(struct page *page, struct page *buddy) >>+{ >>+ if (page_in_zone_hole(buddy)) >>+ return 1; >>+ >>+ if (buddy_outside_zone_span(page, buddy)) >>+ return 1; >>+ >>+ return 0; >>+} >>+ >>+/* >>+ * This function checks whether a buddy is free and is the buddy of page. >>+ * We can coalesce a page and its buddy if >>+ * (a) the buddy is not "outside" the zone && >> * (b) the buddy is in the buddy system && >> * (c) a page and its buddy have the same order. >> * >>@@ -292,15 +320,13 @@ __find_combined_index(unsigned long page >> * >> * For recording page's order, we use page_private(page). >> */ >>-static inline int page_is_buddy(struct page *page, int order) >>+static inline int page_is_buddy(struct page *page, struct page *buddy, int order) >> { >>-#ifdef CONFIG_HOLES_IN_ZONE >>- if (!pfn_valid(page_to_pfn(page))) >>+ if (buddy_outside_zone(page, buddy)) >> return 0; > > > This is a heck of a lot of code to be throwing into the page-freeing > hotpath. Surely there's a way of moving all this work to > initialisation/hotadd time? Can't think of any good way to do it. We could add yet another page flag, which would relegate unaligned portions of zones to only order-0 pages (and never try to merge them up the buddy allocator). Of course that's another page flag. It is possible we can avoid the zone seqlock checks simply by always testing whether the pfn is valid (this way the test would be more unified with the holes in zone case). The tests would still be pretty heavyweight though. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org