linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: akpm@linux-foundation.org
Cc: Mel Gorman <mel@csn.ul.ie>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 3/13] Fix corruption of memmap on ia64-sparsemem when mem_section is not a power of 2
Subject: Fix corruption of memmap on ia64-sparsemem when mem_section is not a power of 2
Date: Mon, 10 Sep 2007 12:21:11 +0100 (IST)	[thread overview]
Message-ID: <20070910112111.3097.85750.sendpatchset@skynet.skynet.ie> (raw)
In-Reply-To: <20070910112011.3097.8438.sendpatchset@skynet.skynet.ie>

There are problems in the use of SPARSEMEM and pageblock flags that causes
problems on ia64.

The first part of the problem is that units are incorrect in
SECTION_BLOCKFLAGS_BITS computation.  This results in a map_section's
section_mem_map being treated as part of a bitmap which isn't good.  This
was evident with an invalid virtual address when mem_init attempted to free
bootmem pages while relinquishing control from the bootmem allocator.

The second part of the problem occurs because the pageblock flags bitmap is
be located with the mem_section.  The SECTIONS_PER_ROOT computation using
sizeof (mem_section) may not be a power of 2 depending on the size of the
bitmap.  This renders masks and other such things not power of 2 base.
This issue was seen with SPARSEMEM_EXTREME on ia64.  This patch moves the
bitmap outside of mem_section and uses a pointer instead in the
mem_section.  The bitmaps are allocated when the section is being
initialised.

Note that sparse_early_usemap_alloc() does not use alloc_remap() like
sparse_early_mem_map_alloc().  The allocation required for the bitmap on
x86, the only architecture that uses alloc_remap is typically smaller than
a cache line.  alloc_remap() pads out allocations to the cache size which
would be a needless waste.

Credit to Bob Picco for identifying the original problem and effecting a
fix for the SECTION_BLOCKFLAGS_BITS calculation.  Credit to Andy Whitcroft
for devising the best way of allocating the bitmaps only when required for
the section.

From: Bob Picco <bob.picco@hp.com>
[wli@holomorphy.com: warning fix]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mmzone.h |    4 ++-
 mm/sparse.c            |   54 +++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 54 insertions(+), 4 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc5-002-add-a-bitmap-that-is-used-to-track-flags-affecting-a-block-of-pages/include/linux/mmzone.h linux-2.6.23-rc5-003-fix-corruption-of-memmap-on-ia64-sparsemem-when-mem_section-is-not-a-power-of-2/include/linux/mmzone.h
--- linux-2.6.23-rc5-002-add-a-bitmap-that-is-used-to-track-flags-affecting-a-block-of-pages/include/linux/mmzone.h	2007-09-02 16:19:05.000000000 +0100
+++ linux-2.6.23-rc5-003-fix-corruption-of-memmap-on-ia64-sparsemem-when-mem_section-is-not-a-power-of-2/include/linux/mmzone.h	2007-09-02 16:19:16.000000000 +0100
@@ -739,7 +739,9 @@ struct mem_section {
 	 * before using it wrong.
 	 */
 	unsigned long section_mem_map;
-	DECLARE_BITMAP(pageblock_flags, SECTION_BLOCKFLAGS_BITS);
+
+	/* See declaration of similar field in struct zone */
+	unsigned long *pageblock_flags;
 };
 
 #ifdef CONFIG_SPARSEMEM_EXTREME
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.23-rc5-002-add-a-bitmap-that-is-used-to-track-flags-affecting-a-block-of-pages/mm/sparse.c linux-2.6.23-rc5-003-fix-corruption-of-memmap-on-ia64-sparsemem-when-mem_section-is-not-a-power-of-2/mm/sparse.c
--- linux-2.6.23-rc5-002-add-a-bitmap-that-is-used-to-track-flags-affecting-a-block-of-pages/mm/sparse.c	2007-09-02 16:18:56.000000000 +0100
+++ linux-2.6.23-rc5-003-fix-corruption-of-memmap-on-ia64-sparsemem-when-mem_section-is-not-a-power-of-2/mm/sparse.c	2007-09-02 16:19:16.000000000 +0100
@@ -204,14 +204,16 @@ struct page *sparse_decode_mem_map(unsig
 }
 
 static int __meminit sparse_init_one_section(struct mem_section *ms,
-		unsigned long pnum, struct page *mem_map)
+		unsigned long pnum, struct page *mem_map,
+		unsigned long *pageblock_bitmap)
 {
 	if (!present_section(ms))
 		return -EINVAL;
 
 	ms->section_mem_map &= ~SECTION_MAP_MASK;
 	ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) |
							SECTION_HAS_MEM_MAP;
+ 	ms->pageblock_flags = pageblock_bitmap;
 
 	return 1;
 }
@@ -221,6 +223,38 @@ void *alloc_bootmem_high_node(pg_data_t 
 	return NULL;
 }
 
+static unsigned long usemap_size(void)
+{
+	unsigned long size_bytes;
+	size_bytes = roundup(SECTION_BLOCKFLAGS_BITS, 8) / 8;
+	size_bytes = roundup(size_bytes, sizeof(unsigned long));
+	return size_bytes;
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+static unsigned long *__kmalloc_section_usemap(void)
+{
+	return kmalloc(usemap_size(), GFP_KERNEL);
+}
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
+static unsigned long *sparse_early_usemap_alloc(unsigned long pnum)
+{
+	unsigned long *usemap;
+	struct mem_section *ms = __nr_to_section(pnum);
+	int nid = sparse_early_nid(ms);
+
+	usemap = alloc_bootmem_node(NODE_DATA(nid), usemap_size());
+	if (usemap)
+		return usemap;
+
+	/* Stupid: suppress gcc warning for SPARSEMEM && !NUMA */
+	nid = 0;
+
+	printk(KERN_WARNING "%s: allocation failed\n", __FUNCTION__);
+	return NULL;
+}
+
 struct page __init *sparse_early_mem_map_alloc(unsigned long pnum)
 {
 	struct page *map;
@@ -254,6 +288,7 @@ void __init sparse_init(void)
 {
 	unsigned long pnum;
 	struct page *map;
+	unsigned long *usemap;
 
 	for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
 		if (!valid_section_nr(pnum))
@@ -262,7 +297,13 @@ void __init sparse_init(void)
 		map = sparse_early_mem_map_alloc(pnum);
 		if (!map)
 			continue;
-		sparse_init_one_section(__nr_to_section(pnum), pnum, map);
+
+		usemap = sparse_early_usemap_alloc(pnum);
+		if (!usemap)
+			continue;
+
+		sparse_init_one_section(__nr_to_section(pnum), pnum, map,
+								usemap);
 	}
 }
 
@@ -318,6 +359,7 @@ int sparse_add_one_section(struct zone *
 	struct pglist_data *pgdat = zone->zone_pgdat;
 	struct mem_section *ms;
 	struct page *memmap;
+	unsigned long *usemap;
 	unsigned long flags;
 	int ret;
 
@@ -327,6 +369,7 @@ int sparse_add_one_section(struct zone *
 	 */
 	sparse_index_init(section_nr, pgdat->node_id);
 	memmap = __kmalloc_section_memmap(nr_pages);
+	usemap = __kmalloc_section_usemap();
 
 	pgdat_resize_lock(pgdat, &flags);
 
@@ -335,9 +378,14 @@ int sparse_add_one_section(struct zone *
 		ret = -EEXIST;
 		goto out;
 	}
+
+	if (!usemap) {
+		ret = -ENOMEM;
+		goto out;
+	}
 	ms->section_mem_map |= SECTION_MARKED_PRESENT;
 
-	ret = sparse_init_one_section(ms, section_nr, memmap);
+	ret = sparse_init_one_section(ms, section_nr, memmap, usemap);
 
 out:
 	pgdat_resize_unlock(pgdat, &flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-09-10 11:21 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-10 11:20 [PATCH 0/13] Reduce external fragmentation by grouping pages by mobility v30 Mel Gorman
2007-09-10 11:20 ` [PATCH 1/13] ia64: parse kernel parameter hugepagesz= in early boot, ia64: parse kernel parameter hugepagesz= in early boot Mel Gorman
2007-09-10 11:20 ` [PATCH 2/13] Add a bitmap that is used to track flags affecting a block of pages, Add a bitmap that is used to track flags affecting a block of pages Mel Gorman
2007-09-10 11:21 ` Mel Gorman [this message]
2007-09-10 11:21 ` [PATCH 4/13] Split the free lists for movable and unmovable allocations, Split the free lists for movable and unmovable allocations Mel Gorman
2007-09-10 11:21 ` [PATCH 5/13] Choose pages from the per cpu list-based on migration type, Choose pages from the per cpu list-based on migration type Mel Gorman
2009-07-13 19:16   ` [PATCH 5/13] " Andrew Morton
2009-07-14  9:14     ` Mel Gorman
2007-09-10 11:22 ` [PATCH 6/13] Group short-lived and reclaimable kernel allocations, Group short-lived and reclaimable kernel allocations Mel Gorman
2007-09-10 19:44   ` Paul Jackson
2007-09-10 21:15     ` Mel Gorman
2007-09-10 11:22 ` [PATCH 7/13] Drain per-cpu lists when high-order allocations fail, Drain per-cpu lists when high-order allocations fail Mel Gorman
2007-09-10 15:05   ` [PATCH 7/13] " Nick Piggin
2007-09-11  9:34     ` Mel Gorman
2007-09-10 11:22 ` [PATCH 8/13] Move free pages between lists on steal, Move free pages between lists on steal Mel Gorman
2007-09-10 11:23 ` [PATCH 9/13] Do not group pages by mobility type on low memory systems, Do not group pages by mobility type on low memory systems Mel Gorman
2007-09-10 11:23 ` [PATCH 10/13] Bias the location of pages freed for min_free_kbytes in the same pageblock_nr_pages areas, Bias the location of pages freed for min_free_kbytes in the same pageblock_nr_pages areas Mel Gorman
2007-09-10 11:23 ` [PATCH 11/13] Bias the placement of kernel pages at lower pfns, Bias the placement of kernel pages at lower pfns Mel Gorman
2007-09-10 11:24 ` [PATCH 12/13] Be more agressive about stealing when MIGRATE_RECLAIMABLE allocations fallback, Be more agressive about stealing when MIGRATE_RECLAIMABLE allocations fallback Mel Gorman
2007-09-10 11:24 ` [PATCH 13/13] Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo, Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo Mel Gorman
2007-09-14  1:01 ` [PATCH 0/13] Reduce external fragmentation by grouping pages by mobility v30 Andrew Morton
2007-09-14 14:33   ` Mel Gorman
2007-09-16 10:34     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070910112111.3097.85750.sendpatchset@skynet.skynet.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox