linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Mel Gorman <mel@csn.ul.ie>
Cc: davej@codemonkey.org.uk, tony.luck@intel.com, ak@suse.de,
	bob.picco@hp.com, linux-kernel@vger.kernel.org,
	linuxppc-dev@ozlabs.org, linux-mm@kvack.org
Subject: Re: [PATCH 4/6] Have x86_64 use add_active_range() and free_area_init_nodes
Date: Sun, 21 May 2006 12:08:43 -0700	[thread overview]
Message-ID: <20060521120843.43babdc7.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0605211528390.16327@skynet.skynet.ie>

Mel Gorman <mel@csn.ul.ie> wrote:
>

> > Anyway, I just don't get how this code can work.  We have an e820 map with
> > up to 128 entries (this machine has ten) and we're trying to scrunch that
> > all into the four-entry early_node_map[].
> >
> 
> Missing E820MAX was a mistake. On x86_64, CONFIG_MAX_ACTIVE_REGIONS should 
> have been used. I didn't expect x86_64 to have so many memory holes.

x86 uses 128 e820 slots too.

>
> > On my little x86 PC:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
> > BIOS-e820: 000000000009bc00 - 000000000009c000 (reserved)
> > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > BIOS-e820: 0000000000100000 - 000000000ffc0000 (usable)
> > BIOS-e820: 000000000ffc0000 - 000000000fff8000 (ACPI data)
> > BIOS-e820: 000000000fff8000 - 0000000010000000 (ACPI NVS)
> > BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
> > BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
> > BIOS-e820: 00000000ffb80000 - 00000000ffc00000 (reserved)
> > BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
> > 0MB HIGHMEM available.
> > 255MB LOWMEM available.
> > found SMP MP-table at 000ff780
> > Range (nid 0) 0 -> 65472, max 4
> > On node 0 totalpages: 65472
> >  DMA zone: 4096 pages, LIFO batch:0
> >  Normal zone: 61376 pages, LIFO batch:15
> >
> > So here, the architecture code only called add_active_range() the once, for
> > the entire memory map.
>
> Because in this case, the architecture reported that there was just one 
> range of available pages with no holes.

So..  we're registering a simgle blob of pfns which includes the "reserved"
memory as well as the "ACPI data" and the "ACPI NVS" (with an apparent
off-by-one here).

How come the machine still works?  I guess the architecture went and marked
those pfns reserved.

> > If so, perhaps the bug is that the x86_64 code isn't doing that.  And that
>  > x86 isn't doing it for some people either.
>  >
> 
>  I'm hoping in this case that having MAX_ACTIVE_REGIONS match E820MAX will 
>  fix the issue on your machine.

I expect it will.

One does wonder whether it's worth all this fuss though.  It's only a
24-byte structure and it's all thrown away in free_initmem().  One _could_
just go and do

	#define MAX_ACTIVE_REGIONS 10000

and be happy.

> I'm still confused why Christian's failed 
>  to boot with the patch backed out though.

He didn't get any "Too many memory regions" messages, so it's something
different.

Maybe he hit my off-by-one on his "ACPI data"?

hm, I didn't mention this in the earlier email.   On my x86 I have

  BIOS-provided physical RAM map:
  BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
  BIOS-e820: 000000000009bc00 - 000000000009c000 (reserved)
  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
  BIOS-e820: 0000000000100000 - 000000000ffc0000 (usable)
  BIOS-e820: 000000000ffc0000 - 000000000fff8000 (ACPI data)
  BIOS-e820: 000000000fff8000 - 0000000010000000 (ACPI NVS)
  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
  BIOS-e820: 00000000ffb80000 - 00000000ffc00000 (reserved)
  BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)

I added some debug and saw that add_active_range() was getting a
start_pfn=0 and an end_pfn which corresponds with 0x0fffc000.  So my "ACPI
NVS" is getting chopped off.

If Christian is seeing a similar thing then his "ACPI data" will be getting
only part-registered.

I'd suggest that the next rev be liberal in its printking.  This is the
debug patch I used:

 mm/page_alloc.c |   25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff -puN mm/page_alloc.c~a mm/page_alloc.c
--- devel/mm/page_alloc.c~a	2006-05-20 13:19:58.000000000 -0700
+++ devel-akpm/mm/page_alloc.c	2006-05-20 13:20:42.000000000 -0700
@@ -2463,22 +2463,36 @@ void __init add_active_range(unsigned in
 						unsigned long end_pfn)
 {
 	unsigned int i;
-	printk(KERN_DEBUG "Range (%d) %lu -> %lu\n", nid, start_pfn, end_pfn);
+
+	printk("Range (nid %d) %lu -> %lu, max %d\n",
+			nid, start_pfn, end_pfn, MAX_ACTIVE_REGIONS - 1);
 
 	/* Merge with existing active regions if possible */
 	for (i = 0; early_node_map[i].end_pfn; i++) {
-		if (early_node_map[i].nid != nid)
+		printk("i=%d early_node_map[i].nid=%d "
+				"early_node_map[i].start_pfn=%lu "
+				"early_node_map[i].end_pfn=%lu",
+			i, early_node_map[i].nid,
+			early_node_map[i].start_pfn,
+			early_node_map[i].end_pfn);
+
+		if (early_node_map[i].nid != nid) {
+			printk(" continue 1\n");
 			continue;
+		}
 
 		/* Skip if an existing region covers this new one */
 		if (start_pfn >= early_node_map[i].start_pfn &&
-				end_pfn <= early_node_map[i].end_pfn)
+				end_pfn <= early_node_map[i].end_pfn) {
+			printk(" return 1\n");
 			return;
+		}
 
 		/* Merge forward if suitable */
 		if (start_pfn <= early_node_map[i].end_pfn &&
 				end_pfn > early_node_map[i].end_pfn) {
 			early_node_map[i].end_pfn = end_pfn;
+			printk(" return 2\n");
 			return;
 		}
 
@@ -2486,13 +2500,16 @@ void __init add_active_range(unsigned in
 		if (start_pfn < early_node_map[i].end_pfn &&
 				end_pfn >= early_node_map[i].start_pfn) {
 			early_node_map[i].start_pfn = start_pfn;
+			printk(" return 3\n");
 			return;
 		}
+		printk("\n");
 	}
 
 	/* Leave last entry NULL, we use range.end_pfn to terminate the walk */
 	if (i >= MAX_ACTIVE_REGIONS - 1) {
-		printk(KERN_ERR "Too many memory regions, truncating\n");
+		printk(KERN_ERR "More than %d memory regions, truncating\n",
+				MAX_ACTIVE_REGIONS - 1);
 		return;
 	}
 
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-05-21 19:08 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-08 14:10 [PATCH 0/6] Sizing zones and holes in an architecture independent manner V6 Mel Gorman
2006-05-08 14:10 ` [PATCH 1/6] Introduce mechanism for registering active regions of memory Mel Gorman
2006-05-08 14:11 ` [PATCH 2/6] Have Power use add_active_range() and free_area_init_nodes() Mel Gorman
2006-05-08 14:11 ` [PATCH 3/6] Have x86 use add_active_range() and free_area_init_nodes Mel Gorman
2006-05-08 14:11 ` [PATCH 4/6] Have x86_64 " Mel Gorman
2006-05-20 20:59   ` Andrew Morton
2006-05-20 21:27     ` Andi Kleen
2006-05-20 21:40       ` Andrew Morton
2006-05-20 22:17         ` Andi Kleen
2006-05-20 22:54           ` Andrew Morton
2006-05-21 16:20       ` Mel Gorman
2006-05-21 15:50     ` Mel Gorman
2006-05-21 19:08       ` Andrew Morton [this message]
2006-05-21 22:23         ` Mel Gorman
2006-05-23 18:01     ` Mel Gorman
2006-05-08 14:12 ` [PATCH 5/6] Have ia64 " Mel Gorman
2006-05-15  3:31   ` Andrew Morton
2006-05-15  8:21     ` Andy Whitcroft
2006-05-15 10:00       ` Nick Piggin
2006-05-15 10:19         ` Andy Whitcroft
2006-05-15 10:29           ` KAMEZAWA Hiroyuki
2006-05-15 10:47             ` KAMEZAWA Hiroyuki
2006-05-15 11:02             ` Andy Whitcroft
2006-05-16  0:31             ` Nick Piggin
2006-05-16  1:34               ` KAMEZAWA Hiroyuki
2006-05-16  2:11                 ` Nick Piggin
2006-05-15 12:27     ` Mel Gorman
2006-05-15 22:44       ` Mel Gorman
2006-05-19 14:03     ` Mel Gorman
2006-05-19 14:23       ` Andy Whitcroft
2006-05-08 14:12 ` [PATCH 6/6] Break out memory initialisation code from page_alloc.c to mem_init.c Mel Gorman
2006-05-09  1:47   ` Nick Piggin
2006-05-09  8:24     ` Mel Gorman
2006-07-08 11:10 [PATCH 0/6] Sizing zones and holes in an architecture independent manner V8 Mel Gorman
2006-07-08 11:12 ` [PATCH 4/6] Have x86_64 use add_active_range() and free_area_init_nodes Mel Gorman
2006-08-21 13:45 [PATCH 0/6] Sizing zones and holes in an architecture independent manner V9 Mel Gorman
2006-08-21 13:46 ` [PATCH 4/6] Have x86_64 use add_active_range() and free_area_init_nodes Mel Gorman
2006-08-30 20:57   ` Keith Mannthey
2006-08-31 15:49     ` Mel Gorman
2006-08-31 16:25       ` Mika Penttilä
2006-08-31 17:01         ` Mel Gorman
2006-08-31 17:40           ` Mika Penttilä
2006-08-31 17:52       ` Keith Mannthey
2006-08-31 18:40         ` Mel Gorman
2006-09-01  3:08           ` Keith Mannthey
2006-09-01  8:33             ` Mel Gorman
2006-09-01  8:46               ` Mika Penttilä
2006-09-04 15:36             ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060521120843.43babdc7.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=ak@suse.de \
    --cc=bob.picco@hp.com \
    --cc=davej@codemonkey.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mel@csn.ul.ie \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox