linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC] fix for hot-add enabled SRAT/BIOS and numa KVA areas
@ 2004-11-17  2:37 keith
  2004-11-17 17:11 ` [Lhms-devel] " Dave Hansen
  2004-11-17 22:33 ` Yasunori Goto
  0 siblings, 2 replies; 10+ messages in thread
From: keith @ 2004-11-17  2:37 UTC (permalink / raw)
  To: external hotplug mem list, linux-mm, Chris McDermott

[-- Attachment #1: Type: text/plain, Size: 1460 bytes --]

  My numa hardware (IBM x445 summit based) supports hot-add memory. 
When this feature is enabled in the bios and booted as with CONFIG_NUMA
I get a memory range exposed by the SRAT/ACPI parsing.  This range
express the amount of memory the "could" be added to the system.  
  
  This chunk extends from the end of physical memory to the end of the
i386 address space.  If the following my physical memory is 0x2C0000. 

(From the boot messages)
Memory range 0x80000 to 0xC0000 (type 0x0) in proximity domain 0x01 enabled
Memory range 0x100000 to 0x2C0000 (type 0x0) in proximity domain 0x01 enabled
Memory range 0x2C0000 to 0x1000000 (type 0x0) in proximity domain 0x01 enabled and removable
  
  These memory ranges I believe to be valid according to what I know
about the SRAT and the ACPI 2.0c specs.  (I am not an ACPI expert please
correct me if I am wrong!)

  The numa KVA code used the node_start and node_end values (obtained
from the above memory ranges) to make it's lowmem reservations.  The
problem is that the lowmem area reserved is quite large.  It reserves
the entire a lmem_map large enough for 0x1000000 address space.  I don't
feel this is a great use of lowmem on my system :)

  Thankfully as we know the e820 shows what memory is really in the
system.  My simple fix it to find the max_pfn from the e820 earlier and
set the numa KVA areas accordingly. 
 
Don't trust the SRAT for this info only the e820.  

Thanks,
  Keith Mannthey 



[-- Attachment #2: fix2.patch --]
[-- Type: text/x-patch, Size: 1212 bytes --]

diff -urN linux-2.6.9/arch/i386/mm/discontig.c linux-2.6.9-fix2/arch/i386/mm/discontig.c
--- linux-2.6.9/arch/i386/mm/discontig.c	2004-11-16 17:41:18.207154544 -0800
+++ linux-2.6.9-fix2/arch/i386/mm/discontig.c	2004-11-16 17:37:05.811524512 -0800
@@ -199,6 +199,15 @@
 	unsigned long size, reserve_pages = 0;
 
 	for (nid = 1; nid < numnodes; nid++) {
+		/*
+		* The acpi/srat node info can show hot-add memroy zones
+		* where memory could be added but not currently present.
+		*/
+		if (node_start_pfn[nid] > max_pfn)
+			continue;
+		if (node_end_pfn[nid] > max_pfn)
+			node_end_pfn[nid] = max_pfn;
+
 		/* calculate the size of the mem_map needed in bytes */
 		size = (node_end_pfn[nid] - node_start_pfn[nid] + 1) 
 			* sizeof(struct page) + sizeof(pg_data_t);
@@ -261,12 +270,12 @@
 		printk("\n");
 	}
 
+	find_max_pfn();
 	reserve_pages = calculate_numa_remap_pages();
 
 	/* partially used pages are not usable - thus round upwards */
 	system_start_pfn = min_low_pfn = PFN_UP(init_pg_tables_end);
 
-	find_max_pfn();
 	system_max_low_pfn = max_low_pfn = find_max_low_pfn() - reserve_pages;
 	printk("reserve_pages = %ld find_max_low_pfn() ~ %ld\n",
 			reserve_pages, max_low_pfn + reserve_pages);

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-11-18 19:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-17  2:37 [RFC] fix for hot-add enabled SRAT/BIOS and numa KVA areas keith
2004-11-17 17:11 ` [Lhms-devel] " Dave Hansen
2004-11-18  2:08   ` keith
2004-11-18  2:24     ` Dave Hansen
2004-11-18 19:18       ` keith
2004-11-17 22:33 ` Yasunori Goto
2004-11-17 22:42   ` Dave Hansen
2004-11-17 23:21     ` Yasunori Goto
2004-11-18  2:18     ` keith
2004-11-18  2:16   ` keith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox