* [RFC/Patch]Making Removable zone[0/4]
@ 2004-10-26 2:24 Yasunori Goto
2004-10-26 2:34 ` [RFC/Patch]Making Removable zone[1/4] Yasunori Goto
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Yasunori Goto @ 2004-10-26 2:24 UTC (permalink / raw)
To: lhms-devel, linux-mm
Hello.
This patch set is to make new zone (Hotremovable) for
memory hotplug to create area which is removed relatively easily.
I made this patch set 2 month ago,
but I thought this patch set has many problem,
so I was worried whether it should be posted for a long time.
However I'm feeling its time become too long.
So, I post this to ask which is better way.
If you have any suggestions, please tell me.
This patches make Hotremovable attribute as orthogonal against
DMA/Normal/Highmem. So, there will be six zones
(DMA/Normal/Highmem/ Removable DMA/ Removable Normal/
Removable Highmem).
However, this orthogonal attribute is cause of problems like
followings....
1) Zone Id bits in page->flags must be extended from 2 to 3
to make 6 zones. However, there is not enough space in it.
2) Array size of zonelist for 6 zones might be too big.
(Especially, when there are a lot of numbers of nodes)
3) Index of zonelist array is decided by __GFP_xxx bit. So,
index must be power of 2. But, GFP_Removable can be set with
GFP_HIGHMEM or GFP_DMA. (not power of 2).
4) Some of kernel codes assume that order of Zone's index is
DMA -> Normal -> Highmem.
But removable attribute will break its order.
5) Zonelist order must be also changed.
Which is better zonelist order?
a) Removable Highmem -> Removable Normal -> Removable DMA
-> Highmem -> Normal -> DMA
b) Removable Highmem -> Highmem -> Removable Normal -> Normal
-> Removable DMA -> DMA
If the kind of zone is just 4 types like DMA/Normal/Highmem/Removable
(Not orthogonal), some of these problems become easy.
And I suppose 4) and 5) imply more codes like mem_molicy
must be changed.
But 6 zones code has an advantage for hotplug of kernel memory.
If an component of kernel can become hot-removable,
probably it would like to use "Horemovable DMA" or
"Hotremovable Normal".
So, I also worried which type of removable zone is better.
These patch set is old, but they can be applied against 2.6.9-mm1.
Please comment.
Bye.
--
Yasunori Goto <ygoto at us.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC/Patch]Making Removable zone[1/4]
2004-10-26 2:24 [RFC/Patch]Making Removable zone[0/4] Yasunori Goto
@ 2004-10-26 2:34 ` Yasunori Goto
2004-10-26 2:35 ` [RFC/Patch]Making Removable zone[2/4] Yasunori Goto
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Yasunori Goto @ 2004-10-26 2:34 UTC (permalink / raw)
To: lhms-devel, linux-mm
This patch makes new zones (Hot-removable DMA/ Hotremovable-Normal/
Hotremovable-Highmem).
hotremovable-goto/include/linux/mmzone.h | 40 +++++++------------------------
hotremovable-goto/mm/page_alloc.c | 3 +-
2 files changed, 12 insertions(+), 31 deletions(-)
diff -puN include/linux/mmzone.h~new_zone include/linux/mmzone.h
--- hotremovable/include/linux/mmzone.h~new_zone Fri Aug 27 21:06:50 2004
+++ hotremovable-goto/include/linux/mmzone.h Fri Aug 27 21:06:50 2004
@@ -73,37 +73,17 @@ struct per_cpu_pageset {
#define ZONE_NORMAL 1
#define ZONE_HIGHMEM 2
-#define MAX_NR_ZONES 3 /* Sync this with ZONES_SHIFT */
-#define ZONES_SHIFT 2 /* ceil(log2(MAX_NR_ZONES)) */
+#define ZONE_REMOVABLE 3
+#define ZONE_DMA_RMV (ZONE_DMA + ZONE_REMOVABLE) /* Hot-Removable DMA zone */
+#define ZONE_NORMAL_RMV (ZONE_NORMAL + ZONE_REMOVABLE) /* Hot-Removable DMA zone */
+#define ZONE_HIGHMEM_RMV (ZONE_HIGHMEM + ZONE_REMOVABLE) /* Hot-Removable DMA zone */
+#define MAX_NR_ZONES 6 /* Sync this with ZONES_SHIFT */
+#define ZONES_SHIFT 3 /* ceil(log2(MAX_NR_ZONES)) */
-/*
- * When a memory allocation must conform to specific limitations (such
- * as being suitable for DMA) the caller will pass in hints to the
- * allocator in the gfp_mask, in the zone modifier bits. These bits
- * are used to select a priority ordered list of memory zones which
- * match the requested limits. GFP_ZONEMASK defines which bits within
- * the gfp_mask should be considered as zone modifiers. Each valid
- * combination of the zone modifier bits has a corresponding list
- * of zones (in node_zonelists). Thus for two zone modifiers there
- * will be a maximum of 4 (2 ** 2) zonelists, for 3 modifiers there will
- * be 8 (2 ** 3) zonelists. GFP_ZONETYPES defines the number of possible
- * combinations of zone modifiers in "zone modifier space".
- */
-#define GFP_ZONEMASK 0x03
-/*
- * As an optimisation any zone modifier bits which are only valid when
- * no other zone modifier bits are set (loners) should be placed in
- * the highest order bits of this field. This allows us to reduce the
- * extent of the zonelists thus saving space. For example in the case
- * of three zone modifier bits, we could require up to eight zonelists.
- * If the left most zone modifier is a "loner" then the highest valid
- * zonelist would be four allowing us to allocate only five zonelists.
- * Use the first form when the left most bit is not a "loner", otherwise
- * use the second.
- */
-/* #define GFP_ZONETYPES (GFP_ZONEMASK + 1) */ /* Non-loner */
-#define GFP_ZONETYPES ((GFP_ZONEMASK + 1) / 2 + 1) /* Loner */
+#define GFP_ZONEMASK 0x07
+
+#define GFP_ZONETYPES (MAX_NR_ZONES + 1)
/*
* On machines where it is needed (eg PCs) we divide physical memory
@@ -414,7 +394,7 @@ extern struct pglist_data contig_page_da
* with 32 bit page->flags field, we reserve 8 bits for node/zone info.
* there are 3 zones (2 bits) and this leaves 8-2=6 bits for nodes.
*/
-#define MAX_NODES_SHIFT 6
+#define MAX_NODES_SHIFT 5
#elif BITS_PER_LONG == 64
/*
* with 64 bit flags field, there's plenty of room.
diff -puN mm/page_alloc.c~new_zone mm/page_alloc.c
--- hotremovable/mm/page_alloc.c~new_zone Fri Aug 27 21:06:50 2004
+++ hotremovable-goto/mm/page_alloc.c Fri Aug 27 21:06:50 2004
@@ -57,7 +57,8 @@ EXPORT_SYMBOL(nr_swap_pages);
struct zone *zone_table[1 << (ZONES_SHIFT + NODES_SHIFT)];
EXPORT_SYMBOL(zone_table);
-static char *zone_names[MAX_NR_ZONES] = { "DMA", "Normal", "HighMem" };
+static char *zone_names[MAX_NR_ZONES] = { "DMA", "Normal", "HighMem",
+ "DMA-Removable", "Normal-Removable","Highmem-Removable"};
int min_free_kbytes = 1024;
unsigned long __initdata nr_kernel_pages;
_
--
Yasunori Goto <ygoto at us.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC/Patch]Making Removable zone[2/4]
2004-10-26 2:24 [RFC/Patch]Making Removable zone[0/4] Yasunori Goto
2004-10-26 2:34 ` [RFC/Patch]Making Removable zone[1/4] Yasunori Goto
@ 2004-10-26 2:35 ` Yasunori Goto
2004-10-26 2:36 ` [RFC/Patch]Making Removable zone[3/4] Yasunori Goto
2004-10-26 2:38 ` [RFC/Patch]Making Removable zone[4/4] Yasunori Goto
3 siblings, 0 replies; 5+ messages in thread
From: Yasunori Goto @ 2004-10-26 2:35 UTC (permalink / raw)
To: lhms-devel, linux-mm
User processes and page cache can use removable area by this patch.
hotremovable-goto/include/linux/gfp.h | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff -puN include/linux/gfp.h~gfp_removable include/linux/gfp.h
--- hotremovable/include/linux/gfp.h~gfp_removable Fri Aug 27 21:06:57 2004
+++ hotremovable-goto/include/linux/gfp.h Fri Aug 27 21:06:57 2004
@@ -11,9 +11,10 @@ struct vm_area_struct;
/*
* GFP bitmasks..
*/
-/* Zone modifiers in GFP_ZONEMASK (see linux/mmzone.h - low two bits) */
+/* Zone modifiers in GFP_ZONEMASK (see linux/mmzone.h - low three bits) */
#define __GFP_DMA 0x01
#define __GFP_HIGHMEM 0x02
+#define __GFP_REMOVABLE 0x04
/*
* Action modifiers - doesn't change the zoning
@@ -51,7 +52,7 @@ struct vm_area_struct;
#define GFP_NOFS (__GFP_WAIT | __GFP_IO)
#define GFP_KERNEL (__GFP_WAIT | __GFP_IO | __GFP_FS)
#define GFP_USER (__GFP_WAIT | __GFP_IO | __GFP_FS)
-#define GFP_HIGHUSER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HIGHMEM)
+#define GFP_HIGHUSER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HIGHMEM | __GFP_REMOVABLE)
/* Flag - indicates that the buffer will be suitable for DMA. Ignored on some
platforms, used as appropriate on others */
_
--
Yasunori Goto <ygoto at us.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC/Patch]Making Removable zone[3/4]
2004-10-26 2:24 [RFC/Patch]Making Removable zone[0/4] Yasunori Goto
2004-10-26 2:34 ` [RFC/Patch]Making Removable zone[1/4] Yasunori Goto
2004-10-26 2:35 ` [RFC/Patch]Making Removable zone[2/4] Yasunori Goto
@ 2004-10-26 2:36 ` Yasunori Goto
2004-10-26 2:38 ` [RFC/Patch]Making Removable zone[4/4] Yasunori Goto
3 siblings, 0 replies; 5+ messages in thread
From: Yasunori Goto @ 2004-10-26 2:36 UTC (permalink / raw)
To: lhms-devel, linux-mm
This patch is to make new order of zone index and zonelist.
hotremovable-goto/include/linux/mmzone.h | 10 +++-
hotremovable-goto/mm/page_alloc.c | 67 +++++++++++++------------------
2 files changed, 37 insertions(+), 40 deletions(-)
diff -puN include/linux/mmzone.h~zones_order include/linux/mmzone.h
--- hotremovable/include/linux/mmzone.h~zones_order Fri Aug 27 21:07:01 2004
+++ hotremovable-goto/include/linux/mmzone.h Fri Aug 27 21:07:01 2004
@@ -234,6 +234,11 @@ struct zonelist {
struct zone *zones[MAX_NUMNODES * MAX_NR_ZONES + 1]; // NULL delimited
};
+/* zonelist is decided by zone_order */
+struct zonelist_order{
+ char zone_order[MAX_NR_ZONES + 1]; /* -1 delimited */
+};
+extern const struct zonelist_order zorder[];
/*
* The pg_data_t structure is used in machines with CONFIG_DISCONTIGMEM
@@ -408,8 +413,9 @@ extern struct pglist_data contig_page_da
#error NODES_SHIFT > MAX_NODES_SHIFT
#endif
-/* There are currently 3 zones: DMA, Normal & Highmem, thus we need 2 bits */
-#define MAX_ZONES_SHIFT 2
+/* There are currently 6 zones: {DMA, Normal , Highmem} x
+ {hot-Removable or Un-Removable} thus we need 3 bits */
+#define MAX_ZONES_SHIFT 3
#if ZONES_SHIFT > MAX_ZONES_SHIFT
#error ZONES_SHIFT > MAX_ZONES_SHIFT
diff -puN mm/page_alloc.c~zones_order mm/page_alloc.c
--- hotremovable/mm/page_alloc.c~zones_order Fri Aug 27 21:07:01 2004
+++ hotremovable-goto/mm/page_alloc.c Fri Aug 27 21:07:01 2004
@@ -47,6 +47,18 @@ long nr_swap_pages;
int numnodes = 1;
int sysctl_lower_zone_protection = 0;
+const struct zonelist_order zorder[GFP_ZONETYPES] = {
+ {{ ZONE_NORMAL, ZONE_DMA, -1, -1, -1, -1, -1}}, /* __GFP_NORMAL */
+ {{ ZONE_DMA, -1, -1, -1, -1, -1, -1}}, /* __GFP_DMA */
+ {{ ZONE_HIGHMEM, ZONE_NORMAL, ZONE_DMA, -1, -1, -1, -1}}, /* __GFP_HIGHMEM */
+ {{ -1, -1, -1, -1, -1, -1, -1}}, /* reserve */
+ {{ ZONE_NORMAL_RMV, ZONE_NORMAL, ZONE_DMA_RMV,
+ ZONE_DMA, -1, -1, -1}}, /* __GFP_NORMAL | __GFP_REMOVABLE */
+ {{ ZONE_DMA_RMV, ZONE_DMA, -1, -1, -1, -1, -1}}, /* __GFP_DMA | __GFP_REMOVABLE */
+ {{ZONE_HIGHMEM_RMV, ZONE_HIGHMEM, ZONE_NORMAL_RMV,
+ ZONE_NORMAL, ZONE_DMA_RMV, ZONE_DMA, -1}} /* __GFP_HIGHMEM | __GFP_REMOVABLE */
+};
+
EXPORT_SYMBOL(totalram_pages);
EXPORT_SYMBOL(nr_swap_pages);
@@ -1452,27 +1464,17 @@ void show_free_areas(void)
*/
static int __init build_zonelists_node(pg_data_t *pgdat, struct zonelist *zonelist, int j, int k)
{
- switch (k) {
- struct zone *zone;
- default:
- BUG();
- case ZONE_HIGHMEM:
- zone = pgdat->node_zones + ZONE_HIGHMEM;
- if (zone->present_pages) {
-#ifndef CONFIG_HIGHMEM
- BUG();
-#endif
- zonelist->zones[j++] = zone;
- }
- case ZONE_NORMAL:
- zone = pgdat->node_zones + ZONE_NORMAL;
- if (zone->present_pages)
- zonelist->zones[j++] = zone;
- case ZONE_DMA:
- zone = pgdat->node_zones + ZONE_DMA;
- if (zone->present_pages)
- zonelist->zones[j++] = zone;
- }
+ struct zone *zone;
+ int i = 0,index;
+
+ index = zorder[k].zone_order[i];
+
+ while(index != -1){
+ zone = pgdat->node_zones + index;
+ zonelist->zones[j++] = zone;
+ i++;
+ index = zorder[k].zone_order[i];
+ };
return j;
}
@@ -1537,7 +1539,7 @@ static int __init find_next_best_node(in
static void __init build_zonelists(pg_data_t *pgdat)
{
- int i, j, k, node, local_node;
+ int i, j, node, local_node;
int prev_node, load;
struct zonelist *zonelist;
DECLARE_BITMAP(used_mask, MAX_NUMNODES);
@@ -1569,13 +1571,7 @@ static void __init build_zonelists(pg_da
zonelist = pgdat->node_zonelists + i;
for (j = 0; zonelist->zones[j] != NULL; j++);
- k = ZONE_NORMAL;
- if (i & __GFP_HIGHMEM)
- k = ZONE_HIGHMEM;
- if (i & __GFP_DMA)
- k = ZONE_DMA;
-
- j = build_zonelists_node(NODE_DATA(node), zonelist, j, k);
+ j = build_zonelists_node(NODE_DATA(node), zonelist, j, i);
zonelist->zones[j] = NULL;
}
}
@@ -1585,7 +1581,7 @@ static void __init build_zonelists(pg_da
static void __init build_zonelists(pg_data_t *pgdat)
{
- int i, j, k, node, local_node;
+ int i, j, node, local_node;
local_node = pgdat->node_id;
for (i = 0; i < GFP_ZONETYPES; i++) {
@@ -1595,13 +1591,8 @@ static void __init build_zonelists(pg_da
memset(zonelist, 0, sizeof(*zonelist));
j = 0;
- k = ZONE_NORMAL;
- if (i & __GFP_HIGHMEM)
- k = ZONE_HIGHMEM;
- if (i & __GFP_DMA)
- k = ZONE_DMA;
- j = build_zonelists_node(pgdat, zonelist, j, k);
+ j = build_zonelists_node(pgdat, zonelist, j, i);
/*
* Now we build the zonelist so that it contains the zones
* of all the other nodes.
@@ -1611,9 +1602,9 @@ static void __init build_zonelists(pg_da
* node N+1 (modulo N)
*/
for (node = local_node + 1; node < numnodes; node++)
- j = build_zonelists_node(NODE_DATA(node), zonelist, j, k);
+ j = build_zonelists_node(NODE_DATA(node), zonelist, j, i);
for (node = 0; node < local_node; node++)
- j = build_zonelists_node(NODE_DATA(node), zonelist, j, k);
+ j = build_zonelists_node(NODE_DATA(node), zonelist, j, i);
zonelist->zones[j] = NULL;
}
_
--
Yasunori Goto <ygoto at us.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC/Patch]Making Removable zone[4/4]
2004-10-26 2:24 [RFC/Patch]Making Removable zone[0/4] Yasunori Goto
` (2 preceding siblings ...)
2004-10-26 2:36 ` [RFC/Patch]Making Removable zone[3/4] Yasunori Goto
@ 2004-10-26 2:38 ` Yasunori Goto
3 siblings, 0 replies; 5+ messages in thread
From: Yasunori Goto @ 2004-10-26 2:38 UTC (permalink / raw)
To: lhms-devel, linux-mm
This patch is just for test to make removable zones.
--
hotremovable-goto/arch/i386/mm/init.c | 22 +++++++++++++++++++---
1 files changed, 19 insertions(+), 3 deletions(-)
diff -puN arch/i386/mm/init.c~removable arch/i386/mm/init.c
--- hotremovable/arch/i386/mm/init.c~removable Fri Aug 27 21:07:12 2004
+++ hotremovable-goto/arch/i386/mm/init.c Fri Aug 27 21:07:12 2004
@@ -512,22 +512,38 @@ void zap_low_mappings (void)
}
#ifndef CONFIG_DISCONTIGMEM
+
+unsigned int __init check_max_unremovable(void)
+{
+ /* XXX : Hardware information might be necessary.
+ Now is just for test */
+ return (highend_pfn + max_low_pfn) / 2;
+}
+
void __init zone_sizes_init(void)
{
- unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0};
- unsigned int max_dma, high, low;
+ unsigned long zones_size[MAX_NR_ZONES] = {0, 0, 0, 0, 0, 0};
+ unsigned int max_dma, high, low, max_unremovable;
max_dma = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;
low = max_low_pfn;
high = highend_pfn;
+ max_unremovable = check_max_unremovable();
+
if (low < max_dma)
zones_size[ZONE_DMA] = low;
else {
zones_size[ZONE_DMA] = max_dma;
zones_size[ZONE_NORMAL] = low - max_dma;
#ifdef CONFIG_HIGHMEM
- zones_size[ZONE_HIGHMEM] = high - low;
+ if( low >= max_unremovable )
+ zones_size[ZONE_HIGHMEM_RMV] = high - low;
+ else if( high > max_unremovable ){
+ zones_size[ZONE_HIGHMEM_RMV] = high - max_unremovable;
+ zones_size[ZONE_HIGHMEM] = max_unremovable - low;
+ }else
+ zones_size[ZONE_HIGHMEM] = high - low;
#endif
}
free_area_init(zones_size);
_
--
Yasunori Goto <ygoto at us.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-10-26 2:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-26 2:24 [RFC/Patch]Making Removable zone[0/4] Yasunori Goto
2004-10-26 2:34 ` [RFC/Patch]Making Removable zone[1/4] Yasunori Goto
2004-10-26 2:35 ` [RFC/Patch]Making Removable zone[2/4] Yasunori Goto
2004-10-26 2:36 ` [RFC/Patch]Making Removable zone[3/4] Yasunori Goto
2004-10-26 2:38 ` [RFC/Patch]Making Removable zone[4/4] Yasunori Goto
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox