* [PATCH 1/2] memory-hotplug: fix wrong edge when hot add a new node @ 2015-08-11 7:26 Xishi Qiu 2015-08-11 7:33 ` [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() Xishi Qiu 0 siblings, 1 reply; 7+ messages in thread From: Xishi Qiu @ 2015-08-11 7:26 UTC (permalink / raw) To: Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku, Tang Chen, Gu Zheng Cc: Xishi Qiu, Linux MM, LKML When we add a new node, the edge of memory may be wrong. e.g. system has 4 nodes, and node3 is movable, node3 mem:[24G-32G], 1. hotremove the node3, 2. then hotadd node3 with a part of memory, mem:[26G-30G], 3. call hotadd_new_pgdat() free_area_init_node() get_pfn_range_for_nid() 4. it will return wrong start_pfn and end_pfn, because we have not update the memblock. This patch also fix another bug, please see http://marc.info/?l=linux-kernel&m=142961156129456&w=2 Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> --- mm/memory_hotplug.c | 3 +++ mm/page_alloc.c | 8 ++++++++ 2 files changed, 11 insertions(+) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 26fbba7..11f26cc 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1269,6 +1269,7 @@ int __ref add_memory(int nid, u64 start, u64 size) /* create new memmap entry */ firmware_map_add_hotplug(start, start + size, "System RAM"); + memblock_add_node(start, size, nid); goto out; @@ -2005,6 +2006,8 @@ void __ref remove_memory(int nid, u64 start, u64 size) /* remove memmap entry */ firmware_map_remove(start, start + size, "System RAM"); + memblock_free(start, size); + memblock_remove(start, size); arch_remove_memory(start, size); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1b91d37..7ca12b9 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5050,6 +5050,10 @@ static unsigned long __meminit zone_spanned_pages_in_node(int nid, { unsigned long zone_start_pfn, zone_end_pfn; + /* When hotadd a new node, the node should be empty */ + if (!node_start_pfn && !node_end_pfn) + return 0; + /* Get the start and end of the zone */ zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type]; zone_end_pfn = arch_zone_highest_possible_pfn[zone_type]; @@ -5113,6 +5117,10 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid, unsigned long zone_high = arch_zone_highest_possible_pfn[zone_type]; unsigned long zone_start_pfn, zone_end_pfn; + /* When hotadd a new node, the node should be empty */ + if (!node_start_pfn && !node_end_pfn) + return 0; + zone_start_pfn = clamp(node_start_pfn, zone_low, zone_high); zone_end_pfn = clamp(node_end_pfn, zone_low, zone_high); -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() 2015-08-11 7:26 [PATCH 1/2] memory-hotplug: fix wrong edge when hot add a new node Xishi Qiu @ 2015-08-11 7:33 ` Xishi Qiu 2015-08-23 10:28 ` Tang Chen 0 siblings, 1 reply; 7+ messages in thread From: Xishi Qiu @ 2015-08-11 7:33 UTC (permalink / raw) To: Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku, Tang Chen, Gu Zheng Cc: Xishi Qiu, Linux MM, LKML After hotadd_new_pgdat()->free_area_init_node(), the pgdat and zone's spanned/present are both 0, so remove reset_node_managed_pages() and reset_node_managed_pages(). Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> --- mm/memory_hotplug.c | 25 ------------------------- 1 file changed, 25 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 11f26cc..997dfad 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1065,16 +1065,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ } #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ -static void reset_node_present_pages(pg_data_t *pgdat) -{ - struct zone *z; - - for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) - z->present_pages = 0; - - pgdat->node_present_pages = 0; -} - /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) { @@ -1109,21 +1099,6 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) build_all_zonelists(pgdat, NULL); mutex_unlock(&zonelists_mutex); - /* - * zone->managed_pages is set to an approximate value in - * free_area_init_core(), which will cause - * /sys/device/system/node/nodeX/meminfo has wrong data. - * So reset it to 0 before any memory is onlined. - */ - reset_node_managed_pages(pgdat); - - /* - * When memory is hot-added, all the memory is in offline state. So - * clear all zones' present_pages because they will be updated in - * online_pages() and offline_pages(). - */ - reset_node_present_pages(pgdat); - return pgdat; } -- 2.0.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() 2015-08-11 7:33 ` [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() Xishi Qiu @ 2015-08-23 10:28 ` Tang Chen 2015-08-24 9:17 ` Xishi Qiu 0 siblings, 1 reply; 7+ messages in thread From: Tang Chen @ 2015-08-23 10:28 UTC (permalink / raw) To: Xishi Qiu, Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku Cc: tangchen, Linux MM, LKML, Yasuaki Ishimatsu Hi Shi, Sorry for the late reply. I hope it won't be too late. NON-ACK by me, I think. I noticed that your first has been merged. But it won't fix the problem these code intended to fix. After your patch 1, zone's spanned/present won't be set to 0 because: free_area_init_node() |--> get_pfn_range_for_nid(&start_pfn, &end_pfn) |--> calculate_node_totalpages(pgdat, start_pfn, end_pfn, ...) | --> zone_spanned_pages_in_node() | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 | --> zone_absent_pages_in_node() | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 This is caused by a little bug in your patch 1. You should put memblock_add_node(start, size, nid) before hotadd_new_pgdat() because: hotadd_new_pgdat() | --> free_area_init_node() | --> get_pfn_range_for_nid() | --> find memory ranges in memblock. | --> memblock_add_node(start, size, nid) ------------------- if you add it here, it doesn't work. The result will be like below if we hotadd node 5. [ 2007.577000] Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff] [ 2007.584000] On node 5 totalpages: 0 [ 2007.585000] Built 5 zonelists in Node order, mobility grouping on. Total pages: 32588823 [ 2007.594000] Policy zone: Normal [ 2007.598000] init_memory_mapping: [mem 0x60000000000-0x607ffffffff] And also, if we merge this patch, /sys/devices/system/node/nodeX/meminfo will break. Since this patch is not merged, I think let's just drop it. And about the little bug in your patch 1, since I'm in a hurry, I have already send a patch to fix it. Thanks. :) On 08/11/2015 03:33 PM, Xishi Qiu wrote: > After hotadd_new_pgdat()->free_area_init_node(), the pgdat and zone's spanned/present > are both 0, so remove reset_node_managed_pages() and reset_node_managed_pages(). > > Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> > --- > mm/memory_hotplug.c | 25 ------------------------- > 1 file changed, 25 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 11f26cc..997dfad 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1065,16 +1065,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > } > #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ > > -static void reset_node_present_pages(pg_data_t *pgdat) > -{ > - struct zone *z; > - > - for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) > - z->present_pages = 0; > - > - pgdat->node_present_pages = 0; > -} > - > /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ > static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) > { > @@ -1109,21 +1099,6 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) > build_all_zonelists(pgdat, NULL); > mutex_unlock(&zonelists_mutex); > > - /* > - * zone->managed_pages is set to an approximate value in > - * free_area_init_core(), which will cause > - * /sys/device/system/node/nodeX/meminfo has wrong data. > - * So reset it to 0 before any memory is onlined. > - */ > - reset_node_managed_pages(pgdat); > - > - /* > - * When memory is hot-added, all the memory is in offline state. So > - * clear all zones' present_pages because they will be updated in > - * online_pages() and offline_pages(). > - */ > - reset_node_present_pages(pgdat); > - > return pgdat; > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() 2015-08-23 10:28 ` Tang Chen @ 2015-08-24 9:17 ` Xishi Qiu 2015-08-24 9:39 ` Tang Chen 0 siblings, 1 reply; 7+ messages in thread From: Xishi Qiu @ 2015-08-24 9:17 UTC (permalink / raw) To: Tang Chen Cc: Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku, Linux MM, LKML, Yasuaki Ishimatsu On 2015/8/23 18:28, Tang Chen wrote: > Hi Shi, > > Sorry for the late reply. I hope it won't be too late. > > NON-ACK by me, I think. > > I noticed that your first has been merged. But it won't fix the problem > these code intended to fix. > > After your patch 1, zone's spanned/present won't be set to 0 because: > > free_area_init_node() > |--> get_pfn_range_for_nid(&start_pfn, &end_pfn) > |--> calculate_node_totalpages(pgdat, start_pfn, end_pfn, ...) > | --> zone_spanned_pages_in_node() > | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 > | --> zone_absent_pages_in_node() > | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 > Hi Tang, Thank you for your reply. When we add a new node, it is not included in the memblock, I have no idea why zone_spanned_pages_in_node() and zone_absent_pages_in_node() won't return 0. Do you add some debug code and print it? > This is caused by a little bug in your patch 1. > > You should put memblock_add_node(start, size, nid) before hotadd_new_pgdat() > because: > My patch is just add an empty node first, later __add_zone() will update the size. But it is all right put here, and it can fix the print bug. > hotadd_new_pgdat() > | --> free_area_init_node() > | --> get_pfn_range_for_nid() > | --> find memory ranges in memblock. > > | --> memblock_add_node(start, size, nid) ------------------- if you add it here, it doesn't work. > > The result will be like below if we hotadd node 5. > [ 2007.577000] Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff] pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid, (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); start_pfn and end_pfn are both 0, and 0-1 -> 0xffffffffffffffff, right? > [ 2007.584000] On node 5 totalpages: 0 > [ 2007.585000] Built 5 zonelists in Node order, mobility grouping on. Total pages: 32588823 > [ 2007.594000] Policy zone: Normal > [ 2007.598000] init_memory_mapping: [mem 0x60000000000-0x607ffffffff] > > > And also, if we merge this patch, /sys/devices/system/node/nodeX/meminfo will break. > trigger call trace? > > Since this patch is not merged, I think let's just drop it. > > And about the little bug in your patch 1, since I'm in a hurry, I have already send a patch to fix it. > > > Thanks. :) > > > On 08/11/2015 03:33 PM, Xishi Qiu wrote: >> After hotadd_new_pgdat()->free_area_init_node(), the pgdat and zone's spanned/present >> are both 0, so remove reset_node_managed_pages() and reset_node_managed_pages(). >> >> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> >> --- >> mm/memory_hotplug.c | 25 ------------------------- >> 1 file changed, 25 deletions(-) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index 11f26cc..997dfad 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1065,16 +1065,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ >> } >> #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ >> -static void reset_node_present_pages(pg_data_t *pgdat) >> -{ >> - struct zone *z; >> - >> - for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) >> - z->present_pages = 0; >> - >> - pgdat->node_present_pages = 0; >> -} >> - >> /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ >> static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) >> { >> @@ -1109,21 +1099,6 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) >> build_all_zonelists(pgdat, NULL); >> mutex_unlock(&zonelists_mutex); >> - /* >> - * zone->managed_pages is set to an approximate value in >> - * free_area_init_core(), which will cause >> - * /sys/device/system/node/nodeX/meminfo has wrong data. >> - * So reset it to 0 before any memory is onlined. >> - */ >> - reset_node_managed_pages(pgdat); >> - >> - /* >> - * When memory is hot-added, all the memory is in offline state. So >> - * clear all zones' present_pages because they will be updated in >> - * online_pages() and offline_pages(). >> - */ >> - reset_node_present_pages(pgdat); >> - >> return pgdat; >> } >> > > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() 2015-08-24 9:17 ` Xishi Qiu @ 2015-08-24 9:39 ` Tang Chen 2015-08-24 11:24 ` Xishi Qiu 0 siblings, 1 reply; 7+ messages in thread From: Tang Chen @ 2015-08-24 9:39 UTC (permalink / raw) To: Xishi Qiu Cc: Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku, Linux MM, LKML, Yasuaki Ishimatsu On 08/24/2015 05:17 PM, Xishi Qiu wrote: > On 2015/8/23 18:28, Tang Chen wrote: > >> Hi Shi, >> >> Sorry for the late reply. I hope it won't be too late. >> >> NON-ACK by me, I think. >> >> I noticed that your first has been merged. But it won't fix the problem >> these code intended to fix. >> >> After your patch 1, zone's spanned/present won't be set to 0 because: >> >> free_area_init_node() >> |--> get_pfn_range_for_nid(&start_pfn, &end_pfn) >> |--> calculate_node_totalpages(pgdat, start_pfn, end_pfn, ...) >> | --> zone_spanned_pages_in_node() >> | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 >> | --> zone_absent_pages_in_node() >> | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 >> > Hi Tang, > > Thank you for your reply. When we add a new node, it is not included in the > memblock, I have no idea why zone_spanned_pages_in_node() and zone_absent_pages_in_node() > won't return 0. Do you add some debug code and print it? Sorry, I didn't say it clearly since I was in a hurry. If we move memblock_add_node() forward before hotadd_new_pgdat(), then node_start_pfn and node_end_pfn won't be 0. And zone_spanned_pages_in_node() and zone_absent_pages_in_node() won't return 0. In your patch, they will return 0. And that leads to the print problem. > >> This is caused by a little bug in your patch 1. >> >> You should put memblock_add_node(start, size, nid) before hotadd_new_pgdat() >> because: >> > My patch is just add an empty node first, later __add_zone() will update the size. > But it is all right put here, and it can fix the print bug. Yes. I just found this bug in the log. > >> hotadd_new_pgdat() >> | --> free_area_init_node() >> | --> get_pfn_range_for_nid() >> | --> find memory ranges in memblock. >> >> | --> memblock_add_node(start, size, nid) ------------------- if you add it here, it doesn't work. >> >> The result will be like below if we hotadd node 5. >> [ 2007.577000] Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff] > pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid, > (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); > start_pfn and end_pfn are both 0, and 0-1 -> 0xffffffffffffffff, right? Yes. > >> [ 2007.584000] On node 5 totalpages: 0 >> [ 2007.585000] Built 5 zonelists in Node order, mobility grouping on. Total pages: 32588823 >> [ 2007.594000] Policy zone: Normal >> [ 2007.598000] init_memory_mapping: [mem 0x60000000000-0x607ffffffff] >> >> >> And also, if we merge this patch, /sys/devices/system/node/nodeX/meminfo will break. >> > trigger call trace? No. There is no error output. But if you see /sys/devices/system/node/nodeX/meminfo, memory size will double because totalpages is calculated once here, and one more time when onlining memory. I think reset it to 0 in add_memory() is just a temporary solution. Will improve it in the future. Thanks. > >> Since this patch is not merged, I think let's just drop it. >> >> And about the little bug in your patch 1, since I'm in a hurry, I have already send a patch to fix it. >> >> >> Thanks. :) >> >> >> On 08/11/2015 03:33 PM, Xishi Qiu wrote: >>> After hotadd_new_pgdat()->free_area_init_node(), the pgdat and zone's spanned/present >>> are both 0, so remove reset_node_managed_pages() and reset_node_managed_pages(). >>> >>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> >>> --- >>> mm/memory_hotplug.c | 25 ------------------------- >>> 1 file changed, 25 deletions(-) >>> >>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >>> index 11f26cc..997dfad 100644 >>> --- a/mm/memory_hotplug.c >>> +++ b/mm/memory_hotplug.c >>> @@ -1065,16 +1065,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ >>> } >>> #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ >>> -static void reset_node_present_pages(pg_data_t *pgdat) >>> -{ >>> - struct zone *z; >>> - >>> - for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) >>> - z->present_pages = 0; >>> - >>> - pgdat->node_present_pages = 0; >>> -} >>> - >>> /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ >>> static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) >>> { >>> @@ -1109,21 +1099,6 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) >>> build_all_zonelists(pgdat, NULL); >>> mutex_unlock(&zonelists_mutex); >>> - /* >>> - * zone->managed_pages is set to an approximate value in >>> - * free_area_init_core(), which will cause >>> - * /sys/device/system/node/nodeX/meminfo has wrong data. >>> - * So reset it to 0 before any memory is onlined. >>> - */ >>> - reset_node_managed_pages(pgdat); >>> - >>> - /* >>> - * When memory is hot-added, all the memory is in offline state. So >>> - * clear all zones' present_pages because they will be updated in >>> - * online_pages() and offline_pages(). >>> - */ >>> - reset_node_present_pages(pgdat); >>> - >>> return pgdat; >>> } >>> >> >> . >> > > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() 2015-08-24 9:39 ` Tang Chen @ 2015-08-24 11:24 ` Xishi Qiu 2015-08-26 7:19 ` Tang Chen 0 siblings, 1 reply; 7+ messages in thread From: Xishi Qiu @ 2015-08-24 11:24 UTC (permalink / raw) To: Tang Chen Cc: Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku, Linux MM, LKML, Yasuaki Ishimatsu On 2015/8/24 17:39, Tang Chen wrote: > > On 08/24/2015 05:17 PM, Xishi Qiu wrote: >> On 2015/8/23 18:28, Tang Chen wrote: >> >>> Hi Shi, >>> >>> Sorry for the late reply. I hope it won't be too late. >>> >>> NON-ACK by me, I think. >>> >>> I noticed that your first has been merged. But it won't fix the problem >>> these code intended to fix. >>> >>> After your patch 1, zone's spanned/present won't be set to 0 because: >>> >>> free_area_init_node() >>> |--> get_pfn_range_for_nid(&start_pfn, &end_pfn) >>> |--> calculate_node_totalpages(pgdat, start_pfn, end_pfn, ...) >>> | --> zone_spanned_pages_in_node() >>> | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 >>> | --> zone_absent_pages_in_node() >>> | --> if (!node_start_pfn && !node_end_pfn) return 0; -------- false, won't return 0 >>> >> Hi Tang, >> >> Thank you for your reply. When we add a new node, it is not included in the >> memblock, I have no idea why zone_spanned_pages_in_node() and zone_absent_pages_in_node() >> won't return 0. Do you add some debug code and print it? > > Sorry, I didn't say it clearly since I was in a hurry. > > If we move memblock_add_node() forward before hotadd_new_pgdat(), > then node_start_pfn and node_end_pfn won't be 0. And > zone_spanned_pages_in_node() and zone_absent_pages_in_node() won't > return 0. > > In your patch, they will return 0. And that leads to the print problem. > >> >>> This is caused by a little bug in your patch 1. >>> >>> You should put memblock_add_node(start, size, nid) before hotadd_new_pgdat() >>> because: >>> >> My patch is just add an empty node first, later __add_zone() will update the size. >> But it is all right put here, and it can fix the print bug. > > Yes. I just found this bug in the log. > >> >>> hotadd_new_pgdat() >>> | --> free_area_init_node() >>> | --> get_pfn_range_for_nid() >>> | --> find memory ranges in memblock. >>> >>> | --> memblock_add_node(start, size, nid) ------------------- if you add it here, it doesn't work. >>> >>> The result will be like below if we hotadd node 5. >>> [ 2007.577000] Initmem setup node 5 [mem 0x0000000000000000-0xffffffffffffffff] >> pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid, >> (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); >> start_pfn and end_pfn are both 0, and 0-1 -> 0xffffffffffffffff, right? > > Yes. > >> >>> [ 2007.584000] On node 5 totalpages: 0 >>> [ 2007.585000] Built 5 zonelists in Node order, mobility grouping on. Total pages: 32588823 >>> [ 2007.594000] Policy zone: Normal >>> [ 2007.598000] init_memory_mapping: [mem 0x60000000000-0x607ffffffff] >>> >>> >>> And also, if we merge this patch, /sys/devices/system/node/nodeX/meminfo will break. >>> >> trigger call trace? > > No. There is no error output. But if you see /sys/devices/system/node/nodeX/meminfo, > memory size will double because totalpages is calculated once here, and one more time > when onlining memory. > Hi Tang, Do you mean si_meminfo_node() -> val->totalram = managed_pages; will be added double? But my patch will keep it 0 in hotadd_new_pgdat(), so it will not be double, right? Thanks, Xishi Qiu > I think reset it to 0 in add_memory() is just a temporary solution. Will improve it in the > future. > > Thanks. > >> >>> Since this patch is not merged, I think let's just drop it. >>> >>> And about the little bug in your patch 1, since I'm in a hurry, I have already send a patch to fix it. >>> >>> >>> Thanks. :) >>> >>> >>> On 08/11/2015 03:33 PM, Xishi Qiu wrote: >>>> After hotadd_new_pgdat()->free_area_init_node(), the pgdat and zone's spanned/present >>>> are both 0, so remove reset_node_managed_pages() and reset_node_managed_pages(). >>>> >>>> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> >>>> --- >>>> mm/memory_hotplug.c | 25 ------------------------- >>>> 1 file changed, 25 deletions(-) >>>> >>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >>>> index 11f26cc..997dfad 100644 >>>> --- a/mm/memory_hotplug.c >>>> +++ b/mm/memory_hotplug.c >>>> @@ -1065,16 +1065,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ >>>> } >>>> #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ >>>> -static void reset_node_present_pages(pg_data_t *pgdat) >>>> -{ >>>> - struct zone *z; >>>> - >>>> - for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) >>>> - z->present_pages = 0; >>>> - >>>> - pgdat->node_present_pages = 0; >>>> -} >>>> - >>>> /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ >>>> static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) >>>> { >>>> @@ -1109,21 +1099,6 @@ static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start) >>>> build_all_zonelists(pgdat, NULL); >>>> mutex_unlock(&zonelists_mutex); >>>> - /* >>>> - * zone->managed_pages is set to an approximate value in >>>> - * free_area_init_core(), which will cause >>>> - * /sys/device/system/node/nodeX/meminfo has wrong data. >>>> - * So reset it to 0 before any memory is onlined. >>>> - */ >>>> - reset_node_managed_pages(pgdat); >>>> - >>>> - /* >>>> - * When memory is hot-added, all the memory is in offline state. So >>>> - * clear all zones' present_pages because they will be updated in >>>> - * online_pages() and offline_pages(). >>>> - */ >>>> - reset_node_present_pages(pgdat); >>>> - >>>> return pgdat; >>>> } >>>> >>> >>> . >>> >> >> >> . >> > > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() 2015-08-24 11:24 ` Xishi Qiu @ 2015-08-26 7:19 ` Tang Chen 0 siblings, 0 replies; 7+ messages in thread From: Tang Chen @ 2015-08-26 7:19 UTC (permalink / raw) To: Xishi Qiu Cc: Andrew Morton, Yasuaki Ishimatsu, Kamezawa Hiroyuki, izumi.taku, Linux MM, LKML, Yasuaki Ishimatsu On 08/24/2015 07:24 PM, Xishi Qiu wrote: > ...... >>>> [ 2007.584000] On node 5 totalpages: 0 >>>> [ 2007.585000] Built 5 zonelists in Node order, mobility grouping on. Total pages: 32588823 >>>> [ 2007.594000] Policy zone: Normal >>>> [ 2007.598000] init_memory_mapping: [mem 0x60000000000-0x607ffffffff] >>>> >>>> >>>> And also, if we merge this patch, /sys/devices/system/node/nodeX/meminfo will break. >>>> >>> trigger call trace? >> No. There is no error output. But if you see /sys/devices/system/node/nodeX/meminfo, >> memory size will double because totalpages is calculated once here, and one more time >> when onlining memory. >> > Hi Tang, > > Do you mean si_meminfo_node() -> val->totalram = managed_pages; will be added double? > But my patch will keep it 0 in hotadd_new_pgdat(), so it will not be double, right? > Hi, I mean this: online_pages() |--> zone->zone_pgdat->node_present_pages += onlined_pages; It will be double. Since meminfo data is retrieved from these kernel structures, /proc/meminfo will be broken. Actually speaking, reset it when hot-adding memory is not a good idea. We should make the memory init code be suitable for both boot code and memory hot-plug code. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-08-26 7:20 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-08-11 7:26 [PATCH 1/2] memory-hotplug: fix wrong edge when hot add a new node Xishi Qiu 2015-08-11 7:33 ` [PATCH 2/2] memory-hotplug: remove reset_node_managed_pages() and reset_node_managed_pages() in hotadd_new_pgdat() Xishi Qiu 2015-08-23 10:28 ` Tang Chen 2015-08-24 9:17 ` Xishi Qiu 2015-08-24 9:39 ` Tang Chen 2015-08-24 11:24 ` Xishi Qiu 2015-08-26 7:19 ` Tang Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox