* memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE @ 2008-07-22 16:55 Gerald Schaefer 2008-07-23 2:48 ` Yasunori Goto 0 siblings, 1 reply; 11+ messages in thread From: Gerald Schaefer @ 2008-07-22 16:55 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Yasunori Goto, Dave Hansen, Andy Whitcroft I've been testing memory hotplug on s390, on a system that starts w/o memory in ZONE_MOVABLE at first, and then some memory chunks will be added to ZONE_MOVABLE via memory hot-add. Now I observe the following problem: Memory hot-remove of the lowest memory chunk in ZONE_MOVABLE will fail because of some reserved pages at the beginning of each zone (MIGRATE_RESERVED). During memory hot-add, setup_per_zone_pages_min() will be called from online_pages() to redistribute/recalculate the reserved page blocks. This will mark some page blocks at the beginning of each zone as MIGRATE_RESERVE. Now, the memory chunk containing these blocks cannot be set offline again, because only MIGRATE_MOVABLE pages can be isolated (offline_pages -> start_isolate_page_range). So you cannot remove all the memory chunks that have been added via memory hotplug. I'm not sure if I am missing something here, or if this really is a bug. Any thoughts? Thanks, Gerald -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE 2008-07-22 16:55 memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE Gerald Schaefer @ 2008-07-23 2:48 ` Yasunori Goto 2008-07-29 16:07 ` Gerald Schaefer 0 siblings, 1 reply; 11+ messages in thread From: Yasunori Goto @ 2008-07-23 2:48 UTC (permalink / raw) To: Gerald Schaefer Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft Hi. > I've been testing memory hotplug on s390, on a system that starts w/o > memory in ZONE_MOVABLE at first, and then some memory chunks will be > added to ZONE_MOVABLE via memory hot-add. Now I observe the following > problem: > > Memory hot-remove of the lowest memory chunk in ZONE_MOVABLE will fail > because of some reserved pages at the beginning of each zone > (MIGRATE_RESERVED). > > During memory hot-add, setup_per_zone_pages_min() will be called from > online_pages() to redistribute/recalculate the reserved page blocks. > This will mark some page blocks at the beginning of each zone as > MIGRATE_RESERVE. Now, the memory chunk containing these blocks cannot > be set offline again, because only MIGRATE_MOVABLE pages can be isolated > (offline_pages -> start_isolate_page_range). > > So you cannot remove all the memory chunks that have been added via > memory hotplug. I'm not sure if I am missing something here, or if this > really is a bug. Any thoughts? I believe you are right. Current hot-remove code is NOT perfect. You may remove some sections, but may not other sections, because there are some un-removable pages by some reasons (not only MIGRATE_RESERVED). I think MIGRATE_RESERVED pages should be move to MIGRATE_MOVABLE when those pages must be removed, and should recalculate MIGRATE_RESERVED pages. Bye. -- Yasunori Goto -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE 2008-07-23 2:48 ` Yasunori Goto @ 2008-07-29 16:07 ` Gerald Schaefer 2008-07-30 3:16 ` Yasunori Goto 0 siblings, 1 reply; 11+ messages in thread From: Gerald Schaefer @ 2008-07-29 16:07 UTC (permalink / raw) To: Yasunori Goto Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra On Wed, 2008-07-23 at 11:48 +0900, Yasunori Goto wrote: > > Memory hot-remove of the lowest memory chunk in ZONE_MOVABLE will fail > > because of some reserved pages at the beginning of each zone > > (MIGRATE_RESERVED). > > > I believe you are right. Current hot-remove code is NOT perfect. > You may remove some sections, but may not other sections, > because there are some un-removable pages by some reasons > (not only MIGRATE_RESERVED). > > I think MIGRATE_RESERVED pages should be move to MIGRATE_MOVABLE when > those pages must be removed, and should recalculate MIGRATE_RESERVED pages. Hi, Would it be an option to set pages_min to 0 for ZONE_MOVABLE in setup_per_zone_pages_min()? This would avoid the MIGRATE_RESERVED vs. MIGRATE_MOVABLE conflict on memory hot-remove. If I understand it correctly, the kernel wouldn't be able to use the reserved pages in ZONE_MOVABLE for __GFP_HIGH and PF_MEMALLOC allocations anyway, right? At the moment, ZONE_MOVABLE pages will also account for the lowmem_pages calculation in setup_per_zone_pages_min(). The recalculation will then redistribute and reduce the amount of reserved pages for the other zones. Won't this effectively reduce the amount of reserved min_free_kbytes memory that is available to the kernel, even getting worse the more memory is added to ZONE_MOVABLE? With the following patch, ZONE_MOVABLE will be skipped for the lowmem_pages calculation, just like it is already done for highmem. It will also set pages_min to 0 for ZONE_MOVABLE. But I have an uneasy feeling about this, because I may be missing side effects from this. Any opinions? Thanks, Gerald --- include/linux/mmzone.h | 5 +++++ mm/page_alloc.c | 4 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) Index: linux-2.6/include/linux/mmzone.h =================================================================== --- linux-2.6.orig/include/linux/mmzone.h +++ linux-2.6/include/linux/mmzone.h @@ -660,6 +660,11 @@ static inline int is_dma(struct zone *zo #endif } +static inline int is_movable(struct zone *zone) +{ + return zone == zone->zone_pgdat->node_zones + ZONE_MOVABLE; +} + /* These two functions are used to setup the per zone pages min values */ struct ctl_table; struct file; Index: linux-2.6/mm/page_alloc.c =================================================================== --- linux-2.6.orig/mm/page_alloc.c +++ linux-2.6/mm/page_alloc.c @@ -4210,7 +4210,7 @@ void setup_per_zone_pages_min(void) /* Calculate total number of !ZONE_HIGHMEM pages */ for_each_zone(zone) { - if (!is_highmem(zone)) + if (!is_highmem(zone) && !is_movable(zone)) lowmem_pages += zone->present_pages; } @@ -4243,7 +4243,7 @@ void setup_per_zone_pages_min(void) * If it's a lowmem zone, reserve a number of pages * proportionate to the zone's size. */ - zone->pages_min = tmp; + zone->pages_min = is_movable(zone) ? 0 : tmp; } zone->pages_low = zone->pages_min + (tmp >> 2); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE 2008-07-29 16:07 ` Gerald Schaefer @ 2008-07-30 3:16 ` Yasunori Goto 2008-07-30 12:16 ` Gerald Schaefer 0 siblings, 1 reply; 11+ messages in thread From: Yasunori Goto @ 2008-07-30 3:16 UTC (permalink / raw) To: Gerald Schaefer Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra > On Wed, 2008-07-23 at 11:48 +0900, Yasunori Goto wrote: > > > Memory hot-remove of the lowest memory chunk in ZONE_MOVABLE will fail > > > because of some reserved pages at the beginning of each zone > > > (MIGRATE_RESERVED). > > > > > I believe you are right. Current hot-remove code is NOT perfect. > > You may remove some sections, but may not other sections, > > because there are some un-removable pages by some reasons > > (not only MIGRATE_RESERVED). > > > > I think MIGRATE_RESERVED pages should be move to MIGRATE_MOVABLE when > > those pages must be removed, and should recalculate MIGRATE_RESERVED pages. > > Hi, > > Would it be an option to set pages_min to 0 for ZONE_MOVABLE in > setup_per_zone_pages_min()? This would avoid the MIGRATE_RESERVED vs. > MIGRATE_MOVABLE conflict on memory hot-remove. If I understand it > correctly, the kernel wouldn't be able to use the reserved pages in > ZONE_MOVABLE for __GFP_HIGH and PF_MEMALLOC allocations anyway, right? > > At the moment, ZONE_MOVABLE pages will also account for the lowmem_pages > calculation in setup_per_zone_pages_min(). The recalculation will then > redistribute and reduce the amount of reserved pages for the other zones. > Won't this effectively reduce the amount of reserved min_free_kbytes memory > that is available to the kernel, even getting worse the more memory is > added to ZONE_MOVABLE? > > With the following patch, ZONE_MOVABLE will be skipped for the > lowmem_pages calculation, just like it is already done for highmem. > It will also set pages_min to 0 for ZONE_MOVABLE. But I have an uneasy > feeling about this, because I may be missing side effects from this. > Any opinions? Well, I didn't mean changing pages_min value. There may be side effect as you are saying. I meant if some pages were MIGRATE_RESERVE attribute when hot-remove are -executing-, their attribute should be changed. For example, how is like following dummy code? Is it impossible? (Not only here, some places will have to be modified..) Thanks. --- mm/page_alloc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: current/mm/page_alloc.c =================================================================== --- current.orig/mm/page_alloc.c 2008-07-29 22:17:54.000000000 +0900 +++ current/mm/page_alloc.c 2008-07-30 12:04:03.000000000 +0900 @@ -4828,7 +4828,9 @@ int set_migratetype_isolate(struct page /* * In future, more migrate types will be able to be isolation target. */ - if (get_pageblock_migratetype(page) != MIGRATE_MOVABLE) + if ((get_pageblock_migratetype(page) != MIGRATE_MOVABLE) || + !((removing section is the last section on the zone) && + get_pageblock_migratetype(page) == MIGRATE_RESREVE)) goto out; set_pageblock_migratetype(page, MIGRATE_ISOLATE); move_freepages_block(zone, page, MIGRATE_ISOLATE); -- Yasunori Goto -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE 2008-07-30 3:16 ` Yasunori Goto @ 2008-07-30 12:16 ` Gerald Schaefer 2008-07-31 5:16 ` Yasunori Goto 2008-07-31 13:22 ` Mel Gorman 0 siblings, 2 replies; 11+ messages in thread From: Gerald Schaefer @ 2008-07-30 12:16 UTC (permalink / raw) To: Yasunori Goto Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Mel Gorman, Andrew Morton On Wed, 2008-07-30 at 12:16 +0900, Yasunori Goto wrote: > Well, I didn't mean changing pages_min value. There may be side effect as > you are saying. > I meant if some pages were MIGRATE_RESERVE attribute when hot-remove are > -executing-, their attribute should be changed. > > For example, how is like following dummy code? Is it impossible? > (Not only here, some places will have to be modified..) Right, this should be possible. I was somewhat wandering from the subject, because I noticed that there may be a bigger problem with MIGRATE_RESERVE pages in ZONE_MOVABLE, and that we may not want to have them in the first place. The more memory we add to ZONE_MOVABLE, the less reserved pages will remain to the other zones. In setup_per_zone_pages_min(), min_free_kbytes will be redistributed to a zone where the kernel cannot make any use of it, effectively reducing the available min_free_kbytes. This just doesn't sound right. I believe that a similar situation is the reason why highmem pages are skipped in the calculation and I think that we need that for ZONE_MOVABLE too. Any thoughts on that problem? Setting pages_min to 0 for ZONE_MOVABLE, while not capping pages_low and pages_high, could be an option. I don't have a sufficient memory managment overview to tell if that has negative side effects, maybe someone with a deeper insight could comment on that. Thanks, Gerald -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE 2008-07-30 12:16 ` Gerald Schaefer @ 2008-07-31 5:16 ` Yasunori Goto 2008-07-31 13:22 ` Mel Gorman 1 sibling, 0 replies; 11+ messages in thread From: Yasunori Goto @ 2008-07-31 5:16 UTC (permalink / raw) To: Gerald Schaefer, Mel Gorman Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Andrew Morton > On Wed, 2008-07-30 at 12:16 +0900, Yasunori Goto wrote: > > Well, I didn't mean changing pages_min value. There may be side effect as > > you are saying. > > I meant if some pages were MIGRATE_RESERVE attribute when hot-remove are > > -executing-, their attribute should be changed. > > > > For example, how is like following dummy code? Is it impossible? > > (Not only here, some places will have to be modified..) > > Right, this should be possible. I was somewhat wandering from the subject, > because I noticed that there may be a bigger problem with MIGRATE_RESERVE > pages in ZONE_MOVABLE, and that we may not want to have them in the first > place. > > The more memory we add to ZONE_MOVABLE, the less reserved pages will > remain to the other zones. In setup_per_zone_pages_min(), min_free_kbytes > will be redistributed to a zone where the kernel cannot make any use of > it, effectively reducing the available min_free_kbytes. This just doesn't > sound right. I believe that a similar situation is the reason why highmem > pages are skipped in the calculation and I think that we need that for > ZONE_MOVABLE too. Any thoughts on that problem? > > Setting pages_min to 0 for ZONE_MOVABLE, while not capping pages_low > and pages_high, could be an option. I don't have a sufficient memory > managment overview to tell if that has negative side effects, maybe > someone with a deeper insight could comment on that. At least, pages_min should not be 0. It is used as watermark when memory shortage situation. If it is 0, kernel will misunderstand shortage situation. Certainly, pages_min value may be not appropriate value for ZONE_MOVABLE. But it is not memory-hotplug issue. True your question is why ZONE_MOVABLE has MIGRATE_RESREVE pages, right? However, I think it is intended for emergency pool of memory shortage situation for ZONE_MOVABLE via fallback[]. If not, these MIGRATE_RESERVE pages are not made originally. It is why I wrote previous mail. Mel Gormal-san knows around here very well. He may explain its detail more. Bye. -- Yasunori Goto -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE 2008-07-30 12:16 ` Gerald Schaefer 2008-07-31 5:16 ` Yasunori Goto @ 2008-07-31 13:22 ` Mel Gorman 2008-07-31 17:45 ` memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes Gerald Schaefer 1 sibling, 1 reply; 11+ messages in thread From: Mel Gorman @ 2008-07-31 13:22 UTC (permalink / raw) To: Gerald Schaefer Cc: Yasunori Goto, linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Andrew Morton On (30/07/08 14:16), Gerald Schaefer didst pronounce: > On Wed, 2008-07-30 at 12:16 +0900, Yasunori Goto wrote: > > Well, I didn't mean changing pages_min value. There may be side effect as > > you are saying. > > I meant if some pages were MIGRATE_RESERVE attribute when hot-remove are > > -executing-, their attribute should be changed. > > > > For example, how is like following dummy code? Is it impossible? > > (Not only here, some places will have to be modified..) > > Right, this should be possible. I was somewhat wandering from the subject, > because I noticed that there may be a bigger problem with MIGRATE_RESERVE > pages in ZONE_MOVABLE, and that we may not want to have them in the first > place. > MIGRATE_RESERVE is of large importance to ZONE_DMA32 and ZONE_NORMAL, to a much lesser extent to ZONE_HIGHMEM and almost irrevelant to ZONE_MOVABLE. However, nothing about MIGRATE_RESERVE should prevent the hot-remove of the section. If the section is totally free, it is considered removable according to is_mem_section_removable(). If other parts of memory hot-remove are deliberately ignoring the RESERVE sections, they should stop that. I haven't read the whole thread, but in your original mail, you say that ZONE_MOVABLE is populated by memory hot-add. Are there really PageReserved() pages there? If so, is there any chance or other management structures are being allocated within the section you are hot-adding? If so and they are not getting freed, that might be why hot-remove is failing. If they are not PageReserved() pages and this is an -mm kernel, I would enable CONFIG_PAGE_OWNER and see who really reallocated those problem pages that are not freeing. > The more memory we add to ZONE_MOVABLE, the less reserved pages will > remain to the other zones. In setup_per_zone_pages_min(), min_free_kbytes > will be redistributed to a zone where the kernel cannot make any use of > it, effectively reducing the available min_free_kbytes. I'm not sure what you mean by "available min_free_kbytes". The overall value for min_free_kbytes should be approximately the same whether the zone exists or not. However, you're right in that the distribution of minimum free pages changes with ZONE_MOVABLE because the zones are different sizes now. This affects reclaim, not memory hot-remove. > This just doesn't > sound right. I believe that a similar situation is the reason why highmem > pages are skipped in the calculation and I think that we need that for > ZONE_MOVABLE too. Any thoughts on that problem? > is_highmem(ZONE_MOVABLE) should be returning true if the zone is really part of himem. > Setting pages_min to 0 for ZONE_MOVABLE, while not capping pages_low > and pages_high, could be an option. I don't have a sufficient memory > managment overview to tell if that has negative side effects, maybe > someone with a deeper insight could comment on that. > pages_min of 0 means the other values would be 0 as well. This means that kswapd may never be woken up to free pages within that zone and lead to poor utilisation of the zone as allocators fallback to other zones to avoid direct reclaim. I don't think that is your intention nor will it help memory hot-remove. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes 2008-07-31 13:22 ` Mel Gorman @ 2008-07-31 17:45 ` Gerald Schaefer 2008-08-01 11:16 ` Yasunori Goto 2008-08-01 16:26 ` Mel Gorman 0 siblings, 2 replies; 11+ messages in thread From: Gerald Schaefer @ 2008-07-31 17:45 UTC (permalink / raw) To: Mel Gorman Cc: Yasunori Goto, linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Andrew Morton On Thu, 2008-07-31 at 14:22 +0100, Mel Gorman wrote: > > The more memory we add to ZONE_MOVABLE, the less reserved pages will > > remain to the other zones. In setup_per_zone_pages_min(), min_free_kbytes > > will be redistributed to a zone where the kernel cannot make any use of > > it, effectively reducing the available min_free_kbytes. > > I'm not sure what you mean by "available min_free_kbytes". The overall value > for min_free_kbytes should be approximately the same whether the zone exists > or not. However, you're right in that the distribution of minimum free pages > changes with ZONE_MOVABLE because the zones are different sizes now. This > affects reclaim, not memory hot-remove. Sorry for mixing things up in this thread, the min_free_kbytes issue is not related to memory hot-remove, but rather to hot-add and the things that happen in setup_per_zone_pages_min(), which is called from online_pages(). It may well be that my assumptions are wrong, but I'd like to explain my concerns again: If we have a system with 1 GB of memory, min_free_kbytes will be calculated to 4 MB for ZONE_NORMAL, for example. Now, if we add 3 GB of hotplug memory to ZONE_MOVABLE, the total min_free_kbytes will still remain 4 MB but it will be distributed differently: ZONE_NORMAL will now have only 1 MB of MIGRATE_RESERVE memory left, while ZONE_MOVABLE will have 3 MB, e.g. My assumption is now, that the reserved 3 MB in ZONE_MOVABLE won't be usable by the kernel anymore, e.g. for PF_MEMALLOC, because it is in ZONE_MOVABLE now. This is what I mean with "effectively reducing the available min_free_kbytes". The system would now behave in the same way as a system which only had 1 MB of min_free_kbytes, although /proc/sys/vm/min_free_kbytes would still say 4 MB. After all, this tunable can have a rather negative impact on a system, especially if it is too low, hence my concerns. > > This just doesn't > > sound right. I believe that a similar situation is the reason why highmem > > pages are skipped in the calculation and I think that we need that for > > ZONE_MOVABLE too. Any thoughts on that problem? > > > > is_highmem(ZONE_MOVABLE) should be returning true if the zone is really > part of himem. We don't have highmem on s390, I was just trying to give an example: I noticed that there is special treatment for highmem pages in setup_per_zone_pages_min(), and thought that we may also need to handle ZONE_MOVABLE in a special way. > > Setting pages_min to 0 for ZONE_MOVABLE, while not capping pages_low > > and pages_high, could be an option. I don't have a sufficient memory > > managment overview to tell if that has negative side effects, maybe > > someone with a deeper insight could comment on that. > > > > pages_min of 0 means the other values would be 0 as well. This means that > kswapd may never be woken up to free pages within that zone and lead to > poor utilisation of the zone as allocators fallback to other zones to > avoid direct reclaim. I don't think that is your intention nor will it > help memory hot-remove. Do you mean pages_low and pages_high? In setup_per_zone_pages_min(), those would not be set to 0, even if we set pages_min to 0. Again, a similar strategy is being used for highmem in that function, only that pages_min is set to a small value instead of 0 in that case. So it should not affect kswapd but only __GFP_HIGH and PF_MEMALLOC allocations, which won't be allocated from ZONE_MOVABLE anyway if I understood that right. Thanks, Gerald -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes 2008-07-31 17:45 ` memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes Gerald Schaefer @ 2008-08-01 11:16 ` Yasunori Goto 2008-08-01 16:04 ` Gerald Schaefer 2008-08-01 16:26 ` Mel Gorman 1 sibling, 1 reply; 11+ messages in thread From: Yasunori Goto @ 2008-08-01 11:16 UTC (permalink / raw) To: Gerald Schaefer Cc: Mel Gorman, linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Andrew Morton > Sorry for mixing things up in this thread, the min_free_kbytes issue is > not related to memory hot-remove, but rather to hot-add and the things that > happen in setup_per_zone_pages_min(), which is called from online_pages(). > It may well be that my assumptions are wrong, but I'd like to explain my > concerns again: > > If we have a system with 1 GB of memory, min_free_kbytes will be calculated > to 4 MB for ZONE_NORMAL, for example. Now, if we add 3 GB of hotplug memory > to ZONE_MOVABLE, the total min_free_kbytes will still remain 4 MB but it > will be distributed differently: ZONE_NORMAL will now have only 1 MB of > MIGRATE_RESERVE memory left, while ZONE_MOVABLE will have 3 MB, e.g. > Right. > My assumption is now, that the reserved 3 MB in ZONE_MOVABLE won't be > usable by the kernel anymore, e.g. for PF_MEMALLOC, because it is in > ZONE_MOVABLE now. I don't make sense here. I suppose there is no relationship between ZONE_MOVABLE, PF_MEMALLOC and MIGRATE_RESERVE pages. Could you tell me more? > This is what I mean with "effectively reducing the > available min_free_kbytes". The system would now behave in the same way > as a system which only had 1 MB of min_free_kbytes, although > /proc/sys/vm/min_free_kbytes would still say 4 MB. After all, this tunable > can have a rather negative impact on a system, especially if it is too > low, hence my concerns. > > > > Setting pages_min to 0 for ZONE_MOVABLE, while not capping pages_low > > > and pages_high, could be an option. I don't have a sufficient memory > > > managment overview to tell if that has negative side effects, maybe > > > someone with a deeper insight could comment on that. > > > > > > > pages_min of 0 means the other values would be 0 as well. This means that > > kswapd may never be woken up to free pages within that zone and lead to > > poor utilisation of the zone as allocators fallback to other zones to > > avoid direct reclaim. I don't think that is your intention nor will it > > help memory hot-remove. > > Do you mean pages_low and pages_high? In setup_per_zone_pages_min(), > those would not be set to 0, even if we set pages_min to 0. Again, a > similar strategy is being used for highmem in that function, only that > pages_min is set to a small value instead of 0 in that case. So it should > not affect kswapd but only __GFP_HIGH and PF_MEMALLOC allocations, which > won't be allocated from ZONE_MOVABLE anyway if I understood that right. pages_min seems to be used in get_pages_from_freelist(). Do you mean following is not executed? if (!(alloc_flags & ALLOC_NO_WATERMARKS)) { unsigned long mark; if (alloc_flags & ALLOC_WMARK_MIN) mark = zone->pages_min; <------!!! else if (alloc_flags & ALLOC_WMARK_LOW) mark = zone->pages_low; else mark = zone->pages_high; if (!zone_watermark_ok(zone, order, mark, <-----!!! classzone_idx, alloc_flags)) { if (!zone_reclaim_mode || !zone_reclaim(zone, gfp_mask, order)) goto this_zone_full; } } But even if pages_min is not used as you said, I suppose it is accidental by changing source code. It should work as watermark to keep its meaning. If not, it would be cause of bug in the future by misunderstanding. Bye. -- Yasunori Goto -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes 2008-08-01 11:16 ` Yasunori Goto @ 2008-08-01 16:04 ` Gerald Schaefer 0 siblings, 0 replies; 11+ messages in thread From: Gerald Schaefer @ 2008-08-01 16:04 UTC (permalink / raw) To: Yasunori Goto Cc: Mel Gorman, linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Andrew Morton On Fri, 2008-08-01 at 20:16 +0900, Yasunori Goto wrote: > > My assumption is now, that the reserved 3 MB in ZONE_MOVABLE won't be > > usable by the kernel anymore, e.g. for PF_MEMALLOC, because it is in > > ZONE_MOVABLE now. > > I don't make sense here. I suppose there is no relationship between > ZONE_MOVABLE, PF_MEMALLOC and MIGRATE_RESERVE pages. > Could you tell me more? Ok, I thought that PF_MEMALLOC allocations work on the MIGRATE_RESERVE pageblocks, and that only kernel allocations can use PF_MEMALLOC. I also thought that kernel allocations cannot use ZONE_MOVABLE, e.g. for page cache memory, because such pages would not be migratable. So I assumed that MIGRATE_RESERVE pageblocks in ZONE_MOVABLE would not be available for PF_MEMALLOC allocations. With this assumption, which can be totally wrong, the redistribution of MIGRATE_RESERVE pageblocks in setup_per_zone_pages_min() looks like it will take away reserved pageblocks that should be available to the kernel in emergency situations. Maybe I should have explained this assumption earlier, because my whole min_free_kbytes issue depends on it. If I'm wrong, I apologize for confusing you all with this "issue", and I will go back to the original problem with removing the lowest memory chunk in ZONE_MOVABLE... Thanks, Gerald -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes 2008-07-31 17:45 ` memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes Gerald Schaefer 2008-08-01 11:16 ` Yasunori Goto @ 2008-08-01 16:26 ` Mel Gorman 1 sibling, 0 replies; 11+ messages in thread From: Mel Gorman @ 2008-08-01 16:26 UTC (permalink / raw) To: Gerald Schaefer Cc: Yasunori Goto, linux-kernel, linux-mm, schwidefsky, heiko.carstens, KAMEZAWA Hiroyuki, Dave Hansen, Andy Whitcroft, Christoph Lameter, Nick Piggin, Peter Zijlstra, Andrew Morton On (31/07/08 19:45), Gerald Schaefer didst pronounce: > On Thu, 2008-07-31 at 14:22 +0100, Mel Gorman wrote: > > > The more memory we add to ZONE_MOVABLE, the less reserved pages will > > > remain to the other zones. In setup_per_zone_pages_min(), min_free_kbytes > > > will be redistributed to a zone where the kernel cannot make any use of > > > it, effectively reducing the available min_free_kbytes. > > > > I'm not sure what you mean by "available min_free_kbytes". The overall value > > for min_free_kbytes should be approximately the same whether the zone exists > > or not. However, you're right in that the distribution of minimum free pages > > changes with ZONE_MOVABLE because the zones are different sizes now. This > > affects reclaim, not memory hot-remove. > > Sorry for mixing things up in this thread, the min_free_kbytes issue is > not related to memory hot-remove, but rather to hot-add and the things that > happen in setup_per_zone_pages_min(), which is called from online_pages(). > It may well be that my assumptions are wrong, but I'd like to explain my > concerns again: > > If we have a system with 1 GB of memory, min_free_kbytes will be calculated > to 4 MB for ZONE_NORMAL, for example. Now, if we add 3 GB of hotplug memory > to ZONE_MOVABLE, the total min_free_kbytes will still remain 4 MB but it > will be distributed differently: ZONE_NORMAL will now have only 1 MB of > MIGRATE_RESERVE memory left, while ZONE_MOVABLE will have 3 MB, e.g. > Ok, I haven't double checked your figures but lets go with the assumption - adding memory means min_free_kbytes will be distributed differently. > My assumption is now, that the reserved 3 MB in ZONE_MOVABLE won't be > usable by the kernel anymore, e.g. for PF_MEMALLOC, because it is in > ZONE_MOVABLE now. Nothing stops PF_MEMALLOC being used and the only thing that stops 3MB being used in ZONE_MOVABLE is min_free_kbytes, not the fact there is a MIGRATE_RESERVE there. PF_MEMALLOC and MIGRATE_RESERVE are not related. I think you are confusing what MIGRATE_RESERVE is for. A number of pageblocks at the start of a zone are marked MIGRATE_RESERVE depending on the size of min_free_kbytes for that value. The kernel will try avoiding allocating from there so that high-order-atomic-allocatons have a chance of succeeding from there. It's not kept aside for emergency-allocations. > This is what I mean with "effectively reducing the > available min_free_kbytes". The system would now behave in the same way > as a system which only had 1 MB of min_free_kbytes, although > /proc/sys/vm/min_free_kbytes would still say 4 MB. After all, this tunable > can have a rather negative impact on a system, especially if it is too > low, hence my concerns. > Increase min_free_kbytes on memory hot-add? > > > > This just doesn't > > > sound right. I believe that a similar situation is the reason why highmem > > > pages are skipped in the calculation and I think that we need that for > > > ZONE_MOVABLE too. Any thoughts on that problem? > > > > > > > is_highmem(ZONE_MOVABLE) should be returning true if the zone is really > > part of himem. > > We don't have highmem on s390, I was just trying to give an example: I > noticed that there is special treatment for highmem pages in > setup_per_zone_pages_min(), and thought that we may also need to handle > ZONE_MOVABLE in a special way. > ZONE_MOVABLE should be treated the same as highmem would be in terms of tuning > > > > Setting pages_min to 0 for ZONE_MOVABLE, while not capping pages_low > > > and pages_high, could be an option. I don't have a sufficient memory > > > managment overview to tell if that has negative side effects, maybe > > > someone with a deeper insight could comment on that. > > > > > > > pages_min of 0 means the other values would be 0 as well. This means that > > kswapd may never be woken up to free pages within that zone and lead to > > poor utilisation of the zone as allocators fallback to other zones to > > avoid direct reclaim. I don't think that is your intention nor will it > > help memory hot-remove. > > Do you mean pages_low and pages_high? In setup_per_zone_pages_min(), > those would not be set to 0, even if we set pages_min to 0. Again, a > similar strategy is being used for highmem in that function, only that > pages_min is set to a small value instead of 0 in that case. So it should > not affect kswapd but only __GFP_HIGH and PF_MEMALLOC allocations, which > won't be allocated from ZONE_MOVABLE anyway if I understood that right. > Ok, I'm losing track here, maybe it's just too late on a friday. right now, ZONE_MOVABLE should be setup similar to what HIGHMEM would have been. It shouldn't get its pages_min value set to 0 and even if it did, it would not help memory hot-remove. Also, nothing stops __GFP_HIGH or PF_MEMALLOC using ZONE_MOVABLE as long as the caller is using __GFP_MOVABLE. However, as it is unlikely that combination of flags would occur I'd be open to examining how min_free_kbytes gets distibuted. It is an independent topic to why the beginning of the zone is not removable though. I suspect MIGRATE_RESERVE is a red herring. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-08-01 16:26 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-07-22 16:55 memory hotplug: hot-remove fails on lowest chunk in ZONE_MOVABLE Gerald Schaefer 2008-07-23 2:48 ` Yasunori Goto 2008-07-29 16:07 ` Gerald Schaefer 2008-07-30 3:16 ` Yasunori Goto 2008-07-30 12:16 ` Gerald Schaefer 2008-07-31 5:16 ` Yasunori Goto 2008-07-31 13:22 ` Mel Gorman 2008-07-31 17:45 ` memory hotplug: hot-add to ZONE_MOVABLE vs. min_free_kbytes Gerald Schaefer 2008-08-01 11:16 ` Yasunori Goto 2008-08-01 16:04 ` Gerald Schaefer 2008-08-01 16:26 ` Mel Gorman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox