* [PATCH] memory hotplug: run lru_add_drain_all() on each cpu
@ 2008-12-03 21:25 Gerald Schaefer
2008-12-03 22:16 ` Dave Hansen
0 siblings, 1 reply; 11+ messages in thread
From: Gerald Schaefer @ 2008-12-03 21:25 UTC (permalink / raw)
To: akpm
Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto, npiggin
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
offline_pages() calls lru_add_drain_all() followed by drain_all_pages().
While drain_all_pages() works on each cpu, lru_add_drain_all() only runs
on the current cpu for architectures w/o CONFIG_NUMA. This let us run
into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during
memory hotplug stress test on s390. The page in question was still on the
pcp list, because of a race with lru_add_drain_all() and drain_all_pages()
on different cpus.
This is fixed with this patch by adding CONFIG_MEMORY_HOTREMOVE to the
lru_add_drain_all() #ifdef, to let it run on each cpu.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
mm/swap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Index: linux-2.6/mm/swap.c
===================================================================
--- linux-2.6.orig/mm/swap.c
+++ linux-2.6/mm/swap.c
@@ -299,7 +299,8 @@ void lru_add_drain(void)
put_cpu();
}
-#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU)
+#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU) || \
+ defined(CONFIG_MEMORY_HOTREMOVE)
static void lru_add_drain_per_cpu(struct work_struct *dummy)
{
lru_add_drain();
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-03 21:25 [PATCH] memory hotplug: run lru_add_drain_all() on each cpu Gerald Schaefer @ 2008-12-03 22:16 ` Dave Hansen 2008-12-04 0:31 ` KAMEZAWA Hiroyuki 2008-12-05 13:08 ` [PATCH] memory hotplug: run lru_add_drain_all() on each cpu Gerald Schaefer 0 siblings, 2 replies; 11+ messages in thread From: Dave Hansen @ 2008-12-03 22:16 UTC (permalink / raw) To: gerald.schaefer Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, kamezawa.hiroyu, y-goto, npiggin On Wed, 2008-12-03 at 22:25 +0100, Gerald Schaefer wrote: > offline_pages() calls lru_add_drain_all() followed by drain_all_pages(). > While drain_all_pages() works on each cpu, lru_add_drain_all() only runs > on the current cpu for architectures w/o CONFIG_NUMA. I'm a bit confused why this is. Is this because the LRUs are per-zone and we expected !CONFIG_NUMA systems to only have LRUs sitting on the same (only) node as the current CPU? This doesn't make any sense, though. The pagevecs that drain_cpu_pagevecs() actually empties out are per-cpu. > This let us run > into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during > memory hotplug stress test on s390. The page in question was still on the > pcp list, because of a race with lru_add_drain_all() and drain_all_pages() > on different cpus. > > This is fixed with this patch by adding CONFIG_MEMORY_HOTREMOVE to the > lru_add_drain_all() #ifdef, to let it run on each cpu. This doesn't seem right to me. CONFIG_MEMORY_HOTREMOVE doesn't change the layout of the LRUs, unlike NUMA or UNEVICTABLE_LRU. So, I think this bug is more due to the hotremove code mis-expecting behavior out of lru_add_drain_all(). Why does this not affect the other lru_add_drain_all() users? -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-03 22:16 ` Dave Hansen @ 2008-12-04 0:31 ` KAMEZAWA Hiroyuki 2008-12-04 2:14 ` [PATCH] mm: remove UP version lru_add_drain_all() KOSAKI Motohiro 2008-12-05 13:08 ` [PATCH] memory hotplug: run lru_add_drain_all() on each cpu Gerald Schaefer 1 sibling, 1 reply; 11+ messages in thread From: KAMEZAWA Hiroyuki @ 2008-12-04 0:31 UTC (permalink / raw) To: Dave Hansen Cc: gerald.schaefer, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, y-goto, npiggin On Wed, 03 Dec 2008 14:16:07 -0800 Dave Hansen <dave@linux.vnet.ibm.com> wrote: > > This let us run > > into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during > > memory hotplug stress test on s390. The page in question was still on the > > pcp list, because of a race with lru_add_drain_all() and drain_all_pages() > > on different cpus. > > > > This is fixed with this patch by adding CONFIG_MEMORY_HOTREMOVE to the > > lru_add_drain_all() #ifdef, to let it run on each cpu. > > This doesn't seem right to me. CONFIG_MEMORY_HOTREMOVE doesn't change > the layout of the LRUs, unlike NUMA or UNEVICTABLE_LRU. So, I think > this bug is more due to the hotremove code mis-expecting behavior out of > lru_add_drain_all(). > How about #ifdef CONFIG_SMP #else.. #endif rather than -#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU) +#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU) || \ + defined(CONFIG_MEMORY_HOTREMOVE) ... thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] mm: remove UP version lru_add_drain_all() 2008-12-04 0:31 ` KAMEZAWA Hiroyuki @ 2008-12-04 2:14 ` KOSAKI Motohiro 2008-12-04 2:23 ` KOSAKI Motohiro 2008-12-04 18:01 ` Gerald Schaefer 0 siblings, 2 replies; 11+ messages in thread From: KOSAKI Motohiro @ 2008-12-04 2:14 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: kosaki.motohiro, Dave Hansen, gerald.schaefer, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, y-goto, npiggin, Lee Schermerhorn, Christoph Lameter (CC to Christoph Lameter and Lee Schermerhorn) > On Wed, 03 Dec 2008 14:16:07 -0800 > Dave Hansen <dave@linux.vnet.ibm.com> wrote: > > > This let us run > > > into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during > > > memory hotplug stress test on s390. The page in question was still on the > > > pcp list, because of a race with lru_add_drain_all() and drain_all_pages() > > > on different cpus. > > > > > > This is fixed with this patch by adding CONFIG_MEMORY_HOTREMOVE to the > > > lru_add_drain_all() #ifdef, to let it run on each cpu. > > > > This doesn't seem right to me. CONFIG_MEMORY_HOTREMOVE doesn't change > > the layout of the LRUs, unlike NUMA or UNEVICTABLE_LRU. So, I think > > this bug is more due to the hotremove code mis-expecting behavior out of > > lru_add_drain_all(). > > > How about > > #ifdef CONFIG_SMP > > #else.. > > #endif > > rather than > > -#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU) > +#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU) || \ > + defined(CONFIG_MEMORY_HOTREMOVE) > ... The default value of CONFIG_UNEVICTABLE_LRU is ON. Then, almost machine use CONFIG_NUMA version lru_add_drain_all(). Therefore, this config option is not so valuable. I like simple removing. following patch can boot on UP machine. === Currently, lru_add_drain_all() has two version. (1) use schedule_on_each_cpu() (2) don't use schedule_on_each_cpu() Gerald Schaefer reported it doesn't works well on SMP (not NUMA) S390 machine. offline_pages() calls lru_add_drain_all() followed by drain_all_pages(). While drain_all_pages() works on each cpu, lru_add_drain_all() only runs on the current cpu for architectures w/o CONFIG_NUMA. This let us run into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during memory hotplug stress test on s390. The page in question was still on the pcp list, because of a race with lru_add_drain_all() and drain_all_pages() on different cpus. Actually, Almost machine has CONFIG_UNEVICTABLE_LRU=y. Then almost machine use (1) version lru_add_drain_all although the machine is UP. Then this ifdef is not valueable. simple removing is better. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> CC: Christoph Lameter <cl@linux-foundation.org> CC: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- mm/swap.c | 13 ------------- 1 file changed, 13 deletions(-) Index: b/mm/swap.c =================================================================== --- a/mm/swap.c 2008-11-24 19:33:26.000000000 +0900 +++ b/mm/swap.c 2008-12-04 09:49:05.000000000 +0900 @@ -299,7 +299,6 @@ void lru_add_drain(void) put_cpu(); } -#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU) static void lru_add_drain_per_cpu(struct work_struct *dummy) { lru_add_drain(); @@ -313,18 +312,6 @@ int lru_add_drain_all(void) return schedule_on_each_cpu(lru_add_drain_per_cpu); } -#else - -/* - * Returns 0 for success - */ -int lru_add_drain_all(void) -{ - lru_add_drain(); - return 0; -} -#endif - /* * Batched page_cache_release(). Decrement the reference count on all the * passed pages. If it fell to zero then remove the page from the LRU and -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: remove UP version lru_add_drain_all() 2008-12-04 2:14 ` [PATCH] mm: remove UP version lru_add_drain_all() KOSAKI Motohiro @ 2008-12-04 2:23 ` KOSAKI Motohiro 2008-12-04 18:01 ` Gerald Schaefer 1 sibling, 0 replies; 11+ messages in thread From: KOSAKI Motohiro @ 2008-12-04 2:23 UTC (permalink / raw) To: gerald.schaefer Cc: kosaki.motohiro, KAMEZAWA Hiroyuki, Dave Hansen, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, y-goto, npiggin, Lee Schermerhorn, Christoph Lameter silly mistake. sorry. > > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > CC: Christoph Lameter <cl@linux-foundation.org> > CC: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: remove UP version lru_add_drain_all() 2008-12-04 2:14 ` [PATCH] mm: remove UP version lru_add_drain_all() KOSAKI Motohiro 2008-12-04 2:23 ` KOSAKI Motohiro @ 2008-12-04 18:01 ` Gerald Schaefer 1 sibling, 0 replies; 11+ messages in thread From: Gerald Schaefer @ 2008-12-04 18:01 UTC (permalink / raw) To: KOSAKI Motohiro Cc: KAMEZAWA Hiroyuki, Dave Hansen, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, y-goto, npiggin, Lee Schermerhorn, Christoph Lameter On Thu, 2008-12-04 at 11:14 +0900, KOSAKI Motohiro wrote: > Then this ifdef is not valueable. > simple removing is better. > > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > CC: Christoph Lameter <cl@linux-foundation.org> > CC: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Thanks, works for me. Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-03 22:16 ` Dave Hansen 2008-12-04 0:31 ` KAMEZAWA Hiroyuki @ 2008-12-05 13:08 ` Gerald Schaefer 2008-12-05 20:43 ` Dave Hansen 1 sibling, 1 reply; 11+ messages in thread From: Gerald Schaefer @ 2008-12-05 13:08 UTC (permalink / raw) To: Dave Hansen Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, kamezawa.hiroyu, y-goto, npiggin On Wed, 2008-12-03 at 14:16 -0800, Dave Hansen wrote: > I'm a bit confused why this is. Is this because the LRUs are per-zone > and we expected !CONFIG_NUMA systems to only have LRUs sitting on the > same (only) node as the current CPU? > > This doesn't make any sense, though. The pagevecs that > drain_cpu_pagevecs() actually empties out are per-cpu. Right, the pagevecs are per-cpu, independent from any CONFIG_NUMA settings, and this is exactly why I would expect that lru_add_drain_all() works on all cpus, as opposed to lru_add_drain() which works only on the current cpu. > This doesn't seem right to me. CONFIG_MEMORY_HOTREMOVE doesn't change > the layout of the LRUs, unlike NUMA or UNEVICTABLE_LRU. So, I think > this bug is more due to the hotremove code mis-expecting behavior out of > lru_add_drain_all(). > > Why does this not affect the other lru_add_drain_all() users? Good question, there are only a few other users and most of them were added just recently with the unevictable lru patches. The only exception is migrate_prep(), but this is only called from sys_move_pages(), which is not implemented w/o CONFIG_NUMA afaik. As explained above, the per-cpu pagevec layout should be independent from NUMA or UNEVICTABLE_LRU, so I guess the right thing to do here is completely remove the #ifdef as in the patch from Kosaki Motohiro (or at least replace it with a CONFIG_SMP as suggested by Kamezawa Hiroyuki). Thanks, Gerald -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-05 13:08 ` [PATCH] memory hotplug: run lru_add_drain_all() on each cpu Gerald Schaefer @ 2008-12-05 20:43 ` Dave Hansen 2008-12-07 4:43 ` KOSAKI Motohiro 0 siblings, 1 reply; 11+ messages in thread From: Dave Hansen @ 2008-12-05 20:43 UTC (permalink / raw) To: gerald.schaefer Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, kamezawa.hiroyu, y-goto, npiggin On Fri, 2008-12-05 at 14:08 +0100, Gerald Schaefer wrote: > > As explained above, the per-cpu pagevec layout should be independent > from NUMA or UNEVICTABLE_LRU, so I guess the right thing to do here > is completely remove the #ifdef as in the patch from Kosaki Motohiro > (or at least replace it with a CONFIG_SMP as suggested by Kamezawa > Hiroyuki). Thanks for looking into it deeper. That CONFIG_SMP thing really does look like the right solution. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-05 20:43 ` Dave Hansen @ 2008-12-07 4:43 ` KOSAKI Motohiro 2008-12-08 13:56 ` Lee Schermerhorn 0 siblings, 1 reply; 11+ messages in thread From: KOSAKI Motohiro @ 2008-12-07 4:43 UTC (permalink / raw) To: Dave Hansen, Lee Schermerhorn Cc: kosaki.motohiro, gerald.schaefer, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, kamezawa.hiroyu, y-goto, npiggin CC to Lee Schermerhorn > On Fri, 2008-12-05 at 14:08 +0100, Gerald Schaefer wrote: > > > > As explained above, the per-cpu pagevec layout should be independent > > from NUMA or UNEVICTABLE_LRU, so I guess the right thing to do here > > is completely remove the #ifdef as in the patch from Kosaki Motohiro > > (or at least replace it with a CONFIG_SMP as suggested by Kamezawa > > Hiroyuki). > > Thanks for looking into it deeper. That CONFIG_SMP thing really does > look like the right solution. Lee, Could you read this thread and explain why you add ifdef CONFIG_UNEVICTABLE_LRU? I am not sure about that Dave's proposal is safe change. (but I guess he is right) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-07 4:43 ` KOSAKI Motohiro @ 2008-12-08 13:56 ` Lee Schermerhorn 2008-12-08 14:30 ` KOSAKI Motohiro 0 siblings, 1 reply; 11+ messages in thread From: Lee Schermerhorn @ 2008-12-08 13:56 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Dave Hansen, gerald.schaefer, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, kamezawa.hiroyu, y-goto, npiggin On Sun, 2008-12-07 at 13:43 +0900, KOSAKI Motohiro wrote: > CC to Lee Schermerhorn > > > > On Fri, 2008-12-05 at 14:08 +0100, Gerald Schaefer wrote: > > > > > > As explained above, the per-cpu pagevec layout should be independent > > > from NUMA or UNEVICTABLE_LRU, so I guess the right thing to do here > > > is completely remove the #ifdef as in the patch from Kosaki Motohiro > > > (or at least replace it with a CONFIG_SMP as suggested by Kamezawa > > > Hiroyuki). > > > > Thanks for looking into it deeper. That CONFIG_SMP thing really does > > look like the right solution. > > Lee, Could you read this thread and explain why you add ifdef CONFIG_UNEVICTABLE_LRU? > I am not sure about that Dave's proposal is safe change. (but I guess he is right) I added that back in Patch 17/25 "Mlocked Pages are non-reclaimable" [before nonreclaimable became unevictable". I did this because "lru_add_drain_all()" was only used by numa code prior to this, and was under #ifdef CONFIG_NUMA". I called lru_add_drain_all() from __mlock_vma_pages_range() [since removed] and I wanted the nonreclaimable/unevictable mlocked pages feature to be independent of numa. So, I had to ensure that we defined the function for nonreclaimable/unevictable lru as well as numa. Now it appears that hotplug and memcg also depend on lru_add_drain_all(), so making it depend on 'SMP looks reasonable to me. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] memory hotplug: run lru_add_drain_all() on each cpu 2008-12-08 13:56 ` Lee Schermerhorn @ 2008-12-08 14:30 ` KOSAKI Motohiro 0 siblings, 0 replies; 11+ messages in thread From: KOSAKI Motohiro @ 2008-12-08 14:30 UTC (permalink / raw) To: Lee Schermerhorn Cc: Dave Hansen, gerald.schaefer, akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, kamezawa.hiroyu, y-goto, npiggin >> Lee, Could you read this thread and explain why you add ifdef CONFIG_UNEVICTABLE_LRU? >> I am not sure about that Dave's proposal is safe change. (but I guess he is right) > > I added that back in Patch 17/25 "Mlocked Pages are > non-reclaimable" [before nonreclaimable became unevictable". I did this > because "lru_add_drain_all()" was only used by numa code prior to this, > and was under #ifdef CONFIG_NUMA". I called lru_add_drain_all() from > __mlock_vma_pages_range() [since removed] and I wanted the > nonreclaimable/unevictable mlocked pages feature to be independent of > numa. So, I had to ensure that we defined the function for > nonreclaimable/unevictable lru as well as numa. > > Now it appears that hotplug and memcg also depend on > lru_add_drain_all(), so making it depend on 'SMP looks reasonable to me. Thanks a lot. I'll make that patch. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-12-08 14:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-12-03 21:25 [PATCH] memory hotplug: run lru_add_drain_all() on each cpu Gerald Schaefer 2008-12-03 22:16 ` Dave Hansen 2008-12-04 0:31 ` KAMEZAWA Hiroyuki 2008-12-04 2:14 ` [PATCH] mm: remove UP version lru_add_drain_all() KOSAKI Motohiro 2008-12-04 2:23 ` KOSAKI Motohiro 2008-12-04 18:01 ` Gerald Schaefer 2008-12-05 13:08 ` [PATCH] memory hotplug: run lru_add_drain_all() on each cpu Gerald Schaefer 2008-12-05 20:43 ` Dave Hansen 2008-12-07 4:43 ` KOSAKI Motohiro 2008-12-08 13:56 ` Lee Schermerhorn 2008-12-08 14:30 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox