* KVM: memory ballooning bug? @ 2015-12-23 5:22 Minchan Kim 2015-12-23 11:14 ` Rafael Aquini 2015-12-27 17:23 ` Konstantin Khlebnikov 0 siblings, 2 replies; 6+ messages in thread From: Minchan Kim @ 2015-12-23 5:22 UTC (permalink / raw) To: Rafael Aquini Cc: linux-mm, Andrew Morton, Konstantin Khlebnikov, linux-kernel During my compaction-related stuff, I encountered some problems with ballooning. Firstly, with repeated inflating and deflating cycle, guest memory(ie, cat /proc/meminfo | grep MemTotal) decreased and couldn't recover. When I review source code, balloon_lock should cover release_pages_balloon. Otherwise, struct virtio_balloon fields could be overwritten by race of fill_balloon(e,g, vb->*pfns could be critical). Below patch fixed the problem. diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 7efc32945810..7d3e5d0e9aa4 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) */ if (vb->num_pfns != 0) tell_host(vb, vb->deflate_vq); - mutex_unlock(&vb->balloon_lock); release_pages_balloon(vb); + mutex_unlock(&vb->balloon_lock); return num_freed_pages; } Secondly, in balloon_page_dequeue, pages_lock should cover list_for_each_entry_safe loop. Otherwise, the cursor page could be isolated by compaction and then list_del by isolation could poison the page->lru so the loop could access wrong address like this. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1+ #1906 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 Stack: 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 Call Trace: [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 [<ffffffff812c8bc7>] balloon+0x217/0x2a0 [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 [<ffffffff8105b6e9>] kthread+0xc9/0xe0 [<ffffffff8105b620>] ? kthread_park+0x60/0x60 [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 [<ffffffff8105b620>] ? kthread_park+0x60/0x60 Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 RSP <ffff8800a7fefdc0> ---[ end trace 43cf28060d708d5f ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled We could fix it by protecting the entire loop by pages_lock but problem is irq latency during walking the list. But I doubt how often such worst scenario happens because in normal situation, the loop would exit easily via succeeding trylock_page. Any comments? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: memory ballooning bug? 2015-12-23 5:22 KVM: memory ballooning bug? Minchan Kim @ 2015-12-23 11:14 ` Rafael Aquini 2015-12-23 14:17 ` Minchan Kim 2015-12-27 17:23 ` Konstantin Khlebnikov 1 sibling, 1 reply; 6+ messages in thread From: Rafael Aquini @ 2015-12-23 11:14 UTC (permalink / raw) To: Minchan Kim; +Cc: linux-mm, Andrew Morton, Konstantin Khlebnikov, linux-kernel On Wed, Dec 23, 2015 at 02:22:28PM +0900, Minchan Kim wrote: > During my compaction-related stuff, I encountered some problems with > ballooning. > > Firstly, with repeated inflating and deflating cycle, guest memory(ie, > cat /proc/meminfo | grep MemTotal) decreased and couldn't recover. > > When I review source code, balloon_lock should cover release_pages_balloon. > Otherwise, struct virtio_balloon fields could be overwritten by race > of fill_balloon(e,g, vb->*pfns could be critical). > Below patch fixed the problem. > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 7efc32945810..7d3e5d0e9aa4 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) > */ > if (vb->num_pfns != 0) > tell_host(vb, vb->deflate_vq); > - mutex_unlock(&vb->balloon_lock); > release_pages_balloon(vb); > + mutex_unlock(&vb->balloon_lock); > return num_freed_pages; > } > > Secondly, in balloon_page_dequeue, pages_lock should cover > list_for_each_entry_safe loop. Otherwise, the cursor page > could be isolated by compaction and then list_del by isolation > could poison the page->lru so the loop could access wrong address > like this. > > general protection fault: 0000 [#1] SMP > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1+ #1906 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 > RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 > RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d > RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 > RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 > R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 > R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 > FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 > Stack: > 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 > 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 > ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 > Call Trace: > [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 > [<ffffffff812c8bc7>] balloon+0x217/0x2a0 > [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 > [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 > [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 > [<ffffffff8105b6e9>] kthread+0xc9/0xe0 > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 > RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > RSP <ffff8800a7fefdc0> > ---[ end trace 43cf28060d708d5f ]--- > Kernel panic - not syncing: Fatal exception > Dumping ftrace buffer: > (ftrace buffer empty) > Kernel Offset: disabled > > We could fix it by protecting the entire loop by pages_lock but > problem is irq latency during walking the list. > But I doubt how often such worst scenario happens because > in normal situation, the loop would exit easily via succeeding > trylock_page. > > Any comments? Nope, I think the simplest way to address both cases you stumbled across is by replacing the locking to extend those critical sections as you suggested. Merry Xmas! -- Rafael -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: memory ballooning bug? 2015-12-23 11:14 ` Rafael Aquini @ 2015-12-23 14:17 ` Minchan Kim 2015-12-23 15:14 ` Rafael Aquini 0 siblings, 1 reply; 6+ messages in thread From: Minchan Kim @ 2015-12-23 14:17 UTC (permalink / raw) To: Rafael Aquini Cc: linux-mm, Andrew Morton, Konstantin Khlebnikov, linux-kernel On Wed, Dec 23, 2015 at 06:14:49AM -0500, Rafael Aquini wrote: > On Wed, Dec 23, 2015 at 02:22:28PM +0900, Minchan Kim wrote: > > During my compaction-related stuff, I encountered some problems with > > ballooning. > > > > Firstly, with repeated inflating and deflating cycle, guest memory(ie, > > cat /proc/meminfo | grep MemTotal) decreased and couldn't recover. > > > > When I review source code, balloon_lock should cover release_pages_balloon. > > Otherwise, struct virtio_balloon fields could be overwritten by race > > of fill_balloon(e,g, vb->*pfns could be critical). > > Below patch fixed the problem. > > > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > > index 7efc32945810..7d3e5d0e9aa4 100644 > > --- a/drivers/virtio/virtio_balloon.c > > +++ b/drivers/virtio/virtio_balloon.c > > @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) > > */ > > if (vb->num_pfns != 0) > > tell_host(vb, vb->deflate_vq); > > - mutex_unlock(&vb->balloon_lock); > > release_pages_balloon(vb); > > + mutex_unlock(&vb->balloon_lock); > > return num_freed_pages; > > } > > > > Secondly, in balloon_page_dequeue, pages_lock should cover > > list_for_each_entry_safe loop. Otherwise, the cursor page > > could be isolated by compaction and then list_del by isolation > > could poison the page->lru so the loop could access wrong address > > like this. > > > > general protection fault: 0000 [#1] SMP > > Dumping ftrace buffer: > > (ftrace buffer empty) > > Modules linked in: > > CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1+ #1906 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 > > RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > > RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 > > RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d > > RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 > > RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 > > R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 > > R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 > > FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 > > Stack: > > 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 > > 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 > > ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 > > Call Trace: > > [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 > > [<ffffffff812c8bc7>] balloon+0x217/0x2a0 > > [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 > > [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 > > [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 > > [<ffffffff8105b6e9>] kthread+0xc9/0xe0 > > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > > [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 > > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > > Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 > > RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > > RSP <ffff8800a7fefdc0> > > ---[ end trace 43cf28060d708d5f ]--- > > Kernel panic - not syncing: Fatal exception > > Dumping ftrace buffer: > > (ftrace buffer empty) > > Kernel Offset: disabled > > > > We could fix it by protecting the entire loop by pages_lock but > > problem is irq latency during walking the list. > > But I doubt how often such worst scenario happens because > > in normal situation, the loop would exit easily via succeeding > > trylock_page. > > > > Any comments? > > Nope, I think the simplest way to address both cases you stumbled > across is by replacing the locking to extend those critical sections as > you suggested. I couldn't understand why you said "Nope" and which lock do you mean? There are two locks to need to extend. If you are on same page with me, I suggested this. diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 7efc329..7d3e5d0 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) */ if (vb->num_pfns != 0) tell_host(vb, vb->deflate_vq); - mutex_unlock(&vb->balloon_lock); release_pages_balloon(vb); + mutex_unlock(&vb->balloon_lock); return num_freed_pages; } diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index d3116be..300117f 100644 --- a/mm/balloon_compaction.c +++ b/mm/balloon_compaction.c @@ -61,6 +61,7 @@ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) bool dequeued_page; dequeued_page = false; + spin_lock_irqsave(&b_dev_info->pages_lock, flags); list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { /* * Block others from accessing the 'page' while we get around @@ -75,15 +76,14 @@ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) continue; } #endif - spin_lock_irqsave(&b_dev_info->pages_lock, flags); balloon_page_delete(page); __count_vm_event(BALLOON_DEFLATE); - spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); unlock_page(page); dequeued_page = true; break; } } + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); if (!dequeued_page) { /* Do you agree this patch? If so, I will send patch after X-mas. Otherwise, please elaborate it more. Happy Happy Xmas! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: memory ballooning bug? 2015-12-23 14:17 ` Minchan Kim @ 2015-12-23 15:14 ` Rafael Aquini 0 siblings, 0 replies; 6+ messages in thread From: Rafael Aquini @ 2015-12-23 15:14 UTC (permalink / raw) To: Minchan Kim; +Cc: linux-mm, Andrew Morton, Konstantin Khlebnikov, linux-kernel On Wed, Dec 23, 2015 at 11:17:10PM +0900, Minchan Kim wrote: > On Wed, Dec 23, 2015 at 06:14:49AM -0500, Rafael Aquini wrote: > > On Wed, Dec 23, 2015 at 02:22:28PM +0900, Minchan Kim wrote: > > > During my compaction-related stuff, I encountered some problems with > > > ballooning. > > > > > > Firstly, with repeated inflating and deflating cycle, guest memory(ie, > > > cat /proc/meminfo | grep MemTotal) decreased and couldn't recover. > > > > > > When I review source code, balloon_lock should cover release_pages_balloon. > > > Otherwise, struct virtio_balloon fields could be overwritten by race > > > of fill_balloon(e,g, vb->*pfns could be critical). > > > Below patch fixed the problem. > > > > > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > > > index 7efc32945810..7d3e5d0e9aa4 100644 > > > --- a/drivers/virtio/virtio_balloon.c > > > +++ b/drivers/virtio/virtio_balloon.c > > > @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) > > > */ > > > if (vb->num_pfns != 0) > > > tell_host(vb, vb->deflate_vq); > > > - mutex_unlock(&vb->balloon_lock); > > > release_pages_balloon(vb); > > > + mutex_unlock(&vb->balloon_lock); > > > return num_freed_pages; > > > } > > > > > > Secondly, in balloon_page_dequeue, pages_lock should cover > > > list_for_each_entry_safe loop. Otherwise, the cursor page > > > could be isolated by compaction and then list_del by isolation > > > could poison the page->lru so the loop could access wrong address > > > like this. > > > > > > general protection fault: 0000 [#1] SMP > > > Dumping ftrace buffer: > > > (ftrace buffer empty) > > > Modules linked in: > > > CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1+ #1906 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > > task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 > > > RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > > > RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 > > > RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d > > > RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 > > > RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 > > > R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 > > > R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 > > > FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > > CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 > > > Stack: > > > 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 > > > 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 > > > ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 > > > Call Trace: > > > [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 > > > [<ffffffff812c8bc7>] balloon+0x217/0x2a0 > > > [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 > > > [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 > > > [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 > > > [<ffffffff8105b6e9>] kthread+0xc9/0xe0 > > > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > > > [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 > > > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > > > Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 > > > RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > > > RSP <ffff8800a7fefdc0> > > > ---[ end trace 43cf28060d708d5f ]--- > > > Kernel panic - not syncing: Fatal exception > > > Dumping ftrace buffer: > > > (ftrace buffer empty) > > > Kernel Offset: disabled > > > > > > We could fix it by protecting the entire loop by pages_lock but > > > problem is irq latency during walking the list. > > > But I doubt how often such worst scenario happens because > > > in normal situation, the loop would exit easily via succeeding > > > trylock_page. > > > > > > Any comments? > > > > Nope, I think the simplest way to address both cases you stumbled > > across is by replacing the locking to extend those critical sections as > > you suggested. > > I couldn't understand why you said "Nope" and which lock do you mean? > There are two locks to need to extend. > If you are on same page with me, I suggested this. > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 7efc329..7d3e5d0 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) > */ > if (vb->num_pfns != 0) > tell_host(vb, vb->deflate_vq); > - mutex_unlock(&vb->balloon_lock); > release_pages_balloon(vb); > + mutex_unlock(&vb->balloon_lock); > return num_freed_pages; > } > > diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c > index d3116be..300117f 100644 > --- a/mm/balloon_compaction.c > +++ b/mm/balloon_compaction.c > @@ -61,6 +61,7 @@ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) > bool dequeued_page; > > dequeued_page = false; > + spin_lock_irqsave(&b_dev_info->pages_lock, flags); > list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { > /* > * Block others from accessing the 'page' while we get around > @@ -75,15 +76,14 @@ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) > continue; > } > #endif > - spin_lock_irqsave(&b_dev_info->pages_lock, flags); > balloon_page_delete(page); > __count_vm_event(BALLOON_DEFLATE); > - spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); > unlock_page(page); > dequeued_page = true; > break; > } > } > + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); > > if (!dequeued_page) { > /* > Do you agree this patch? If so, I will send patch after X-mas. That's precisely what I was thinking :) Happy Xmas! -- Rafael -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: memory ballooning bug? 2015-12-23 5:22 KVM: memory ballooning bug? Minchan Kim 2015-12-23 11:14 ` Rafael Aquini @ 2015-12-27 17:23 ` Konstantin Khlebnikov 2015-12-27 23:49 ` Minchan Kim 1 sibling, 1 reply; 6+ messages in thread From: Konstantin Khlebnikov @ 2015-12-27 17:23 UTC (permalink / raw) To: Minchan Kim Cc: Rafael Aquini, linux-mm, Andrew Morton, Konstantin Khlebnikov, Linux Kernel Mailing List On Wed, Dec 23, 2015 at 8:22 AM, Minchan Kim <minchan@kernel.org> wrote: > During my compaction-related stuff, I encountered some problems with > ballooning. > > Firstly, with repeated inflating and deflating cycle, guest memory(ie, > cat /proc/meminfo | grep MemTotal) decreased and couldn't recover. > > When I review source code, balloon_lock should cover release_pages_balloon. > Otherwise, struct virtio_balloon fields could be overwritten by race > of fill_balloon(e,g, vb->*pfns could be critical). I guess, in original design fill and leak could be called only from single kernel thread which manages balloon. Seems like lock was added only for migration. So, locking scheme should be revisited for sure. Probably it's been broken by some of recent changes. > Below patch fixed the problem. > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 7efc32945810..7d3e5d0e9aa4 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num) > */ > if (vb->num_pfns != 0) > tell_host(vb, vb->deflate_vq); > - mutex_unlock(&vb->balloon_lock); > release_pages_balloon(vb); > + mutex_unlock(&vb->balloon_lock); > return num_freed_pages; > } > > Secondly, in balloon_page_dequeue, pages_lock should cover > list_for_each_entry_safe loop. Otherwise, the cursor page > could be isolated by compaction and then list_del by isolation > could poison the page->lru so the loop could access wrong address > like this. > > general protection fault: 0000 [#1] SMP > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1+ #1906 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 > RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 > RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d > RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 > RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 > R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 > R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 > FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 > Stack: > 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 > 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 > ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 > Call Trace: > [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 > [<ffffffff812c8bc7>] balloon+0x217/0x2a0 > [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 > [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 > [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 > [<ffffffff8105b6e9>] kthread+0xc9/0xe0 > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 > [<ffffffff8105b620>] ? kthread_park+0x60/0x60 > Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 > RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 > RSP <ffff8800a7fefdc0> > ---[ end trace 43cf28060d708d5f ]--- > Kernel panic - not syncing: Fatal exception > Dumping ftrace buffer: > (ftrace buffer empty) > Kernel Offset: disabled > > We could fix it by protecting the entire loop by pages_lock but > problem is irq latency during walking the list. > But I doubt how often such worst scenario happens because > in normal situation, the loop would exit easily via succeeding > trylock_page. > > Any comments? > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: KVM: memory ballooning bug? 2015-12-27 17:23 ` Konstantin Khlebnikov @ 2015-12-27 23:49 ` Minchan Kim 0 siblings, 0 replies; 6+ messages in thread From: Minchan Kim @ 2015-12-27 23:49 UTC (permalink / raw) To: Konstantin Khlebnikov Cc: Rafael Aquini, linux-mm, Andrew Morton, Konstantin Khlebnikov, Linux Kernel Mailing List On Sun, Dec 27, 2015 at 08:23:03PM +0300, Konstantin Khlebnikov wrote: > On Wed, Dec 23, 2015 at 8:22 AM, Minchan Kim <minchan@kernel.org> wrote: > > During my compaction-related stuff, I encountered some problems with > > ballooning. > > > > Firstly, with repeated inflating and deflating cycle, guest memory(ie, > > cat /proc/meminfo | grep MemTotal) decreased and couldn't recover. > > > > When I review source code, balloon_lock should cover release_pages_balloon. > > Otherwise, struct virtio_balloon fields could be overwritten by race > > of fill_balloon(e,g, vb->*pfns could be critical). > > I guess, in original design fill and leak could be called only from single > kernel thread which manages balloon. Seems like lock was added > only for migration. So, locking scheme should be revisited for sure. > Probably it's been broken by some of recent changes. When I read git log, it seems to be broken from introdcuing balloon_compaction. Anyway, ballooning is out of my interest. I just wanted to go ahead my test for a long time without any problem. ;-) If you guys want to redesign the locking scheme fully, please do. Until that, I can go with my test with my patches I just sent. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-12-27 23:48 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-23 5:22 KVM: memory ballooning bug? Minchan Kim 2015-12-23 11:14 ` Rafael Aquini 2015-12-23 14:17 ` Minchan Kim 2015-12-23 15:14 ` Rafael Aquini 2015-12-27 17:23 ` Konstantin Khlebnikov 2015-12-27 23:49 ` Minchan Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox