linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Rafael Aquini <aquini@redhat.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	linux-kernel@vger.kernel.org
Subject: KVM: memory ballooning bug?
Date: Wed, 23 Dec 2015 14:22:28 +0900	[thread overview]
Message-ID: <20151223052228.GA31269@bbox> (raw)

During my compaction-related stuff, I encountered some problems with
ballooning.

Firstly, with repeated inflating and deflating cycle, guest memory(ie,
cat /proc/meminfo | grep MemTotal) decreased and couldn't recover.

When I review source code, balloon_lock should cover release_pages_balloon.
Otherwise, struct virtio_balloon fields could be overwritten by race
of fill_balloon(e,g, vb->*pfns could be critical).
Below patch fixed the problem.

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 7efc32945810..7d3e5d0e9aa4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -209,8 +209,8 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
         */
        if (vb->num_pfns != 0)
                tell_host(vb, vb->deflate_vq);
-       mutex_unlock(&vb->balloon_lock);
        release_pages_balloon(vb);
+       mutex_unlock(&vb->balloon_lock);
        return num_freed_pages;
 }
 
Secondly, in balloon_page_dequeue, pages_lock should cover
list_for_each_entry_safe loop. Otherwise, the cursor page
could be isolated by compaction and then list_del by isolation
could poison the page->lru so the loop could access wrong address
like this.

general protection fault: 0000 [#1] SMP 
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1+ #1906
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000
RIP: 0010:[<ffffffff8115e754>]  [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130
RSP: 0018:ffff8800a7fefdc0  EFLAGS: 00010246
RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d
RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68
RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000
R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0
R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060
FS:  0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0
Stack:
 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060
 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020
 ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060
Call Trace:
 [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0
 [<ffffffff812c8bc7>] balloon+0x217/0x2a0
 [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0
 [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0
 [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0
 [<ffffffff8105b6e9>] kthread+0xc9/0xe0
 [<ffffffff8105b620>] ? kthread_park+0x60/0x60
 [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70
 [<ffffffff8105b620>] ? kthread_park+0x60/0x60
Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 
RIP  [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130
 RSP <ffff8800a7fefdc0>
---[ end trace 43cf28060d708d5f ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled

We could fix it by protecting the entire loop by pages_lock but
problem is irq latency during walking the list.
But I doubt how often such worst scenario happens because
in normal situation, the loop would exit easily via succeeding
trylock_page.

Any comments?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2015-12-23  5:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-23  5:22 Minchan Kim [this message]
2015-12-23 11:14 ` Rafael Aquini
2015-12-23 14:17   ` Minchan Kim
2015-12-23 15:14     ` Rafael Aquini
2015-12-27 17:23 ` Konstantin Khlebnikov
2015-12-27 23:49   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151223052228.GA31269@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@redhat.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox