From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f171.google.com (mail-pf0-f171.google.com [209.85.192.171]) by kanga.kvack.org (Postfix) with ESMTP id 66BAE6B0005 for ; Thu, 10 Mar 2016 21:38:12 -0500 (EST) Received: by mail-pf0-f171.google.com with SMTP id 124so83653078pfg.0 for ; Thu, 10 Mar 2016 18:38:12 -0800 (PST) Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTP id re13si125916pab.202.2016.03.10.18.38.11 for ; Thu, 10 Mar 2016 18:38:11 -0800 (PST) From: "Li, Liang Z" Subject: RE: [RFC qemu 0/4] A PV solution for live migration optimization Date: Fri, 11 Mar 2016 02:38:02 +0000 Message-ID: References: <1457001868-15949-1-git-send-email-liang.z.li@intel.com> <20160308111343.GM15443@grmbl.mre> <20160310075728.GB4678@grmbl.mre> <20160310111844.GB2276@work-vm> In-Reply-To: <20160310111844.GB2276@work-vm> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: "Dr. David Alan Gilbert" Cc: Amit Shah , "quintela@redhat.com" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , "mst@redhat.com" , "akpm@linux-foundation.org" , "pbonzini@redhat.com" , "rth@twiddle.net" , "ehabkost@redhat.com" , "linux-mm@kvack.org" , "virtualization@lists.linux-foundation.org" , "kvm@vger.kernel.org" , "mohan_parthasarathy@hpe.com" , "jitendra.kolhe@hpe.com" , "simhan@hpe.com" >=20 > Hi, > I'm just catching back up on this thread; so without reference to any > particular previous mail in the thread. >=20 > 1) How many of the free pages do we tell the host about? > Your main change is telling the host about all the > free pages. Yes, all the guest's free pages. > If we tell the host about all the free pages, then we might > end up needing to allocate more pages and update the host > with pages we now want to use; that would have to wait for the > host to acknowledge that use of these pages, since if we don't > wait for it then it might have skipped migrating a page we > just started using (I don't understand how your series solves that). > So the guest probably needs to keep some free pages - how many? Actually, there is no need to care about whether the free pages will be use= d by the host. We only care about some of the free pages we get reused by the guest, right= ? The dirty page logging can be used to solve this, starting the dirty page l= ogging before getting the free pages informant from guest. Even some of the free pages are modifi= ed by the guest during the process of getting the free pages information, these modified pa= ges will be traced by the dirty page logging mechanism. So in the following migration_bitmap_s= ync() function. The pages in the free pages bitmap, but latter was modified, will be reset = to dirty. We won't omit any dirtied pages. So, guest doesn't need to keep any free pages. > 2) Clearing out caches > Does it make sense to clean caches? They're apparently useful data > so if we clean them it's likely to slow the guest down; I guess > they're also likely to be fairly static data - so at least fairly > easy to migrate. > The answer here partially depends on what you want from your migrati= on; > if you're after the fastest possible migration time it might make > sense to clean the caches and avoid migrating them; but that might > be at the cost of more disruption to the guest - there's a trade off > somewhere and it's not clear to me how you set that depending on you= r > guest/network/reqirements. >=20 Yes, clean the caches is an option. Let the users decide using it or not. > 3) Why is ballooning slow? > You've got a figure of 5s to balloon on an 8GB VM - but an > 8GB VM isn't huge; so I worry about how long it would take > on a big VM. We need to understand why it's slow > * is it due to the guest shuffling pages around? > * is it due to the virtio-balloon protocol sending one page > at a time? > + Do balloon pages normally clump in physical memory > - i.e. would a 'large balloon' message help > - or do we need a bitmap because it tends not to clump? >=20 I didn't do a comprehensive test. But I found most of the time spending on allocating the pages and sending the PFNs to guest, I don't know that's the most time consuming operation, allocating the pages or sending the PFNs= . > * is it due to the madvise on the host? > If we were using the normal balloon messages, then we > could, during migration, just route those to the migration > code rather than bothering with the madvise. > If they're clumping together we could just turn that into > one big madvise; if they're not then would we benefit from > a call that lets us madvise lots of areas? >=20 My test showed madvise() is not the main reason for the long time, only tak= en 10% of the total inflating balloon operation time. Big madvise can more or less improve the performance. > 4) Speeding up the migration of those free pages > You're using the bitmap to avoid migrating those free pages; HPe's > patchset is reconstructing a bitmap from the balloon data; OK, so > this all makes sense to avoid migrating them - I'd also been thinking > of using pagemap to spot zero pages that would help find other zero'd > pages, but perhaps ballooned is enough? >=20 Could you describe your ideal with more details? > 5) Second-migrate > Given a VM where you've done all those tricks on, what happens when > you migrate it a second time? I guess you're aiming for the guest > to update it's bitmap; HPe's solution is to migrate it's balloon > bitmap along with the migration data. Nothing is special in the second migration, QEMU will request the guest for= free pages Information, and the guest will traverse it's current free page list to con= struct a new free page bitmap and send it to QEMU. Just like in the first migration. Liang >=20 > Dave >=20 > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org