linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Li, Liang Z" <liang.z.li@intel.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mst@redhat.com" <mst@redhat.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"rth@twiddle.net" <rth@twiddle.net>,
	"ehabkost@redhat.com" <ehabkost@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"mohan_parthasarathy@hpe.com" <mohan_parthasarathy@hpe.com>,
	"jitendra.kolhe@hpe.com" <jitendra.kolhe@hpe.com>,
	"simhan@hpe.com" <simhan@hpe.com>
Subject: RE: [RFC qemu 0/4] A PV solution for live migration optimization
Date: Tue, 15 Mar 2016 03:31:36 +0000	[thread overview]
Message-ID: <F2CBF3009FA73547804AE4C663CAB28E0414D67B@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <20160314170334.GK2234@work-vm>

> > > Hi,
> > >   I'm just catching back up on this thread; so without reference to
> > > any particular previous mail in the thread.
> > >
> > >   1) How many of the free pages do we tell the host about?
> > >      Your main change is telling the host about all the
> > >      free pages.
> >
> > Yes, all the guest's free pages.
> >
> > >      If we tell the host about all the free pages, then we might
> > >      end up needing to allocate more pages and update the host
> > >      with pages we now want to use; that would have to wait for the
> > >      host to acknowledge that use of these pages, since if we don't
> > >      wait for it then it might have skipped migrating a page we
> > >      just started using (I don't understand how your series solves that).
> > >      So the guest probably needs to keep some free pages - how many?
> >
> > Actually, there is no need to care about whether the free pages will be
> used by the host.
> > We only care about some of the free pages we get reused by the guest,
> right?
> >
> > The dirty page logging can be used to solve this, starting the dirty
> > page logging before getting the free pages informant from guest. Even
> > some of the free pages are modified by the guest during the process of
> > getting the free pages information, these modified pages will be traced by
> the dirty page logging mechanism. So in the following
> migration_bitmap_sync() function.
> > The pages in the free pages bitmap, but latter was modified, will be
> > reset to dirty. We won't omit any dirtied pages.
> >
> > So, guest doesn't need to keep any free pages.
> 
> OK, yes, that works; so we do:
>   * enable dirty logging
>   * ask guest for free pages
>   * initialise the migration bitmap as everything-free
>   * then later we do the normal sync-dirty bitmap stuff and it all just works.
> 
> That's nice and simple.
> 
> > >   2) Clearing out caches
> > >      Does it make sense to clean caches?  They're apparently useful data
> > >      so if we clean them it's likely to slow the guest down; I guess
> > >      they're also likely to be fairly static data - so at least fairly
> > >      easy to migrate.
> > >      The answer here partially depends on what you want from your
> migration;
> > >      if you're after the fastest possible migration time it might make
> > >      sense to clean the caches and avoid migrating them; but that might
> > >      be at the cost of more disruption to the guest - there's a trade off
> > >      somewhere and it's not clear to me how you set that depending on
> your
> > >      guest/network/reqirements.
> > >
> >
> > Yes, clean the caches is an option.  Let the users decide using it or not.
> >
> > >   3) Why is ballooning slow?
> > >      You've got a figure of 5s to balloon on an 8GB VM - but an
> > >      8GB VM isn't huge; so I worry about how long it would take
> > >      on a big VM.   We need to understand why it's slow
> > >        * is it due to the guest shuffling pages around?
> > >        * is it due to the virtio-balloon protocol sending one page
> > >          at a time?
> > >          + Do balloon pages normally clump in physical memory
> > >             - i.e. would a 'large balloon' message help
> > >             - or do we need a bitmap because it tends not to clump?
> > >
> >
> > I didn't do a comprehensive test. But I found most of the time
> > spending on allocating the pages and sending the PFNs to guest, I
> > don't know that's the most time consuming operation, allocating the pages
> or sending the PFNs.
> 
> It might be a good idea to analyse it a bit more to convince people where the
> problem is.
> 

Yes, I will try to measure the time spending on different parts.

> > >        * is it due to the madvise on the host?
> > >          If we were using the normal balloon messages, then we
> > >          could, during migration, just route those to the migration
> > >          code rather than bothering with the madvise.
> > >          If they're clumping together we could just turn that into
> > >          one big madvise; if they're not then would we benefit from
> > >          a call that lets us madvise lots of areas?
> > >
> >
> > My test showed madvise() is not the main reason for the long time,
> > only taken 10% of the total  inflating balloon operation time.
> > Big madvise can more or less improve the performance.
> 
> OK; 10% of the total is still pretty big even for your 8GB VM.
> 
> > >   4) Speeding up the migration of those free pages
> > >     You're using the bitmap to avoid migrating those free pages; HPe's
> > >     patchset is reconstructing a bitmap from the balloon data;  OK, so
> > >     this all makes sense to avoid migrating them - I'd also been thinking
> > >     of using pagemap to spot zero pages that would help find other zero'd
> > >     pages, but perhaps ballooned is enough?
> > >
> > Could you describe your ideal with more details?
> 
> At the moment the migration code spends a fair amount of time checking if a
> page is zero; I was thinking perhaps the qemu could just open
> /proc/self/pagemap and check if the page was mapped; that would seem
> cheap if we're checking big ranges; and that would find all the balloon pages.
> 

Even if virtio-balloon is not enabled, it can be used to find the pages that never used
by guest.

> > >   5) Second-migrate
> > >     Given a VM where you've done all those tricks on, what happens when
> > >     you migrate it a second time?   I guess you're aiming for the guest
> > >     to update it's bitmap;  HPe's solution is to migrate it's balloon
> > >     bitmap along with the migration data.
> >
> > Nothing is special in the second migration, QEMU will request the
> > guest for free pages Information, and the guest will traverse it's
> > current free page list to construct a new free page bitmap and send it to
> QEMU. Just like in the first migration.
> 
> Right.
> 
> Dave
> 
> > Liang
> > >
> > > Dave
> > >
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-03-15  3:31 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 10:44 Liang Li
2016-03-03 10:44 ` [RFC qemu 1/4] pc: Add code to get the lowmem form PCMachineState Liang Li
2016-03-03 10:44 ` [RFC qemu 2/4] virtio-balloon: Add a new feature to balloon device Liang Li
2016-03-03 12:23   ` Cornelia Huck
2016-03-04  2:38     ` Li, Liang Z
2016-03-03 12:56   ` Michael S. Tsirkin
2016-03-04  2:29     ` Li, Liang Z
2016-03-03 10:44 ` [RFC qemu 3/4] migration: not set migration bitmap in setup stage Liang Li
2016-03-03 10:44 ` [RFC qemu 4/4] migration: filter out guest's free pages in ram bulk stage Liang Li
2016-03-03 12:16   ` Cornelia Huck
2016-03-04  2:32     ` Li, Liang Z
2016-03-03 12:45   ` [Qemu-devel] " Daniel P. Berrange
2016-03-04  2:43     ` Li, Liang Z
2016-03-03 13:58 ` [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization Roman Kagan
2016-03-04  1:35   ` Li, Liang Z
2016-03-03 17:46 ` Dr. David Alan Gilbert
2016-03-04  1:52   ` Li, Liang Z
2016-03-04  8:14     ` [Qemu-devel] " Roman Kagan
2016-03-04  9:08       ` Li, Liang Z
2016-03-04 10:23         ` Roman Kagan
2016-03-04 14:26           ` Li, Liang Z
2016-03-04 14:45             ` Michael S. Tsirkin
2016-03-04 15:49               ` Li, Liang Z
2016-03-05 19:55                 ` Michael S. Tsirkin
2016-03-07  6:49                   ` Li, Liang Z
2016-03-07 11:40                     ` Michael S. Tsirkin
2016-03-07 15:06                       ` Li, Liang Z
2016-03-09 14:28                       ` Roman Kagan
2016-03-09 15:27                         ` Li, Liang Z
2016-03-09 15:30                           ` Michael S. Tsirkin
2016-03-10  1:41                             ` Li, Liang Z
2016-03-10 12:29                               ` Michael S. Tsirkin
2016-03-09 15:41                         ` Michael S. Tsirkin
2016-03-09 17:04                           ` Roman Kagan
2016-03-09 17:39                             ` Michael S. Tsirkin
2016-03-10 10:21                               ` Roman Kagan
2016-03-09 19:38                             ` Rik van Riel
2016-03-10  9:30                               ` Roman Kagan
2016-03-04 16:24             ` Paolo Bonzini
2016-03-04 18:51               ` Dr. David Alan Gilbert
2016-03-07  5:34                 ` Li, Liang Z
2016-03-09 13:22                 ` Roman Kagan
2016-03-09 14:19                   ` Li, Liang Z
2016-03-09  6:18               ` Li, Liang Z
2016-03-04  7:55   ` Roman Kagan
2016-03-04  8:23     ` Li, Liang Z
2016-03-04  8:35       ` Roman Kagan
2016-03-04  9:08         ` Dr. David Alan Gilbert
2016-03-04  9:12           ` Li, Liang Z
2016-03-04  9:47             ` Michael S. Tsirkin
2016-03-04 10:11               ` Li, Liang Z
2016-03-04 10:36                 ` Michael S. Tsirkin
2016-03-04 15:13                   ` Li, Liang Z
2016-03-08 14:03                     ` Michael S. Tsirkin
2016-03-08 14:17                       ` Li, Liang Z
2016-03-04  9:35           ` Roman Kagan
2016-03-08 11:13 ` Amit Shah
2016-03-08 13:11   ` Li, Liang Z
2016-03-10  7:44   ` Li, Liang Z
2016-03-10  7:57     ` Amit Shah
2016-03-10  8:36       ` Li, Liang Z
2016-03-10 11:18         ` Dr. David Alan Gilbert
2016-03-11  2:38           ` Li, Liang Z
2016-03-14 17:03             ` Dr. David Alan Gilbert
2016-03-15  3:31               ` Li, Liang Z [this message]
2016-03-15 10:29               ` Michael S. Tsirkin
2016-03-15 11:11                 ` Li, Liang Z
2016-03-15 19:55                   ` Dr. David Alan Gilbert
2016-03-16  1:20                     ` Li, Liang Z
2016-03-04  9:32 Jitendra Kolhe
2016-03-04  9:36 ` Li, Liang Z
2016-03-08 11:14 ` Amit Shah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F2CBF3009FA73547804AE4C663CAB28E0414D67B@shsmsx102.ccr.corp.intel.com \
    --to=liang.z.li@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=jitendra.kolhe@hpe.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mohan_parthasarathy@hpe.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=rth@twiddle.net \
    --cc=simhan@hpe.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox