linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Axel Rasmussen <axelrasmussen@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Christian Brauner <brauner@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	 Hongchen Zhang <zhanghongchen@loongson.cn>,
	Huang Ying <ying.huang@intel.com>,
	 James Houghton <jthoughton@google.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	 Miaohe Lin <linmiaohe@huawei.com>,
	"Mike Rapoport (IBM)" <rppt@kernel.org>,
	Nadav Amit <namit@vmware.com>,
	 Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Peter Xu <peterx@redhat.com>,  Shuah Khan <shuah@kernel.org>,
	ZhangPeng <zhangpeng362@huawei.com>,
	 linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-mm@kvack.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 1/3] mm: userfaultfd: add new UFFDIO_SIGBUS ioctl
Date: Thu, 11 May 2023 13:40:16 -0700	[thread overview]
Message-ID: <CAJHvVcg+Sm-=F=Xhi-WVLRxDcDcYzD8AwLpHHoP8zLubOoX6TQ@mail.gmail.com> (raw)
In-Reply-To: <20230511202243.GA5466@monkey>

On Thu, May 11, 2023 at 1:29 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 05/11/23 11:24, Axel Rasmussen wrote:
> > The basic idea here is to "simulate" memory poisoning for VMs. A VM
> > running on some host might encounter a memory error, after which some
> > page(s) are poisoned (i.e., future accesses SIGBUS). They expect that
> > once poisoned, pages can never become "un-poisoned". So, when we live
> > migrate the VM, we need to preserve the poisoned status of these pages.
> >
> > When live migrating, we try to get the guest running on its new host as
> > quickly as possible. So, we start it running before all memory has been
> > copied, and before we're certain which pages should be poisoned or not.
> >
> > So the basic way to use this new feature is:
> >
> > - On the new host, the guest's memory is registered with userfaultfd, in
> >   either MISSING or MINOR mode (doesn't really matter for this purpose).
> > - On any first access, we get a userfaultfd event. At this point we can
> >   communicate with the old host to find out if the page was poisoned.
>
> Just curious, what is this communication channel with the old host?

James can probably describe it in more detail / more correctly than I
can. My (possibly wrong :) ) understanding is:

On the source machine we maintain a bitmap indicating which pages are
clean or dirty (meaning, modified after the initial "precopy" of
memory to the target machine) or poisoned. Eventually the entire
bitmap is sent to the target machine, but this takes some time (maybe
seconds on large machines). After this point though we have all the
information we need, we no longer need to communicate with the source
to find out the status of pages (although there may still be some
memory contents to finish copying over).

In the meantime, I think the target machine can also ask the source
machine about the status of individual pages (for quick on-demand
paging).

As for the underlying mechanism, it's an internal protocol but the
publicly-available thing it's most similar to is probably gRPC [1]. At
a really basic level, we send binary serialized protocol buffers [2]
over the network in a request / response fashion.

[1] https://grpc.io/
[2] https://protobuf.dev/

> --
> Mike Kravetz
>
> > - If so, we can respond with a UFFDIO_SIGBUS - this places a swap marker
> >   so any future accesses will SIGBUS. Because the pte is now "present",
> >   future accesses won't generate more userfaultfd events, they'll just
> >   SIGBUS directly.
> >
> > UFFDIO_SIGBUS does not handle unmapping previously-present PTEs. This
> > isn't needed, because during live migration we want to intercept
> > all accesses with userfaultfd (not just writes, so WP mode isn't useful
> > for this). So whether minor or missing mode is being used (or both), the
> > PTE won't be present in any case, so handling that case isn't needed.
> >


  reply	other threads:[~2023-05-11 20:40 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11 18:24 Axel Rasmussen
2023-05-11 18:24 ` [PATCH 2/3] selftests/mm: refactor uffd_poll_thread to allow custom fault handlers Axel Rasmussen
2023-05-11 18:24 ` [PATCH 3/3] selftests/mm: add uffd unit test for UFFDIO_SIGBUS Axel Rasmussen
2023-05-11 20:22 ` [PATCH 1/3] mm: userfaultfd: add new UFFDIO_SIGBUS ioctl Mike Kravetz
2023-05-11 20:40   ` Axel Rasmussen [this message]
2023-05-11 21:05     ` Axel Rasmussen
2023-05-11 22:00 ` James Houghton
2023-05-17 22:12   ` Peter Xu
2023-05-17 22:20     ` Peter Xu
2023-05-17 22:28       ` Axel Rasmussen
2023-05-18  0:20         ` Peter Xu
2023-05-18  0:43         ` Jiaqi Yan
2023-05-18 16:05           ` Peter Xu
2023-05-18 20:38             ` Axel Rasmussen
2023-05-18 21:38               ` Peter Xu
2023-05-18 21:50                 ` Peter Xu
2023-05-19  8:38               ` David Hildenbrand
2023-05-19 15:04                 ` Jiaqi Yan
2023-05-19 16:20                   ` Peter Xu
2023-05-19 17:32                     ` Axel Rasmussen
2023-05-23 17:27                       ` Peter Xu
2023-05-23 17:26 ` Peter Xu
2023-05-23 17:59   ` Axel Rasmussen
2023-05-24 15:05     ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJHvVcg+Sm-=F=Xhi-WVLRxDcDcYzD8AwLpHHoP8zLubOoX6TQ@mail.gmail.com' \
    --to=axelrasmussen@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=david@redhat.com \
    --cc=jthoughton@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=namit@vmware.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=ying.huang@intel.com \
    --cc=zhanghongchen@loongson.cn \
    --cc=zhangpeng362@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox