From: Xiao Guangrong <xiaoguangrong.eric@gmail.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
wenchao <wenchaolinux@gmail.com>, Mel Gorman <mgorman@suse.de>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
hughd@google.com, walken@google.com,
Alexander Viro <viro@zeniv.linux.org.uk>,
kirill.shutemov@linux.intel.com,
Anthony Liguori <anthony@codemonkey.ws>,
KVM <kvm@vger.kernel.org>
Subject: Re: [RFC PATCH V1 0/6] mm: add a new option MREMAP_DUP to mmrep syscall
Date: Tue, 31 Dec 2013 20:06:51 +0800 [thread overview]
Message-ID: <943AC3BD-C4EB-4B6C-BE34-AB921938AAF0@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131230202342.GA7973@amt.cnet>
On Dec 31, 2013, at 4:23 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Dec 17, 2013 at 01:59:04PM +0800, Xiao Guangrong wrote:
>>
>> CCed KVM guys.
>>
>> On 05/10/2013 01:11 PM, Stefan Hajnoczi wrote:
>>> On Fri, May 10, 2013 at 4:28 AM, wenchao <wenchaolinux@gmail.com> wrote:
>>>> 于 2013-5-9 22:13, Mel Gorman 写道:
>>>>
>>>>> On Thu, May 09, 2013 at 05:50:05PM +0800, wenchaolinux@gmail.com wrote:
>>>>>>
>>>>>> From: Wenchao Xia <wenchaolinux@gmail.com>
>>>>>>
>>>>>> This serial try to enable mremap syscall to cow some private memory
>>>>>> region,
>>>>>> just like what fork() did. As a result, user space application would got
>>>>>> a
>>>>>> mirror of those region, and it can be used as a snapshot for further
>>>>>> processing.
>>>>>>
>>>>>
>>>>> What not just fork()? Even if the application was threaded it should be
>>>>> managable to handle fork just for processing the private memory region
>>>>> in question. I'm having trouble figuring out what sort of application
>>>>> would require an interface like this.
>>>>>
>>>> It have some troubles: parent - child communication, sometimes
>>>> page copy.
>>>> I'd like to snapshot qemu guest's RAM, currently solution is:
>>>> 1) fork()
>>>> 2) pipe guest RAM data from child to parent.
>>>> 3) parent write down the contents.
>>>>
>>>> To avoid complex communication for data control, and file content
>>>> protecting, So let parent instead of child handling the data with
>>>> a pipe, but this brings additional copy(). I think an explicit API
>>>> cow mapping an memory region inside one process, could avoid it,
>>>> and faster and cow less pages, also make user space code nicer.
>>>
>>> A new Linux-specific API is not portable and not available on existing
>>> hosts. Since QEMU supports non-Linux host operating systems the
>>> fork() approach is preferable.
>>>
>>> If you're worried about the memory copy - which should be benchmarked
>>> - then vmsplice(2) can be used in the child process and splice(2) can
>>> be used in the parent. It probably doesn't help though since QEMU
>>> scans RAM pages to find all-zero pages before sending them over the
>>> socket, and at that point the memory copy might not make much
>>> difference.
>>>
>>> Perhaps other applications can use this new flag better, but for QEMU
>>> I think fork()'s portability is more important than the convenience of
>>> accessing the CoW pages in the same process.
>>
>> Yup, I agree with you that the new syscall sometimes is not a good solution.
>>
>> Currently, we're working on live-update[1] that will be enabled on Qemu firstly,
>> this feature let the guest run on the new Qemu binary smoothly without
>> restart, it's good for us to do security-update.
>>
>> In this case, we need to move the guest memory on old qemu instance to the
>> new one, fork() can not help because we need to exec() a new instance, after
>> that all memory mapping will be destroyed.
>>
>> We tried to enable SPLICE_F_MOVE[2] for vmsplice() to move the memory without
>> memory-copy but the performance isn't so good as we expected: it's due to
>> some limitations: the page-size, lock, message-size limitation on pipe, etc.
>> Of course, we will continue to improve this, but wenchao's patch seems a new
>> direction for us.
>>
>> To coordinate with your fork() approach, maybe we can introduce a new flag
>> for VMA, something like: VM_KEEP_ONEXEC, to tell exec() to do not destroy
>> this VMA. How about this or you guy have new idea? Really appreciate for your
>> suggestion.
>>
>> [1] http://marc.info/?l=qemu-devel&m=138597598700844&w=2
>> [2] https://lkml.org/lkml/2013/10/25/285
>
> Hi,
>
Hi Marcelo,
> What is the purpose of snapshotting guest RAM here, in the context of
> local migration?
RAM-shapshotting and local-migration are on the different ways.
Why i asked for your guy’s suggestion here is beacuse i thought
they need do a same thing that moves memory from one process
to another in a efficient way. Your idea? :)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-12-31 12:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-09 9:50 wenchaolinux
2013-05-09 9:50 ` [RFC PATCH V1 1/6] mm: add parameter remove_old in move_huge_pmd() wenchaolinux
2013-05-09 9:50 ` [RFC PATCH V1 2/6] mm : allow copy between different addresses for copy_one_pte() wenchaolinux
2013-05-09 9:50 ` [RFC PATCH V1 3/6] mm : export rss vec helper functions wenchaolinux
2013-05-09 9:50 ` [RFC PATCH V1 4/6] mm : export is_cow_mapping() wenchaolinux
2013-05-09 9:50 ` [RFC PATCH V1 5/6] mm : add parameter remove_old in move_page_tables wenchaolinux
2013-05-09 9:50 ` [RFC PATCH V1 6/6] mm : add new option MREMAP_DUP to mremap() syscall wenchaolinux
2013-05-09 14:13 ` [RFC PATCH V1 0/6] mm: add a new option MREMAP_DUP to mmrep syscall Mel Gorman
2013-05-10 2:28 ` wenchao
2013-05-10 5:11 ` Stefan Hajnoczi
2013-12-17 5:59 ` Xiao Guangrong
2013-12-30 20:23 ` Marcelo Tosatti
2013-12-31 12:06 ` Xiao Guangrong [this message]
2013-12-31 18:53 ` Marcelo Tosatti
2014-01-06 7:41 ` Xiao Guangrong
2013-05-10 9:22 ` Kirill A. Shutemov
2013-05-11 14:16 ` Pavel Emelyanov
2013-05-13 2:40 ` wenchao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=943AC3BD-C4EB-4B6C-BE34-AB921938AAF0@linux.vnet.ibm.com \
--to=xiaoguangrong.eric@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=anthony@codemonkey.ws \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mtosatti@redhat.com \
--cc=stefanha@gmail.com \
--cc=viro@zeniv.linux.org.uk \
--cc=walken@google.com \
--cc=wenchaolinux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox