linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Emelyanov <xemul@virtuozzo.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Mike Rapoport <mike.rapoport@gmail.com>
Subject: Re: [PATCH 0/1] soft_dirty: fix soft_dirty during THP split
Date: Mon, 22 Aug 2016 19:35:32 +0300	[thread overview]
Message-ID: <57BB29D4.8080100@virtuozzo.com> (raw)
In-Reply-To: <20160819143712.fehugpmnxmxyydi2@redhat.com>

On 08/19/2016 05:37 PM, Andrea Arcangeli wrote:
> On Fri, Aug 19, 2016 at 04:52:51PM +0300, Pavel Emelyanov wrote:
>> Hm... Are you talking about some in-kernel test, or just any? We have
>> tests in CRIU tree for UFFD (not sure we've wired up the non-cooperative
>> part though).
> 
> Nice. I wasn't aware you had uffd specific tests in CRIU, I'll check.
> 
> I was referring to the tools/testing/selftest/vm/userfault*, but I
> suppose it's fine in CIRU as well. A self contained test suitable for
> testing/selftest would be nice too as not everyone will run CRIU tests
> to test the kernel.
> 
> Currently what's tested is anon missing, tmpfs missing and hugetlbfs
> missing and they all work (just fixed two tmpfs bugs yesterday thanks
> to the tmpfs test that crashed my workstation when I tried it, now it
> passes fine :).
> 
>> And my main worry about this is COW-sharing. If we have two tasks that
>> fork()-ed from each other and we try to lazily restore a page that
>> is still COW-ed between them, the uffd API doesn't give us anything to
>> do it. So we effectively break COW on lazy restore. Do you have any
>> ideas what can be done about it?
> 
> Building a shared page is tricky, not even khugepaged was doing that
> for anon.
> 
> Kirill extended khugepaged to do it, along the THP on tmpfs support,
> as it's more important for tmpfs (I haven't yet checked if it landed
> upstream with the rest of tmpfs in 4.8-rc though).
> 
> The main API problem is the uffd is different between parent and
> child, fork with your non cooperative patches gives you a new uffd
> that represents the child mm.

Yes.

> To create a shared page among two "mm" the API should be able to
> specify the two "mm" and two "addresses" atomically in the same
> ioctl. And the uffd _is_ the "mm" with the current API.

Well, with current approach mm equals uffd file, so passing
one uffd descriptor into another's ioctl should do the trick.

> So what it takes to do it is to add a UFFDIO_COPY_COW that takes as
> parameter an address for the current "uffd" and a list of "int uffd,
> unsigned long address" pairs.

Yup :)

> Even with the UFFDIO_COPY things should still work solid, it'll just
> take more memory and it'll break-COW during restore. The important
> thing is "break" is as in "allocate more memory", not as in "crashing" :).
> 
>> We have ... readiness to do it :) since once CRIU hits this we'll have to.
> 
> Ok great.
> 
> I also thought about it a bit and I think it's just a matter of
> specifying which uffd should get the notification first. The manager
> then will take the notification first and it will call an
> UFFDIO_FAULT_PASS to cascade in the second uffd registered in the
> region if the page was missing in the source container, without waking
> up the task blocked in handle_userfault. To find the page is missing
> in the source container you could use pagemap.
> 
> Thanks,
> Andrea
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-08-22 16:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19 12:41 Andrea Arcangeli
2016-08-19 12:41 ` [PATCH 1/1] " Andrea Arcangeli
2016-08-19 13:17   ` Pavel Emelyanov
2016-08-19 13:20 ` [PATCH 0/1] " Pavel Emelyanov
2016-08-19 13:43   ` Andrea Arcangeli
2016-08-19 13:52     ` Pavel Emelyanov
2016-08-19 14:37       ` Andrea Arcangeli
2016-08-22 16:35         ` Pavel Emelyanov [this message]
2016-08-23 11:03       ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57BB29D4.8080100@virtuozzo.com \
    --to=xemul@virtuozzo.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill@shutemov.name \
    --cc=linux-mm@kvack.org \
    --cc=mike.rapoport@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox