linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Hugh Dickins <hughd@google.com>, David Hildenbrand <david@redhat.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Ives van Hoorne <ives@codesandbox.io>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@goole.com>,
	Alistair Popple <apopple@nvidia.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [PATCH RFC] mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA
Date: Tue, 6 Dec 2022 16:18:28 -0500	[thread overview]
Message-ID: <Y4+xpCRQCazCymdS@x1n> (raw)
In-Reply-To: <22d8e8ac-d75-a66-2650-b4d59f89855e@google.com>

Hi, Hugh,

On Tue, Dec 06, 2022 at 11:09:40AM -0800, Hugh Dickins wrote:
> I have not been following the uffd-wp work, but I believe that David's
> painstaking and excellent account of vm_page_prot is correct.  Peter,
> please I beg you to follow his advice and go for (1) for uffd-wp.

Thanks for jumping in, I will.  Your input definitely matters.

I think the fundamental moot point was whether vm_page_prot is the "safest"
or "default" behavior of vma.

I thought it was the "default" and I don't want to have VM_UFFD_WP regions
to have default no-write attribute comparing to generic ones.  I wanted it
to behave exactly like a normal shmem vma if no one explicitly does
something else.

If it's "safest" and you also agree then I think indeed VM_UFFD_WP falls
into that category; frankly I don't have a solid clue before.  Maybe that
also matches better with the recent vma_wants_manual_pte_write_upgrade(),
or we start to lose write bit in do_numa_page() at least for uffd-wp
protected ranges when recovering the ptes, and it'll be ugly to have:

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3a3d23b3dbe2..718d540b9eb4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2086,7 +2086,7 @@ static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma
         * private mappings, that's always the case when we have write
         * permissions as we properly have to handle COW.
         */
-       if (vma->vm_flags & VM_SHARED)
+       if ((vma->vm_flags & VM_SHARED) && !userfaultfd_wp(vma))
                return vma_wants_writenotify(vma, vma->vm_page_prot);
        return !!(vma->vm_flags & VM_WRITE);

To recover the write bit.

I tried to move forward with the 47ba8bcc33b2 ("mm/migrate: fix read-only
page got writable when recover pte", 2022-11-25) in mm-unstable upstream
because that seems pretty obvious to me and that's verified in the test
beds, but I obviously failed anyway.  Let's adjust the patches with the
best we can have.

> I do not share David's faith in "documented": documented or not,
> depart from safe convention and you will be adding (at least the
> opportunity for) serious bugs.

Yes I agree not having the write bit is safer.  That's also why I wanted to
move the wrprotect into pte_mkuffd_wp.  I assume from "resolving problem"
pov they're the same, but I'll follow your advise.

David, would you please repost a new patch for this one and copy Ives to
make sure it'll work for him in his systems?

I'd suggest to drop the mprotect example, I'll reply later on that to the
other email but I still don't know whether it's a good example for a reader
to understand the problem.

No reproducer needed for numa I think - I guess Ives's test case would be
far enough to verify it if possible.  I also hope what Ives saw was the
numa balancing issue you reported, so maybe it'll resolve all problem he
has.  Then with that verified and queued we can drop the mm/migrate patch.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2022-12-06 21:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-02 12:27 David Hildenbrand
2022-12-02 16:33 ` Peter Xu
2022-12-02 16:56   ` David Hildenbrand
2022-12-02 17:11     ` David Hildenbrand
2022-12-05 21:08       ` Peter Xu
2022-12-06  0:46         ` [PATCH] mm/uffd: Always wr-protect pte in pte_mkuffd_wp() kernel test robot
2022-12-06 16:21           ` Peter Xu
2022-12-06 11:43         ` kernel test robot
2022-12-06 16:28         ` [PATCH RFC] mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA David Hildenbrand
2022-12-06 19:09           ` Hugh Dickins
2022-12-06 21:18             ` Peter Xu [this message]
2022-12-07 15:32               ` David Hildenbrand
2022-12-07 17:43                 ` Peter Xu
2022-12-07 19:53                   ` David Hildenbrand
2022-12-07 20:14                     ` Peter Xu
2022-12-06 21:27           ` Peter Xu
2022-12-07 13:33             ` David Hildenbrand
2022-12-07 15:59               ` Peter Xu
2022-12-07 20:10                 ` David Hildenbrand
2022-12-08 15:17                   ` Peter Xu
2022-12-06 18:38         ` [PATCH] mm/uffd: Always wr-protect pte in pte_mkuffd_wp() kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4+xpCRQCazCymdS@x1n \
    --to=peterx@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=hughd@goole.com \
    --cc=ives@codesandbox.io \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nadav.amit@gmail.com \
    --cc=rppt@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox