linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Axel Rasmussen <axelrasmussen@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Brian Geffon <bgeffon@google.com>,
	 Christian Brauner <brauner@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	 Gaosheng Cui <cuigaosheng1@huawei.com>,
	Huang Ying <ying.huang@intel.com>,
	 Hugh Dickins <hughd@google.com>,
	James Houghton <jthoughton@google.com>,
	 "Jan Alexander Steffens (heftig)" <heftig@archlinux.org>,
	Jiaqi Yan <jiaqiyan@google.com>,
	 Jonathan Corbet <corbet@lwn.net>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	 "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	 Mike Kravetz <mike.kravetz@oracle.com>,
	"Mike Rapoport (IBM)" <rppt@kernel.org>,
	 Muchun Song <muchun.song@linux.dev>,
	Nadav Amit <namit@vmware.com>,
	 Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Peter Xu <peterx@redhat.com>,
	 Ryan Roberts <ryan.roberts@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	 Suleiman Souhlal <suleiman@google.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 "T.J. Alumbaugh" <talumbau@google.com>,
	Yu Zhao <yuzhao@google.com>,  ZhangPeng <zhangpeng362@huawei.com>,
	linux-doc@vger.kernel.org,  linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,  linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v4 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general
Date: Mon, 10 Jul 2023 14:59:55 -0700	[thread overview]
Message-ID: <CAJHvVch5j=J=d-TqC1bgN6bKLrr0N3W7cwSOAqHf8O3axqapwA@mail.gmail.com> (raw)
In-Reply-To: <CAJHvVcgfN5RVXJ_f3tN2UinV_kWCMyCY_g5oKm=BtgQJz-e7gA@mail.gmail.com>

On Mon, Jul 10, 2023 at 10:19 AM Axel Rasmussen
<axelrasmussen@google.com> wrote:
>
> On Sat, Jul 8, 2023 at 6:08 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Fri,  7 Jul 2023 14:55:33 -0700 Axel Rasmussen <axelrasmussen@google.com> wrote:
> >
> > > Future patches will re-use PTE_MARKER_SWAPIN_ERROR to implement
> > > UFFDIO_POISON, so make some various preparations for that:
> > >
> > > First, rename it to just PTE_MARKER_POISONED. The "SWAPIN" can be
> > > confusing since we're going to re-use it for something not really
> > > related to swap. This can be particularly confusing for things like
> > > hugetlbfs, which doesn't support swap whatsoever. Also rename some
> > > various helper functions.
> > >
> > > Next, fix pte marker copying for hugetlbfs. Previously, it would WARN on
> > > seeing a PTE_MARKER_SWAPIN_ERROR, since hugetlbfs doesn't support swap.
> > > But, since we're going to re-use it, we want it to go ahead and copy it
> > > just like non-hugetlbfs memory does today. Since the code to do this is
> > > more complicated now, pull it out into a helper which can be re-used in
> > > both places. While we're at it, also make it slightly more explicit in
> > > its handling of e.g. uffd wp markers.
> > >
> > > For non-hugetlbfs page faults, instead of returning VM_FAULT_SIGBUS for
> > > an error entry, return VM_FAULT_HWPOISON. For most cases this change
> > > doesn't matter, e.g. a userspace program would receive a SIGBUS either
> > > way. But for UFFDIO_POISON, this change will let KVM guests get an MCE
> > > out of the box, instead of giving a SIGBUS to the hypervisor and
> > > requiring it to somehow inject an MCE.
> > >
> > > Finally, for hugetlbfs faults, handle PTE_MARKER_POISONED, and return
> > > VM_FAULT_HWPOISON_LARGE in such cases. Note that this can't happen today
> > > because the lack of swap support means we'll never end up with such a
> > > PTE anyway, but this behavior will be needed once such entries *can*
> > > show up via UFFDIO_POISON.
> > >
> > > --- a/include/linux/mm_inline.h
> > > +++ b/include/linux/mm_inline.h
> > > @@ -523,6 +523,25 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
> > >       return atomic_read(&mm->tlb_flush_pending) > 1;
> > >  }
> > >
> > > +/*
> > > + * Computes the pte marker to copy from the given source entry into dst_vma.
> > > + * If no marker should be copied, returns 0.
> > > + * The caller should insert a new pte created with make_pte_marker().
> > > + */
> > > +static inline pte_marker copy_pte_marker(
> > > +             swp_entry_t entry, struct vm_area_struct *dst_vma)
> > > +{
> > > +     pte_marker srcm = pte_marker_get(entry);
> > > +     /* Always copy error entries. */
> > > +     pte_marker dstm = srcm & PTE_MARKER_POISONED;
> > > +
> > > +     /* Only copy PTE markers if UFFD register matches. */
> > > +     if ((srcm & PTE_MARKER_UFFD_WP) && userfaultfd_wp(dst_vma))
> > > +             dstm |= PTE_MARKER_UFFD_WP;
> > > +
> > > +     return dstm;
> > > +}
> >
> > Breaks the build with CONFIG_MMU=n (arm allnoconfig).  pte_marker isn't
> > defined.
> >
> > I'll slap #ifdef CONFIG_MMU around this function, but probably somethng more
> > fine-grained could be used, like CONFIG_PTE_MARKER_UFFD_WP.  Please
> > consider.
>
> Whoops, sorry about this. This function "ought" to be in
> include/linux/swapops.h where it would be inside a #ifdef CONFIG_MMU
> anyway, but it can't be because it uses userfaultfd_wp() so there'd be
> a circular include. I think just wrapping it in CONFIG_MMU is the
> right way.
>
> But, this has also made me realize we need to not advertise
> UFFDIO_POISON as supported unless we have CONFIG_MMU. I don't want
> HAVE_ARCH_USERFAULTFD_WP for that, because it's only enabled on
> x86_64, whereas I want to support at least arm64 as well. I don't see
> a strong reason not to just use CONFIG_MMU for this too; this feature
> depends on the API in swapops.h, which uses that ifdef, so I don't see
> a lot of value out of creating a new but equivalent config option.

Actually, I'm being silly. CONFIG_USERFAULTFD depends on CONFIG_MMU,
so we don't need to worry about most of this.

Andrew's fix to just wrap the helper in CONFIG_MMU is enough.

>
> I'll make the needed changes (and also address Peter's comment above)
> and send out a v5.
>
> >
> > btw, both copy_pte_marker() and pte_install_uffd_wp_if_needed() look
> > far too large to justify inlining.  Please review the desirability of
> > this.

As far as inlining goes, I'm not opposed to un-inlining this, I was
mainly copying that pattern from existing helpers in swapops.h.

One question is, if it weren't inline, where should it go? There is no
mm/swapops.c which I would say is otherwise the proper place for it. I
don't see any other good place for the functions to go. The one I'm
introducing isn't userfaultfd-specific so userfaultfd.c seems wrong.

> >
> >


  reply	other threads:[~2023-07-10 22:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-07 21:55 [PATCH v4 0/8] add UFFDIO_POISON to simulate memory poisoning with UFFD Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general Axel Rasmussen
2023-07-08 15:00   ` Peter Xu
2023-07-09  1:08   ` Andrew Morton
2023-07-10 17:19     ` Axel Rasmussen
2023-07-10 21:59       ` Axel Rasmussen [this message]
2023-07-07 21:55 ` [PATCH v4 2/8] mm: userfaultfd: check for start + len overflow in validate_range Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 3/8] mm: userfaultfd: extract file size check out into a helper Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 4/8] mm: userfaultfd: add new UFFDIO_POISON ioctl Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 5/8] mm: userfaultfd: support UFFDIO_POISON for hugetlbfs Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 6/8] mm: userfaultfd: document and enable new UFFDIO_POISON feature Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 7/8] selftests/mm: refactor uffd_poll_thread to allow custom fault handlers Axel Rasmussen
2023-07-08 15:02   ` Peter Xu
2023-07-10 17:08     ` Axel Rasmussen
2023-07-07 21:55 ` [PATCH v4 8/8] selftests/mm: add uffd unit test for UFFDIO_POISON Axel Rasmussen
2023-09-21 16:28   ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJHvVch5j=J=d-TqC1bgN6bKLrr0N3W7cwSOAqHf8O3axqapwA@mail.gmail.com' \
    --to=axelrasmussen@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=cuigaosheng1@huawei.com \
    --cc=david@redhat.com \
    --cc=heftig@archlinux.org \
    --cc=hughd@google.com \
    --cc=jiaqiyan@google.com \
    --cc=jthoughton@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=namit@vmware.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=suleiman@google.com \
    --cc=surenb@google.com \
    --cc=talumbau@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=zhangpeng362@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox