From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71C54EB64D9 for ; Mon, 10 Jul 2023 17:20:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECDB76B0075; Mon, 10 Jul 2023 13:20:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7CE18E0001; Mon, 10 Jul 2023 13:20:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF75F8D0001; Mon, 10 Jul 2023 13:20:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BCC836B0075 for ; Mon, 10 Jul 2023 13:20:34 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 77CFC801C1 for ; Mon, 10 Jul 2023 17:20:34 +0000 (UTC) X-FDA: 80996366388.25.056DECD Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf03.hostedemail.com (Postfix) with ESMTP id D71EC20004 for ; Mon, 10 Jul 2023 17:20:31 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=ht6uj+rb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=axelrasmussen@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689009631; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TUvRrbUgojp3S6sSCvyqiMIcXWz8PVCjBGghvW84veI=; b=E1QsYzRMsfBmgZo5Muk+RzEHUWN/2FqUjU5joxxRMnGpIucHQ7esOCjEOFQRNYSkpQI4XY oal9Csv1yi966j9mCABJmDqoFWpxIs+4WhVQAkDaMaOF4kww/+LY9LTInyyP75VdVKc+ED 9rituRdZ/+JBFAwl0s0zA9K4F1BVxW0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=ht6uj+rb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of axelrasmussen@google.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=axelrasmussen@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689009631; a=rsa-sha256; cv=none; b=H2ZbqnS4rHla5plu4G5fjNLSlqEYr2PtSKryN0Y59BbsGJJqWbjd0L8fmld53bmlRI7/cj vmpwsdSxek6Ys5hdkhipaO5q6zFQlFIr9DpSDCGYR9s7UCN8Cf4qrr+AnXKf9J3un45MY6 AsdKDlSKIYJw8ZSsjhRnkfNvBSqcNzI= Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-99357737980so623936066b.2 for ; Mon, 10 Jul 2023 10:20:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689009630; x=1691601630; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TUvRrbUgojp3S6sSCvyqiMIcXWz8PVCjBGghvW84veI=; b=ht6uj+rb8yZwl2NV6V2gaufj2a9VqDGv3yssM3oEm8JYDAo3AfjXbxHJBoKuO5DlTl pD2WbqWUNWBfOfUcp7EjZQaJuiAnv1BmYyrgKUWyxJdJCQ68MaP1VKmDPuxC7dJIjAX7 5lTxg6vkvvnF36m5L1bm4PnNDP4LvpxIoQveWHh2gkoks+XtXvEeE+KHOyBM/KtU7ana rLs2CvLZmdELqO0wio4tAbxozq0KdYtf0J/CTkahgjBxYGRmPFuUf1LzxrU4EkTfeN3Z gpy7JaV38FmOqpGRYZf93Pg/PDCJFItQcB8y5NCti8Gz2Dad+OcDiuukXy3mkez7MG1/ Lpcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689009630; x=1691601630; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TUvRrbUgojp3S6sSCvyqiMIcXWz8PVCjBGghvW84veI=; b=hO7vmMxoeElIP+/ci3KGMpZsx1GyRuuWIHeBtQWwVMxIGrtdHgh+NNn60bFegLvsvK AQNxU3vm11/qocfShnWI8fxjBV662yCtfFJknpIpsz/B6tuvMbyVKZokUBvRwaX2QyqS Kc5PnsTPYkdg3tIGz5ggx+4MZCiTEKjC8m63z55134GkpC3zgOEHxc7+bAlFfJVyqo+Q u/DOyk1TNTOTFRYc+WVoEJ9mNYW5Mfaha1bfkwcujt9ra/EnbiaiUvT/Uh1YUoDu9Rbz 3A8Rpqqtd2y2X4c8oRgtcjwzWwXSMuF7pTJmB2cMvPCNBGjGTAopZgee/E5LzCcphXSC Rw0Q== X-Gm-Message-State: ABy/qLY6wINsH8MJE1/90ErNV2p+r7LMkFPc4QELx8IaAyl04OcmSQED q7X5kuY3GndudLfVdknGUTyd5FlHcw/+CoUwgwHcXw== X-Google-Smtp-Source: APBJJlEwoywxx29c3j+Q6tKCZYdv00R5S9OXHCpuykOBe4v5wpLPGzaltu+U90bW+McFEnOZMwR1EGkBUBdKGHluAn4= X-Received: by 2002:a17:907:d23:b0:991:f383:d5c3 with SMTP id gn35-20020a1709070d2300b00991f383d5c3mr17452608ejc.74.1689009630336; Mon, 10 Jul 2023 10:20:30 -0700 (PDT) MIME-Version: 1.0 References: <20230707215540.2324998-1-axelrasmussen@google.com> <20230707215540.2324998-2-axelrasmussen@google.com> <20230708180850.bc938ab49fbfb38b83c367c8@linux-foundation.org> In-Reply-To: <20230708180850.bc938ab49fbfb38b83c367c8@linux-foundation.org> From: Axel Rasmussen Date: Mon, 10 Jul 2023 10:19:54 -0700 Message-ID: Subject: Re: [PATCH v4 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general To: Andrew Morton Cc: Alexander Viro , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: D71EC20004 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: jgug8x88kccq8qwi5tjg6a3jgpy9soo4 X-HE-Tag: 1689009631-213067 X-HE-Meta: U2FsdGVkX19+yUjnXsCJ7yFjCl9HI1THRXZSR5mcBTBCPz0eWyaxcTuXT58hho8b2LGoaerQ7CB2RsJyJ2+0mXjGme1cwBFydi2f6GwCeupqqQG5deBqBZaqn7GcRPTBl6tM0+dPP1qpUMCqOsv4Wvn2kxavdtFQQQPvOUzHmriFeBDz5+zz7m0DnfrIzgbWf9jqhGQW2fpMB8mWvUsxxwsdctTAxjw2mZYunGzZD0Mo53wXGXxM57/dCzUPl0mh4jZWIkrhAxwdpLBNy1ON8X0blORRYHQwS2XD+E4pQ/QciRN7Pqp8aVGVzXdcTSwRNeyE3CLuXlFldOdlse4vRVDrJBb8W6MvH4j5usBVVGB2JOj1JJWKAF7C68hmC27to6yklamnuLu1BsGx1WeBulIRyAHdGxSqMFK8IwUGP1EaAw+Zz0mHXa9hQwkFe6sf3XvozyjXM2l4FqX482Tc8FpAmhRUCdRV6FUa/uKTaJzdSiKjtNbsZVhj6mApu6fPZhN/bbcCyTzwywrtRfzhmAuRfC6D4QJI5VSwTA8Iq4aBLeX7IGmKv9n9OzNc+MjylakBEOstF7UkcHFqSh/12YNtIO0xZIMd4zzJt12plKw9EdjQKgbeNI3tOHfOEe2OP/NrSdtT1zN42mPzN1Y+SdNyrQgfVHxTOwNxIAFWEn62J2LEmsQCewksx0Sgx6VyD6r/3Be+HyXl/9iv3OcUrEFQzxSAyVMajnH8bJ/f6hcl3bvFvMqweHp6YR95UDhDPVPmJwBnpI10PIW7XpeJJNTFShJOHy50dcHydrxr0bTZR6Q0Gr67WUOpPPHPLf66XVsy4pPvllfQnZKLKb+UOdJi+c+DXbiAIjqZbjphZMRcF3tlC/06xh1FNV7S0nyV4j2dgNGIl7DnSbeJYDeOzYsNGCbaoLGxIm/ZyT4yxk9pIhg/rRkfkQi2PeXtAFvDH7bI3bk9qHf2ScbETdy xNLn8psX 5t7otJVCHh2L3HO2KXznV/kaLxkTqrTWfQQWX7IoYcarXD8FpVmPwBnHnXiSHULd9Q7xcQPjcK/MummO0EBDUhrZ7GEhE+udjC2FFQZADlyeVDTFyFmN2mX3zZ0jK+FmfiDIsj5Y+JUp1vxgkxIEJU3bdLZNRva88XwWN/d8z/Py5gN4gDyn0kbRmCybiqdEj9/n5aN+F2BxeV0YxMUDyCjMwNOF4VDErLS6365Lf/kFVzfDyD+BCyQdu/8vf7ROQbKqcT9aQh3T1exVorO93D1pkxCFukXfReQxNLkAe5xttOBSILXB1xRMZ3zc/TPvfB0wxvsFczpX7RwUztrinWL4wdLEKi1/OkJzooyvf2XKYM5E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 8, 2023 at 6:08=E2=80=AFPM Andrew Morton wrote: > > On Fri, 7 Jul 2023 14:55:33 -0700 Axel Rasmussen wrote: > > > Future patches will re-use PTE_MARKER_SWAPIN_ERROR to implement > > UFFDIO_POISON, so make some various preparations for that: > > > > First, rename it to just PTE_MARKER_POISONED. The "SWAPIN" can be > > confusing since we're going to re-use it for something not really > > related to swap. This can be particularly confusing for things like > > hugetlbfs, which doesn't support swap whatsoever. Also rename some > > various helper functions. > > > > Next, fix pte marker copying for hugetlbfs. Previously, it would WARN o= n > > seeing a PTE_MARKER_SWAPIN_ERROR, since hugetlbfs doesn't support swap. > > But, since we're going to re-use it, we want it to go ahead and copy it > > just like non-hugetlbfs memory does today. Since the code to do this is > > more complicated now, pull it out into a helper which can be re-used in > > both places. While we're at it, also make it slightly more explicit in > > its handling of e.g. uffd wp markers. > > > > For non-hugetlbfs page faults, instead of returning VM_FAULT_SIGBUS for > > an error entry, return VM_FAULT_HWPOISON. For most cases this change > > doesn't matter, e.g. a userspace program would receive a SIGBUS either > > way. But for UFFDIO_POISON, this change will let KVM guests get an MCE > > out of the box, instead of giving a SIGBUS to the hypervisor and > > requiring it to somehow inject an MCE. > > > > Finally, for hugetlbfs faults, handle PTE_MARKER_POISONED, and return > > VM_FAULT_HWPOISON_LARGE in such cases. Note that this can't happen toda= y > > because the lack of swap support means we'll never end up with such a > > PTE anyway, but this behavior will be needed once such entries *can* > > show up via UFFDIO_POISON. > > > > --- a/include/linux/mm_inline.h > > +++ b/include/linux/mm_inline.h > > @@ -523,6 +523,25 @@ static inline bool mm_tlb_flush_nested(struct mm_s= truct *mm) > > return atomic_read(&mm->tlb_flush_pending) > 1; > > } > > > > +/* > > + * Computes the pte marker to copy from the given source entry into ds= t_vma. > > + * If no marker should be copied, returns 0. > > + * The caller should insert a new pte created with make_pte_marker(). > > + */ > > +static inline pte_marker copy_pte_marker( > > + swp_entry_t entry, struct vm_area_struct *dst_vma) > > +{ > > + pte_marker srcm =3D pte_marker_get(entry); > > + /* Always copy error entries. */ > > + pte_marker dstm =3D srcm & PTE_MARKER_POISONED; > > + > > + /* Only copy PTE markers if UFFD register matches. */ > > + if ((srcm & PTE_MARKER_UFFD_WP) && userfaultfd_wp(dst_vma)) > > + dstm |=3D PTE_MARKER_UFFD_WP; > > + > > + return dstm; > > +} > > Breaks the build with CONFIG_MMU=3Dn (arm allnoconfig). pte_marker isn't > defined. > > I'll slap #ifdef CONFIG_MMU around this function, but probably somethng m= ore > fine-grained could be used, like CONFIG_PTE_MARKER_UFFD_WP. Please > consider. Whoops, sorry about this. This function "ought" to be in include/linux/swapops.h where it would be inside a #ifdef CONFIG_MMU anyway, but it can't be because it uses userfaultfd_wp() so there'd be a circular include. I think just wrapping it in CONFIG_MMU is the right way. But, this has also made me realize we need to not advertise UFFDIO_POISON as supported unless we have CONFIG_MMU. I don't want HAVE_ARCH_USERFAULTFD_WP for that, because it's only enabled on x86_64, whereas I want to support at least arm64 as well. I don't see a strong reason not to just use CONFIG_MMU for this too; this feature depends on the API in swapops.h, which uses that ifdef, so I don't see a lot of value out of creating a new but equivalent config option. I'll make the needed changes (and also address Peter's comment above) and send out a v5. > > btw, both copy_pte_marker() and pte_install_uffd_wp_if_needed() look > far too large to justify inlining. Please review the desirability of > this. > >