From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D15FC48260 for ; Tue, 13 Feb 2024 13:33:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EE298D000F; Tue, 13 Feb 2024 08:33:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09E1D8D0001; Tue, 13 Feb 2024 08:33:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E33468D000F; Tue, 13 Feb 2024 08:33:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CE32E8D0001 for ; Tue, 13 Feb 2024 08:33:51 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 964561C10AE for ; Tue, 13 Feb 2024 13:33:51 +0000 (UTC) X-FDA: 81786873462.20.1707186 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf26.hostedemail.com (Postfix) with ESMTP id B02DE14000B for ; Tue, 13 Feb 2024 13:33:49 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="NjW66I/R"; spf=pass (imf26.hostedemail.com: domain of ardb@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=ardb@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707831229; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8a/Vc4TXtqK6gkRK7UgQ33nYv0qwgMBTh6B6I+ok7t4=; b=TT7G9iLje6HaZ/dJUxPzIAnl7I5B0aMe2b+zybq05f0byV7Xzc6ncyy1ch/UrtHkgu50Sc L9/kkBPG+cCCdiC1xrdtp/AvzG8oOJsUvmrdN9YL1mqn7QEQmaW87Mdzz3SfhpxTclhmU3 S4jKUv1MPDbr6lPPh4f54aifrkD/cuY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707831229; a=rsa-sha256; cv=none; b=m05Bolbx6b0Hb09k8vAAv7u9kRbhsmBdDxTG6hChWH7+imhbznL5OovYumotJHmGKBUgNT ST1CQZNiMXlLle4dQfSKQ3hWKsnPf19fLw2joO++vvi5gvHC/GxZTrTXVx6oLDqzhhKdhJ j71j+A54eZrK3zp7jqxTdaHvG9l4zHY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="NjW66I/R"; spf=pass (imf26.hostedemail.com: domain of ardb@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=ardb@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 75B8C6149D for ; Tue, 13 Feb 2024 13:33:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 20F52C43394 for ; Tue, 13 Feb 2024 13:33:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1707831228; bh=19Y5ahn44aBoKaRpMWdk3LKXXQE0FO77x5Y2b3wdKaw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=NjW66I/RyK5Frul2lsV16RtBUelOEL2efXL0HqOYl4F16m7wLEGAHyV4kBDhXEsiT gAM9cdHcPSKpwelVKU8x0KBBx3UsrmVTKSa1jgCnIy87XFxBbUd0jdRBp85YdpUu0X 0MyUq2MKRjgo4BqRtwj16TRwyGe8EifSBGdTKzooV6APrXmVrhhG8N/8SsZL5S2Hk7 pCHfsKiZmZJBgKaNtdny7oGew3cHzeUe3G9H32susD2FhAS1sro8Ox9Hfa1B6eBDiL DJ4SXJmH97QizhHKLRswJkmfT482ugHVyyLl8iQxEI+0+bdR+MlxIIUxAFDU9LUL8v 1NVgpw3xO/rww== Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-5101cd91017so1060607e87.2 for ; Tue, 13 Feb 2024 05:33:48 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWQx874njouzO5C4SBF77OxGDy+UKIarBb8ubXj9VQ1akT/PdYz0COl2MEiK+DGi3OQaa1x6QdfUzkuuGLXKxWcTJE= X-Gm-Message-State: AOJu0Yyk7QgEDx4syo9hNQ2vewwTpKGnBDxuzY09wMp+7l7TqoiqkR0P S9IkmiFunrOxUkXoYe9ta99VX2KDO/SV5LQObVdpCxkSvvKXMaWDG8c/Dvi0d0oCVwxOBPDf1Uh W9hCAHAw7vuRwIuYD7Q/xtbUG5Ac= X-Google-Smtp-Source: AGHT+IG4e5NfncV3RTrOH8gQhvaJJElu+/4o3kIRvOaccoBXLTJkqtjEBr950PGwXa4lfyB9fWPqF6zsh0CZ/o4MARQ= X-Received: by 2002:ac2:4318:0:b0:511:7b36:933e with SMTP id l24-20020ac24318000000b005117b36933emr5957570lfh.58.1707831226284; Tue, 13 Feb 2024 05:33:46 -0800 (PST) MIME-Version: 1.0 References: <20240202080756.1453939-1-ryan.roberts@arm.com> <20240202080756.1453939-20-ryan.roberts@arm.com> <64395ae4-3a7d-45dd-8f1d-ea6b232829c5@arm.com> <41499621-482f-455b-9f68-b43ea8052557@redhat.com> <1d302d7a-50ab-4ab4-b049-75ed4a71a87d@arm.com> <99e2a92c-f2a2-4e1e-8ce2-08caae2cb7e4@redhat.com> In-Reply-To: From: Ard Biesheuvel Date: Tue, 13 Feb 2024 14:33:34 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings To: Ryan Roberts Cc: David Hildenbrand , Mark Rutland , Catalin Marinas , Will Deacon , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Nicholas Piggin , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B02DE14000B X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: dyn5wrtftib8eg4dtifacnkno434sp8x X-HE-Tag: 1707831229-469747 X-HE-Meta: U2FsdGVkX19USt34thJBAR7fnbhS6mobCUUCzlWY+6G1Qpv7Ps+v1cs28kgEy3QPqT7YWR1XkEZCicLbeo8S28t+DI8WCwgs9rvv1dfqyZZEHoBDcBYUe99Ag/Is7OmpDfCg9uxztjQ8sestNPTPX/ly+kevcOb0Y/rTVnbm2YdTCMNI7wOzi5bK/zz0JTeVQVmhnFsmFUuwdzRL5epp99w7vQNhRciBprSCWdX9kO4IVPbpPvsvQlvRfSbl6ogsju2Yn/DapFJJmittQCnx21qp267yT2QsJxm8XR2kHZz0p9sQN1N+MYdRKcllKgaNgHvjjUsx0ol8uiEljJ0VA55xQ3u7vaB1TIBQXKCtxQauDgChMVXQ8Cs6Z+CnvTcQPmqjJGiEzb9vcTY8QEsK4SxUz63cIZ3XVvgnuPRVNmh4HPWyhUaH/Dt/M32Fi+uZUHjW06jHJo5r5B3xsU4f+PwVHPbLNzNxBqYVSDKVnR7YVAWL6qFHvfF05QhTa7dgMlQlpyDOfAggEAZSFGIEweNYY/hPOC5e4UDpbTtePeYNb450EZeNa2DQ2ACOsbABf9Qxf2WhbTP6CY1jLE5xv7SQNnP5dQdnGwyp/Y2RvyKNMOwD/i5TmaCFHnOW2Ebju3vXG6ij8qYVoPOEBhyi8DTDSI0iMo6tpF80ADx+TQLBFFqBuj17bn3KHpkxSpxl+1zZqkBEHj22RDkQ22QXvFyCnryp0n11hDq3/ARklvKQarx4vRSieA1DNsGbIOhfw/l0Yb8/V3zGCMlc0S2FtsFxT/Z+trKPLLSUITn3aRbPdLKv0uEHRhQeIw8InfQwgqZcYkjBLDV7Me1P1JS+412ntLHkVQwMmtdRG/LoupSydRRiypWckK0fd1mG+rlFL341vv52KLFBihvnRu1rX6fbNgqIWTf7EYDbY6HpVyjmWI1PfnfjV9druhjZPmT9hKYsXEFDscKposxTQj0 ssxjFnUM M0gep/ZETylWVES9x2wKNpZkH6QKsr3TJAzDB6MPzNeqUqZw0Z52BVVD18YNFIDYapDFr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 13 Feb 2024 at 14:21, Ryan Roberts wrote: > > On 13/02/2024 13:13, David Hildenbrand wrote: > > On 13.02.24 14:06, Ryan Roberts wrote: > >> On 13/02/2024 12:19, David Hildenbrand wrote: > >>> On 13.02.24 13:06, Ryan Roberts wrote: > >>>> On 12/02/2024 20:38, Ryan Roberts wrote: > >>>>> [...] > >>>>> > >>>>>>>>> +static inline bool mm_is_user(struct mm_struct *mm) > >>>>>>>>> +{ > >>>>>>>>> + /* > >>>>>>>>> + * Don't attempt to apply the contig bit to kernel mapping= s, because > >>>>>>>>> + * dynamically adding/removing the contig bit can cause pa= ge faults. > >>>>>>>>> + * These racing faults are ok for user space, since they g= et > >>>>>>>>> serialized > >>>>>>>>> + * on the PTL. But kernel mappings can't tolerate faults. > >>>>>>>>> + */ > >>>>>>>>> + return mm !=3D &init_mm; > >>>>>>>>> +} > >>>>>>>> > >>>>>>>> We also have the efi_mm as a non-user mm, though I don't think w= e > >>>>>>>> manipulate > >>>>>>>> that while it is live, and I'm not sure if that needs any specia= l handling. > >>>>>>> > >>>>>>> Well we never need this function in the hot (order-0 folio) path,= so I > >>>>>>> think I > >>>>>>> could add a check for efi_mm here with performance implication. I= t's > >>>>>>> probably > >>>>>>> safest to explicitly exclude it? What do you think? > >>>>>> > >>>>>> Oops: This should have read "I think I could add a check for efi_m= m here > >>>>>> *without* performance implication" > >>>>> > >>>>> It turns out that efi_mm is only defined when CONFIG_EFI is enabled= . I can do > >>>>> this: > >>>>> > >>>>> return mm !=3D &init_mm && (!IS_ENABLED(CONFIG_EFI) || mm !=3D &efi= _mm); > >>>>> > >>>>> Is that acceptable? This is my preference, but nothing else outside= of efi > >>>>> references this symbol currently. > >>>>> > >>>>> Or perhaps I can convince myself that its safe to treat efi_mm like= userspace. > >>>>> There are a couple of things that need to be garanteed for it to be= safe: > >>>>> > >>>>> - The PFNs of present ptes either need to have an associated st= ruct > >>>>> page or > >>>>> need to have the PTE_SPECIAL bit set (either pte_mkspecial() = or > >>>>> pte_mkdevmap()) > >>>>> > >>>>> - Live mappings must either be static (no changes that could ca= use > >>>>> fold/unfold > >>>>> while live) or the system must be able to tolerate a temporar= y fault > >>>>> > >>>>> Mark suggests efi_mm is not manipulated while live, so that meets t= he latter > >>>>> requirement, but I'm not sure about the former? > >>>> > >>>> I've gone through all the efi code, and conclude that, as Mark sugge= sts, the > >>>> mappings are indeed static. And additionally, the ptes are populated= using only > >>>> the _private_ ptep API, so there is no issue here. As just discussed= with Mark, > >>>> my prefereence is to not make any changes to code, and just add a co= mment > >>>> describing why efi_mm is safe. > >>>> > >>>> Details: > >>>> > >>>> * Registered with ptdump > >>>> * ptep_get_lockless() > >>>> * efi_create_mapping -> create_pgd_mapping =E2=80=A6 -> init_pte: > >>>> * __ptep_get() > >>>> * __set_pte() > >>>> * efi_memattr_apply_permissions -> efi_set_mapping_permissions =E2= =80=A6 -> > >>>> set_permissions > >>>> * __ptep_get() > >>>> * __set_pte() > >>> > >>> Sound good. We could add some VM_WARN_ON if we ever get the efi_mm vi= a the > >>> "official" APIs. > >> > >> We could, but that would lead to the same linkage issue, which I'm try= ing to > >> avoid in the first place: > >> > >> VM_WARN_ON(IS_ENABLED(CONFIG_EFI) && mm =3D=3D efi_mm); > >> > >> This creates new source code dependencies, which I would rather avoid = if > >> possible. > > > > Just a thought, you could have a is_efi_mm() function that abstracts al= l that. > > > > diff --git a/include/linux/efi.h b/include/linux/efi.h > > index c74f47711f0b..152f5fa66a2a 100644 > > --- a/include/linux/efi.h > > +++ b/include/linux/efi.h > > @@ -692,6 +692,15 @@ extern struct efi { > > > > extern struct mm_struct efi_mm; > > > > +static inline void is_efi_mm(struct mm_struct *mm) > > +{ > > +#ifdef CONFIG_EFI > > + return mm =3D=3D &efi_mm; > > +#else > > + return false; > > +#endif > > +} > > + > > static inline int > > efi_guidcmp (efi_guid_t left, efi_guid_t right) > > { > > > > > > That would definitely work, but in that case, I might as well just check = for it > in mm_is_user() (and personally I would change the name to mm_is_efi()): > > > static inline bool mm_is_user(struct mm_struct *mm) > { > return mm !=3D &init_mm && !mm_is_efi(mm); > } > > Any objections? > Any reason not to use IS_ENABLED(CONFIG_EFI) in the above? The extern declaration is visible to the compiler, and any references should disappear before the linker could notice that efi_mm does not exist. In any case, feel free to add Acked-by: Ard Biesheuvel when you roll a patch based on the above, with or without IS_ENABLED(). And as you concluded, efi_mm is indeed set in stone once the runtime regions described by the firmware have been mapped, although this may happen in two passes depending on how the runtime regions are described. But by the time user MMs might exist, efi_mm should effectively be immutable.