From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E699D3A670 for ; Tue, 29 Oct 2024 17:02:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 921576B009A; Tue, 29 Oct 2024 13:02:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AB946B009B; Tue, 29 Oct 2024 13:02:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 722786B009C; Tue, 29 Oct 2024 13:02:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4EE5F6B009A for ; Tue, 29 Oct 2024 13:02:31 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E0A3D1402C2 for ; Tue, 29 Oct 2024 17:02:30 +0000 (UTC) X-FDA: 82727257452.27.A2654C2 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf07.hostedemail.com (Postfix) with ESMTP id ABA1E40019 for ; Tue, 29 Oct 2024 17:01:54 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730221137; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ITwLgLplkyYtweutGU3Gs54aDJPJu4TJIeTe2P24k/U=; b=3GIAxWiwbmHNkWBBy9p03ik0iyoR+caxxS+qLg6p3WIloSQb/xHlsodptr8xIcIiFxUIhL UfDPGY6gq2DAXGjB384iHr3cYjHk8jxIWH1ZSaHo2tjitMzTa5wgfLbpqIEYuVQ/KYD6ke 6iBEGBMa3PsULeaNUyliEv2lYng/cFs= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of cmarinas@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730221137; a=rsa-sha256; cv=none; b=v38ycHFTB/geYWCui6yuC9sotRxon/PhAgK42wCiX7ELg/WM0qRnn2Oh7E8zMhwoUpmnIK gFGBXjDxW3lI0pklcjTB70qEjWKjB9nE2MGi8vvXi5k8+HETxX3QHwQDwtgYFKzLOGrY1A VoIaAWhUVvBn4ishSUwQEIa7/MB4BVY= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 463C95C014A; Tue, 29 Oct 2024 17:01:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BAAC5C4CECD; Tue, 29 Oct 2024 17:02:25 +0000 (UTC) Date: Tue, 29 Oct 2024 17:02:23 +0000 From: Catalin Marinas To: Lorenzo Stoakes Cc: Vlastimil Babka , Linus Torvalds , "Liam R. Howlett" , Mark Brown , Andrew Morton , Jann Horn , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Peter Xu , linux-arm-kernel@lists.infradead.org, Will Deacon , Aishwarya TCV Subject: Re: [PATCH hotfix 6.12 v2 4/8] mm: resolve faulty mmap_region() error path behaviour Message-ID: References: <0b64edb9-491e-4dcd-8dc1-d3c8a336a49b@suse.cz> <1608957a-d138-4401-98ef-7fbe5fb7c387@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: ABA1E40019 X-Stat-Signature: 9o56s3a98mrxd5b53gci8t6qpokhkioe X-Rspam-User: X-HE-Tag: 1730221314-826981 X-HE-Meta: U2FsdGVkX19kHsXKDZMTPPwljj8nmuaE+El0WzjKemCwX64/JdUkuZQg2eNGQTCcOOTLrApTTe3PlYLT37i89PMZC2MeH7oHtGuCCSNoc2YL8ot9HmFQ4D5Nk6CjzTMUMSlX28itVNhLEDPvYA/vd56tUVIa5gvKZcvHsR4VupcMkFAhSdcRWex4TES2xyOvYFME3ygrIb0wK5rNzNBRojBukLOeA4bplWhfQA+RmGKcTFv+LlO95WzIJVy5YWPnMoX6T7vOrH2LjKqJtQhgEmXvpyedP7e6yYgsc+jxc5qYA2VdLNR3yVjxP3HUY1xz+Ftld5jJTzBKtjglAfZOTC40vwiIHhudMm89KxuF2+79TnxrYVItcwAxC7ZXobKxNn+setfjK2FUh9RavJI3GndN5eBwiF/7iyyANve5XWeDcteEiidrm9wFR8IlNUHDvJYZepOd0xT+osXlxreQKal5swX4LKRl42X/3YmKU5QXOHhW1XRZ39KwN3y2ONd2YyRoenRMWD88Di8gjMa60O+nXY4/UFhy/fICbFAXDTRr0yWoPf2uz4DLgRXl0Vac31EO3EBqVTm/cRpi2bu2cWjjiiLKD7imvqJ+S48Vhwk8QzyO/lszamU3F9GW1kxlv3ZZ+x3JJAtzjMtruHjPjXiMNWMFdi+5mnTZWC/i6Vzkg33/62iqhFMl/M5jr0kNpr5dLY9vj8tLuBIo7rSBu9lfVJ3M93dDRxLRz5q5m/4LMNaKiBsWXZPAfrvKhgiSRII4CCzPQp9uNHz8P8ADWQyCFAgg3z+suyNy5eK0CzNWZE7vEL7ErYgiw7A9XSUcS1mY5poa3VUl7X5q48mB7316HpNVzPR7lcQZDU0Fl/p99TqiU/ajpX55fQT5oMS68LashZxUs9r70/gKjWM13X9hTlA0V0pz884W3vzhUnW24aEjbW1NfA+T6wh8KY0N3ASNu6l9xSBHDEkdxRF B/bg36IM ZFSUENnjaPAcj48w9fzzmvKZKWq3vS3XPZm4yc2PWEPHOzBr0Sn6jkjar7HWHoJGl1wyHGQzVqloLmSRL8sfSSZvu6rPznsxnB9HAaKtkLdaKwfZO3D5+7q89tUH66Fo7yCJcyvG51/C/0wMWkceoWoK7YOtwDmI9UFmXBJvmhLsGNuBsUvvUVXhm3QM9Pc1qJO7QCP5n9s4TLcNhbTiCKTLmWN1ZfFMYYhy0erXQ0G+jH8wyM4ZYFukCqVF0O9ixm5j4UVi0DH0LW+asBkuriM5M2v712Xx8qcYcf+09WKQsge+Drb4O1BXYtJ6SFMcaedE0VT0Ny5bAK5GakKhoGR6rnPs+07qbnRyltP/uU8Bmp3jZvqCzmrnKDKV648OWY9ZC346I0n43yIo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Oct 29, 2024 at 04:36:32PM +0000, Lorenzo Stoakes wrote: > On Tue, Oct 29, 2024 at 04:22:42PM +0000, Catalin Marinas wrote: > > On Tue, Oct 29, 2024 at 03:16:00PM +0000, Lorenzo Stoakes wrote: > > > On Tue, Oct 29, 2024 at 03:04:41PM +0000, Catalin Marinas wrote: > > > > On Mon, Oct 28, 2024 at 10:14:50PM +0000, Lorenzo Stoakes wrote: > > > > > So continue to check VM_MTE_ALLOWED which arch_calc_vm_flag_bits() sets if > > > > > MAP_ANON. > > > > [...] > > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > > > > index 4ba1d00fabda..e87f5d6799a7 100644 > > > > > --- a/mm/shmem.c > > > > > +++ b/mm/shmem.c > > > > > @@ -2733,9 +2733,6 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma) > > > > > if (ret) > > > > > return ret; > > > > > > > > > > - /* arm64 - allow memory tagging on RAM-based files */ > > > > > - vm_flags_set(vma, VM_MTE_ALLOWED); > > > > > > > > This breaks arm64 KVM if the VMM uses shared mappings for the memory > > > > slots (which is possible). We have kvm_vma_mte_allowed() that checks for > > > > the VM_MTE_ALLOWED flag as the VMM may not use PROT_MTE/VM_MTE directly. > > > > > > Ugh yup missed that thanks. > > > > > > > I need to read this thread properly but why not pass the file argument > > > > to arch_calc_vm_flag_bits() and set VM_MTE_ALLOWED in there? > > > > > > Can't really do that as it is entangled in a bunch of other stuff, > > > e.g. calc_vm_prot_bits() would have to pass file and that's used in a bunch > > > of places including arch code and... etc. etc. > > > > Not calc_vm_prot_bits() but calc_vm_flag_bits(). > > arch_calc_vm_flag_bits() is only implemented by two architectures - > > arm64 and parisc and calc_vm_flag_bits() is only called from do_mmap(). > > > > Basically we want to set VM_MTE_ALLOWED early during the mmap() call > > and, at the time, my thinking was to do it in calc_vm_flag_bits(). The > > calc_vm_prot_bits() OTOH is also called on the mprotect() path and is > > responsible for translating PROT_MTE into a VM_MTE flag without any > > checks. arch_validate_flags() would check if VM_MTE comes together with > > VM_MTE_ALLOWED. But, as in the KVM case, that's not the only function > > checking VM_MTE_ALLOWED. > > > > Since calc_vm_flag_bits() did not take a file argument, the lazy > > approach was to add the flag explicitly for shmem (and hugetlbfs in > > -next). But I think it would be easier to just add the file argument to > > calc_vm_flag_bits() and do the check in the arch code to return > > VM_MTE_ALLOWED. AFAICT, this is called before mmap_region() and > > arch_validate_flags() (unless I missed something in the recent > > reworking). > > I mean I totally get why you're suggesting it Not sure ;) > - it's the right _place_ but... > It would require changes to a ton of code which is no good for a backport > and we don't _need_ to do it. > > I'd rather do the smallest delta at this point, as I am not a huge fan of > sticking it in here (I mean your point is wholly valid - it's at a better > place to do so and we can change flags here, it's just - it's not where you > expect to do this obviously). > > I mean for instance in arch/x86/kernel/cpu/sgx/encl.c (a file I'd _really_ > like us not to touch here by the way) we'd have to what pass NULL? That's calc_vm_prot_bits(). I suggested calc_vm_flag_bits() (I know, confusing names and total lack of inspiration when we added MTE support). The latter is only called in one place - do_mmap(). That's what I meant (untested, on top of -next as it has a MAP_HUGETLB check in there). I don't think it's much worse than your proposal, assuming that it works: diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h index 1dbfb56cb313..358bbaaafd41 100644 --- a/arch/arm64/include/asm/mman.h +++ b/arch/arm64/include/asm/mman.h @@ -6,6 +6,8 @@ #ifndef BUILD_VDSO #include +#include +#include #include static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, @@ -31,7 +33,7 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, } #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) +static inline unsigned long arch_calc_vm_flag_bits(struct file *file, unsigned long flags) { /* * Only allow MTE on anonymous mappings as these are guaranteed to be @@ -39,12 +41,12 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) * filesystem supporting MTE (RAM-based). */ if (system_supports_mte() && - (flags & (MAP_ANONYMOUS | MAP_HUGETLB))) + (flags & (MAP_ANONYMOUS | MAP_HUGETLB) || shmem_file(file))) return VM_MTE_ALLOWED; return 0; } -#define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags) +#define arch_calc_vm_flag_bits(file, flags) arch_calc_vm_flag_bits(file, flags) static inline bool arch_validate_prot(unsigned long prot, unsigned long addr __always_unused) diff --git a/arch/parisc/include/asm/mman.h b/arch/parisc/include/asm/mman.h index 89b6beeda0b8..663f587dc789 100644 --- a/arch/parisc/include/asm/mman.h +++ b/arch/parisc/include/asm/mman.h @@ -2,6 +2,7 @@ #ifndef __ASM_MMAN_H__ #define __ASM_MMAN_H__ +#include #include /* PARISC cannot allow mdwe as it needs writable stacks */ @@ -11,7 +12,7 @@ static inline bool arch_memory_deny_write_exec_supported(void) } #define arch_memory_deny_write_exec_supported arch_memory_deny_write_exec_supported -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) +static inline unsigned long arch_calc_vm_flag_bits(struct file *file, unsigned long flags) { /* * The stack on parisc grows upwards, so if userspace requests memory @@ -23,6 +24,6 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags) return 0; } -#define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags) +#define arch_calc_vm_flag_bits(file, flags) arch_calc_vm_flag_bits(file, flags) #endif /* __ASM_MMAN_H__ */ diff --git a/include/linux/mman.h b/include/linux/mman.h index 8ddca62d6460..a842783ffa62 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -2,6 +2,7 @@ #ifndef _LINUX_MMAN_H #define _LINUX_MMAN_H +#include #include #include @@ -94,7 +95,7 @@ static inline void vm_unacct_memory(long pages) #endif #ifndef arch_calc_vm_flag_bits -#define arch_calc_vm_flag_bits(flags) 0 +#define arch_calc_vm_flag_bits(file, flags) 0 #endif #ifndef arch_validate_prot @@ -151,13 +152,13 @@ calc_vm_prot_bits(unsigned long prot, unsigned long pkey) * Combine the mmap "flags" argument into "vm_flags" used internally. */ static inline unsigned long -calc_vm_flag_bits(unsigned long flags) +calc_vm_flag_bits(struct file *file, unsigned long flags) { return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) | _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) | - arch_calc_vm_flag_bits(flags); + arch_calc_vm_flag_bits(file, flags); } unsigned long vm_commit_limit(void); diff --git a/mm/mmap.c b/mm/mmap.c index f102314bb500..f904b3bba962 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -344,7 +344,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, * to. we assume access permissions have been handled by the open * of the memory object, so we don't do any here. */ - vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | + vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(file, flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; /* Obtain the address to map to. we verify (or select) it and ensure diff --git a/mm/nommu.c b/mm/nommu.c index 635d028d647b..e9b5f527ab5b 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -842,7 +842,7 @@ static unsigned long determine_vm_flags(struct file *file, { unsigned long vm_flags; - vm_flags = calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(flags); + vm_flags = calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(file, flags); if (!file) { /* diff --git a/mm/shmem.c b/mm/shmem.c index f24a0f34723e..ff194341fddb 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2737,9 +2737,6 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma) if (ret) return ret; - /* arm64 - allow memory tagging on RAM-based files */ - vm_flags_set(vma, VM_MTE_ALLOWED); - file_accessed(file); /* This is anonymous shared memory if it is unlinked at the time of mmap */ if (inode->i_nlink) -- Catalin