* [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes)
@ 2024-10-29 18:11 Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 1/5] mm: avoid unsafe VMA hook invocation when error arises on mmap hook Lorenzo Stoakes
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-29 18:11 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel,
linux-mm, Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
NOTE: This should be applied on mm-hotfixes-unstable in Andrew's mm tree as
it relies on other pending hotfixes.
The mmap_region() function is somewhat terrifying, with spaghetti-like
control flow and numerous means by which issues can arise and incomplete
state, memory leaks and other unpleasantness can occur.
A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently observed
issues with resource leaks and observable inconsistent state.
This series goes to great lengths to simplify how mmap_region() works and
to avoid unwinding errors late on in the process of setting up the VMA for
the new mapping, and equally avoids such operations occurring while the VMA
is in an inconsistent state.
The patches in this series comprise the minimal changes required to resolve
existing issues in mmap_region() error handling, in order that they can be
hotfixed and backported. There is additionally a follow up series which
goes further, separated out from the v1 series and sent and updated
separately.
v4:
* Reworked solution to use arch_calc_vm_flag_bits() as suggested by
Catalin. This also ensures we do not break MTE in a KVM scenario.
v3:
* Added correct handling for arm64 MTE which was otherwise broken, as
reported by Mark Brown.
https://lore.kernel.org/all/cover.1730206735.git.lorenzo.stoakes@oracle.com/
v2:
* Marked first 4 patches as hotfixes, the rest as not.
* Improved comment in vma_close() as per Vlastimil.
* Updated hole byte count as per Jann.
* Updated comment in map_deny_write_exec() as per Jann.
* Dropped unnecessary vma_iter_free() as per Vlastimil, Liam.
* Corrected vms_abort_munmap_vmas() mistaken assumption about nr_pages as
per Vlastimil.
* Changed order of initial checks in mmap_region() to avoid user-visible
side effects as per Vlastimil, Liam.
* Corrected silly incorrect use of vma field.
* Various style corrects as per Liam.
* Fix horrid mistake with merge VMA, reworked the logic to avoid that
nonsense altogether.
* Add fields to map state rather than using vmg fields to avoid
confusion/risk of vmg state changing breaking things.
* Replaced last commit removing merge retry with one that retries the
merge, only sanely.
https://lore.kernel.org/all/cover.1729715266.git.lorenzo.stoakes@oracle.com/
v1:
https://lore.kernel.org/all/cover.1729628198.git.lorenzo.stoakes@oracle.com/
Lorenzo Stoakes (5):
mm: avoid unsafe VMA hook invocation when error arises on mmap hook
mm: unconditionally close VMAs on error
mm: refactor map_deny_write_exec()
mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
mm: resolve faulty mmap_region() error path behaviour
arch/arm64/include/asm/mman.h | 10 ++-
arch/parisc/include/asm/mman.h | 5 +-
include/linux/mman.h | 28 +++++--
mm/internal.h | 45 ++++++++++++
mm/mmap.c | 130 ++++++++++++++++++---------------
mm/mprotect.c | 2 +-
mm/nommu.c | 9 +--
mm/shmem.c | 3 -
mm/vma.c | 14 ++--
mm/vma.h | 6 +-
10 files changed, 159 insertions(+), 93 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH hotfix 6.12 v4 1/5] mm: avoid unsafe VMA hook invocation when error arises on mmap hook
2024-10-29 18:11 [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes) Lorenzo Stoakes
@ 2024-10-29 18:11 ` Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 2/5] mm: unconditionally close VMAs on error Lorenzo Stoakes
` (3 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-29 18:11 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel,
linux-mm, Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
After an attempted mmap() fails, we are no longer in a situation where we
can safely interact with VMA hooks. This is currently not enforced, meaning
that we need complicated handling to ensure we do not incorrectly call
these hooks.
We can avoid the whole issue by treating the VMA as suspect the moment that
the file->f_ops->mmap() function reports an error by replacing whatever VMA
operations were installed with a dummy empty set of VMA operations.
We do so through a new helper function internal to mm - mmap_file() - which
is both more logically named than the existing call_mmap() function and
correctly isolates handling of the vm_op reassignment to mm.
All the existing invocations of call_mmap() outside of mm are ultimately
nested within the call_mmap() from mm, which we now replace.
It is therefore safe to leave call_mmap() in place as a convenience
function (and to avoid churn). The invokers are:
ovl_file_operations -> mmap -> ovl_mmap() -> backing_file_mmap()
coda_file_operations -> mmap -> coda_file_mmap()
shm_file_operations -> shm_mmap()
shm_file_operations_huge -> shm_mmap()
dma_buf_fops -> dma_buf_mmap_internal -> i915_dmabuf_ops
-> i915_gem_dmabuf_mmap()
None of these callers interact with vm_ops or mappings in a problematic way
on error, quickly exiting out.
Reported-by: Jann Horn <jannh@google.com>
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Cc: stable <stable@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/internal.h | 27 +++++++++++++++++++++++++++
mm/mmap.c | 6 +++---
mm/nommu.c | 4 ++--
3 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 16c1f3cd599e..4eab2961e69c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -108,6 +108,33 @@ static inline void *folio_raw_mapping(const struct folio *folio)
return (void *)(mapping & ~PAGE_MAPPING_FLAGS);
}
+/*
+ * This is a file-backed mapping, and is about to be memory mapped - invoke its
+ * mmap hook and safely handle error conditions. On error, VMA hooks will be
+ * mutated.
+ *
+ * @file: File which backs the mapping.
+ * @vma: VMA which we are mapping.
+ *
+ * Returns: 0 if success, error otherwise.
+ */
+static inline int mmap_file(struct file *file, struct vm_area_struct *vma)
+{
+ int err = call_mmap(file, vma);
+
+ if (likely(!err))
+ return 0;
+
+ /*
+ * OK, we tried to call the file hook for mmap(), but an error
+ * arose. The mapping is in an inconsistent state and we most not invoke
+ * any further hooks on it.
+ */
+ vma->vm_ops = &vma_dummy_vm_ops;
+
+ return err;
+}
+
#ifdef CONFIG_MMU
/* Flags for folio_pte_batch(). */
diff --git a/mm/mmap.c b/mm/mmap.c
index 9841b41e3c76..6e3b25f7728f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1422,7 +1422,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
/*
* clear PTEs while the vma is still in the tree so that rmap
* cannot race with the freeing later in the truncate scenario.
- * This is also needed for call_mmap(), which is why vm_ops
+ * This is also needed for mmap_file(), which is why vm_ops
* close function is called.
*/
vms_clean_up_area(&vms, &mas_detach);
@@ -1447,7 +1447,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
if (file) {
vma->vm_file = get_file(file);
- error = call_mmap(file, vma);
+ error = mmap_file(file, vma);
if (error)
goto unmap_and_free_vma;
@@ -1470,7 +1470,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vma_iter_config(&vmi, addr, end);
/*
- * If vm_flags changed after call_mmap(), we should try merge
+ * If vm_flags changed after mmap_file(), we should try merge
* vma again as we may succeed this time.
*/
if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) {
diff --git a/mm/nommu.c b/mm/nommu.c
index 385b0c15add8..f9ccc02458ec 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -885,7 +885,7 @@ static int do_mmap_shared_file(struct vm_area_struct *vma)
{
int ret;
- ret = call_mmap(vma->vm_file, vma);
+ ret = mmap_file(vma->vm_file, vma);
if (ret == 0) {
vma->vm_region->vm_top = vma->vm_region->vm_end;
return 0;
@@ -918,7 +918,7 @@ static int do_mmap_private(struct vm_area_struct *vma,
* happy.
*/
if (capabilities & NOMMU_MAP_DIRECT) {
- ret = call_mmap(vma->vm_file, vma);
+ ret = mmap_file(vma->vm_file, vma);
/* shouldn't return success if we're not sharing */
if (WARN_ON_ONCE(!is_nommu_shared_mapping(vma->vm_flags)))
ret = -ENOSYS;
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH hotfix 6.12 v4 2/5] mm: unconditionally close VMAs on error
2024-10-29 18:11 [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes) Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 1/5] mm: avoid unsafe VMA hook invocation when error arises on mmap hook Lorenzo Stoakes
@ 2024-10-29 18:11 ` Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 3/5] mm: refactor map_deny_write_exec() Lorenzo Stoakes
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-29 18:11 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel,
linux-mm, Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
Incorrect invocation of VMA callbacks when the VMA is no longer in a
consistent state is bug prone and risky to perform.
With regards to the important vm_ops->close() callback We have gone to
great lengths to try to track whether or not we ought to close VMAs.
Rather than doing so and risking making a mistake somewhere, instead
unconditionally close and reset vma->vm_ops to an empty dummy operations
set with a NULL .close operator.
We introduce a new function to do so - vma_close() - and simplify existing
vms logic which tracked whether we needed to close or not.
This simplifies the logic, avoids incorrect double-calling of the .close()
callback and allows us to update error paths to simply call vma_close()
unconditionally - making VMA closure idempotent.
Reported-by: Jann Horn <jannh@google.com>
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Cc: stable <stable@kernel.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/internal.h | 18 ++++++++++++++++++
mm/mmap.c | 5 ++---
mm/nommu.c | 3 +--
mm/vma.c | 14 +++++---------
mm/vma.h | 4 +---
5 files changed, 27 insertions(+), 17 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index 4eab2961e69c..64c2eb0b160e 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -135,6 +135,24 @@ static inline int mmap_file(struct file *file, struct vm_area_struct *vma)
return err;
}
+/*
+ * If the VMA has a close hook then close it, and since closing it might leave
+ * it in an inconsistent state which makes the use of any hooks suspect, clear
+ * them down by installing dummy empty hooks.
+ */
+static inline void vma_close(struct vm_area_struct *vma)
+{
+ if (vma->vm_ops && vma->vm_ops->close) {
+ vma->vm_ops->close(vma);
+
+ /*
+ * The mapping is in an inconsistent state, and no further hooks
+ * may be invoked upon it.
+ */
+ vma->vm_ops = &vma_dummy_vm_ops;
+ }
+}
+
#ifdef CONFIG_MMU
/* Flags for folio_pte_batch(). */
diff --git a/mm/mmap.c b/mm/mmap.c
index 6e3b25f7728f..ac0604f146f6 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1573,8 +1573,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
return addr;
close_and_free_vma:
- if (file && !vms.closed_vm_ops && vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
+ vma_close(vma);
if (file || vma->vm_file) {
unmap_and_free_vma:
@@ -1934,7 +1933,7 @@ void exit_mmap(struct mm_struct *mm)
do {
if (vma->vm_flags & VM_ACCOUNT)
nr_accounted += vma_pages(vma);
- remove_vma(vma, /* unreachable = */ true, /* closed = */ false);
+ remove_vma(vma, /* unreachable = */ true);
count++;
cond_resched();
vma = vma_next(&vmi);
diff --git a/mm/nommu.c b/mm/nommu.c
index f9ccc02458ec..635d028d647b 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -589,8 +589,7 @@ static int delete_vma_from_mm(struct vm_area_struct *vma)
*/
static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma)
{
- if (vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
+ vma_close(vma);
if (vma->vm_file)
fput(vma->vm_file);
put_nommu_region(vma->vm_region);
diff --git a/mm/vma.c b/mm/vma.c
index b21ffec33f8e..7621384d64cf 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -323,11 +323,10 @@ static bool can_vma_merge_right(struct vma_merge_struct *vmg,
/*
* Close a vm structure and free it.
*/
-void remove_vma(struct vm_area_struct *vma, bool unreachable, bool closed)
+void remove_vma(struct vm_area_struct *vma, bool unreachable)
{
might_sleep();
- if (!closed && vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
+ vma_close(vma);
if (vma->vm_file)
fput(vma->vm_file);
mpol_put(vma_policy(vma));
@@ -1115,9 +1114,7 @@ void vms_clean_up_area(struct vma_munmap_struct *vms,
vms_clear_ptes(vms, mas_detach, true);
mas_set(mas_detach, 0);
mas_for_each(mas_detach, vma, ULONG_MAX)
- if (vma->vm_ops && vma->vm_ops->close)
- vma->vm_ops->close(vma);
- vms->closed_vm_ops = true;
+ vma_close(vma);
}
/*
@@ -1160,7 +1157,7 @@ void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
/* Remove and clean up vmas */
mas_set(mas_detach, 0);
mas_for_each(mas_detach, vma, ULONG_MAX)
- remove_vma(vma, /* = */ false, vms->closed_vm_ops);
+ remove_vma(vma, /* unreachable = */ false);
vm_unacct_memory(vms->nr_accounted);
validate_mm(mm);
@@ -1684,8 +1681,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
return new_vma;
out_vma_link:
- if (new_vma->vm_ops && new_vma->vm_ops->close)
- new_vma->vm_ops->close(new_vma);
+ vma_close(new_vma);
if (new_vma->vm_file)
fput(new_vma->vm_file);
diff --git a/mm/vma.h b/mm/vma.h
index 55457cb68200..75558b5e9c8c 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -42,7 +42,6 @@ struct vma_munmap_struct {
int vma_count; /* Number of vmas that will be removed */
bool unlock; /* Unlock after the munmap */
bool clear_ptes; /* If there are outstanding PTE to be cleared */
- bool closed_vm_ops; /* call_mmap() was encountered, so vmas may be closed */
/* 1 byte hole */
unsigned long nr_pages; /* Number of pages being removed */
unsigned long locked_vm; /* Number of locked pages */
@@ -198,7 +197,6 @@ static inline void init_vma_munmap(struct vma_munmap_struct *vms,
vms->unmap_start = FIRST_USER_ADDRESS;
vms->unmap_end = USER_PGTABLES_CEILING;
vms->clear_ptes = false;
- vms->closed_vm_ops = false;
}
#endif
@@ -269,7 +267,7 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
unsigned long start, size_t len, struct list_head *uf,
bool unlock);
-void remove_vma(struct vm_area_struct *vma, bool unreachable, bool closed);
+void remove_vma(struct vm_area_struct *vma, bool unreachable);
void unmap_region(struct ma_state *mas, struct vm_area_struct *vma,
struct vm_area_struct *prev, struct vm_area_struct *next);
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH hotfix 6.12 v4 3/5] mm: refactor map_deny_write_exec()
2024-10-29 18:11 [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes) Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 1/5] mm: avoid unsafe VMA hook invocation when error arises on mmap hook Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 2/5] mm: unconditionally close VMAs on error Lorenzo Stoakes
@ 2024-10-29 18:11 ` Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 5/5] mm: resolve faulty mmap_region() error path behaviour Lorenzo Stoakes
4 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-29 18:11 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel,
linux-mm, Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
Refactor the map_deny_write_exec() to not unnecessarily require a VMA
parameter but rather to accept VMA flags parameters, which allows us to use
this function early in mmap_region() in a subsequent commit.
While we're here, we refactor the function to be more readable and add some
additional documentation.
Reported-by: Jann Horn <jannh@google.com>
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Cc: stable <stable@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
include/linux/mman.h | 21 ++++++++++++++++++---
mm/mmap.c | 2 +-
mm/mprotect.c | 2 +-
mm/vma.h | 2 +-
4 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/include/linux/mman.h b/include/linux/mman.h
index bcb201ab7a41..8ddca62d6460 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -188,16 +188,31 @@ static inline bool arch_memory_deny_write_exec_supported(void)
*
* d) mmap(PROT_READ | PROT_EXEC)
* mmap(PROT_READ | PROT_EXEC | PROT_BTI)
+ *
+ * This is only applicable if the user has set the Memory-Deny-Write-Execute
+ * (MDWE) protection mask for the current process.
+ *
+ * @old specifies the VMA flags the VMA originally possessed, and @new the ones
+ * we propose to set.
+ *
+ * Return: false if proposed change is OK, true if not ok and should be denied.
*/
-static inline bool map_deny_write_exec(struct vm_area_struct *vma, unsigned long vm_flags)
+static inline bool map_deny_write_exec(unsigned long old, unsigned long new)
{
+ /* If MDWE is disabled, we have nothing to deny. */
if (!test_bit(MMF_HAS_MDWE, ¤t->mm->flags))
return false;
- if ((vm_flags & VM_EXEC) && (vm_flags & VM_WRITE))
+ /* If the new VMA is not executable, we have nothing to deny. */
+ if (!(new & VM_EXEC))
+ return false;
+
+ /* Under MDWE we do not accept newly writably executable VMAs... */
+ if (new & VM_WRITE)
return true;
- if (!(vma->vm_flags & VM_EXEC) && (vm_flags & VM_EXEC))
+ /* ...nor previously non-executable VMAs becoming executable. */
+ if (!(old & VM_EXEC))
return true;
return false;
diff --git a/mm/mmap.c b/mm/mmap.c
index ac0604f146f6..ab71d4c3464c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1505,7 +1505,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vma_set_anonymous(vma);
}
- if (map_deny_write_exec(vma, vma->vm_flags)) {
+ if (map_deny_write_exec(vma->vm_flags, vma->vm_flags)) {
error = -EACCES;
goto close_and_free_vma;
}
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 0c5d6d06107d..6f450af3252e 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -810,7 +810,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
break;
}
- if (map_deny_write_exec(vma, newflags)) {
+ if (map_deny_write_exec(vma->vm_flags, newflags)) {
error = -EACCES;
break;
}
diff --git a/mm/vma.h b/mm/vma.h
index 75558b5e9c8c..d58068c0ff2e 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -42,7 +42,7 @@ struct vma_munmap_struct {
int vma_count; /* Number of vmas that will be removed */
bool unlock; /* Unlock after the munmap */
bool clear_ptes; /* If there are outstanding PTE to be cleared */
- /* 1 byte hole */
+ /* 2 byte hole */
unsigned long nr_pages; /* Number of pages being removed */
unsigned long locked_vm; /* Number of locked pages */
unsigned long nr_accounted; /* Number of VM_ACCOUNT pages */
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-29 18:11 [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes) Lorenzo Stoakes
` (2 preceding siblings ...)
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 3/5] mm: refactor map_deny_write_exec() Lorenzo Stoakes
@ 2024-10-29 18:11 ` Lorenzo Stoakes
2024-10-30 9:18 ` Vlastimil Babka
2024-10-30 12:00 ` Catalin Marinas
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 5/5] mm: resolve faulty mmap_region() error path behaviour Lorenzo Stoakes
4 siblings, 2 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-29 18:11 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel,
linux-mm, Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
Currently MTE is permitted in two circumstances (desiring to use MTE having
been specified by the VM_MTE flag) - where MAP_ANONYMOUS is specified, as
checked by arch_calc_vm_flag_bits() and actualised by setting the
VM_MTE_ALLOWED flag, or if the file backing the mapping is shmem, in which
case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap hook is activated
in mmap_region().
The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also set
is the arm64 implementation of arch_validate_flags().
Unfortunately, we intend to refactor mmap_region() to perform this check
earlier, meaning that in the case of a shmem backing we will not have
invoked shmem_mmap() yet, causing the mapping to fail spuriously.
It is inappropriate to set this architecture-specific flag in general mm
code anyway, so a sensible resolution of this issue is to instead move the
check somewhere else.
We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
the arch_calc_vm_flag_bits() call.
This is an appropriate place to do this as we already check for the
MAP_ANONYMOUS case here, and the shmem file case is simply a variant of the
same idea - we permit RAM-backed memory.
This requires a modification to the arch_calc_vm_flag_bits() signature to
pass in a pointer to the struct file associated with the mapping, however
this is not too egregious as this is only used by two architectures anyway
- arm64 and parisc.
So this patch performs this adjustment and removes the unnecessary
assignment of VM_MTE_ALLOWED in shmem_mmap().
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Jann Horn <jannh@google.com>
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Cc: stable <stable@kernel.org>
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
arch/arm64/include/asm/mman.h | 10 +++++++---
arch/parisc/include/asm/mman.h | 5 +++--
include/linux/mman.h | 7 ++++---
mm/mmap.c | 2 +-
mm/nommu.c | 2 +-
mm/shmem.c | 3 ---
6 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index 9e39217b4afb..798d965760d4 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -6,6 +6,8 @@
#ifndef BUILD_VDSO
#include <linux/compiler.h>
+#include <linux/fs.h>
+#include <linux/shmem_fs.h>
#include <linux/types.h>
static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
@@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
}
#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
-static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
+static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
+ unsigned long flags)
{
/*
* Only allow MTE on anonymous mappings as these are guaranteed to be
* backed by tags-capable memory. The vm_flags may be overridden by a
* filesystem supporting MTE (RAM-based).
*/
- if (system_supports_mte() && (flags & MAP_ANONYMOUS))
+ if (system_supports_mte() &&
+ ((flags & MAP_ANONYMOUS) || shmem_file(file)))
return VM_MTE_ALLOWED;
return 0;
}
-#define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags)
+#define arch_calc_vm_flag_bits(file, flags) arch_calc_vm_flag_bits(file, flags)
static inline bool arch_validate_prot(unsigned long prot,
unsigned long addr __always_unused)
diff --git a/arch/parisc/include/asm/mman.h b/arch/parisc/include/asm/mman.h
index 89b6beeda0b8..663f587dc789 100644
--- a/arch/parisc/include/asm/mman.h
+++ b/arch/parisc/include/asm/mman.h
@@ -2,6 +2,7 @@
#ifndef __ASM_MMAN_H__
#define __ASM_MMAN_H__
+#include <linux/fs.h>
#include <uapi/asm/mman.h>
/* PARISC cannot allow mdwe as it needs writable stacks */
@@ -11,7 +12,7 @@ static inline bool arch_memory_deny_write_exec_supported(void)
}
#define arch_memory_deny_write_exec_supported arch_memory_deny_write_exec_supported
-static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
+static inline unsigned long arch_calc_vm_flag_bits(struct file *file, unsigned long flags)
{
/*
* The stack on parisc grows upwards, so if userspace requests memory
@@ -23,6 +24,6 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
return 0;
}
-#define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags)
+#define arch_calc_vm_flag_bits(file, flags) arch_calc_vm_flag_bits(file, flags)
#endif /* __ASM_MMAN_H__ */
diff --git a/include/linux/mman.h b/include/linux/mman.h
index 8ddca62d6460..bd70af0321e8 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -2,6 +2,7 @@
#ifndef _LINUX_MMAN_H
#define _LINUX_MMAN_H
+#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/percpu_counter.h>
@@ -94,7 +95,7 @@ static inline void vm_unacct_memory(long pages)
#endif
#ifndef arch_calc_vm_flag_bits
-#define arch_calc_vm_flag_bits(flags) 0
+#define arch_calc_vm_flag_bits(file, flags) 0
#endif
#ifndef arch_validate_prot
@@ -151,13 +152,13 @@ calc_vm_prot_bits(unsigned long prot, unsigned long pkey)
* Combine the mmap "flags" argument into "vm_flags" used internally.
*/
static inline unsigned long
-calc_vm_flag_bits(unsigned long flags)
+calc_vm_flag_bits(struct file *file, unsigned long flags)
{
return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
_calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
_calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
_calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
- arch_calc_vm_flag_bits(flags);
+ arch_calc_vm_flag_bits(file, flags);
}
unsigned long vm_commit_limit(void);
diff --git a/mm/mmap.c b/mm/mmap.c
index ab71d4c3464c..aee5fa08ae5d 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -344,7 +344,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
* to. we assume access permissions have been handled by the open
* of the memory object, so we don't do any here.
*/
- vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) |
+ vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(file, flags) |
mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
/* Obtain the address to map to. we verify (or select) it and ensure
diff --git a/mm/nommu.c b/mm/nommu.c
index 635d028d647b..e9b5f527ab5b 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -842,7 +842,7 @@ static unsigned long determine_vm_flags(struct file *file,
{
unsigned long vm_flags;
- vm_flags = calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(flags);
+ vm_flags = calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(file, flags);
if (!file) {
/*
diff --git a/mm/shmem.c b/mm/shmem.c
index 4ba1d00fabda..e87f5d6799a7 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2733,9 +2733,6 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
if (ret)
return ret;
- /* arm64 - allow memory tagging on RAM-based files */
- vm_flags_set(vma, VM_MTE_ALLOWED);
-
file_accessed(file);
/* This is anonymous shared memory if it is unlinked at the time of mmap */
if (inode->i_nlink)
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH hotfix 6.12 v4 5/5] mm: resolve faulty mmap_region() error path behaviour
2024-10-29 18:11 [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes) Lorenzo Stoakes
` (3 preceding siblings ...)
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Lorenzo Stoakes
@ 2024-10-29 18:11 ` Lorenzo Stoakes
4 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-29 18:11 UTC (permalink / raw)
To: Andrew Morton
Cc: Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-kernel,
linux-mm, Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
The mmap_region() function is somewhat terrifying, with spaghetti-like
control flow and numerous means by which issues can arise and incomplete
state, memory leaks and other unpleasantness can occur.
A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently observed
issues with resource leaks and observable inconsistent state.
Taking advantage of previous patches in this series we move a number of
checks earlier in the code, simplifying things by moving the core of the
logic into a static internal function __mmap_region().
Doing this allows us to perform a number of checks up front before we do
any real work, and allows us to unwind the writable unmap check
unconditionally as required and to perform a CONFIG_DEBUG_VM_MAPLE_TREE
validation unconditionally also.
We move a number of things here:
1. We preallocate memory for the iterator before we call the file-backed
memory hook, allowing us to exit early and avoid having to perform
complicated and error-prone close/free logic. We carefully free
iterator state on both success and error paths.
2. The enclosing mmap_region() function handles the mapping_map_writable()
logic early. Previously the logic had the mapping_map_writable() at the
point of mapping a newly allocated file-backed VMA, and a matching
mapping_unmap_writable() on success and error paths.
We now do this unconditionally if this is a file-backed, shared writable
mapping. If a driver changes the flags to eliminate VM_MAYWRITE, however
doing so does not invalidate the seal check we just performed, and we in
any case always decrement the counter in the wrapper.
We perform a debug assert to ensure a driver does not attempt to do the
opposite.
3. We also move arch_validate_flags() up into the mmap_region()
function. This is only relevant on arm64 and sparc64, and the check is
only meaningful for SPARC with ADI enabled. We explicitly add a warning
for this arch if a driver invalidates this check, though the code ought
eventually to be fixed to eliminate the need for this.
With all of these measures in place, we no longer need to explicitly close
the VMA on error paths, as we place all checks which might fail prior to a
call to any driver mmap hook.
This eliminates an entire class of errors, makes the code easier to reason
about and more robust.
Reported-by: Jann Horn <jannh@google.com>
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Cc: stable <stable@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
mm/mmap.c | 119 +++++++++++++++++++++++++++++-------------------------
1 file changed, 65 insertions(+), 54 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index aee5fa08ae5d..79d541f1502b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1358,20 +1358,18 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
return do_vmi_munmap(&vmi, mm, start, len, uf, false);
}
-unsigned long mmap_region(struct file *file, unsigned long addr,
+static unsigned long __mmap_region(struct file *file, unsigned long addr,
unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
struct list_head *uf)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma = NULL;
pgoff_t pglen = PHYS_PFN(len);
- struct vm_area_struct *merge;
unsigned long charged = 0;
struct vma_munmap_struct vms;
struct ma_state mas_detach;
struct maple_tree mt_detach;
unsigned long end = addr + len;
- bool writable_file_mapping = false;
int error;
VMA_ITERATOR(vmi, mm, addr);
VMG_STATE(vmg, mm, &vmi, addr, end, vm_flags, pgoff);
@@ -1445,28 +1443,26 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vm_flags_init(vma, vm_flags);
vma->vm_page_prot = vm_get_page_prot(vm_flags);
+ if (vma_iter_prealloc(&vmi, vma)) {
+ error = -ENOMEM;
+ goto free_vma;
+ }
+
if (file) {
vma->vm_file = get_file(file);
error = mmap_file(file, vma);
if (error)
- goto unmap_and_free_vma;
-
- if (vma_is_shared_maywrite(vma)) {
- error = mapping_map_writable(file->f_mapping);
- if (error)
- goto close_and_free_vma;
-
- writable_file_mapping = true;
- }
+ goto unmap_and_free_file_vma;
+ /* Drivers cannot alter the address of the VMA. */
+ WARN_ON_ONCE(addr != vma->vm_start);
/*
- * Expansion is handled above, merging is handled below.
- * Drivers should not alter the address of the VMA.
+ * Drivers should not permit writability when previously it was
+ * disallowed.
*/
- if (WARN_ON((addr != vma->vm_start))) {
- error = -EINVAL;
- goto close_and_free_vma;
- }
+ VM_WARN_ON_ONCE(vm_flags != vma->vm_flags &&
+ !(vm_flags & VM_MAYWRITE) &&
+ (vma->vm_flags & VM_MAYWRITE));
vma_iter_config(&vmi, addr, end);
/*
@@ -1474,6 +1470,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* vma again as we may succeed this time.
*/
if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) {
+ struct vm_area_struct *merge;
+
vmg.flags = vma->vm_flags;
/* If this fails, state is reset ready for a reattempt. */
merge = vma_merge_new_range(&vmg);
@@ -1491,7 +1489,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vma = merge;
/* Update vm_flags to pick up the change. */
vm_flags = vma->vm_flags;
- goto unmap_writable;
+ goto file_expanded;
}
vma_iter_config(&vmi, addr, end);
}
@@ -1500,26 +1498,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
} else if (vm_flags & VM_SHARED) {
error = shmem_zero_setup(vma);
if (error)
- goto free_vma;
+ goto free_iter_vma;
} else {
vma_set_anonymous(vma);
}
- if (map_deny_write_exec(vma->vm_flags, vma->vm_flags)) {
- error = -EACCES;
- goto close_and_free_vma;
- }
-
- /* Allow architectures to sanity-check the vm_flags */
- if (!arch_validate_flags(vma->vm_flags)) {
- error = -EINVAL;
- goto close_and_free_vma;
- }
-
- if (vma_iter_prealloc(&vmi, vma)) {
- error = -ENOMEM;
- goto close_and_free_vma;
- }
+#ifdef CONFIG_SPARC64
+ /* TODO: Fix SPARC ADI! */
+ WARN_ON_ONCE(!arch_validate_flags(vm_flags));
+#endif
/* Lock the VMA since it is modified after insertion into VMA tree */
vma_start_write(vma);
@@ -1533,10 +1520,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
*/
khugepaged_enter_vma(vma, vma->vm_flags);
- /* Once vma denies write, undo our temporary denial count */
-unmap_writable:
- if (writable_file_mapping)
- mapping_unmap_writable(file->f_mapping);
+file_expanded:
file = vma->vm_file;
ksm_add_vma(vma);
expanded:
@@ -1569,23 +1553,17 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vma_set_page_prot(vma);
- validate_mm(mm);
return addr;
-close_and_free_vma:
- vma_close(vma);
-
- if (file || vma->vm_file) {
-unmap_and_free_vma:
- fput(vma->vm_file);
- vma->vm_file = NULL;
+unmap_and_free_file_vma:
+ fput(vma->vm_file);
+ vma->vm_file = NULL;
- vma_iter_set(&vmi, vma->vm_end);
- /* Undo any partial mapping done by a device driver. */
- unmap_region(&vmi.mas, vma, vmg.prev, vmg.next);
- }
- if (writable_file_mapping)
- mapping_unmap_writable(file->f_mapping);
+ vma_iter_set(&vmi, vma->vm_end);
+ /* Undo any partial mapping done by a device driver. */
+ unmap_region(&vmi.mas, vma, vmg.prev, vmg.next);
+free_iter_vma:
+ vma_iter_free(&vmi);
free_vma:
vm_area_free(vma);
unacct_error:
@@ -1595,10 +1573,43 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
abort_munmap:
vms_abort_munmap_vmas(&vms, &mas_detach);
gather_failed:
- validate_mm(mm);
return error;
}
+unsigned long mmap_region(struct file *file, unsigned long addr,
+ unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
+ struct list_head *uf)
+{
+ unsigned long ret;
+ bool writable_file_mapping = false;
+
+ /* Check to see if MDWE is applicable. */
+ if (map_deny_write_exec(vm_flags, vm_flags))
+ return -EACCES;
+
+ /* Allow architectures to sanity-check the vm_flags. */
+ if (!arch_validate_flags(vm_flags))
+ return -EINVAL;
+
+ /* Map writable and ensure this isn't a sealed memfd. */
+ if (file && is_shared_maywrite(vm_flags)) {
+ int error = mapping_map_writable(file->f_mapping);
+
+ if (error)
+ return error;
+ writable_file_mapping = true;
+ }
+
+ ret = __mmap_region(file, addr, len, vm_flags, pgoff, uf);
+
+ /* Clear our write mapping regardless of error. */
+ if (writable_file_mapping)
+ mapping_unmap_writable(file->f_mapping);
+
+ validate_mm(current->mm);
+ return ret;
+}
+
static int __vm_munmap(unsigned long start, size_t len, bool unlock)
{
int ret;
--
2.47.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Lorenzo Stoakes
@ 2024-10-30 9:18 ` Vlastimil Babka
2024-10-30 10:58 ` Catalin Marinas
2024-10-30 12:00 ` Catalin Marinas
1 sibling, 1 reply; 18+ messages in thread
From: Vlastimil Babka @ 2024-10-30 9:18 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Liam R . Howlett, Jann Horn, linux-kernel, linux-mm,
Linus Torvalds, Peter Xu, Catalin Marinas, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
On 10/29/24 19:11, Lorenzo Stoakes wrote:
> Currently MTE is permitted in two circumstances (desiring to use MTE having
> been specified by the VM_MTE flag) - where MAP_ANONYMOUS is specified, as
> checked by arch_calc_vm_flag_bits() and actualised by setting the
> VM_MTE_ALLOWED flag, or if the file backing the mapping is shmem, in which
> case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap hook is activated
> in mmap_region().
>
> The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also set
> is the arm64 implementation of arch_validate_flags().
>
> Unfortunately, we intend to refactor mmap_region() to perform this check
> earlier, meaning that in the case of a shmem backing we will not have
> invoked shmem_mmap() yet, causing the mapping to fail spuriously.
>
> It is inappropriate to set this architecture-specific flag in general mm
> code anyway, so a sensible resolution of this issue is to instead move the
> check somewhere else.
>
> We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
> the arch_calc_vm_flag_bits() call.
>
> This is an appropriate place to do this as we already check for the
> MAP_ANONYMOUS case here, and the shmem file case is simply a variant of the
> same idea - we permit RAM-backed memory.
>
> This requires a modification to the arch_calc_vm_flag_bits() signature to
> pass in a pointer to the struct file associated with the mapping, however
> this is not too egregious as this is only used by two architectures anyway
> - arm64 and parisc.
>
> So this patch performs this adjustment and removes the unnecessary
> assignment of VM_MTE_ALLOWED in shmem_mmap().
>
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Reported-by: Jann Horn <jannh@google.com>
> Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
> Cc: stable <stable@kernel.org>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
> --- a/arch/arm64/include/asm/mman.h
> +++ b/arch/arm64/include/asm/mman.h
> @@ -6,6 +6,8 @@
>
> #ifndef BUILD_VDSO
> #include <linux/compiler.h>
> +#include <linux/fs.h>
> +#include <linux/shmem_fs.h>
> #include <linux/types.h>
>
> static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> }
> #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>
> -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
> +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
> + unsigned long flags)
> {
> /*
> * Only allow MTE on anonymous mappings as these are guaranteed to be
> * backed by tags-capable memory. The vm_flags may be overridden by a
> * filesystem supporting MTE (RAM-based).
We should also eventually remove the last sentence or even replace it with
its negation, or somebody might try reintroducing the pattern that won't
work anymore (wasn't there such a hugetlbfs thing in -next?).
> */
> - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
> + if (system_supports_mte() &&
> + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
> return VM_MTE_ALLOWED;
>
> return 0;
> }
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 9:18 ` Vlastimil Babka
@ 2024-10-30 10:58 ` Catalin Marinas
2024-10-30 11:09 ` Vlastimil Babka
0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2024-10-30 10:58 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Lorenzo Stoakes, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller, Yang Shi
On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
> On 10/29/24 19:11, Lorenzo Stoakes wrote:
> > --- a/arch/arm64/include/asm/mman.h
> > +++ b/arch/arm64/include/asm/mman.h
> > @@ -6,6 +6,8 @@
> >
> > #ifndef BUILD_VDSO
> > #include <linux/compiler.h>
> > +#include <linux/fs.h>
> > +#include <linux/shmem_fs.h>
> > #include <linux/types.h>
> >
> > static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > }
> > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> >
> > -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
> > +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
> > + unsigned long flags)
> > {
> > /*
> > * Only allow MTE on anonymous mappings as these are guaranteed to be
> > * backed by tags-capable memory. The vm_flags may be overridden by a
> > * filesystem supporting MTE (RAM-based).
>
> We should also eventually remove the last sentence or even replace it with
> its negation, or somebody might try reintroducing the pattern that won't
> work anymore (wasn't there such a hugetlbfs thing in -next?).
I agree, we should update this comment as well though as a fix this
patch is fine for now.
There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
should still work after the above change but we'd need to move it over
here (and fix the comment at the same time). We'll probably do it around
-rc1 or maybe earlier once this fix hits mainline. I don't think we have
an equivalent of shmem_file() for hugetlbfs, we'll need to figure
something out.
> > */
> > - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
> > + if (system_supports_mte() &&
> > + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
> > return VM_MTE_ALLOWED;
> >
> > return 0;
> > }
This will conflict with the arm64 for-next/core tree as it's adding
a MAP_HUGETLB check. Trivial resolution though, Stephen will handle it.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 10:58 ` Catalin Marinas
@ 2024-10-30 11:09 ` Vlastimil Babka
2024-10-30 11:53 ` Lorenzo Stoakes
2024-10-30 18:30 ` Catalin Marinas
0 siblings, 2 replies; 18+ messages in thread
From: Vlastimil Babka @ 2024-10-30 11:09 UTC (permalink / raw)
To: Catalin Marinas
Cc: Lorenzo Stoakes, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller, Yang Shi
On 10/30/24 11:58, Catalin Marinas wrote:
> On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
>> On 10/29/24 19:11, Lorenzo Stoakes wrote:
>> > --- a/arch/arm64/include/asm/mman.h
>> > +++ b/arch/arm64/include/asm/mman.h
>> > @@ -6,6 +6,8 @@
>> >
>> > #ifndef BUILD_VDSO
>> > #include <linux/compiler.h>
>> > +#include <linux/fs.h>
>> > +#include <linux/shmem_fs.h>
>> > #include <linux/types.h>
>> >
>> > static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>> > @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>> > }
>> > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>> >
>> > -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
>> > +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
>> > + unsigned long flags)
>> > {
>> > /*
>> > * Only allow MTE on anonymous mappings as these are guaranteed to be
>> > * backed by tags-capable memory. The vm_flags may be overridden by a
>> > * filesystem supporting MTE (RAM-based).
>>
>> We should also eventually remove the last sentence or even replace it with
>> its negation, or somebody might try reintroducing the pattern that won't
>> work anymore (wasn't there such a hugetlbfs thing in -next?).
>
> I agree, we should update this comment as well though as a fix this
> patch is fine for now.
>
> There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
> should still work after the above change but we'd need to move it over
I guess it will work after the above change, but not after 5/5?
> here (and fix the comment at the same time). We'll probably do it around
> -rc1 or maybe earlier once this fix hits mainline.
I assume this will hopefully go to rc7.
> I don't think we have
> an equivalent of shmem_file() for hugetlbfs, we'll need to figure
> something out.
I've found is_file_hugepages(), could work? And while adding the hugetlbfs
change here, the comment could be adjusted too, right?
>
>> > */
>> > - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
>> > + if (system_supports_mte() &&
>> > + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
>> > return VM_MTE_ALLOWED;
>> >
>> > return 0;
>> > }
>
> This will conflict with the arm64 for-next/core tree as it's adding
> a MAP_HUGETLB check. Trivial resolution though, Stephen will handle it.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 11:09 ` Vlastimil Babka
@ 2024-10-30 11:53 ` Lorenzo Stoakes
2024-10-30 12:39 ` Catalin Marinas
2024-10-30 14:58 ` Yang Shi
2024-10-30 18:30 ` Catalin Marinas
1 sibling, 2 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-30 11:53 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Catalin Marinas, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller, Yang Shi
On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
> On 10/30/24 11:58, Catalin Marinas wrote:
> > On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
> >> On 10/29/24 19:11, Lorenzo Stoakes wrote:
> >> > --- a/arch/arm64/include/asm/mman.h
> >> > +++ b/arch/arm64/include/asm/mman.h
> >> > @@ -6,6 +6,8 @@
> >> >
> >> > #ifndef BUILD_VDSO
> >> > #include <linux/compiler.h>
> >> > +#include <linux/fs.h>
> >> > +#include <linux/shmem_fs.h>
> >> > #include <linux/types.h>
> >> >
> >> > static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >> > @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >> > }
> >> > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> >> >
> >> > -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
> >> > +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
> >> > + unsigned long flags)
> >> > {
> >> > /*
> >> > * Only allow MTE on anonymous mappings as these are guaranteed to be
> >> > * backed by tags-capable memory. The vm_flags may be overridden by a
> >> > * filesystem supporting MTE (RAM-based).
> >>
> >> We should also eventually remove the last sentence or even replace it with
> >> its negation, or somebody might try reintroducing the pattern that won't
> >> work anymore (wasn't there such a hugetlbfs thing in -next?).
> >
> > I agree, we should update this comment as well though as a fix this
> > patch is fine for now.
> >
> > There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
> > should still work after the above change but we'd need to move it over
>
> I guess it will work after the above change, but not after 5/5?
>
> > here (and fix the comment at the same time). We'll probably do it around
> > -rc1 or maybe earlier once this fix hits mainline.
>
> I assume this will hopefully go to rc7.
To be clear - this is a CRITICAL fix that MUST land for 6.12. I'd be inclined to
try to get it to an earlier rc-.
>
> > I don't think we have
> > an equivalent of shmem_file() for hugetlbfs, we'll need to figure
> > something out.
>
> I've found is_file_hugepages(), could work? And while adding the hugetlbfs
> change here, the comment could be adjusted too, right?
Right but the MAP_HUGETLB should work to? Can we save such changes that
alter any kind of existing behaviour to later series?
As this is going to be backported (by me...!) and I don't want to risk
inadvertant changes.
>
> >
> >> > */
> >> > - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
> >> > + if (system_supports_mte() &&
> >> > + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
> >> > return VM_MTE_ALLOWED;
> >> >
> >> > return 0;
> >> > }
> >
> > This will conflict with the arm64 for-next/core tree as it's adding
> > a MAP_HUGETLB check. Trivial resolution though, Stephen will handle it.
Thanks!
> >
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Lorenzo Stoakes
2024-10-30 9:18 ` Vlastimil Babka
@ 2024-10-30 12:00 ` Catalin Marinas
2024-10-30 12:13 ` Lorenzo Stoakes
1 sibling, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2024-10-30 12:00 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Liam R . Howlett, Vlastimil Babka, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
On Tue, Oct 29, 2024 at 06:11:47PM +0000, Lorenzo Stoakes wrote:
> Currently MTE is permitted in two circumstances (desiring to use MTE having
> been specified by the VM_MTE flag) - where MAP_ANONYMOUS is specified, as
> checked by arch_calc_vm_flag_bits() and actualised by setting the
> VM_MTE_ALLOWED flag, or if the file backing the mapping is shmem, in which
> case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap hook is activated
> in mmap_region().
>
> The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also set
> is the arm64 implementation of arch_validate_flags().
>
> Unfortunately, we intend to refactor mmap_region() to perform this check
> earlier, meaning that in the case of a shmem backing we will not have
> invoked shmem_mmap() yet, causing the mapping to fail spuriously.
>
> It is inappropriate to set this architecture-specific flag in general mm
> code anyway, so a sensible resolution of this issue is to instead move the
> check somewhere else.
>
> We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
> the arch_calc_vm_flag_bits() call.
>
> This is an appropriate place to do this as we already check for the
> MAP_ANONYMOUS case here, and the shmem file case is simply a variant of the
> same idea - we permit RAM-backed memory.
>
> This requires a modification to the arch_calc_vm_flag_bits() signature to
> pass in a pointer to the struct file associated with the mapping, however
> this is not too egregious as this is only used by two architectures anyway
> - arm64 and parisc.
>
> So this patch performs this adjustment and removes the unnecessary
> assignment of VM_MTE_ALLOWED in shmem_mmap().
>
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Reported-by: Jann Horn <jannh@google.com>
> Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
> Cc: stable <stable@kernel.org>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Thanks for respinning this. FTR,
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> @@ -151,13 +152,13 @@ calc_vm_prot_bits(unsigned long prot, unsigned long pkey)
> * Combine the mmap "flags" argument into "vm_flags" used internally.
> */
> static inline unsigned long
> -calc_vm_flag_bits(unsigned long flags)
> +calc_vm_flag_bits(struct file *file, unsigned long flags)
> {
> return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
> _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
> _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
> _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
> - arch_calc_vm_flag_bits(flags);
> + arch_calc_vm_flag_bits(file, flags);
Nitpick (but please ignore, Andrew picked them up already): one space
alignment off.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 12:00 ` Catalin Marinas
@ 2024-10-30 12:13 ` Lorenzo Stoakes
0 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-30 12:13 UTC (permalink / raw)
To: Catalin Marinas
Cc: Andrew Morton, Liam R . Howlett, Vlastimil Babka, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
On Wed, Oct 30, 2024 at 12:00:12PM +0000, Catalin Marinas wrote:
> On Tue, Oct 29, 2024 at 06:11:47PM +0000, Lorenzo Stoakes wrote:
> > Currently MTE is permitted in two circumstances (desiring to use MTE having
> > been specified by the VM_MTE flag) - where MAP_ANONYMOUS is specified, as
> > checked by arch_calc_vm_flag_bits() and actualised by setting the
> > VM_MTE_ALLOWED flag, or if the file backing the mapping is shmem, in which
> > case we set VM_MTE_ALLOWED in shmem_mmap() when the mmap hook is activated
> > in mmap_region().
> >
> > The function that checks that, if VM_MTE is set, VM_MTE_ALLOWED is also set
> > is the arm64 implementation of arch_validate_flags().
> >
> > Unfortunately, we intend to refactor mmap_region() to perform this check
> > earlier, meaning that in the case of a shmem backing we will not have
> > invoked shmem_mmap() yet, causing the mapping to fail spuriously.
> >
> > It is inappropriate to set this architecture-specific flag in general mm
> > code anyway, so a sensible resolution of this issue is to instead move the
> > check somewhere else.
> >
> > We resolve this by setting VM_MTE_ALLOWED much earlier in do_mmap(), via
> > the arch_calc_vm_flag_bits() call.
> >
> > This is an appropriate place to do this as we already check for the
> > MAP_ANONYMOUS case here, and the shmem file case is simply a variant of the
> > same idea - we permit RAM-backed memory.
> >
> > This requires a modification to the arch_calc_vm_flag_bits() signature to
> > pass in a pointer to the struct file associated with the mapping, however
> > this is not too egregious as this is only used by two architectures anyway
> > - arm64 and parisc.
> >
> > So this patch performs this adjustment and removes the unnecessary
> > assignment of VM_MTE_ALLOWED in shmem_mmap().
> >
> > Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> > Reported-by: Jann Horn <jannh@google.com>
> > Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
> > Cc: stable <stable@kernel.org>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> Thanks for respinning this. FTR,
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Thanks!
>
> > @@ -151,13 +152,13 @@ calc_vm_prot_bits(unsigned long prot, unsigned long pkey)
> > * Combine the mmap "flags" argument into "vm_flags" used internally.
> > */
> > static inline unsigned long
> > -calc_vm_flag_bits(unsigned long flags)
> > +calc_vm_flag_bits(struct file *file, unsigned long flags)
> > {
> > return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
> > _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
> > _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
> > _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
> > - arch_calc_vm_flag_bits(flags);
> > + arch_calc_vm_flag_bits(file, flags);
>
> Nitpick (but please ignore, Andrew picked them up already): one space
> alignment off.
Ack yeah, I saw that at the time, didn't quite know how best to resolve as
my editor wanted to put in tabs, but was already mix of tabs + spaces,
which renders different in diff than in the actual code... but in case good
that it's resolvd!
>
> --
> Catalin
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 11:53 ` Lorenzo Stoakes
@ 2024-10-30 12:39 ` Catalin Marinas
2024-10-30 15:00 ` Yang Shi
2024-10-30 14:58 ` Yang Shi
1 sibling, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2024-10-30 12:39 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Vlastimil Babka, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller, Yang Shi
On Wed, Oct 30, 2024 at 11:53:06AM +0000, Lorenzo Stoakes wrote:
> On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
> > On 10/30/24 11:58, Catalin Marinas wrote:
> > > On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
> > >> On 10/29/24 19:11, Lorenzo Stoakes wrote:
> > >> > --- a/arch/arm64/include/asm/mman.h
> > >> > +++ b/arch/arm64/include/asm/mman.h
> > >> > @@ -6,6 +6,8 @@
> > >> >
> > >> > #ifndef BUILD_VDSO
> > >> > #include <linux/compiler.h>
> > >> > +#include <linux/fs.h>
> > >> > +#include <linux/shmem_fs.h>
> > >> > #include <linux/types.h>
> > >> >
> > >> > static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > >> > @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > >> > }
> > >> > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> > >> >
> > >> > -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
> > >> > +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
> > >> > + unsigned long flags)
> > >> > {
> > >> > /*
> > >> > * Only allow MTE on anonymous mappings as these are guaranteed to be
> > >> > * backed by tags-capable memory. The vm_flags may be overridden by a
> > >> > * filesystem supporting MTE (RAM-based).
> > >>
> > >> We should also eventually remove the last sentence or even replace it with
> > >> its negation, or somebody might try reintroducing the pattern that won't
> > >> work anymore (wasn't there such a hugetlbfs thing in -next?).
> > >
> > > I agree, we should update this comment as well though as a fix this
> > > patch is fine for now.
> > >
> > > There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
> > > should still work after the above change but we'd need to move it over
> >
> > I guess it will work after the above change, but not after 5/5?
> >
> > > here (and fix the comment at the same time). We'll probably do it around
> > > -rc1 or maybe earlier once this fix hits mainline.
> >
> > I assume this will hopefully go to rc7.
>
> To be clear - this is a CRITICAL fix that MUST land for 6.12. I'd be inclined to
> try to get it to an earlier rc-.
Ah, good point. So after this series is merged at rc6/rc7, the new
MTE+hugetlbfs in -next won't work. Not an issue, it can be sorted out
later.
> > > I don't think we have
> > > an equivalent of shmem_file() for hugetlbfs, we'll need to figure
> > > something out.
> >
> > I've found is_file_hugepages(), could work? And while adding the hugetlbfs
> > change here, the comment could be adjusted too, right?
>
> Right but the MAP_HUGETLB should work to? Can we save such changes that
> alter any kind of existing behaviour to later series?
>
> As this is going to be backported (by me...!) and I don't want to risk
> inadvertant changes.
MAP_HUGETLB and is_file_hugepages() fixes can go in after 6.13-rc1. This
series is fine as is, we wouldn't backport any MAP_HUGETLB changes
anyway since the flag check wasn't the only issue that needed addressing
for hugetlb MTE mappings.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 11:53 ` Lorenzo Stoakes
2024-10-30 12:39 ` Catalin Marinas
@ 2024-10-30 14:58 ` Yang Shi
2024-10-30 15:08 ` Lorenzo Stoakes
1 sibling, 1 reply; 18+ messages in thread
From: Yang Shi @ 2024-10-30 14:58 UTC (permalink / raw)
To: Lorenzo Stoakes, Vlastimil Babka
Cc: Catalin Marinas, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
On 10/30/24 4:53 AM, Lorenzo Stoakes wrote:
> On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
>> On 10/30/24 11:58, Catalin Marinas wrote:
>>> On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
>>>> On 10/29/24 19:11, Lorenzo Stoakes wrote:
>>>>> --- a/arch/arm64/include/asm/mman.h
>>>>> +++ b/arch/arm64/include/asm/mman.h
>>>>> @@ -6,6 +6,8 @@
>>>>>
>>>>> #ifndef BUILD_VDSO
>>>>> #include <linux/compiler.h>
>>>>> +#include <linux/fs.h>
>>>>> +#include <linux/shmem_fs.h>
>>>>> #include <linux/types.h>
>>>>>
>>>>> static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>>>>> @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>>>>> }
>>>>> #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>>>>>
>>>>> -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
>>>>> +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
>>>>> + unsigned long flags)
>>>>> {
>>>>> /*
>>>>> * Only allow MTE on anonymous mappings as these are guaranteed to be
>>>>> * backed by tags-capable memory. The vm_flags may be overridden by a
>>>>> * filesystem supporting MTE (RAM-based).
>>>> We should also eventually remove the last sentence or even replace it with
>>>> its negation, or somebody might try reintroducing the pattern that won't
>>>> work anymore (wasn't there such a hugetlbfs thing in -next?).
>>> I agree, we should update this comment as well though as a fix this
>>> patch is fine for now.
>>>
>>> There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
>>> should still work after the above change but we'd need to move it over
>> I guess it will work after the above change, but not after 5/5?
>>
>>> here (and fix the comment at the same time). We'll probably do it around
>>> -rc1 or maybe earlier once this fix hits mainline.
>> I assume this will hopefully go to rc7.
> To be clear - this is a CRITICAL fix that MUST land for 6.12. I'd be inclined to
> try to get it to an earlier rc-.
>
>>> I don't think we have
>>> an equivalent of shmem_file() for hugetlbfs, we'll need to figure
>>> something out.
>> I've found is_file_hugepages(), could work? And while adding the hugetlbfs
>> change here, the comment could be adjusted too, right?
> Right but the MAP_HUGETLB should work to? Can we save such changes that
> alter any kind of existing behaviour to later series?
We should need both because mmap hugetlbfs file may not use MAP_HUGETLB.
>
> As this is going to be backported (by me...!) and I don't want to risk
> inadvertant changes.
>
>>>>> */
>>>>> - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
>>>>> + if (system_supports_mte() &&
>>>>> + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
>>>>> return VM_MTE_ALLOWED;
>>>>>
>>>>> return 0;
>>>>> }
>>> This will conflict with the arm64 for-next/core tree as it's adding
>>> a MAP_HUGETLB check. Trivial resolution though, Stephen will handle it.
> Thanks!
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 12:39 ` Catalin Marinas
@ 2024-10-30 15:00 ` Yang Shi
0 siblings, 0 replies; 18+ messages in thread
From: Yang Shi @ 2024-10-30 15:00 UTC (permalink / raw)
To: Catalin Marinas, Lorenzo Stoakes
Cc: Vlastimil Babka, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller
On 10/30/24 5:39 AM, Catalin Marinas wrote:
> On Wed, Oct 30, 2024 at 11:53:06AM +0000, Lorenzo Stoakes wrote:
>> On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
>>> On 10/30/24 11:58, Catalin Marinas wrote:
>>>> On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
>>>>> On 10/29/24 19:11, Lorenzo Stoakes wrote:
>>>>>> --- a/arch/arm64/include/asm/mman.h
>>>>>> +++ b/arch/arm64/include/asm/mman.h
>>>>>> @@ -6,6 +6,8 @@
>>>>>>
>>>>>> #ifndef BUILD_VDSO
>>>>>> #include <linux/compiler.h>
>>>>>> +#include <linux/fs.h>
>>>>>> +#include <linux/shmem_fs.h>
>>>>>> #include <linux/types.h>
>>>>>>
>>>>>> static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>>>>>> @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>>>>>> }
>>>>>> #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>>>>>>
>>>>>> -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
>>>>>> +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
>>>>>> + unsigned long flags)
>>>>>> {
>>>>>> /*
>>>>>> * Only allow MTE on anonymous mappings as these are guaranteed to be
>>>>>> * backed by tags-capable memory. The vm_flags may be overridden by a
>>>>>> * filesystem supporting MTE (RAM-based).
>>>>> We should also eventually remove the last sentence or even replace it with
>>>>> its negation, or somebody might try reintroducing the pattern that won't
>>>>> work anymore (wasn't there such a hugetlbfs thing in -next?).
>>>> I agree, we should update this comment as well though as a fix this
>>>> patch is fine for now.
>>>>
>>>> There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
>>>> should still work after the above change but we'd need to move it over
>>> I guess it will work after the above change, but not after 5/5?
>>>
>>>> here (and fix the comment at the same time). We'll probably do it around
>>>> -rc1 or maybe earlier once this fix hits mainline.
>>> I assume this will hopefully go to rc7.
>> To be clear - this is a CRITICAL fix that MUST land for 6.12. I'd be inclined to
>> try to get it to an earlier rc-.
> Ah, good point. So after this series is merged at rc6/rc7, the new
> MTE+hugetlbfs in -next won't work. Not an issue, it can be sorted out
> later.
>
>>>> I don't think we have
>>>> an equivalent of shmem_file() for hugetlbfs, we'll need to figure
>>>> something out.
>>> I've found is_file_hugepages(), could work? And while adding the hugetlbfs
>>> change here, the comment could be adjusted too, right?
>> Right but the MAP_HUGETLB should work to? Can we save such changes that
>> alter any kind of existing behaviour to later series?
>>
>> As this is going to be backported (by me...!) and I don't want to risk
>> inadvertant changes.
> MAP_HUGETLB and is_file_hugepages() fixes can go in after 6.13-rc1. This
> series is fine as is, we wouldn't backport any MAP_HUGETLB changes
> anyway since the flag check wasn't the only issue that needed addressing
> for hugetlb MTE mappings.
I agree. The fix looks trivial.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 14:58 ` Yang Shi
@ 2024-10-30 15:08 ` Lorenzo Stoakes
2024-10-30 15:48 ` Yang Shi
0 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2024-10-30 15:08 UTC (permalink / raw)
To: Yang Shi
Cc: Vlastimil Babka, Catalin Marinas, Andrew Morton,
Liam R . Howlett, Jann Horn, linux-kernel, linux-mm,
Linus Torvalds, Peter Xu, Will Deacon, Mark Brown,
David S . Miller, Andreas Larsson, James E . J . Bottomley,
Helge Deller
On Wed, Oct 30, 2024 at 07:58:33AM -0700, Yang Shi wrote:
>
>
> On 10/30/24 4:53 AM, Lorenzo Stoakes wrote:
> > On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
> > > On 10/30/24 11:58, Catalin Marinas wrote:
> > > > On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
> > > > > On 10/29/24 19:11, Lorenzo Stoakes wrote:
> > > > > > --- a/arch/arm64/include/asm/mman.h
> > > > > > +++ b/arch/arm64/include/asm/mman.h
> > > > > > @@ -6,6 +6,8 @@
> > > > > >
> > > > > > #ifndef BUILD_VDSO
> > > > > > #include <linux/compiler.h>
> > > > > > +#include <linux/fs.h>
> > > > > > +#include <linux/shmem_fs.h>
> > > > > > #include <linux/types.h>
> > > > > >
> > > > > > static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > > > > > @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> > > > > > }
> > > > > > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> > > > > >
> > > > > > -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
> > > > > > +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
> > > > > > + unsigned long flags)
> > > > > > {
> > > > > > /*
> > > > > > * Only allow MTE on anonymous mappings as these are guaranteed to be
> > > > > > * backed by tags-capable memory. The vm_flags may be overridden by a
> > > > > > * filesystem supporting MTE (RAM-based).
> > > > > We should also eventually remove the last sentence or even replace it with
> > > > > its negation, or somebody might try reintroducing the pattern that won't
> > > > > work anymore (wasn't there such a hugetlbfs thing in -next?).
> > > > I agree, we should update this comment as well though as a fix this
> > > > patch is fine for now.
> > > >
> > > > There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
> > > > should still work after the above change but we'd need to move it over
> > > I guess it will work after the above change, but not after 5/5?
> > >
> > > > here (and fix the comment at the same time). We'll probably do it around
> > > > -rc1 or maybe earlier once this fix hits mainline.
> > > I assume this will hopefully go to rc7.
> > To be clear - this is a CRITICAL fix that MUST land for 6.12. I'd be inclined to
> > try to get it to an earlier rc-.
> >
> > > > I don't think we have
> > > > an equivalent of shmem_file() for hugetlbfs, we'll need to figure
> > > > something out.
> > > I've found is_file_hugepages(), could work? And while adding the hugetlbfs
> > > change here, the comment could be adjusted too, right?
> > Right but the MAP_HUGETLB should work to? Can we save such changes that
> > alter any kind of existing behaviour to later series?
>
> We should need both because mmap hugetlbfs file may not use MAP_HUGETLB.
Right yeah, we could create a memfd with MFD_HUGETLB for instance and mount
that...
Perhaps somebody could propose the 6.13 change (as this series is just
focused on the hotfix)?
Note that we absolutely plan to try to merge this in 6.12 (it is a critical
fix for a few separate issues).
I guess since we already have something in the arm64 tree adding
MAP_HUGETLB we could rebase that and add a is_file_hugepages() there to
cover off that case too?
(Though I note that shm_file_operations_huge also sets FOP_HUGE_PAGES which
this predicate picks up, not sure if we're ok wtih that? But discussion
better had I think in whichever thread this hugetlb change came from
perhaps?)
Catalin, perhaps?
>
> >
> > As this is going to be backported (by me...!) and I don't want to risk
> > inadvertant changes.
> >
> > > > > > */
> > > > > > - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
> > > > > > + if (system_supports_mte() &&
> > > > > > + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
> > > > > > return VM_MTE_ALLOWED;
> > > > > >
> > > > > > return 0;
> > > > > > }
> > > > This will conflict with the arm64 for-next/core tree as it's adding
> > > > a MAP_HUGETLB check. Trivial resolution though, Stephen will handle it.
> > Thanks!
> >
>
Thanks all!
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 15:08 ` Lorenzo Stoakes
@ 2024-10-30 15:48 ` Yang Shi
0 siblings, 0 replies; 18+ messages in thread
From: Yang Shi @ 2024-10-30 15:48 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Vlastimil Babka, Catalin Marinas, Andrew Morton,
Liam R . Howlett, Jann Horn, linux-kernel, linux-mm,
Linus Torvalds, Peter Xu, Will Deacon, Mark Brown,
David S . Miller, Andreas Larsson, James E . J . Bottomley,
Helge Deller
On 10/30/24 8:08 AM, Lorenzo Stoakes wrote:
> On Wed, Oct 30, 2024 at 07:58:33AM -0700, Yang Shi wrote:
>>
>> On 10/30/24 4:53 AM, Lorenzo Stoakes wrote:
>>> On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
>>>> On 10/30/24 11:58, Catalin Marinas wrote:
>>>>> On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
>>>>>> On 10/29/24 19:11, Lorenzo Stoakes wrote:
>>>>>>> --- a/arch/arm64/include/asm/mman.h
>>>>>>> +++ b/arch/arm64/include/asm/mman.h
>>>>>>> @@ -6,6 +6,8 @@
>>>>>>>
>>>>>>> #ifndef BUILD_VDSO
>>>>>>> #include <linux/compiler.h>
>>>>>>> +#include <linux/fs.h>
>>>>>>> +#include <linux/shmem_fs.h>
>>>>>>> #include <linux/types.h>
>>>>>>>
>>>>>>> static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>>>>>>> @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>>>>>>> }
>>>>>>> #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>>>>>>>
>>>>>>> -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
>>>>>>> +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
>>>>>>> + unsigned long flags)
>>>>>>> {
>>>>>>> /*
>>>>>>> * Only allow MTE on anonymous mappings as these are guaranteed to be
>>>>>>> * backed by tags-capable memory. The vm_flags may be overridden by a
>>>>>>> * filesystem supporting MTE (RAM-based).
>>>>>> We should also eventually remove the last sentence or even replace it with
>>>>>> its negation, or somebody might try reintroducing the pattern that won't
>>>>>> work anymore (wasn't there such a hugetlbfs thing in -next?).
>>>>> I agree, we should update this comment as well though as a fix this
>>>>> patch is fine for now.
>>>>>
>>>>> There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
>>>>> should still work after the above change but we'd need to move it over
>>>> I guess it will work after the above change, but not after 5/5?
>>>>
>>>>> here (and fix the comment at the same time). We'll probably do it around
>>>>> -rc1 or maybe earlier once this fix hits mainline.
>>>> I assume this will hopefully go to rc7.
>>> To be clear - this is a CRITICAL fix that MUST land for 6.12. I'd be inclined to
>>> try to get it to an earlier rc-.
>>>
>>>>> I don't think we have
>>>>> an equivalent of shmem_file() for hugetlbfs, we'll need to figure
>>>>> something out.
>>>> I've found is_file_hugepages(), could work? And while adding the hugetlbfs
>>>> change here, the comment could be adjusted too, right?
>>> Right but the MAP_HUGETLB should work to? Can we save such changes that
>>> alter any kind of existing behaviour to later series?
>> We should need both because mmap hugetlbfs file may not use MAP_HUGETLB.
> Right yeah, we could create a memfd with MFD_HUGETLB for instance and mount
> that...
>
> Perhaps somebody could propose the 6.13 change (as this series is just
> focused on the hotfix)?
Once this series go in rc7, we (me and Catalin) need to rebase hugetlb
MTE patches anyway due to the conflict. But it should be trivial.
>
> Note that we absolutely plan to try to merge this in 6.12 (it is a critical
> fix for a few separate issues).
>
> I guess since we already have something in the arm64 tree adding
> MAP_HUGETLB we could rebase that and add a is_file_hugepages() there to
> cover off that case too?
Yes
>
> (Though I note that shm_file_operations_huge also sets FOP_HUGE_PAGES which
> this predicate picks up, not sure if we're ok wtih that? But discussion
> better had I think in whichever thread this hugetlb change came from
> perhaps?)
It is ok. SHM_HUGETLB uses hugetlbfs actually.
>
> Catalin, perhaps?
>
>>> As this is going to be backported (by me...!) and I don't want to risk
>>> inadvertant changes.
>>>
>>>>>>> */
>>>>>>> - if (system_supports_mte() && (flags & MAP_ANONYMOUS))
>>>>>>> + if (system_supports_mte() &&
>>>>>>> + ((flags & MAP_ANONYMOUS) || shmem_file(file)))
>>>>>>> return VM_MTE_ALLOWED;
>>>>>>>
>>>>>>> return 0;
>>>>>>> }
>>>>> This will conflict with the arm64 for-next/core tree as it's adding
>>>>> a MAP_HUGETLB check. Trivial resolution though, Stephen will handle it.
>>> Thanks!
>>>
> Thanks all!
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling
2024-10-30 11:09 ` Vlastimil Babka
2024-10-30 11:53 ` Lorenzo Stoakes
@ 2024-10-30 18:30 ` Catalin Marinas
1 sibling, 0 replies; 18+ messages in thread
From: Catalin Marinas @ 2024-10-30 18:30 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Lorenzo Stoakes, Andrew Morton, Liam R . Howlett, Jann Horn,
linux-kernel, linux-mm, Linus Torvalds, Peter Xu, Will Deacon,
Mark Brown, David S . Miller, Andreas Larsson,
James E . J . Bottomley, Helge Deller, Yang Shi
On Wed, Oct 30, 2024 at 12:09:43PM +0100, Vlastimil Babka wrote:
> On 10/30/24 11:58, Catalin Marinas wrote:
> > On Wed, Oct 30, 2024 at 10:18:27AM +0100, Vlastimil Babka wrote:
> >> On 10/29/24 19:11, Lorenzo Stoakes wrote:
> >> > --- a/arch/arm64/include/asm/mman.h
> >> > +++ b/arch/arm64/include/asm/mman.h
> >> > @@ -6,6 +6,8 @@
> >> >
> >> > #ifndef BUILD_VDSO
> >> > #include <linux/compiler.h>
> >> > +#include <linux/fs.h>
> >> > +#include <linux/shmem_fs.h>
> >> > #include <linux/types.h>
> >> >
> >> > static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >> > @@ -31,19 +33,21 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >> > }
> >> > #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> >> >
> >> > -static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
> >> > +static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
> >> > + unsigned long flags)
> >> > {
> >> > /*
> >> > * Only allow MTE on anonymous mappings as these are guaranteed to be
> >> > * backed by tags-capable memory. The vm_flags may be overridden by a
> >> > * filesystem supporting MTE (RAM-based).
> >>
> >> We should also eventually remove the last sentence or even replace it with
> >> its negation, or somebody might try reintroducing the pattern that won't
> >> work anymore (wasn't there such a hugetlbfs thing in -next?).
> >
> > I agree, we should update this comment as well though as a fix this
> > patch is fine for now.
> >
> > There is indeed a hugetlbfs change in -next adding VM_MTE_ALLOWED. It
> > should still work after the above change but we'd need to move it over
>
> I guess it will work after the above change, but not after 5/5?
>
> > here (and fix the comment at the same time). We'll probably do it around
> > -rc1 or maybe earlier once this fix hits mainline.
>
> I assume this will hopefully go to rc7.
>
> > I don't think we have
> > an equivalent of shmem_file() for hugetlbfs, we'll need to figure
> > something out.
>
> I've found is_file_hugepages(), could work? And while adding the hugetlbfs
> change here, the comment could be adjusted too, right?
Right, thanks for the hint.
I guess the conflict resolution in -next will be something like:
----------------8<----------------------------------
diff --cc arch/arm64/include/asm/mman.h
index 798d965760d4,65bc2b07f666..8b9b819196e5
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@@ -42,7 -39,7 +42,7 @@@ static inline unsigned long arch_calc_v
* filesystem supporting MTE (RAM-based).
*/
if (system_supports_mte() &&
- ((flags & MAP_ANONYMOUS) || shmem_file(file)))
- (flags & (MAP_ANONYMOUS | MAP_HUGETLB)))
++ ((flags & (MAP_ANONYMOUS | MAP_HUGETLB)) || shmem_file(file)))
return VM_MTE_ALLOWED;
return 0;
----------------8<----------------------------------
The fix-up for hugetlbfs is something like:
----------------8<----------------------------------
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index 8b9b819196e5..988eff8269a6 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -6,6 +6,7 @@
#ifndef BUILD_VDSO
#include <linux/compiler.h>
+#include <linux/hugetlb.h>
#include <linux/fs.h>
#include <linux/shmem_fs.h>
#include <linux/types.h>
@@ -37,12 +38,12 @@ static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
unsigned long flags)
{
/*
- * Only allow MTE on anonymous mappings as these are guaranteed to be
- * backed by tags-capable memory. The vm_flags may be overridden by a
- * filesystem supporting MTE (RAM-based).
+ * Only allow MTE on anonymous, shmem and hugetlb mappings as these
+ * are guaranteed to be backed by tags-capable memory.
*/
if (system_supports_mte() &&
- ((flags & (MAP_ANONYMOUS | MAP_HUGETLB)) || shmem_file(file)))
+ ((flags & (MAP_ANONYMOUS | MAP_HUGETLB)) || shmem_file(file) ||
+ (file && is_file_hugepages(file))))
return VM_MTE_ALLOWED;
return 0;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index f26b3b53d7de..5cf327337e22 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -110,7 +110,7 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
* way when do_mmap unwinds (may be important on powerpc
* and ia64).
*/
- vm_flags_set(vma, VM_HUGETLB | VM_DONTEXPAND | VM_MTE_ALLOWED);
+ vm_flags_set(vma, VM_HUGETLB | VM_DONTEXPAND);
vma->vm_ops = &hugetlb_vm_ops;
ret = seal_check_write(info->seals, vma);
----------------8<----------------------------------
We still have VM_DATA_DEFAULT_FLAGS but I think this is fine, the flag
is set by the arch code. This is only to allow mprotect(PROT_MTE) on brk
ranges if any user app wants to do that.
I did not specifically require that only the arch code sets
VM_MTE_ALLOWED but I'd expect it to be the case unless we get some
obscure arm-specific driver that wants to allow MTE on mmap for on-chip
memory (very unlikely though).
That 'if' block needs to be split into multiple ones, it becomes harder
to read. shmem_file() does check for !file but is_file_hugepages()
doesn't, might as well put them under the same 'if' block. And thinking
about it, current arm64 code seems broken as it allows
mmap(MAP_ANONYMOUS | MAP_HUGETLB) but it doesn't actually work properly
prior to Yang's patch in -next.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-10-30 18:30 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-29 18:11 [PATCH hotfix 6.12 v4 0/5] fix error handling in mmap_region() and refactor (hotfixes) Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 1/5] mm: avoid unsafe VMA hook invocation when error arises on mmap hook Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 2/5] mm: unconditionally close VMAs on error Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 3/5] mm: refactor map_deny_write_exec() Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 4/5] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Lorenzo Stoakes
2024-10-30 9:18 ` Vlastimil Babka
2024-10-30 10:58 ` Catalin Marinas
2024-10-30 11:09 ` Vlastimil Babka
2024-10-30 11:53 ` Lorenzo Stoakes
2024-10-30 12:39 ` Catalin Marinas
2024-10-30 15:00 ` Yang Shi
2024-10-30 14:58 ` Yang Shi
2024-10-30 15:08 ` Lorenzo Stoakes
2024-10-30 15:48 ` Yang Shi
2024-10-30 18:30 ` Catalin Marinas
2024-10-30 12:00 ` Catalin Marinas
2024-10-30 12:13 ` Lorenzo Stoakes
2024-10-29 18:11 ` [PATCH hotfix 6.12 v4 5/5] mm: resolve faulty mmap_region() error path behaviour Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox