Re: [PATCH v2 0/6] mm: Optimize mseal checks

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
       [not found] <20240807211309.2729719-1-pedro.falcato@gmail.com>
@ 2024-08-08 23:12 ` Andrew Morton
  2024-08-09  0:34   ` Pedro Falcato
       [not found] ` <20240807211309.2729719-2-pedro.falcato@gmail.com>
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Andrew Morton @ 2024-08-08 23:12 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Liam R. Howlett, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:

> This series also depends on (and will eventually very slightly conflict with)
> the powerpc series that removes arch_unmap[2].

That's awkward.  Please describe the dependency?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-08 23:12 ` [PATCH v2 0/6] mm: Optimize mseal checks Andrew Morton
@ 2024-08-09  0:34   ` Pedro Falcato
  2024-08-09  1:02     ` Jeff Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Pedro Falcato @ 2024-08-09  0:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Liam R. Howlett, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> > This series also depends on (and will eventually very slightly conflict with)
> > the powerpc series that removes arch_unmap[2].
>
> That's awkward.  Please describe the dependency?

One of the transformations done in this patch series (patch 2) assumes
that arch_unmap either doesn't exist or does nothing.
PPC is the only architecture with an arch_unmap implementation, and
through the series I linked they're going to make it work via
->close().

What's the easiest way to deal with this? Can the PPC series go
through the mm tree?

I could also possibly work around this on my end, and either limit the
terribleness to the ppc arch_unmap code or limit the effectiveness of
the patch set a bit. But both of these options feel like somewhat
fighting the inevitable.

What do you think?

Thanks,
Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-09  0:34   ` Pedro Falcato
@ 2024-08-09  1:02     ` Jeff Xu
  2024-08-09 19:34       ` Liam R. Howlett
  2024-08-15 22:10       ` Jeff Xu
  0 siblings, 2 replies; 26+ messages in thread
From: Jeff Xu @ 2024-08-09  1:02 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Liam R. Howlett, Vlastimil Babka, Lorenzo Stoakes,
	linux-mm, linux-kernel, oliver.sang, torvalds, Michael Ellerman

On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > > This series also depends on (and will eventually very slightly conflict with)
> > > the powerpc series that removes arch_unmap[2].
> >
> > That's awkward.  Please describe the dependency?
>
> One of the transformations done in this patch series (patch 2) assumes
> that arch_unmap either doesn't exist or does nothing.
> PPC is the only architecture with an arch_unmap implementation, and
> through the series I linked they're going to make it work via
> ->close().
>
> What's the easiest way to deal with this? Can the PPC series go
> through the mm tree?
>
This patch can't be merged until arch_unmap() is all removed (ppc change)

Also I'm still doing a test/reviewing for this patch,  perhaps it is
better to wait till my test is done.

Thanks
-Jeff


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/6] mm: Move can_modify_vma to mm/internal.h
       [not found] ` <20240807211309.2729719-2-pedro.falcato@gmail.com>
@ 2024-08-09  9:57   ` Lorenzo Stoakes
  0 siblings, 0 replies; 26+ messages in thread
From: Lorenzo Stoakes @ 2024-08-09  9:57 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Liam R. Howlett, Vlastimil Babka, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

On Wed, Aug 07, 2024 at 10:13:04PM GMT, Pedro Falcato wrote:
> Move can_modify_vma to internal.h so it can be inlined properly (with
> the intent to remove can_modify_mm callsites).

This is kind of petty, but it'd be nice to move this to mm/vma.h instead as it's
a vma-specific thing.

This also makes it testable under vma userland tests.

>
> Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> ---
>  mm/internal.h | 24 ++++++++++++++++++++++++
>  mm/mseal.c    | 17 -----------------
>  2 files changed, 24 insertions(+), 17 deletions(-)
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 52f7fc4e8ac..90f50f3c4cf 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1353,6 +1353,24 @@ static inline int can_do_mseal(unsigned long flags)
>  	return 0;
>  }
>
> +static inline bool vma_is_sealed(struct vm_area_struct *vma)
> +{
> +	return (vma->vm_flags & VM_SEALED);
> +}
> +
> +/*
> + * check if a vma is sealed for modification.
> + * return true, if modification is allowed.
> + */
> +static inline bool can_modify_vma(struct vm_area_struct *vma)
> +{
> +	if (unlikely(vma_is_sealed(vma)))
> +		return false;
> +
> +	return true;
> +}
> +
> +
>  bool can_modify_mm(struct mm_struct *mm, unsigned long start,
>  		unsigned long end);
>  bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start,
> @@ -1374,6 +1392,12 @@ static inline bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start,
>  {
>  	return true;
>  }
> +
> +static inline bool can_modify_vma(struct vm_area_struct *vma)
> +{
> +	return true;
> +}
> +
>  #endif
>
>  #ifdef CONFIG_SHRINKER_DEBUG
> diff --git a/mm/mseal.c b/mm/mseal.c
> index bf783bba8ed..4591ae8d29c 100644
> --- a/mm/mseal.c
> +++ b/mm/mseal.c
> @@ -16,28 +16,11 @@
>  #include <linux/sched.h>
>  #include "internal.h"
>
> -static inline bool vma_is_sealed(struct vm_area_struct *vma)
> -{
> -	return (vma->vm_flags & VM_SEALED);
> -}
> -
>  static inline void set_vma_sealed(struct vm_area_struct *vma)
>  {
>  	vm_flags_set(vma, VM_SEALED);
>  }
>
> -/*
> - * check if a vma is sealed for modification.
> - * return true, if modification is allowed.
> - */
> -static bool can_modify_vma(struct vm_area_struct *vma)
> -{
> -	if (unlikely(vma_is_sealed(vma)))
> -		return false;
> -
> -	return true;
> -}
> -
>  static bool is_madv_discard(int behavior)
>  {
>  	return	behavior &
> --
> 2.46.0
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with can_modify_vma
       [not found] ` <20240807211309.2729719-5-pedro.falcato@gmail.com>
@ 2024-08-09 16:05   ` Liam R. Howlett
  2024-08-09 18:45     ` Pedro Falcato
  0 siblings, 1 reply; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-09 16:05 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

* Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> Delegate all can_modify checks to the proper places. Unmap checks are
> done in do_unmap (et al).
> 
> This patch allows for mremap partial failure in certain cases (for
> instance, when destination VMAs aren't sealed, but the source VMA is).
> It shouldn't be too troublesome, as you'd need to go out of your way to
> do illegal operations on a VMA.

As mseal() is supposed to be a security thing, is the illegal operation
not a concern?

> 
> Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> ---
> v2:
> 	- Removed a superfluous check in mremap (Jeff Xu)
> 
>  mm/mremap.c | 30 ++++--------------------------
>  1 file changed, 4 insertions(+), 26 deletions(-)
> 
> diff --git a/mm/mremap.c b/mm/mremap.c
> index e7ae140fc64..35afb3e38a8 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -821,6 +821,10 @@ static struct vm_area_struct *vma_to_resize(unsigned long addr,
>  	if (!vma)
>  		return ERR_PTR(-EFAULT);
>  
> +	/* Don't allow vma expansion when it has already been sealed */
> +	if (!can_modify_vma(vma))
> +		return ERR_PTR(-EPERM);
> +
>  	/*
>  	 * !old_len is a special case where an attempt is made to 'duplicate'
>  	 * a mapping.  This makes no sense for private mappings as it will
> @@ -902,19 +906,6 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len,
>  	if ((mm->map_count + 2) >= sysctl_max_map_count - 3)
>  		return -ENOMEM;
>  
> -	/*
> -	 * In mremap_to().
> -	 * Move a VMA to another location, check if src addr is sealed.
> -	 *
> -	 * Place can_modify_mm here because mremap_to()
> -	 * does its own checking for address range, and we only
> -	 * check the sealing after passing those checks.
> -	 *
> -	 * can_modify_mm assumes we have acquired the lock on MM.
> -	 */
> -	if (unlikely(!can_modify_mm(mm, addr, addr + old_len)))
> -		return -EPERM;
> -
>  	if (flags & MREMAP_FIXED) {
>  		/*
>  		 * In mremap_to().
> @@ -1079,19 +1070,6 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
>  		goto out;
>  	}
>  
> -	/*
> -	 * Below is shrink/expand case (not mremap_to())
> -	 * Check if src address is sealed, if so, reject.
> -	 * In other words, prevent shrinking or expanding a sealed VMA.
> -	 *
> -	 * Place can_modify_mm here so we can keep the logic related to
> -	 * shrink/expand together.
> -	 */
> -	if (unlikely(!can_modify_mm(mm, addr, addr + old_len))) {
> -		ret = -EPERM;
> -		goto out;
> -	}
> -
>  	/*
>  	 * Always allow a shrinking remap: that just unmaps
>  	 * the unnecessary pages..
> -- 
> 2.46.0
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
       [not found] ` <20240807211309.2729719-3-pedro.falcato@gmail.com>
@ 2024-08-09 16:15   ` Liam R. Howlett
  2024-08-09 16:48     ` Liam R. Howlett
  0 siblings, 1 reply; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-09 16:15 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

* Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> We were doing an extra mmap tree traversal just to check if the entire
> range is modifiable. This can be done when we iterate through the VMAs
> instead.
> 
> Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> ---
>  mm/mmap.c | 13 +------------
>  mm/vma.c  | 23 ++++++++++++-----------
>  2 files changed, 13 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 4a9c2329b09..c1c7a7d00f5 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  		unsigned long start, unsigned long end, struct list_head *uf,
>  		bool unlock)
>  {
> -	struct mm_struct *mm = vma->vm_mm;
> -
> -	/*
> -	 * Check if memory is sealed before arch_unmap.
> -	 * Prevent unmapping a sealed VMA.
> -	 * can_modify_mm assumes we have acquired the lock on MM.
> -	 */
> -	if (unlikely(!can_modify_mm(mm, start, end)))
> -		return -EPERM;
> -
> -	arch_unmap(mm, start, end);
> -	return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> +	return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
>  }
>  
>  /*
> diff --git a/mm/vma.c b/mm/vma.c
> index bf0546fe6ea..7a121bcc907 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  		if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
>  			goto map_count_exceeded;
>  
> +		/* Don't bother splitting the VMA if we can't unmap it anyway */
> +		if (!can_modify_vma(vma)) {
> +			error = -EPERM;
> +			goto start_split_failed;
> +		}
> +

Would this check be better placed in __split_vma()?  It could replace
both this and the next chunk of code.

>  		error = __split_vma(vmi, vma, start, 1);
>  		if (error)
>  			goto start_split_failed;
> @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  	 */
>  	next = vma;
>  	do {
> +		if (!can_modify_vma(vma)) {
> +			error = -EPERM;
> +			goto modify_vma_failed;
> +		}
> +
>  		/* Does it split the end? */
>  		if (next->vm_end > end) {
>  			error = __split_vma(vmi, next, end, 0);
> @@ -815,6 +826,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
>  	__mt_destroy(&mt_detach);
>  	return 0;
>  
> +modify_vma_failed:
>  clear_tree_failed:
>  userfaultfd_error:
>  munmap_gather_failed:
> @@ -860,17 +872,6 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
>  	if (end == start)
>  		return -EINVAL;
>  
> -	/*
> -	 * Check if memory is sealed before arch_unmap.
> -	 * Prevent unmapping a sealed VMA.
> -	 * can_modify_mm assumes we have acquired the lock on MM.
> -	 */
> -	if (unlikely(!can_modify_mm(mm, start, end)))
> -		return -EPERM;
> -
> -	 /* arch_unmap() might do unmaps itself.  */
> -	arch_unmap(mm, start, end);
> -
>  	/* Find the first overlapping VMA */
>  	vma = vma_find(vmi, end);
>  	if (!vma) {
> -- 
> 2.46.0
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 5/6] mseal: Replace can_modify_mm_madv with a vma variant
       [not found] ` <20240807211309.2729719-6-pedro.falcato@gmail.com>
@ 2024-08-09 16:27   ` Liam R. Howlett
  0 siblings, 0 replies; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-09 16:27 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

* Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> Replace can_modify_mm_madv() with a single vma variant, and associated
> checks in madvise.
> 
> While we're at it, also invert the order of checks in:
>  if (unlikely(is_ro_anon(vma) && !can_modify_vma(vma))
> 
> Checking if we can modify the vma itself (through vm_flags) is
> certainly cheaper than is_ro_anon() due to arch_vma_access_permitted()
> looking at e.g pkeys registers (with extra branches) in some
> architectures.

Doesn't this also allow for partial madvise success?  If you pass a
range across vmas, then it will fail once it encounters an mseal'ed vma.
This change should probably be reflected in the change log.


> 
> Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> ---
>  mm/internal.h |  6 ++----
>  mm/madvise.c  | 13 +++----------
>  mm/mseal.c    | 17 ++++-------------
>  3 files changed, 9 insertions(+), 27 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 90f50f3c4cf..3f9a5c17626 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -1373,8 +1373,7 @@ static inline bool can_modify_vma(struct vm_area_struct *vma)
>  
>  bool can_modify_mm(struct mm_struct *mm, unsigned long start,
>  		unsigned long end);
> -bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start,
> -		unsigned long end, int behavior);
> +bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior);
>  #else
>  static inline int can_do_mseal(unsigned long flags)
>  {
> @@ -1387,8 +1386,7 @@ static inline bool can_modify_mm(struct mm_struct *mm, unsigned long start,
>  	return true;
>  }
>  
> -static inline bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start,
> -		unsigned long end, int behavior)
> +static inline bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior)
>  {
>  	return true;
>  }
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 89089d84f8d..4e64770be16 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1031,6 +1031,9 @@ static int madvise_vma_behavior(struct vm_area_struct *vma,
>  	struct anon_vma_name *anon_name;
>  	unsigned long new_flags = vma->vm_flags;
>  
> +	if (unlikely(!can_modify_vma_madv(vma, behavior)))
> +		return -EPERM;
> +
>  	switch (behavior) {
>  	case MADV_REMOVE:
>  		return madvise_remove(vma, prev, start, end);
> @@ -1448,15 +1451,6 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh
>  	start = untagged_addr_remote(mm, start);
>  	end = start + len;
>  
> -	/*
> -	 * Check if the address range is sealed for do_madvise().
> -	 * can_modify_mm_madv assumes we have acquired the lock on MM.
> -	 */
> -	if (unlikely(!can_modify_mm_madv(mm, start, end, behavior))) {
> -		error = -EPERM;
> -		goto out;
> -	}
> -

Funny, this check stopped populate madvise operations.  The new code
does not, which is probably better and fine.

>  	blk_start_plug(&plug);
>  	switch (behavior) {
>  	case MADV_POPULATE_READ:
> @@ -1470,7 +1464,6 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh
>  	}
>  	blk_finish_plug(&plug);
>  
> -out:
>  	if (write)
>  		mmap_write_unlock(mm);
>  	else
> diff --git a/mm/mseal.c b/mm/mseal.c
> index 4591ae8d29c..6559242dd05 100644
> --- a/mm/mseal.c
> +++ b/mm/mseal.c
> @@ -67,24 +67,15 @@ bool can_modify_mm(struct mm_struct *mm, unsigned long start, unsigned long end)
>  }
>  
>  /*
> - * Check if the vmas of a memory range are allowed to be modified by madvise.
> - * the memory ranger can have a gap (unallocated memory).
> - * return true, if it is allowed.
> + * Check if a vma is allowed to be modified by madvise.
>   */
> -bool can_modify_mm_madv(struct mm_struct *mm, unsigned long start, unsigned long end,
> -		int behavior)
> +bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior)
>  {
> -	struct vm_area_struct *vma;
> -
> -	VMA_ITERATOR(vmi, mm, start);
> -
>  	if (!is_madv_discard(behavior))
>  		return true;
>  
> -	/* going through each vma to check. */
> -	for_each_vma_range(vmi, vma, end)
> -		if (unlikely(is_ro_anon(vma) && !can_modify_vma(vma)))
> -			return false;
> +	if (unlikely(!can_modify_vma(vma) && is_ro_anon(vma)))
> +		return false;
>  
>  	/* Allow by default. */
>  	return true;
> -- 
> 2.46.0
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-09 16:15   ` [PATCH v2 2/6] mm/munmap: " Liam R. Howlett
@ 2024-08-09 16:48     ` Liam R. Howlett
  2024-08-09 18:53       ` Pedro Falcato
  0 siblings, 1 reply; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-09 16:48 UTC (permalink / raw)
  To: Pedro Falcato, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
	linux-mm, linux-kernel, oliver.sang, torvalds, jeffxu,
	Michael Ellerman

* Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > We were doing an extra mmap tree traversal just to check if the entire
> > range is modifiable. This can be done when we iterate through the VMAs
> > instead.
> > 
> > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > ---
> >  mm/mmap.c | 13 +------------
> >  mm/vma.c  | 23 ++++++++++++-----------
> >  2 files changed, 13 insertions(+), 23 deletions(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 4a9c2329b09..c1c7a7d00f5 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  		unsigned long start, unsigned long end, struct list_head *uf,
> >  		bool unlock)
> >  {
> > -	struct mm_struct *mm = vma->vm_mm;
> > -
> > -	/*
> > -	 * Check if memory is sealed before arch_unmap.
> > -	 * Prevent unmapping a sealed VMA.
> > -	 * can_modify_mm assumes we have acquired the lock on MM.
> > -	 */
> > -	if (unlikely(!can_modify_mm(mm, start, end)))
> > -		return -EPERM;
> > -
> > -	arch_unmap(mm, start, end);
> > -	return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > +	return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> >  }
> >  
> >  /*
> > diff --git a/mm/vma.c b/mm/vma.c
> > index bf0546fe6ea..7a121bcc907 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  		if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> >  			goto map_count_exceeded;
> >  
> > +		/* Don't bother splitting the VMA if we can't unmap it anyway */
> > +		if (!can_modify_vma(vma)) {
> > +			error = -EPERM;
> > +			goto start_split_failed;
> > +		}
> > +
> 
> Would this check be better placed in __split_vma()?  It could replace
> both this and the next chunk of code.

not quite.

> 
> >  		error = __split_vma(vmi, vma, start, 1);
> >  		if (error)
> >  			goto start_split_failed;
> > @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	 */
> >  	next = vma;
> >  	do {
> > +		if (!can_modify_vma(vma)) {
> > +			error = -EPERM;
> > +			goto modify_vma_failed;
> > +		}
> > +

This chunk would need to be moved below the end check so that we catch
full vma unmaps.

> >  		/* Does it split the end? */
> >  		if (next->vm_end > end) {
> >  			error = __split_vma(vmi, next, end, 0);
> > @@ -815,6 +826,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> >  	__mt_destroy(&mt_detach);
> >  	return 0;
> >  
> > +modify_vma_failed:
> >  clear_tree_failed:
> >  userfaultfd_error:
> >  munmap_gather_failed:
> > @@ -860,17 +872,6 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
> >  	if (end == start)
> >  		return -EINVAL;
> >  
> > -	/*
> > -	 * Check if memory is sealed before arch_unmap.
> > -	 * Prevent unmapping a sealed VMA.
> > -	 * can_modify_mm assumes we have acquired the lock on MM.
> > -	 */
> > -	if (unlikely(!can_modify_mm(mm, start, end)))
> > -		return -EPERM;
> > -
> > -	 /* arch_unmap() might do unmaps itself.  */
> > -	arch_unmap(mm, start, end);
> > -
> >  	/* Find the first overlapping VMA */
> >  	vma = vma_find(vmi, end);
> >  	if (!vma) {
> > -- 
> > 2.46.0
> > 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with can_modify_vma
  2024-08-09 16:05   ` [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with can_modify_vma Liam R. Howlett
@ 2024-08-09 18:45     ` Pedro Falcato
  2024-08-12 15:22       ` Liam R. Howlett
  0 siblings, 1 reply; 26+ messages in thread
From: Pedro Falcato @ 2024-08-09 18:45 UTC (permalink / raw)
  To: Liam R. Howlett, Pedro Falcato, Andrew Morton, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	jeffxu, Michael Ellerman

On Fri, Aug 9, 2024 at 5:06 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > Delegate all can_modify checks to the proper places. Unmap checks are
> > done in do_unmap (et al).
> >
> > This patch allows for mremap partial failure in certain cases (for
> > instance, when destination VMAs aren't sealed, but the source VMA is).
> > It shouldn't be too troublesome, as you'd need to go out of your way to
> > do illegal operations on a VMA.
>
> As mseal() is supposed to be a security thing, is the illegal operation
> not a concern?

My 3 cents (note: I'm not a security guy):

- Linux m*() operations have been allowed to partially fail for ages.
POSIX only disallows this in the munmap case (which is why we need all
that detached VMA logic), but not in any other case. We have a lot of
other failure points in these syscalls, and would require extensive
refactoring to patch this up (very likely with an inevitable
performance regression, as we saw in this case).

- Despite allowing for partial failure, this patch set always keeps
the sealed VMAs untouched (so that invariant isn't broken). The munmap
semantics remain untouched (and POSIXly correct) due to the detached
VMA logic.

- I personally have not heard of a single attack making use of this,
and the performance hit is very measurable and exists _for every
user_, despite mseal being a very niche feature (I cannot find a
single user of mseal upstream, both in debian code search, github,
chromium, v8, glibc, and what have you).

I remember SIGKILL on a bad operation being a possibility during the
mseal patch set talks, and if people are really scared of this partial
stuff, I would guess that's still a possibility. There are no users
after all...

-- 
Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-09 16:48     ` Liam R. Howlett
@ 2024-08-09 18:53       ` Pedro Falcato
  2024-08-09 19:24         ` Liam R. Howlett
  0 siblings, 1 reply; 26+ messages in thread
From: Pedro Falcato @ 2024-08-09 18:53 UTC (permalink / raw)
  To: Liam R. Howlett, Pedro Falcato, Andrew Morton, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	jeffxu, Michael Ellerman

On Fri, Aug 9, 2024 at 5:48 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > We were doing an extra mmap tree traversal just to check if the entire
> > > range is modifiable. This can be done when we iterate through the VMAs
> > > instead.
> > >
> > > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > > ---
> > >  mm/mmap.c | 13 +------------
> > >  mm/vma.c  | 23 ++++++++++++-----------
> > >  2 files changed, 13 insertions(+), 23 deletions(-)
> > >
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 4a9c2329b09..c1c7a7d00f5 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > >             unsigned long start, unsigned long end, struct list_head *uf,
> > >             bool unlock)
> > >  {
> > > -   struct mm_struct *mm = vma->vm_mm;
> > > -
> > > -   /*
> > > -    * Check if memory is sealed before arch_unmap.
> > > -    * Prevent unmapping a sealed VMA.
> > > -    * can_modify_mm assumes we have acquired the lock on MM.
> > > -    */
> > > -   if (unlikely(!can_modify_mm(mm, start, end)))
> > > -           return -EPERM;
> > > -
> > > -   arch_unmap(mm, start, end);
> > > -   return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > > +   return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> > >  }
> > >
> > >  /*
> > > diff --git a/mm/vma.c b/mm/vma.c
> > > index bf0546fe6ea..7a121bcc907 100644
> > > --- a/mm/vma.c
> > > +++ b/mm/vma.c
> > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > >             if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> > >                     goto map_count_exceeded;
> > >
> > > +           /* Don't bother splitting the VMA if we can't unmap it anyway */
> > > +           if (!can_modify_vma(vma)) {
> > > +                   error = -EPERM;
> > > +                   goto start_split_failed;
> > > +           }
> > > +
> >
> > Would this check be better placed in __split_vma()?  It could replace
> > both this and the next chunk of code.
>
> not quite.

Yeah, I was going to say that splitting a sealed VMA is okay (and we
allow it on mlock and madvise).

>
> >
> > >             error = __split_vma(vmi, vma, start, 1);
> > >             if (error)
> > >                     goto start_split_failed;
> > > @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > >      */
> > >     next = vma;
> > >     do {
> > > +           if (!can_modify_vma(vma)) {
> > > +                   error = -EPERM;
> > > +                   goto modify_vma_failed;
> > > +           }
> > > +
>
> This chunk would need to be moved below the end check so that we catch
> full vma unmaps.

Why below the end check? I believe we can avoid the split? Is there
something I'm missing?
But I did find a bug, what I really seem to want is:

 +           if (!can_modify_vma(next)) {
instead of (vma). It's somewhat concerning how the mseal selftests
didn't trip on this?

-- 
Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-09 18:53       ` Pedro Falcato
@ 2024-08-09 19:24         ` Liam R. Howlett
  2024-08-12 14:30           ` Jeff Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-09 19:24 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

* Pedro Falcato <pedro.falcato@gmail.com> [240809 14:53]:
> On Fri, Aug 9, 2024 at 5:48 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> > > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > > We were doing an extra mmap tree traversal just to check if the entire
> > > > range is modifiable. This can be done when we iterate through the VMAs
> > > > instead.
> > > >
> > > > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > > > ---
> > > >  mm/mmap.c | 13 +------------
> > > >  mm/vma.c  | 23 ++++++++++++-----------
> > > >  2 files changed, 13 insertions(+), 23 deletions(-)
> > > >
> > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > index 4a9c2329b09..c1c7a7d00f5 100644
> > > > --- a/mm/mmap.c
> > > > +++ b/mm/mmap.c
> > > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > >             unsigned long start, unsigned long end, struct list_head *uf,
> > > >             bool unlock)
> > > >  {
> > > > -   struct mm_struct *mm = vma->vm_mm;
> > > > -
> > > > -   /*
> > > > -    * Check if memory is sealed before arch_unmap.
> > > > -    * Prevent unmapping a sealed VMA.
> > > > -    * can_modify_mm assumes we have acquired the lock on MM.
> > > > -    */
> > > > -   if (unlikely(!can_modify_mm(mm, start, end)))
> > > > -           return -EPERM;
> > > > -
> > > > -   arch_unmap(mm, start, end);
> > > > -   return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > > > +   return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> > > >  }
> > > >
> > > >  /*
> > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > index bf0546fe6ea..7a121bcc907 100644
> > > > --- a/mm/vma.c
> > > > +++ b/mm/vma.c
> > > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > >             if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> > > >                     goto map_count_exceeded;
> > > >
> > > > +           /* Don't bother splitting the VMA if we can't unmap it anyway */
> > > > +           if (!can_modify_vma(vma)) {
> > > > +                   error = -EPERM;
> > > > +                   goto start_split_failed;
> > > > +           }
> > > > +
> > >
> > > Would this check be better placed in __split_vma()?  It could replace
> > > both this and the next chunk of code.
> >
> > not quite.
> 
> Yeah, I was going to say that splitting a sealed VMA is okay (and we
> allow it on mlock and madvise).

splitting a sealed vma wasn't supposed to be okay.. but it is Jeff's
feature.  Jeff?

> 
> >
> > >
> > > >             error = __split_vma(vmi, vma, start, 1);
> > > >             if (error)
> > > >                     goto start_split_failed;
> > > > @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > >      */
> > > >     next = vma;
> > > >     do {
> > > > +           if (!can_modify_vma(vma)) {
> > > > +                   error = -EPERM;
> > > > +                   goto modify_vma_failed;
> > > > +           }
> > > > +
> >
> > This chunk would need to be moved below the end check so that we catch
> > full vma unmaps.
> 
> Why below the end check? I believe we can avoid the split? Is there
> something I'm missing?
> But I did find a bug, what I really seem to want is:
> 
>  +           if (!can_modify_vma(next)) {

Good catch.

> instead of (vma). It's somewhat concerning how the mseal selftests
> didn't trip on this?

the end check will call split and will fail in there, if you move the
code as I suggested.

That means, if we aren't splitting, we still have to check the vma, so
the check is necessary.




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-09  1:02     ` Jeff Xu
@ 2024-08-09 19:34       ` Liam R. Howlett
  2024-08-15 22:10       ` Jeff Xu
  1 sibling, 0 replies; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-09 19:34 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Pedro Falcato, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
	linux-mm, linux-kernel, oliver.sang, torvalds, Michael Ellerman

* Jeff Xu <jeffxu@google.com> [240808 21:03]:
> On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > > This series also depends on (and will eventually very slightly conflict with)
> > > > the powerpc series that removes arch_unmap[2].
> > >
> > > That's awkward.  Please describe the dependency?
> >
> > One of the transformations done in this patch series (patch 2) assumes
> > that arch_unmap either doesn't exist or does nothing.
> > PPC is the only architecture with an arch_unmap implementation, and
> > through the series I linked they're going to make it work via
> > ->close().
> >
> > What's the easiest way to deal with this? Can the PPC series go
> > through the mm tree?
> >
> This patch can't be merged until arch_unmap() is all removed (ppc change)

I would say that you should just leave the arch_unmap() call in place
for now.

Are we seriously worried that a powerpc user will try to unmap the VDSO
and then get an error that it was mseal'ed after setting the VDSO
pointer to NULL?  I'm more worried about the Perseid meteor shower
hitting me - aka the sky falling.

Note that we could already run into an error after setting the vdso
pointer to null today.

...

Thanks,
Liam


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-09 19:24         ` Liam R. Howlett
@ 2024-08-12 14:30           ` Jeff Xu
  2024-08-12 16:57             ` Liam R. Howlett
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Xu @ 2024-08-12 14:30 UTC (permalink / raw)
  To: Liam R. Howlett, Pedro Falcato, Andrew Morton, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	jeffxu, Michael Ellerman, Kees Cook

+ Kees who commented on mseal() series before. Please keep Kees in the
cc for future response to this series.

On Fri, Aug 9, 2024 at 12:25 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Pedro Falcato <pedro.falcato@gmail.com> [240809 14:53]:
> > On Fri, Aug 9, 2024 at 5:48 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > >
> > > * Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> > > > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > > > We were doing an extra mmap tree traversal just to check if the entire
> > > > > range is modifiable. This can be done when we iterate through the VMAs
> > > > > instead.
> > > > >
> > > > > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > > > > ---
> > > > >  mm/mmap.c | 13 +------------
> > > > >  mm/vma.c  | 23 ++++++++++++-----------
> > > > >  2 files changed, 13 insertions(+), 23 deletions(-)
> > > > >
> > > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > > index 4a9c2329b09..c1c7a7d00f5 100644
> > > > > --- a/mm/mmap.c
> > > > > +++ b/mm/mmap.c
> > > > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > >             unsigned long start, unsigned long end, struct list_head *uf,
> > > > >             bool unlock)
> > > > >  {
> > > > > -   struct mm_struct *mm = vma->vm_mm;
> > > > > -
> > > > > -   /*
> > > > > -    * Check if memory is sealed before arch_unmap.
> > > > > -    * Prevent unmapping a sealed VMA.
> > > > > -    * can_modify_mm assumes we have acquired the lock on MM.
> > > > > -    */
> > > > > -   if (unlikely(!can_modify_mm(mm, start, end)))
> > > > > -           return -EPERM;
> > > > > -
> > > > > -   arch_unmap(mm, start, end);
> > > > > -   return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > > > > +   return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> > > > >  }
> > > > >
> > > > >  /*
> > > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > > index bf0546fe6ea..7a121bcc907 100644
> > > > > --- a/mm/vma.c
> > > > > +++ b/mm/vma.c
> > > > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > >             if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> > > > >                     goto map_count_exceeded;
> > > > >
> > > > > +           /* Don't bother splitting the VMA if we can't unmap it anyway */
> > > > > +           if (!can_modify_vma(vma)) {
> > > > > +                   error = -EPERM;
> > > > > +                   goto start_split_failed;
> > > > > +           }
> > > > > +
> > > >
> > > > Would this check be better placed in __split_vma()?  It could replace
> > > > both this and the next chunk of code.
> > >
> > > not quite.
> >
> > Yeah, I was going to say that splitting a sealed VMA is okay (and we
> > allow it on mlock and madvise).
>
> splitting a sealed vma wasn't supposed to be okay.. but it is Jeff's
> feature.  Jeff?
>
Splitting a sealed VMA is wrong.
Whoever wants to split a sealed VMA should  answer this question
first: what is the use case for splitting a sealed VMA?

The V2 series doesn't have selftest change which indicates lack of
testing. The out-of-loop check is positioned nearer to the API entry
point and separated from internal business logic, thereby minimizing
the testing requirements. However, as we move the sealing check
further inward and intertwine it with business logic, greater test
coverage becomes necessary to ensure  the correctness of  sealing
is preserved.




-Jeff
















> >
> > >
> > > >
> > > > >             error = __split_vma(vmi, vma, start, 1);
> > > > >             if (error)
> > > > >                     goto start_split_failed;
> > > > > @@ -723,6 +729,11 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > >      */
> > > > >     next = vma;
> > > > >     do {
> > > > > +           if (!can_modify_vma(vma)) {
> > > > > +                   error = -EPERM;
> > > > > +                   goto modify_vma_failed;
> > > > > +           }
> > > > > +
> > >
> > > This chunk would need to be moved below the end check so that we catch
> > > full vma unmaps.
> >
> > Why below the end check? I believe we can avoid the split? Is there
> > something I'm missing?
> > But I did find a bug, what I really seem to want is:
> >
> >  +           if (!can_modify_vma(next)) {
>
> Good catch.
>
> > instead of (vma). It's somewhat concerning how the mseal selftests
> > didn't trip on this?
>
> the end check will call split and will fail in there, if you move the
> code as I suggested.
>
> That means, if we aren't splitting, we still have to check the vma, so
> the check is necessary.
>
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with can_modify_vma
  2024-08-09 18:45     ` Pedro Falcato
@ 2024-08-12 15:22       ` Liam R. Howlett
  0 siblings, 0 replies; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-12 15:22 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Vlastimil Babka, Lorenzo Stoakes, linux-mm,
	linux-kernel, oliver.sang, torvalds, jeffxu, Michael Ellerman

* Pedro Falcato <pedro.falcato@gmail.com> [240809 14:45]:
> On Fri, Aug 9, 2024 at 5:06 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > Delegate all can_modify checks to the proper places. Unmap checks are
> > > done in do_unmap (et al).
> > >
> > > This patch allows for mremap partial failure in certain cases (for
> > > instance, when destination VMAs aren't sealed, but the source VMA is).
> > > It shouldn't be too troublesome, as you'd need to go out of your way to
> > > do illegal operations on a VMA.
> >
> > As mseal() is supposed to be a security thing, is the illegal operation
> > not a concern?
> 
> My 3 cents (note: I'm not a security guy):
> 
> - Linux m*() operations have been allowed to partially fail for ages.
> POSIX only disallows this in the munmap case (which is why we need all
> that detached VMA logic), but not in any other case. We have a lot of
> other failure points in these syscalls, and would require extensive
> refactoring to patch this up (very likely with an inevitable
> performance regression, as we saw in this case).
> 
> - Despite allowing for partial failure, this patch set always keeps
> the sealed VMAs untouched (so that invariant isn't broken). The munmap
> semantics remain untouched (and POSIXly correct) due to the detached
> VMA logic.
> 
> - I personally have not heard of a single attack making use of this,
> and the performance hit is very measurable and exists _for every
> user_, despite mseal being a very niche feature (I cannot find a
> single user of mseal upstream, both in debian code search, github,
> chromium, v8, glibc, and what have you).
> 

...

I really have no disagreement with the above statements, but looking at
this further: vma_to_resize() is called in 2 places:
1. mremap() syscall
mremap() calls vma_lookup() and then later calls vma_to_resize() which
also calls vma_lookup() in the first 5 lines of the function.

2. mremap_to() static function
mremap_to() is called only from mreamp(), but earlier than
vma_to_resize().

If we move the vma check to mremap() after finding the vma, then we can
avoid partial failures due to mseal().  We should probably check as much
as possible there, but that change would be too large to fix a
regression.

iow the check was in the wrong place and was the wrong check, but we can
use your check and move it up ~15 lines and everything will be the same
and faster.

For a later patch, there is an opportunity to even make this faster by
passing through the vma to vma_to_resize().  We could remove another
walk of the vma tree.  Probably not necessary to fix the regression, but
it would at least reduce the instruction count - if not a performance
increase (depending on cache use).

Thanks,
Liam

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-12 14:30           ` Jeff Xu
@ 2024-08-12 16:57             ` Liam R. Howlett
  2024-08-12 17:38               ` Jeff Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-12 16:57 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Pedro Falcato, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
	linux-mm, linux-kernel, oliver.sang, torvalds, Michael Ellerman,
	Kees Cook

* Jeff Xu <jeffxu@google.com> [240812 10:30]:
> + Kees who commented on mseal() series before. Please keep Kees in the
> cc for future response to this series.
> 
> On Fri, Aug 9, 2024 at 12:25 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Pedro Falcato <pedro.falcato@gmail.com> [240809 14:53]:
> > > On Fri, Aug 9, 2024 at 5:48 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > >
> > > > * Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> > > > > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > > > > We were doing an extra mmap tree traversal just to check if the entire
> > > > > > range is modifiable. This can be done when we iterate through the VMAs
> > > > > > instead.
> > > > > >
> > > > > > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > > > > > ---
> > > > > >  mm/mmap.c | 13 +------------
> > > > > >  mm/vma.c  | 23 ++++++++++++-----------
> > > > > >  2 files changed, 13 insertions(+), 23 deletions(-)
> > > > > >
> > > > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > > > index 4a9c2329b09..c1c7a7d00f5 100644
> > > > > > --- a/mm/mmap.c
> > > > > > +++ b/mm/mmap.c
> > > > > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > > >             unsigned long start, unsigned long end, struct list_head *uf,
> > > > > >             bool unlock)
> > > > > >  {
> > > > > > -   struct mm_struct *mm = vma->vm_mm;
> > > > > > -
> > > > > > -   /*
> > > > > > -    * Check if memory is sealed before arch_unmap.
> > > > > > -    * Prevent unmapping a sealed VMA.
> > > > > > -    * can_modify_mm assumes we have acquired the lock on MM.
> > > > > > -    */
> > > > > > -   if (unlikely(!can_modify_mm(mm, start, end)))
> > > > > > -           return -EPERM;
> > > > > > -
> > > > > > -   arch_unmap(mm, start, end);
> > > > > > -   return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > > > > > +   return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> > > > > >  }
> > > > > >
> > > > > >  /*
> > > > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > > > index bf0546fe6ea..7a121bcc907 100644
> > > > > > --- a/mm/vma.c
> > > > > > +++ b/mm/vma.c
> > > > > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > > >             if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> > > > > >                     goto map_count_exceeded;
> > > > > >
> > > > > > +           /* Don't bother splitting the VMA if we can't unmap it anyway */
> > > > > > +           if (!can_modify_vma(vma)) {
> > > > > > +                   error = -EPERM;
> > > > > > +                   goto start_split_failed;
> > > > > > +           }
> > > > > > +
> > > > >
> > > > > Would this check be better placed in __split_vma()?  It could replace
> > > > > both this and the next chunk of code.
> > > >
> > > > not quite.
> > >
> > > Yeah, I was going to say that splitting a sealed VMA is okay (and we
> > > allow it on mlock and madvise).
> >
> > splitting a sealed vma wasn't supposed to be okay.. but it is Jeff's
> > feature.  Jeff?
> >
> Splitting a sealed VMA is wrong.
> Whoever wants to split a sealed VMA should  answer this question
> first: what is the use case for splitting a sealed VMA?

If we lower the checks to __split_vma() and anywhere that does entire
vma modifications, then we would avoid modifications of the vma.  This
particular loop breaks on the final vma, if it needs splitting, so we'd
still need the check in the main loop to ensure the full vma isn't
mseal()'ed.  Splitting the start/end could be covered by the
__split_vma() function.

> 
> The V2 series doesn't have selftest change which indicates lack of
> testing. The out-of-loop check is positioned nearer to the API entry
> point and separated from internal business logic, thereby minimizing
> the testing requirements. However, as we move the sealing check
> further inward and intertwine it with business logic, greater test
> coverage becomes necessary to ensure  the correctness of  sealing
> is preserved.

I would have expected more complete test coverage and not limited to
what is expected to happen with an up front test.  Changes may happen
that you aren't Cc'ed on (or when you are away, etc) that could cause a
silent failure which could go undetected for a prolonged period of time.

If splitting a vma isn't okay, then it would be safer to test at least
in some scenarios in the upstream tests.  Ideally, all tests are
upstream so everyone can run the testing.

Thanks,
Liam


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-12 16:57             ` Liam R. Howlett
@ 2024-08-12 17:38               ` Jeff Xu
  2024-08-12 19:25                 ` Liam R. Howlett
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Xu @ 2024-08-12 17:38 UTC (permalink / raw)
  To: Liam R. Howlett, Jeff Xu, Pedro Falcato, Andrew Morton,
	Vlastimil Babka, Lorenzo Stoakes, linux-mm, linux-kernel,
	oliver.sang, torvalds, Michael Ellerman, Kees Cook

On Mon, Aug 12, 2024 at 9:58 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Jeff Xu <jeffxu@google.com> [240812 10:30]:
> > + Kees who commented on mseal() series before. Please keep Kees in the
> > cc for future response to this series.
> >
> > On Fri, Aug 9, 2024 at 12:25 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > >
> > > * Pedro Falcato <pedro.falcato@gmail.com> [240809 14:53]:
> > > > On Fri, Aug 9, 2024 at 5:48 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > > >
> > > > > * Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> > > > > > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > > > > > We were doing an extra mmap tree traversal just to check if the entire
> > > > > > > range is modifiable. This can be done when we iterate through the VMAs
> > > > > > > instead.
> > > > > > >
> > > > > > > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > > > > > > ---
> > > > > > >  mm/mmap.c | 13 +------------
> > > > > > >  mm/vma.c  | 23 ++++++++++++-----------
> > > > > > >  2 files changed, 13 insertions(+), 23 deletions(-)
> > > > > > >
> > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > > > > index 4a9c2329b09..c1c7a7d00f5 100644
> > > > > > > --- a/mm/mmap.c
> > > > > > > +++ b/mm/mmap.c
> > > > > > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > > > >             unsigned long start, unsigned long end, struct list_head *uf,
> > > > > > >             bool unlock)
> > > > > > >  {
> > > > > > > -   struct mm_struct *mm = vma->vm_mm;
> > > > > > > -
> > > > > > > -   /*
> > > > > > > -    * Check if memory is sealed before arch_unmap.
> > > > > > > -    * Prevent unmapping a sealed VMA.
> > > > > > > -    * can_modify_mm assumes we have acquired the lock on MM.
> > > > > > > -    */
> > > > > > > -   if (unlikely(!can_modify_mm(mm, start, end)))
> > > > > > > -           return -EPERM;
> > > > > > > -
> > > > > > > -   arch_unmap(mm, start, end);
> > > > > > > -   return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > > > > > > +   return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> > > > > > >  }
> > > > > > >
> > > > > > >  /*
> > > > > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > > > > index bf0546fe6ea..7a121bcc907 100644
> > > > > > > --- a/mm/vma.c
> > > > > > > +++ b/mm/vma.c
> > > > > > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > > > >             if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> > > > > > >                     goto map_count_exceeded;
> > > > > > >
> > > > > > > +           /* Don't bother splitting the VMA if we can't unmap it anyway */
> > > > > > > +           if (!can_modify_vma(vma)) {
> > > > > > > +                   error = -EPERM;
> > > > > > > +                   goto start_split_failed;
> > > > > > > +           }
> > > > > > > +
> > > > > >
> > > > > > Would this check be better placed in __split_vma()?  It could replace
> > > > > > both this and the next chunk of code.
> > > > >
> > > > > not quite.
> > > >
> > > > Yeah, I was going to say that splitting a sealed VMA is okay (and we
> > > > allow it on mlock and madvise).
> > >
> > > splitting a sealed vma wasn't supposed to be okay.. but it is Jeff's
> > > feature.  Jeff?
> > >
> > Splitting a sealed VMA is wrong.
> > Whoever wants to split a sealed VMA should  answer this question
> > first: what is the use case for splitting a sealed VMA?
>
> If we lower the checks to __split_vma() and anywhere that does entire
> vma modifications, then we would avoid modifications of the vma.  This
> particular loop breaks on the final vma, if it needs splitting, so we'd
> still need the check in the main loop to ensure the full vma isn't
> mseal()'ed.  Splitting the start/end could be covered by the
> __split_vma() function.
>
> >
> > The V2 series doesn't have selftest change which indicates lack of
> > testing. The out-of-loop check is positioned nearer to the API entry
> > point and separated from internal business logic, thereby minimizing
> > the testing requirements. However, as we move the sealing check
> > further inward and intertwine it with business logic, greater test
> > coverage becomes necessary to ensure  the correctness of  sealing
> > is preserved.
>
> I would have expected more complete test coverage and not limited to
> what is expected to happen with an up front test.  Changes may happen
> that you aren't Cc'ed on (or when you are away, etc) that could cause a
> silent failure which could go undetected for a prolonged period of time.
>
> If splitting a vma isn't okay, then it would be safer to test at least
> in some scenarios in the upstream tests.  Ideally, all tests are
> upstream so everyone can run the testing.
>
We will want to be careful about our expectation of  test coverage
offered in selftest. When adding mseal, I added 40+ test cases to
ensure mseal didn't regress on existing mm api, i.e. you can take the
mseal test , make a small modification (removing seal=1) and run on an
old version of kernel and they will pass. I think it is wrong to
expect the selftest is all it takes to find a regression if the dev is
doing a  *** major design/feature change ***. While it is possible to
write test cases to guide all future changes, doing so requires much
bigger effort with diminishing returns, essentially  it is testing an
"impossible to reach cases" in existing code.

Thanks
-Jeff

> Thanks,
> Liam


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/6] mm/munmap: Replace can_modify_mm with can_modify_vma
  2024-08-12 17:38               ` Jeff Xu
@ 2024-08-12 19:25                 ` Liam R. Howlett
  0 siblings, 0 replies; 26+ messages in thread
From: Liam R. Howlett @ 2024-08-12 19:25 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Pedro Falcato, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
	linux-mm, linux-kernel, oliver.sang, torvalds, Michael Ellerman,
	Kees Cook

* Jeff Xu <jeffxu@google.com> [240812 13:38]:
> On Mon, Aug 12, 2024 at 9:58 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> >
> > * Jeff Xu <jeffxu@google.com> [240812 10:30]:
> > > + Kees who commented on mseal() series before. Please keep Kees in the
> > > cc for future response to this series.
> > >
> > > On Fri, Aug 9, 2024 at 12:25 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > >
> > > > * Pedro Falcato <pedro.falcato@gmail.com> [240809 14:53]:
> > > > > On Fri, Aug 9, 2024 at 5:48 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
> > > > > >
> > > > > > * Liam R. Howlett <Liam.Howlett@oracle.com> [240809 12:15]:
> > > > > > > * Pedro Falcato <pedro.falcato@gmail.com> [240807 17:13]:
> > > > > > > > We were doing an extra mmap tree traversal just to check if the entire
> > > > > > > > range is modifiable. This can be done when we iterate through the VMAs
> > > > > > > > instead.
> > > > > > > >
> > > > > > > > Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com>
> > > > > > > > ---
> > > > > > > >  mm/mmap.c | 13 +------------
> > > > > > > >  mm/vma.c  | 23 ++++++++++++-----------
> > > > > > > >  2 files changed, 13 insertions(+), 23 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > > > > > index 4a9c2329b09..c1c7a7d00f5 100644
> > > > > > > > --- a/mm/mmap.c
> > > > > > > > +++ b/mm/mmap.c
> > > > > > > > @@ -1740,18 +1740,7 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > > > > >             unsigned long start, unsigned long end, struct list_head *uf,
> > > > > > > >             bool unlock)
> > > > > > > >  {
> > > > > > > > -   struct mm_struct *mm = vma->vm_mm;
> > > > > > > > -
> > > > > > > > -   /*
> > > > > > > > -    * Check if memory is sealed before arch_unmap.
> > > > > > > > -    * Prevent unmapping a sealed VMA.
> > > > > > > > -    * can_modify_mm assumes we have acquired the lock on MM.
> > > > > > > > -    */
> > > > > > > > -   if (unlikely(!can_modify_mm(mm, start, end)))
> > > > > > > > -           return -EPERM;
> > > > > > > > -
> > > > > > > > -   arch_unmap(mm, start, end);
> > > > > > > > -   return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> > > > > > > > +   return do_vmi_align_munmap(vmi, vma, vma->vm_mm, start, end, uf, unlock);
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  /*
> > > > > > > > diff --git a/mm/vma.c b/mm/vma.c
> > > > > > > > index bf0546fe6ea..7a121bcc907 100644
> > > > > > > > --- a/mm/vma.c
> > > > > > > > +++ b/mm/vma.c
> > > > > > > > @@ -712,6 +712,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > > > > > > >             if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count)
> > > > > > > >                     goto map_count_exceeded;
> > > > > > > >
> > > > > > > > +           /* Don't bother splitting the VMA if we can't unmap it anyway */
> > > > > > > > +           if (!can_modify_vma(vma)) {
> > > > > > > > +                   error = -EPERM;
> > > > > > > > +                   goto start_split_failed;
> > > > > > > > +           }
> > > > > > > > +
> > > > > > >
> > > > > > > Would this check be better placed in __split_vma()?  It could replace
> > > > > > > both this and the next chunk of code.
> > > > > >
> > > > > > not quite.
> > > > >
> > > > > Yeah, I was going to say that splitting a sealed VMA is okay (and we
> > > > > allow it on mlock and madvise).
> > > >
> > > > splitting a sealed vma wasn't supposed to be okay.. but it is Jeff's
> > > > feature.  Jeff?
> > > >
> > > Splitting a sealed VMA is wrong.
> > > Whoever wants to split a sealed VMA should  answer this question
> > > first: what is the use case for splitting a sealed VMA?
> >
> > If we lower the checks to __split_vma() and anywhere that does entire
> > vma modifications, then we would avoid modifications of the vma.  This
> > particular loop breaks on the final vma, if it needs splitting, so we'd
> > still need the check in the main loop to ensure the full vma isn't
> > mseal()'ed.  Splitting the start/end could be covered by the
> > __split_vma() function.
> >
> > >
> > > The V2 series doesn't have selftest change which indicates lack of
> > > testing. The out-of-loop check is positioned nearer to the API entry
> > > point and separated from internal business logic, thereby minimizing
> > > the testing requirements. However, as we move the sealing check
> > > further inward and intertwine it with business logic, greater test
> > > coverage becomes necessary to ensure  the correctness of  sealing
> > > is preserved.
> >
> > I would have expected more complete test coverage and not limited to
> > what is expected to happen with an up front test.  Changes may happen
> > that you aren't Cc'ed on (or when you are away, etc) that could cause a
> > silent failure which could go undetected for a prolonged period of time.
> >
> > If splitting a vma isn't okay, then it would be safer to test at least
> > in some scenarios in the upstream tests.  Ideally, all tests are
> > upstream so everyone can run the testing.
> >
> We will want to be careful about our expectation of  test coverage
> offered in selftest. When adding mseal, I added 40+ test cases to
> ensure mseal didn't regress on existing mm api, i.e. you can take the
> mseal test , make a small modification (removing seal=1) and run on an
> old version of kernel and they will pass. I think it is wrong to
> expect the selftest is all it takes to find a regression if the dev is
> doing a  *** major design/feature change ***. While it is possible to
> write test cases to guide all future changes, doing so requires much
> bigger effort with diminishing returns, essentially  it is testing an
> "impossible to reach cases" in existing code.

One of the main points of self testing is to ensure that someone doesn't
break your feature.  Your tests do no accomplish that, otherwise you
wouldn't be requesting more tests with these changes - this isn't a bug
fix.

The initial commit also does not say that the vma won't be split so it
doesn't seem fair to blame someone for not testing something that wasn't
tested before nor existed as a restriction in the documentation [1].
Note that splitting a vma just increases the vma count and doesn't
change attributes of the vma or affect the memory. Pedro wasn't involved
in the initial discussions, so he doesn't have the background about how
mseal() was treated differently.

Some of the language in your responses are troubling and seem to
indicate that the design/feature was done in a way to optimise
backporting (avoiding the business logic that has changed), and to
minimize the testing requirement.

Although I can understand (empathize, even) that you cannot test
everything from the start; you can grow your test set to avoid future
regressions and misunderstandings on how things should work.  (If the
dont_split_vma() tests fail, then I would probably question if I can
split a vma..)

Splitting a vma is a pretty fundamental operation, along with unmapping,
mapping, and merging.  I would expect a self tests to, at least, cover
the fundamentals, if they are to be affected.  If testing is lacking,
can you please provide test cases so that we can validate fixes for your
feature so it works as you intended/need it to work?  We don't know if
we are violating rules that aren't clear.

We are now waiting for you to run some tests (since Thursday) to see if
these patches are okay, but we have no idea what those tests are or
where they exist.  This is problematic as you seem to be the only one
able to complete a task subject to interpretation - as you might
imagine, this would be a problem for Chromium if your contact
information was to change, or you were away for several months.

You may consider this a major design/feature change, but it is a
necessary change to get back the performance and to keep mseal()
functioning.

Thanks,
Liam

[1]. https://lore.kernel.org/linux-mm/20240415163527.626541-1-jeffxu@chromium.org/



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
       [not found] <20240807211309.2729719-1-pedro.falcato@gmail.com>
                   ` (4 preceding siblings ...)
       [not found] ` <20240807211309.2729719-6-pedro.falcato@gmail.com>
@ 2024-08-15 21:11 ` Jeff Xu
  5 siblings, 0 replies; 26+ messages in thread
From: Jeff Xu @ 2024-08-15 21:11 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Liam R. Howlett, Vlastimil Babka, Lorenzo Stoakes,
	linux-mm, linux-kernel, oliver.sang, torvalds, jeffxu,
	Michael Ellerman

On Wed, Aug 7, 2024 at 2:13 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> [based on mm-unstable, 98808d08]
>
> Optimize mseal checks by removing the separate can_modify_mm() step, and
> just doing checks on the individual vmas, when various operations are
> themselves iterating through the tree. This provides a nice speedup and restores
> performance parity with pre-mseal[3].
>
> This series also depends on (and will eventually very slightly conflict with)
> the powerpc series that removes arch_unmap[2].
>
> will-it-scale mmap1_process[1] -t 1 results:
>
> commit 3450fe2b574b4345e4296ccae395149e1a357fee:
>
> min:277605 max:277605 total:277605
> min:281784 max:281784 total:281784
> min:277238 max:277238 total:277238
> min:281761 max:281761 total:281761
> min:274279 max:274279 total:274279
> min:254854 max:254854 total:254854
> measurement
> min:269143 max:269143 total:269143
> min:270454 max:270454 total:270454
> min:243523 max:243523 total:243523
> min:251148 max:251148 total:251148
> min:209669 max:209669 total:209669
> min:190426 max:190426 total:190426
> min:231219 max:231219 total:231219
> min:275364 max:275364 total:275364
> min:266540 max:266540 total:266540
> min:242572 max:242572 total:242572
> min:284469 max:284469 total:284469
> min:278882 max:278882 total:278882
> min:283269 max:283269 total:283269
> min:281204 max:281204 total:281204
>
> After this patch set:
>
> min:280580 max:280580 total:280580
> min:290514 max:290514 total:290514
> min:291006 max:291006 total:291006
> min:290352 max:290352 total:290352
> min:294582 max:294582 total:294582
> min:293075 max:293075 total:293075
> measurement
> min:295613 max:295613 total:295613
> min:294070 max:294070 total:294070
> min:293193 max:293193 total:293193
> min:291631 max:291631 total:291631
> min:295278 max:295278 total:295278
> min:293782 max:293782 total:293782
> min:290361 max:290361 total:290361
> min:294517 max:294517 total:294517
> min:293750 max:293750 total:293750
> min:293572 max:293572 total:293572
> min:295239 max:295239 total:295239
> min:292932 max:292932 total:292932
> min:293319 max:293319 total:293319
> min:294954 max:294954 total:294954
>
> This was a Completely Unscientific test but seems to show there were around 5-10% gains on ops per second.
>
It is likely true that the test " was Completely Unscientific".
Without the necessary information, such as stddiv, stderr, and large
enough sample size, it is impossible to prove the test can reasonably
measure the impact of the patch. Unless you add that information, it
would be prudent to omit this test data from the cover letter.

> Oliver performed his own tests and showed[3] a similar ~5% gain in them.
>
> [1]: mmap1_process does mmap and munmap in a loop. I didn't bother testing multithreading cases.
> [2]: https://lore.kernel.org/all/20240807124103.85644-1-mpe@ellerman.id.au/
> [3]: https://lore.kernel.org/all/ZrMMJfe9aXSWxJz6@xsang-OptiPlex-9020/
> Link: https://lore.kernel.org/all/202408041602.caa0372-oliver.sang@intel.com/
>
> Changes in v2:
>  - Rebased on top of mm-unstable
>  - Removed a superfluous check in mremap (Jeff Xu)
Thanks for taking this suggestion. When I posted that comment, I was
studying the mremap change in V2, I'm not 100% sure if it is possible,
or reasonable the code is written that way (even before mseal is
added.)

This week, I wrote more self-tests [1] for the mremap to reason about
the code handling of mremap across the VMA boundary, as part of an
effort to test your patch.

During the process I realized that we can probably improve the mremap
code further, even without considering the sealing,  so I made a refactor
patch for mremap. [2].

Testing is 90% of the time, if not more, when I modify the kernel
code. If you agree with my refactor on mremap, please feel free to
rebase  or include it as part of your patch series.

Thanks
-Jeff

[1] https://lore.kernel.org/linux-mm/20240814071424.2655666-2-jeffxu@chromium.org/
[2] https://lore.kernel.org/linux-mm/20240814071424.2655666-3-jeffxu@chromium.org/


> Pedro Falcato (6):
>   mm: Move can_modify_vma to mm/internal.h
>   mm/munmap: Replace can_modify_mm with can_modify_vma
>   mm/mprotect: Replace can_modify_mm with can_modify_vma
>   mm/mremap: Replace can_modify_mm with can_modify_vma
>   mseal: Replace can_modify_mm_madv with a vma variant
>   mm: Remove can_modify_mm()
>
>  mm/internal.h | 30 ++++++++++++++++++++--------
>  mm/madvise.c  | 13 +++---------
>  mm/mmap.c     | 13 +-----------
>  mm/mprotect.c | 12 +++--------
>  mm/mremap.c   | 30 ++++------------------------
>  mm/mseal.c    | 55 ++++-----------------------------------------------
>  mm/vma.c      | 23 ++++++++++-----------
>  7 files changed, 49 insertions(+), 127 deletions(-)
>
>
> base-commit: 98808d08fc0f78ee638e0c0a88020fbbaf581ec6
> --
> 2.46.0
>
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-09  1:02     ` Jeff Xu
  2024-08-09 19:34       ` Liam R. Howlett
@ 2024-08-15 22:10       ` Jeff Xu
  2024-08-16  1:58         ` Pedro Falcato
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff Xu @ 2024-08-15 22:10 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Pedro Falcato, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

Hi Andrew, Pedro

On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
>
> On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > > This series also depends on (and will eventually very slightly conflict with)
> > > > the powerpc series that removes arch_unmap[2].
> > >
> > > That's awkward.  Please describe the dependency?
> >
> > One of the transformations done in this patch series (patch 2) assumes
> > that arch_unmap either doesn't exist or does nothing.
> > PPC is the only architecture with an arch_unmap implementation, and
> > through the series I linked they're going to make it work via
> > ->close().
> >
> > What's the easiest way to deal with this? Can the PPC series go
> > through the mm tree?
> >
> This patch can't be merged until arch_unmap() is all removed (ppc change)
>
> Also I'm still doing a test/reviewing for this patch,  perhaps it is
> better to wait till my test is done.
>
Sorry that I'm late for updating this thread.

With removing arch_unmap() change landed , there is no dependency for
the patch. However: I have other comments:

1. Testing
Testing is 90% of work when I modify kernel code and send a patch.  So
I'm a little disappointed that this patch doesn't have any test
updates or add new tests. Which makes me less confident about the
regression risk on mseal itself, i.e. a sealed mapping being
overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
in  [1], and I would like to repeat it to stress my point.

The V2 series doesn't have selftest change which indicates lack of
testing. The out-of-loop check is positioned nearer to the API entry
point and separated from internal business logic, thereby minimizing
the testing requirements. However, as we move the sealing check
further inward and intertwine it with business logic, greater test
coverage becomes necessary to ensure  the correctness of  sealing
is preserved.

Yes. I promised to run some tests, which I did, with the existing self
test (that passed),  also I added more tests in the mremap selftest.
However I'm bound by the time that I can spend on this  (my other
duties and deliverables), I can't test it as much as I like to for
in-loop change (in a time frame demanded by a dev in this ml). Because
this patch is not getting tested as it should be, my confidence for
the V2 patch is low .

2 perf testing
stress-ng is not stable in my test with Chromebook, and I'm requesting
 Oliver to take more samples [2] . This due diligence assures that
this patch accurately fulfills its purpose. The in-loop approach adds
complexity to the code, i.e. future dev is harder to understand the
sealing logic. Additionally, it sacrifices a security feature that
makes it harder for an attacker to modify mapping (currently if an
attacker uses munmap with a large address range, if one of the
addresses is sealed, the entire range is not modified. In the in-loop
approach,  memory will be unmapped till it hits the sealed memory).
Therefore, I would like to ascertain the gain.

3 mremap refactor work.
I posted mremap refactor work [3] , which is aligned with the
direction that we want to do in-loop change, and it also (imo)  a
better version of handling error cases for mremap across multiple vma
boundaries. That patch set can be applied on its own, and the test
cases added also enhance the existing selftest. I hope my patch can be
reviewed, and if passing perf test/approved, applied to mm first.

Thanks
Best regards,
-Jeff

[1] https://lore.kernel.org/linux-mm/20240814071424.2655666-2-jeffxu@chromium.org/
[2] https://lore.kernel.org/linux-mm/CABi2SkXtZLojx3AQU4C=41NtBPGjVB2+fv_KWziOqyXRQ8P7Bg@mail.gmail.com/
[3] https://lore.kernel.org/linux-mm/20240814071424.2655666-1-jeffxu@chromium.org/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-15 22:10       ` Jeff Xu
@ 2024-08-16  1:58         ` Pedro Falcato
  2024-08-16 17:07           ` Jeff Xu
  2024-08-16 17:30           ` Jeff Xu
  0 siblings, 2 replies; 26+ messages in thread
From: Pedro Falcato @ 2024-08-16  1:58 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

On Thu, Aug 15, 2024 at 11:10 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> Hi Andrew, Pedro
>
> On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
> >
> > On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > >
> > > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > >
> > > > > This series also depends on (and will eventually very slightly conflict with)
> > > > > the powerpc series that removes arch_unmap[2].
> > > >
> > > > That's awkward.  Please describe the dependency?
> > >
> > > One of the transformations done in this patch series (patch 2) assumes
> > > that arch_unmap either doesn't exist or does nothing.
> > > PPC is the only architecture with an arch_unmap implementation, and
> > > through the series I linked they're going to make it work via
> > > ->close().
> > >
> > > What's the easiest way to deal with this? Can the PPC series go
> > > through the mm tree?
> > >
> > This patch can't be merged until arch_unmap() is all removed (ppc change)
> >
> > Also I'm still doing a test/reviewing for this patch,  perhaps it is
> > better to wait till my test is done.
> >
> Sorry that I'm late for updating this thread.
>
> With removing arch_unmap() change landed , there is no dependency for
> the patch. However: I have other comments:
>
> 1. Testing
> Testing is 90% of work when I modify kernel code and send a patch.  So
> I'm a little disappointed that this patch doesn't have any test
> updates or add new tests. Which makes me less confident about the
> regression risk on mseal itself, i.e. a sealed mapping being
> overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
> in  [1], and I would like to repeat it to stress my point.
>
> The V2 series doesn't have selftest change which indicates lack of
> testing. The out-of-loop check is positioned nearer to the API entry
> point and separated from internal business logic, thereby minimizing
> the testing requirements. However, as we move the sealing check
> further inward and intertwine it with business logic, greater test
> coverage becomes necessary to ensure  the correctness of  sealing
> is preserved.

Sorry, cut the crap. Your backhanded concerns are very funny.
mseal was _your_ feature. You wrote the tests. You either didn't write
or didn't understand what kinds of tests need/should be done,
otherwise you would've done them. I wrote some code to fix the awful
performance hit. It is a _fix_, not a feature. Your previous mseal
tests should've covered this. If they didn't, that's okay (we all make
mistakes), but pushing your problems onto me is seriously laughable.

I want to stress this bit: There's no mseal feature, there's no
business logic. There's mmap, munmap, mprotect, madvise, mremap (among
others). These are the things people actually care about, and need to
stay fast. Memory management is a core part of the kernel. All of the
very pretty abstractions you're talking about aren't applicable to
kernel code (for any kernel, really) in general. No one wants to pay
the cost of walking through a data structure 2 or 3 times just to
"separate the business logic" for a random, minimally simple feature.

>
> Yes. I promised to run some tests, which I did, with the existing self
> test (that passed),  also I added more tests in the mremap selftest.
> However I'm bound by the time that I can spend on this  (my other
> duties and deliverables), I can't test it as much as I like to for
> in-loop change (in a time frame demanded by a dev in this ml). Because
> this patch is not getting tested as it should be, my confidence for
> the V2 patch is low .

Ok so you: tried to explain to me how to run selftests in v1 (when I
actively did _before sending_, and found a bug in your tests, and
wrote about it in-depth), pledge to "run some tests", never get back
to us, push the "the testsuite I wrote is lacking" concern onto me,
send a whole separate parallel patch set that tries to address _one_
of the regressions with complete disregard for the work done here,
complain about a lack of time, and now say a backhanded "your
confidence for the V2 patch is low".

I seriously have no words.
I want to stress I have no way to test real software that uses mseal
because APPARENTLY THERE IS NONE. THE FEATURE ADDED EXPLICITLY FOR
CHROMIUM IS NOT USED BY UPSTREAM CHROMIUM.

>
> 2 perf testing
> stress-ng is not stable in my test with Chromebook, and I'm requesting
>  Oliver to take more samples [2] . This due diligence assures that
> this patch accurately fulfills its purpose. The in-loop approach adds
> complexity to the code, i.e. future dev is harder to understand the
> sealing logic. Additionally, it sacrifices a security feature that
> makes it harder for an attacker to modify mapping (currently if an
> attacker uses munmap with a large address range, if one of the
> addresses is sealed, the entire range is not modified. In the in-loop
> approach,  memory will be unmapped till it hits the sealed memory).

Wrong. munmap is atomic and always has been. It's required by POSIX.

I would also ask you to back these partial mprotect (assuming that's
what you meant) concerns with a real attack that takes this into
account, instead of hand waving "security".
While noting that all of these operations can already partially fail,
this is not new (and is explicitly permitted by POSIX for
syscalls-not-named-munmap).

> Therefore, I would like to ascertain the gain.

The gain is real. I've proven it with will-it-scale, Oliver ran his
tests and found the regression to be gone (and they do seem to do
proper statistical analysis).
I would wager you to find a workload or hardware where *doing
substantially less work* makes for slower code.

>
> 3 mremap refactor work.

mremap refactoring is not related to these mmap regressions. In the v3
I'm cleaning up and sending out tomorrow, I minimally change mremap
(as Liam wanted).

-- 
Pedro

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-16  1:58         ` Pedro Falcato
@ 2024-08-16 17:07           ` Jeff Xu
  2024-08-16 18:09             ` Pedro Falcato
  2024-08-16 17:30           ` Jeff Xu
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff Xu @ 2024-08-16 17:07 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

Hi Pedro

On Thu, Aug 15, 2024 at 6:58 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Thu, Aug 15, 2024 at 11:10 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > Hi Andrew, Pedro
> >
> > On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
> > >
> > > On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > >
> > > > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > > >
> > > > > > This series also depends on (and will eventually very slightly conflict with)
> > > > > > the powerpc series that removes arch_unmap[2].
> > > > >
> > > > > That's awkward.  Please describe the dependency?
> > > >
> > > > One of the transformations done in this patch series (patch 2) assumes
> > > > that arch_unmap either doesn't exist or does nothing.
> > > > PPC is the only architecture with an arch_unmap implementation, and
> > > > through the series I linked they're going to make it work via
> > > > ->close().
> > > >
> > > > What's the easiest way to deal with this? Can the PPC series go
> > > > through the mm tree?
> > > >
> > > This patch can't be merged until arch_unmap() is all removed (ppc change)
> > >
> > > Also I'm still doing a test/reviewing for this patch,  perhaps it is
> > > better to wait till my test is done.
> > >
> > Sorry that I'm late for updating this thread.
> >
> > With removing arch_unmap() change landed , there is no dependency for
> > the patch. However: I have other comments:
> >
> > 1. Testing
> > Testing is 90% of work when I modify kernel code and send a patch.  So
> > I'm a little disappointed that this patch doesn't have any test
> > updates or add new tests. Which makes me less confident about the
> > regression risk on mseal itself, i.e. a sealed mapping being
> > overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
> > in  [1], and I would like to repeat it to stress my point.
> >
> > The V2 series doesn't have selftest change which indicates lack of
> > testing. The out-of-loop check is positioned nearer to the API entry
> > point and separated from internal business logic, thereby minimizing
> > the testing requirements. However, as we move the sealing check
> > further inward and intertwine it with business logic, greater test
> > coverage becomes necessary to ensure  the correctness of  sealing
> > is preserved.
>
> Sorry, cut the crap. Your backhanded concerns are very funny.
> mseal was _your_ feature. You wrote the tests. You either didn't write
> or didn't understand what kinds of tests need/should be done,
> otherwise you would've done them. I wrote some code to fix the awful
> performance hit. It is a _fix_, not a feature. Your previous mseal
> tests should've covered this. If they didn't, that's okay (we all make
> mistakes), but pushing your problems onto me is seriously laughable.
>
I posted the comments about the lack of a test update in V2 last
monday, there is no response from you until Thursday night (which is
OK). If you were expecting me to update the test cases and to support
your patch series, you should explicitly call it out.

So your point here is that you wrote the code, passed  the existing
selftest, fixed the perf, and the job is done. If the test doesn't
cover the new  cases you added, it is not your problem, you only need
to care about perf part.

This is how regression happened in past, e.g.

commit 77795f900e2a07c1cbedc375789aefb43843b6c2
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Tue Jun 6 14:29:12 2023 -0400

    mm/mprotect: fix do_mprotect_pkey() limit check

    The return of do_mprotect_pkey() can still be incorrectly returned as
    success if there is a gap that spans to or beyond the end address passed
    in.  Update the check to ensure that the end address has indeed been seen.

    Link: https://lore.kernel.org/all/CABi2SkXjN+5iFoBhxk71t3cmunTk-s=rB4T7qo0UQRh17s49PQ@mail.gmail.com/
    Link: https://lkml.kernel.org/r/20230606182912.586576-1-Liam.Howlett@oracle.com
    Fixes: 82f951340f25 ("mm/mprotect: fix do_mprotect_pkey() return on error")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reported-by: Jeff Xu <jeffxu@chromium.org>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Had not I wrote selftest to discover this mprotect regression, it
would go unnoticed  and stay that way.

My point is: the existing selftest for mseal  is written for out-loop,
and will not fully test in-loop. Your patch has made a big change to
mseal, more tests are needed.

To move forward from this situation for your patch series, either you
or me will need to update the tests. How about sharing the load ?

> I seriously have no words.
> I want to stress I have no way to test real software that uses mseal
> because APPARENTLY THERE IS NONE. THE FEATURE ADDED EXPLICITLY FOR
> CHROMIUM IS NOT USED BY UPSTREAM CHROMIUM.
>
The testing is about making sure that sealing is preserved after
mmap/mremap/munmap/mprotect call, there is no real software that will
do that kind of testing, even in the future, selftest is the only way.

Security features never come quickly, adding  syscall to the kernel is
the first step to allow apps to use it.

> >
> > 2 perf testing
> > stress-ng is not stable in my test with Chromebook, and I'm requesting
> >  Oliver to take more samples [2] . This due diligence assures that
> > this patch accurately fulfills its purpose. The in-loop approach adds
> > complexity to the code, i.e. future dev is harder to understand the
> > sealing logic. Additionally, it sacrifices a security feature that
> > makes it harder for an attacker to modify mapping (currently if an
> > attacker uses munmap with a large address range, if one of the
> > addresses is sealed, the entire range is not modified. In the in-loop
> > approach,  memory will be unmapped till it hits the sealed memory).
>
> Wrong. munmap is atomic and always has been. It's required by POSIX.
>
Please run this test on the latest kernel branch to verify:

static void test_munmap_free_multiple_ranges(bool seal)
{
        void *ptr;
        unsigned long page_size = getpagesize();
        unsigned long size = 8 * page_size;
        int ret;
        int prot;

        setup_single_address(size, &ptr);
        FAIL_TEST_IF_FALSE(ptr != (void *)-1);

        /* unmap one page from beginning. */
        ret = sys_munmap(ptr, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* unmap one page from middle. */
        ret = sys_munmap(ptr + 4 * page_size, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* seal the last page */
        if (seal) {
                ret = sys_mseal(ptr + 7 * page_size, page_size);
                FAIL_TEST_IF_FALSE(!ret);
        }

        /* munmap all 8  pages from beginning */
        ret = sys_munmap(ptr, 8 * page_size);
        if (seal) {
                FAIL_TEST_IF_FALSE(ret < 0);

                /* verify none of existing pages in  the range are unmapped */
                size = get_vma_size(ptr + page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 3 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  5 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 2 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  7 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 1 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);
        } else {
                FAIL_TEST_IF_FALSE(!ret);

                /* verify all pages are unmapped */
                for (int i = 0; i < 8; i++) {
                        size = get_vma_size(ptr, &prot);
                        FAIL_TEST_IF_FALSE(size == 0);
                }
        }

        REPORT_TEST_PASS();
}

test_munmap_free_multiple_ranges(true)
test_munmap_free_multiple_ranges(false)

> I would also ask you to back these partial mprotect (assuming that's
> what you meant) concerns with a real attack that takes this into
> account, instead of hand waving "security".
> While noting that all of these operations can already partially fail,
> this is not new (and is explicitly permitted by POSIX for
> syscalls-not-named-munmap).
>
As defence gets stronger, with seccomp,selinux,landlock, attackers now
have to find an easier route.

> > Therefore, I would like to ascertain the gain.
>
> The gain is real. I've proven it with will-it-scale, Oliver ran his
> tests and found the regression to be gone (and they do seem to do
> proper statistical analysis).
> I would wager you to find a workload or hardware where *doing
> substantially less work* makes for slower code.
>
> >
> > 3 mremap refactor work.
>
> mremap refactoring is not related to these mmap regressions. In the v3
> I'm cleaning up and sending out tomorrow, I minimally change mremap
> (as Liam wanted).
>
If the test issue is not resolved, V3 will be in the same situation as V2.

Best Regards,
-Jeff

> --
> Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-16  1:58         ` Pedro Falcato
  2024-08-16 17:07           ` Jeff Xu
@ 2024-08-16 17:30           ` Jeff Xu
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff Xu @ 2024-08-16 17:30 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

Hi Pedro,

(resent,  previous email has html link)

On Thu, Aug 15, 2024 at 6:58 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Thu, Aug 15, 2024 at 11:10 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > Hi Andrew, Pedro
> >
> > On Thu, Aug 8, 2024 at 6:03 PM Jeff Xu <jeffxu@google.com> wrote:
> > >
> > > On Thu, Aug 8, 2024 at 5:34 PM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 9, 2024 at 12:12 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > >
> > > > > On Wed,  7 Aug 2024 22:13:03 +0100 Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > > > >
> > > > > > This series also depends on (and will eventually very slightly conflict with)
> > > > > > the powerpc series that removes arch_unmap[2].
> > > > >
> > > > > That's awkward.  Please describe the dependency?
> > > >
> > > > One of the transformations done in this patch series (patch 2) assumes
> > > > that arch_unmap either doesn't exist or does nothing.
> > > > PPC is the only architecture with an arch_unmap implementation, and
> > > > through the series I linked they're going to make it work via
> > > > ->close().
> > > >
> > > > What's the easiest way to deal with this? Can the PPC series go
> > > > through the mm tree?
> > > >
> > > This patch can't be merged until arch_unmap() is all removed (ppc change)
> > >
> > > Also I'm still doing a test/reviewing for this patch,  perhaps it is
> > > better to wait till my test is done.
> > >
> > Sorry that I'm late for updating this thread.
> >
> > With removing arch_unmap() change landed , there is no dependency for
> > the patch. However: I have other comments:
> >
> > 1. Testing
> > Testing is 90% of work when I modify kernel code and send a patch.  So
> > I'm a little disappointed that this patch doesn't have any test
> > updates or add new tests. Which makes me less confident about the
> > regression risk on mseal itself, i.e. a sealed mapping being
> > overwritten by mprotect/mmap/mremap/munmap.  I have posted the comment
> > in  [1], and I would like to repeat it to stress my point.
> >
> > The V2 series doesn't have selftest change which indicates lack of
> > testing. The out-of-loop check is positioned nearer to the API entry
> > point and separated from internal business logic, thereby minimizing
> > the testing requirements. However, as we move the sealing check
> > further inward and intertwine it with business logic, greater test
> > coverage becomes necessary to ensure  the correctness of  sealing
> > is preserved.
>
> Sorry, cut the crap. Your backhanded concerns are very funny.
> mseal was _your_ feature. You wrote the tests. You either didn't write
> or didn't understand what kinds of tests need/should be done,
> otherwise you would've done them. I wrote some code to fix the awful
> performance hit. It is a _fix_, not a feature. Your previous mseal
> tests should've covered this. If they didn't, that's okay (we all make
> mistakes), but pushing your problems onto me is seriously laughable.
>
I posted the comments about the lack of a test update in V2 last
monday, there is no response from you until Thursday night (which is
OK). If you were expecting me to update the test cases and to support
your patch series, you should explicitly call it out.

So your point here is that you wrote the code, passed  the existing
selftest, fixed the perf, and the job is done. If the test doesn't
cover the new  cases you added, it is not your problem, you only need
to care about perf part.

This is how regression happened in past, e.g.

commit 77795f900e2a07c1cbedc375789aefb43843b6c2
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Tue Jun 6 14:29:12 2023 -0400

    mm/mprotect: fix do_mprotect_pkey() limit check

    The return of do_mprotect_pkey() can still be incorrectly returned as
    success if there is a gap that spans to or beyond the end address passed
    in.  Update the check to ensure that the end address has indeed been seen.

    Link: https://lore.kernel.org/all/CABi2SkXjN+5iFoBhxk71t3cmunTk-s=rB4T7qo0UQRh17s49PQ@mail.gmail.com/
    Link: https://lkml.kernel.org/r/20230606182912.586576-1-Liam.Howlett@oracle.com
    Fixes: 82f951340f25 ("mm/mprotect: fix do_mprotect_pkey() return on error")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reported-by: Jeff Xu <jeffxu@chromium.org>
    Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


Had not I wrote selftest to discover this mprotect regression, it
would go unnoticed  and stay that way.

My point is: the existing selftest for mseal  is written for out-loop,
and will not fully test in-loop. Your patch has made a big change to
mseal, more tests are needed.

To move forward from this situation for your patch series, either you
or me will need to update the tests. How about sharing the load ?

> I want to stress this bit: There's no mseal feature, there's no
> business logic. There's mmap, munmap, mprotect, madvise, mremap (among
> others). These are the things people actually care about, and need to
> stay fast. Memory management is a core part of the kernel. All of the
> very pretty abstractions you're talking about aren't applicable to
> kernel code (for any kernel, really) in general. No one wants to pay
> the cost of walking through a data structure 2 or 3 times just to
> "separate the business logic" for a random, minimally simple feature.
>
The testing is about making sure that sealing is preserved after
mmap/mremap/munmap/mprotect call, there is no real software that will
do that kind of testing, even in the future, selftest is the only way.

Security features never come quickly, adding  syscall to the kernel is
the first step to allow apps to use it.

> >
> > Yes. I promised to run some tests, which I did, with the existing self
> > test (that passed),  also I added more tests in the mremap selftest.
> > However I'm bound by the time that I can spend on this  (my other
> > duties and deliverables), I can't test it as much as I like to for
> > in-loop change (in a time frame demanded by a dev in this ml). Because
> > this patch is not getting tested as it should be, my confidence for
> > the V2 patch is low .
>
> Ok so you: tried to explain to me how to run selftests in v1 (when I
> actively did _before sending_, and found a bug in your tests, and
> wrote about it in-depth), pledge to "run some tests", never get back
> to us, push the "the testsuite I wrote is lacking" concern onto me,
> send a whole separate parallel patch set that tries to address _one_
> of the regressions with complete disregard for the work done here,
> complain about a lack of time, and now say a backhanded "your
> confidence for the V2 patch is low".
>
> I seriously have no words.
> I want to stress I have no way to test real software that uses mseal
> because APPARENTLY THERE IS NONE. THE FEATURE ADDED EXPLICITLY FOR
> CHROMIUM IS NOT USED BY UPSTREAM CHROMIUM.
>
> >
> > 2 perf testing
> > stress-ng is not stable in my test with Chromebook, and I'm requesting
> >  Oliver to take more samples [2] . This due diligence assures that
> > this patch accurately fulfills its purpose. The in-loop approach adds
> > complexity to the code, i.e. future dev is harder to understand the
> > sealing logic. Additionally, it sacrifices a security feature that
> > makes it harder for an attacker to modify mapping (currently if an
> > attacker uses munmap with a large address range, if one of the
> > addresses is sealed, the entire range is not modified. In the in-loop
> > approach,  memory will be unmapped till it hits the sealed memory).
>
> Wrong. munmap is atomic and always has been. It's required by POSIX.
>
Please run this test on the latest kernel branch to verify:

static void test_munmap_free_multiple_ranges(bool seal)
{
        void *ptr;
        unsigned long page_size = getpagesize();
        unsigned long size = 8 * page_size;
        int ret;
        int prot;

        setup_single_address(size, &ptr);
        FAIL_TEST_IF_FALSE(ptr != (void *)-1);

        /* unmap one page from beginning. */
        ret = sys_munmap(ptr, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* unmap one page from middle. */
        ret = sys_munmap(ptr + 4 * page_size, page_size);
        FAIL_TEST_IF_FALSE(!ret);

        /* seal the last page */
        if (seal) {
                ret = sys_mseal(ptr + 7 * page_size, page_size);
                FAIL_TEST_IF_FALSE(!ret);
        }

        /* munmap all 8  pages from beginning */
        ret = sys_munmap(ptr, 8 * page_size);
        if (seal) {
                FAIL_TEST_IF_FALSE(ret < 0);

                /* verify none of existing pages in  the range are unmapped */
                size = get_vma_size(ptr + page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 3 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  5 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 2 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);

                size = get_vma_size(ptr +  7 * page_size, &prot);
                FAIL_TEST_IF_FALSE(size == 1 * page_size);
                FAIL_TEST_IF_FALSE(prot == 4);
        } else {
                FAIL_TEST_IF_FALSE(!ret);

                /* verify all pages are unmapped */
                for (int i = 0; i < 8; i++) {
                        size = get_vma_size(ptr, &prot);
                        FAIL_TEST_IF_FALSE(size == 0);
                }
        }

        REPORT_TEST_PASS();
}

test_munmap_free_multiple_ranges(true)
test_munmap_free_multiple_ranges(false)

> I would also ask you to back these partial mprotect (assuming that's
> what you meant) concerns with a real attack that takes this into
> account, instead of hand waving "security".
> While noting that all of these operations can already partially fail,
> this is not new (and is explicitly permitted by POSIX for
> syscalls-not-named-munmap).
>
As defence gets stronger, with seccomp,selinux,landlock, attackers now
have to find an easier route.

> > Therefore, I would like to ascertain the gain.
>
> The gain is real. I've proven it with will-it-scale, Oliver ran his
> tests and found the regression to be gone (and they do seem to do
> proper statistical analysis).
> I would wager you to find a workload or hardware where *doing
> substantially less work* makes for slower code.
>
> >
> > 3 mremap refactor work.
>
> mremap refactoring is not related to these mmap regressions. In the v3
> I'm cleaning up and sending out tomorrow, I minimally change mremap
> (as Liam wanted).
>
If the test issue is not resolved, V3 will be in the same situation as V2.

Best Regards,
-Jeff

> --
> Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-16 17:07           ` Jeff Xu
@ 2024-08-16 18:09             ` Pedro Falcato
  2024-08-16 18:20               ` Jeff Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Pedro Falcato @ 2024-08-16 18:09 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> Please run this test on the latest kernel branch to verify:
>
> static void test_munmap_free_multiple_ranges(bool seal)
> {
>         void *ptr;
>         unsigned long page_size = getpagesize();
>         unsigned long size = 8 * page_size;
>         int ret;
>         int prot;
>
>         setup_single_address(size, &ptr);
>         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
>
>         /* unmap one page from beginning. */
>         ret = sys_munmap(ptr, page_size);
>         FAIL_TEST_IF_FALSE(!ret);
>
>         /* unmap one page from middle. */
>         ret = sys_munmap(ptr + 4 * page_size, page_size);
>         FAIL_TEST_IF_FALSE(!ret);
>
>         /* seal the last page */
>         if (seal) {
>                 ret = sys_mseal(ptr + 7 * page_size, page_size);
>                 FAIL_TEST_IF_FALSE(!ret);
>         }
>
>         /* munmap all 8  pages from beginning */
>         ret = sys_munmap(ptr, 8 * page_size);
>         if (seal) {
>                 FAIL_TEST_IF_FALSE(ret < 0);
>
>                 /* verify none of existing pages in  the range are unmapped */
>                 size = get_vma_size(ptr + page_size, &prot);
>                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
>                 FAIL_TEST_IF_FALSE(prot == 4);
>
>                 size = get_vma_size(ptr +  5 * page_size, &prot);
>                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
>                 FAIL_TEST_IF_FALSE(prot == 4);
>
>                 size = get_vma_size(ptr +  7 * page_size, &prot);
>                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
>                 FAIL_TEST_IF_FALSE(prot == 4);
>         } else {
>                 FAIL_TEST_IF_FALSE(!ret);
>
>                 /* verify all pages are unmapped */
>                 for (int i = 0; i < 8; i++) {
>                         size = get_vma_size(ptr, &prot);
>                         FAIL_TEST_IF_FALSE(size == 0);
>                 }
>         }
>
>         REPORT_TEST_PASS();
> }
>

FWIW this test does not work correctly on my end due to sealed VMAs
getting merged. I hacked up setup_single_address to work around that,
and the test does pass on both 6.10.5 and my local mseal changes
branch.

-- 
Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-16 18:09             ` Pedro Falcato
@ 2024-08-16 18:20               ` Jeff Xu
  2024-08-16 18:26                 ` Pedro Falcato
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff Xu @ 2024-08-16 18:20 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

On Fri, Aug 16, 2024 at 11:09 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> > Please run this test on the latest kernel branch to verify:
> >
> > static void test_munmap_free_multiple_ranges(bool seal)
> > {
> >         void *ptr;
> >         unsigned long page_size = getpagesize();
> >         unsigned long size = 8 * page_size;
> >         int ret;
> >         int prot;
> >
> >         setup_single_address(size, &ptr);
> >         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
> >
> >         /* unmap one page from beginning. */
> >         ret = sys_munmap(ptr, page_size);
> >         FAIL_TEST_IF_FALSE(!ret);
> >
> >         /* unmap one page from middle. */
> >         ret = sys_munmap(ptr + 4 * page_size, page_size);
> >         FAIL_TEST_IF_FALSE(!ret);
> >
> >         /* seal the last page */
> >         if (seal) {
> >                 ret = sys_mseal(ptr + 7 * page_size, page_size);
> >                 FAIL_TEST_IF_FALSE(!ret);
> >         }
> >
> >         /* munmap all 8  pages from beginning */
> >         ret = sys_munmap(ptr, 8 * page_size);
> >         if (seal) {
> >                 FAIL_TEST_IF_FALSE(ret < 0);
> >
> >                 /* verify none of existing pages in  the range are unmapped */
> >                 size = get_vma_size(ptr + page_size, &prot);
> >                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
> >                 FAIL_TEST_IF_FALSE(prot == 4);
> >
> >                 size = get_vma_size(ptr +  5 * page_size, &prot);
> >                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
> >                 FAIL_TEST_IF_FALSE(prot == 4);
> >
> >                 size = get_vma_size(ptr +  7 * page_size, &prot);
> >                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
> >                 FAIL_TEST_IF_FALSE(prot == 4);
> >         } else {
> >                 FAIL_TEST_IF_FALSE(!ret);
> >
> >                 /* verify all pages are unmapped */
> >                 for (int i = 0; i < 8; i++) {
> >                         size = get_vma_size(ptr, &prot);
> >                         FAIL_TEST_IF_FALSE(size == 0);
> >                 }
> >         }
> >
> >         REPORT_TEST_PASS();
> > }
> >
>
> FWIW this test does not work correctly on my end due to sealed VMAs
> getting merged. I hacked up setup_single_address to work around that,
> and the test does pass on both 6.10.5 and my local mseal changes
> branch.
Yes. you would need to comment out other tests and run this test only,
it didn't consider the case that sealed vma will merge with another
sealed vma (created from another test)

The test didn't pass with the V2 patch (the seal = true) case.

-Jeff

>
> --
> Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-16 18:20               ` Jeff Xu
@ 2024-08-16 18:26                 ` Pedro Falcato
  2024-08-16 18:42                   ` Jeff Xu
  0 siblings, 1 reply; 26+ messages in thread
From: Pedro Falcato @ 2024-08-16 18:26 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

On Fri, Aug 16, 2024 at 7:20 PM Jeff Xu <jeffxu@chromium.org> wrote:
>
> On Fri, Aug 16, 2024 at 11:09 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> >
> > On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> > > Please run this test on the latest kernel branch to verify:
> > >
> > > static void test_munmap_free_multiple_ranges(bool seal)
> > > {
> > >         void *ptr;
> > >         unsigned long page_size = getpagesize();
> > >         unsigned long size = 8 * page_size;
> > >         int ret;
> > >         int prot;
> > >
> > >         setup_single_address(size, &ptr);
> > >         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
> > >
> > >         /* unmap one page from beginning. */
> > >         ret = sys_munmap(ptr, page_size);
> > >         FAIL_TEST_IF_FALSE(!ret);
> > >
> > >         /* unmap one page from middle. */
> > >         ret = sys_munmap(ptr + 4 * page_size, page_size);
> > >         FAIL_TEST_IF_FALSE(!ret);
> > >
> > >         /* seal the last page */
> > >         if (seal) {
> > >                 ret = sys_mseal(ptr + 7 * page_size, page_size);
> > >                 FAIL_TEST_IF_FALSE(!ret);
> > >         }
> > >
> > >         /* munmap all 8  pages from beginning */
> > >         ret = sys_munmap(ptr, 8 * page_size);
> > >         if (seal) {
> > >                 FAIL_TEST_IF_FALSE(ret < 0);
> > >
> > >                 /* verify none of existing pages in  the range are unmapped */
> > >                 size = get_vma_size(ptr + page_size, &prot);
> > >                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
> > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > >
> > >                 size = get_vma_size(ptr +  5 * page_size, &prot);
> > >                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
> > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > >
> > >                 size = get_vma_size(ptr +  7 * page_size, &prot);
> > >                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
> > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > >         } else {
> > >                 FAIL_TEST_IF_FALSE(!ret);
> > >
> > >                 /* verify all pages are unmapped */
> > >                 for (int i = 0; i < 8; i++) {
> > >                         size = get_vma_size(ptr, &prot);
> > >                         FAIL_TEST_IF_FALSE(size == 0);
> > >                 }
> > >         }
> > >
> > >         REPORT_TEST_PASS();
> > > }
> > >
> >
> > FWIW this test does not work correctly on my end due to sealed VMAs
> > getting merged. I hacked up setup_single_address to work around that,
> > and the test does pass on both 6.10.5 and my local mseal changes
> > branch.
> Yes. you would need to comment out other tests and run this test only,
> it didn't consider the case that sealed vma will merge with another
> sealed vma (created from another test)
>
> The test didn't pass with the V2 patch (the seal = true) case.

Because we... found a bug in my munmap changes. The fixed v3 I'm
planning to send out does indeed pass.

-- 
Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/6] mm: Optimize mseal checks
  2024-08-16 18:26                 ` Pedro Falcato
@ 2024-08-16 18:42                   ` Jeff Xu
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff Xu @ 2024-08-16 18:42 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Jeff Xu, Andrew Morton, Liam R. Howlett, Vlastimil Babka,
	Lorenzo Stoakes, linux-mm, linux-kernel, oliver.sang, torvalds,
	Michael Ellerman

On Fri, Aug 16, 2024 at 11:26 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Fri, Aug 16, 2024 at 7:20 PM Jeff Xu <jeffxu@chromium.org> wrote:
> >
> > On Fri, Aug 16, 2024 at 11:09 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
> > >
> > > On Fri, Aug 16, 2024 at 6:07 PM Jeff Xu <jeffxu@chromium.org> wrote:
> > > > Please run this test on the latest kernel branch to verify:
> > > >
> > > > static void test_munmap_free_multiple_ranges(bool seal)
> > > > {
> > > >         void *ptr;
> > > >         unsigned long page_size = getpagesize();
> > > >         unsigned long size = 8 * page_size;
> > > >         int ret;
> > > >         int prot;
> > > >
> > > >         setup_single_address(size, &ptr);
> > > >         FAIL_TEST_IF_FALSE(ptr != (void *)-1);
> > > >
> > > >         /* unmap one page from beginning. */
> > > >         ret = sys_munmap(ptr, page_size);
> > > >         FAIL_TEST_IF_FALSE(!ret);
> > > >
> > > >         /* unmap one page from middle. */
> > > >         ret = sys_munmap(ptr + 4 * page_size, page_size);
> > > >         FAIL_TEST_IF_FALSE(!ret);
> > > >
> > > >         /* seal the last page */
> > > >         if (seal) {
> > > >                 ret = sys_mseal(ptr + 7 * page_size, page_size);
> > > >                 FAIL_TEST_IF_FALSE(!ret);
> > > >         }
> > > >
> > > >         /* munmap all 8  pages from beginning */
> > > >         ret = sys_munmap(ptr, 8 * page_size);
> > > >         if (seal) {
> > > >                 FAIL_TEST_IF_FALSE(ret < 0);
> > > >
> > > >                 /* verify none of existing pages in  the range are unmapped */
> > > >                 size = get_vma_size(ptr + page_size, &prot);
> > > >                 FAIL_TEST_IF_FALSE(size == 3 * page_size);
> > > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > > >
> > > >                 size = get_vma_size(ptr +  5 * page_size, &prot);
> > > >                 FAIL_TEST_IF_FALSE(size == 2 * page_size);
> > > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > > >
> > > >                 size = get_vma_size(ptr +  7 * page_size, &prot);
> > > >                 FAIL_TEST_IF_FALSE(size == 1 * page_size);
> > > >                 FAIL_TEST_IF_FALSE(prot == 4);
> > > >         } else {
> > > >                 FAIL_TEST_IF_FALSE(!ret);
> > > >
> > > >                 /* verify all pages are unmapped */
> > > >                 for (int i = 0; i < 8; i++) {
> > > >                         size = get_vma_size(ptr, &prot);
> > > >                         FAIL_TEST_IF_FALSE(size == 0);
> > > >                 }
> > > >         }
> > > >
> > > >         REPORT_TEST_PASS();
> > > > }
> > > >
> > >
> > > FWIW this test does not work correctly on my end due to sealed VMAs
> > > getting merged. I hacked up setup_single_address to work around that,
> > > and the test does pass on both 6.10.5 and my local mseal changes
> > > branch.
> > Yes. you would need to comment out other tests and run this test only,
> > it didn't consider the case that sealed vma will merge with another
> > sealed vma (created from another test)
> >
> > The test didn't pass with the V2 patch (the seal = true) case.
>
> Because we... found a bug in my munmap changes. The fixed v3 I'm
> planning to send out does indeed pass.
>
OK, I think you got my point.
Glad to hear that you are doing more testing.

Thanks
-Jeff

> --
> Pedro


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-08-16 18:43 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20240807211309.2729719-1-pedro.falcato@gmail.com>
2024-08-08 23:12 ` [PATCH v2 0/6] mm: Optimize mseal checks Andrew Morton
2024-08-09  0:34   ` Pedro Falcato
2024-08-09  1:02     ` Jeff Xu
2024-08-09 19:34       ` Liam R. Howlett
2024-08-15 22:10       ` Jeff Xu
2024-08-16  1:58         ` Pedro Falcato
2024-08-16 17:07           ` Jeff Xu
2024-08-16 18:09             ` Pedro Falcato
2024-08-16 18:20               ` Jeff Xu
2024-08-16 18:26                 ` Pedro Falcato
2024-08-16 18:42                   ` Jeff Xu
2024-08-16 17:30           ` Jeff Xu
     [not found] ` <20240807211309.2729719-2-pedro.falcato@gmail.com>
2024-08-09  9:57   ` [PATCH v2 1/6] mm: Move can_modify_vma to mm/internal.h Lorenzo Stoakes
     [not found] ` <20240807211309.2729719-5-pedro.falcato@gmail.com>
2024-08-09 16:05   ` [PATCH v2 4/6] mm/mremap: Replace can_modify_mm with can_modify_vma Liam R. Howlett
2024-08-09 18:45     ` Pedro Falcato
2024-08-12 15:22       ` Liam R. Howlett
     [not found] ` <20240807211309.2729719-3-pedro.falcato@gmail.com>
2024-08-09 16:15   ` [PATCH v2 2/6] mm/munmap: " Liam R. Howlett
2024-08-09 16:48     ` Liam R. Howlett
2024-08-09 18:53       ` Pedro Falcato
2024-08-09 19:24         ` Liam R. Howlett
2024-08-12 14:30           ` Jeff Xu
2024-08-12 16:57             ` Liam R. Howlett
2024-08-12 17:38               ` Jeff Xu
2024-08-12 19:25                 ` Liam R. Howlett
     [not found] ` <20240807211309.2729719-6-pedro.falcato@gmail.com>
2024-08-09 16:27   ` [PATCH v2 5/6] mseal: Replace can_modify_mm_madv with a vma variant Liam R. Howlett
2024-08-15 21:11 ` [PATCH v2 0/6] mm: Optimize mseal checks Jeff Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox