linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/mmap: Align the length parameter of munmap with hugepage size
@ 2024-07-10  5:45 Haoran Jang
  2024-07-10  8:24 ` Lorenzo Stoakes
  0 siblings, 1 reply; 4+ messages in thread
From: Haoran Jang @ 2024-07-10  5:45 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, lstoakes, vbabka, Liam.Howlett, akpm, Haoran Jiang

From: Haoran Jiang <jianghaoran@kylinos.cn>

munmap hugepge mappings, if the length of the range to munmap
is not aligned with hugepage size,munmap will fail.
In the hugetlb_vm_op_split function, an error will be returned
if startaddr+len is not hugepage size aligned.

before this patch:
in "tools/testing/selftests/mm/hugepage-mremap.c"
modify DEFAULT_LENGTH_MB to 3M,compile and run,
the following error message is displayed

-------------------------
running ./hugepage-mremap
-------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Address returned by mmap() = 0x7cb34b000000
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
Bail out! mremap: Expected failure, but call succeeded

Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
---
 mm/mmap.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 83b4682ec85c..0b3a60bf9b6f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2733,7 +2733,15 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
 	if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start)
 		return -EINVAL;
 
-	end = start + PAGE_ALIGN(len);
+	vma = find_vma(mm, start);
+	if (!vma) {
+		if (unlock)
+			mmap_write_unlock(mm);
+		return 0;
+	}
+
+	end = start + ALIGN(len, vma_kernel_pagesize(vma));
+
 	if (end == start)
 		return -EINVAL;
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/mmap: Align the length parameter of munmap with hugepage size
  2024-07-10  5:45 [PATCH] mm/mmap: Align the length parameter of munmap with hugepage size Haoran Jang
@ 2024-07-10  8:24 ` Lorenzo Stoakes
  2024-07-11 11:32   ` Haoran Jiang
  0 siblings, 1 reply; 4+ messages in thread
From: Lorenzo Stoakes @ 2024-07-10  8:24 UTC (permalink / raw)
  To: Haoran Jang; +Cc: linux-mm, linux-kernel, vbabka, Liam.Howlett, akpm

On Wed, Jul 10, 2024 at 01:45:58PM GMT, Haoran Jang wrote:
> From: Haoran Jiang <jianghaoran@kylinos.cn>
>
> munmap hugepge mappings, if the length of the range to munmap
> is not aligned with hugepage size,munmap will fail.
> In the hugetlb_vm_op_split function, an error will be returned
> if startaddr+len is not hugepage size aligned.
>
> before this patch:
> in "tools/testing/selftests/mm/hugepage-mremap.c"
> modify DEFAULT_LENGTH_MB to 3M,compile and run,
> the following error message is displayed
>
> -------------------------
> running ./hugepage-mremap
> -------------------------
> TAP version 13
> 1..1
> Map haddr: Returned address is 0x7eaa40000000
> Map daddr: Returned address is 0x7daa40000000
> Map vaddr: Returned address is 0x7faa40000000
> Address returned by mmap() = 0x7cb34b000000
> Mremap: Returned address is 0x7faa40000000
> First hex is 0
> First hex is 3020100
> Bail out! mremap: Expected failure, but call succeeded
>
> Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
> ---
>  mm/mmap.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 83b4682ec85c..0b3a60bf9b6f 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2733,7 +2733,15 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
>  	if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start)
>  		return -EINVAL;
>
> -	end = start + PAGE_ALIGN(len);
> +	vma = find_vma(mm, start);
> +	if (!vma) {
> +		if (unlock)
> +			mmap_write_unlock(mm);
> +		return 0;
> +	}

I really don't like this, firstly we're duplicating the VMA lookup (we
vma_find() below), and we fail to use the iterator here, and also we are
duplicating the unlock logic.

Also the semantics seem wrong, we are looking for a VMA that ends at or
after start, so you're just checking to see if start is past the last VMA
in the mm aren't you?

This doesn't seem to be accomplishing anything too useful, unless I'm
missing something?

> +
> +	end = start + ALIGN(len, vma_kernel_pagesize(vma));
> +

This seems to be the 'action' part of the change, but I'm concerned this is
completely broken, because you're using the result of find_vma() passed
into vma_kernel_pagesize() which could find a VMA _after_ the input range,
and end up unmapping a far wider range...

I'm also wondering if we should be doing some hugetlb-specific logic here,
or whether that belongs elsewhere?

Liam can chime in on that.

>  	if (end == start)
>  		return -EINVAL;
>
> --
> 2.43.0
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/mmap: Align the length parameter of munmap with hugepage size
  2024-07-10  8:24 ` Lorenzo Stoakes
@ 2024-07-11 11:32   ` Haoran Jiang
  2024-07-11 13:22     ` Lorenzo Stoakes
  0 siblings, 1 reply; 4+ messages in thread
From: Haoran Jiang @ 2024-07-11 11:32 UTC (permalink / raw)
  To: Lorenzo Stoakes; +Cc: linux-mm, linux-kernel, vbabka, Liam.Howlett, akpm

On Wed, 2024-07-10 at 09:24 +0100, Lorenzo Stoakes wrote:
> On Wed, Jul 10, 2024 at 01:45:58PM GMT, Haoran Jang wrote:
> > From: Haoran Jiang <jianghaoran@kylinos.cn>
> > 
> > munmap hugepge mappings, if the length of the range to munmap
> > is not aligned with hugepage size,munmap will fail.
> > In the hugetlb_vm_op_split function, an error will be returned
> > if startaddr+len is not hugepage size aligned.
> > 
> > before this patch:
> > in "tools/testing/selftests/mm/hugepage-mremap.c"
> > modify DEFAULT_LENGTH_MB to 3M,compile and run,
> > the following error message is displayed
> > 
> > -------------------------
> > running ./hugepage-mremap
> > -------------------------
> > TAP version 13
> > 1..1
> > Map haddr: Returned address is 0x7eaa40000000
> > Map daddr: Returned address is 0x7daa40000000
> > Map vaddr: Returned address is 0x7faa40000000
> > Address returned by mmap() = 0x7cb34b000000
> > Mremap: Returned address is 0x7faa40000000
> > First hex is 0
> > First hex is 3020100
> > Bail out! mremap: Expected failure, but call succeeded
> > 
> > Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
> > ---
> >  mm/mmap.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 83b4682ec85c..0b3a60bf9b6f 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2733,7 +2733,15 @@ int do_vmi_munmap(struct vma_iterator *vmi,
> > struct mm_struct *mm,
> >  	if ((offset_in_page(start)) || start > TASK_SIZE || len >
> > TASK_SIZE-start)
> >  		return -EINVAL;
> > 
> > -	end = start + PAGE_ALIGN(len);
> > +	vma = find_vma(mm, start);
> > +	if (!vma) {
> > +		if (unlock)
> > +			mmap_write_unlock(mm);
> > +		return 0;
> > +	}
> 
> I really don't like this, firstly we're duplicating the VMA lookup
> (we
> vma_find() below), and we fail to use the iterator here, and also we
> are
> duplicating the unlock logic.
> 
> Also the semantics seem wrong, we are looking for a VMA that ends at
> or
> after start, so you're just checking to see if start is past the last
> VMA
> in the mm aren't you?
> 
> This doesn't seem to be accomplishing anything too useful, unless I'm
> missing something?
> 
> > +
> > +	end = start + ALIGN(len, vma_kernel_pagesize(vma));
> > +
> 
> This seems to be the 'action' part of the change, but I'm concerned
> this is
> completely broken, because you're using the result of find_vma()
> passed
> into vma_kernel_pagesize() which could find a VMA _after_ the input
> range,
> and end up unmapping a far wider range...
> 
> I'm also wondering if we should be doing some hugetlb-specific logic
> here,
> or whether that belongs elsewhere?
> 
> Liam can chime in on that.
> 
> >  	if (end == start)
> >  		return -EINVAL;
> > 
> > --
> > 2.43.0
> > 
1, While performing an MMAP operation,The length aligned with hugepage
size.

unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len,
			      unsigned long prot, unsigned long flags,
			      unsigned long fd, unsigned long pgoff)
{
    ...
    if (is_file_hugepages(file)) {
	len = ALIGN(len, huge_page_size(hstate_file(file)));
    ...
}

2,During the munmap, do_vmi_align_munmap->__split_vma(vmi, next, end,
0)->hugetlb_vm_op_split.It will determine whether the end address is
aligned with  hugepage size, and if the end address is not aligned, 
return fail. Is there expect the application to align the length?

 hugetlb_vm_op_split(struct vm_area_struct *vma, unsigned long addr)
 {
 	if (addr & ~(huge_page_mask(hstate_vma(vma))))
		return -EINVAL;
 
 }

3,Or after the vma_find (vmi, end), recalculate the end address ?

ex:
vma = vma_find(vmi, end);
...
if (is_vm_hugetlb_page(vma))
{
	hugepage_size = huge_page_size(hstate_vma(vma));
	end = start + ALIGN(len, hugepage_size);
}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/mmap: Align the length parameter of munmap with hugepage size
  2024-07-11 11:32   ` Haoran Jiang
@ 2024-07-11 13:22     ` Lorenzo Stoakes
  0 siblings, 0 replies; 4+ messages in thread
From: Lorenzo Stoakes @ 2024-07-11 13:22 UTC (permalink / raw)
  To: Haoran Jiang; +Cc: linux-mm, linux-kernel, vbabka, Liam.Howlett, akpm

On Thu, Jul 11, 2024 at 07:32:23PM GMT, Haoran Jiang wrote:
> On Wed, 2024-07-10 at 09:24 +0100, Lorenzo Stoakes wrote:
> > On Wed, Jul 10, 2024 at 01:45:58PM GMT, Haoran Jang wrote:
> > > From: Haoran Jiang <jianghaoran@kylinos.cn>
> > >
> > > munmap hugepge mappings, if the length of the range to munmap
> > > is not aligned with hugepage size,munmap will fail.
> > > In the hugetlb_vm_op_split function, an error will be returned
> > > if startaddr+len is not hugepage size aligned.
> > >
> > > before this patch:
> > > in "tools/testing/selftests/mm/hugepage-mremap.c"
> > > modify DEFAULT_LENGTH_MB to 3M,compile and run,
> > > the following error message is displayed
> > >
> > > -------------------------
> > > running ./hugepage-mremap
> > > -------------------------
> > > TAP version 13
> > > 1..1
> > > Map haddr: Returned address is 0x7eaa40000000
> > > Map daddr: Returned address is 0x7daa40000000
> > > Map vaddr: Returned address is 0x7faa40000000
> > > Address returned by mmap() = 0x7cb34b000000
> > > Mremap: Returned address is 0x7faa40000000
> > > First hex is 0
> > > First hex is 3020100
> > > Bail out! mremap: Expected failure, but call succeeded
> > >
> > > Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
> > > ---
> > >  mm/mmap.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 83b4682ec85c..0b3a60bf9b6f 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -2733,7 +2733,15 @@ int do_vmi_munmap(struct vma_iterator *vmi,
> > > struct mm_struct *mm,
> > >  	if ((offset_in_page(start)) || start > TASK_SIZE || len >
> > > TASK_SIZE-start)
> > >  		return -EINVAL;
> > >
> > > -	end = start + PAGE_ALIGN(len);
> > > +	vma = find_vma(mm, start);
> > > +	if (!vma) {
> > > +		if (unlock)
> > > +			mmap_write_unlock(mm);
> > > +		return 0;
> > > +	}
> >
> > I really don't like this, firstly we're duplicating the VMA lookup
> > (we
> > vma_find() below), and we fail to use the iterator here, and also we
> > are
> > duplicating the unlock logic.
> >
> > Also the semantics seem wrong, we are looking for a VMA that ends at
> > or
> > after start, so you're just checking to see if start is past the last
> > VMA
> > in the mm aren't you?
> >
> > This doesn't seem to be accomplishing anything too useful, unless I'm
> > missing something?
> >
> > > +
> > > +	end = start + ALIGN(len, vma_kernel_pagesize(vma));
> > > +
> >
> > This seems to be the 'action' part of the change, but I'm concerned
> > this is
> > completely broken, because you're using the result of find_vma()
> > passed
> > into vma_kernel_pagesize() which could find a VMA _after_ the input
> > range,
> > and end up unmapping a far wider range...
> >
> > I'm also wondering if we should be doing some hugetlb-specific logic
> > here,
> > or whether that belongs elsewhere?
> >
> > Liam can chime in on that.
> >
> > >  	if (end == start)
> > >  		return -EINVAL;
> > >
> > > --
> > > 2.43.0
> > >
> 1, While performing an MMAP operation,The length aligned with hugepage
> size.
>
> unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len,
> 			      unsigned long prot, unsigned long flags,
> 			      unsigned long fd, unsigned long pgoff)
> {
>     ...
>     if (is_file_hugepages(file)) {
> 	len = ALIGN(len, huge_page_size(hstate_file(file)));
>     ...
> }
>
> 2,During the munmap, do_vmi_align_munmap->__split_vma(vmi, next, end,
> 0)->hugetlb_vm_op_split.It will determine whether the end address is
> aligned with  hugepage size, and if the end address is not aligned,
> return fail. Is there expect the application to align the length?
>
>  hugetlb_vm_op_split(struct vm_area_struct *vma, unsigned long addr)
>  {
>  	if (addr & ~(huge_page_mask(hstate_vma(vma))))
> 		return -EINVAL;
>
>  }
>
> 3,Or after the vma_find (vmi, end), recalculate the end address ?
>
> ex:
> vma = vma_find(vmi, end);
> ...
> if (is_vm_hugetlb_page(vma))
> {
> 	hugepage_size = huge_page_size(hstate_vma(vma));
> 	end = start + ALIGN(len, hugepage_size);
> }

I'm confused as to what you're saying here? I'm addressing your original
patch, which was flawed in that it could end up unmapping the incorrect
VMA.

I guess you're basically asking whether generally we expect applications to
be aligned to huge page size?

According to the man page:


   Huge page (Huge TLB) mappings

       For mappings that employ huge pages, the requirements for the
       arguments of mmap() and munmap() differ somewhat from the
       requirements for mappings that use the native system page size.

       For mmap(), offset must be a multiple of the underlying huge page
       size.  The system automatically aligns length to be a multiple of
       the underlying huge page size.

       For munmap(), addr, and length must both be a multiple of the
       underlying huge page size.


So yeah, I think this kills this patch unfortunately. This is expected
behaviour and your adjustment of the test is invalid.

I do understand why you wanted to adjust this though, but it seems that we
explicitly want this behaviour, so sorry about that, but thank you for
taking a look at this regardless! :)

I would suggest that it's probably a product of there being quite a large
difference between rounding up to base page size (which is expected and
customary for these kinds of operations) vs. treating a 1 byte length as
meaning you want to free, for instance, 2 MiB of data.

Equally, it would otherwise cause some confusion around THP pages which are
huge but can be split.

So sadly I think this is a nack for the patch/concept.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-07-11 13:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-10  5:45 [PATCH] mm/mmap: Align the length parameter of munmap with hugepage size Haoran Jang
2024-07-10  8:24 ` Lorenzo Stoakes
2024-07-11 11:32   ` Haoran Jiang
2024-07-11 13:22     ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox