[PATCH v1] mm: fix unmap_mapping_range high bits shift bug

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1] mm: fix unmap_mapping_range high bits shift bug
@ 2023-12-20  5:28 jiajun.xie
  2023-12-20 17:53 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: jiajun.xie @ 2023-12-20  5:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, jiajun.xie.sh

From: Jiajun Xie <jiajun.xie.sh@gmail.com>

The bug happens when highest bit of holebegin is 1, suppose
holebign is 0x8000000111111000, after shift, hba would be
0xfff8000000111111, then vma_interval_tree_foreach would look
it up fail or leads to the wrong result.

error call seq e.g.:
- mmap(..., offset=0x8000000111111000)
  |- syscall(mmap, ... unsigned long, off):
     |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT);

  here pgoff is correctly shifted to 0x8000000111111,
  but pass 0x8000000111111000 as holebegin to unmap
  would then cause terrible result, as shown below:

- unmap_mapping_range(..., loff_t const holebegin)
  |- pgoff_t hba = holebegin >> PAGE_SHIFT;
          /* hba = 0xfff8000000111111 unexpectedly */

turn holebegin to be unsigned first would fix the bug.

Signed-off-by: Jiajun Xie <jiajun.xie.sh@gmail.com>
---
 mm/memory.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 5c757fba8..6e0712d06 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3624,8 +3624,8 @@ EXPORT_SYMBOL_GPL(unmap_mapping_pages);
 void unmap_mapping_range(struct address_space *mapping,
 		loff_t const holebegin, loff_t const holelen, int even_cows)
 {
-	pgoff_t hba = holebegin >> PAGE_SHIFT;
-	pgoff_t hlen = (holelen + PAGE_SIZE - 1) >> PAGE_SHIFT;
+	pgoff_t hba = (pgoff_t)(holebegin) >> PAGE_SHIFT;
+	pgoff_t hlen = ((pgoff_t)(holelen) + PAGE_SIZE - 1) >> PAGE_SHIFT;
 
 	/* Check for overflow. */
 	if (sizeof(holelen) > sizeof(hlen)) {
-- 
2.34.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] mm: fix unmap_mapping_range high bits shift bug
  2023-12-20  5:28 [PATCH v1] mm: fix unmap_mapping_range high bits shift bug jiajun.xie
@ 2023-12-20 17:53 ` Andrew Morton
  2023-12-21  5:40   ` Jiajun Xie
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2023-12-20 17:53 UTC (permalink / raw)
  To: jiajun.xie; +Cc: linux-mm, linux-kernel

On Wed, 20 Dec 2023 13:28:39 +0800 "jiajun.xie" <jiajun.xie.sh@gmail.com> wrote:

> From: Jiajun Xie <jiajun.xie.sh@gmail.com>
> 
> The bug happens when highest bit of holebegin is 1, suppose
> holebign is 0x8000000111111000, after shift, hba would be
> 0xfff8000000111111, then vma_interval_tree_foreach would look
> it up fail or leads to the wrong result.
> 
> error call seq e.g.:
> - mmap(..., offset=0x8000000111111000)
>   |- syscall(mmap, ... unsigned long, off):
>      |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT);
> 
>   here pgoff is correctly shifted to 0x8000000111111,
>   but pass 0x8000000111111000 as holebegin to unmap
>   would then cause terrible result, as shown below:
> 
> - unmap_mapping_range(..., loff_t const holebegin)
>   |- pgoff_t hba = holebegin >> PAGE_SHIFT;
>           /* hba = 0xfff8000000111111 unexpectedly */
> 
> turn holebegin to be unsigned first would fix the bug.
> 

Thanks.  Are you able to describe the runtime effects of this
(obviously bad, but it's good to spell it out) and under what
circumstances it occurs?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] mm: fix unmap_mapping_range high bits shift bug
  2023-12-20 17:53 ` Andrew Morton
@ 2023-12-21  5:40   ` Jiajun Xie
  2023-12-21 22:08     ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Jiajun Xie @ 2023-12-21  5:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

On Thu, Dec 21, 2023 at 1:53 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed, 20 Dec 2023 13:28:39 +0800 "jiajun.xie" <jiajun.xie.sh@gmail.com> wrote:
>
> > From: Jiajun Xie <jiajun.xie.sh@gmail.com>
> >
> > The bug happens when highest bit of holebegin is 1, suppose
> > holebign is 0x8000000111111000, after shift, hba would be
> > 0xfff8000000111111, then vma_interval_tree_foreach would look
> > it up fail or leads to the wrong result.
> >
> > error call seq e.g.:
> > - mmap(..., offset=0x8000000111111000)
> >   |- syscall(mmap, ... unsigned long, off):
> >      |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT);
> >
> >   here pgoff is correctly shifted to 0x8000000111111,
> >   but pass 0x8000000111111000 as holebegin to unmap
> >   would then cause terrible result, as shown below:
> >
> > - unmap_mapping_range(..., loff_t const holebegin)
> >   |- pgoff_t hba = holebegin >> PAGE_SHIFT;
> >           /* hba = 0xfff8000000111111 unexpectedly */
> >
> > turn holebegin to be unsigned first would fix the bug.
> >
>
> Thanks.  Are you able to describe the runtime effects of this
> (obviously bad, but it's good to spell it out) and under what
> circumstances it occurs?

Thanks for the quick reply.

The issue happens in Heterogeneous computing, where the
device(e.g. gpu) and host share the same virtual address space.

A simple workflow pattern which hit the issue is:
        /* host */
    1. userspace first mmap a file backed VA range with specified offset.
                        e.g. (offset=0x800..., mmap return: va_a)
    2. write some data to the corresponding sys page
                         e.g. (va_a = 0xAABB)
        /* device */
    3. gpu workload touches VA, triggers gpu fault and notify the host.
        /* host */
    4. reviced gpu fault notification, then it will:
            4.1 unmap host pages and also takes care of cpu tlb
                  (use unmap_mapping_range with offset=0x800...)
            4.2 migrate sys page to device
            4.3 setup device page table and resolve device fault.
        /* device */
    5. gpu workload continued, it accessed va_a and got 0xAABB.
    6. gpu workload continued, it wrote 0xBBCC to va_a.
        /* host */
    7. userspace access va_a, as expected, it will:
            7.1 trigger cpu vm fault.
            7.2 driver handling fault to migrate gpu local page to host.
    8. userspace then could correctly get 0xBBCC from va_a
    9. done

But in step 4.1, if we hitted the bug this patch mentioned, then user space
would never trigger cpu fault, and still get the old value: 0xAABB.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] mm: fix unmap_mapping_range high bits shift bug
  2023-12-21  5:40   ` Jiajun Xie
@ 2023-12-21 22:08     ` Andrew Morton
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2023-12-21 22:08 UTC (permalink / raw)
  To: Jiajun Xie; +Cc: linux-mm, linux-kernel

On Thu, 21 Dec 2023 13:40:11 +0800 Jiajun Xie <jiajun.xie.sh@gmail.com> wrote:

> > (obviously bad, but it's good to spell it out) and under what
> > circumstances it occurs?
> 
> Thanks for the quick reply.
> 
> The issue happens in Heterogeneous computing, where the
> device(e.g. gpu) and host share the same virtual address space.
> 
> A simple workflow pattern which hit the issue is:
>         /* host */
>     1. userspace first mmap a file backed VA range with specified offset.
>                         e.g. (offset=0x800..., mmap return: va_a)
>     2. write some data to the corresponding sys page
>                          e.g. (va_a = 0xAABB)
>         /* device */
>     3. gpu workload touches VA, triggers gpu fault and notify the host.
>         /* host */
>     4. reviced gpu fault notification, then it will:
>             4.1 unmap host pages and also takes care of cpu tlb
>                   (use unmap_mapping_range with offset=0x800...)
>             4.2 migrate sys page to device
>             4.3 setup device page table and resolve device fault.
>         /* device */
>     5. gpu workload continued, it accessed va_a and got 0xAABB.
>     6. gpu workload continued, it wrote 0xBBCC to va_a.
>         /* host */
>     7. userspace access va_a, as expected, it will:
>             7.1 trigger cpu vm fault.
>             7.2 driver handling fault to migrate gpu local page to host.
>     8. userspace then could correctly get 0xBBCC from va_a
>     9. done
> 
> But in step 4.1, if we hitted the bug this patch mentioned, then user space
> would never trigger cpu fault, and still get the old value: 0xAABB.

Thanks.  Based on the above, I added cc:stable to the changelog so the
fix will be backported into earlier kernels (it looks like that's 20+
years worth!).  And I pasted the above text into that changelog.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-12-21 22:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-20  5:28 [PATCH v1] mm: fix unmap_mapping_range high bits shift bug jiajun.xie
2023-12-20 17:53 ` Andrew Morton
2023-12-21  5:40   ` Jiajun Xie
2023-12-21 22:08     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox