linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-04-08  6:34 jiang.biao2
  0 siblings, 0 replies; 11+ messages in thread
From: jiang.biao2 @ 2013-04-08  6:34 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Zhang Yi, Ma Chenggong, Liu Dong, Cui Yunfeng, Lu Zhongjun, Jiang Biao

[-- Attachment #1: Type: text/plain, Size: 5381 bytes --]

From: Zhang Yi <zhang.yi20@zte.com.cn>

The futex-keys of processes share futex determined by page-offset, 
mapping-host, and 
mapping-index of the user space address.
User appications using hugepage for futex may lead to futex-key conflict. 
Assume there 
are two or more futexes in diffrent normal pages of the hugepage, and each 
futex has 
the same offset in its normal page, causing all the futexes have the same 
futex-key. 
In that case, futex may not work well.

This patch adds the normal page index in the compound page into the offset 
of futex-key.

Steps to reproduce the bug:
1. The 1st thread map a file of hugetlbfs, and use the return address as 
the 1st mutex's 
address, and use the return address with PAGE_SIZE added as the 2nd 
mutex's address;
2. The 1st thread initialize the two mutexes with pshared attribute, and 
lock the two mutexes.
3. The 1st thread create the 2nd thread, and the 2nd thread block on the 
1st mutex.
4. The 1st thread create the 3rd thread, and the 3rd thread block on the 
2nd mutex.
5. The 1st thread unlock the 2nd mutex, the 3rd thread can not take the 
2nd mutex, and 
may block forever.

Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>

diff -uprN orig/linux-3.9-rc5/include/linux/mm.h 
new/linux-3.9-rc5/include/linux/mm.h
--- orig/linux-3.9-rc5/include/linux/mm.h       2013-03-31 
22:12:43.000000000 +0000
+++ new/linux-3.9-rc5/include/linux/mm.h        2013-04-03 
11:01:19.671403000 +0000
@@ -502,6 +502,20 @@ static inline void set_compound_order(st
        page[1].lru.prev = (void *)order;
 }
 
+static inline void set_page_compound_index(struct page *page, int index)
+{
+       if (PageHead(page))
+               return;
+       page->index = index;
+}
+
+static inline int get_page_compound_index(struct page *page)
+{
+       if (PageHead(page))
+               return 0;
+       return page->index;
+}
+
 #ifdef CONFIG_MMU
 /*
  * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
diff -uprN orig/linux-3.9-rc5/kernel/futex.c 
new/linux-3.9-rc5/kernel/futex.c
--- orig/linux-3.9-rc5/kernel/futex.c   2013-03-31 22:12:43.000000000 
+0000
+++ new/linux-3.9-rc5/kernel/futex.c    2013-04-03 11:03:42.168663000 
+0000
@@ -239,7 +239,7 @@ get_futex_key(u32 __user *uaddr, int fsh
        unsigned long address = (unsigned long)uaddr;
        struct mm_struct *mm = current->mm;
        struct page *page, *page_head;
-       int err, ro = 0;
+       int err, ro = 0, compound_index = 0;
 
        /*
         * The futex address must be "naturally" aligned.
@@ -299,6 +299,7 @@ again:
                         * freed from under us.
                         */
                        if (page != page_head) {
+                               compound_index = 
get_page_compound_index(page);
                                get_page(page_head);
                                put_page(page);
                        }
@@ -311,6 +312,7 @@ again:
 #else
        page_head = compound_head(page);
        if (page != page_head) {
+               compound_index = get_page_compound_index(page);
                get_page(page_head);
                put_page(page);
        }
@@ -363,7 +365,7 @@ again:
                key->private.mm = mm;
                key->private.address = address;
        } else {
-               key->both.offset |= FUT_OFF_INODE; /* inode-based key */
+               key->both.offset |= (compound_index << PAGE_SHIFT) | 
FUT_OFF_INODE; /* inode-based key */
                key->shared.inode = page_head->mapping->host;
                key->shared.pgoff = page_head->index;
        }
diff -uprN orig/linux-3.9-rc5/mm/hugetlb.c new/linux-3.9-rc5/mm/hugetlb.c
--- orig/linux-3.9-rc5/mm/hugetlb.c     2013-03-31 22:12:43.000000000 
+0000
+++ new/linux-3.9-rc5/mm/hugetlb.c      2013-04-03 11:02:10.556132000 
+0000
@@ -667,6 +667,7 @@ static void prep_compound_gigantic_page(
        for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
                __SetPageTail(p);
                set_page_count(p, 0);
+               set_page_compound_index(p, i);
                p->first_page = page;
        }
 }
diff -uprN orig/linux-3.9-rc5/mm/page_alloc.c 
new/linux-3.9-rc5/mm/page_alloc.c
--- orig/linux-3.9-rc5/mm/page_alloc.c  2013-03-31 22:12:43.000000000 
+0000
+++ new/linux-3.9-rc5/mm/page_alloc.c   2013-04-03 11:01:47.933353000 
+0000
@@ -361,6 +361,7 @@ void prep_compound_page(struct page *pag
                struct page *p = page + i;
                __SetPageTail(p);
                set_page_count(p, 0);
+               set_page_compound_index(p, i);
                p->first_page = page;
        }
 }
--------------------------------------------------------
ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s).  If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited.  If you have received this mail in error, please delete it and notify us immediately.

[-- Attachment #2: Type: text/html, Size: 12222 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-04-16  3:37 zhang.yi20
  2013-04-16 17:57 ` Darren Hart
  2013-04-16 18:37 ` Dave Hansen
  0 siblings, 2 replies; 11+ messages in thread
From: zhang.yi20 @ 2013-04-16  3:37 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Peter Zijlstra, Darren Hart, Thomas Gleixner, Ingo Molnar

Hello,

The futex-keys of processes share futex determined by page-offset, 
mapping-host, and 
mapping-index of the user space address. 
User appications using hugepage for futex may lead to futex-key conflict. 
Assume there 
are two or more futexes in diffrent normal pages of the hugepage, and each 
futex has 
the same offset in its normal page, causing all the futexes have the same 
futex-key. 
In that case, futex may not work well. 

This patch adds the normal page index in the compound page into the offset 
of futex-key. 

Steps to reproduce the bug: 
1. The 1st thread map a file of hugetlbfs, and use the return address as 
the 1st mutex's 
address, and use the return address with PAGE_SIZE added as the 2nd 
mutex's address; 
2. The 1st thread initialize the two mutexes with pshared attribute, and 
lock the two mutexes. 
3. The 1st thread create the 2nd thread, and the 2nd thread block on the 
1st mutex. 
4. The 1st thread create the 3rd thread, and the 3rd thread block on the 
2nd mutex. 
5. The 1st thread unlock the 2nd mutex, the 3rd thread can not take the 
2nd mutex, and 
may block forever. 

Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>

diff -uprN orig/linux-3.9-rc7/include/linux/mm.h 
new/linux-3.9-rc7/include/linux/mm.h
--- orig/linux-3.9-rc7/include/linux/mm.h       2013-04-15 
00:45:16.000000000 +0000
+++ new/linux-3.9-rc7/include/linux/mm.h        2013-04-16 
11:21:59.573458000 +0000
@@ -502,6 +502,20 @@ static inline void set_compound_order(st
        page[1].lru.prev = (void *)order;
 }
 
+static inline void set_page_compound_index(struct page *page, int index)
+{
+       if (PageHead(page))
+               return;
+       page->index = index;
+}
+
+static inline int get_page_compound_index(struct page *page)
+{
+       if (PageHead(page))
+               return 0;
+       return page->index;
+}
+
 #ifdef CONFIG_MMU
 /*
  * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
diff -uprN orig/linux-3.9-rc7/kernel/futex.c 
new/linux-3.9-rc7/kernel/futex.c
--- orig/linux-3.9-rc7/kernel/futex.c   2013-04-15 00:45:16.000000000 
+0000
+++ new/linux-3.9-rc7/kernel/futex.c    2013-04-16 11:13:30.069887000 
+0000
@@ -239,7 +239,7 @@ get_futex_key(u32 __user *uaddr, int fsh
        unsigned long address = (unsigned long)uaddr;
        struct mm_struct *mm = current->mm;
        struct page *page, *page_head;
-       int err, ro = 0;
+       int err, ro = 0, comp_idx = 0;
 
        /*
         * The futex address must be "naturally" aligned.
@@ -299,6 +299,7 @@ again:
                         * freed from under us.
                         */
                        if (page != page_head) {
+                               comp_idx = get_page_compound_index(page);
                                get_page(page_head);
                                put_page(page);
                        }
@@ -311,6 +312,7 @@ again:
 #else
        page_head = compound_head(page);
        if (page != page_head) {
+               comp_idx = get_page_compound_index(page);
                get_page(page_head);
                put_page(page);
        }
@@ -363,7 +365,8 @@ again:
                key->private.mm = mm;
                key->private.address = address;
        } else {
-               key->both.offset |= FUT_OFF_INODE; /* inode-based key */
+               key->both.offset |= (comp_idx << PAGE_SHIFT)
+                                   | FUT_OFF_INODE; /* inode-based key */
                key->shared.inode = page_head->mapping->host;
                key->shared.pgoff = page_head->index;
        }
diff -uprN orig/linux-3.9-rc7/mm/hugetlb.c new/linux-3.9-rc7/mm/hugetlb.c
--- orig/linux-3.9-rc7/mm/hugetlb.c     2013-04-15 00:45:16.000000000 
+0000
+++ new/linux-3.9-rc7/mm/hugetlb.c      2013-04-16 10:23:02.658531000 
+0000
@@ -667,6 +667,7 @@ static void prep_compound_gigantic_page(
        for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
                __SetPageTail(p);
                set_page_count(p, 0);
+               set_page_compound_index(p, i);
                p->first_page = page;
        }
 }
diff -uprN orig/linux-3.9-rc7/mm/page_alloc.c 
new/linux-3.9-rc7/mm/page_alloc.c
--- orig/linux-3.9-rc7/mm/page_alloc.c  2013-04-15 00:45:16.000000000 
+0000
+++ new/linux-3.9-rc7/mm/page_alloc.c   2013-04-16 10:23:16.452393000 
+0000
@@ -361,6 +361,7 @@ void prep_compound_page(struct page *pag
                struct page *p = page + i;
                __SetPageTail(p);
                set_page_count(p, 0);
+               set_page_compound_index(p, i);
                p->first_page = page;
        }
 }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: Re: [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
  2013-04-16 17:57 ` Darren Hart
@ 2013-04-17  9:55 zhang.yi20
  2013-04-17 14:18 ` Darren Hart
  -1 siblings, 1 reply; 11+ messages in thread
From: zhang.yi20 @ 2013-04-17  9:55 UTC (permalink / raw)
  To: Darren Hart
  Cc: linux-kernel, linux-mm, Ingo Molnar, Peter Zijlstra, Thomas Gleixner

Darren Hart <dvhart@linux.intel.com> wrote on 2013/04/17 01:57:10:

> Again, a functional testcase in futextest would be a good idea. This
> helps validate the patch and also can be used to identify regressions in
> the future.

I will post the testcase code later.

> 
> What is the max value of comp_idx? Are we at risk of truncating it?
> Looks like not really from my initial look.
> 
> This also needs a comment in futex.h describing the usage of the offset
> field in union futex_key as well as above get_futex_key describing the
> key for shared mappings.
> 
> 

As far as I know , the max size of one hugepage is 1 GBytes for x86 cpu.
Can some other cpus support greater hugepage even more than 4 GBytes? If 
so, we can change the type of 'offset' from int to long to avoid 
truncating.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-04-19  2:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-08  6:34 [PATCH] futex: bugfix for futex-key conflict when futex use hugepage jiang.biao2
2013-04-16  3:37 zhang.yi20
2013-04-16 17:57 ` Darren Hart
2013-04-16 18:37 ` Dave Hansen
2013-04-16 18:47   ` Darren Hart
2013-04-17  9:55 zhang.yi20
2013-04-17 14:18 ` Darren Hart
2013-04-17 15:26   ` Dave Hansen
2013-04-17 15:51     ` Darren Hart
2013-04-18  8:05       ` zhang.yi20
2013-04-18 14:34         ` Darren Hart
2013-04-19  2:13           ` zhang.yi20
2013-04-19  2:42             ` Darren Hart
2013-04-19  2:45             ` Darren Hart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox