[patch] optimize follow_hugetlb

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [patch] optimize follow_hugetlb_page
@ 2006-03-09 11:26 Chen, Kenneth W
  2006-03-10  3:54 ` Andrew Morton
  0 siblings, 1 reply; 2+ messages in thread
From: Chen, Kenneth W @ 2006-03-09 11:26 UTC (permalink / raw)
  To: wli, 'Andrew Morton'; +Cc: linux-mm

follow_hugetlb_page walks a range of user virtual address and then
fills in list of struct page * into an array that is passed from
the argument list.  It also gets a reference count via get_page().
For compound page, get_page() actually traverse back to head page
via page_private() macro and then adds a reference count to the
head page.  Since we are doing a virt to pte look up, kernel already
has a struct page pointer into the head page.  So instead of traverse
into the small unit page struct and then follow a link back to the
head page, optimize that with incrementing the reference count
directly on the head page.

The benefit is that we don't take a cache miss on accessing page
struct for the corresponding user address and more importantly, not
to pollute the cache with a "not very useful" round trip of pointer
chasing.  This adds a moderate performance gain on an I/O intensive
database transaction workload.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>

--- ./mm/hugetlb.c.orig	2006-02-22 18:57:37.218102659 -0800
+++ ./mm/hugetlb.c	2006-02-22 20:49:33.008059453 -0800
@@ -521,10 +521,9 @@ int follow_hugetlb_page(struct mm_struct
 			struct page **pages, struct vm_area_struct **vmas,
 			unsigned long *position, int *length, int i)
 {
-	unsigned long vpfn, vaddr = *position;
+	unsigned long pidx, vaddr = *position;
 	int remainder = *length;

-	vpfn = vaddr/PAGE_SIZE;
 	spin_lock(&mm->page_table_lock);
 	while (vaddr < vma->vm_end && remainder) {
 		pte_t *pte;
@@ -552,19 +551,23 @@ int follow_hugetlb_page(struct mm_struct
 			break;
 		}

-		if (pages) {
-			page = &pte_page(*pte)[vpfn % (HPAGE_SIZE/PAGE_SIZE)];
-			get_page(page);
-			pages[i] = page;
-		}
+		pidx = (vaddr & ~HPAGE_MASK) >> PAGE_SHIFT;
+		page = pte_page(*pte);
+same_page:
+		get_page(page);
+		if (pages)
+			pages[i] = page + pidx;

 		if (vmas)
 			vmas[i] = vma;

 		vaddr += PAGE_SIZE;
-		++vpfn;
+		++pidx;
 		--remainder;
 		++i;
+		if (vaddr < vma->vm_end && remainder &&
+		    pidx < HPAGE_SIZE/PAGE_SIZE)
+			goto same_page;
 	}
 	spin_unlock(&mm->page_table_lock);
 	*length = remainder;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [patch] optimize follow_hugetlb_page
  2006-03-09 11:26 [patch] optimize follow_hugetlb_page Chen, Kenneth W
@ 2006-03-10  3:54 ` Andrew Morton
  0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2006-03-10  3:54 UTC (permalink / raw)
  To: Chen, Kenneth W; +Cc: wli, linux-mm, David Gibson

"Chen, Kenneth W" <kenneth.w.chen@intel.com> wrote:
>
>  follow_hugetlb_page walks a range of user virtual address and then
>  fills in list of struct page * into an array that is passed from
>  the argument list.  It also gets a reference count via get_page().
>  For compound page, get_page() actually traverse back to head page
>  via page_private() macro and then adds a reference count to the
>  head page.  Since we are doing a virt to pte look up, kernel already
>  has a struct page pointer into the head page.  So instead of traverse
>  into the small unit page struct and then follow a link back to the
>  head page, optimize that with incrementing the reference count
>  directly on the head page.
> 
>  The benefit is that we don't take a cache miss on accessing page
>  struct for the corresponding user address and more importantly, not
>  to pollute the cache with a "not very useful" round trip of pointer
>  chasing.  This adds a moderate performance gain on an I/O intensive
>  database transaction workload.
> 
> 
>  Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
> 
> 
>  --- ./mm/hugetlb.c.orig	2006-02-22 18:57:37.218102659 -0800
>  +++ ./mm/hugetlb.c	2006-02-22 20:49:33.008059453 -0800
>  @@ -521,10 +521,9 @@ int follow_hugetlb_page(struct mm_struct
>   			struct page **pages, struct vm_area_struct **vmas,
>   			unsigned long *position, int *length, int i)
>   {
>  -	unsigned long vpfn, vaddr = *position;
>  +	unsigned long pidx, vaddr = *position;

So I spent some time trying to divine what "pidx" means, and ended up
deciding that it doesn't.  So I renamed it to pfn_offset and, being a kind
soul, I added a comment to help out the next guy.

The patch assumes that all pageframes which represent a compound page are
contiguously laid out in mem_map[].  Which is reasonable, I guess.


--- devel/mm/hugetlb.c~optimize-follow_hugetlb_page	2006-03-09 19:46:04.000000000 -0800
+++ devel-akpm/mm/hugetlb.c	2006-03-09 19:51:34.000000000 -0800
@@ -661,10 +661,10 @@ int follow_hugetlb_page(struct mm_struct
 			struct page **pages, struct vm_area_struct **vmas,
 			unsigned long *position, int *length, int i)
 {
-	unsigned long vpfn, vaddr = *position;
+	unsigned long pfn_offset;
+	unsigned long vaddr = *position;
 	int remainder = *length;
 
-	vpfn = vaddr/PAGE_SIZE;
 	spin_lock(&mm->page_table_lock);
 	while (vaddr < vma->vm_end && remainder) {
 		pte_t *pte;
@@ -692,19 +692,28 @@ int follow_hugetlb_page(struct mm_struct
 			break;
 		}
 
-		if (pages) {
-			page = &pte_page(*pte)[vpfn % (HPAGE_SIZE/PAGE_SIZE)];
-			get_page(page);
-			pages[i] = page;
-		}
+		pfn_offset = (vaddr & ~HPAGE_MASK) >> PAGE_SHIFT;
+		page = pte_page(*pte);
+same_page:
+		get_page(page);
+		if (pages)
+			pages[i] = page + pfn_offset;
 
 		if (vmas)
 			vmas[i] = vma;
 
 		vaddr += PAGE_SIZE;
-		++vpfn;
+		++pfn_offset;
 		--remainder;
 		++i;
+		if (vaddr < vma->vm_end && remainder &&
+				pfn_offset < HPAGE_SIZE/PAGE_SIZE) {
+			/*
+			 * We use pfn_offset to avoid touching the pageframes
+			 * of this compound page.
+			 */
+			goto same_page;
+		}
 	}
 	spin_unlock(&mm->page_table_lock);
 	*length = remainder;
_


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-03-10  3:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-03-09 11:26 [patch] optimize follow_hugetlb_page Chen, Kenneth W
2006-03-10  3:54 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox