linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC] prefault optimization
@ 2003-08-08  0:20 Adam Litke
  2003-08-08  1:37 ` Andrew Morton
  2003-08-18 13:19 ` Mel Gorman
  0 siblings, 2 replies; 7+ messages in thread
From: Adam Litke @ 2003-08-08  0:20 UTC (permalink / raw)
  To: linux-mm; +Cc: Martin J. Bligh

This patch attempts to reduce page fault overhead for mmap'd files.  All 
pages in the page cache that will be managed by the current vma are 
instantiated in the page table.  This boots, but some applications fail 
(eg. make).  I am probably missing a corner case somewhere.  Let me know 
what you think.

--Adam Litke

diff -urN linux-2.5.73-virgin/mm/memory.c linux-2.5.73-vm/mm/memory.c
--- linux-2.5.73-virgin/mm/memory.c	2003-06-22 11:32:43.000000000 -0700
+++ linux-2.5.73-vm/mm/memory.c	2003-08-07 13:01:48.000000000 -0700
@@ -1328,6 +1328,47 @@
  	return ret;
  }

+/* Try to reduce overhead from page faults by grabbing pages from the page
+ * cache and instantiating the page table entries for this vma
+ */
+static int
+do_pre_fault(struct mm_struct *mm, struct vm_area_struct *vma, pmd_t *pmd,
+		const pte_t *page_table)
+{
+	unsigned long vm_end_pgoff, offset, address;
+	struct address_space *mapping;
+	struct page *new_page;
+	pte_t *pte, entry;
+	struct pte_chain *pte_chain;
+	
+	/* the file offset corrssponding to end of this vma */
+	vm_end_pgoff = ((vma->vm_end - vma->vm_start) >> PAGE_SHIFT) + 
vma->vm_pgoff;
+	mapping = vma->vm_file->f_dentry->d_inode->i_mapping;
+
+	/* Itterate through all pages managed by this vma */
+	for(offset = vma->vm_pgoff; offset < vm_end_pgoff; ++offset)
+	{
+		address = vma->vm_start + ((offset - vma->vm_pgoff) << PAGE_SHIFT);
+		pte = pte_offset_map(pmd, address);
+		if(pte_none(*pte)) { /* don't touch instantiated ptes */
+			new_page = find_get_page(mapping, offset);
+			if(!new_page)
+				continue;
+			
+			/* This code taken directly from do_no_page() */
+			pte_chain = pte_chain_alloc(GFP_KERNEL);
+			++mm->rss;
+			flush_icache_page(vma, new_page);
+			entry = mk_pte(new_page, vma->vm_page_prot);
+			set_pte(pte, entry);
+			pte_chain = page_add_rmap(new_page, pte, pte_chain);
+			pte_unmap(page_table);
+			update_mmu_cache(vma, address, *pte);
+			pte_chain_free(pte_chain);
+		}
+	}
+}
+
  /*
   * do_no_page() tries to create a new page mapping. It aggressively
   * tries to share with existing pages, but makes a separate copy if
@@ -1405,6 +1446,8 @@
  		set_pte(page_table, entry);
  		pte_chain = page_add_rmap(new_page, page_table, pte_chain);
  		pte_unmap(page_table);
+		//if(!write_access)
+			do_pre_fault(mm, vma, pmd, page_table);
  	} else {
  		/* One of our sibling threads was faster, back out. */
  		pte_unmap(page_table);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: [RFC] prefault optimization
@ 2003-08-15 16:08 Adam Litke
  0 siblings, 0 replies; 7+ messages in thread
From: Adam Litke @ 2003-08-15 16:08 UTC (permalink / raw)
  To: linux-mm; +Cc: Martin J. Bligh, Andrew Morton

Here is the latest on the pre-fault code I posted last week.  I fixed it
up some in response to comments I received.  There is still a bug which
causes some programs to segfault (ie. gcc).  Interestingly, man fails
the first time it is run, but subsequent runs are successful.  Please
take a look and let me know what you think.  Any ideas about that bug?

On Thu, 2003-08-07 at 18:37, Andrew Morton wrote:
 > I'd like to see it using find_get_pages() though.

This implementation is simple but somewhat wasteful.  My basic testing
shows around 30% of pages returned from find_get_pages() aren't used.

 > And find a way to hold the pte page's atomic kmap across the whole 
pte > page

I allocate the page once at the beginning but have to drop it when I
need to allocate a pte_chain.  Perhaps it could be done a better way.

 > Perhaps it can use install_page() as well, rather than open-coding it?

It seems that install_page does too much for what I need.  For starters
it zaps the pte.  There is also no need to do the pgd lookup stuff every
time because I already know the correct pte entry to use.

 > Cannot do a sleeping allocation while holding the atomic kmap from
 > pte_offset_map().

I took a dirty approach to this one.  Is it ok to hold the
page_table_lock throughout this function?

 > And the pte_chain handling can be optimised:

I think I am pretty close here.  In my brief test 10% of mapped pages
required a call to pte_chain_alloc.

--Adam

diff -urN linux-2.5.73-virgin/include/asm-i386/pgtable.h 
linux-2.5.73-vm/include/asm-i386/pgtable.h
--- linux-2.5.73-virgin/include/asm-i386/pgtable.h	2003-06-22 
11:33:04.000000000 -0700
+++ linux-2.5.73-vm/include/asm-i386/pgtable.h	2003-08-13 
07:08:18.000000000 -0700
@@ -299,12 +299,15 @@
  	((pte_t *)kmap_atomic(pmd_page(*(dir)),KM_PTE0) + pte_index(address))
  #define pte_offset_map_nested(dir, address) \
  	((pte_t *)kmap_atomic(pmd_page(*(dir)),KM_PTE1) + pte_index(address))
+#define pte_base_map(dir) \
+	((pte_t *)kmap_atomic(pmd_page(*(dir)),KM_PTE0)
  #define pte_unmap(pte) kunmap_atomic(pte, KM_PTE0)
  #define pte_unmap_nested(pte) kunmap_atomic(pte, KM_PTE1)
  #else
  #define pte_offset_map(dir, address) \
  	((pte_t *)page_address(pmd_page(*(dir))) + pte_index(address))
  #define pte_offset_map_nested(dir, address) pte_offset_map(dir, address)
+#define pte_base_map(dir) ((pte_t *)page_address(pmd_page(*(dir))))
  #define pte_unmap(pte) do { } while (0)
  #define pte_unmap_nested(pte) do { } while (0)
  #endif
diff -urN linux-2.5.73-virgin/mm/memory.c linux-2.5.73-vm/mm/memory.c
--- linux-2.5.73-virgin/mm/memory.c	2003-06-22 11:32:43.000000000 -0700
+++ linux-2.5.73-vm/mm/memory.c	2003-08-14 07:57:54.000000000 -0700
@@ -1328,6 +1328,74 @@
  	return ret;
  }

+#define vma_nr_pages(vma) \
+	((vma->vm_end - vma->vm_start) >> PAGE_SHIFT)
+
+/* Try to reduce overhead from page faults by grabbing pages from the page
+ * cache and instantiating the page table entries for this vma
+ */
+
+unsigned long prefault_entered = 0;
+unsigned long prefault_pages_mapped = 0;
+unsigned long prefault_pte_alloc = 0;
+unsigned long prefault_unused_pages = 0;
+
+static void
+do_pre_fault(struct mm_struct *mm, struct vm_area_struct *vma, pmd_t *pmd)
+{
+	unsigned long offset, address;
+	struct address_space *mapping;
+	struct page *new_page;
+	pte_t *pte, *pte_base;
+	struct pte_chain *pte_chain;
+	unsigned int i, num_pages;
+	struct page **pages;
+	
+	/* debug */ ++prefault_entered;
+	pages = kmalloc(PTRS_PER_PTE * sizeof(struct page*), GFP_KERNEL);
+	mapping = vma->vm_file->f_dentry->d_inode->i_mapping;
+	num_pages = find_get_pages(mapping, vma->vm_pgoff, PTRS_PER_PTE, pages);
+
+	pte_chain = pte_chain_alloc(GFP_KERNEL);
+	pte_base = pte_base_map(pmd);
+
+	/* Iterate through all pages managed by this vma */
+	for (i = 0; i < num_pages; ++i)
+	{
+		new_page = pages[i];
+		if (new_page->index >= (vma->vm_pgoff + vma_nr_pages(vma)))
+			break; /* The rest of the pages are not in this vma */
+		offset = new_page->index - vma->vm_pgoff;
+		address = vma->vm_start + (offset << PAGE_SHIFT);
+		pte = pte_base + pte_index(address);
+		if (pte_none(*pte)) {
+			mm->rss++;
+			flush_icache_page(vma, new_page);
+			set_pte(pte, mk_pte(new_page, vma->vm_page_prot));
+			if (pte_chain == NULL) {
+				pte_unmap(pte_base);
+				/* debug */ ++prefault_pte_alloc;
+				pte_chain = pte_chain_alloc(GFP_KERNEL);
+				pte_base = pte_base_map(pmd);
+			}
+			pte_chain = page_add_rmap(new_page, pte, pte_chain);
+			update_mmu_cache(vma, address, *pte);
+			/* debug */ ++prefault_pages_mapped;
+			pages[i] = NULL;
+		}
+	}
+	pte_unmap(pte_base);
+	pte_chain_free(pte_chain);
+
+	/* Release the pages we did not sucessfully add */
+	for (i = 0; i < num_pages; ++i)
+		if (pages[i]) {
+			/* debug */ ++prefault_unused_pages;
+			page_cache_release(pages[i]);
+		}
+	kfree(pages);
+}
+
  /*
   * do_no_page() tries to create a new page mapping. It aggressively
   * tries to share with existing pages, but makes a separate copy if
@@ -1416,6 +1484,8 @@

  	/* no need to invalidate: a not-present page shouldn't be cached */
  	update_mmu_cache(vma, address, entry);
+
+	do_pre_fault(mm, vma, pmd);
  	spin_unlock(&mm->page_table_lock);
  	ret = VM_FAULT_MAJOR;
  	goto out;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-08-18 17:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-08  0:20 [RFC] prefault optimization Adam Litke
2003-08-08  1:37 ` Andrew Morton
2003-08-14 21:45   ` Adam Litke
2003-08-14 21:47   ` Adam Litke
2003-08-18 13:19 ` Mel Gorman
2003-08-18 17:15   ` Martin J. Bligh
2003-08-15 16:08 Adam Litke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox