linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/15] PTI: clean page table interface
@ 2005-05-21  2:43 Paul Davies
  2005-05-21  2:53 ` [PATCH 2/15] PTI: Add general files and directories Paul Cameron Davies
  2005-05-28  8:53 ` [PATCH 1/15] PTI: clean page table interface Christoph Hellwig
  0 siblings, 2 replies; 19+ messages in thread
From: Paul Davies @ 2005-05-21  2:43 UTC (permalink / raw)
  To: linux-mm

Here are a set of 15 patches against 2.6.12-rc4 to provide a clean
page table interface so that alternate page tables can be fitted
to Linux in the future.  This patch set is produced on behalf of
the Gelato research group at the University of New South Wales.

LMbench results are included at the end of this patch set.  The
results are very good although the mmap latency figures were
slightly higher than expected.

I look forward to any feedback that will assist me in putting
together a page table interface that will benefit the whole linux
community. 

Paul C Davies (for Gelato@UNSW)

Patch 1 of 15.

			GENERAL INFORMATION

The current page table implementation is tightly interwoven with
the rest of the  virtual memory code.  This makes it difficult to
implement new page tables, or to change the existing implementation.

This patch series attempts to abstract out the page table, so that
architectures can replace it with one that is more friendly if they
wish.  It's probable that architectures such as i386 and ARM, where
the hardware walks the current page table directly, will not want to
change it; but IA64 amongst others may wish to try page tables more
suited to huge sparse virtual memory layouts, or page tables that can
be hardware walked.

A new Kconfig option allows selecting the format; at present it's a
choice of one entry, but that will change in the future.

LMBench and similar microbenchmarks show no significant performance 
degradation after the full patch set is applied, on i386, Pentium-4 
or IA64 McKinley.  The patch set passes all vm tests for The LTP test 
suite ltp-20050505. 

There are 15 patches.  The general story is:
	* Introduce the architecture independent interface minus
	  iterators.
	* Move relevant code behind interface.
	* Go through each function in the general interface and call
	  it.
	* Introduce iterators.
	* Go through and call all iterators.
Up to this point all architectures will run through this by default.
	* Now introduce the ia64 mlpt specific interface.
	* Move architecture specific mlpt code behind interface and
	  call the new interface.

The first patch introduces the architecture independent interface
minus the iterators.  Kconfig options for architectures other than 
i386 and IA64 will be added in a later patch series.

 arch/i386/Kconfig         |    2 
 arch/ia64/Kconfig         |    2 
 include/mm/mlpt-generic.h |  190 ++++++++++++++++++++++++++++++++++++++++++++++
 mm/Kconfig                |   16 +++
 4 files changed, 210 insertions(+)

Index: linux-2.6.12-rc4/include/mm/mlpt-generic.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/mm/mlpt-generic.h	2005-05-19 17:04:00.000000000 +1000
@@ -0,0 +1,190 @@
+#ifndef _MM_MLPT_GENERIC_H
+#define _MM_MLPT_GENERIC_H 1
+
+#include <linux/highmem.h>
+#include <asm/tlb.h>
+
+/**
+ * init_page_table - initialise a user process page table 
+ *
+ * Returns the address of the page table
+ *
+ * Creates a new page table.  This consists of a zeroed out pgd.
+ */
+
+static inline pgd_t *init_page_table(void)
+{
+	return pgd_alloc(NULL);
+}
+
+/**
+ * free_page_table - frees a user process page table 
+ * @pgd: the pointer to the page table
+ *
+ * Returns void
+ *
+ * Frees the page table.  It assumes that the rest of the page table has been 
+ * torn down prior to this.
+ */
+
+static inline void free_page_table(pgd_t *pgd)
+{
+	pgd_free(pgd);
+}
+
+/**
+ * lookup_page_table - looks up any page table 
+ * @mm: the address space that owns the page table
+ * @address: The virtual address we are trying to find the pte for 
+ *
+ * Returns a pointer to a pte.
+ *
+ * Look up the kernel or user page table.
+ */
+
+static inline pte_t *lookup_page_table(struct mm_struct *mm, unsigned long address)
+{ 
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	if (mm) { /* Look up user page table */
+		pgd = pgd_offset(mm, address);
+		if (pgd_none_or_clear_bad(pgd))
+			return NULL;
+	} else { /* Look up kernel page table */
+		pgd = pgd_offset_k(address);
+		if (pgd_none_or_clear_bad(pgd)) //look at clear bad here.
+			return NULL;
+	}
+
+	pud = pud_offset(pgd, address);
+	if (pud_none_or_clear_bad(pud)) {
+		return NULL;
+	}
+
+	pmd = pmd_offset(pud, address);
+	if (pmd_none_or_clear_bad(pmd)) {
+		return NULL;
+	}
+
+	pte = pte_offset_map(pmd, address);
+
+	return pte;
+}
+
+/**
+ * build_page_table - builds a user process page table.
+ * @mm: the address space that owns the page table.
+ * @address: The virtual address for which we are adding a mapping.
+ *
+ * Returns a pointer to a pte.
+ *
+ * Builds the pud/pmd/pte directories for a page table if requried.
+ * This function readies the page table for insertion.
+ */
+
+static inline pte_t *build_page_table(struct mm_struct *mm, unsigned long address)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset(mm, address);
+
+	if (!pgd) {
+		return NULL;
+	}
+
+	pud = pud_alloc(mm, pgd, address);
+	if (!pud) {
+		return NULL;
+	}
+
+	pmd = pmd_alloc(mm, pud, address);
+	if (!pmd) {
+		return NULL;
+	}
+
+	pte = pte_alloc_map(mm, pmd, address);
+
+	return pte;
+}
+
+/**
+ * lookup_nested_pte - looks up a nested pte.
+ * @mm: the address space that owns the page table.
+ * @address: The virtual address for which we are adding a mapping.
+ *
+ * Returns a pointer to the pte to be unmapped.
+ *
+ * This function looks up a user page table for a nested pte. 
+ */
+
+static inline pte_t *lookup_nested_pte(struct mm_struct *mm, unsigned long address)
+{ 
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte = NULL;
+
+	pgd = pgd_offset(mm, address);
+	if (pgd_none_or_clear_bad(pgd))
+		goto end;
+
+	pud = pud_offset(pgd, address);
+	if (pud_none_or_clear_bad(pud))
+		goto end;
+
+	pmd = pmd_offset(pud, address);
+	if (pmd_none_or_clear_bad(pmd))
+		goto end;
+
+	pte = pte_offset_map_nested(pmd, address);
+	if (pte_none(*pte)) {
+		pte_unmap_nested(pte);
+		pte = NULL;
+	}
+end:
+	return pte;
+}
+
+/**
+ * lookup_page_table_gate - looks up a page table.
+ * @mm: the address space that owns the page table.
+ * @start: The virtual address we are looking up
+ *
+ * Returns a pointer to the pte to be unmapped.
+ *
+ * This function looks up a page table.  The gate varies with the 
+ * architecture.  
+ */
+
+static inline pte_t *lookup_page_table_gate(struct mm_struct *mm, unsigned long start)
+{
+	unsigned long pg = start & PAGE_MASK;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	if (pg > TASK_SIZE)
+		pgd = pgd_offset_k(pg);
+	else
+		pgd = pgd_offset_gate(mm, pg);
+	BUG_ON(pgd_none(*pgd));
+	pud = pud_offset(pgd, pg);
+	BUG_ON(pud_none(*pud));
+	pmd = pmd_offset(pud, pg);
+	BUG_ON(pmd_none(*pmd));
+	pte = pte_offset_map(pmd, pg);
+
+	return pte;
+}
+
+void free_pgtables(struct mmu_gather **tlb, struct vm_area_struct *vma,
+		   unsigned long floor, unsigned long ceiling);
+
+#endif
Index: linux-2.6.12-rc4/mm/Kconfig
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/mm/Kconfig	2005-05-19 17:04:00.000000000 +1000
@@ -0,0 +1,16 @@
+choice
+	prompt "Page Table Format"
+	default MLPT
+
+config MLPT
+       bool "MLPT"
+       help
+         Linux will offer a choice of page table formats for different
+	 purposes.  The Multi-Level-Page Table is the standard (old)
+	 page table, which can be walked directly by many
+	 architectures.
+	 Typically each architecture will have, as well as Linux's
+	 page tables, its own hardware-walked tables that act as a
+	 software-loaded cache of the kernel tables.
+	 
+endchoice
Index: linux-2.6.12-rc4/arch/i386/Kconfig
===================================================================
--- linux-2.6.12-rc4.orig/arch/i386/Kconfig	2005-05-19 17:02:57.000000000 +1000
+++ linux-2.6.12-rc4/arch/i386/Kconfig	2005-05-19 17:04:00.000000000 +1000
@@ -703,6 +703,8 @@
 	  with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to
 	  /dev/cpu/31/cpuid.
 
+source "mm/Kconfig"
+
 source "drivers/firmware/Kconfig"
 
 choice
Index: linux-2.6.12-rc4/arch/ia64/Kconfig
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/Kconfig	2005-05-19 17:02:57.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/Kconfig	2005-05-19 17:04:00.000000000 +1000
@@ -342,6 +342,8 @@
 	depends on IOSAPIC && EXPERIMENTAL
 	default y
 
+source "mm/Kconfig"
+
 source "drivers/firmware/Kconfig"
 
 source "fs/Kconfig.binfmt"

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/15] PTI: Add general files and directories
  2005-05-21  2:43 [PATCH 1/15] PTI: clean page table interface Paul Davies
@ 2005-05-21  2:53 ` Paul Cameron Davies
  2005-05-21  3:08   ` [PATCH 3/15] PTI: move mlpt behind interface Paul Cameron Davies
  2005-05-28  8:53 ` [PATCH 1/15] PTI: clean page table interface Christoph Hellwig
  1 sibling, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  2:53 UTC (permalink / raw)
  To: Paul Davies; +Cc: linux-mm

Patch 2 of 15.

This patch adds the files and directories for architecture independent
mlpt code to sit behind a clean page table interface.

 	*mlpt.c is to contain the mlpt specific functions to be moved
 	 behind the interface.
 	*page_table.h is for including general page table implementations.
 	 In this case, the incumbent mlpt.
 	*pgtable-mlpt.h and tlb-mlpt.h are for mlpt abstractions from
 	 the generic pgtable.h and tlb.h
 	*mm-mlpt.h is for mlpt abstractions from mm.h

  include/asm-generic/pgtable-mlpt.h |    4 ++++
  include/asm-generic/tlb-mlpt.h     |    4 ++++
  include/linux/page_table.h         |   12 ++++++++++++
  include/mm/mm-mlpt.h               |    4 ++++
  mm/Makefile                        |    2 ++
  mm/fixed-mlpt/Makefile             |    3 +++
  mm/fixed-mlpt/mlpt.c               |    1 +
  7 files changed, 30 insertions(+)

Index: linux-2.6.12-rc4/mm/fixed-mlpt/mlpt.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/mm/fixed-mlpt/mlpt.c	2005-05-19 
17:08:37.000000000 +1000
@@ -0,0 +1 @@
+#include <linux/page_table.h>
Index: linux-2.6.12-rc4/mm/Makefile
===================================================================
--- linux-2.6.12-rc4.orig/mm/Makefile	2005-05-19 17:08:34.000000000 
+1000
+++ linux-2.6.12-rc4/mm/Makefile	2005-05-19 17:08:37.000000000 
+1000
@@ -7,6 +7,8 @@
  			   mlock.o mmap.o mprotect.o mremap.o msync.o 
rmap.o \
  			   vmalloc.o

+mmu-$(CONFIG_MMU)	+= fixed-mlpt/
+
  obj-y			:= bootmem.o filemap.o mempool.o oom_kill.o 
fadvise.o \
  			   page_alloc.o page-writeback.o pdflush.o \
  			   readahead.o slab.o swap.o truncate.o vmscan.o \
Index: linux-2.6.12-rc4/mm/fixed-mlpt/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/mm/fixed-mlpt/Makefile	2005-05-19 
17:08:37.000000000 +1000
@@ -0,0 +1,3 @@
+#Makefile for mm/fixed-mlpt/
+
+obj-y	:= mlpt.o
Index: linux-2.6.12-rc4/include/linux/page_table.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/linux/page_table.h	2005-05-19 
17:08:37.000000000 +1000
@@ -0,0 +1,12 @@
+#ifndef _LINUX_PAGE_TABLE_H
+#define _LINUX_PAGE_TABLE_H 1
+
+#include <linux/config.h>
+#include <asm/pgtable.h>
+
+#ifdef CONFIG_MLPT
+#include <asm/pgalloc.h>
+#include <mm/mlpt-generic.h>
+#endif
+
+#endif
Index: linux-2.6.12-rc4/include/mm/mm-mlpt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/mm/mm-mlpt.h	2005-05-19 
17:08:37.000000000 +1000
@@ -0,0 +1,4 @@
+#ifndef _MM_MM_MLPT_H
+#define _MM_MM_MLPT_H 1
+
+#endif
Index: linux-2.6.12-rc4/include/asm-generic/pgtable-mlpt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/asm-generic/pgtable-mlpt.h	2005-05-19 
17:08:37.000000000 +1000
@@ -0,0 +1,4 @@
+#ifndef _ASM_GENERIC_PGTABLE_MLPT_H
+#define _ASM_GENERIC_PGTABLE_MLPT_H 1
+
+#endif
Index: linux-2.6.12-rc4/include/asm-generic/tlb-mlpt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/asm-generic/tlb-mlpt.h	2005-05-19 
17:08:37.000000000 +1000
@@ -0,0 +1,4 @@
+#ifndef _ASM_GENERIC_TLB_MLPT_H
+#define _ASM_GENERIC_TLB_MLPT_H 1
+
+#endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 3/15] PTI: move mlpt behind interface
  2005-05-21  2:53 ` [PATCH 2/15] PTI: Add general files and directories Paul Cameron Davies
@ 2005-05-21  3:08   ` Paul Cameron Davies
  2005-05-21  3:15     ` [PATCH 4/15] PTI: move mlpt behind interface cont Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  3:08 UTC (permalink / raw)
  To: linux-mm

Patch 3 of 15.

This patch starts to rearrange the code, to separate
page-table-specific code into a new file.

 	*The patch moves free_pgtables() away and makes free_pgd_range()
 	 a static.  This breaks hugeTLBfs, but that is to be fixed up in a
 	 later patch set.
 	*The prototype for free_pgtables() is removed out of mm.h as it
 	 now resides in mlpt-generic.h
 	*free_pgtables() is now being called through mlpt-generic.h via
 	 page_table.h.
 	*abstracts mlpt dependent code from mm.h to mm-mlpt.h

  include/linux/mm.h        |   40 +--------
  include/mm/mlpt-generic.h |    3
  include/mm/mm-mlpt.h      |   32 +++++++
  mm/fixed-mlpt/mlpt.c      |  193 
++++++++++++++++++++++++++++++++++++++++++++++
  mm/memory.c               |  177 
------------------------------------------
  mm/mmap.c                 |    1
  6 files changed, 233 insertions(+), 213 deletions(-)

Index: linux-2.6.12-rc4/mm/fixed-mlpt/mlpt.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/fixed-mlpt/mlpt.c	2005-05-19 
17:24:29.000000000 +1000
+++ linux-2.6.12-rc4/mm/fixed-mlpt/mlpt.c	2005-05-19 
17:24:49.000000000 +1000
@@ -1 +1,194 @@
+#include <linux/kernel_stat.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>
+#include <linux/mman.h>
+#include <linux/swap.h>
+#include <linux/highmem.h>
+#include <linux/pagemap.h>
+#include <linux/rmap.h>
+#include <linux/module.h>
+#include <linux/init.h>
  #include <linux/page_table.h>
+
+#include <asm/uaccess.h>
+#include <asm/tlb.h>
+#include <asm/tlbflush.h>
+
+#include <linux/swapops.h>
+#include <linux/elf.h>
+
+
+/*
+ * Note: this doesn't free the actual pages themselves. That
+ * has been handled earlier when unmapping all the memory regions.
+ */
+static void free_pte_range(struct mmu_gather *tlb, pmd_t *pmd)
+{
+	struct page *page = pmd_page(*pmd);
+	pmd_clear(pmd);
+	pte_free_tlb(tlb, page);
+	dec_page_state(nr_page_table_pages);
+	tlb->mm->nr_ptes--;
+}
+
+static inline void free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
+				unsigned long addr, unsigned long end,
+				unsigned long floor, unsigned long 
ceiling)
+{
+	pmd_t *pmd;
+	unsigned long next;
+	unsigned long start;
+
+	start = addr;
+	pmd = pmd_offset(pud, addr);
+	do {
+		next = pmd_addr_end(addr, end);
+		if (pmd_none_or_clear_bad(pmd))
+			continue;
+		free_pte_range(tlb, pmd);
+	} while (pmd++, addr = next, addr != end);
+
+	start &= PUD_MASK;
+	if (start < floor)
+		return;
+	if (ceiling) {
+		ceiling &= PUD_MASK;
+		if (!ceiling)
+			return;
+	}
+	if (end - 1 > ceiling - 1)
+		return;
+
+	pmd = pmd_offset(pud, start);
+	pud_clear(pud);
+	pmd_free_tlb(tlb, pmd);
+}
+
+static inline void free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
+				unsigned long addr, unsigned long end,
+				unsigned long floor, unsigned long 
ceiling)
+{
+	pud_t *pud;
+	unsigned long next;
+	unsigned long start;
+
+	start = addr;
+	pud = pud_offset(pgd, addr);
+	do {
+		next = pud_addr_end(addr, end);
+		if (pud_none_or_clear_bad(pud))
+			continue;
+		free_pmd_range(tlb, pud, addr, next, floor, ceiling);
+	} while (pud++, addr = next, addr != end);
+
+	start &= PGDIR_MASK;
+	if (start < floor)
+		return;
+	if (ceiling) {
+		ceiling &= PGDIR_MASK;
+		if (!ceiling)
+			return;
+	}
+	if (end - 1 > ceiling - 1)
+		return;
+
+	pud = pud_offset(pgd, start);
+	pgd_clear(pgd);
+	pud_free_tlb(tlb, pud);
+}
+
+/*
+ * This function frees user-level page tables of a process.
+ *
+ * Must be called with pagetable lock held.
+ */
+static void free_pgd_range(struct mmu_gather **tlb,
+			unsigned long addr, unsigned long end,
+			unsigned long floor, unsigned long ceiling)
+{
+	pgd_t *pgd;
+	unsigned long next;
+	unsigned long start;
+
+	/*
+	 * The next few lines have given us lots of grief...
+	 *
+	 * Why are we testing PMD* at this top level?  Because often
+	 * there will be no work to do at all, and we'd prefer not to
+	 * go all the way down to the bottom just to discover that.
+	 *
+	 * Why all these "- 1"s?  Because 0 represents both the bottom
+	 * of the address space and the top of it (using -1 for the
+	 * top wouldn't help much: the masks would do the wrong thing).
+	 * The rule is that addr 0 and floor 0 refer to the bottom of
+	 * the address space, but end 0 and ceiling 0 refer to the top
+	 * Comparisons need to use "end - 1" and "ceiling - 1" (though
+	 * that end 0 case should be mythical).
+	 *
+	 * Wherever addr is brought up or ceiling brought down, we must
+	 * be careful to reject "the opposite 0" before it confuses the
+	 * subsequent tests.  But what about where end is brought down
+	 * by PMD_SIZE below? no, end can't go down to 0 there.
+	 *
+	 * Whereas we round start (addr) and ceiling down, by different
+	 * masks at different levels, in order to test whether a table
+	 * now has no other vmas using it, so can be freed, we don't
+	 * bother to round floor or end up - the tests don't need that.
+	 */
+
+	addr &= PMD_MASK;
+	if (addr < floor) {
+		addr += PMD_SIZE;
+		if (!addr)
+			return;
+	}
+	if (ceiling) {
+		ceiling &= PMD_MASK;
+		if (!ceiling)
+			return;
+	}
+	if (end - 1 > ceiling - 1)
+		end -= PMD_SIZE;
+	if (addr > end - 1)
+		return;
+
+	start = addr;
+	pgd = pgd_offset((*tlb)->mm, addr);
+	do {
+		next = pgd_addr_end(addr, end);
+		if (pgd_none_or_clear_bad(pgd))
+			continue;
+		free_pud_range(*tlb, pgd, addr, next, floor, ceiling);
+	} while (pgd++, addr = next, addr != end);
+
+	if (!tlb_is_full_mm(*tlb))
+		flush_tlb_pgtables((*tlb)->mm, start, end);
+}
+
+void free_pgtables(struct mmu_gather **tlb, struct vm_area_struct *vma,
+		unsigned long floor, unsigned long ceiling)
+{
+	while (vma) {
+		struct vm_area_struct *next = vma->vm_next;
+		unsigned long addr = vma->vm_start;
+
+		if (is_hugepage_only_range(vma->vm_mm, addr, HPAGE_SIZE)) 
{
+			hugetlb_free_pgd_range(tlb, addr, vma->vm_end,
+				floor, next? next->vm_start: ceiling);
+		} else {
+			/*
+			 * Optimization: gather nearby vmas into one call 
down
+			 */
+			while (next && next->vm_start <= vma->vm_end + 
PMD_SIZE
+			  && !is_hugepage_only_range(vma->vm_mm, 
next->vm_start,
+							HPAGE_SIZE)) {
+				vma = next;
+				next = vma->vm_next;
+			}
+			free_pgd_range(tlb, addr, vma->vm_end,
+				floor, next? next->vm_start: ceiling);
+		}
+		vma = next;
+	}
+}
+
Index: linux-2.6.12-rc4/include/mm/mlpt-generic.h
===================================================================
--- linux-2.6.12-rc4.orig/include/mm/mlpt-generic.h	2005-05-19 
17:24:29.000000000 +1000
+++ linux-2.6.12-rc4/include/mm/mlpt-generic.h	2005-05-19 
17:24:49.000000000 +1000
@@ -6,8 +6,7 @@

  /**
   * init_page_table - initialise a user process page table
- *
- * Returns the address of the page table
+ * the address of the page table
   *
   * Creates a new page table.  This consists of a zeroed out pgd.
   */
Index: linux-2.6.12-rc4/mm/mmap.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/mmap.c	2005-05-19 17:24:29.000000000 
+1000
+++ linux-2.6.12-rc4/mm/mmap.c	2005-05-19 17:24:49.000000000 +1000
@@ -24,6 +24,7 @@
  #include <linux/mount.h>
  #include <linux/mempolicy.h>
  #include <linux/rmap.h>
+#include <linux/page_table.h>

  #include <asm/uaccess.h>
  #include <asm/cacheflush.h>
Index: linux-2.6.12-rc4/mm/memory.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/memory.c	2005-05-19 17:24:29.000000000 
+1000
+++ linux-2.6.12-rc4/mm/memory.c	2005-05-19 17:24:49.000000000 
+1000
@@ -48,12 +48,11 @@
  #include <linux/rmap.h>
  #include <linux/module.h>
  #include <linux/init.h>
+#include <linux/page_table.h>

-#include <asm/pgalloc.h>
  #include <asm/uaccess.h>
  #include <asm/tlb.h>
  #include <asm/tlbflush.h>
-#include <asm/pgtable.h>

  #include <linux/swapops.h>
  #include <linux/elf.h>
@@ -106,180 +105,6 @@
  	pmd_clear(pmd);
  }

-/*
- * Note: this doesn't free the actual pages themselves. That
- * has been handled earlier when unmapping all the memory regions.
- */
-static void free_pte_range(struct mmu_gather *tlb, pmd_t *pmd)
-{
-	struct page *page = pmd_page(*pmd);
-	pmd_clear(pmd);
-	pte_free_tlb(tlb, page);
-	dec_page_state(nr_page_table_pages);
-	tlb->mm->nr_ptes--;
-}
-
-static inline void free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
-				unsigned long addr, unsigned long end,
-				unsigned long floor, unsigned long 
ceiling)
-{
-	pmd_t *pmd;
-	unsigned long next;
-	unsigned long start;
-
-	start = addr;
-	pmd = pmd_offset(pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(pmd))
-			continue;
-		free_pte_range(tlb, pmd);
-	} while (pmd++, addr = next, addr != end);
-
-	start &= PUD_MASK;
-	if (start < floor)
-		return;
-	if (ceiling) {
-		ceiling &= PUD_MASK;
-		if (!ceiling)
-			return;
-	}
-	if (end - 1 > ceiling - 1)
-		return;
-
-	pmd = pmd_offset(pud, start);
-	pud_clear(pud);
-	pmd_free_tlb(tlb, pmd);
-}
-
-static inline void free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
-				unsigned long addr, unsigned long end,
-				unsigned long floor, unsigned long 
ceiling)
-{
-	pud_t *pud;
-	unsigned long next;
-	unsigned long start;
-
-	start = addr;
-	pud = pud_offset(pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(pud))
-			continue;
-		free_pmd_range(tlb, pud, addr, next, floor, ceiling);
-	} while (pud++, addr = next, addr != end);
-
-	start &= PGDIR_MASK;
-	if (start < floor)
-		return;
-	if (ceiling) {
-		ceiling &= PGDIR_MASK;
-		if (!ceiling)
-			return;
-	}
-	if (end - 1 > ceiling - 1)
-		return;
-
-	pud = pud_offset(pgd, start);
-	pgd_clear(pgd);
-	pud_free_tlb(tlb, pud);
-}
-
-/*
- * This function frees user-level page tables of a process.
- *
- * Must be called with pagetable lock held.
- */
-void free_pgd_range(struct mmu_gather **tlb,
-			unsigned long addr, unsigned long end,
-			unsigned long floor, unsigned long ceiling)
-{
-	pgd_t *pgd;
-	unsigned long next;
-	unsigned long start;
-
-	/*
-	 * The next few lines have given us lots of grief...
-	 *
-	 * Why are we testing PMD* at this top level?  Because often
-	 * there will be no work to do at all, and we'd prefer not to
-	 * go all the way down to the bottom just to discover that.
-	 *
-	 * Why all these "- 1"s?  Because 0 represents both the bottom
-	 * of the address space and the top of it (using -1 for the
-	 * top wouldn't help much: the masks would do the wrong thing).
-	 * The rule is that addr 0 and floor 0 refer to the bottom of
-	 * the address space, but end 0 and ceiling 0 refer to the top
-	 * Comparisons need to use "end - 1" and "ceiling - 1" (though
-	 * that end 0 case should be mythical).
-	 *
-	 * Wherever addr is brought up or ceiling brought down, we must
-	 * be careful to reject "the opposite 0" before it confuses the
-	 * subsequent tests.  But what about where end is brought down
-	 * by PMD_SIZE below? no, end can't go down to 0 there.
-	 *
-	 * Whereas we round start (addr) and ceiling down, by different
-	 * masks at different levels, in order to test whether a table
-	 * now has no other vmas using it, so can be freed, we don't
-	 * bother to round floor or end up - the tests don't need that.
-	 */
-
-	addr &= PMD_MASK;
-	if (addr < floor) {
-		addr += PMD_SIZE;
-		if (!addr)
-			return;
-	}
-	if (ceiling) {
-		ceiling &= PMD_MASK;
-		if (!ceiling)
-			return;
-	}
-	if (end - 1 > ceiling - 1)
-		end -= PMD_SIZE;
-	if (addr > end - 1)
-		return;
-
-	start = addr;
-	pgd = pgd_offset((*tlb)->mm, addr);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(pgd))
-			continue;
-		free_pud_range(*tlb, pgd, addr, next, floor, ceiling);
-	} while (pgd++, addr = next, addr != end);
-
-	if (!tlb_is_full_mm(*tlb))
-		flush_tlb_pgtables((*tlb)->mm, start, end);
-}
-
-void free_pgtables(struct mmu_gather **tlb, struct vm_area_struct *vma,
-		unsigned long floor, unsigned long ceiling)
-{
-	while (vma) {
-		struct vm_area_struct *next = vma->vm_next;
-		unsigned long addr = vma->vm_start;
-
-		if (is_hugepage_only_range(vma->vm_mm, addr, HPAGE_SIZE)) 
{
-			hugetlb_free_pgd_range(tlb, addr, vma->vm_end,
-				floor, next? next->vm_start: ceiling);
-		} else {
-			/*
-			 * Optimization: gather nearby vmas into one call 
down
-			 */
-			while (next && next->vm_start <= vma->vm_end + 
PMD_SIZE
-			  && !is_hugepage_only_range(vma->vm_mm, 
next->vm_start,
-							HPAGE_SIZE)) {
-				vma = next;
-				next = vma->vm_next;
-			}
-			free_pgd_range(tlb, addr, vma->vm_end,
-				floor, next? next->vm_start: ceiling);
-		}
-		vma = next;
-	}
-}
-
  pte_t fastcall *pte_alloc_map(struct mm_struct *mm, pmd_t *pmd,
  				unsigned long address)
  {
Index: linux-2.6.12-rc4/include/linux/mm.h
===================================================================
--- linux-2.6.12-rc4.orig/include/linux/mm.h	2005-05-19 
17:24:29.000000000 +1000
+++ linux-2.6.12-rc4/include/linux/mm.h	2005-05-19 17:24:49.000000000 
+1000
@@ -587,10 +587,6 @@
  		struct vm_area_struct *start_vma, unsigned long 
start_addr,
  		unsigned long end_addr, unsigned long *nr_accounted,
  		struct zap_details *);
-void free_pgd_range(struct mmu_gather **tlb, unsigned long addr,
-		unsigned long end, unsigned long floor, unsigned long 
ceiling);
-void free_pgtables(struct mmu_gather **tlb, struct vm_area_struct 
*start_vma,
-		unsigned long floor, unsigned long ceiling);
  int copy_page_range(struct mm_struct *dst, struct mm_struct *src,
  			struct vm_area_struct *vma);
  int zeromap_page_range(struct vm_area_struct *vma, unsigned long from,
@@ -605,10 +601,11 @@
  }

  extern int vmtruncate(struct inode * inode, loff_t offset);
-extern pud_t *FASTCALL(__pud_alloc(struct mm_struct *mm, pgd_t *pgd, 
unsigned long address));
-extern pmd_t *FASTCALL(__pmd_alloc(struct mm_struct *mm, pud_t *pud, 
unsigned long address));
-extern pte_t *FASTCALL(pte_alloc_kernel(struct mm_struct *mm, pmd_t *pmd, 
unsigned long address));
-extern pte_t *FASTCALL(pte_alloc_map(struct mm_struct *mm, pmd_t *pmd, 
unsigned long address));
+
+#ifdef CONFIG_MLPT
+#include <mm/mm-mlpt.h>
+#endif
+
  extern int install_page(struct mm_struct *mm, struct vm_area_struct *vma, 
unsigned long addr, struct page *page, pgprot_t prot);
  extern int install_file_pte(struct mm_struct *mm, struct vm_area_struct 
*vma, unsigned long addr, unsigned long pgoff, pgprot_t prot);
  extern int handle_mm_fault(struct mm_struct *mm,struct vm_area_struct 
*vma, unsigned long address, int write_access);
@@ -654,33 +651,6 @@
  extern struct shrinker *set_shrinker(int, shrinker_t);
  extern void remove_shrinker(struct shrinker *shrinker);

-/*
- * On a two-level or three-level page table, this ends up being trivial. 
Thus
- * the inlining and the symmetry break with pte_alloc_map() that does all
- * of this out-of-line.
- */
-/*
- * The following ifdef needed to get the 4level-fixup.h header to work.
- * Remove it when 4level-fixup.h has been removed.
- */
-#ifdef CONFIG_MMU
-#ifndef __ARCH_HAS_4LEVEL_HACK
-static inline pud_t *pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned 
long address)
-{
-	if (pgd_none(*pgd))
-		return __pud_alloc(mm, pgd, address);
-	return pud_offset(pgd, address);
-}
-
-static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned 
long address)
-{
-	if (pud_none(*pud))
-		return __pmd_alloc(mm, pud, address);
-	return pmd_offset(pud, address);
-}
-#endif
-#endif /* CONFIG_MMU */
-
  extern void free_area_init(unsigned long * zones_size);
  extern void free_area_init_node(int nid, pg_data_t *pgdat,
  	unsigned long * zones_size, unsigned long zone_start_pfn,
Index: linux-2.6.12-rc4/include/mm/mm-mlpt.h
===================================================================
--- linux-2.6.12-rc4.orig/include/mm/mm-mlpt.h	2005-05-19 
17:24:27.000000000 +1000
+++ linux-2.6.12-rc4/include/mm/mm-mlpt.h	2005-05-19 
17:29:17.000000000 +1000
@@ -1,4 +1,36 @@
  #ifndef _MM_MM_MLPT_H
  #define _MM_MM_MLPT_H 1

+extern pud_t *FASTCALL(__pud_alloc(struct mm_struct *mm, pgd_t *pgd, 
unsigned long address));
+extern pmd_t *FASTCALL(__pmd_alloc(struct mm_struct *mm, pud_t *pud, 
unsigned long address));
+extern pte_t *FASTCALL(pte_alloc_kernel(struct mm_struct *mm, pmd_t *pmd, 
unsigned long address));
+extern pte_t *FASTCALL(pte_alloc_map(struct mm_struct *mm, pmd_t *pmd, 
unsigned long address));
+
+/*
+ * On a two-level or three-level page table, this ends up being trivial. 
Thus
+ * the inlining and the symmetry break with pte_alloc_map() that does all
+ * of this out-of-line.
+ */
+/*
+ * The following ifdef needed to get the 4level-fixup.h header to work.
+ * Remove it when 4level-fixup.h has been removed.
+ */
+#ifdef CONFIG_MMU
+#ifndef __ARCH_HAS_4LEVEL_HACK
+static inline pud_t *pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned 
long address)
+{
+	if (pgd_none(*pgd))
+		return __pud_alloc(mm, pgd, address);
+	return pud_offset(pgd, address);
+}
+
+static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned 
long address)
+{
+	if (pud_none(*pud))
+		return __pmd_alloc(mm, pud, address);
+	return pmd_offset(pud, address);
+}
+#endif
+#endif /* CONFIG_MMU */
+
  #endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 4/15] PTI: move mlpt behind interface cont.
  2005-05-21  3:08   ` [PATCH 3/15] PTI: move mlpt behind interface Paul Cameron Davies
@ 2005-05-21  3:15     ` Paul Cameron Davies
  2005-05-21  3:26       ` [PATCH 5/15] PTI: Finish moving mlpt behind interface Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  3:15 UTC (permalink / raw)
  To: linux-mm

Patch 4 of 15.

This patch continues moving mlpt code behind the interface.
 	*mlpt directory allocation functions are moved behind the
 	 page table interface to mlpt.c from memory.c.
 	*Their prototypes were abstracted from mm.h to mm-mlpt.h
 	 in a previous patch.
 	*Functions for clearing bad pgds, pmds and puds are moved
 	 from memory.c to mlpt.c also.  The prototypes for these
 	 functions are abstracted in the next patch.

  mm/fixed-mlpt/mlpt.c |  146 
++++++++++++++++++++++++++++++++++++++++++++++++++
  mm/memory.c          |  147 
---------------------------------------------------
  2 files changed, 146 insertions(+), 147 deletions(-)

Index: linux-2.6.12-rc4/mm/memory.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/memory.c	2005-05-19 17:40:54.000000000 
+1000
+++ linux-2.6.12-rc4/mm/memory.c	2005-05-19 17:41:04.000000000 
+1000
@@ -82,82 +82,6 @@
  EXPORT_SYMBOL(vmalloc_earlyreserve);

  /*
- * If a p?d_bad entry is found while walking page tables, report
- * the error, before resetting entry to p?d_none.  Usually (but
- * very seldom) called out from the p?d_none_or_clear_bad macros.
- */
-
-void pgd_clear_bad(pgd_t *pgd)
-{
-	pgd_ERROR(*pgd);
-	pgd_clear(pgd);
-}
-
-void pud_clear_bad(pud_t *pud)
-{
-	pud_ERROR(*pud);
-	pud_clear(pud);
-}
-
-void pmd_clear_bad(pmd_t *pmd)
-{
-	pmd_ERROR(*pmd);
-	pmd_clear(pmd);
-}
-
-pte_t fastcall *pte_alloc_map(struct mm_struct *mm, pmd_t *pmd,
-				unsigned long address)
-{
-	if (!pmd_present(*pmd)) {
-		struct page *new;
-
-		spin_unlock(&mm->page_table_lock);
-		new = pte_alloc_one(mm, address);
-		spin_lock(&mm->page_table_lock);
-		if (!new)
-			return NULL;
-		/*
-		 * Because we dropped the lock, we should re-check the
-		 * entry, as somebody else could have populated it..
-		 */
-		if (pmd_present(*pmd)) {
-			pte_free(new);
-			goto out;
-		}
-		mm->nr_ptes++;
-		inc_page_state(nr_page_table_pages);
-		pmd_populate(mm, pmd, new);
-	}
-out:
-	return pte_offset_map(pmd, address);
-}
-
-pte_t fastcall * pte_alloc_kernel(struct mm_struct *mm, pmd_t *pmd, 
unsigned long address)
-{
-	if (!pmd_present(*pmd)) {
-		pte_t *new;
-
-		spin_unlock(&mm->page_table_lock);
-		new = pte_alloc_one_kernel(mm, address);
-		spin_lock(&mm->page_table_lock);
-		if (!new)
-			return NULL;
-
-		/*
-		 * Because we dropped the lock, we should re-check the
-		 * entry, as somebody else could have populated it..
-		 */
-		if (pmd_present(*pmd)) {
-			pte_free_kernel(new);
-			goto out;
-		}
-		pmd_populate_kernel(mm, pmd, new);
-	}
-out:
-	return pte_offset_kernel(pmd, address);
-}
-
-/*
   * copy one vm_area from one task to the other. Assumes the page tables
   * already present in the new task to be cleared in the whole range
   * covered by this vma.
@@ -1890,77 +1814,6 @@
  	return VM_FAULT_OOM;
  }

-#ifndef __PAGETABLE_PUD_FOLDED
-/*
- * Allocate page upper directory.
- *
- * We've already handled the fast-path in-line, and we own the
- * page table lock.
- */
-pud_t fastcall *__pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned 
long address)
-{
-	pud_t *new;
-
-	spin_unlock(&mm->page_table_lock);
-	new = pud_alloc_one(mm, address);
-	spin_lock(&mm->page_table_lock);
-	if (!new)
-		return NULL;
-
-	/*
-	 * Because we dropped the lock, we should re-check the
-	 * entry, as somebody else could have populated it..
-	 */
-	if (pgd_present(*pgd)) {
-		pud_free(new);
-		goto out;
-	}
-	pgd_populate(mm, pgd, new);
- out:
-	return pud_offset(pgd, address);
-}
-#endif /* __PAGETABLE_PUD_FOLDED */
-
-#ifndef __PAGETABLE_PMD_FOLDED
-/*
- * Allocate page middle directory.
- *
- * We've already handled the fast-path in-line, and we own the
- * page table lock.
- */
-pmd_t fastcall *__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned 
long address)
-{
-	pmd_t *new;
-
-	spin_unlock(&mm->page_table_lock);
-	new = pmd_alloc_one(mm, address);
-	spin_lock(&mm->page_table_lock);
-	if (!new)
-		return NULL;
-
-	/*
-	 * Because we dropped the lock, we should re-check the
-	 * entry, as somebody else could have populated it..
-	 */
-#ifndef __ARCH_HAS_4LEVEL_HACK
-	if (pud_present(*pud)) {
-		pmd_free(new);
-		goto out;
-	}
-	pud_populate(mm, pud, new);
-#else
-	if (pgd_present(*pud)) {
-		pmd_free(new);
-		goto out;
-	}
-	pgd_populate(mm, pud, new);
-#endif /* __ARCH_HAS_4LEVEL_HACK */
-
- out:
-	return pmd_offset(pud, address);
-}
-#endif /* __PAGETABLE_PMD_FOLDED */
-
  int make_pages_present(unsigned long addr, unsigned long end)
  {
  	int ret, len, write;
Index: linux-2.6.12-rc4/mm/fixed-mlpt/mlpt.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/fixed-mlpt/mlpt.c	2005-05-19 
17:40:54.000000000 +1000
+++ linux-2.6.12-rc4/mm/fixed-mlpt/mlpt.c	2005-05-19 
17:41:04.000000000 +1000
@@ -192,3 +192,149 @@
  	}
  }

+/*
+ * If a p?d_bad entry is found while walking page tables, report
+ * the error, before resetting entry to p?d_none.  Usually (but
+ * very seldom) called out from the p?d_none_or_clear_bad macros.
+ */
+
+void pgd_clear_bad(pgd_t *pgd)
+{
+	pgd_ERROR(*pgd);
+	pgd_clear(pgd);
+}
+
+void pud_clear_bad(pud_t *pud)
+{
+	pud_ERROR(*pud);
+	pud_clear(pud);
+}
+
+void pmd_clear_bad(pmd_t *pmd)
+{
+	pmd_ERROR(*pmd);
+	pmd_clear(pmd);
+}
+
+pte_t fastcall *pte_alloc_map(struct mm_struct *mm, pmd_t *pmd,
+				unsigned long address)
+{
+	if (!pmd_present(*pmd)) {
+		struct page *new;
+
+		spin_unlock(&mm->page_table_lock);
+		new = pte_alloc_one(mm, address);
+		spin_lock(&mm->page_table_lock);
+		if (!new)
+			return NULL;
+		/*
+		 * Because we dropped the lock, we should re-check the
+		 * entry, as somebody else could have populated it..
+		 */
+		if (pmd_present(*pmd)) {
+			pte_free(new);
+			goto out;
+		}
+		mm->nr_ptes++;
+		inc_page_state(nr_page_table_pages);
+		pmd_populate(mm, pmd, new);
+	}
+out:
+	return pte_offset_map(pmd, address);
+}
+
+pte_t fastcall * pte_alloc_kernel(struct mm_struct *mm, pmd_t *pmd, 
unsigned long address)
+{
+	if (!pmd_present(*pmd)) {
+		pte_t *new;
+
+		spin_unlock(&mm->page_table_lock);
+		new = pte_alloc_one_kernel(mm, address);
+		spin_lock(&mm->page_table_lock);
+		if (!new)
+			return NULL;
+
+		/*
+		 * Because we dropped the lock, we should re-check the
+		 * entry, as somebody else could have populated it..
+		 */
+		if (pmd_present(*pmd)) {
+			pte_free_kernel(new);
+			goto out;
+		}
+		pmd_populate_kernel(mm, pmd, new);
+	}
+out:
+	return pte_offset_kernel(pmd, address);
+}
+
+#ifndef __PAGETABLE_PUD_FOLDED
+/*
+ * Allocate page upper directory.
+ *
+ * We've already handled the fast-path in-line, and we own the
+ * page table lock.
+ */
+pud_t fastcall *__pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned 
long address)
+{
+	pud_t *new;
+
+	spin_unlock(&mm->page_table_lock);
+	new = pud_alloc_one(mm, address);
+	spin_lock(&mm->page_table_lock);
+	if (!new)
+		return NULL;
+
+	/*
+	 * Because we dropped the lock, we should re-check the
+	 * entry, as somebody else could have populated it..
+	 */
+	if (pgd_present(*pgd)) {
+		pud_free(new);
+		goto out;
+	}
+	pgd_populate(mm, pgd, new);
+ out:
+	return pud_offset(pgd, address);
+}
+#endif /* __PAGETABLE_PUD_FOLDED */
+
+#ifndef __PAGETABLE_PMD_FOLDED
+/*
+ * Allocate page middle directory.
+ *
+ * We've already handled the fast-path in-line, and we own the
+ * page table lock.
+ */
+pmd_t fastcall *__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned 
long address)
+{
+	pmd_t *new;
+
+	spin_unlock(&mm->page_table_lock);
+	new = pmd_alloc_one(mm, address);
+	spin_lock(&mm->page_table_lock);
+	if (!new)
+		return NULL;
+
+	/*
+	 * Because we dropped the lock, we should re-check the
+	 * entry, as somebody else could have populated it..
+	 */
+#ifndef __ARCH_HAS_4LEVEL_HACK
+	if (pud_present(*pud)) {
+		pmd_free(new);
+		goto out;
+	}
+	pud_populate(mm, pud, new);
+#else
+	if (pgd_present(*pud)) {
+		pmd_free(new);
+		goto out;
+	}
+	pgd_populate(mm, pud, new);
+#endif /* __ARCH_HAS_4LEVEL_HACK */
+
+ out:
+	return pmd_offset(pud, address);
+}
+#endif /* __PAGETABLE_PMD_FOLDED */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 5/15] PTI: Finish moving mlpt behind interface
  2005-05-21  3:15     ` [PATCH 4/15] PTI: move mlpt behind interface cont Paul Cameron Davies
@ 2005-05-21  3:26       ` Paul Cameron Davies
  2005-05-21  3:47         ` [PATCH 6/15] PTI: Start calling the interface Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  3:26 UTC (permalink / raw)
  To: linux-mm

Patch 5 of 15.

This patch completes moving general mlpt code behind the
page table interface.

 	*It abstracts mlpt dependent code from the general
 	 pgtable.h and the general tlb.h to ptable-mlpt.h
 	 and tlb-mlpt.h respectively.
 	*The prototypes from clearing bad pgds etc are moved in this
 	 process.

  include/asm-generic/pgtable-mlpt.h |   74 
+++++++++++++++++++++++++++++++++++++
  include/asm-generic/pgtable.h      |   71 
+----------------------------------
  include/asm-generic/tlb-mlpt.h     |   20 ++++++++++
  include/asm-generic/tlb.h          |   20 +---------
  4 files changed, 98 insertions(+), 87 deletions(-)

Index: linux-2.6.12-rc4/include/asm-generic/pgtable.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-generic/pgtable.h	2005-05-17 
21:45:09.000000000 +1000
+++ linux-2.6.12-rc4/include/asm-generic/pgtable.h	2005-05-18 
00:41:18.000000000 +1000
@@ -131,81 +131,14 @@
  #define page_test_and_clear_young(page) (0)
  #endif

-#ifndef __HAVE_ARCH_PGD_OFFSET_GATE
-#define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
-#endif
-
  #ifndef __HAVE_ARCH_LAZY_MMU_PROT_UPDATE
  #define lazy_mmu_prot_update(pte)	do { } while (0)
  #endif

-/*
- * When walking page tables, get the address of the next boundary,
- * or the end address of the range if that comes earlier.  Although no
- * vma end wraps to 0, rounded up __boundary may wrap to 0 throughout.
- */
-
-#define pgd_addr_end(addr, end) 
\
-({	unsigned long __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;	\
-	(__boundary - 1 < (end) - 1)? __boundary: (end);		\
-})
-
-#ifndef pud_addr_end
-#define pud_addr_end(addr, end) 
\
-({	unsigned long __boundary = ((addr) + PUD_SIZE) & PUD_MASK;	\
-	(__boundary - 1 < (end) - 1)? __boundary: (end);		\
-})
-#endif

-#ifndef pmd_addr_end
-#define pmd_addr_end(addr, end) 
\
-({	unsigned long __boundary = ((addr) + PMD_SIZE) & PMD_MASK;	\
-	(__boundary - 1 < (end) - 1)? __boundary: (end);		\
-})
+#ifdef CONFIG_MLPT
+#include <asm-generic/pgtable-mlpt.h>
  #endif

-#ifndef __ASSEMBLY__
-/*
- * When walking page tables, we usually want to skip any p?d_none 
entries;
- * and any p?d_bad entries - reporting the error before resetting to 
none.
- * Do the tests inline, but report and clear the bad entry in 
mm/memory.c.
- */
-void pgd_clear_bad(pgd_t *);
-void pud_clear_bad(pud_t *);
-void pmd_clear_bad(pmd_t *);
-
-static inline int pgd_none_or_clear_bad(pgd_t *pgd)
-{
-	if (pgd_none(*pgd))
-		return 1;
-	if (unlikely(pgd_bad(*pgd))) {
-		pgd_clear_bad(pgd);
-		return 1;
-	}
-	return 0;
-}
-
-static inline int pud_none_or_clear_bad(pud_t *pud)
-{
-	if (pud_none(*pud))
-		return 1;
-	if (unlikely(pud_bad(*pud))) {
-		pud_clear_bad(pud);
-		return 1;
-	}
-	return 0;
-}
-
-static inline int pmd_none_or_clear_bad(pmd_t *pmd)
-{
-	if (pmd_none(*pmd))
-		return 1;
-	if (unlikely(pmd_bad(*pmd))) {
-		pmd_clear_bad(pmd);
-		return 1;
-	}
-	return 0;
-}
-#endif /* !__ASSEMBLY__ */

  #endif /* _ASM_GENERIC_PGTABLE_H */
Index: linux-2.6.12-rc4/include/asm-generic/pgtable-mlpt.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-generic/pgtable-mlpt.h	2005-05-18 
00:30:14.000000000 +1000
+++ linux-2.6.12-rc4/include/asm-generic/pgtable-mlpt.h	2005-05-18 
00:41:05.000000000 +1000
@@ -1,4 +1,78 @@
  #ifndef _ASM_GENERIC_PGTABLE_MLPT_H
  #define _ASM_GENERIC_PGTABLE_MLPT_H 1

+#ifndef __HAVE_ARCH_PGD_OFFSET_GATE
+#define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
+#endif
+
+/*
+ * When walking page tables, get the address of the next boundary,
+ * or the end address of the range if that comes earlier.  Although no
+ * vma end wraps to 0, rounded up __boundary may wrap to 0 throughout.
+ */
+
+#define pgd_addr_end(addr, end) 
\
+({	unsigned long __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;	\
+	(__boundary - 1 < (end) - 1)? __boundary: (end);		\
+})
+
+#ifndef pud_addr_end
+#define pud_addr_end(addr, end) 
\
+({	unsigned long __boundary = ((addr) + PUD_SIZE) & PUD_MASK;	\
+	(__boundary - 1 < (end) - 1)? __boundary: (end);		\
+})
+#endif
+
+#ifndef pmd_addr_end
+#define pmd_addr_end(addr, end) 
\
+({	unsigned long __boundary = ((addr) + PMD_SIZE) & PMD_MASK;	\
+	(__boundary - 1 < (end) - 1)? __boundary: (end);		\
+})
+#endif
+
+#ifndef __ASSEMBLY__
+/*
+ * When walking page tables, we usually want to skip any p?d_none 
entries;
+ * and any p?d_bad entries - reporting the error before resetting to 
none.
+ * Do the tests inline, but report and clear the bad entry in 
mm/memory.c.
+ */
+void pgd_clear_bad(pgd_t *);
+void pud_clear_bad(pud_t *);
+void pmd_clear_bad(pmd_t *);
+
+static inline int pgd_none_or_clear_bad(pgd_t *pgd)
+{
+	if (pgd_none(*pgd))
+		return 1;
+	if (unlikely(pgd_bad(*pgd))) {
+		pgd_clear_bad(pgd);
+		return 1;
+	}
+	return 0;
+}
+
+static inline int pud_none_or_clear_bad(pud_t *pud)
+{
+	if (pud_none(*pud))
+		return 1;
+	if (unlikely(pud_bad(*pud))) {
+		pud_clear_bad(pud);
+		return 1;
+	}
+	return 0;
+}
+
+static inline int pmd_none_or_clear_bad(pmd_t *pmd)
+{
+	if (pmd_none(*pmd))
+		return 1;
+	if (unlikely(pmd_bad(*pmd))) {
+		pmd_clear_bad(pmd);
+		return 1;
+	}
+	return 0;
+}
+#endif /* !__ASSEMBLY__ */
+
+
  #endif
Index: linux-2.6.12-rc4/include/asm-generic/tlb.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-generic/tlb.h	2005-05-07 
15:20:31.000000000 +1000
+++ linux-2.6.12-rc4/include/asm-generic/tlb.h	2005-05-18 
00:54:19.000000000 +1000
@@ -135,26 +135,10 @@
  		__tlb_remove_tlb_entry(tlb, ptep, address);	\
  	} while (0)

-#define pte_free_tlb(tlb, ptep)					\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pte_free_tlb(tlb, ptep);			\
-	} while (0)
-
-#ifndef __ARCH_HAS_4LEVEL_HACK
-#define pud_free_tlb(tlb, pudp)					\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pud_free_tlb(tlb, pudp);			\
-	} while (0)
+#ifdef CONFIG_MLPT
+#include <asm-generic/tlb-mlpt.h>
  #endif

-#define pmd_free_tlb(tlb, pmdp)					\
-	do {							\
-		tlb->need_flush = 1;				\
-		__pmd_free_tlb(tlb, pmdp);			\
-	} while (0)
-
  #define tlb_migrate_finish(mm) do {} while (0)

  #endif /* _ASM_GENERIC__TLB_H */
Index: linux-2.6.12-rc4/include/asm-generic/tlb-mlpt.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-generic/tlb-mlpt.h	2005-05-18 
00:30:14.000000000 +1000
+++ linux-2.6.12-rc4/include/asm-generic/tlb-mlpt.h	2005-05-18 
00:54:03.000000000 +1000
@@ -1,4 +1,24 @@
  #ifndef _ASM_GENERIC_TLB_MLPT_H
  #define _ASM_GENERIC_TLB_MLPT_H 1

+#define pte_free_tlb(tlb, ptep)					\
+	do {							\
+		tlb->need_flush = 1;				\
+		__pte_free_tlb(tlb, ptep);			\
+	} while (0)
+
+#ifndef __ARCH_HAS_4LEVEL_HACK
+#define pud_free_tlb(tlb, pudp)					\
+	do {							\
+		tlb->need_flush = 1;				\
+		__pud_free_tlb(tlb, pudp);			\
+	} while (0)
+#endif
+
+#define pmd_free_tlb(tlb, pmdp)					\
+	do {							\
+		tlb->need_flush = 1;				\
+		__pmd_free_tlb(tlb, pmdp);			\
+	} while (0)
+
  #endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 6/15] PTI: Start calling the interface
  2005-05-21  3:26       ` [PATCH 5/15] PTI: Finish moving mlpt behind interface Paul Cameron Davies
@ 2005-05-21  3:47         ` Paul Cameron Davies
  2005-05-21  3:54           ` [PATCH 7/15] PTI: continue calling interface Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  3:47 UTC (permalink / raw)
  To: linux-mm

Patch 6 of 15.

This patch starts calling the interface.  The mlpt now actually
starts to run through the interface.

 	*fork.c calls init_page_table and free_page_table to create
 	 and delete user page tables as part of the forking process.
 	*exec.c calls build_page_table in install_arg_page.
 	*fremap.c calls build_page_table in install_page and
 	 in install_file_pte.
 	*mremap.c calls lookup_nested_pte, lookup_page_table and
 	 build_page_table from the general interface in move_one_page.
 	*A number of functions look to have disappeared from mremap.c
 	 but this functionality has simply moved behind the page
 	 table interface.

  fs/exec.c     |   17 +++---------
  kernel/fork.c |    6 +---
  mm/fremap.c   |   33 ++-----------------------
  mm/mremap.c   |   76 
+++-------------------------------------------------------
  4 files changed, 14 insertions(+), 118 deletions(-)

Index: linux-2.6.12-rc4/kernel/fork.c
===================================================================
--- linux-2.6.12-rc4.orig/kernel/fork.c	2005-05-19 17:24:20.000000000 
+1000
+++ linux-2.6.12-rc4/kernel/fork.c	2005-05-19 17:55:08.000000000 
+1000
@@ -41,9 +41,7 @@
  #include <linux/profile.h>
  #include <linux/rmap.h>
  #include <linux/acct.h>
-
-#include <asm/pgtable.h>
-#include <asm/pgalloc.h>
+#include <linux/page_table.h>
  #include <asm/uaccess.h>
  #include <asm/mmu_context.h>
  #include <asm/cacheflush.h>
@@ -286,7 +284,7 @@

  static inline int mm_alloc_pgd(struct mm_struct * mm)
  {
-	mm->pgd = pgd_alloc(mm);
+	mm->pgd = init_page_table();
  	if (unlikely(!mm->pgd))
  		return -ENOMEM;
  	return 0;
Index: linux-2.6.12-rc4/fs/exec.c
===================================================================
--- linux-2.6.12-rc4.orig/fs/exec.c	2005-05-19 17:24:20.000000000 
+1000
+++ linux-2.6.12-rc4/fs/exec.c	2005-05-19 17:55:08.000000000 +1000
@@ -48,6 +48,7 @@
  #include <linux/syscalls.h>
  #include <linux/rmap.h>
  #include <linux/acct.h>
+#include <linux/page_table.h>

  #include <asm/uaccess.h>
  #include <asm/mmu_context.h>
@@ -302,25 +303,15 @@
  			struct page *page, unsigned long address)
  {
  	struct mm_struct *mm = vma->vm_mm;
-	pgd_t * pgd;
-	pud_t * pud;
-	pmd_t * pmd;
  	pte_t * pte;

  	if (unlikely(anon_vma_prepare(vma)))
  		goto out_sig;
-
+
  	flush_dcache_page(page);
-	pgd = pgd_offset(mm, address);
+	spin_lock(&mm->page_table_lock);

-	spin_lock(&mm->page_table_lock);
-	pud = pud_alloc(mm, pgd, address);
-	if (!pud)
-		goto out;
-	pmd = pmd_alloc(mm, pud, address);
-	if (!pmd)
-		goto out;
-	pte = pte_alloc_map(mm, pmd, address);
+	pte = build_page_table(mm, address);
  	if (!pte)
  		goto out;
  	if (!pte_none(*pte)) {
Index: linux-2.6.12-rc4/mm/fremap.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/fremap.c	2005-05-19 17:24:20.000000000 
+1000
+++ linux-2.6.12-rc4/mm/fremap.c	2005-05-19 17:55:08.000000000 
+1000
@@ -15,6 +15,7 @@
  #include <linux/rmap.h>
  #include <linux/module.h>
  #include <linux/syscalls.h>
+#include <linux/page_table.h>

  #include <asm/mmu_context.h>
  #include <asm/cacheflush.h>
@@ -60,23 +61,10 @@
  	pgoff_t size;
  	int err = -ENOMEM;
  	pte_t *pte;
-	pmd_t *pmd;
-	pud_t *pud;
-	pgd_t *pgd;
  	pte_t pte_val;

-	pgd = pgd_offset(mm, addr);
  	spin_lock(&mm->page_table_lock);
-
-	pud = pud_alloc(mm, pgd, addr);
-	if (!pud)
-		goto err_unlock;
-
-	pmd = pmd_alloc(mm, pud, addr);
-	if (!pmd)
-		goto err_unlock;
-
-	pte = pte_alloc_map(mm, pmd, addr);
+	pte = build_page_table(mm, addr);
  	if (!pte)
  		goto err_unlock;

@@ -107,7 +95,6 @@
  }
  EXPORT_SYMBOL(install_page);

-
  /*
   * Install a file pte to a given virtual memory address, release any
   * previously existing mapping.
@@ -117,23 +104,10 @@
  {
  	int err = -ENOMEM;
  	pte_t *pte;
-	pmd_t *pmd;
-	pud_t *pud;
-	pgd_t *pgd;
  	pte_t pte_val;

-	pgd = pgd_offset(mm, addr);
  	spin_lock(&mm->page_table_lock);
-
-	pud = pud_alloc(mm, pgd, addr);
-	if (!pud)
-		goto err_unlock;
-
-	pmd = pmd_alloc(mm, pud, addr);
-	if (!pmd)
-		goto err_unlock;
-
-	pte = pte_alloc_map(mm, pmd, addr);
+	pte = build_page_table(mm, addr);
  	if (!pte)
  		goto err_unlock;

@@ -151,7 +125,6 @@
  	return err;
  }

-
  /***
   * sys_remap_file_pages - remap arbitrary pages of a shared backing store
   *                        file within an existing vma.
Index: linux-2.6.12-rc4/mm/mremap.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/mremap.c	2005-05-19 17:24:20.000000000 
+1000
+++ linux-2.6.12-rc4/mm/mremap.c	2005-05-19 17:55:08.000000000 
+1000
@@ -17,78 +17,12 @@
  #include <linux/highmem.h>
  #include <linux/security.h>
  #include <linux/syscalls.h>
+#include <linux/page_table.h>

  #include <asm/uaccess.h>
  #include <asm/cacheflush.h>
  #include <asm/tlbflush.h>

-static pte_t *get_one_pte_map_nested(struct mm_struct *mm, unsigned long 
addr)
-{
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte = NULL;
-
-	pgd = pgd_offset(mm, addr);
-	if (pgd_none_or_clear_bad(pgd))
-		goto end;
-
-	pud = pud_offset(pgd, addr);
-	if (pud_none_or_clear_bad(pud))
-		goto end;
-
-	pmd = pmd_offset(pud, addr);
-	if (pmd_none_or_clear_bad(pmd))
-		goto end;
-
-	pte = pte_offset_map_nested(pmd, addr);
-	if (pte_none(*pte)) {
-		pte_unmap_nested(pte);
-		pte = NULL;
-	}
-end:
-	return pte;
-}
-
-static pte_t *get_one_pte_map(struct mm_struct *mm, unsigned long addr)
-{
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-
-	pgd = pgd_offset(mm, addr);
-	if (pgd_none_or_clear_bad(pgd))
-		return NULL;
-
-	pud = pud_offset(pgd, addr);
-	if (pud_none_or_clear_bad(pud))
-		return NULL;
-
-	pmd = pmd_offset(pud, addr);
-	if (pmd_none_or_clear_bad(pmd))
-		return NULL;
-
-	return pte_offset_map(pmd, addr);
-}
-
-static inline pte_t *alloc_one_pte_map(struct mm_struct *mm, unsigned 
long addr)
-{
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte = NULL;
-
-	pgd = pgd_offset(mm, addr);
-
-	pud = pud_alloc(mm, pgd, addr);
-	if (!pud)
-		return NULL;
-	pmd = pmd_alloc(mm, pud, addr);
-	if (pmd)
-		pte = pte_alloc_map(mm, pmd, addr);
-	return pte;
-}
-
  static int
  move_one_page(struct vm_area_struct *vma, unsigned long old_addr,
  		struct vm_area_struct *new_vma, unsigned long new_addr)
@@ -113,25 +47,25 @@
  	}
  	spin_lock(&mm->page_table_lock);

-	src = get_one_pte_map_nested(mm, old_addr);
+	src = lookup_nested_pte(mm, old_addr);
  	if (src) {
  		/*
  		 * Look to see whether alloc_one_pte_map needs to perform 
a
  		 * memory allocation.  If it does then we need to drop the
  		 * atomic kmap
  		 */
-		dst = get_one_pte_map(mm, new_addr);
+		dst = lookup_page_table(mm, new_addr);
  		if (unlikely(!dst)) {
  			pte_unmap_nested(src);
  			if (mapping)
  				spin_unlock(&mapping->i_mmap_lock);
-			dst = alloc_one_pte_map(mm, new_addr);
+			dst = build_page_table(mm, new_addr);
  			if (mapping && 
!spin_trylock(&mapping->i_mmap_lock)) {
  				spin_unlock(&mm->page_table_lock);
  				spin_lock(&mapping->i_mmap_lock);
  				spin_lock(&mm->page_table_lock);
  			}
-			src = get_one_pte_map_nested(mm, old_addr);
+			src = lookup_nested_pte(mm, old_addr);
  		}
  		/*
  		 * Since alloc_one_pte_map can drop and re-acquire

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 7/15] PTI: continue calling interface
  2005-05-21  3:47         ` [PATCH 6/15] PTI: Start calling the interface Paul Cameron Davies
@ 2005-05-21  3:54           ` Paul Cameron Davies
  2005-05-21  4:04             ` [PATCH 8/15] PTI: Keep " Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  3:54 UTC (permalink / raw)
  To: linux-mm

Patch 7 of 15.

This patch continues to call the new interface.

 	*lookup_page_table is called in page_check_address in rmap.c
 	*build_page_table is called in handle_mm_fault.
 	*handle_pte_fault, do_file_page and do_anonymous_page are no
 	 longer passed pmds. lookup_page_table is called later on to
 	 avoid passing the pmds
 	*do_no_page is not passed a pmd anymore.  lookup_page_table
 	 is called instead to get the relevant pte.
 	*do_swap_page and do_wp_page are no longer passed pmds.
 	 lookup_page_table is called instead.

  mm/memory.c |   59 
++++++++++++++++++++++++-----------------------------------
  mm/rmap.c   |   27 ++++++++++-----------------
  2 files changed, 34 insertions(+), 52 deletions(-)

Index: linux-2.6.12-rc4/mm/rmap.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/rmap.c	2005-05-19 17:01:14.000000000 
+1000
+++ linux-2.6.12-rc4/mm/rmap.c	2005-05-19 18:01:20.000000000 +1000
@@ -53,6 +53,7 @@
  #include <linux/init.h>
  #include <linux/rmap.h>
  #include <linux/rcupdate.h>
+#include <linux/page_table.h>

  #include <asm/tlbflush.h>

@@ -250,9 +251,6 @@
  static pte_t *page_check_address(struct page *page, struct mm_struct *mm,
  					unsigned long address)
  {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *pte;

  	/*
@@ -260,20 +258,15 @@
  	 * munmap, fork, etc...
  	 */
  	spin_lock(&mm->page_table_lock);
-	pgd = pgd_offset(mm, address);
-	if (likely(pgd_present(*pgd))) {
-		pud = pud_offset(pgd, address);
-		if (likely(pud_present(*pud))) {
-			pmd = pmd_offset(pud, address);
-			if (likely(pmd_present(*pmd))) {
-				pte = pte_offset_map(pmd, address);
-				if (likely(pte_present(*pte) &&
-					   page_to_pfn(page) == 
pte_pfn(*pte)))
-					return pte;
-				pte_unmap(pte);
-			}
-		}
-	}
+	pte = lookup_page_table(mm, address);
+	if(!pte)
+		goto out_unlock;
+	if (likely(pte_present(*pte) &&
+	   page_to_pfn(page) == pte_pfn(*pte)))
+		return pte;
+	pte_unmap(pte);
+
+out_unlock:
  	spin_unlock(&mm->page_table_lock);
  	return ERR_PTR(-ENOENT);
  }
Index: linux-2.6.12-rc4/mm/memory.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/memory.c	2005-05-19 17:41:04.000000000 
+1000
+++ linux-2.6.12-rc4/mm/memory.c	2005-05-19 18:01:20.000000000 
+1000
@@ -993,7 +993,7 @@
   * with the page_table_lock released.
   */
  static int do_wp_page(struct mm_struct *mm, struct vm_area_struct * vma,
-	unsigned long address, pte_t *page_table, pmd_t *pmd, pte_t pte)
+	unsigned long address, pte_t *page_table, pte_t pte)
  {
  	struct page *old_page, *new_page;
  	unsigned long pfn = pte_pfn(pte);
@@ -1053,7 +1053,8 @@
  	 * Re-check the pte - we dropped the lock
  	 */
  	spin_lock(&mm->page_table_lock);
-	page_table = pte_offset_map(pmd, address);
+	page_table = lookup_page_table(mm, address);
+
  	if (likely(pte_same(*page_table, pte))) {
  		if (PageAnon(old_page))
  			dec_mm_counter(mm, anon_rss);
@@ -1405,7 +1406,7 @@
   */
  static int do_swap_page(struct mm_struct * mm,
  	struct vm_area_struct * vma, unsigned long address,
-	pte_t *page_table, pmd_t *pmd, pte_t orig_pte, int write_access)
+	pte_t *page_table, pte_t orig_pte, int write_access)
  {
  	struct page *page;
  	swp_entry_t entry = pte_to_swp_entry(orig_pte);
@@ -1424,7 +1425,7 @@
  			 * we released the page table lock.
  			 */
  			spin_lock(&mm->page_table_lock);
-			page_table = pte_offset_map(pmd, address);
+			page_table = lookup_page_table(mm, address);
  			if (likely(pte_same(*page_table, orig_pte)))
  				ret = VM_FAULT_OOM;
  			else
@@ -1448,7 +1449,7 @@
  	 * released the page table lock.
  	 */
  	spin_lock(&mm->page_table_lock);
-	page_table = pte_offset_map(pmd, address);
+	page_table = lookup_page_table(mm, address);
  	if (unlikely(!pte_same(*page_table, orig_pte))) {
  		pte_unmap(page_table);
  		spin_unlock(&mm->page_table_lock);
@@ -1478,7 +1479,7 @@

  	if (write_access) {
  		if (do_wp_page(mm, vma, address,
-				page_table, pmd, pte) == VM_FAULT_OOM)
+				page_table, pte) == VM_FAULT_OOM)
  			ret = VM_FAULT_OOM;
  		goto out;
  	}
@@ -1499,7 +1500,7 @@
   */
  static int
  do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
-		pte_t *page_table, pmd_t *pmd, int write_access,
+		pte_t *page_table, int write_access,
  		unsigned long addr)
  {
  	pte_t entry;
@@ -1521,8 +1522,7 @@
  			goto no_mem;

  		spin_lock(&mm->page_table_lock);
-		page_table = pte_offset_map(pmd, addr);
-
+		page_table = lookup_page_table(mm, addr);
  		if (!pte_none(*page_table)) {
  			pte_unmap(page_table);
  			page_cache_release(page);
@@ -1565,7 +1565,7 @@
   */
  static int
  do_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
-	unsigned long address, int write_access, pte_t *page_table, pmd_t 
*pmd)
+	unsigned long address, int write_access, pte_t *page_table)
  {
  	struct page * new_page;
  	struct address_space *mapping = NULL;
@@ -1576,7 +1576,7 @@

  	if (!vma->vm_ops || !vma->vm_ops->nopage)
  		return do_anonymous_page(mm, vma, page_table,
-					pmd, write_access, address);
+					write_access, address);
  	pte_unmap(page_table);
  	spin_unlock(&mm->page_table_lock);

@@ -1631,7 +1631,7 @@
  		page_cache_release(new_page);
  		goto retry;
  	}
-	page_table = pte_offset_map(pmd, address);
+	page_table = lookup_page_table(mm, address);

  	/*
  	 * This silly early PAGE_DIRTY setting removes a race
@@ -1685,7 +1685,7 @@
   * nonlinear vmas.
   */
  static int do_file_page(struct mm_struct * mm, struct vm_area_struct * 
vma,
-	unsigned long address, int write_access, pte_t *pte, pmd_t *pmd)
+	unsigned long address, int write_access, pte_t *pte)
  {
  	unsigned long pgoff;
  	int err;
@@ -1698,7 +1698,7 @@
  	if (!vma->vm_ops || !vma->vm_ops->populate ||
  			(write_access && !(vma->vm_flags & VM_SHARED))) {
  		pte_clear(mm, address, pte);
-		return do_no_page(mm, vma, address, write_access, pte, 
pmd);
+		return do_no_page(mm, vma, address, write_access, pte);
  	}

  	pgoff = pte_to_pgoff(*pte);
@@ -1706,7 +1706,8 @@
  	pte_unmap(pte);
  	spin_unlock(&mm->page_table_lock);

-	err = vma->vm_ops->populate(vma, address & PAGE_MASK, PAGE_SIZE, 
vma->vm_page_prot, pgoff, 0);
+	err = vma->vm_ops->populate(vma, address & PAGE_MASK, PAGE_SIZE,
+		vma->vm_page_prot, pgoff, 0);
  	if (err == -ENOMEM)
  		return VM_FAULT_OOM;
  	if (err)
@@ -1737,8 +1738,8 @@
   */
  static inline int handle_pte_fault(struct mm_struct *mm,
  	struct vm_area_struct * vma, unsigned long address,
-	int write_access, pte_t *pte, pmd_t *pmd)
-{
+	int write_access, pte_t *pte)
+{
  	pte_t entry;

  	entry = *pte;
@@ -1749,15 +1750,15 @@
  		 * drop the lock.
  		 */
  		if (pte_none(entry))
-			return do_no_page(mm, vma, address, write_access, 
pte, pmd);
+			return do_no_page(mm, vma, address, write_access, 
pte);
  		if (pte_file(entry))
-			return do_file_page(mm, vma, address, 
write_access, pte, pmd);
-		return do_swap_page(mm, vma, address, pte, pmd, entry, 
write_access);
+			return do_file_page(mm, vma, address, 
write_access, pte);
+		return do_swap_page(mm, vma, address, pte, entry, 
write_access);
  	}

  	if (write_access) {
  		if (!pte_write(entry))
-			return do_wp_page(mm, vma, address, pte, pmd, 
entry);
+			return do_wp_page(mm, vma, address, pte, entry);

  		entry = pte_mkdirty(entry);
  	}
@@ -1776,9 +1777,6 @@
  int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct * vma,
  		unsigned long address, int write_access)
  {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *pte;

  	__set_current_state(TASK_RUNNING);
@@ -1792,22 +1790,13 @@
  	 * We need the page table lock to synchronize with kswapd
  	 * and the SMP-safe atomic PTE updates.
  	 */
-	pgd = pgd_offset(mm, address);
  	spin_lock(&mm->page_table_lock);

-	pud = pud_alloc(mm, pgd, address);
-	if (!pud)
-		goto oom;
-
-	pmd = pmd_alloc(mm, pud, address);
-	if (!pmd)
-		goto oom;
-
-	pte = pte_alloc_map(mm, pmd, address);
+	pte = build_page_table(mm, address);
  	if (!pte)
  		goto oom;

-	return handle_pte_fault(mm, vma, address, write_access, pte, pmd);
+	return handle_pte_fault(mm, vma, address, write_access, pte);

   oom:
  	spin_unlock(&mm->page_table_lock);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 8/15] PTI: Keep calling interface
  2005-05-21  3:54           ` [PATCH 7/15] PTI: continue calling interface Paul Cameron Davies
@ 2005-05-21  4:04             ` Paul Cameron Davies
  2005-05-21  4:12               ` [PATCH 9/15] PTI: Introduce iterators Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  4:04 UTC (permalink / raw)
  To: linux-mm

Patch 8 of 15.

This patch continues to call the new interface.  It moves through
memory.c and continues to call the page table interface

 	*follow_page now looks up a page table via the page table
 	 interface.  This breaks the hugeTLBfs which will be fixed
 	 in a later patch series.
 	*untouched_anonymous_page looks up the page table via
 	 the page table interface.
 	*get_user_pages calls lookup_page_table_gate from the
 	 new interface
 	*vmalloc_to_page does a lookup of the kernel page table
 	 via lookup_page_table.

  mm/memory.c |   94 
+++++++++++-------------------------------------------------
  1 files changed, 18 insertions(+), 76 deletions(-)

Index: linux-2.6.12-rc4/mm/memory.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/memory.c	2005-05-18 13:05:49.000000000 
+1000
+++ linux-2.6.12-rc4/mm/memory.c	2005-05-18 13:48:29.000000000 
+1000
@@ -528,33 +528,12 @@
  static struct page *
  __follow_page(struct mm_struct *mm, unsigned long address, int read, int 
write)
  {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *ptep, pte;
  	unsigned long pfn;
  	struct page *page;
-
-	page = follow_huge_addr(mm, address, write);
-	if (! IS_ERR(page))
-		return page;
-
-	pgd = pgd_offset(mm, address);
-	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
-		goto out;
-
-	pud = pud_offset(pgd, address);
-	if (pud_none(*pud) || unlikely(pud_bad(*pud)))
-		goto out;

-	pmd = pmd_offset(pud, address);
-	if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
-		goto out;
-	if (pmd_huge(*pmd))
-		return follow_huge_pmd(mm, address, pmd, write);
-
-	ptep = pte_offset_map(pmd, address);
-	if (!ptep)
+	ptep = lookup_page_table(mm, address);
+	if(!ptep)
  		goto out;

  	pte = *ptep;
@@ -605,37 +584,20 @@
  	return page;
  }

-
  static inline int
  untouched_anonymous_page(struct mm_struct* mm, struct vm_area_struct 
*vma,
  			 unsigned long address)
  {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-
  	/* Check if the vma is for an anonymous mapping. */
  	if (vma->vm_ops && vma->vm_ops->nopage)
  		return 0;
-
-	/* Check if page directory entry exists. */
-	pgd = pgd_offset(mm, address);
-	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
-		return 1;
-
-	pud = pud_offset(pgd, address);
-	if (pud_none(*pud) || unlikely(pud_bad(*pud)))
-		return 1;
-
-	/* Check if page middle directory entry exists. */
-	pmd = pmd_offset(pud, address);
-	if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
-		return 1;
-
+
  	/* There is a pte slot for 'address' in 'mm'. */
-	return 0;
-}
+	if(lookup_page_table(mm, address))
+		return 0;

+	return 1;
+}

  int get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
  		unsigned long start, int len, int write, int force,
@@ -657,24 +619,11 @@

  		vma = find_extend_vma(mm, start);
  		if (!vma && in_gate_area(tsk, start)) {
-			unsigned long pg = start & PAGE_MASK;
  			struct vm_area_struct *gate_vma = 
get_gate_vma(tsk);
-			pgd_t *pgd;
-			pud_t *pud;
-			pmd_t *pmd;
  			pte_t *pte;
  			if (write) /* user gate pages are read-only */
  				return i ? : -EFAULT;
-			if (pg > TASK_SIZE)
-				pgd = pgd_offset_k(pg);
-			else
-				pgd = pgd_offset_gate(mm, pg);
-			BUG_ON(pgd_none(*pgd));
-			pud = pud_offset(pgd, pg);
-			BUG_ON(pud_none(*pud));
-			pmd = pmd_offset(pud, pg);
-			BUG_ON(pmd_none(*pmd));
-			pte = pte_offset_map(pmd, pg);
+			pte = lookup_page_table_gate(mm, start);
  			BUG_ON(pte_none(*pte));
  			if (pages) {
  				pages[i] = pte_page(*pte);
@@ -1831,24 +1780,17 @@
  {
  	unsigned long addr = (unsigned long) vmalloc_addr;
  	struct page *page = NULL;
-	pgd_t *pgd = pgd_offset_k(addr);
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *ptep, pte;
-
-	if (!pgd_none(*pgd)) {
-		pud = pud_offset(pgd, addr);
-		if (!pud_none(*pud)) {
-			pmd = pmd_offset(pud, addr);
-			if (!pmd_none(*pmd)) {
-				ptep = pte_offset_map(pmd, addr);
-				pte = *ptep;
-				if (pte_present(pte))
-					page = pte_page(pte);
-				pte_unmap(ptep);
-			}
-		}
-	}
+
+	ptep = lookup_page_table(NULL, addr);
+	if(!ptep)
+		return page;
+
+	pte = *ptep;
+	if (pte_present(pte))
+		page = pte_page(pte);
+	pte_unmap(ptep);
+
  	return page;
  }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 9/15] PTI: Introduce iterators
  2005-05-21  4:04             ` [PATCH 8/15] PTI: Keep " Paul Cameron Davies
@ 2005-05-21  4:12               ` Paul Cameron Davies
  2005-05-21  4:19                 ` [PATCH 10/15] PTI: Call iterators Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  4:12 UTC (permalink / raw)
  To: linux-mm

Patch 9 of 15.

This patch introduces 3 iterators to complete the architecture independent
component of the page table interface.  Each iterator is passed a function
that can operate on the pte it is iterating over.  Each iterator may be
passed a struct containing parameters for the function to operate on.

 	*page_table_build_iterator: This iterator builds the page table
 	 between the given range of addresses.
 	*page_table_read_iterator: This iterator is passed a range of
 	 addresses for a page table and iterates over the ptes to be
 	 operated on accordingly.
 	*page_table_dual_iterator: This iterator reads a page table and
 	 builds an identical page table.

  include/mm/mlpt-generic.h   |    1
  include/mm/mlpt-iterators.h |  348 
++++++++++++++++++++++++++++++++++++++++++++
  2 files changed, 349 insertions(+)

Index: linux-2.6.12-rc4/include/mm/mlpt-iterators.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/mm/mlpt-iterators.h	2005-05-19 
18:12:36.000000000 +1000
@@ -0,0 +1,348 @@
+#ifndef MLPT_ITERATORS_H
+#define MLPT_ITERATORS_H 1
+
+typedef int (*pte_callback_t)(struct mm_struct *, pte_t *, unsigned long, 
void *);
+
+static void unmap_pte(struct mm_struct *mm, pte_t *pte)
+{
+	if (mm == &init_mm)
+		return;
+
+	pte_unmap(pte);
+}
+
+static pte_t *pte_alloc(struct mm_struct *mm, pmd_t *pmd, unsigned long 
address)
+{
+	if (mm == &init_mm)
+		return pte_alloc_kernel(&init_mm, pmd, address);
+
+	return pte_alloc_map(mm, pmd, address);
+}
+
+static int build_iterator_pte_range(struct mm_struct *mm, pmd_t *pmd, 
unsigned long addr,
+	unsigned long end, pte_callback_t func, void *args)
+{
+	pte_t *pte;
+	int err;
+
+	pte = pte_alloc(mm, pmd, addr);
+	if (!pte)
+		return -ENOMEM;
+	do {
+		err = func(mm, pte, addr, args);
+		if (err)
+			return err;
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+
+	unmap_pte(mm, pte - 1);
+
+	return 0;
+}
+
+static inline int build_iterator_pmd_range(struct mm_struct *mm, pud_t 
*pud,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	pmd_t *pmd;
+	unsigned long next;
+
+	pmd = pmd_alloc(mm, pud, addr);
+	if (!pmd)
+		return -ENOMEM;
+	do {
+		next = pmd_addr_end(addr, end);
+		if (build_iterator_pte_range(mm, pmd, addr, next, func, 
args))
+			return -ENOMEM;
+	} while (pmd++, addr = next, addr != end);
+
+	return 0;
+}
+
+static inline int build_iterator_pud_range(struct mm_struct *mm, pgd_t 
*pgd,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	pud_t *pud;
+	unsigned long next;
+
+	pud = pud_alloc(mm, pgd, addr);
+	if (!pud)
+		return -ENOMEM;
+
+	do {
+		next = pud_addr_end(addr, end);
+		if (build_iterator_pmd_range(mm, pud, addr, next, func, 
args))
+			return -ENOMEM;
+	} while (pud++, addr = next, addr != end);
+
+	return 0;
+}
+
+/**
+ * page_table_build_iterator - THE BUILD ITERATOR
+ * @mm: the address space that owns the page table
+ * @addr: the address to start building at
+ * @end: the last address in the build range
+ * @func: the function to operate on the pte
+ * @args: the arguments to pass to the function
+ *
+ * Returns int.  Indicates error
+ *
+ * Builds the page table between the given range of addresses.  func
+ * operates on each pte according to args supplied.
+ */
+
+static inline int page_table_build_iterator(struct mm_struct *mm,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	unsigned long next;
+	int err;
+	pgd_t *pgd;
+
+	if (mm == &init_mm)
+		pgd = pgd_offset_k(addr);
+	else
+		pgd = pgd_offset(mm, addr);
+
+	do {
+		next = pgd_addr_end(addr, end);
+		err = build_iterator_pud_range(mm, pgd, addr, next, func, 
args);
+		if (err)
+			break;
+	} while (pgd++, addr = next, addr != end);
+
+	return err;
+}
+
+static pte_t *pte_offset(struct mm_struct *mm, pmd_t *pmd, unsigned long 
address)
+{
+	if (mm == &init_mm)
+		return pte_offset_kernel(pmd, address);
+
+	return pte_offset_map(pmd, address);
+}
+
+
+static int read_iterator_pte_range(struct mm_struct *mm, pmd_t *pmd,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	pte_t *pte;
+	int ret=0;
+
+	pte = pte_offset(mm, pmd, addr);
+
+	do {
+		ret = func(mm, pte, addr, args);
+		if (ret)
+			return ret;
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+
+	unmap_pte(mm, pte - 1);
+
+	return ret;
+}
+
+
+static inline int read_iterator_pmd_range(struct mm_struct *mm, pud_t 
*pud,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	pmd_t *pmd;
+	unsigned long next;
+	int ret=0;
+
+	pmd = pmd_offset(pud, addr);
+	do {
+		next = pmd_addr_end(addr, end);
+		if (pmd_none_or_clear_bad(pmd))
+			continue;
+		ret = read_iterator_pte_range(mm, pmd, addr, next, func, 
args);
+		if(ret)
+			break;
+	} while (pmd++, addr = next, addr != end);
+	return ret;
+}
+
+
+static inline int read_iterator_pud_range(struct mm_struct *mm, pgd_t 
*pgd,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	pud_t *pud;
+	unsigned long next;
+	int ret=0;
+
+	pud = pud_offset(pgd, addr);
+	do {
+		next = pud_addr_end(addr, end);
+		if (pud_none_or_clear_bad(pud))
+			continue;
+		ret = read_iterator_pmd_range(mm, pud, addr, next, func, 
args);
+		if(ret)
+			break;
+	} while (pud++, addr = next, addr != end);
+	return ret;
+}
+
+/**
+ * page_table_read_iterator - THE READ ITERATOR
+ * @mm: the address space that owns the page table
+ * @addr: the address to start building at
+ * @end: the last address in the build range
+ * @func: the function to operate on the pte
+ * @args: the arguments to pass to the function
+ *
+ * Returns int.  Indicates error
+ *
+ * Reads the page table between the given range of addresses.  func
+ * operates on each pte according to args supplied.
+ */
+
+static inline int page_table_read_iterator(struct mm_struct *mm,
+	unsigned long addr, unsigned long end, pte_callback_t func, void 
*args)
+{
+	unsigned long next;
+	pgd_t *pgd;
+	int ret=0;
+
+	if (mm == &init_mm)
+		pgd = pgd_offset_k(addr);
+	else
+		pgd = pgd_offset(mm, addr);
+
+	do {
+		next = pgd_addr_end(addr, end);
+		if (pgd_none_or_clear_bad(pgd))
+			continue;
+		ret = read_iterator_pud_range(mm, pgd, addr, next, func, 
args);
+		if(ret)
+			break;
+	} while (pgd++, addr = next, addr != end);
+
+	return ret;
+}
+
+typedef int (*pte_rw_iterator_callback_t)(struct mm_struct *, struct 
mm_struct *,
+	pte_t *, pte_t *, unsigned long, void *);
+
+
+static int dual_pte_range(struct mm_struct *dst_mm, struct mm_struct 
*src_mm,
+		pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, 
unsigned long end,
+		pte_rw_iterator_callback_t func, void *args)
+{
+	pte_t *src_pte, *dst_pte;
+	int progress;
+
+again:
+	dst_pte = pte_alloc_map(dst_mm, dst_pmd, addr);
+	if (!dst_pte)
+		return -ENOMEM;
+	src_pte = pte_offset_map_nested(src_pmd, addr);
+
+	progress = 0;
+	spin_lock(&src_mm->page_table_lock);
+	do {
+		/*
+		 * We are holding two locks at this point - either of them
+		 * could generate latencies in another task on another 
CPU.
+		 */
+		if (progress >= 32 && (need_resched() ||
+		    need_lockbreak(&src_mm->page_table_lock) ||
+		    need_lockbreak(&dst_mm->page_table_lock)))
+			break;
+		if (pte_none(*src_pte)) {
+			progress++;
+			continue;
+		}
+		func(dst_mm, src_mm, dst_pte, src_pte, addr, args);
+		progress += 8;
+	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
+	spin_unlock(&src_mm->page_table_lock);
+
+	pte_unmap_nested(src_pte - 1);
+	pte_unmap(dst_pte - 1);
+	cond_resched_lock(&dst_mm->page_table_lock);
+	if (addr != end)
+		goto again;
+	return 0;
+}
+
+static inline int dual_pmd_range(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
+		pud_t *dst_pud, pud_t *src_pud, unsigned long addr, 
unsigned long end,
+		pte_rw_iterator_callback_t func, void *args)
+{
+	pmd_t *src_pmd, *dst_pmd;
+	unsigned long next;
+
+	dst_pmd = pmd_alloc(dst_mm, dst_pud, addr);
+	if (!dst_pmd)
+		return -ENOMEM;
+	src_pmd = pmd_offset(src_pud, addr);
+	do {
+		next = pmd_addr_end(addr, end);
+		if (pmd_none_or_clear_bad(src_pmd))
+			continue;
+		if (dual_pte_range(dst_mm, src_mm, dst_pmd, src_pmd,
+						addr, next, func, args))
+			return -ENOMEM;
+	} while (dst_pmd++, src_pmd++, addr = next, addr != end);
+	return 0;
+}
+
+static inline int dual_pud_range(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
+		pgd_t *dst_pgd, pgd_t *src_pgd, unsigned long addr, 
unsigned long end,
+		pte_rw_iterator_callback_t func, void *args)
+{
+	pud_t *src_pud, *dst_pud;
+	unsigned long next;
+
+	dst_pud = pud_alloc(dst_mm, dst_pgd, addr);
+	if (!dst_pud)
+		return -ENOMEM;
+	src_pud = pud_offset(src_pgd, addr);
+	do {
+		next = pud_addr_end(addr, end);
+		if (pud_none_or_clear_bad(src_pud))
+			continue;
+		if (dual_pmd_range(dst_mm, src_mm, dst_pud, src_pud,
+						addr, next, func, args))
+			return -ENOMEM;
+	} while (dst_pud++, src_pud++, addr = next, addr != end);
+	return 0;
+}
+
+/**
+ * page_table_dual_iterator - THE READ WRITE ITERATOR
+ * @dst_mm: the address space that owns the destination page table
+ * @src_mm: the address space that owns the source page table
+ * @addr: the address to start building at
+ * @end: the last address in the build range
+ * @func: the function to operate on the pte
+ * @args: the arguments to pass to the function
+ *
+ * Returns int.  Indicates error
+ *
+ * Reads the source page table and builds a replica page table.
+ * func operates on the ptes in the source and destination page tables.
+ */
+
+static inline int page_table_dual_iterator(struct mm_struct *dst_mm, 
struct mm_struct *src_mm,
+	unsigned long addr, unsigned long end, pte_rw_iterator_callback_t 
func, void *args)
+{
+	pgd_t *src_pgd;
+	pgd_t *dst_pgd;
+	unsigned long next;
+
+	dst_pgd = pgd_offset(dst_mm, addr);
+	src_pgd = pgd_offset(src_mm, addr);
+	do {
+		next = pgd_addr_end(addr, end);
+		if (pgd_none_or_clear_bad(src_pgd))
+			continue;
+
+		if (dual_pud_range(dst_mm, src_mm, dst_pgd,
+			src_pgd, addr, next, func, args))
+			return -ENOMEM;
+
+	} while (dst_pgd++, src_pgd++, addr = next, addr != end);
+	return 0;
+}
+
+
+#endif
Index: linux-2.6.12-rc4/include/mm/mlpt-generic.h
===================================================================
--- linux-2.6.12-rc4.orig/include/mm/mlpt-generic.h	2005-05-19 
17:24:49.000000000 +1000
+++ linux-2.6.12-rc4/include/mm/mlpt-generic.h	2005-05-19 
18:12:36.000000000 +1000
@@ -3,6 +3,7 @@

  #include <linux/highmem.h>
  #include <asm/tlb.h>
+#include <mm/mlpt-iterators.h>

  /**
   * init_page_table - initialise a user process page table

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 10/15] PTI: Call iterators
  2005-05-21  4:12               ` [PATCH 9/15] PTI: Introduce iterators Paul Cameron Davies
@ 2005-05-21  4:19                 ` Paul Cameron Davies
  2005-05-21  4:58                   ` [PATCH 11/15] PTI: Continue calling iterators Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  4:19 UTC (permalink / raw)
  To: linux-mm

Patch 10 of 15.

page_table_dual_iterator is called to abstract
 	*copy_page_range in memory.c
 	*This is the only call to this specialised
 	 iterator.

page_table_build_iterator is called to abstract
 	*zeromap_page_table in memory.c
 	*remap_pfn_range in memory.c
 	*map_vm_area in vmalloc.c.
 	*These are all the calls to this iterator.

  mm/memory.c  |  268 
++++++++++-------------------------------------------------
  mm/vmalloc.c |   70 +++------------
  2 files changed, 62 insertions(+), 276 deletions(-)

Index: linux-2.6.12-rc4/mm/vmalloc.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/vmalloc.c	2005-05-19 17:01:14.000000000 
+1000
+++ linux-2.6.12-rc4/mm/vmalloc.c	2005-05-19 18:15:26.000000000 
+1000
@@ -15,6 +15,7 @@
  #include <linux/interrupt.h>

  #include <linux/vmalloc.h>
+#include <linux/page_table.h>

  #include <asm/uaccess.h>
  #include <asm/tlbflush.h>
@@ -83,76 +84,37 @@
  	flush_tlb_kernel_range((unsigned long) area->addr, end);
  }

-static int vmap_pte_range(pmd_t *pmd, unsigned long addr,
-			unsigned long end, pgprot_t prot, struct page 
***pages)
+struct map_vm_area_struct
  {
-	pte_t *pte;
-
-	pte = pte_alloc_kernel(&init_mm, pmd, addr);
-	if (!pte)
-		return -ENOMEM;
-	do {
-		struct page *page = **pages;
-		WARN_ON(!pte_none(*pte));
-		if (!page)
-			return -ENOMEM;
-		set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
-		(*pages)++;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	return 0;
-}
+	struct page ***pages;
+	pgprot_t prot;
+};

-static inline int vmap_pmd_range(pud_t *pud, unsigned long addr,
-			unsigned long end, pgprot_t prot, struct page 
***pages)
+int map_vm_range(struct mm_struct *mm, pte_t *pte, unsigned long addr, 
void *data)
  {
-	pmd_t *pmd;
-	unsigned long next;
+	struct page *page = **(((struct map_vm_area_struct 
*)data)->pages);

-	pmd = pmd_alloc(&init_mm, pud, addr);
-	if (!pmd)
+	WARN_ON(!pte_none(*pte));
+	if (!page)
  		return -ENOMEM;
-	do {
-		next = pmd_addr_end(addr, end);
-		if (vmap_pte_range(pmd, addr, next, prot, pages))
-			return -ENOMEM;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
-}
-
-static inline int vmap_pud_range(pgd_t *pgd, unsigned long addr,
-			unsigned long end, pgprot_t prot, struct page 
***pages)
-{
-	pud_t *pud;
-	unsigned long next;
-
-	pud = pud_alloc(&init_mm, pgd, addr);
-	if (!pud)
-		return -ENOMEM;
-	do {
-		next = pud_addr_end(addr, end);
-		if (vmap_pmd_range(pud, addr, next, prot, pages))
-			return -ENOMEM;
-	} while (pud++, addr = next, addr != end);
+	set_pte_at(&init_mm, addr, pte,
+		mk_pte(page, (((struct map_vm_area_struct 
*)data)->prot)));
+	(*(((struct map_vm_area_struct *)data)->pages))++;
  	return 0;
  }

  int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page 
***pages)
  {
-	pgd_t *pgd;
-	unsigned long next;
  	unsigned long addr = (unsigned long) area->addr;
  	unsigned long end = addr + area->size - PAGE_SIZE;
  	int err;
+	struct map_vm_area_struct data;

+	data.pages = pages;
+	data.prot = prot;
  	BUG_ON(addr >= end);
-	pgd = pgd_offset_k(addr);
  	spin_lock(&init_mm.page_table_lock);
-	do {
-		next = pgd_addr_end(addr, end);
-		err = vmap_pud_range(pgd, addr, next, prot, pages);
-		if (err)
-			break;
-	} while (pgd++, addr = next, addr != end);
+	err = page_table_build_iterator(&init_mm, addr, end, map_vm_range, 
&data);
  	spin_unlock(&init_mm.page_table_lock);
  	flush_cache_vmap((unsigned long) area->addr, end);
  	return err;
Index: linux-2.6.12-rc4/mm/memory.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/memory.c	2005-05-19 18:04:27.000000000 
+1000
+++ linux-2.6.12-rc4/mm/memory.c	2005-05-19 18:15:26.000000000 
+1000
@@ -90,14 +90,21 @@
   * but may be dropped within p[mg]d_alloc() and pte_alloc_map().
   */

-static inline void
-copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
-		pte_t *dst_pte, pte_t *src_pte, unsigned long vm_flags,
-		unsigned long addr)
+struct copy_page_range_struct
+{
+	unsigned long vm_flags;
+};
+
+static inline int
+copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte_t 
*dst_pte,
+	pte_t *src_pte, unsigned long addr, void *data)
  {
  	pte_t pte = *src_pte;
  	struct page *page;
  	unsigned long pfn;
+	unsigned long vm_flags;
+
+	vm_flags = ((struct copy_page_range_struct *)data)->vm_flags;

  	/* pte contains position in swap or file, so copy. */
  	if (unlikely(!pte_present(pte))) {
@@ -111,7 +118,7 @@
  			}
  		}
  		set_pte_at(dst_mm, addr, dst_pte, pte);
-		return;
+		return 0;
  	}

  	pfn = pte_pfn(pte);
@@ -126,7 +133,7 @@

  	if (!page || PageReserved(page)) {
  		set_pte_at(dst_mm, addr, dst_pte, pte);
-		return;
+		return 0;
  	}

  	/*
@@ -151,116 +158,21 @@
  		inc_mm_counter(dst_mm, anon_rss);
  	set_pte_at(dst_mm, addr, dst_pte, pte);
  	page_dup_rmap(page);
-}
-
-static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct 
*src_mm,
-		pmd_t *dst_pmd, pmd_t *src_pmd, struct vm_area_struct 
*vma,
-		unsigned long addr, unsigned long end)
-{
-	pte_t *src_pte, *dst_pte;
-	unsigned long vm_flags = vma->vm_flags;
-	int progress;
-
-again:
-	dst_pte = pte_alloc_map(dst_mm, dst_pmd, addr);
-	if (!dst_pte)
-		return -ENOMEM;
-	src_pte = pte_offset_map_nested(src_pmd, addr);
-
-	progress = 0;
-	spin_lock(&src_mm->page_table_lock);
-	do {
-		/*
-		 * We are holding two locks at this point - either of them
-		 * could generate latencies in another task on another 
CPU.
-		 */
-		if (progress >= 32 && (need_resched() ||
-		    need_lockbreak(&src_mm->page_table_lock) ||
-		    need_lockbreak(&dst_mm->page_table_lock)))
-			break;
-		if (pte_none(*src_pte)) {
-			progress++;
-			continue;
-		}
-		copy_one_pte(dst_mm, src_mm, dst_pte, src_pte, vm_flags, 
addr);
-		progress += 8;
-	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
-	spin_unlock(&src_mm->page_table_lock);
-
-	pte_unmap_nested(src_pte - 1);
-	pte_unmap(dst_pte - 1);
-	cond_resched_lock(&dst_mm->page_table_lock);
-	if (addr != end)
-		goto again;
-	return 0;
-}
-
-static inline int copy_pmd_range(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
-		pud_t *dst_pud, pud_t *src_pud, struct vm_area_struct 
*vma,
-		unsigned long addr, unsigned long end)
-{
-	pmd_t *src_pmd, *dst_pmd;
-	unsigned long next;
-
-	dst_pmd = pmd_alloc(dst_mm, dst_pud, addr);
-	if (!dst_pmd)
-		return -ENOMEM;
-	src_pmd = pmd_offset(src_pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(src_pmd))
-			continue;
-		if (copy_pte_range(dst_mm, src_mm, dst_pmd, src_pmd,
-						vma, addr, next))
-			return -ENOMEM;
-	} while (dst_pmd++, src_pmd++, addr = next, addr != end);
-	return 0;
-}
-
-static inline int copy_pud_range(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
-		pgd_t *dst_pgd, pgd_t *src_pgd, struct vm_area_struct 
*vma,
-		unsigned long addr, unsigned long end)
-{
-	pud_t *src_pud, *dst_pud;
-	unsigned long next;
-
-	dst_pud = pud_alloc(dst_mm, dst_pgd, addr);
-	if (!dst_pud)
-		return -ENOMEM;
-	src_pud = pud_offset(src_pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(src_pud))
-			continue;
-		if (copy_pmd_range(dst_mm, src_mm, dst_pud, src_pud,
-						vma, addr, next))
-			return -ENOMEM;
-	} while (dst_pud++, src_pud++, addr = next, addr != end);
  	return 0;
  }

  int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
  		struct vm_area_struct *vma)
  {
-	pgd_t *src_pgd, *dst_pgd;
-	unsigned long next;
  	unsigned long addr = vma->vm_start;
  	unsigned long end = vma->vm_end;
+	int err;
+	struct copy_page_range_struct data;

-	if (is_vm_hugetlb_page(vma))
-		return copy_hugetlb_page_range(dst_mm, src_mm, vma);
+	data.vm_flags = vma->vm_flags;

-	dst_pgd = pgd_offset(dst_mm, addr);
-	src_pgd = pgd_offset(src_mm, addr);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(src_pgd))
-			continue;
-		if (copy_pud_range(dst_mm, src_mm, dst_pgd, src_pgd,
-						vma, addr, next))
-			return -ENOMEM;
-	} while (dst_pgd++, src_pgd++, addr = next, addr != end);
-	return 0;
+	err = page_table_dual_iterator(dst_mm, src_mm, addr, end, 
copy_one_pte, &data);
+	return err;
  }

  static void zap_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
@@ -718,76 +630,33 @@

  EXPORT_SYMBOL(get_user_pages);

-static int zeromap_pte_range(struct mm_struct *mm, pmd_t *pmd,
-			unsigned long addr, unsigned long end, pgprot_t 
prot)
+struct zeromap_struct
  {
-	pte_t *pte;
+	pgprot_t prot;
+};

-	pte = pte_alloc_map(mm, pmd, addr);
-	if (!pte)
-		return -ENOMEM;
-	do {
-		pte_t zero_pte = pte_wrprotect(mk_pte(ZERO_PAGE(addr), 
prot));
-		BUG_ON(!pte_none(*pte));
-		set_pte_at(mm, addr, pte, zero_pte);
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	pte_unmap(pte - 1);
-	return 0;
-}
-
-static inline int zeromap_pmd_range(struct mm_struct *mm, pud_t *pud,
-			unsigned long addr, unsigned long end, pgprot_t 
prot)
+int zero_range(struct mm_struct *mm, pte_t *pte, unsigned long addr, void 
*data)
  {
-	pmd_t *pmd;
-	unsigned long next;
-
-	pmd = pmd_alloc(mm, pud, addr);
-	if (!pmd)
-		return -ENOMEM;
-	do {
-		next = pmd_addr_end(addr, end);
-		if (zeromap_pte_range(mm, pmd, addr, next, prot))
-			return -ENOMEM;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
-}
-
-static inline int zeromap_pud_range(struct mm_struct *mm, pgd_t *pgd,
-			unsigned long addr, unsigned long end, pgprot_t 
prot)
-{
-	pud_t *pud;
-	unsigned long next;
-
-	pud = pud_alloc(mm, pgd, addr);
-	if (!pud)
-		return -ENOMEM;
-	do {
-		next = pud_addr_end(addr, end);
-		if (zeromap_pmd_range(mm, pud, addr, next, prot))
-			return -ENOMEM;
-	} while (pud++, addr = next, addr != end);
+	pte_t zero_pte = pte_wrprotect(mk_pte(ZERO_PAGE(addr),
+		((struct zeromap_struct *)data)->prot));
+	BUG_ON(!pte_none(*pte));
+	set_pte_at(mm, addr, pte, zero_pte);
  	return 0;
  }

  int zeromap_page_range(struct vm_area_struct *vma,
  			unsigned long addr, unsigned long size, pgprot_t 
prot)
  {
-	pgd_t *pgd;
-	unsigned long next;
  	unsigned long end = addr + size;
  	struct mm_struct *mm = vma->vm_mm;
  	int err;
+	struct zeromap_struct data;

+	data.prot = prot;
  	BUG_ON(addr >= end);
-	pgd = pgd_offset(mm, addr);
  	flush_cache_range(vma, addr, end);
  	spin_lock(&mm->page_table_lock);
-	do {
-		next = pgd_addr_end(addr, end);
-		err = zeromap_pud_range(mm, pgd, addr, next, prot);
-		if (err)
-			break;
-	} while (pgd++, addr = next, addr != end);
+	err = page_table_build_iterator(mm, addr, end, zero_range, &data);
  	spin_unlock(&mm->page_table_lock);
  	return err;
  }
@@ -797,74 +666,32 @@
   * mappings are removed. any references to nonexistent pages results
   * in null mappings (currently treated as "copy-on-access")
   */
-static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
-{
-	pte_t *pte;

-	pte = pte_alloc_map(mm, pmd, addr);
-	if (!pte)
-		return -ENOMEM;
-	do {
-		BUG_ON(!pte_none(*pte));
-		if (!pfn_valid(pfn) || PageReserved(pfn_to_page(pfn)))
-			set_pte_at(mm, addr, pte, pfn_pte(pfn, prot));
-		pfn++;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	pte_unmap(pte - 1);
-	return 0;
-}
-
-static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+struct remap_pfn_struct
  {
-	pmd_t *pmd;
-	unsigned long next;
-
-	pfn -= addr >> PAGE_SHIFT;
-	pmd = pmd_alloc(mm, pud, addr);
-	if (!pmd)
-		return -ENOMEM;
-	do {
-		next = pmd_addr_end(addr, end);
-		if (remap_pte_range(mm, pmd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
-}
+	unsigned long pfn;
+	pgprot_t prot;
+};

-static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+int remap_pfn_one(struct mm_struct *mm, pte_t *pte, unsigned long 
address, void *data)
  {
-	pud_t *pud;
-	unsigned long next;
+	unsigned long pfn = ((struct remap_pfn_struct *)data)->pfn;
+	pgprot_t prot = ((struct remap_pfn_struct *)data)->prot;

-	pfn -= addr >> PAGE_SHIFT;
-	pud = pud_alloc(mm, pgd, addr);
-	if (!pud)
-		return -ENOMEM;
-	do {
-		next = pud_addr_end(addr, end);
-		if (remap_pmd_range(mm, pud, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pud++, addr = next, addr != end);
+	pfn += (address >> PAGE_SHIFT);
+	BUG_ON(!pte_none(*pte));
+	if (!pfn_valid(pfn) || PageReserved(pfn_to_page(pfn)))
+		set_pte_at(mm, address, pte, pfn_pte(pfn, prot));
  	return 0;
  }

-/*  Note: this is only safe if the mm semaphore is held when called. */
  int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
  		    unsigned long pfn, unsigned long size, pgprot_t prot)
  {
-	pgd_t *pgd;
-	unsigned long next;
  	unsigned long end = addr + size;
  	struct mm_struct *mm = vma->vm_mm;
  	int err;
+	struct remap_pfn_struct data;

  	/*
  	 * Physically remapped pages are special. Tell the
@@ -878,19 +705,16 @@

  	BUG_ON(addr >= end);
  	pfn -= addr >> PAGE_SHIFT;
-	pgd = pgd_offset(mm, addr);
+	data.pfn = pfn;
+	data.prot = prot;
+
  	flush_cache_range(vma, addr, end);
  	spin_lock(&mm->page_table_lock);
-	do {
-		next = pgd_addr_end(addr, end);
-		err = remap_pud_range(mm, pgd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot);
-		if (err)
-			break;
-	} while (pgd++, addr = next, addr != end);
+	err = page_table_build_iterator(mm, addr, end, remap_pfn_one, 
&data);
  	spin_unlock(&mm->page_table_lock);
  	return err;
  }
+
  EXPORT_SYMBOL(remap_pfn_range);

  /*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 11/15] PTI: Continue calling iterators
  2005-05-21  4:19                 ` [PATCH 10/15] PTI: Call iterators Paul Cameron Davies
@ 2005-05-21  4:58                   ` Paul Cameron Davies
  2005-05-21  5:04                     ` [PATCH 12/15] PTI: Finish " Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  4:58 UTC (permalink / raw)
  To: linux-mm

Patch 11 of 15.

This patch starts calling the read iterator.

 	*It abstracts unmap_page_range in memory.c
 	*It abstracts unmap_vm_area in vmalloc.c
 	*It abstracts change_protection in mprotect.c

  mm/memory.c   |  174 
++++++++++++++++++++++++----------------------------------
  mm/mprotect.c |   78 +++++++-------------------
  mm/vmalloc.c  |   52 +----------------
  3 files changed, 98 insertions(+), 206 deletions(-)

Index: linux-2.6.12-rc4/mm/memory.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/memory.c	2005-05-19 18:15:26.000000000 
+1000
+++ linux-2.6.12-rc4/mm/memory.c	2005-05-19 18:21:20.000000000 
+1000
@@ -175,127 +175,97 @@
  	return err;
  }

-static void zap_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
-				unsigned long addr, unsigned long end,
-				struct zap_details *details)
+struct unmap_page_range_struct
  {
-	pte_t *pte;
+	struct mmu_gather *tlb;
+	struct zap_details *details;
+};

-	pte = pte_offset_map(pmd, addr);
-	do {
-		pte_t ptent = *pte;
-		if (pte_none(ptent))
-			continue;
-		if (pte_present(ptent)) {
-			struct page *page = NULL;
-			unsigned long pfn = pte_pfn(ptent);
-			if (pfn_valid(pfn)) {
-				page = pfn_to_page(pfn);
-				if (PageReserved(page))
-					page = NULL;
-			}
-			if (unlikely(details) && page) {
-				/*
-				 * unmap_shared_mapping_pages() wants to
-				 * invalidate cache without truncating:
-				 * unmap shared but keep private pages.
-				 */
-				if (details->check_mapping &&
-				    details->check_mapping != 
page->mapping)
-					continue;
-				/*
-				 * Each page->index must be checked when
-				 * invalidating or truncating nonlinear.
-				 */
-				if (details->nonlinear_vma &&
-				    (page->index < details->first_index ||
-				     page->index > details->last_index))
-					continue;
-			}
-			ptent = ptep_get_and_clear(tlb->mm, addr, pte);
-			tlb_remove_tlb_entry(tlb, pte, addr);
-			if (unlikely(!page))
-				continue;
-			if (unlikely(details) && details->nonlinear_vma
-			    && linear_page_index(details->nonlinear_vma,
-						addr) != page->index)
-				set_pte_at(tlb->mm, addr, pte,
-					   pgoff_to_pte(page->index));
-			if (pte_dirty(ptent))
-				set_page_dirty(page);
-			if (PageAnon(page))
-				dec_mm_counter(tlb->mm, anon_rss);
-			else if (pte_young(ptent))
-				mark_page_accessed(page);
-			tlb->freed++;
-			page_remove_rmap(page);
-			tlb_remove_page(tlb, page);
-			continue;
+static int zap_one_pte(struct mm_struct *mm, pte_t *pte, unsigned long 
addr, void *data)
+{
+	struct mmu_gather *tlb = ((struct unmap_page_range_struct 
*)data)->tlb;
+	struct zap_details *details = ((struct unmap_page_range_struct 
*)data)->details;
+
+	pte_t ptent = *pte;
+	if (pte_present(ptent)) {
+		struct page *page = NULL;
+		unsigned long pfn = pte_pfn(ptent);
+		if (pfn_valid(pfn)) {
+			page = pfn_to_page(pfn);
+			if (PageReserved(page))
+				page = NULL;
  		}
-		/*
-		 * If details->check_mapping, we leave swap entries;
-		 * if details->nonlinear_vma, we leave file entries.
-		 */
-		if (unlikely(details))
-			continue;
-		if (!pte_file(ptent))
-			free_swap_and_cache(pte_to_swp_entry(ptent));
-		pte_clear(tlb->mm, addr, pte);
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	pte_unmap(pte - 1);
-}

-static inline void zap_pmd_range(struct mmu_gather *tlb, pud_t *pud,
-				unsigned long addr, unsigned long end,
-				struct zap_details *details)
-{
-	pmd_t *pmd;
-	unsigned long next;
+		if (unlikely(details) && page) {
+			/*
+			 * unmap_shared_mapping_pages() wants to
+			 * invalidate cache without truncating:
+			 * unmap shared but keep private pages.
+			 */
+			if (details->check_mapping &&
+			    details->check_mapping != page->mapping)
+				return 0;
+			/*
+			 * Each page->index must be checked when
+			 * invalidating or truncating nonlinear.
+			 */
+			if (details->nonlinear_vma &&
+			    (page->index < details->first_index ||
+			     page->index > details->last_index))
+				return 0;
+		}

-	pmd = pmd_offset(pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(pmd))
-			continue;
-		zap_pte_range(tlb, pmd, addr, next, details);
-	} while (pmd++, addr = next, addr != end);
-}
+		ptent = ptep_get_and_clear(tlb->mm, addr, pte);
+		tlb_remove_tlb_entry(tlb, pte, addr);
+		if (unlikely(!page))
+			return 0;

-static inline void zap_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
-				unsigned long addr, unsigned long end,
-				struct zap_details *details)
-{
-	pud_t *pud;
-	unsigned long next;
+		if (unlikely(details) && details->nonlinear_vma
+		    && linear_page_index(details->nonlinear_vma,
+					addr) != page->index)
+			set_pte_at(tlb->mm, addr, pte,
+				   pgoff_to_pte(page->index));
+		if (pte_dirty(ptent))
+			set_page_dirty(page);
+		if (PageAnon(page))
+			dec_mm_counter(tlb->mm, anon_rss);
+		else if (pte_young(ptent))
+			mark_page_accessed(page);
+		tlb->freed++;
+		page_remove_rmap(page);
+		tlb_remove_page(tlb, page);
+		return 0;
+
+	}

-	pud = pud_offset(pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(pud))
-			continue;
-		zap_pmd_range(tlb, pud, addr, next, details);
-	} while (pud++, addr = next, addr != end);
+	/*
+	 * If details->check_mapping, we leave swap entries;
+	 * if details->nonlinear_vma, we leave file entries.
+	 */
+	if (unlikely(details))
+		return 0;
+
+	if (!pte_file(ptent))
+		free_swap_and_cache(pte_to_swp_entry(ptent));
+	pte_clear(tlb->mm, addr, pte);
+	return 0;
  }

  static void unmap_page_range(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
  				unsigned long addr, unsigned long end,
  				struct zap_details *details)
  {
-	pgd_t *pgd;
-	unsigned long next;
+	struct unmap_page_range_struct data;

  	if (details && !details->check_mapping && !details->nonlinear_vma)
  		details = NULL;

+	data.tlb = tlb;
+	data.details = details;
+
  	BUG_ON(addr >= end);
  	tlb_start_vma(tlb, vma);
-	pgd = pgd_offset(vma->vm_mm, addr);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(pgd))
-			continue;
-		zap_pud_range(tlb, pgd, addr, next, details);
-	} while (pgd++, addr = next, addr != end);
+	page_table_read_iterator(vma->vm_mm, addr, end, zap_one_pte, 
&data);
  	tlb_end_vma(tlb, vma);
  }

Index: linux-2.6.12-rc4/mm/vmalloc.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/vmalloc.c	2005-05-19 18:15:26.000000000 
+1000
+++ linux-2.6.12-rc4/mm/vmalloc.c	2005-05-19 18:21:20.000000000 
+1000
@@ -24,63 +24,21 @@
  DEFINE_RWLOCK(vmlist_lock);
  struct vm_struct *vmlist;

-static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned 
long end)
+int unmap_vm_pte(struct mm_struct *mm, pte_t *pte, unsigned long address, 
void *args)
  {
-	pte_t *pte;
-
-	pte = pte_offset_kernel(pmd, addr);
-	do {
-		pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte);
-		WARN_ON(!pte_none(ptent) && !pte_present(ptent));
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-}
-
-static inline void vunmap_pmd_range(pud_t *pud, unsigned long addr,
-						unsigned long end)
-{
-	pmd_t *pmd;
-	unsigned long next;
-
-	pmd = pmd_offset(pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(pmd))
-			continue;
-		vunmap_pte_range(pmd, addr, next);
-	} while (pmd++, addr = next, addr != end);
-}
-
-static inline void vunmap_pud_range(pgd_t *pgd, unsigned long addr,
-						unsigned long end)
-{
-	pud_t *pud;
-	unsigned long next;
-
-	pud = pud_offset(pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(pud))
-			continue;
-		vunmap_pmd_range(pud, addr, next);
-	} while (pud++, addr = next, addr != end);
+	pte_t ptent = ptep_get_and_clear(&init_mm, address, pte);
+	WARN_ON(!pte_none(ptent) && !pte_present(ptent));
+	return 0;
  }

  void unmap_vm_area(struct vm_struct *area)
  {
-	pgd_t *pgd;
-	unsigned long next;
  	unsigned long addr = (unsigned long) area->addr;
  	unsigned long end = addr + area->size;

  	BUG_ON(addr >= end);
-	pgd = pgd_offset_k(addr);
  	flush_cache_vunmap(addr, end);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(pgd))
-			continue;
-		vunmap_pud_range(pgd, addr, next);
-	} while (pgd++, addr = next, addr != end);
+	page_table_read_iterator(&init_mm, addr, end, unmap_vm_pte, NULL);
  	flush_tlb_kernel_range((unsigned long) area->addr, end);
  }

Index: linux-2.6.12-rc4/mm/mprotect.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/mprotect.c	2005-05-19 17:01:14.000000000 
+1000
+++ linux-2.6.12-rc4/mm/mprotect.c	2005-05-19 18:21:20.000000000 
+1000
@@ -19,82 +19,46 @@
  #include <linux/mempolicy.h>
  #include <linux/personality.h>
  #include <linux/syscalls.h>
+#include <linux/page_table.h>

  #include <asm/uaccess.h>
-#include <asm/pgtable.h>
  #include <asm/cacheflush.h>
  #include <asm/tlbflush.h>

-static void change_pte_range(struct mm_struct *mm, pmd_t *pmd,
-		unsigned long addr, unsigned long end, pgprot_t newprot)
-{
-	pte_t *pte;
-
-	pte = pte_offset_map(pmd, addr);
-	do {
-		if (pte_present(*pte)) {
-			pte_t ptent;
-
-			/* Avoid an SMP race with hardware updated 
dirty/clean
-			 * bits by wiping the pte and then setting the new 
pte
-			 * into place.
-			 */
-			ptent = pte_modify(ptep_get_and_clear(mm, addr, 
pte), newprot);
-			set_pte_at(mm, addr, pte, ptent);
-			lazy_mmu_prot_update(ptent);
-		}
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	pte_unmap(pte - 1);
-}
-
-static inline void change_pmd_range(struct mm_struct *mm, pud_t *pud,
-		unsigned long addr, unsigned long end, pgprot_t newprot)
+struct change_prot_struct
  {
-	pmd_t *pmd;
-	unsigned long next;
-
-	pmd = pmd_offset(pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(pmd))
-			continue;
-		change_pte_range(mm, pmd, addr, next, newprot);
-	} while (pmd++, addr = next, addr != end);
-}
+	pgprot_t newprot;
+};

-static inline void change_pud_range(struct mm_struct *mm, pgd_t *pgd,
-		unsigned long addr, unsigned long end, pgprot_t newprot)
+int change_prot_pte(struct mm_struct *mm, pte_t *pte, unsigned long 
address, void *data)
  {
-	pud_t *pud;
-	unsigned long next;
-
-	pud = pud_offset(pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(pud))
-			continue;
-		change_pmd_range(mm, pud, addr, next, newprot);
-	} while (pud++, addr = next, addr != end);
+	if (pte_present(*pte)) {
+		pte_t ptent;
+		/* Avoid an SMP race with hardware updated dirty/clean
+		 * bits by wiping the pte and then setting the new pte
+		 * into place.
+		 */
+		ptent = pte_modify(ptep_get_and_clear(mm, address, pte),
+			((struct change_prot_struct *)data)->newprot);
+		set_pte_at(mm, address, pte, ptent);
+		lazy_mmu_prot_update(ptent);
+		return 0;
+	}
+	return 0;
  }

  static void change_protection(struct vm_area_struct *vma,
  		unsigned long addr, unsigned long end, pgprot_t newprot)
  {
  	struct mm_struct *mm = vma->vm_mm;
-	pgd_t *pgd;
-	unsigned long next;
  	unsigned long start = addr;
+	struct change_prot_struct data;

+	data.newprot = newprot;
  	BUG_ON(addr >= end);
-	pgd = pgd_offset(mm, addr);
  	flush_cache_range(vma, addr, end);
  	spin_lock(&mm->page_table_lock);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(pgd))
-			continue;
-		change_pud_range(mm, pgd, addr, next, newprot);
-	} while (pgd++, addr = next, addr != end);
+	page_table_read_iterator(mm, addr, end, change_prot_pte, &data);
  	flush_tlb_range(vma, start, end);
  	spin_unlock(&mm->page_table_lock);
  }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 12/15] PTI: Finish calling iterators
  2005-05-21  4:58                   ` [PATCH 11/15] PTI: Continue calling iterators Paul Cameron Davies
@ 2005-05-21  5:04                     ` Paul Cameron Davies
  2005-05-21  5:09                       ` [PATCH 13/15] PTI: Add files and IA64 part of interface Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  5:04 UTC (permalink / raw)
  To: linux-mm

Patch 12 of 15.

This patch continues to call the read iterator

 	*abstracts sync_page_range in msync.c
 	*abstracts unuse_vma in swapfile.c
 	*abstracts verify_pages in mempolicy.c
 	*abstracts try_to_umap_cluster in rmap.c
 	 Some defines moved to mlpt-iterators as
 	 part of this process.
 	*This finishes all the calls to the read
 	 iterator.

  include/mm/mlpt-iterators.h |    3 +
  mm/mempolicy.c              |   64 +++++++++++--------------
  mm/msync.c                  |   89 +++++++++--------------------------
  mm/rmap.c                   |  111 
+++++++++++++++++++++-----------------------
  mm/swapfile.c               |   91 ++++++++++--------------------------
  5 files changed, 133 insertions(+), 225 deletions(-)

Index: linux-2.6.12-rc4/mm/msync.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/msync.c	2005-05-19 17:01:14.000000000 
+1000
+++ linux-2.6.12-rc4/mm/msync.c	2005-05-19 18:27:40.000000000 +1000
@@ -13,8 +13,8 @@
  #include <linux/mman.h>
  #include <linux/hugetlb.h>
  #include <linux/syscalls.h>
+#include <linux/page_table.h>

-#include <asm/pgtable.h>
  #include <asm/tlbflush.h>

  /*
@@ -22,85 +22,42 @@
   * threads/the swapper from ripping pte's out from under us.
   */

-static void sync_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
-				unsigned long addr, unsigned long end)
-{
-	pte_t *pte;
-
-	pte = pte_offset_map(pmd, addr);
-	do {
-		unsigned long pfn;
-		struct page *page;
-
-		if (!pte_present(*pte))
-			continue;
-		pfn = pte_pfn(*pte);
-		if (!pfn_valid(pfn))
-			continue;
-		page = pfn_to_page(pfn);
-		if (PageReserved(page))
-			continue;
-
-		if (ptep_clear_flush_dirty(vma, addr, pte) ||
-		    page_test_and_clear_dirty(page))
-			set_page_dirty(page);
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	pte_unmap(pte - 1);
-}
-
-static inline void sync_pmd_range(struct vm_area_struct *vma, pud_t *pud,
-				unsigned long addr, unsigned long end)
+struct sync_page_struct
  {
-	pmd_t *pmd;
-	unsigned long next;
-
-	pmd = pmd_offset(pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(pmd))
-			continue;
-		sync_pte_range(vma, pmd, addr, next);
-	} while (pmd++, addr = next, addr != end);
-}
+	struct vm_area_struct *vma;
+};

-static inline void sync_pud_range(struct vm_area_struct *vma, pgd_t *pgd,
-				unsigned long addr, unsigned long end)
+int sync_range_pte(struct mm_struct *mm, pte_t *pte, unsigned long 
address, void *data)
  {
-	pud_t *pud;
-	unsigned long next;
+	unsigned long pfn;
+	struct page *page;

-	pud = pud_offset(pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(pud))
-			continue;
-		sync_pmd_range(vma, pud, addr, next);
-	} while (pud++, addr = next, addr != end);
+	if (!pte_present(*pte))
+		return 0;
+	pfn = pte_pfn(*pte);
+	if (!pfn_valid(pfn))
+		return 0;
+	page = pfn_to_page(pfn);
+	if (PageReserved(page))
+		return 0;
+
+	if (ptep_clear_flush_dirty(((struct sync_page_struct *)data)->vma, 
address, pte) ||
+	    page_test_and_clear_dirty(page))
+		set_page_dirty(page);
+	return 0;
  }

  static void sync_page_range(struct vm_area_struct *vma,
  				unsigned long addr, unsigned long end)
  {
  	struct mm_struct *mm = vma->vm_mm;
-	pgd_t *pgd;
-	unsigned long next;
-
-	/* For hugepages we can't go walking the page table normally,
-	 * but that's ok, hugetlbfs is memory based, so we don't need
-	 * to do anything more on an msync() */
-	if (is_vm_hugetlb_page(vma))
-		return;
+	struct sync_page_struct data;

+	data.vma = vma;
  	BUG_ON(addr >= end);
-	pgd = pgd_offset(mm, addr);
  	flush_cache_range(vma, addr, end);
  	spin_lock(&mm->page_table_lock);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(pgd))
-			continue;
-		sync_pud_range(vma, pgd, addr, next);
-	} while (pgd++, addr = next, addr != end);
+	page_table_read_iterator(mm, addr, end, sync_range_pte, &data);
  	spin_unlock(&mm->page_table_lock);
  }

Index: linux-2.6.12-rc4/mm/swapfile.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/swapfile.c	2005-05-19 17:01:14.000000000 
+1000
+++ linux-2.6.12-rc4/mm/swapfile.c	2005-05-19 18:27:40.000000000 
+1000
@@ -26,8 +26,8 @@
  #include <linux/security.h>
  #include <linux/backing-dev.h>
  #include <linux/syscalls.h>
+#include <linux/page_table.h>

-#include <asm/pgtable.h>
  #include <asm/tlbflush.h>
  #include <linux/swapops.h>

@@ -435,70 +435,35 @@
  	activate_page(page);
  }

-static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
-				unsigned long addr, unsigned long end,
-				swp_entry_t entry, struct page *page)
+struct unuse_vma_struct
  {
-	pte_t *pte;
-	pte_t swp_pte = swp_entry_to_pte(entry);
-
-	pte = pte_offset_map(pmd, addr);
-	do {
-		/*
-		 * swapoff spends a _lot_ of time in this loop!
-		 * Test inline before going to call unuse_pte.
-		 */
-		if (unlikely(pte_same(*pte, swp_pte))) {
-			unuse_pte(vma, pte, addr, entry, page);
-			pte_unmap(pte);
-			return 1;
-		}
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	pte_unmap(pte - 1);
-	return 0;
-}
-
-static inline int unuse_pmd_range(struct vm_area_struct *vma, pud_t *pud,
-				unsigned long addr, unsigned long end,
-				swp_entry_t entry, struct page *page)
-{
-	pmd_t *pmd;
-	unsigned long next;
-
-	pmd = pmd_offset(pud, addr);
-	do {
-		next = pmd_addr_end(addr, end);
-		if (pmd_none_or_clear_bad(pmd))
-			continue;
-		if (unuse_pte_range(vma, pmd, addr, next, entry, page))
-			return 1;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
-}
+	struct vm_area_struct *vma;
+	swp_entry_t entry;
+	struct page *page;
+};

-static inline int unuse_pud_range(struct vm_area_struct *vma, pgd_t *pgd,
-				unsigned long addr, unsigned long end,
-				swp_entry_t entry, struct page *page)
+int unuse_vma_pte(struct mm_struct *mm, pte_t *pte, unsigned long 
address, void *data)
  {
-	pud_t *pud;
-	unsigned long next;
-
-	pud = pud_offset(pgd, addr);
-	do {
-		next = pud_addr_end(addr, end);
-		if (pud_none_or_clear_bad(pud))
-			continue;
-		if (unuse_pmd_range(vma, pud, addr, next, entry, page))
-			return 1;
-	} while (pud++, addr = next, addr != end);
+	pte_t swp_pte = swp_entry_to_pte( ((struct unuse_vma_struct 
*)data)->entry );
+	/*
+	 * swapoff spends a _lot_ of time in this loop!
+	 * Test inline before going to call unuse_pte.
+	 */
+	if (unlikely(pte_same(*pte, swp_pte))) {
+		unuse_pte(((struct unuse_vma_struct *)data)->vma, pte, 
address,
+			((struct unuse_vma_struct *)data)->entry,
+			((struct unuse_vma_struct *)data)->page);
+		pte_unmap(pte);
+		return 1;
+	}
  	return 0;
  }

  static int unuse_vma(struct vm_area_struct *vma,
  				swp_entry_t entry, struct page *page)
  {
-	pgd_t *pgd;
-	unsigned long addr, end, next;
+	unsigned long addr, end;
+	struct unuse_vma_struct data;

  	if (page->mapping) {
  		addr = page_address_in_vma(page, vma);
@@ -510,15 +475,11 @@
  		addr = vma->vm_start;
  		end = vma->vm_end;
  	}
-
-	pgd = pgd_offset(vma->vm_mm, addr);
-	do {
-		next = pgd_addr_end(addr, end);
-		if (pgd_none_or_clear_bad(pgd))
-			continue;
-		if (unuse_pud_range(vma, pgd, addr, next, entry, page))
-			return 1;
-	} while (pgd++, addr = next, addr != end);
+
+	data.vma = vma;
+	data.entry = entry;
+	data.page = page;
+	page_table_read_iterator(vma->vm_mm, addr, end, unuse_vma_pte, 
&data);
  	return 0;
  }

Index: linux-2.6.12-rc4/mm/mempolicy.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/mempolicy.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/mm/mempolicy.c	2005-05-19 18:27:40.000000000 
+1000
@@ -76,6 +76,7 @@
  #include <linux/init.h>
  #include <linux/compat.h>
  #include <linux/mempolicy.h>
+#include <linux/page_table.h>
  #include <asm/tlbflush.h>
  #include <asm/uaccess.h>

@@ -238,46 +239,37 @@
  }

  /* Ensure all existing pages follow the policy. */
+
+struct verify_pages_struct
+{
+	unsigned long *nodes;
+};
+
+int verify_page(struct mm_struct *mm, pte_t *pte, unsigned long address, 
void *data)
+{
+	struct page *p;
+	unsigned long *nodes = ((struct verify_pages_struct 
*)data)->nodes;
+
+	p = NULL;
+	if (pte_present(*pte))
+		p = pte_page(*pte);
+	pte_unmap(pte);
+	if (p) {
+		unsigned nid = page_to_nid(p);
+		if (!test_bit(nid, nodes))
+			return -EIO;
+	}
+	return 0;
+}
+
  static int
  verify_pages(struct mm_struct *mm,
  	     unsigned long addr, unsigned long end, unsigned long *nodes)
  {
-	while (addr < end) {
-		struct page *p;
-		pte_t *pte;
-		pmd_t *pmd;
-		pud_t *pud;
-		pgd_t *pgd;
-		pgd = pgd_offset(mm, addr);
-		if (pgd_none(*pgd)) {
-			unsigned long next = (addr + PGDIR_SIZE) & 
PGDIR_MASK;
-			if (next > addr)
-				break;
-			addr = next;
-			continue;
-		}
-		pud = pud_offset(pgd, addr);
-		if (pud_none(*pud)) {
-			addr = (addr + PUD_SIZE) & PUD_MASK;
-			continue;
-		}
-		pmd = pmd_offset(pud, addr);
-		if (pmd_none(*pmd)) {
-			addr = (addr + PMD_SIZE) & PMD_MASK;
-			continue;
-		}
-		p = NULL;
-		pte = pte_offset_map(pmd, addr);
-		if (pte_present(*pte))
-			p = pte_page(*pte);
-		pte_unmap(pte);
-		if (p) {
-			unsigned nid = page_to_nid(p);
-			if (!test_bit(nid, nodes))
-				return -EIO;
-		}
-		addr += PAGE_SIZE;
-	}
+	struct verify_pages_struct data;
+
+	data.nodes = nodes;
+	page_table_read_iterator(mm, addr, end, verify_page, &data);
  	return 0;
  }

Index: linux-2.6.12-rc4/include/mm/mlpt-iterators.h
===================================================================
--- linux-2.6.12-rc4.orig/include/mm/mlpt-iterators.h	2005-05-19 
18:12:36.000000000 +1000
+++ linux-2.6.12-rc4/include/mm/mlpt-iterators.h	2005-05-19 
18:27:40.000000000 +1000
@@ -344,5 +344,8 @@
  	return 0;
  }

+#define CLUSTER_SIZE	min(32*PAGE_SIZE, PMD_SIZE)
+#define CLUSTER_MASK	(~(CLUSTER_SIZE - 1))
+

  #endif
Index: linux-2.6.12-rc4/mm/rmap.c
===================================================================
--- linux-2.6.12-rc4.orig/mm/rmap.c	2005-05-19 18:01:20.000000000 
+1000
+++ linux-2.6.12-rc4/mm/rmap.c	2005-05-19 18:27:40.000000000 +1000
@@ -609,22 +609,63 @@
   * there there won't be many ptes located within the scan cluster.  In 
this case
   * maybe we could scan further - to the end of the pte page, perhaps.
   */
-#define CLUSTER_SIZE	min(32*PAGE_SIZE, PMD_SIZE)
-#define CLUSTER_MASK	(~(CLUSTER_SIZE - 1))
+
+struct unmap_cluster_struct
+{
+	unsigned int *mapcount;
+	struct vm_area_struct *vma;
+};
+
+int unmap_cluster(struct mm_struct *mm, pte_t *pte, unsigned long 
address, void *data)
+{
+	unsigned int *mapcount = ((struct unmap_cluster_struct 
*)data)->mapcount;
+	struct vm_area_struct *vma = ((struct unmap_cluster_struct 
*)data)->vma;
+
+	unsigned long pfn;
+	struct page *page;
+	pte_t pteval;
+
+	if (!pte_present(*pte))
+		return 0;
+
+	pfn = pte_pfn(*pte);
+	if (!pfn_valid(pfn))
+		return 0;
+
+	page = pfn_to_page(pfn);
+	BUG_ON(PageAnon(page));
+	if (PageReserved(page))
+		return 0;
+
+	if (ptep_clear_flush_young(vma, address, pte))
+		return 0;
+
+	/* Nuke the page table entry. */
+	flush_cache_page(vma, address, pfn);
+	pteval = ptep_clear_flush(vma, address, pte);
+
+	/* If nonlinear, store the file page offset in the pte. */
+	if (page->index != linear_page_index(vma, address))
+		set_pte_at(mm, address, pte, pgoff_to_pte(page->index));
+
+	/* Move the dirty bit to the physical page now the pte is gone. */
+	if (pte_dirty(pteval))
+		set_page_dirty(page);
+
+	page_remove_rmap(page);
+	page_cache_release(page);
+	dec_mm_counter(mm, rss);
+	(*mapcount)--;
+	return 0;
+}

  static void try_to_unmap_cluster(unsigned long cursor,
  	unsigned int *mapcount, struct vm_area_struct *vma)
  {
  	struct mm_struct *mm = vma->vm_mm;
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
-	pte_t *pte;
-	pte_t pteval;
-	struct page *page;
  	unsigned long address;
  	unsigned long end;
-	unsigned long pfn;
+	struct unmap_cluster_struct data;

  	/*
  	 * We need the page_table_lock to protect us from page faults,
@@ -639,56 +680,10 @@
  	if (end > vma->vm_end)
  		end = vma->vm_end;

-	pgd = pgd_offset(mm, address);
-	if (!pgd_present(*pgd))
-		goto out_unlock;
-
-	pud = pud_offset(pgd, address);
-	if (!pud_present(*pud))
-		goto out_unlock;
-
-	pmd = pmd_offset(pud, address);
-	if (!pmd_present(*pmd))
-		goto out_unlock;
-
-	for (pte = pte_offset_map(pmd, address);
-			address < end; pte++, address += PAGE_SIZE) {
-
-		if (!pte_present(*pte))
-			continue;
-
-		pfn = pte_pfn(*pte);
-		if (!pfn_valid(pfn))
-			continue;
-
-		page = pfn_to_page(pfn);
-		BUG_ON(PageAnon(page));
-		if (PageReserved(page))
-			continue;
-
-		if (ptep_clear_flush_young(vma, address, pte))
-			continue;
-
-		/* Nuke the page table entry. */
-		flush_cache_page(vma, address, pfn);
-		pteval = ptep_clear_flush(vma, address, pte);
-
-		/* If nonlinear, store the file page offset in the pte. */
-		if (page->index != linear_page_index(vma, address))
-			set_pte_at(mm, address, pte, 
pgoff_to_pte(page->index));
-
-		/* Move the dirty bit to the physical page now the pte is 
gone. */
-		if (pte_dirty(pteval))
-			set_page_dirty(page);
-
-		page_remove_rmap(page);
-		page_cache_release(page);
-		dec_mm_counter(mm, rss);
-		(*mapcount)--;
-	}
+	data.mapcount =	mapcount;
+	data.vma = vma;
+	page_table_read_iterator(mm, address, end, unmap_cluster, &data);

-	pte_unmap(pte);
-out_unlock:
  	spin_unlock(&mm->page_table_lock);
  }


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 13/15] PTI: Add files and IA64 part of interface
  2005-05-21  5:04                     ` [PATCH 12/15] PTI: Finish " Paul Cameron Davies
@ 2005-05-21  5:09                       ` Paul Cameron Davies
  2005-05-21  5:15                         ` [PATCH 14/15] PTI: Move IA64 mlpt code behind interface Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  5:09 UTC (permalink / raw)
  To: linux-mm

Patch 13 of 15.

This patch adds the new files required by the IA64 architecture
to achieve a trully independent page table interface.  Architectures
other that IA64 also require an architecture dependent interface
component to achieve a trully indepenent page table interface.

 	*The architecture dependent interface is to go in
 	 include/asm-ia64/mlpt.h  This will be hooked into the general
 	 page table interface.
 	*mlpt specific code in include/asm-ia64/pgtable.h is to be
 	 abstracted to include/asm-ia64/pgtable-mlpt.h.
 	*mlpt specific code for the ia64 architecture is to be shifted
 	 behind the interface into mlpt-ia64.c.

  arch/ia64/mm/Makefile               |    2
  arch/ia64/mm/fixed-mlpt/Makefile    |    5 +
  arch/ia64/mm/fixed-mlpt/mlpt-ia64.c |    1
  include/asm-ia64/mlpt.h             |  108 
++++++++++++++++++++++++++++++++++++
  include/asm-ia64/pgtable-mlpt.h     |    5 +
  include/asm-ia64/pgtable.h          |    7 ++
  6 files changed, 128 insertions(+)

Index: linux-2.6.12-rc4/include/asm-ia64/pgtable.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-ia64/pgtable.h	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/include/asm-ia64/pgtable.h	2005-05-19 
18:32:00.000000000 +1000
@@ -20,6 +20,10 @@
  #include <asm/system.h>
  #include <asm/types.h>

+#ifdef CONFIG_MLPT
+#include <asm/pgtable-mlpt.h>
+#endif
+
  #define IA64_MAX_PHYS_BITS	50	/* max. number of physical address 
bits (architected) */

  /*
@@ -561,7 +565,10 @@
  #define __HAVE_ARCH_PGD_OFFSET_GATE
  #define __HAVE_ARCH_LAZY_MMU_PROT_UPDATE

+#ifdef CONFIG_MLPT
  #include <asm-generic/pgtable-nopud.h>
+#endif
+
  #include <asm-generic/pgtable.h>

  #endif /* _ASM_IA64_PGTABLE_H */
Index: linux-2.6.12-rc4/include/asm-ia64/pgtable-mlpt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/asm-ia64/pgtable-mlpt.h	2005-05-19 
18:32:00.000000000 +1000
@@ -0,0 +1,5 @@
+#ifndef ASM_IA64_PGTABLE_MLPT_H
+#define ASM_IA64_PGTABLE_MLPT_H 1
+
+#endif
+
Index: linux-2.6.12-rc4/arch/ia64/mm/fixed-mlpt/mlpt-ia64.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/arch/ia64/mm/fixed-mlpt/mlpt-ia64.c	2005-05-19 
18:32:00.000000000 +1000
@@ -0,0 +1 @@
+
Index: linux-2.6.12-rc4/arch/ia64/mm/fixed-mlpt/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/arch/ia64/mm/fixed-mlpt/Makefile	2005-05-19 
18:32:00.000000000 +1000
@@ -0,0 +1,5 @@
+#
+# Makefile
+#
+
+obj-y := mlpt-ia64.o
Index: linux-2.6.12-rc4/arch/ia64/mm/Makefile
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/Makefile	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/Makefile	2005-05-19 
18:32:00.000000000 +1000
@@ -4,6 +4,8 @@

  obj-y := init.o fault.o tlb.o extable.o

+obj-y += fixed-mlpt/
+
  obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
  obj-$(CONFIG_NUMA)	   += numa.o
  obj-$(CONFIG_DISCONTIGMEM) += discontig.o
Index: linux-2.6.12-rc4/include/asm-ia64/mlpt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc4/include/asm-ia64/mlpt.h	2005-05-19 
18:32:00.000000000 +1000
@@ -0,0 +1,108 @@
+#ifndef MLPT_IA64_H
+#define MLPT_IA64_H 1
+
+#include <linux/bootmem.h>
+
+static inline pte_t *lookup_kernel_page_table(unsigned long address)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(address);
+	if (pgd_none_or_clear_bad(pgd))
+		return NULL;
+
+	pud = pud_offset(pgd, address);
+	if (pud_none_or_clear_bad(pud)) {
+		return NULL;
+	}
+
+	pmd = pmd_offset(pud, address);
+	if (pmd_none_or_clear_bad(pmd)) {
+		return NULL;
+	}
+
+	pte = pte_offset_kernel(pmd, address);
+
+	return pte;
+}
+
+
+/**
+ * build_kernel_page_table - frees a user process page table.
+ * @mm: the address space that owns the page table.
+ * @address: The virtual address for which we are adding a mapping.
+ *
+ * Returns a pointer to a pte.
+ *
+ * Builds the pud/pmd.pte directorires for a page table if requried.
+ * This function readies the page table for insertion.
+ */
+
+static inline pte_t *build_kernel_page_table(unsigned long address)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(address);
+
+	if (!pgd) {
+		return NULL;
+	}
+
+	pud = pud_alloc(&init_mm, pgd, address);
+	if (!pud) {
+		return NULL;
+	}
+
+	pmd = pmd_alloc(&init_mm, pud, address);
+	if (!pmd) {
+		return NULL;
+	}
+
+	pte = pte_alloc_map(&init_mm, pmd, address);
+
+	return pte;
+}
+
+
+/**
+ * build_memory_map - builds the kernel page table for the memory map
+ * @address: The virtual address for which we are adding a mapping.
+ *
+ * Returns a pointer to the pte to be mapped.
+ *
+ * This function builds the kernel page table
+ */
+
+static inline pte_t *build_memory_map(unsigned long address)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	pgd = pgd_offset_k(address);
+	if (pgd_none(*pgd))
+		pgd_populate(&init_mm, pgd,
+			     alloc_bootmem_pages_node(
+				     NODE_DATA(node), PAGE_SIZE));
+	pud = pud_offset(pgd, address);
+
+	if (pud_none(*pud))
+		pud_populate(&init_mm, pud,
+			     alloc_bootmem_pages_node(
+				     NODE_DATA(node), PAGE_SIZE));
+	pmd = pmd_offset(pud, address);
+
+	if (pmd_none(*pmd))
+		pmd_populate_kernel(&init_mm, pmd,
+				    alloc_bootmem_pages_node(
+					    NODE_DATA(node), PAGE_SIZE));
+	return pte_offset_kernel(pmd, address);
+}
+
+#endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 14/15] PTI: Move IA64 mlpt code behind interface
  2005-05-21  5:09                       ` [PATCH 13/15] PTI: Add files and IA64 part of interface Paul Cameron Davies
@ 2005-05-21  5:15                         ` Paul Cameron Davies
  2005-05-21  5:27                           ` [PATCH 15/15] PTI: Call IA64 interface Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  5:15 UTC (permalink / raw)
  To: linux-mm

Patch 14 of 15.

This patch moves the mlpt specific code for the ia64 architecture behind 
the
architecture specific part of the page table interface.

 	*Shifted some defines from init.c to ia64-mlpt.c
 	*Moved the set of functions that controls the level of cached
 	 directories from init.c to ia64-mlpt.c.
 	*Moved through the files that call pgalloc.h which
 	 is now a part of the page table implementation and
 	 call page_table.h instead, which is independent.

  arch/ia64/kernel/process.c          |    2
  arch/ia64/kernel/smp.c              |    3 -
  arch/ia64/kernel/smpboot.c          |    3 -
  arch/ia64/mm/contig.c               |    5 +-
  arch/ia64/mm/discontig.c            |    2
  arch/ia64/mm/fixed-mlpt/mlpt-ia64.c |   82 
++++++++++++++++++++++++++++++++++++
  arch/ia64/mm/init.c                 |   56 +-----------------------
  arch/ia64/mm/tlb.c                  |    2
  arch/ia64/sn/kernel/sn2/cache.c     |    1
  include/asm-ia64/tlb.h              |   20 +-------
  10 files changed, 97 insertions(+), 79 deletions(-)

Index: linux-2.6.12-rc4/arch/ia64/mm/init.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/init.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/init.c	2005-05-19 
18:36:14.000000000 +1000
@@ -20,7 +20,9 @@
  #include <linux/swap.h>
  #include <linux/proc_fs.h>
  #include <linux/bitops.h>
+#include <linux/page_table.h>

+#include <asm/mlpt.h>
  #include <asm/a.out.h>
  #include <asm/dma.h>
  #include <asm/ia32.h>
@@ -28,7 +30,6 @@
  #include <asm/machvec.h>
  #include <asm/numa.h>
  #include <asm/patch.h>
-#include <asm/pgalloc.h>
  #include <asm/sal.h>
  #include <asm/sections.h>
  #include <asm/system.h>
@@ -39,9 +40,6 @@

  DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);

-DEFINE_PER_CPU(unsigned long *, __pgtable_quicklist);
-DEFINE_PER_CPU(long, __pgtable_quicklist_size);
-
  extern void ia64_tlb_init (void);

  unsigned long MAX_DMA_ADDRESS = PAGE_OFFSET + 0x100000000UL;
@@ -56,53 +54,6 @@
  struct page *zero_page_memmap_ptr;	/* map entry for zero page */
  EXPORT_SYMBOL(zero_page_memmap_ptr);

-#define MIN_PGT_PAGES			25UL
-#define MAX_PGT_FREES_PER_PASS		16L
-#define PGT_FRACTION_OF_NODE_MEM	16
-
-static inline long
-max_pgt_pages(void)
-{
-	u64 node_free_pages, max_pgt_pages;
-
-#ifndef	CONFIG_NUMA
-	node_free_pages = nr_free_pages();
-#else
-	node_free_pages = nr_free_pages_pgdat(NODE_DATA(numa_node_id()));
-#endif
-	max_pgt_pages = node_free_pages / PGT_FRACTION_OF_NODE_MEM;
-	max_pgt_pages = max(max_pgt_pages, MIN_PGT_PAGES);
-	return max_pgt_pages;
-}
-
-static inline long
-min_pages_to_free(void)
-{
-	long pages_to_free;
-
-	pages_to_free = pgtable_quicklist_size - max_pgt_pages();
-	pages_to_free = min(pages_to_free, MAX_PGT_FREES_PER_PASS);
-	return pages_to_free;
-}
-
-void
-check_pgt_cache(void)
-{
-	long pages_to_free;
-
-	if (unlikely(pgtable_quicklist_size <= MIN_PGT_PAGES))
-		return;
-
-	preempt_disable();
-	while (unlikely((pages_to_free = min_pages_to_free()) > 0)) {
-		while (pages_to_free--) {
-			free_page((unsigned 
long)pgtable_quicklist_alloc());
-		}
-		preempt_enable();
-		preempt_disable();
-	}
-	preempt_enable();
-}

  void
  lazy_mmu_prot_update (pte_t pte)
@@ -555,10 +506,11 @@
  	pg_data_t *pgdat;
  	int i;
  	static struct kcore_list kcore_mem, kcore_vmem, kcore_kernel;
-
+#ifdef CONFIG_MLPT
  	BUG_ON(PTRS_PER_PGD * sizeof(pgd_t) != PAGE_SIZE);
  	BUG_ON(PTRS_PER_PMD * sizeof(pmd_t) != PAGE_SIZE);
  	BUG_ON(PTRS_PER_PTE * sizeof(pte_t) != PAGE_SIZE);
+#endif

  #ifdef CONFIG_PCI
  	/*
Index: linux-2.6.12-rc4/arch/ia64/kernel/process.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/kernel/process.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/kernel/process.c	2005-05-19 
18:36:14.000000000 +1000
@@ -27,13 +27,13 @@
  #include <linux/efi.h>
  #include <linux/interrupt.h>
  #include <linux/delay.h>
+#include <linux/page_table.h>

  #include <asm/cpu.h>
  #include <asm/delay.h>
  #include <asm/elf.h>
  #include <asm/ia32.h>
  #include <asm/irq.h>
-#include <asm/pgalloc.h>
  #include <asm/processor.h>
  #include <asm/sal.h>
  #include <asm/tlbflush.h>
Index: linux-2.6.12-rc4/arch/ia64/kernel/smp.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/kernel/smp.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/kernel/smp.c	2005-05-19 
18:36:14.000000000 +1000
@@ -30,6 +30,7 @@
  #include <linux/delay.h>
  #include <linux/efi.h>
  #include <linux/bitops.h>
+#include <linux/page_table.h>

  #include <asm/atomic.h>
  #include <asm/current.h>
@@ -38,8 +39,6 @@
  #include <asm/io.h>
  #include <asm/irq.h>
  #include <asm/page.h>
-#include <asm/pgalloc.h>
-#include <asm/pgtable.h>
  #include <asm/processor.h>
  #include <asm/ptrace.h>
  #include <asm/sal.h>
Index: linux-2.6.12-rc4/arch/ia64/kernel/smpboot.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/kernel/smpboot.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/kernel/smpboot.c	2005-05-19 
18:36:14.000000000 +1000
@@ -41,6 +41,7 @@
  #include <linux/efi.h>
  #include <linux/percpu.h>
  #include <linux/bitops.h>
+#include <linux/page_table.h>

  #include <asm/atomic.h>
  #include <asm/cache.h>
@@ -52,8 +53,6 @@
  #include <asm/machvec.h>
  #include <asm/mca.h>
  #include <asm/page.h>
-#include <asm/pgalloc.h>
-#include <asm/pgtable.h>
  #include <asm/processor.h>
  #include <asm/ptrace.h>
  #include <asm/sal.h>
Index: linux-2.6.12-rc4/arch/ia64/mm/contig.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/contig.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/contig.c	2005-05-19 
18:36:14.000000000 +1000
@@ -19,10 +19,9 @@
  #include <linux/efi.h>
  #include <linux/mm.h>
  #include <linux/swap.h>
+#include <linux/page_table.h>

  #include <asm/meminit.h>
-#include <asm/pgalloc.h>
-#include <asm/pgtable.h>
  #include <asm/sections.h>
  #include <asm/mca.h>

@@ -61,8 +60,10 @@
  	printk("%d reserved pages\n", reserved);
  	printk("%d pages shared\n", shared);
  	printk("%d pages swap cached\n", cached);
+#ifdef CONFIG_MLPT
  	printk("%ld pages in page table cache\n",
  		pgtable_quicklist_total_size());
+#endif
  }

  /* physical address where the bootmem map is located */
Index: linux-2.6.12-rc4/arch/ia64/mm/discontig.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/discontig.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/discontig.c	2005-05-19 
18:36:14.000000000 +1000
@@ -21,7 +21,7 @@
  #include <linux/acpi.h>
  #include <linux/efi.h>
  #include <linux/nodemask.h>
-#include <asm/pgalloc.h>
+#include <linux/page_table.h>
  #include <asm/tlb.h>
  #include <asm/meminit.h>
  #include <asm/numa.h>
Index: linux-2.6.12-rc4/arch/ia64/mm/tlb.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/tlb.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/tlb.c	2005-05-19 18:36:14.000000000 
+1000
@@ -16,10 +16,10 @@
  #include <linux/sched.h>
  #include <linux/smp.h>
  #include <linux/mm.h>
+#include <linux/page_table.h>

  #include <asm/delay.h>
  #include <asm/mmu_context.h>
-#include <asm/pgalloc.h>
  #include <asm/pal.h>
  #include <asm/tlbflush.h>

Index: linux-2.6.12-rc4/include/asm-ia64/tlb.h
===================================================================
--- linux-2.6.12-rc4.orig/include/asm-ia64/tlb.h	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/include/asm-ia64/tlb.h	2005-05-19 
18:36:14.000000000 +1000
@@ -224,22 +224,8 @@
  	__tlb_remove_tlb_entry(tlb, ptep, addr);	\
  } while (0)

-#define pte_free_tlb(tlb, ptep)				\
-do {							\
-	tlb->need_flush = 1;				\
-	__pte_free_tlb(tlb, ptep);			\
-} while (0)
-
-#define pmd_free_tlb(tlb, ptep)				\
-do {							\
-	tlb->need_flush = 1;				\
-	__pmd_free_tlb(tlb, ptep);			\
-} while (0)
-
-#define pud_free_tlb(tlb, pudp)				\
-do {							\
-	tlb->need_flush = 1;				\
-	__pud_free_tlb(tlb, pudp);			\
-} while (0)
+#ifdef CONFIG_MLPT
+#include <asm-generic/tlb-mlpt.h>
+#endif

  #endif /* _ASM_IA64_TLB_H */
Index: linux-2.6.12-rc4/arch/ia64/sn/kernel/sn2/cache.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/sn/kernel/sn2/cache.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/sn/kernel/sn2/cache.c	2005-05-19 
18:36:14.000000000 +1000
@@ -7,7 +7,6 @@
   *
   */
  #include <linux/module.h>
-#include <asm/pgalloc.h>

  /**
   * sn_flush_all_caches - flush a range of address from all caches (incl. 
L4)
Index: linux-2.6.12-rc4/arch/ia64/mm/fixed-mlpt/mlpt-ia64.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/fixed-mlpt/mlpt-ia64.c	2005-05-19 
18:32:00.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/fixed-mlpt/mlpt-ia64.c	2005-05-19 
18:36:14.000000000 +1000
@@ -1 +1,83 @@
+#include <linux/config.h>
+#include <linux/kernel.h>
+#include <linux/init.h>

+#include <linux/bootmem.h>
+#include <linux/efi.h>
+#include <linux/elf.h>
+#include <linux/mm.h>
+#include <linux/mmzone.h>
+#include <linux/module.h>
+#include <linux/personality.h>
+#include <linux/reboot.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/proc_fs.h>
+#include <linux/bitops.h>
+#include <linux/page_table.h>
+
+#include <asm/a.out.h>
+#include <asm/dma.h>
+#include <asm/ia32.h>
+#include <asm/io.h>
+#include <asm/machvec.h>
+#include <asm/numa.h>
+#include <asm/patch.h>
+#include <asm/sal.h>
+#include <asm/sections.h>
+#include <asm/system.h>
+#include <asm/tlb.h>
+#include <asm/uaccess.h>
+#include <asm/unistd.h>
+#include <asm/mca.h>
+
+DEFINE_PER_CPU(unsigned long *, __pgtable_quicklist);
+DEFINE_PER_CPU(long, __pgtable_quicklist_size);
+
+#define MIN_PGT_PAGES			25UL
+#define MAX_PGT_FREES_PER_PASS		16L
+#define PGT_FRACTION_OF_NODE_MEM	16
+
+static inline long
+max_pgt_pages(void)
+{
+	u64 node_free_pages, max_pgt_pages;
+
+#ifndef	CONFIG_NUMA
+	node_free_pages = nr_free_pages();
+#else
+	node_free_pages = nr_free_pages_pgdat(NODE_DATA(numa_node_id()));
+#endif
+	max_pgt_pages = node_free_pages / PGT_FRACTION_OF_NODE_MEM;
+	max_pgt_pages = max(max_pgt_pages, MIN_PGT_PAGES);
+	return max_pgt_pages;
+}
+
+static inline long
+min_pages_to_free(void)
+{
+	long pages_to_free;
+
+	pages_to_free = pgtable_quicklist_size - max_pgt_pages();
+	pages_to_free = min(pages_to_free, MAX_PGT_FREES_PER_PASS);
+	return pages_to_free;
+}
+
+void
+check_pgt_cache(void)
+{
+	long pages_to_free;
+
+	if (unlikely(pgtable_quicklist_size <= MIN_PGT_PAGES))
+		return;
+
+	preempt_disable();
+	while (unlikely((pages_to_free = min_pages_to_free()) > 0)) {
+		while (pages_to_free--) {
+			free_page((unsigned 
long)pgtable_quicklist_alloc());
+		}
+		preempt_enable();
+		preempt_disable();
+	}
+	preempt_enable();
+}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 15/15] PTI: Call IA64 interface
  2005-05-21  5:15                         ` [PATCH 14/15] PTI: Move IA64 mlpt code behind interface Paul Cameron Davies
@ 2005-05-21  5:27                           ` Paul Cameron Davies
  2005-05-21  5:46                             ` PTI: Patch 10/15 URL Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  5:27 UTC (permalink / raw)
  To: linux-mm

Patch 15 of 15.

The final patch in the series.  This patch goes through
and calls the functions in the IA64 specific page table
interface.  This includes:

 	*call lookup_kernel_page_table in fault.c
 	*call build_kernel_page_table in put_kernel_page.
 	*call build_memory_map in create_mem_map_page_table.

  arch/ia64/mm/fault.c |   22 ++++------------------
  arch/ia64/mm/init.c  |   35 ++++-------------------------------
  2 files changed, 8 insertions(+), 49 deletions(-)

Index: linux-2.6.12-rc4/arch/ia64/mm/fault.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/fault.c	2005-05-19 
17:01:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/fault.c	2005-05-19 
18:40:11.000000000 +1000
@@ -9,8 +9,9 @@
  #include <linux/mm.h>
  #include <linux/smp_lock.h>
  #include <linux/interrupt.h>
+#include <linux/page_table.h>

-#include <asm/pgtable.h>
+#include <asm/mlpt.h>
  #include <asm/processor.h>
  #include <asm/system.h>
  #include <asm/uaccess.h>
@@ -50,27 +51,12 @@
  static int
  mapped_kernel_page_is_present (unsigned long address)
  {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *ptep, pte;

-	pgd = pgd_offset_k(address);
-	if (pgd_none(*pgd) || pgd_bad(*pgd))
-		return 0;
-
-	pud = pud_offset(pgd, address);
-	if (pud_none(*pud) || pud_bad(*pud))
-		return 0;
-
-	pmd = pmd_offset(pud, address);
-	if (pmd_none(*pmd) || pmd_bad(*pmd))
-		return 0;
-
-	ptep = pte_offset_kernel(pmd, address);
+	ptep = lookup_kernel_page_table(address);
  	if (!ptep)
  		return 0;
-
+
  	pte = *ptep;
  	return pte_present(pte);
  }
Index: linux-2.6.12-rc4/arch/ia64/mm/init.c
===================================================================
--- linux-2.6.12-rc4.orig/arch/ia64/mm/init.c	2005-05-19 
18:36:14.000000000 +1000
+++ linux-2.6.12-rc4/arch/ia64/mm/init.c	2005-05-19 
18:40:11.000000000 +1000
@@ -215,27 +215,15 @@
  struct page *
  put_kernel_page (struct page *page, unsigned long address, pgprot_t 
pgprot)
  {
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *pte;

  	if (!PageReserved(page))
  		printk(KERN_ERR "put_kernel_page: page at 0x%p not in 
reserved memory\n",
  		       page_address(page));

-	pgd = pgd_offset_k(address);		/* note: this is NOT 
pgd_offset()! */
-
  	spin_lock(&init_mm.page_table_lock);
  	{
-		pud = pud_alloc(&init_mm, pgd, address);
-		if (!pud)
-			goto out;
-
-		pmd = pmd_alloc(&init_mm, pud, address);
-		if (!pmd)
-			goto out;
-		pte = pte_alloc_map(&init_mm, pmd, address);
+		pte = build_kernel_page_table(address);
  		if (!pte)
  			goto out;
  		if (!pte_none(*pte)) {
@@ -349,9 +337,6 @@
  	unsigned long address, start_page, end_page;
  	struct page *map_start, *map_end;
  	int node;
-	pgd_t *pgd;
-	pud_t *pud;
-	pmd_t *pmd;
  	pte_t *pte;

  	map_start = vmem_map + (__pa(start) >> PAGE_SHIFT);
@@ -362,22 +347,10 @@
  	node = paddr_to_nid(__pa(start));

  	for (address = start_page; address < end_page; address += 
PAGE_SIZE) {
-		pgd = pgd_offset_k(address);
-		if (pgd_none(*pgd))
-			pgd_populate(&init_mm, pgd, 
alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE));
-		pud = pud_offset(pgd, address);
-
-		if (pud_none(*pud))
-			pud_populate(&init_mm, pud, 
alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE));
-		pmd = pmd_offset(pud, address);
-
-		if (pmd_none(*pmd))
-			pmd_populate_kernel(&init_mm, pmd, 
alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE));
-		pte = pte_offset_kernel(pmd, address);
-
+		pte = build_memory_map(address);
  		if (pte_none(*pte))
-			set_pte(pte, 
pfn_pte(__pa(alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE)) >> 
PAGE_SHIFT,
-					     PAGE_KERNEL));
+			set_pte(pte, 
pfn_pte(__pa(alloc_bootmem_pages_node(NODE_DATA(node),
+				PAGE_SIZE)) >> PAGE_SHIFT, PAGE_KERNEL));
  	}
  	return 0;
  }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* PTI: Patch 10/15 URL
  2005-05-21  5:27                           ` [PATCH 15/15] PTI: Call IA64 interface Paul Cameron Davies
@ 2005-05-21  5:46                             ` Paul Cameron Davies
  2005-05-21  5:47                               ` PTI: LMbench results Paul Cameron Davies
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  5:46 UTC (permalink / raw)
  To: linux-mm

Patch 10 can be downloaded from:

http://www.gelato.unsw.edu.au/cvs/cvs/kernel/page_table_interface/

pti-call-iterators-10.patch

I apologise for things not being in a single thread.  I have had
a couple of mail problems while posting.


Paul Davies
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* PTI: LMbench results
  2005-05-21  5:46                             ` PTI: Patch 10/15 URL Paul Cameron Davies
@ 2005-05-21  5:47                               ` Paul Cameron Davies
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Cameron Davies @ 2005-05-21  5:47 UTC (permalink / raw)
  To: linux-mm

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
                                  null     null                       open 
signal   signal    fork    execve  /bin/sh
kernel                           call      I/O     stat    fstat    close 
install   handle  process  process  process
-----------------------------  -------  -------  -------  -------  ------- 
-------  -------  -------  -------  -------
2.6.12-rc4                       0.316  0.45743    2.184    0.591    3.586 
0.587    2.915    115.0    684.8   3449.7
   s.d. (5 runs)                  0.000  0.00152    0.006    0.002    0.017 
0.000    0.019      0.0     13.8      9.4
2.6.13-rc4pti                    0.316  0.45813    2.198    0.588    3.509 
0.608    2.832    120.1    678.0   3475.1
   s.d. (5 runs)                  0.000  0.00014    0.020    0.001    0.020 
0.001    0.037      0.0     11.7     19.2

File select - times in microseconds - smaller is better
-------------------------------------------------------
                                 select   select   select   select   select 
select   select   select
kernel                           10 fd   100 fd   250 fd   500 fd   10 tcp 
100 tcp  250 tcp  500 tcp
-----------------------------  -------  -------  -------  -------  ------- 
-------  -------  -------
2.6.12-rc4                       1.999   11.546   27.426   53.804    2.798 
19.1855  46.4584  91.8376
   s.d.                           0.004    0.004    0.013    0.019    0.005 
0.00785  0.01316  0.03462
2.6.13-rc4pti                    2.030   11.571   27.445   53.788    2.791 
19.1657  46.4454  91.7666
   s.d.                           0.003    0.008    0.010    0.026    0.004 
0.01685  0.03803  0.05652

Context switching with 0K - times in microseconds - smaller is better
---------------------------------------------------------------------
                                 2proc/0k   4proc/0k   8proc/0k  16proc/0k 
32proc/0k  64proc/0k  96proc/0k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch 
ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  --------- 
---------  ---------  ---------
2.6.12-rc4                        13.136     13.788     11.394      8.872 
8.632     10.442     11.396
   s.d.                             5.930      3.110      1.692      1.773 
0.841      0.885      0.630
2.6.13-rc4pti                     13.082     14.514     11.336     10.090 
8.666      9.996     10.662
   s.d.                             6.000      2.952      1.688      0.860 
0.897      0.888      0.552

Context switching with 4K - times in microseconds - smaller is better
---------------------------------------------------------------------
                                 2proc/4k   4proc/4k   8proc/4k  16proc/4k 
32proc/4k  64proc/4k  96proc/4k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch 
ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  --------- 
---------  ---------  ---------
2.6.12-rc4                        13.674     12.524      9.546     10.002 
10.194     12.810     14.578
   s.d.                             5.878      3.668      2.670      0.701 
1.029      0.414      0.381
2.6.13-rc4pti                     13.558     13.780     10.098     10.220 
9.662     12.934     14.600
   s.d.                             6.133      3.618      2.449      0.034 
0.854      0.646      0.560

Context switching with 8K - times in microseconds - smaller is better
---------------------------------------------------------------------
                                 2proc/8k   4proc/8k   8proc/8k  16proc/8k 
32proc/8k  64proc/8k  96proc/8k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch 
ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  --------- 
---------  ---------  ---------
2.6.12-rc4                        11.450     14.246     10.588      9.480 
9.898     13.528     17.054
   s.d.                             7.310      3.511      2.347      0.850 
1.259      0.680      0.762
2.6.13-rc4pti                     16.656     12.896      9.896      9.240 
10.658     13.844     17.104
   s.d.                             0.143      3.537      1.382      1.511 
1.918      0.856      0.691

Context switching with 16K - times in microseconds - smaller is better
----------------------------------------------------------------------
                                2proc/16k  4proc/16k  8proc/16k  16prc/16k 
32prc/16k  64prc/16k  96prc/16k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch 
ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  --------- 
---------  ---------  ---------
2.6.12-rc4                        17.688     13.784     11.712     10.436 
11.328     17.222     21.424
   s.d.                             0.008      3.579      2.312      1.745 
1.733      1.403      1.394
2.6.13-rc4pti                     17.672     13.818     10.966     11.194 
13.000     17.180     21.148
   s.d.                             0.049      3.665      1.315      0.799 
0.875      0.631      0.873

Context switching with 32K - times in microseconds - smaller is better
----------------------------------------------------------------------
                                2proc/32k  4proc/32k  8proc/32k  16prc/32k 
32prc/32k  64prc/32k  96prc/32k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch 
ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  --------- 
---------  ---------  ---------
2.6.12-rc4                        19.474     15.790     14.398     13.948 
17.786     30.478     43.438
   s.d.                             0.265      3.501      1.524      1.454 
1.028      1.713      0.980
2.6.13-rc4pti                     19.394     15.678     11.830     13.026 
17.366     32.206     43.038
   s.d.                             0.103      3.589      1.988      1.914 
2.064      2.087      1.551

Context switching with 64K - times in microseconds - smaller is better
----------------------------------------------------------------------
                                2proc/64k  4proc/64k  8proc/64k  16prc/64k 
32prc/64k  64prc/64k  96prc/64k
kernel                         ctx swtch  ctx swtch  ctx swtch  ctx swtch 
ctx swtch  ctx swtch  ctx swtch
-----------------------------  ---------  ---------  ---------  --------- 
---------  ---------  ---------
2.6.12-rc4                        20.930     18.720     18.142     20.558 
46.034     80.218     86.914
   s.d.                             2.754      2.992      1.737      1.393 
4.058      1.774      2.466
2.6.13-rc4pti                     22.596     21.158     20.558     29.526 
51.312     81.384     88.340
   s.d.                             0.185      3.291      1.060      8.992 
4.053      2.880      3.548

File create/delete and VM system latencies in microseconds - smaller is 
better
----------------------------------------------------------------------------
                                  0K       0K       1K       1K       4K 
4K      10K      10K     Mmap     Prot    Page
kernel                         Create   Delete   Create   Delete   Create 
Delete   Create   Delete   Latency  Fault   Fault
------------------------------ -------  -------  -------  -------  ------- 
-------  -------  -------  -------  ------  ------
2.6.12-rc4                       44.78    16.93    64.39    31.40    64.48 
31.37    83.36    34.13   2173.6   1.087    1.00
   s.d.                            0.05     0.07     0.06     0.08     0.20 
0.15     0.07     0.03     15.9   0.021    0.00
2.6.13-rc4pti                    44.71    16.94    64.53    31.49    64.77 
31.44    84.07    34.19   2306.4   1.173    1.00
   s.d.                            0.03     0.04     0.07     0.07     0.18 
0.16     1.03     0.03     15.6   0.018    0.00

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
kernel                           Pipe   AF/Unix     UDP   RPC/UDP     TCP 
RPC/TCP  TCPconn
-----------------------------  -------  -------  -------  -------  ------- 
-------  -------
2.6.12-rc4                     164.930   80.218   87.183  53.8829  70.5775 
88.0925   50.104
   s.d.                           7.174    2.140  39.9387  0.66410  29.6098 
22.7606    0.247
2.6.13-rc4pti                  156.261   80.508  73.4638  53.9942  60.9479 
71.4832   50.257
   s.d.                          11.973    2.678  38.7197  0.63384  24.5805 
18.2365    0.261

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
                                                              File     Mmap 
Bcopy    Bcopy   Memory   Memory
kernel                           Pipe   AF/Unix    TCP     reread   reread 
(libc)   (hand)     read    write
-----------------------------  -------  -------  -------  -------  ------- 
-------  -------  -------  -------
2.6.12-rc4                     1250.44  1729.22  1240.64  1102.10   590.12 
660.54   387.10   589.91   556.02
   s.d.                            5.43   267.98     4.80     1.52     0.48 
2.09     0.52     0.50     2.66
2.6.13-rc4pti                  1070.91  1770.60  1245.81  1102.00   589.94 
660.65   387.23   590.16   555.78
   s.d.                          301.43   124.01     5.33     1.79     0.44 
2.52     0.58     0.42     2.80

*Local* More Communication bandwidths in MB/s - bigger is better
----------------------------------------------------------------
                                   File     Mmap  Aligned  Partial  Partial 
Partial  Partial
OS                                open     open    Bcopy    Bcopy     Mmap 
Mmap     Mmap    Bzero
                                  close    close   (libc)   (hand)     read 
write   rd/wrt     copy     HTTP
-----------------------------  -------  -------  -------  -------  ------- 
-------  -------  -------  -------
2.6.12-rc4                     1102.20   558.53   664.23   677.96   772.99 
1529.92   473.22  2345.51   16.256
   s.d.                            1.71     0.66     2.47     2.49     0.53 
32.52     0.51     9.35    0.412
2.6.13-rc4pti                  1102.76   557.95   669.52   677.41   773.41 
1521.63   473.11  2344.87   16.152
   s.d.                            0.54     0.18     9.33     3.27     0.54 
33.58     0.60    11.55    0.151

Memory latencies in nanoseconds - smaller is better
---------------------------------------------------
kernel                          Mhz     L1 $     L2 $    Main mem
-----------------------------  -----  -------  -------  ---------
2.6.12-rc4                       900    2.227    6.686     121.45
   s.d.                             0    0.412    0.412       0.41
2.6.13-rc4pti                    900    2.227    6.686     121.44
   s.d.                             0    0.151    0.151       0.15

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/15] PTI: clean page table interface
  2005-05-21  2:43 [PATCH 1/15] PTI: clean page table interface Paul Davies
  2005-05-21  2:53 ` [PATCH 2/15] PTI: Add general files and directories Paul Cameron Davies
@ 2005-05-28  8:53 ` Christoph Hellwig
  2005-05-30  5:16   ` Paul Davies
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2005-05-28  8:53 UTC (permalink / raw)
  To: Paul Davies; +Cc: linux-mm

On Sat, May 21, 2005 at 12:43:31PM +1000, Paul Davies wrote:
> Here are a set of 15 patches against 2.6.12-rc4 to provide a clean
> page table interface so that alternate page tables can be fitted
> to Linux in the future.  This patch set is produced on behalf of
> the Gelato research group at the University of New South Wales.
> 
> LMbench results are included at the end of this patch set.  The
> results are very good although the mmap latency figures were
> slightly higher than expected.
> 
> I look forward to any feedback that will assist me in putting
> together a page table interface that will benefit the whole linux
> community. 

I've not looked over it a lot, but your code organization is a bit odd
and non-standard:

 - generic implementations for per-arch abstractions go into asm-generic
   and every asm-foo/ header that wants to use it includes it.  In your
   case that would be an asm-generic/page_table.h for the generic 3level
   page tables.  Please avoid #includes for generic implementations from
   architecture-independent headers guarded by CONFIG_ symbols.
 - I don't think the subdirectory under mm/ makes sense.  Just call the
   file mm/3level-page-table.c or something.
 - similar please avoid the include/mm directory.  It might or might not
   make sense to have a subdirectory for mm headers, but please don't
   start one as part of a large patch series.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: PTI: clean page table interface
  2005-05-28  8:53 ` [PATCH 1/15] PTI: clean page table interface Christoph Hellwig
@ 2005-05-30  5:16   ` Paul Davies
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Davies @ 2005-05-30  5:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-mm

On 28/05/05 09:53 +0100, Christoph Hellwig wrote:
> I've not looked over it a lot, but your code organization is a bit odd
> and non-standard:
> 
>  - generic implementations for per-arch abstractions go into asm-generic
>    and every asm-foo/ header that wants to use it includes it.  In your
>    case that would be an asm-generic/page_table.h for the generic 3level
>    page tables.  Please avoid #includes for generic implementations from
>    architecture-independent headers guarded by CONFIG_ symbols.
>  - I don't think the subdirectory under mm/ makes sense.  Just call the
>    file mm/3level-page-table.c or something.
>  - similar please avoid the include/mm directory.  It might or might not
>    make sense to have a subdirectory for mm headers, but please don't
>    star one as part of a large patch series.

Thank you for your pointers regarding the code organisation.  I will be
taking your advice which will appear in the next iteration of patches.

We have a guarded page table implementation at UNSW (originally conceived
of by Jochen Liedtke).  We are testing it in Linux as an alternative to
the MLPT.  After the current patches (to achieve a clean interface), we have
a GPT patch set which includes directories mm/fixed-mlpt and mm/gpt.

The GPT is far more sophisticated than the MLPT and is written across a
number of files.  Having a directory for each page table implementation
makes sense when you see alternate page tables side by side.

I am writing a patch[0/15] to give a brief explanation of what we are doing
at UNSW and to explain the interface a little better.  

Please let me know if there is anything else that would assist.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2005-05-30  5:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-21  2:43 [PATCH 1/15] PTI: clean page table interface Paul Davies
2005-05-21  2:53 ` [PATCH 2/15] PTI: Add general files and directories Paul Cameron Davies
2005-05-21  3:08   ` [PATCH 3/15] PTI: move mlpt behind interface Paul Cameron Davies
2005-05-21  3:15     ` [PATCH 4/15] PTI: move mlpt behind interface cont Paul Cameron Davies
2005-05-21  3:26       ` [PATCH 5/15] PTI: Finish moving mlpt behind interface Paul Cameron Davies
2005-05-21  3:47         ` [PATCH 6/15] PTI: Start calling the interface Paul Cameron Davies
2005-05-21  3:54           ` [PATCH 7/15] PTI: continue calling interface Paul Cameron Davies
2005-05-21  4:04             ` [PATCH 8/15] PTI: Keep " Paul Cameron Davies
2005-05-21  4:12               ` [PATCH 9/15] PTI: Introduce iterators Paul Cameron Davies
2005-05-21  4:19                 ` [PATCH 10/15] PTI: Call iterators Paul Cameron Davies
2005-05-21  4:58                   ` [PATCH 11/15] PTI: Continue calling iterators Paul Cameron Davies
2005-05-21  5:04                     ` [PATCH 12/15] PTI: Finish " Paul Cameron Davies
2005-05-21  5:09                       ` [PATCH 13/15] PTI: Add files and IA64 part of interface Paul Cameron Davies
2005-05-21  5:15                         ` [PATCH 14/15] PTI: Move IA64 mlpt code behind interface Paul Cameron Davies
2005-05-21  5:27                           ` [PATCH 15/15] PTI: Call IA64 interface Paul Cameron Davies
2005-05-21  5:46                             ` PTI: Patch 10/15 URL Paul Cameron Davies
2005-05-21  5:47                               ` PTI: LMbench results Paul Cameron Davies
2005-05-28  8:53 ` [PATCH 1/15] PTI: clean page table interface Christoph Hellwig
2005-05-30  5:16   ` Paul Davies

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox