* [PATCH 0/11] alternate 4-level page tables patches (take 2)
@ 2004-12-22 9:50 Nick Piggin
2004-12-22 9:52 ` [PATCH 1/11] parentheses to x86-64 macro Nick Piggin
2004-12-22 10:18 ` [PATCH 0/11] alternate 4-level page tables patches (take 2) Andi Kleen
0 siblings, 2 replies; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:50 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton, Andi Kleen, Hugh Dickins,
Linux Memory Management
OK, it turned out that the fallback header I sent out earlier seemed
to do the right thing on both ia64 and x86_64 (3-level) without really
any changes. So combined with i386 !PAE, that covers 2-level and 3-level
implementations... so with any luck it will work on all arches.
So in the following series, there is:
a minor shuffling of hunks between patches
slight improvement to the clear_page_range patch
one off-by-one bug in clear_pud_range
dropped the inlining patch
inclusion of the fallback header.
Theoretically, all architectures should continue to work as before.
Comments? Any consensus as to which way we want to go? I don't want to
inflame tempers by continuing this line of work, just provoke discussion.
Thanks,
Nick
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/11] parentheses to x86-64 macro
2004-12-22 9:50 [PATCH 0/11] alternate 4-level page tables patches (take 2) Nick Piggin
@ 2004-12-22 9:52 ` Nick Piggin
2004-12-22 9:53 ` [PATCH 2/11] generic 3-level nopmd folding header Nick Piggin
2004-12-22 10:18 ` [PATCH 0/11] alternate 4-level page tables patches (take 2) Andi Kleen
1 sibling, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:52 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 97 bytes --]
1/11
Not strictly a 4level patch, but a warning was spat at me at one
stage during my travels.
[-- Attachment #2: x86-64-fix-macro.patch --]
[-- Type: text/plain, Size: 893 bytes --]
Add parentheses to x86-64's pgd_index's arguments
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/include/asm-x86_64/pgtable.h | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)
diff -puN include/asm-x86_64/pgtable.h~x86-64-fix-macro include/asm-x86_64/pgtable.h
--- linux-2.6/include/asm-x86_64/pgtable.h~x86-64-fix-macro 2004-12-22 20:29:41.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/pgtable.h 2004-12-22 20:35:55.000000000 +1100
@@ -311,7 +311,7 @@ static inline int pmd_large(pmd_t pte) {
/* PGD - Level3 access */
/* to find an entry in a page-table-directory. */
-#define pgd_index(address) ((address >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
+#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
static inline pgd_t *__pgd_offset_k(pgd_t *pgd, unsigned long address)
{
return pgd + pgd_index(address);
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/11] generic 3-level nopmd folding header
2004-12-22 9:52 ` [PATCH 1/11] parentheses to x86-64 macro Nick Piggin
@ 2004-12-22 9:53 ` Nick Piggin
2004-12-22 9:54 ` [PATCH 3/11] convert i386 to generic nopmd header Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
2/11
[-- Attachment #2: 3level-compat.patch --]
[-- Type: text/plain, Size: 2305 bytes --]
Generic headers to fold the 3-level pagetable into 2 levels.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/include/asm-generic/pgtable-nopmd.h | 59 ++++++++++++++++++
1 files changed, 59 insertions(+)
diff -puN /dev/null include/asm-generic/pgtable-nopmd.h
--- /dev/null 2004-09-06 19:38:39.000000000 +1000
+++ linux-2.6-npiggin/include/asm-generic/pgtable-nopmd.h 2004-12-22 20:35:57.000000000 +1100
@@ -0,0 +1,59 @@
+#ifndef _PGTABLE_NOPMD_H
+#define _PGTABLE_NOPMD_H
+
+#ifndef __ASSEMBLY__
+
+/*
+ * Having the pmd type consist of a pgd gets the size right, and allows
+ * us to conceptually access the pgd entry that this pmd is folded into
+ * without casting.
+ */
+typedef struct { pgd_t pgd; } pmd_t;
+
+#define PMD_SHIFT PGDIR_SHIFT
+#define PTRS_PER_PMD 1
+#define PMD_SIZE (1UL << PMD_SHIFT)
+#define PMD_MASK (~(PMD_SIZE-1))
+
+/*
+ * The "pgd_xxx()" functions here are trivial for a folded two-level
+ * setup: the pmd is never bad, and a pmd always exists (as it's folded
+ * into the pgd entry)
+ */
+static inline int pgd_none(pgd_t pgd) { return 0; }
+static inline int pgd_bad(pgd_t pgd) { return 0; }
+static inline int pgd_present(pgd_t pgd) { return 1; }
+static inline void pgd_clear(pgd_t *pgd) { }
+#define pmd_ERROR(pmd) (pgd_ERROR((pmd).pgd))
+
+#define pgd_populate(mm, pmd, pte) do { } while (0)
+#define pgd_populate_kernel(mm, pmd, pte) do { } while (0)
+
+/*
+ * (pmds are folded into pgds so this doesn't get actually called,
+ * but the define is needed for a generic inline function.)
+ */
+#define set_pgd(pgdptr, pgdval) set_pmd((pmd_t *)(pgdptr), (pmd_t) { pgdval })
+
+static inline pmd_t * pmd_offset(pgd_t * pgd, unsigned long address)
+{
+ return (pmd_t *)pgd;
+}
+
+#define pmd_val(x) (pgd_val((x).pgd))
+#define __pmd(x) ((pmd_t) { __pgd(x) } )
+
+#define pgd_page(pgd) (pmd_page((pmd_t){ pgd }))
+#define pgd_page_kernel(pgd) (pmd_page_kernel((pmd_t){ pgd }))
+
+/*
+ * allocating and freeing a pmd is trivial: the 1-entry pmd is
+ * inside the pgd, so has no extra memory associated with it.
+ */
+#define pmd_alloc_one(mm, address) NULL
+#define pmd_free(x) do { } while (0)
+#define __pmd_free_tlb(tlb, x) do { } while (0)
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _PGTABLE_NOPMD_H */
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/11] convert i386 to generic nopmd header
2004-12-22 9:53 ` [PATCH 2/11] generic 3-level nopmd folding header Nick Piggin
@ 2004-12-22 9:54 ` Nick Piggin
2004-12-22 9:54 ` [PATCH 4/11] split copy_page_range Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
3/11
[-- Attachment #2: 3level-i386-cleanup.patch --]
[-- Type: text/plain, Size: 9391 bytes --]
Adapt the i386 architecture to use the generic 2-level folding header.
Just to show how it is done.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/include/asm-i386/mmzone.h | 1
linux-2.6-npiggin/include/asm-i386/page.h | 6 --
linux-2.6-npiggin/include/asm-i386/pgalloc.h | 17 +++----
linux-2.6-npiggin/include/asm-i386/pgtable-2level-defs.h | 2
linux-2.6-npiggin/include/asm-i386/pgtable-2level.h | 33 +++------------
linux-2.6-npiggin/include/asm-i386/pgtable-3level.h | 11 +++++
linux-2.6-npiggin/include/asm-i386/pgtable.h | 13 +----
7 files changed, 31 insertions(+), 52 deletions(-)
diff -puN include/asm-i386/pgtable-2level.h~3level-i386-cleanup include/asm-i386/pgtable-2level.h
--- linux-2.6/include/asm-i386/pgtable-2level.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgtable-2level.h 2004-12-22 20:31:43.000000000 +1100
@@ -1,44 +1,22 @@
#ifndef _I386_PGTABLE_2LEVEL_H
#define _I386_PGTABLE_2LEVEL_H
+#include <asm-generic/pgtable-nopmd.h>
+
#define pte_ERROR(e) \
printk("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, (e).pte_low)
-#define pmd_ERROR(e) \
- printk("%s:%d: bad pmd %08lx.\n", __FILE__, __LINE__, pmd_val(e))
#define pgd_ERROR(e) \
printk("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
/*
- * The "pgd_xxx()" functions here are trivial for a folded two-level
- * setup: the pgd is never bad, and a pmd always exists (as it's folded
- * into the pgd entry)
- */
-static inline int pgd_none(pgd_t pgd) { return 0; }
-static inline int pgd_bad(pgd_t pgd) { return 0; }
-static inline int pgd_present(pgd_t pgd) { return 1; }
-#define pgd_clear(xp) do { } while (0)
-
-/*
* Certain architectures need to do special things when PTEs
* within a page table are directly modified. Thus, the following
* hook is made available.
*/
#define set_pte(pteptr, pteval) (*(pteptr) = pteval)
#define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval)
-/*
- * (pmds are folded into pgds so this doesn't get actually called,
- * but the define is needed for a generic inline function.)
- */
-#define set_pmd(pmdptr, pmdval) (*(pmdptr) = pmdval)
-#define set_pgd(pgdptr, pgdval) (*(pgdptr) = pgdval)
+#define set_pmd(pmdptr, pmdval) (*(pmdptr) = (pmdval))
-#define pgd_page(pgd) \
-((unsigned long) __va(pgd_val(pgd) & PAGE_MASK))
-
-static inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address)
-{
- return (pmd_t *) dir;
-}
#define ptep_get_and_clear(xp) __pte(xchg(&(xp)->pte_low, 0))
#define pte_same(a, b) ((a).pte_low == (b).pte_low)
#define pte_page(x) pfn_to_page(pte_pfn(x))
@@ -47,6 +25,11 @@ static inline pmd_t * pmd_offset(pgd_t *
#define pfn_pte(pfn, prot) __pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot))
#define pfn_pmd(pfn, prot) __pmd(((pfn) << PAGE_SHIFT) | pgprot_val(prot))
+#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
+
+#define pmd_page_kernel(pmd) \
+((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
+
/*
* All present user pages are user-executable:
*/
diff -puN include/asm-i386/page.h~3level-i386-cleanup include/asm-i386/page.h
--- linux-2.6/include/asm-i386/page.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/page.h 2004-12-22 20:31:43.000000000 +1100
@@ -46,11 +46,12 @@ typedef struct { unsigned long pte_low,
typedef struct { unsigned long long pmd; } pmd_t;
typedef struct { unsigned long long pgd; } pgd_t;
typedef struct { unsigned long long pgprot; } pgprot_t;
+#define pmd_val(x) ((x).pmd)
#define pte_val(x) ((x).pte_low | ((unsigned long long)(x).pte_high << 32))
+#define __pmd(x) ((pmd_t) { (x) } )
#define HPAGE_SHIFT 21
#else
typedef struct { unsigned long pte_low; } pte_t;
-typedef struct { unsigned long pmd; } pmd_t;
typedef struct { unsigned long pgd; } pgd_t;
typedef struct { unsigned long pgprot; } pgprot_t;
#define boot_pte_t pte_t /* or would you rather have a typedef */
@@ -66,13 +67,10 @@ typedef struct { unsigned long pgprot; }
#define HAVE_ARCH_HUGETLB_UNMAPPED_AREA
#endif
-
-#define pmd_val(x) ((x).pmd)
#define pgd_val(x) ((x).pgd)
#define pgprot_val(x) ((x).pgprot)
#define __pte(x) ((pte_t) { (x) } )
-#define __pmd(x) ((pmd_t) { (x) } )
#define __pgd(x) ((pgd_t) { (x) } )
#define __pgprot(x) ((pgprot_t) { (x) } )
diff -puN include/asm-i386/pgtable-2level-defs.h~3level-i386-cleanup include/asm-i386/pgtable-2level-defs.h
--- linux-2.6/include/asm-i386/pgtable-2level-defs.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgtable-2level-defs.h 2004-12-22 20:31:43.000000000 +1100
@@ -12,8 +12,6 @@
* the i386 is two-level, so we don't really have any
* PMD directory physically.
*/
-#define PMD_SHIFT 22
-#define PTRS_PER_PMD 1
#define PTRS_PER_PTE 1024
diff -puN include/asm-i386/pgtable-3level.h~3level-i386-cleanup include/asm-i386/pgtable-3level.h
--- linux-2.6/include/asm-i386/pgtable-3level.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgtable-3level.h 2004-12-22 20:35:54.000000000 +1100
@@ -70,9 +70,18 @@ static inline void set_pte(pte_t *ptep,
*/
static inline void pgd_clear (pgd_t * pgd) { }
+#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
+
+#define pmd_page_kernel(pmd) \
+((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
+
#define pgd_page(pgd) \
+((struct page *) __va(pgd_val(pgd) & PAGE_MASK))
+
+#define pgd_page_kernel(pgd) \
((unsigned long) __va(pgd_val(pgd) & PAGE_MASK))
+
/* Find an entry in the second-level page table.. */
#define pmd_offset(dir, address) ((pmd_t *) pgd_page(*(dir)) + \
pmd_index(address))
@@ -142,4 +151,6 @@ static inline pmd_t pfn_pmd(unsigned lon
#define __pte_to_swp_entry(pte) ((swp_entry_t){ (pte).pte_high })
#define __swp_entry_to_pte(x) ((pte_t){ 0, (x).val })
+#define __pmd_free_tlb(tlb, x) do { } while (0)
+
#endif /* _I386_PGTABLE_3LEVEL_H */
diff -puN include/asm-i386/pgalloc.h~3level-i386-cleanup include/asm-i386/pgalloc.h
--- linux-2.6/include/asm-i386/pgalloc.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgalloc.h 2004-12-22 20:35:54.000000000 +1100
@@ -10,12 +10,10 @@
#define pmd_populate_kernel(mm, pmd, pte) \
set_pmd(pmd, __pmd(_PAGE_TABLE + __pa(pte)))
-static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, struct page *pte)
-{
- set_pmd(pmd, __pmd(_PAGE_TABLE +
- ((unsigned long long)page_to_pfn(pte) <<
- (unsigned long long) PAGE_SHIFT)));
-}
+#define pmd_populate(mm, pmd, pte) \
+ set_pmd(pmd, __pmd(_PAGE_TABLE + \
+ ((unsigned long long)page_to_pfn(pte) << \
+ (unsigned long long) PAGE_SHIFT)))
/*
* Allocate and free page tables.
*/
@@ -39,16 +37,15 @@ static inline void pte_free(struct page
#define __pte_free_tlb(tlb,pte) tlb_remove_page((tlb),(pte))
+#ifdef CONFIG_X86_PAE
/*
- * allocating and freeing a pmd is trivial: the 1-entry pmd is
- * inside the pgd, so has no extra memory associated with it.
- * (In the PAE case we free the pmds as part of the pgd.)
+ * In the PAE case we free the pmds as part of the pgd.
*/
-
#define pmd_alloc_one(mm, addr) ({ BUG(); ((pmd_t *)2); })
#define pmd_free(x) do { } while (0)
#define __pmd_free_tlb(tlb,x) do { } while (0)
#define pgd_populate(mm, pmd, pte) BUG()
+#endif
#define check_pgt_cache() do { } while (0)
diff -puN include/asm-i386/pgtable.h~3level-i386-cleanup include/asm-i386/pgtable.h
--- linux-2.6/include/asm-i386/pgtable.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgtable.h 2004-12-22 20:35:54.000000000 +1100
@@ -50,12 +50,12 @@ void paging_init(void);
*/
#ifdef CONFIG_X86_PAE
# include <asm/pgtable-3level-defs.h>
+# define PMD_SIZE (1UL << PMD_SHIFT)
+# define PMD_MASK (~(PMD_SIZE-1))
#else
# include <asm/pgtable-2level-defs.h>
#endif
-#define PMD_SIZE (1UL << PMD_SHIFT)
-#define PMD_MASK (~(PMD_SIZE-1))
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE-1))
@@ -293,15 +293,8 @@ static inline pte_t pte_modify(pte_t pte
#define page_pte(page) page_pte_prot(page, __pgprot(0))
-#define pmd_page_kernel(pmd) \
-((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
-
-#ifndef CONFIG_DISCONTIGMEM
-#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
-#endif /* !CONFIG_DISCONTIGMEM */
-
#define pmd_large(pmd) \
- ((pmd_val(pmd) & (_PAGE_PSE|_PAGE_PRESENT)) == (_PAGE_PSE|_PAGE_PRESENT))
+((pmd_val(pmd) & (_PAGE_PSE|_PAGE_PRESENT)) == (_PAGE_PSE|_PAGE_PRESENT))
/*
* the pgd page can be thought of an array like this: pgd_t[PTRS_PER_PGD]
diff -puN include/asm-i386/mmzone.h~3level-i386-cleanup include/asm-i386/mmzone.h
--- linux-2.6/include/asm-i386/mmzone.h~3level-i386-cleanup 2004-12-22 20:31:43.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/mmzone.h 2004-12-22 20:31:44.000000000 +1100
@@ -116,7 +116,6 @@ static inline struct pglist_data *pfn_to
(unsigned long)(__page - __zone->zone_mem_map) \
+ __zone->zone_start_pfn; \
})
-#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
#ifdef CONFIG_X86_NUMAQ /* we have contiguous memory on NUMA-Q */
#define pfn_valid(pfn) ((pfn) < num_physpages)
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 4/11] split copy_page_range
2004-12-22 9:54 ` [PATCH 3/11] convert i386 to generic nopmd header Nick Piggin
@ 2004-12-22 9:54 ` Nick Piggin
2004-12-22 9:55 ` [PATCH 5/11] replace clear_page_tables with clear_page_range Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
4/11
[-- Attachment #2: 3level-split-copy_page_range.patch --]
[-- Type: text/plain, Size: 8698 bytes --]
Split copy_page_range into the usual set of page table walking functions.
Needed to handle the complexity when moving to 4 levels.
Split out from Andi Kleen's 4level patch.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/mm/memory.c | 290 ++++++++++++++++++++++--------------------
1 files changed, 152 insertions(+), 138 deletions(-)
diff -puN mm/memory.c~3level-split-copy_page_range mm/memory.c
--- linux-2.6/mm/memory.c~3level-split-copy_page_range 2004-12-22 20:31:44.000000000 +1100
+++ linux-2.6-npiggin/mm/memory.c 2004-12-22 20:35:58.000000000 +1100
@@ -204,165 +204,179 @@ pte_t fastcall * pte_alloc_kernel(struct
out:
return pte_offset_kernel(pmd, address);
}
-#define PTE_TABLE_MASK ((PTRS_PER_PTE-1) * sizeof(pte_t))
-#define PMD_TABLE_MASK ((PTRS_PER_PMD-1) * sizeof(pmd_t))
/*
* copy one vm_area from one task to the other. Assumes the page tables
* already present in the new task to be cleared in the whole range
* covered by this vma.
*
- * 08Jan98 Merged into one routine from several inline routines to reduce
- * variable count and make things faster. -jj
- *
* dst->page_table_lock is held on entry and exit,
- * but may be dropped within pmd_alloc() and pte_alloc_map().
+ * but may be dropped within p[mg]d_alloc() and pte_alloc_map().
*/
+
+static inline void
+copy_swap_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, pte_t pte)
+{
+ if (pte_file(pte))
+ return;
+ swap_duplicate(pte_to_swp_entry(pte));
+ if (list_empty(&dst_mm->mmlist)) {
+ spin_lock(&mmlist_lock);
+ list_add(&dst_mm->mmlist, &src_mm->mmlist);
+ spin_unlock(&mmlist_lock);
+ }
+}
+
+static inline void
+copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
+ pte_t *dst_pte, pte_t *src_pte, unsigned long vm_flags,
+ unsigned long addr)
+{
+ pte_t pte = *src_pte;
+ struct page *page;
+ unsigned long pfn;
+
+ /* pte contains position in swap, so copy. */
+ if (!pte_present(pte)) {
+ copy_swap_pte(dst_mm, src_mm, pte);
+ set_pte(dst_pte, pte);
+ return;
+ }
+ pfn = pte_pfn(pte);
+ /* the pte points outside of valid memory, the
+ * mapping is assumed to be good, meaningful
+ * and not mapped via rmap - duplicate the
+ * mapping as is.
+ */
+ page = NULL;
+ if (pfn_valid(pfn))
+ page = pfn_to_page(pfn);
+
+ if (!page || PageReserved(page)) {
+ set_pte(dst_pte, pte);
+ return;
+ }
+
+ /*
+ * If it's a COW mapping, write protect it both
+ * in the parent and the child
+ */
+ if ((vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE) {
+ ptep_set_wrprotect(src_pte);
+ pte = *src_pte;
+ }
+
+ /*
+ * If it's a shared mapping, mark it clean in
+ * the child
+ */
+ if (vm_flags & VM_SHARED)
+ pte = pte_mkclean(pte);
+ pte = pte_mkold(pte);
+ get_page(page);
+ dst_mm->rss++;
+ if (PageAnon(page))
+ dst_mm->anon_rss++;
+ set_pte(dst_pte, pte);
+ page_dup_rmap(page);
+}
+
+static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
+ pmd_t *dst_pmd, pmd_t *src_pmd, struct vm_area_struct *vma,
+ unsigned long addr, unsigned long end)
+{
+ pte_t *src_pte, *dst_pte;
+ pte_t *s, *d;
+ unsigned long vm_flags = vma->vm_flags;
+
+ d = dst_pte = pte_alloc_map(dst_mm, dst_pmd, addr);
+ if (!dst_pte)
+ return -ENOMEM;
+
+ spin_lock(&src_mm->page_table_lock);
+ s = src_pte = pte_offset_map_nested(src_pmd, addr);
+ for (; addr < end; addr += PAGE_SIZE, s++, d++) {
+ if (pte_none(*s))
+ continue;
+ copy_one_pte(dst_mm, src_mm, d, s, vm_flags, addr);
+ }
+ pte_unmap_nested(src_pte);
+ pte_unmap(dst_pte);
+ spin_unlock(&src_mm->page_table_lock);
+ cond_resched_lock(&dst_mm->page_table_lock);
+ return 0;
+}
+
+static int copy_pmd_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
+ pgd_t *dst_pgd, pgd_t *src_pgd, struct vm_area_struct *vma,
+ unsigned long addr, unsigned long end)
+{
+ pmd_t *src_pmd, *dst_pmd;
+ int err = 0;
+ unsigned long next;
+
+ src_pmd = pmd_offset(src_pgd, addr);
+ dst_pmd = pmd_alloc(dst_mm, dst_pgd, addr);
+ if (!dst_pmd)
+ return -ENOMEM;
+
+ for (; addr < end; addr = next, src_pmd++, dst_pmd++) {
+ next = (addr + PMD_SIZE) & PMD_MASK;
+ if (next > end)
+ next = end;
+ if (pmd_none(*src_pmd))
+ continue;
+ if (pmd_bad(*src_pmd)) {
+ pmd_ERROR(*src_pmd);
+ pmd_clear(src_pmd);
+ continue;
+ }
+ err = copy_pte_range(dst_mm, src_mm, dst_pmd, src_pmd,
+ vma, addr, next);
+ if (err)
+ break;
+ }
+ return err;
+}
+
int copy_page_range(struct mm_struct *dst, struct mm_struct *src,
- struct vm_area_struct *vma)
+ struct vm_area_struct *vma)
{
- pgd_t * src_pgd, * dst_pgd;
- unsigned long address = vma->vm_start;
- unsigned long end = vma->vm_end;
- unsigned long cow;
+ pgd_t *src_pgd, *dst_pgd;
+ unsigned long addr, start, end, next;
+ int err = 0;
if (is_vm_hugetlb_page(vma))
return copy_hugetlb_page_range(dst, src, vma);
- cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
- src_pgd = pgd_offset(src, address)-1;
- dst_pgd = pgd_offset(dst, address)-1;
-
- for (;;) {
- pmd_t * src_pmd, * dst_pmd;
-
- src_pgd++; dst_pgd++;
-
- /* copy_pmd_range */
-
+ start = vma->vm_start;
+ src_pgd = pgd_offset(src, start);
+ dst_pgd = pgd_offset(dst, start);
+
+ end = vma->vm_end;
+ addr = start;
+ while (addr && (addr < end-1)) {
+ next = (addr + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > end || next <= addr)
+ next = end;
if (pgd_none(*src_pgd))
- goto skip_copy_pmd_range;
- if (unlikely(pgd_bad(*src_pgd))) {
+ continue;
+ if (pgd_bad(*src_pgd)) {
pgd_ERROR(*src_pgd);
pgd_clear(src_pgd);
-skip_copy_pmd_range: address = (address + PGDIR_SIZE) & PGDIR_MASK;
- if (!address || (address >= end))
- goto out;
continue;
}
+ err = copy_pmd_range(dst, src, dst_pgd, src_pgd,
+ vma, addr, next);
+ if (err)
+ break;
- src_pmd = pmd_offset(src_pgd, address);
- dst_pmd = pmd_alloc(dst, dst_pgd, address);
- if (!dst_pmd)
- goto nomem;
-
- do {
- pte_t * src_pte, * dst_pte;
-
- /* copy_pte_range */
-
- if (pmd_none(*src_pmd))
- goto skip_copy_pte_range;
- if (unlikely(pmd_bad(*src_pmd))) {
- pmd_ERROR(*src_pmd);
- pmd_clear(src_pmd);
-skip_copy_pte_range:
- address = (address + PMD_SIZE) & PMD_MASK;
- if (address >= end)
- goto out;
- goto cont_copy_pmd_range;
- }
-
- dst_pte = pte_alloc_map(dst, dst_pmd, address);
- if (!dst_pte)
- goto nomem;
- spin_lock(&src->page_table_lock);
- src_pte = pte_offset_map_nested(src_pmd, address);
- do {
- pte_t pte = *src_pte;
- struct page *page;
- unsigned long pfn;
-
- /* copy_one_pte */
-
- if (pte_none(pte))
- goto cont_copy_pte_range_noset;
- /* pte contains position in swap, so copy. */
- if (!pte_present(pte)) {
- if (!pte_file(pte)) {
- swap_duplicate(pte_to_swp_entry(pte));
- if (list_empty(&dst->mmlist)) {
- spin_lock(&mmlist_lock);
- list_add(&dst->mmlist,
- &src->mmlist);
- spin_unlock(&mmlist_lock);
- }
- }
- set_pte(dst_pte, pte);
- goto cont_copy_pte_range_noset;
- }
- pfn = pte_pfn(pte);
- /* the pte points outside of valid memory, the
- * mapping is assumed to be good, meaningful
- * and not mapped via rmap - duplicate the
- * mapping as is.
- */
- page = NULL;
- if (pfn_valid(pfn))
- page = pfn_to_page(pfn);
-
- if (!page || PageReserved(page)) {
- set_pte(dst_pte, pte);
- goto cont_copy_pte_range_noset;
- }
-
- /*
- * If it's a COW mapping, write protect it both
- * in the parent and the child
- */
- if (cow) {
- ptep_set_wrprotect(src_pte);
- pte = *src_pte;
- }
-
- /*
- * If it's a shared mapping, mark it clean in
- * the child
- */
- if (vma->vm_flags & VM_SHARED)
- pte = pte_mkclean(pte);
- pte = pte_mkold(pte);
- get_page(page);
- dst->rss++;
- if (PageAnon(page))
- dst->anon_rss++;
- set_pte(dst_pte, pte);
- page_dup_rmap(page);
-cont_copy_pte_range_noset:
- address += PAGE_SIZE;
- if (address >= end) {
- pte_unmap_nested(src_pte);
- pte_unmap(dst_pte);
- goto out_unlock;
- }
- src_pte++;
- dst_pte++;
- } while ((unsigned long)src_pte & PTE_TABLE_MASK);
- pte_unmap_nested(src_pte-1);
- pte_unmap(dst_pte-1);
- spin_unlock(&src->page_table_lock);
- cond_resched_lock(&dst->page_table_lock);
-cont_copy_pmd_range:
- src_pmd++;
- dst_pmd++;
- } while ((unsigned long)src_pmd & PMD_TABLE_MASK);
+ src_pgd++;
+ dst_pgd++;
+ addr = next;
}
-out_unlock:
- spin_unlock(&src->page_table_lock);
-out:
- return 0;
-nomem:
- return -ENOMEM;
+
+ return err;
}
static void zap_pte_range(struct mmu_gather *tlb,
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 5/11] replace clear_page_tables with clear_page_range
2004-12-22 9:54 ` [PATCH 4/11] split copy_page_range Nick Piggin
@ 2004-12-22 9:55 ` Nick Piggin
2004-12-22 9:56 ` [PATCH 6/11] introduce 4-level nopud folding header Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:55 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
5/11
[-- Attachment #2: 3level-clear_page_range.patch --]
[-- Type: text/plain, Size: 8852 bytes --]
Rename clear_page_tables to clear_page_range. clear_page_range takes byte
ranges, and aggressively frees page table pages. Maybe useful to control
page table memory consumption on 4-level architectures (and even 3 level
ones).
Possible downsides are:
- flush_tlb_pgtables gets called more often (only a problem for sparc64
AFAIKS).
- the opportunistic "expand to fill PGDIR_SIZE hole" logic that ensures
something actually gets done under the old system is still in place.
This could sometimes make unmapping small regions more inefficient. There
are some other solutions to look at if this is the case though.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/arch/i386/mm/pgtable.c | 2
linux-2.6-npiggin/arch/ia64/mm/hugetlbpage.c | 15 -----
linux-2.6-npiggin/include/linux/mm.h | 2
linux-2.6-npiggin/mm/memory.c | 80 ++++++++++++++++-----------
linux-2.6-npiggin/mm/mmap.c | 24 +++-----
5 files changed, 63 insertions(+), 60 deletions(-)
diff -puN include/linux/mm.h~3level-clear_page_range include/linux/mm.h
--- linux-2.6/include/linux/mm.h~3level-clear_page_range 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/include/linux/mm.h 2004-12-22 20:35:56.000000000 +1100
@@ -566,7 +566,7 @@ int unmap_vmas(struct mmu_gather **tlbp,
struct vm_area_struct *start_vma, unsigned long start_addr,
unsigned long end_addr, unsigned long *nr_accounted,
struct zap_details *);
-void clear_page_tables(struct mmu_gather *tlb, unsigned long first, int nr);
+void clear_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end);
int copy_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *vma);
int zeromap_page_range(struct vm_area_struct *vma, unsigned long from,
diff -puN mm/memory.c~3level-clear_page_range mm/memory.c
--- linux-2.6/mm/memory.c~3level-clear_page_range 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/mm/memory.c 2004-12-22 20:35:56.000000000 +1100
@@ -100,58 +100,76 @@ static inline void copy_cow_page(struct
* Note: this doesn't free the actual pages themselves. That
* has been handled earlier when unmapping all the memory regions.
*/
-static inline void free_one_pmd(struct mmu_gather *tlb, pmd_t * dir)
+static inline void clear_pmd_range(struct mmu_gather *tlb, pmd_t *pmd, unsigned long start, unsigned long end)
{
struct page *page;
- if (pmd_none(*dir))
+ if (pmd_none(*pmd))
return;
- if (unlikely(pmd_bad(*dir))) {
- pmd_ERROR(*dir);
- pmd_clear(dir);
+ if (unlikely(pmd_bad(*pmd))) {
+ pmd_ERROR(*pmd);
+ pmd_clear(pmd);
return;
}
- page = pmd_page(*dir);
- pmd_clear(dir);
- dec_page_state(nr_page_table_pages);
- tlb->mm->nr_ptes--;
- pte_free_tlb(tlb, page);
+ if (!(start & ~PMD_MASK) && !(end & ~PMD_MASK)) {
+ page = pmd_page(*pmd);
+ pmd_clear(pmd);
+ dec_page_state(nr_page_table_pages);
+ tlb->mm->nr_ptes--;
+ pte_free_tlb(tlb, page);
+ }
}
-static inline void free_one_pgd(struct mmu_gather *tlb, pgd_t * dir)
+static inline void clear_pgd_range(struct mmu_gather *tlb, pgd_t *pgd, unsigned long start, unsigned long end)
{
- int j;
- pmd_t * pmd;
+ unsigned long addr = start, next;
+ pmd_t *pmd, *__pmd;
- if (pgd_none(*dir))
+ if (pgd_none(*pgd))
return;
- if (unlikely(pgd_bad(*dir))) {
- pgd_ERROR(*dir);
- pgd_clear(dir);
+ if (unlikely(pgd_bad(*pgd))) {
+ pgd_ERROR(*pgd);
+ pgd_clear(pgd);
return;
}
- pmd = pmd_offset(dir, 0);
- pgd_clear(dir);
- for (j = 0; j < PTRS_PER_PMD ; j++)
- free_one_pmd(tlb, pmd+j);
- pmd_free_tlb(tlb, pmd);
+
+ pmd = __pmd = pmd_offset(pgd, start);
+ do {
+ next = (addr + PMD_SIZE) & PMD_MASK;
+ if (next > end || next <= addr)
+ next = end;
+
+ clear_pmd_range(tlb, pmd, addr, next);
+ pmd++;
+ addr = next;
+ } while (addr && (addr <= end - 1));
+
+ if (!(start & ~PGDIR_MASK) && !(end & ~PGDIR_MASK)) {
+ pgd_clear(pgd);
+ pmd_free_tlb(tlb, __pmd);
+ }
}
/*
- * This function clears all user-level page tables of a process - this
- * is needed by execve(), so that old pages aren't in the way.
+ * This function clears user-level page tables of a process.
*
* Must be called with pagetable lock held.
*/
-void clear_page_tables(struct mmu_gather *tlb, unsigned long first, int nr)
+void clear_page_range(struct mmu_gather *tlb, unsigned long start, unsigned long end)
{
- pgd_t * page_dir = tlb->mm->pgd;
+ unsigned long addr = start, next;
+ unsigned long i, nr = pgd_index(end + PGDIR_SIZE-1) - pgd_index(start);
+ pgd_t * pgd = pgd_offset(tlb->mm, start);
- page_dir += first;
- do {
- free_one_pgd(tlb, page_dir);
- page_dir++;
- } while (--nr);
+ for (i = 0; i < nr; i++) {
+ next = (addr + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > end || next <= addr)
+ next = end;
+
+ clear_pgd_range(tlb, pgd, addr, next);
+ pgd++;
+ addr = next;
+ }
}
pte_t fastcall * pte_alloc_map(struct mm_struct *mm, pmd_t *pmd, unsigned long address)
diff -puN mm/mmap.c~3level-clear_page_range mm/mmap.c
--- linux-2.6/mm/mmap.c~3level-clear_page_range 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/mm/mmap.c 2004-12-22 20:31:45.000000000 +1100
@@ -1474,7 +1474,6 @@ static void free_pgtables(struct mmu_gat
{
unsigned long first = start & PGDIR_MASK;
unsigned long last = end + PGDIR_SIZE - 1;
- unsigned long start_index, end_index;
struct mm_struct *mm = tlb->mm;
if (!prev) {
@@ -1499,23 +1498,18 @@ static void free_pgtables(struct mmu_gat
last = next->vm_start;
}
if (prev->vm_end > first)
- first = prev->vm_end + PGDIR_SIZE - 1;
+ first = prev->vm_end;
break;
}
no_mmaps:
if (last < first) /* for arches with discontiguous pgd indices */
return;
- /*
- * If the PGD bits are not consecutive in the virtual address, the
- * old method of shifting the VA >> by PGDIR_SHIFT doesn't work.
- */
- start_index = pgd_index(first);
- if (start_index < FIRST_USER_PGD_NR)
- start_index = FIRST_USER_PGD_NR;
- end_index = pgd_index(last);
- if (end_index > start_index) {
- clear_page_tables(tlb, start_index, end_index - start_index);
- flush_tlb_pgtables(mm, first & PGDIR_MASK, last & PGDIR_MASK);
+ if (first < FIRST_USER_PGD_NR * PGDIR_SIZE)
+ first = FIRST_USER_PGD_NR * PGDIR_SIZE;
+ /* No point trying to free anything if we're in the same pte page */
+ if ((first & PMD_MASK) < (last & PMD_MASK)) {
+ clear_page_range(tlb, first, last);
+ flush_tlb_pgtables(mm, first, last);
}
}
@@ -1844,7 +1838,9 @@ void exit_mmap(struct mm_struct *mm)
~0UL, &nr_accounted, NULL);
vm_unacct_memory(nr_accounted);
BUG_ON(mm->map_count); /* This is just debugging */
- clear_page_tables(tlb, FIRST_USER_PGD_NR, USER_PTRS_PER_PGD);
+ clear_page_range(tlb, FIRST_USER_PGD_NR * PGDIR_SIZE,
+ (TASK_SIZE + PGDIR_SIZE - 1) & PGDIR_MASK);
+
tlb_finish_mmu(tlb, 0, MM_VM_SIZE(mm));
vma = mm->mmap;
diff -puN arch/i386/mm/pgtable.c~3level-clear_page_range arch/i386/mm/pgtable.c
--- linux-2.6/arch/i386/mm/pgtable.c~3level-clear_page_range 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/pgtable.c 2004-12-22 20:35:54.000000000 +1100
@@ -252,6 +252,6 @@ void pgd_free(pgd_t *pgd)
if (PTRS_PER_PMD > 1)
for (i = 0; i < USER_PTRS_PER_PGD; ++i)
kmem_cache_free(pmd_cache, (void *)__va(pgd_val(pgd[i])-1));
- /* in the non-PAE case, clear_page_tables() clears user pgd entries */
+ /* in the non-PAE case, clear_page_range() clears user pgd entries */
kmem_cache_free(pgd_cache, pgd);
}
diff -puN arch/ia64/mm/hugetlbpage.c~3level-clear_page_range arch/ia64/mm/hugetlbpage.c
--- linux-2.6/arch/ia64/mm/hugetlbpage.c~3level-clear_page_range 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/arch/ia64/mm/hugetlbpage.c 2004-12-22 20:35:53.000000000 +1100
@@ -187,7 +187,6 @@ void hugetlb_free_pgtables(struct mmu_ga
{
unsigned long first = start & HUGETLB_PGDIR_MASK;
unsigned long last = end + HUGETLB_PGDIR_SIZE - 1;
- unsigned long start_index, end_index;
struct mm_struct *mm = tlb->mm;
if (!prev) {
@@ -212,23 +211,13 @@ void hugetlb_free_pgtables(struct mmu_ga
last = next->vm_start;
}
if (prev->vm_end > first)
- first = prev->vm_end + HUGETLB_PGDIR_SIZE - 1;
+ first = prev->vm_end;
break;
}
no_mmaps:
if (last < first) /* for arches with discontiguous pgd indices */
return;
- /*
- * If the PGD bits are not consecutive in the virtual address, the
- * old method of shifting the VA >> by PGDIR_SHIFT doesn't work.
- */
-
- start_index = pgd_index(htlbpage_to_page(first));
- end_index = pgd_index(htlbpage_to_page(last));
-
- if (end_index > start_index) {
- clear_page_tables(tlb, start_index, end_index - start_index);
- }
+ clear_page_range(tlb, first, last);
}
void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, unsigned long end)
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 6/11] introduce 4-level nopud folding header
2004-12-22 9:55 ` [PATCH 5/11] replace clear_page_tables with clear_page_range Nick Piggin
@ 2004-12-22 9:56 ` Nick Piggin
2004-12-22 9:57 ` [PATCH 7/11] convert Linux to 4-level page tables Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:56 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
6/11
[-- Attachment #2: 4level-compat.patch --]
[-- Type: text/plain, Size: 6138 bytes --]
Generic headers to fold the 4-level pagetable into 3 levels.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/include/asm-generic/pgtable-nopmd.h | 45 +++++++-------
linux-2.6-npiggin/include/asm-generic/pgtable-nopud.h | 56 ++++++++++++++++++
linux-2.6-npiggin/include/asm-generic/tlb.h | 6 +
3 files changed, 85 insertions(+), 22 deletions(-)
diff -puN /dev/null include/asm-generic/pgtable-nopud.h
--- /dev/null 2004-09-06 19:38:39.000000000 +1000
+++ linux-2.6-npiggin/include/asm-generic/pgtable-nopud.h 2004-12-22 20:31:45.000000000 +1100
@@ -0,0 +1,56 @@
+#ifndef _PGTABLE_NOPUD_H
+#define _PGTABLE_NOPUD_H
+
+#ifndef __ASSEMBLY__
+
+/*
+ * Having the pud type consist of a pgd gets the size right, and allows
+ * us to conceptually access the pgd entry that this pud is folded into
+ * without casting.
+ */
+typedef struct { pgd_t pgd; } pud_t;
+
+#define PUD_SHIFT PGDIR_SHIFT
+#define PTRS_PER_PUD 1
+#define PUD_SIZE (1UL << PUD_SHIFT)
+#define PUD_MASK (~(PUD_SIZE-1))
+
+/*
+ * The "pgd_xxx()" functions here are trivial for a folded two-level
+ * setup: the pud is never bad, and a pud always exists (as it's folded
+ * into the pgd entry)
+ */
+static inline int pgd_none(pgd_t pgd) { return 0; }
+static inline int pgd_bad(pgd_t pgd) { return 0; }
+static inline int pgd_present(pgd_t pgd) { return 1; }
+static inline void pgd_clear(pgd_t *pgd) { }
+#define pud_ERROR(pud) (pgd_ERROR((pud).pgd))
+
+#define pgd_populate(mm, pgd, pud) do { } while (0)
+/*
+ * (puds are folded into pgds so this doesn't get actually called,
+ * but the define is needed for a generic inline function.)
+ */
+#define set_pgd(pgdptr, pgdval) set_pud((pud_t *)(pgdptr), (pud_t) { pgdval })
+
+static inline pud_t * pud_offset(pgd_t * pgd, unsigned long address)
+{
+ return (pud_t *)pgd;
+}
+
+#define pud_val(x) (pgd_val((x).pgd))
+#define __pud(x) ((pud_t) { __pgd(x) } )
+
+#define pgd_page(pgd) (pud_page((pud_t){ pgd }))
+#define pgd_page_kernel(pgd) (pud_page_kernel((pud_t){ pgd }))
+
+/*
+ * allocating and freeing a pud is trivial: the 1-entry pud is
+ * inside the pgd, so has no extra memory associated with it.
+ */
+#define pud_alloc_one(mm, address) NULL
+#define pud_free(x) do { } while (0)
+#define __pud_free_tlb(tlb, x) do { } while (0)
+
+#endif /* __ASSEMBLY__ */
+#endif /* _PGTABLE_NOPUD_H */
diff -puN include/asm-generic/pgtable-nopmd.h~4level-compat include/asm-generic/pgtable-nopmd.h
--- linux-2.6/include/asm-generic/pgtable-nopmd.h~4level-compat 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/include/asm-generic/pgtable-nopmd.h 2004-12-22 20:31:45.000000000 +1100
@@ -3,52 +3,53 @@
#ifndef __ASSEMBLY__
+#include <asm-generic/pgtable-nopud.h>
+
/*
- * Having the pmd type consist of a pgd gets the size right, and allows
- * us to conceptually access the pgd entry that this pmd is folded into
+ * Having the pmd type consist of a pud gets the size right, and allows
+ * us to conceptually access the pud entry that this pmd is folded into
* without casting.
*/
-typedef struct { pgd_t pgd; } pmd_t;
+typedef struct { pud_t pud; } pmd_t;
-#define PMD_SHIFT PGDIR_SHIFT
+#define PMD_SHIFT PUD_SHIFT
#define PTRS_PER_PMD 1
#define PMD_SIZE (1UL << PMD_SHIFT)
#define PMD_MASK (~(PMD_SIZE-1))
/*
- * The "pgd_xxx()" functions here are trivial for a folded two-level
+ * The "pud_xxx()" functions here are trivial for a folded two-level
* setup: the pmd is never bad, and a pmd always exists (as it's folded
- * into the pgd entry)
+ * into the pud entry)
*/
-static inline int pgd_none(pgd_t pgd) { return 0; }
-static inline int pgd_bad(pgd_t pgd) { return 0; }
-static inline int pgd_present(pgd_t pgd) { return 1; }
-static inline void pgd_clear(pgd_t *pgd) { }
-#define pmd_ERROR(pmd) (pgd_ERROR((pmd).pgd))
+static inline int pud_none(pud_t pud) { return 0; }
+static inline int pud_bad(pud_t pud) { return 0; }
+static inline int pud_present(pud_t pud) { return 1; }
+static inline void pud_clear(pud_t *pud) { }
+#define pmd_ERROR(pmd) (pud_ERROR((pmd).pud))
-#define pgd_populate(mm, pmd, pte) do { } while (0)
-#define pgd_populate_kernel(mm, pmd, pte) do { } while (0)
+#define pud_populate(mm, pmd, pte) do { } while (0)
/*
- * (pmds are folded into pgds so this doesn't get actually called,
+ * (pmds are folded into puds so this doesn't get actually called,
* but the define is needed for a generic inline function.)
*/
-#define set_pgd(pgdptr, pgdval) set_pmd((pmd_t *)(pgdptr), (pmd_t) { pgdval })
+#define set_pud(pudptr, pudval) set_pmd((pmd_t *)(pudptr), (pmd_t) { pudval })
-static inline pmd_t * pmd_offset(pgd_t * pgd, unsigned long address)
+static inline pmd_t * pmd_offset(pud_t * pud, unsigned long address)
{
- return (pmd_t *)pgd;
+ return (pmd_t *)pud;
}
-#define pmd_val(x) (pgd_val((x).pgd))
-#define __pmd(x) ((pmd_t) { __pgd(x) } )
+#define pmd_val(x) (pud_val((x).pud))
+#define __pmd(x) ((pmd_t) { __pud(x) } )
-#define pgd_page(pgd) (pmd_page((pmd_t){ pgd }))
-#define pgd_page_kernel(pgd) (pmd_page_kernel((pmd_t){ pgd }))
+#define pud_page(pud) (pmd_page((pmd_t){ pud }))
+#define pud_page_kernel(pud) (pmd_page_kernel((pmd_t){ pud }))
/*
* allocating and freeing a pmd is trivial: the 1-entry pmd is
- * inside the pgd, so has no extra memory associated with it.
+ * inside the pud, so has no extra memory associated with it.
*/
#define pmd_alloc_one(mm, address) NULL
#define pmd_free(x) do { } while (0)
diff -puN include/asm-generic/tlb.h~4level-compat include/asm-generic/tlb.h
--- linux-2.6/include/asm-generic/tlb.h~4level-compat 2004-12-22 20:31:45.000000000 +1100
+++ linux-2.6-npiggin/include/asm-generic/tlb.h 2004-12-22 20:35:55.000000000 +1100
@@ -141,6 +141,12 @@ static inline void tlb_remove_page(struc
__pte_free_tlb(tlb, ptep); \
} while (0)
+#define pud_free_tlb(tlb, pudp) \
+ do { \
+ tlb->need_flush = 1; \
+ __pud_free_tlb(tlb, pudp); \
+ } while (0)
+
#define pmd_free_tlb(tlb, pmdp) \
do { \
tlb->need_flush = 1; \
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 7/11] convert Linux to 4-level page tables
2004-12-22 9:56 ` [PATCH 6/11] introduce 4-level nopud folding header Nick Piggin
@ 2004-12-22 9:57 ` Nick Piggin
2004-12-22 9:59 ` [PATCH 8/11] introduce fallback header Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:57 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
7/11
[-- Attachment #2: 4level-core-patch.patch --]
[-- Type: text/plain, Size: 43986 bytes --]
Extend the Linux MM to 4level page tables.
This is the core patch for mm/*, fs/*, include/linux/*
It breaks all architectures, which will be fixed in separate patches.
The conversion is quite straight forward. All the functions walking the page
table hierarchy have been changed to deal with another level at the top. The
additional level is called pml4.
mm/memory.c has changed a lot because it did most of the heavy lifting here.
Most of the changes here are extensions of the previous code.
Signed-off-by: Andi Kleen <ak@suse.de>
Converted to use the pud_t 'page upper' level between pgd and pmd instead of
Andi's pml4 level above pgd.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/drivers/char/drm/drm_memory.h | 3
linux-2.6-npiggin/fs/exec.c | 6
linux-2.6-npiggin/include/linux/init_task.h | 2
linux-2.6-npiggin/include/linux/mm.h | 20 -
linux-2.6-npiggin/mm/fremap.c | 18 -
linux-2.6-npiggin/mm/memory.c | 408 ++++++++++++++++++------
linux-2.6-npiggin/mm/mempolicy.c | 22 +
linux-2.6-npiggin/mm/mprotect.c | 65 ++-
linux-2.6-npiggin/mm/mremap.c | 29 +
linux-2.6-npiggin/mm/msync.c | 55 ++-
linux-2.6-npiggin/mm/rmap.c | 21 +
linux-2.6-npiggin/mm/swapfile.c | 81 +++-
linux-2.6-npiggin/mm/vmalloc.c | 113 ++++--
13 files changed, 644 insertions(+), 199 deletions(-)
diff -puN fs/exec.c~4level-core-patch fs/exec.c
--- linux-2.6/fs/exec.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/fs/exec.c 2004-12-22 20:31:46.000000000 +1100
@@ -300,6 +300,7 @@ void install_arg_page(struct vm_area_str
{
struct mm_struct *mm = vma->vm_mm;
pgd_t * pgd;
+ pud_t * pud;
pmd_t * pmd;
pte_t * pte;
@@ -310,7 +311,10 @@ void install_arg_page(struct vm_area_str
pgd = pgd_offset(mm, address);
spin_lock(&mm->page_table_lock);
- pmd = pmd_alloc(mm, pgd, address);
+ pud = pud_alloc(mm, pgd, address);
+ if (!pud)
+ goto out;
+ pmd = pmd_alloc(mm, pud, address);
if (!pmd)
goto out;
pte = pte_alloc_map(mm, pmd, address);
diff -puN include/linux/init_task.h~4level-core-patch include/linux/init_task.h
--- linux-2.6/include/linux/init_task.h~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/include/linux/init_task.h 2004-12-22 20:31:46.000000000 +1100
@@ -34,7 +34,7 @@
#define INIT_MM(name) \
{ \
.mm_rb = RB_ROOT, \
- .pgd = swapper_pg_dir, \
+ .pgd = swapper_pg_dir, \
.mm_users = ATOMIC_INIT(2), \
.mm_count = ATOMIC_INIT(1), \
.mmap_sem = __RWSEM_INITIALIZER(name.mmap_sem), \
diff -puN include/linux/mm.h~4level-core-patch include/linux/mm.h
--- linux-2.6/include/linux/mm.h~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/include/linux/mm.h 2004-12-22 20:35:55.000000000 +1100
@@ -581,7 +581,8 @@ static inline void unmap_shared_mapping_
}
extern int vmtruncate(struct inode * inode, loff_t offset);
-extern pmd_t *FASTCALL(__pmd_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address));
+extern pud_t *FASTCALL(__pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address));
+extern pmd_t *FASTCALL(__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address));
extern pte_t *FASTCALL(pte_alloc_kernel(struct mm_struct *mm, pmd_t *pmd, unsigned long address));
extern pte_t *FASTCALL(pte_alloc_map(struct mm_struct *mm, pmd_t *pmd, unsigned long address));
extern int install_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, struct page *page, pgprot_t prot);
@@ -626,15 +627,22 @@ extern struct shrinker *set_shrinker(int
extern void remove_shrinker(struct shrinker *shrinker);
/*
- * On a two-level page table, this ends up being trivial. Thus the
- * inlining and the symmetry break with pte_alloc_map() that does all
+ * On a two-level or three-level page table, this ends up being trivial. Thus
+ * the inlining and the symmetry break with pte_alloc_map() that does all
* of this out-of-line.
*/
-static inline pmd_t *pmd_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
+static inline pud_t *pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
{
if (pgd_none(*pgd))
- return __pmd_alloc(mm, pgd, address);
- return pmd_offset(pgd, address);
+ return __pud_alloc(mm, pgd, address);
+ return pud_offset(pgd, address);
+}
+
+static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
+{
+ if (pud_none(*pud))
+ return __pmd_alloc(mm, pud, address);
+ return pmd_offset(pud, address);
}
extern void free_area_init(unsigned long * zones_size);
diff -puN include/linux/sched.h~4level-core-patch include/linux/sched.h
diff -puN kernel/fork.c~4level-core-patch kernel/fork.c
diff -puN mm/fremap.c~4level-core-patch mm/fremap.c
--- linux-2.6/mm/fremap.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/fremap.c 2004-12-22 20:31:46.000000000 +1100
@@ -60,14 +60,19 @@ int install_page(struct mm_struct *mm, s
pgoff_t size;
int err = -ENOMEM;
pte_t *pte;
- pgd_t *pgd;
pmd_t *pmd;
+ pud_t *pud;
+ pgd_t *pgd;
pte_t pte_val;
pgd = pgd_offset(mm, addr);
spin_lock(&mm->page_table_lock);
+
+ pud = pud_alloc(mm, pgd, addr);
+ if (!pud)
+ goto err_unlock;
- pmd = pmd_alloc(mm, pgd, addr);
+ pmd = pmd_alloc(mm, pud, addr);
if (!pmd)
goto err_unlock;
@@ -112,14 +117,19 @@ int install_file_pte(struct mm_struct *m
{
int err = -ENOMEM;
pte_t *pte;
- pgd_t *pgd;
pmd_t *pmd;
+ pud_t *pud;
+ pgd_t *pgd;
pte_t pte_val;
pgd = pgd_offset(mm, addr);
spin_lock(&mm->page_table_lock);
+
+ pud = pud_alloc(mm, pgd, addr);
+ if (!pud)
+ goto err_unlock;
- pmd = pmd_alloc(mm, pgd, addr);
+ pmd = pmd_alloc(mm, pud, addr);
if (!pmd)
goto err_unlock;
diff -puN mm/memory.c~4level-core-patch mm/memory.c
--- linux-2.6/mm/memory.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/memory.c 2004-12-22 20:35:55.000000000 +1100
@@ -34,6 +34,8 @@
*
* 16.07.99 - Support of BIGMEM added by Gerhard Wichert, Siemens AG
* (Gerhard.Wichert@pdb.siemens.de)
+ *
+ * Aug/Sep 2004 Changed to four level page tables (Andi Kleen)
*/
#include <linux/kernel_stat.h>
@@ -120,11 +122,42 @@ static inline void clear_pmd_range(struc
}
}
-static inline void clear_pgd_range(struct mmu_gather *tlb, pgd_t *pgd, unsigned long start, unsigned long end)
+static inline void clear_pud_range(struct mmu_gather *tlb, pud_t *pud, unsigned long start, unsigned long end)
{
unsigned long addr = start, next;
pmd_t *pmd, *__pmd;
+ if (pud_none(*pud))
+ return;
+ if (unlikely(pud_bad(*pud))) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
+ return;
+ }
+
+ pmd = __pmd = pmd_offset(pud, start);
+ do {
+ next = (addr + PMD_SIZE) & PMD_MASK;
+ if (next > end || next <= addr)
+ next = end;
+
+ clear_pmd_range(tlb, pmd, addr, next);
+ pmd++;
+ addr = next;
+ } while (addr && (addr < end));
+
+ if (!(start & ~PUD_MASK) && !(end & ~PUD_MASK)) {
+ pud_clear(pud);
+ pmd_free_tlb(tlb, __pmd);
+ }
+}
+
+
+static inline void clear_pgd_range(struct mmu_gather *tlb, pgd_t *pgd, unsigned long start, unsigned long end)
+{
+ unsigned long addr = start, next;
+ pud_t *pud, *__pud;
+
if (pgd_none(*pgd))
return;
if (unlikely(pgd_bad(*pgd))) {
@@ -133,20 +166,20 @@ static inline void clear_pgd_range(struc
return;
}
- pmd = __pmd = pmd_offset(pgd, start);
+ pud = __pud = pud_offset(pgd, start);
do {
- next = (addr + PMD_SIZE) & PMD_MASK;
+ next = (addr + PUD_SIZE) & PUD_MASK;
if (next > end || next <= addr)
next = end;
- clear_pmd_range(tlb, pmd, addr, next);
- pmd++;
+ clear_pud_range(tlb, pud, addr, next);
+ pud++;
addr = next;
- } while (addr && (addr <= end - 1));
+ } while (addr && (addr < end));
if (!(start & ~PGDIR_MASK) && !(end & ~PGDIR_MASK)) {
pgd_clear(pgd);
- pmd_free_tlb(tlb, __pmd);
+ pud_free_tlb(tlb, __pud);
}
}
@@ -326,15 +359,15 @@ static int copy_pte_range(struct mm_stru
}
static int copy_pmd_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
- pgd_t *dst_pgd, pgd_t *src_pgd, struct vm_area_struct *vma,
+ pud_t *dst_pud, pud_t *src_pud, struct vm_area_struct *vma,
unsigned long addr, unsigned long end)
{
pmd_t *src_pmd, *dst_pmd;
int err = 0;
unsigned long next;
- src_pmd = pmd_offset(src_pgd, addr);
- dst_pmd = pmd_alloc(dst_mm, dst_pgd, addr);
+ src_pmd = pmd_offset(src_pud, addr);
+ dst_pmd = pmd_alloc(dst_mm, dst_pud, addr);
if (!dst_pmd)
return -ENOMEM;
@@ -357,6 +390,38 @@ static int copy_pmd_range(struct mm_stru
return err;
}
+static int copy_pud_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
+ pgd_t *dst_pgd, pgd_t *src_pgd, struct vm_area_struct *vma,
+ unsigned long addr, unsigned long end)
+{
+ pud_t *src_pud, *dst_pud;
+ int err = 0;
+ unsigned long next;
+
+ src_pud = pud_offset(src_pgd, addr);
+ dst_pud = pud_alloc(dst_mm, dst_pgd, addr);
+ if (!dst_pud)
+ return -ENOMEM;
+
+ for (; addr < end; addr = next, src_pud++, dst_pud++) {
+ next = (addr + PUD_SIZE) & PUD_MASK;
+ if (next > end)
+ next = end;
+ if (pud_none(*src_pud))
+ continue;
+ if (pud_bad(*src_pud)) {
+ pud_ERROR(*src_pud);
+ pud_clear(src_pud);
+ continue;
+ }
+ err = copy_pmd_range(dst_mm, src_mm, dst_pud, src_pud,
+ vma, addr, next);
+ if (err)
+ break;
+ }
+ return err;
+}
+
int copy_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *vma)
{
@@ -384,7 +449,7 @@ int copy_page_range(struct mm_struct *ds
pgd_clear(src_pgd);
continue;
}
- err = copy_pmd_range(dst, src, dst_pgd, src_pgd,
+ err = copy_pud_range(dst, src, dst_pgd, src_pgd,
vma, addr, next);
if (err)
break;
@@ -481,23 +546,23 @@ static void zap_pte_range(struct mmu_gat
}
static void zap_pmd_range(struct mmu_gather *tlb,
- pgd_t * dir, unsigned long address,
+ pud_t *pud, unsigned long address,
unsigned long size, struct zap_details *details)
{
pmd_t * pmd;
unsigned long end;
- if (pgd_none(*dir))
+ if (pud_none(*pud))
return;
- if (unlikely(pgd_bad(*dir))) {
- pgd_ERROR(*dir);
- pgd_clear(dir);
+ if (unlikely(pud_bad(*pud))) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
return;
}
- pmd = pmd_offset(dir, address);
+ pmd = pmd_offset(pud, address);
end = address + size;
- if (end > ((address + PGDIR_SIZE) & PGDIR_MASK))
- end = ((address + PGDIR_SIZE) & PGDIR_MASK);
+ if (end > ((address + PUD_SIZE) & PUD_MASK))
+ end = ((address + PUD_SIZE) & PUD_MASK);
do {
zap_pte_range(tlb, pmd, address, end - address, details);
address = (address + PMD_SIZE) & PMD_MASK;
@@ -505,20 +570,46 @@ static void zap_pmd_range(struct mmu_gat
} while (address && (address < end));
}
+static void zap_pud_range(struct mmu_gather *tlb,
+ pgd_t * pgd, unsigned long address,
+ unsigned long end, struct zap_details *details)
+{
+ pud_t * pud;
+
+ if (pgd_none(*pgd))
+ return;
+ if (unlikely(pgd_bad(*pgd))) {
+ pgd_ERROR(*pgd);
+ pgd_clear(pgd);
+ return;
+ }
+ pud = pud_offset(pgd, address);
+ do {
+ zap_pmd_range(tlb, pud, address, end - address, details);
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
+ } while (address && (address < end));
+}
+
static void unmap_page_range(struct mmu_gather *tlb,
struct vm_area_struct *vma, unsigned long address,
unsigned long end, struct zap_details *details)
{
- pgd_t * dir;
+ unsigned long next;
+ pgd_t *pgd;
+ int i;
BUG_ON(address >= end);
- dir = pgd_offset(vma->vm_mm, address);
+ pgd = pgd_offset(vma->vm_mm, address);
tlb_start_vma(tlb, vma);
- do {
- zap_pmd_range(tlb, dir, address, end - address, details);
- address = (address + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (address && (address < end));
+ for (i = pgd_index(address); i <= pgd_index(end-1); i++) {
+ next = (address + PGDIR_SIZE) & PGDIR_MASK;
+ if (next <= address || next > end)
+ next = end;
+ zap_pud_range(tlb, pgd, address, next, details);
+ address = next;
+ pgd++;
+ }
tlb_end_vma(tlb, vma);
}
@@ -660,6 +751,7 @@ struct page *
follow_page(struct mm_struct *mm, unsigned long address, int write)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *ptep, pte;
unsigned long pfn;
@@ -673,13 +765,15 @@ follow_page(struct mm_struct *mm, unsign
if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
goto out;
- pmd = pmd_offset(pgd, address);
- if (pmd_none(*pmd))
+ pud = pud_offset(pgd, address);
+ if (pud_none(*pud) || unlikely(pud_bad(*pud)))
+ goto out;
+
+ pmd = pmd_offset(pud, address);
+ if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
goto out;
if (pmd_huge(*pmd))
return follow_huge_pmd(mm, address, pmd, write);
- if (unlikely(pmd_bad(*pmd)))
- goto out;
ptep = pte_offset_map(pmd, address);
if (!ptep)
@@ -723,6 +817,7 @@ untouched_anonymous_page(struct mm_struc
unsigned long address)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
/* Check if the vma is for an anonymous mapping. */
@@ -734,8 +829,12 @@ untouched_anonymous_page(struct mm_struc
if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
return 1;
+ pud = pud_offset(pgd, address);
+ if (pud_none(*pud) || unlikely(pud_bad(*pud)))
+ return 1;
+
/* Check if page middle directory entry exists. */
- pmd = pmd_offset(pgd, address);
+ pmd = pmd_offset(pud, address);
if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
return 1;
@@ -767,6 +866,7 @@ int get_user_pages(struct task_struct *t
unsigned long pg = start & PAGE_MASK;
struct vm_area_struct *gate_vma = get_gate_vma(tsk);
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
if (write) /* user gate pages are read-only */
@@ -776,7 +876,9 @@ int get_user_pages(struct task_struct *t
else
pgd = pgd_offset_gate(mm, pg);
BUG_ON(pgd_none(*pgd));
- pmd = pmd_offset(pgd, pg);
+ pud = pud_offset(pgd, pg);
+ BUG_ON(pud_none(*pud));
+ pmd = pmd_offset(pud, pg);
BUG_ON(pmd_none(*pmd));
pte = pte_offset_map(pmd, pg);
BUG_ON(pte_none(*pte));
@@ -889,16 +991,16 @@ static void zeromap_pte_range(pte_t * pt
} while (address && (address < end));
}
-static inline int zeromap_pmd_range(struct mm_struct *mm, pmd_t * pmd, unsigned long address,
- unsigned long size, pgprot_t prot)
+static inline int zeromap_pmd_range(struct mm_struct *mm, pmd_t * pmd,
+ unsigned long address, unsigned long size, pgprot_t prot)
{
unsigned long base, end;
- base = address & PGDIR_MASK;
- address &= ~PGDIR_MASK;
+ base = address & PUD_MASK;
+ address &= ~PUD_MASK;
end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
do {
pte_t * pte = pte_alloc_map(mm, pmd, base + address);
if (!pte)
@@ -911,31 +1013,64 @@ static inline int zeromap_pmd_range(stru
return 0;
}
-int zeromap_page_range(struct vm_area_struct *vma, unsigned long address, unsigned long size, pgprot_t prot)
+static inline int zeromap_pud_range(struct mm_struct *mm, pud_t * pud,
+ unsigned long address,
+ unsigned long size, pgprot_t prot)
+{
+ unsigned long base, end;
+ int error = 0;
+
+ base = address & PGDIR_MASK;
+ address &= ~PGDIR_MASK;
+ end = address + size;
+ if (end > PGDIR_SIZE)
+ end = PGDIR_SIZE;
+ do {
+ pmd_t * pmd = pmd_alloc(mm, pud, base + address);
+ error = -ENOMEM;
+ if (!pmd)
+ break;
+ error = zeromap_pmd_range(mm, pmd, address, end - address, prot);
+ if (error)
+ break;
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
+ } while (address && (address < end));
+ return 0;
+}
+
+int zeromap_page_range(struct vm_area_struct *vma, unsigned long address,
+ unsigned long size, pgprot_t prot)
{
+ int i;
int error = 0;
- pgd_t * dir;
+ pgd_t * pgd;
unsigned long beg = address;
unsigned long end = address + size;
+ unsigned long next;
struct mm_struct *mm = vma->vm_mm;
- dir = pgd_offset(mm, address);
+ pgd = pgd_offset(mm, address);
flush_cache_range(vma, beg, end);
- if (address >= end)
- BUG();
+ BUG_ON(address >= end);
+ BUG_ON(end > vma->vm_end);
spin_lock(&mm->page_table_lock);
- do {
- pmd_t *pmd = pmd_alloc(mm, dir, address);
+ for (i = pgd_index(address); i <= pgd_index(end-1); i++) {
+ pud_t *pud = pud_alloc(mm, pgd, address);
error = -ENOMEM;
- if (!pmd)
+ if (!pud)
break;
- error = zeromap_pmd_range(mm, pmd, address, end - address, prot);
+ next = (address + PGDIR_SIZE) & PGDIR_MASK;
+ if (next <= beg || next > end)
+ next = end;
+ error = zeromap_pud_range(mm, pud, address,
+ next - address, prot);
if (error)
break;
- address = (address + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (address && (address < end));
+ address = next;
+ pgd++;
+ }
/*
* Why flush? zeromap_pte_range has a BUG_ON for !pte_none()
*/
@@ -949,8 +1084,9 @@ int zeromap_page_range(struct vm_area_st
* mappings are removed. any references to nonexistent pages results
* in null mappings (currently treated as "copy-on-access")
*/
-static inline void remap_pte_range(pte_t * pte, unsigned long address, unsigned long size,
- unsigned long pfn, pgprot_t prot)
+static inline void
+remap_pte_range(pte_t * pte, unsigned long address, unsigned long size,
+ unsigned long pfn, pgprot_t prot)
{
unsigned long end;
@@ -968,22 +1104,24 @@ static inline void remap_pte_range(pte_t
} while (address && (address < end));
}
-static inline int remap_pmd_range(struct mm_struct *mm, pmd_t * pmd, unsigned long address, unsigned long size,
- unsigned long pfn, pgprot_t prot)
+static inline int
+remap_pmd_range(struct mm_struct *mm, pmd_t * pmd, unsigned long address,
+ unsigned long size, unsigned long pfn, pgprot_t prot)
{
unsigned long base, end;
- base = address & PGDIR_MASK;
- address &= ~PGDIR_MASK;
+ base = address & PUD_MASK;
+ address &= ~PUD_MASK;
end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
- pfn -= address >> PAGE_SHIFT;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
+ pfn -= (address >> PAGE_SHIFT);
do {
pte_t * pte = pte_alloc_map(mm, pmd, base + address);
if (!pte)
return -ENOMEM;
- remap_pte_range(pte, base + address, end - address, pfn + (address >> PAGE_SHIFT), prot);
+ remap_pte_range(pte, base + address, end - address,
+ (address >> PAGE_SHIFT) + pfn, prot);
pte_unmap(pte);
address = (address + PMD_SIZE) & PMD_MASK;
pmd++;
@@ -991,20 +1129,50 @@ static inline int remap_pmd_range(struct
return 0;
}
+static inline int remap_pud_range(struct mm_struct *mm, pud_t * pud,
+ unsigned long address, unsigned long size,
+ unsigned long pfn, pgprot_t prot)
+{
+ unsigned long base, end;
+ int error;
+
+ base = address & PGDIR_MASK;
+ address &= ~PGDIR_MASK;
+ end = address + size;
+ if (end > PGDIR_SIZE)
+ end = PGDIR_SIZE;
+ pfn -= address >> PAGE_SHIFT;
+ do {
+ pmd_t *pmd = pmd_alloc(mm, pud, base+address);
+ error = -ENOMEM;
+ if (!pmd)
+ break;
+ error = remap_pmd_range(mm, pmd, base + address, end - address,
+ (address >> PAGE_SHIFT) + pfn, prot);
+ if (error)
+ break;
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
+ } while (address && (address < end));
+ return error;
+}
+
/* Note: this is only safe if the mm semaphore is held when called. */
-int remap_pfn_range(struct vm_area_struct *vma, unsigned long from, unsigned long pfn, unsigned long size, pgprot_t prot)
+int remap_pfn_range(struct vm_area_struct *vma, unsigned long from,
+ unsigned long pfn, unsigned long size, pgprot_t prot)
{
int error = 0;
- pgd_t * dir;
+ pgd_t *pgd;
unsigned long beg = from;
unsigned long end = from + size;
+ unsigned long next;
struct mm_struct *mm = vma->vm_mm;
+ int i;
pfn -= from >> PAGE_SHIFT;
- dir = pgd_offset(mm, from);
+ pgd = pgd_offset(mm, from);
flush_cache_range(vma, beg, end);
- if (from >= end)
- BUG();
+ BUG_ON(from >= end);
/*
* Physically remapped pages are special. Tell the
@@ -1015,25 +1183,32 @@ int remap_pfn_range(struct vm_area_struc
* this region.
*/
vma->vm_flags |= VM_IO | VM_RESERVED;
+
spin_lock(&mm->page_table_lock);
- do {
- pmd_t *pmd = pmd_alloc(mm, dir, from);
+ for (i = pgd_index(beg); i <= pgd_index(end-1); i++) {
+ pud_t *pud = pud_alloc(mm, pgd, from);
error = -ENOMEM;
- if (!pmd)
+ if (!pud)
break;
- error = remap_pmd_range(mm, pmd, from, end - from, pfn + (from >> PAGE_SHIFT), prot);
+ next = (from + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > end || next <= from)
+ next = end;
+ error = remap_pud_range(mm, pud, from, end - from,
+ pfn + (from >> PAGE_SHIFT), prot);
if (error)
break;
- from = (from + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (from && (from < end));
+ from = next;
+ pgd++;
+ }
/*
* Why flush? remap_pte_range has a BUG_ON for !pte_none()
*/
flush_tlb_range(vma, beg, end);
spin_unlock(&mm->page_table_lock);
+
return error;
}
+
EXPORT_SYMBOL(remap_pfn_range);
/*
@@ -1725,13 +1900,14 @@ static inline int handle_pte_fault(struc
* By the time we get here, we already hold the mm semaphore
*/
int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct * vma,
- unsigned long address, int write_access)
+ unsigned long address, int write_access)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
+ pte_t *pte;
__set_current_state(TASK_RUNNING);
- pgd = pgd_offset(mm, address);
inc_page_state(pgfault);
@@ -1742,18 +1918,63 @@ int handle_mm_fault(struct mm_struct *mm
* We need the page table lock to synchronize with kswapd
* and the SMP-safe atomic PTE updates.
*/
+ pgd = pgd_offset(mm, address);
spin_lock(&mm->page_table_lock);
- pmd = pmd_alloc(mm, pgd, address);
- if (pmd) {
- pte_t * pte = pte_alloc_map(mm, pmd, address);
- if (pte)
- return handle_pte_fault(mm, vma, address, write_access, pte, pmd);
- }
+ pud = pud_alloc(mm, pgd, address);
+ if (!pud)
+ goto oom;
+
+ pmd = pmd_alloc(mm, pud, address);
+ if (!pmd)
+ goto oom;
+
+ pte = pte_alloc_map(mm, pmd, address);
+ if (!pte)
+ goto oom;
+
+ return handle_pte_fault(mm, vma, address, write_access, pte, pmd);
+
+ oom:
spin_unlock(&mm->page_table_lock);
return VM_FAULT_OOM;
}
+#if (PTRS_PER_PGD > 1)
+/*
+ * Allocate page upper directory.
+ *
+ * We've already handled the fast-path in-line, and we own the
+ * page table lock.
+ *
+ * On a two-level or three-level page table, this ends up actually being
+ * entirely optimized away.
+ */
+pud_t fastcall *__pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
+{
+ pud_t *new;
+
+ spin_unlock(&mm->page_table_lock);
+ new = pud_alloc_one(mm, address);
+ spin_lock(&mm->page_table_lock);
+ if (!new)
+ return NULL;
+
+ /*
+ * Because we dropped the lock, we should re-check the
+ * entry, as somebody else could have populated it..
+ */
+ if (pgd_present(*pgd)) {
+ pud_free(new);
+ goto out;
+ }
+ pgd_populate(mm, pgd, new);
+out:
+ return pud_offset(pgd, address);
+}
+#endif
+
+#if (PTRS_PER_PUD > 1)
/*
* Allocate page middle directory.
*
@@ -1763,7 +1984,7 @@ int handle_mm_fault(struct mm_struct *mm
* On a two-level page table, this ends up actually being entirely
* optimized away.
*/
-pmd_t fastcall *__pmd_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
+pmd_t fastcall *__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
{
pmd_t *new;
@@ -1777,14 +1998,15 @@ pmd_t fastcall *__pmd_alloc(struct mm_st
* Because we dropped the lock, we should re-check the
* entry, as somebody else could have populated it..
*/
- if (pgd_present(*pgd)) {
+ if (pud_present(*pud)) {
pmd_free(new);
goto out;
}
- pgd_populate(mm, pgd, new);
+ pud_populate(mm, pud, new);
out:
- return pmd_offset(pgd, address);
+ return pmd_offset(pud, address);
}
+#endif
int make_pages_present(unsigned long addr, unsigned long end)
{
@@ -1815,17 +2037,21 @@ struct page * vmalloc_to_page(void * vma
unsigned long addr = (unsigned long) vmalloc_addr;
struct page *page = NULL;
pgd_t *pgd = pgd_offset_k(addr);
+ pud_t *pud;
pmd_t *pmd;
pte_t *ptep, pte;
if (!pgd_none(*pgd)) {
- pmd = pmd_offset(pgd, addr);
- if (!pmd_none(*pmd)) {
- ptep = pte_offset_map(pmd, addr);
- pte = *ptep;
- if (pte_present(pte))
- page = pte_page(pte);
- pte_unmap(ptep);
+ pud = pud_offset(pgd, addr);
+ if (!pud_none(*pud)) {
+ pmd = pmd_offset(pud, addr);
+ if (!pmd_none(*pmd)) {
+ ptep = pte_offset_map(pmd, addr);
+ pte = *ptep;
+ if (pte_present(pte))
+ page = pte_page(pte);
+ pte_unmap(ptep);
+ }
}
}
return page;
diff -puN mm/mempolicy.c~4level-core-patch mm/mempolicy.c
--- linux-2.6/mm/mempolicy.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/mempolicy.c 2004-12-22 20:31:46.000000000 +1100
@@ -234,18 +234,29 @@ static struct mempolicy *mpol_new(int mo
/* Ensure all existing pages follow the policy. */
static int
-verify_pages(unsigned long addr, unsigned long end, unsigned long *nodes)
+verify_pages(struct mm_struct *mm,
+ unsigned long addr, unsigned long end, unsigned long *nodes)
{
while (addr < end) {
struct page *p;
pte_t *pte;
pmd_t *pmd;
- pgd_t *pgd = pgd_offset_k(addr);
+ pud_t *pud;
+ pgd_t *pgd;
+ pgd = pgd_offset(mm, addr);
if (pgd_none(*pgd)) {
- addr = (addr + PGDIR_SIZE) & PGDIR_MASK;
+ unsigned long next = (addr + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > addr)
+ break;
+ addr = next;
+ continue;
+ }
+ pud = pud_offset(pgd, addr);
+ if (pud_none(*pud)) {
+ addr = (addr + PUD_SIZE) & PUD_MASK;
continue;
}
- pmd = pmd_offset(pgd, addr);
+ pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd)) {
addr = (addr + PMD_SIZE) & PMD_MASK;
continue;
@@ -283,7 +294,8 @@ check_range(struct mm_struct *mm, unsign
if (prev && prev->vm_end < vma->vm_start)
return ERR_PTR(-EFAULT);
if ((flags & MPOL_MF_STRICT) && !is_vm_hugetlb_page(vma)) {
- err = verify_pages(vma->vm_start, vma->vm_end, nodes);
+ err = verify_pages(vma->vm_mm,
+ vma->vm_start, vma->vm_end, nodes);
if (err) {
first = ERR_PTR(err);
break;
diff -puN mm/mmap.c~4level-core-patch mm/mmap.c
diff -puN mm/mprotect.c~4level-core-patch mm/mprotect.c
--- linux-2.6/mm/mprotect.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/mprotect.c 2004-12-22 20:31:46.000000000 +1100
@@ -62,12 +62,38 @@ change_pte_range(pmd_t *pmd, unsigned lo
}
static inline void
-change_pmd_range(pgd_t *pgd, unsigned long address,
+change_pmd_range(pud_t *pud, unsigned long address,
unsigned long size, pgprot_t newprot)
{
pmd_t * pmd;
unsigned long end;
+ if (pud_none(*pud))
+ return;
+ if (pud_bad(*pud)) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
+ return;
+ }
+ pmd = pmd_offset(pud, address);
+ address &= ~PUD_MASK;
+ end = address + size;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
+ do {
+ change_pte_range(pmd, address, end - address, newprot);
+ address = (address + PMD_SIZE) & PMD_MASK;
+ pmd++;
+ } while (address && (address < end));
+}
+
+static inline void
+change_pud_range(pgd_t *pgd, unsigned long address,
+ unsigned long size, pgprot_t newprot)
+{
+ pud_t * pud;
+ unsigned long end;
+
if (pgd_none(*pgd))
return;
if (pgd_bad(*pgd)) {
@@ -75,15 +101,15 @@ change_pmd_range(pgd_t *pgd, unsigned lo
pgd_clear(pgd);
return;
}
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
address &= ~PGDIR_MASK;
end = address + size;
if (end > PGDIR_SIZE)
end = PGDIR_SIZE;
do {
- change_pte_range(pmd, address, end - address, newprot);
- address = (address + PMD_SIZE) & PMD_MASK;
- pmd++;
+ change_pmd_range(pud, address, end - address, newprot);
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
} while (address && (address < end));
}
@@ -91,22 +117,25 @@ static void
change_protection(struct vm_area_struct *vma, unsigned long start,
unsigned long end, pgprot_t newprot)
{
- pgd_t *dir;
- unsigned long beg = start;
+ struct mm_struct *mm = current->mm;
+ pgd_t *pgd;
+ unsigned long beg = start, next;
+ int i;
- dir = pgd_offset(current->mm, start);
+ pgd = pgd_offset(mm, start);
flush_cache_range(vma, beg, end);
- if (start >= end)
- BUG();
- spin_lock(¤t->mm->page_table_lock);
- do {
- change_pmd_range(dir, start, end - start, newprot);
- start = (start + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (start && (start < end));
+ BUG_ON(start >= end);
+ spin_lock(&mm->page_table_lock);
+ for (i = pgd_index(start); i <= pgd_index(end-1); i++) {
+ next = (start + PGDIR_SIZE) & PGDIR_MASK;
+ if (next <= start || next > end)
+ next = end;
+ change_pud_range(pgd, start, next - start, newprot);
+ start = next;
+ pgd++;
+ }
flush_tlb_range(vma, beg, end);
- spin_unlock(¤t->mm->page_table_lock);
- return;
+ spin_unlock(&mm->page_table_lock);
}
static int
diff -puN mm/mremap.c~4level-core-patch mm/mremap.c
--- linux-2.6/mm/mremap.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/mremap.c 2004-12-22 20:31:46.000000000 +1100
@@ -25,19 +25,24 @@
static pte_t *get_one_pte_map_nested(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, addr);
if (pgd_none(*pgd))
goto end;
- if (pgd_bad(*pgd)) {
- pgd_ERROR(*pgd);
- pgd_clear(pgd);
+
+ pud = pud_offset(pgd, addr);
+ if (pud_none(*pud))
+ goto end;
+ if (pud_bad(*pud)) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
goto end;
}
- pmd = pmd_offset(pgd, addr);
+ pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd))
goto end;
if (pmd_bad(*pmd)) {
@@ -58,12 +63,17 @@ end:
static pte_t *get_one_pte_map(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pgd = pgd_offset(mm, addr);
if (pgd_none(*pgd))
return NULL;
- pmd = pmd_offset(pgd, addr);
+
+ pud = pud_offset(pgd, addr);
+ if (pud_none(*pud))
+ return NULL;
+ pmd = pmd_offset(pud, addr);
if (!pmd_present(*pmd))
return NULL;
return pte_offset_map(pmd, addr);
@@ -71,10 +81,17 @@ static pte_t *get_one_pte_map(struct mm_
static inline pte_t *alloc_one_pte_map(struct mm_struct *mm, unsigned long addr)
{
+ pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
- pmd = pmd_alloc(mm, pgd_offset(mm, addr), addr);
+ pgd = pgd_offset(mm, addr);
+
+ pud = pud_alloc(mm, pgd, addr);
+ if (!pud)
+ return NULL;
+ pmd = pmd_alloc(mm, pud, addr);
if (pmd)
pte = pte_alloc_map(mm, pmd, addr);
return pte;
diff -puN mm/msync.c~4level-core-patch mm/msync.c
--- linux-2.6/mm/msync.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/msync.c 2004-12-22 20:31:46.000000000 +1100
@@ -67,13 +67,39 @@ static int filemap_sync_pte_range(pmd_t
return error;
}
-static inline int filemap_sync_pmd_range(pgd_t * pgd,
+static inline int filemap_sync_pmd_range(pud_t * pud,
unsigned long address, unsigned long end,
struct vm_area_struct *vma, unsigned int flags)
{
pmd_t * pmd;
int error;
+ if (pud_none(*pud))
+ return 0;
+ if (pud_bad(*pud)) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
+ return 0;
+ }
+ pmd = pmd_offset(pud, address);
+ if ((address & PUD_MASK) != (end & PUD_MASK))
+ end = (address & PUD_MASK) + PUD_SIZE;
+ error = 0;
+ do {
+ error |= filemap_sync_pte_range(pmd, address, end, vma, flags);
+ address = (address + PMD_SIZE) & PMD_MASK;
+ pmd++;
+ } while (address && (address < end));
+ return error;
+}
+
+static inline int filemap_sync_pud_range(pgd_t *pgd,
+ unsigned long address, unsigned long end,
+ struct vm_area_struct *vma, unsigned int flags)
+{
+ pud_t *pud;
+ int error;
+
if (pgd_none(*pgd))
return 0;
if (pgd_bad(*pgd)) {
@@ -81,14 +107,14 @@ static inline int filemap_sync_pmd_range
pgd_clear(pgd);
return 0;
}
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
if ((address & PGDIR_MASK) != (end & PGDIR_MASK))
end = (address & PGDIR_MASK) + PGDIR_SIZE;
error = 0;
do {
- error |= filemap_sync_pte_range(pmd, address, end, vma, flags);
- address = (address + PMD_SIZE) & PMD_MASK;
- pmd++;
+ error |= filemap_sync_pmd_range(pud, address, end, vma, flags);
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
} while (address && (address < end));
return error;
}
@@ -96,8 +122,10 @@ static inline int filemap_sync_pmd_range
static int filemap_sync(struct vm_area_struct * vma, unsigned long address,
size_t size, unsigned int flags)
{
- pgd_t * dir;
+ pgd_t *pgd;
unsigned long end = address + size;
+ unsigned long next;
+ int i;
int error = 0;
/* Aquire the lock early; it may be possible to avoid dropping
@@ -105,7 +133,7 @@ static int filemap_sync(struct vm_area_s
*/
spin_lock(&vma->vm_mm->page_table_lock);
- dir = pgd_offset(vma->vm_mm, address);
+ pgd = pgd_offset(vma->vm_mm, address);
flush_cache_range(vma, address, end);
/* For hugepages we can't go walking the page table normally,
@@ -116,11 +144,14 @@ static int filemap_sync(struct vm_area_s
if (address >= end)
BUG();
- do {
- error |= filemap_sync_pmd_range(dir, address, end, vma, flags);
- address = (address + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (address && (address < end));
+ for (i = pgd_index(address); i <= pgd_index(end-1); i++) {
+ next = (address + PGDIR_SIZE) & PGDIR_MASK;
+ if (next <= address || next > end)
+ next = end;
+ error |= filemap_sync_pud_range(pgd, address, next, vma, flags);
+ address = next;
+ pgd++;
+ }
/*
* Why flush ? filemap_sync_pte already flushed the tlbs with the
* dirty bits.
diff -puN mm/rmap.c~4level-core-patch mm/rmap.c
--- linux-2.6/mm/rmap.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/rmap.c 2004-12-22 20:31:46.000000000 +1100
@@ -259,6 +259,7 @@ static int page_referenced_one(struct pa
struct mm_struct *mm = vma->vm_mm;
unsigned long address;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
int referenced = 0;
@@ -275,7 +276,11 @@ static int page_referenced_one(struct pa
if (!pgd_present(*pgd))
goto out_unlock;
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+ if (!pud_present(*pud))
+ goto out_unlock;
+
+ pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd))
goto out_unlock;
@@ -502,6 +507,7 @@ static int try_to_unmap_one(struct page
struct mm_struct *mm = vma->vm_mm;
unsigned long address;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
pte_t pteval;
@@ -523,7 +529,11 @@ static int try_to_unmap_one(struct page
if (!pgd_present(*pgd))
goto out_unlock;
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+ if (!pud_present(*pud))
+ goto out_unlock;
+
+ pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd))
goto out_unlock;
@@ -631,6 +641,7 @@ static void try_to_unmap_cluster(unsigne
{
struct mm_struct *mm = vma->vm_mm;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
pte_t pteval;
@@ -656,7 +667,11 @@ static void try_to_unmap_cluster(unsigne
if (!pgd_present(*pgd))
goto out_unlock;
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+ if (!pud_present(*pud))
+ goto out_unlock;
+
+ pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd))
goto out_unlock;
diff -puN mm/swapfile.c~4level-core-patch mm/swapfile.c
--- linux-2.6/mm/swapfile.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/swapfile.c 2004-12-22 20:31:46.000000000 +1100
@@ -486,27 +486,27 @@ static unsigned long unuse_pmd(struct vm
}
/* vma->vm_mm->page_table_lock is held */
-static unsigned long unuse_pgd(struct vm_area_struct * vma, pgd_t *dir,
- unsigned long address, unsigned long size,
+static unsigned long unuse_pud(struct vm_area_struct * vma, pud_t *pud,
+ unsigned long address, unsigned long size, unsigned long offset,
swp_entry_t entry, struct page *page)
{
pmd_t * pmd;
- unsigned long offset, end;
+ unsigned long end;
unsigned long foundaddr;
- if (pgd_none(*dir))
+ if (pud_none(*pud))
return 0;
- if (pgd_bad(*dir)) {
- pgd_ERROR(*dir);
- pgd_clear(dir);
+ if (pud_bad(*pud)) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
return 0;
}
- pmd = pmd_offset(dir, address);
- offset = address & PGDIR_MASK;
- address &= ~PGDIR_MASK;
+ pmd = pmd_offset(pud, address);
+ offset += address & PUD_MASK;
+ address &= ~PUD_MASK;
end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
if (address >= end)
BUG();
do {
@@ -521,12 +521,48 @@ static unsigned long unuse_pgd(struct vm
}
/* vma->vm_mm->page_table_lock is held */
+static unsigned long unuse_pgd(struct vm_area_struct * vma, pgd_t *pgd,
+ unsigned long address, unsigned long size,
+ swp_entry_t entry, struct page *page)
+{
+ pud_t * pud;
+ unsigned long offset;
+ unsigned long foundaddr;
+ unsigned long end;
+
+ if (pgd_none(*pgd))
+ return 0;
+ if (pgd_bad(*pgd)) {
+ pgd_ERROR(*pgd);
+ pgd_clear(pgd);
+ return 0;
+ }
+ pud = pud_offset(pgd, address);
+ offset = address & PGDIR_MASK;
+ address &= ~PGDIR_MASK;
+ end = address + size;
+ if (end > PGDIR_SIZE)
+ end = PGDIR_SIZE;
+ BUG_ON (address >= end);
+ do {
+ foundaddr = unuse_pud(vma, pud, address, end - address,
+ offset, entry, page);
+ if (foundaddr)
+ return foundaddr;
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
+ } while (address && (address < end));
+ return 0;
+}
+
+/* vma->vm_mm->page_table_lock is held */
static unsigned long unuse_vma(struct vm_area_struct * vma,
swp_entry_t entry, struct page *page)
{
- pgd_t *pgdir;
- unsigned long start, end;
+ pgd_t *pgd;
+ unsigned long start, end, next;
unsigned long foundaddr;
+ int i;
if (page->mapping) {
start = page_address_in_vma(page, vma);
@@ -538,15 +574,18 @@ static unsigned long unuse_vma(struct vm
start = vma->vm_start;
end = vma->vm_end;
}
- pgdir = pgd_offset(vma->vm_mm, start);
- do {
- foundaddr = unuse_pgd(vma, pgdir, start, end - start,
- entry, page);
+ pgd = pgd_offset(vma->vm_mm, start);
+ for (i = pgd_index(start); i <= pgd_index(end-1); i++) {
+ next = (start + PGDIR_SIZE) & PGDIR_MASK;
+ if (next > end || next <= start)
+ next = end;
+ foundaddr = unuse_pgd(vma, pgd, start, next - start, entry, page);
if (foundaddr)
return foundaddr;
- start = (start + PGDIR_SIZE) & PGDIR_MASK;
- pgdir++;
- } while (start && (start < end));
+ start = next;
+ i++;
+ pgd++;
+ }
return 0;
}
diff -puN mm/vmalloc.c~4level-core-patch mm/vmalloc.c
--- linux-2.6/mm/vmalloc.c~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/mm/vmalloc.c 2004-12-22 20:31:46.000000000 +1100
@@ -56,25 +56,25 @@ static void unmap_area_pte(pmd_t *pmd, u
} while (address < end);
}
-static void unmap_area_pmd(pgd_t *dir, unsigned long address,
+static void unmap_area_pmd(pud_t *pud, unsigned long address,
unsigned long size)
{
unsigned long end;
pmd_t *pmd;
- if (pgd_none(*dir))
+ if (pud_none(*pud))
return;
- if (pgd_bad(*dir)) {
- pgd_ERROR(*dir);
- pgd_clear(dir);
+ if (pud_bad(*pud)) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
return;
}
- pmd = pmd_offset(dir, address);
- address &= ~PGDIR_MASK;
+ pmd = pmd_offset(pud, address);
+ address &= ~PUD_MASK;
end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
do {
unmap_area_pte(pmd, address, end - address);
@@ -83,6 +83,33 @@ static void unmap_area_pmd(pgd_t *dir, u
} while (address < end);
}
+static void unmap_area_pud(pgd_t *pgd, unsigned long address,
+ unsigned long size)
+{
+ pud_t *pud;
+ unsigned long end;
+
+ if (pgd_none(*pgd))
+ return;
+ if (pgd_bad(*pgd)) {
+ pgd_ERROR(*pgd);
+ pgd_clear(pgd);
+ return;
+ }
+
+ pud = pud_offset(pgd, address);
+ address &= ~PGDIR_MASK;
+ end = address + size;
+ if (end > PGDIR_SIZE)
+ end = PGDIR_SIZE;
+
+ do {
+ unmap_area_pmd(pud, address, end - address);
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
+ } while (address && (address < end));
+}
+
static int map_area_pte(pte_t *pte, unsigned long address,
unsigned long size, pgprot_t prot,
struct page ***pages)
@@ -96,7 +123,6 @@ static int map_area_pte(pte_t *pte, unsi
do {
struct page *page = **pages;
-
WARN_ON(!pte_none(*pte));
if (!page)
return -ENOMEM;
@@ -115,11 +141,11 @@ static int map_area_pmd(pmd_t *pmd, unsi
{
unsigned long base, end;
- base = address & PGDIR_MASK;
- address &= ~PGDIR_MASK;
+ base = address & PUD_MASK;
+ address &= ~PUD_MASK;
end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
do {
pte_t * pte = pte_alloc_kernel(&init_mm, pmd, base + address);
@@ -134,19 +160,41 @@ static int map_area_pmd(pmd_t *pmd, unsi
return 0;
}
+static int map_area_pud(pud_t *pud, unsigned long address,
+ unsigned long end, pgprot_t prot,
+ struct page ***pages)
+{
+ do {
+ pmd_t *pmd = pmd_alloc(&init_mm, pud, address);
+ if (!pmd)
+ return -ENOMEM;
+ if (map_area_pmd(pmd, address, end - address, prot, pages))
+ return -ENOMEM;
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pud++;
+ } while (address && address < end);
+
+ return 0;
+}
+
void unmap_vm_area(struct vm_struct *area)
{
unsigned long address = (unsigned long) area->addr;
unsigned long end = (address + area->size);
- pgd_t *dir;
+ unsigned long next;
+ pgd_t *pgd;
+ int i;
- dir = pgd_offset_k(address);
+ pgd = pgd_offset_k(address);
flush_cache_vunmap(address, end);
- do {
- unmap_area_pmd(dir, address, end - address);
- address = (address + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (address && (address < end));
+ for (i = pgd_index(address); i <= pgd_index(end-1); i++) {
+ next = (address + PGDIR_SIZE) & PGDIR_MASK;
+ if (next <= address || next > end)
+ next = end;
+ unmap_area_pud(pgd, address, next - address);
+ address = next;
+ pgd++;
+ }
flush_tlb_kernel_range((unsigned long) area->addr, end);
}
@@ -154,25 +202,30 @@ int map_vm_area(struct vm_struct *area,
{
unsigned long address = (unsigned long) area->addr;
unsigned long end = address + (area->size-PAGE_SIZE);
- pgd_t *dir;
+ unsigned long next;
+ pgd_t *pgd;
int err = 0;
+ int i;
- dir = pgd_offset_k(address);
+ pgd = pgd_offset_k(address);
spin_lock(&init_mm.page_table_lock);
- do {
- pmd_t *pmd = pmd_alloc(&init_mm, dir, address);
- if (!pmd) {
+ for (i = pgd_index(address); i <= pgd_index(end-1); i++) {
+ pud_t *pud = pud_alloc(&init_mm, pgd, address);
+ if (!pud) {
err = -ENOMEM;
break;
}
- if (map_area_pmd(pmd, address, end - address, prot, pages)) {
+ next = (address + PGDIR_SIZE) & PGDIR_MASK;
+ if (next < address || next > end)
+ next = end;
+ if (map_area_pud(pud, address, next, prot, pages)) {
err = -ENOMEM;
break;
}
- address = (address + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
- } while (address && (address < end));
+ address = next;
+ pgd++;
+ }
spin_unlock(&init_mm.page_table_lock);
flush_cache_vmap((unsigned long) area->addr, end);
diff -puN drivers/char/drm/drm_memory.h~4level-core-patch drivers/char/drm/drm_memory.h
--- linux-2.6/drivers/char/drm/drm_memory.h~4level-core-patch 2004-12-22 20:31:46.000000000 +1100
+++ linux-2.6-npiggin/drivers/char/drm/drm_memory.h 2004-12-22 20:31:46.000000000 +1100
@@ -125,7 +125,8 @@ static inline unsigned long
drm_follow_page (void *vaddr)
{
pgd_t *pgd = pgd_offset_k((unsigned long) vaddr);
- pmd_t *pmd = pmd_offset(pgd, (unsigned long) vaddr);
+ pud_t *pud = pud_offset(pgd, (unsigned long) vaddr);
+ pmd_t *pmd = pmd_offset(pud, (unsigned long) vaddr);
pte_t *ptep = pte_offset_kernel(pmd, (unsigned long) vaddr);
return pte_pfn(*ptep) << PAGE_SHIFT;
}
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 8/11] introduce fallback header
2004-12-22 9:57 ` [PATCH 7/11] convert Linux to 4-level page tables Nick Piggin
@ 2004-12-22 9:59 ` Nick Piggin
2004-12-22 10:00 ` [PATCH 9/11] convert i386 to generic nopud header Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 9:59 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
8/11
[-- Attachment #2: 4level-fallback.patch --]
[-- Type: text/plain, Size: 15586 bytes --]
Add a temporary "fallback" header so architectures can run with the 4level
patgetables patch without modification. All architectures should be
converted to use the folding headers (include/asm-generic/pgtable-nop?d.h)
as soon as possible, and the fallback header removed.
Make all architectures include the fallback header, except i386, because that
architecture has earlier been converted to use pgtable-nopmd.h under the 3
level system, which is not compatible with the fallback header.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/include/asm-alpha/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-arm/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-arm26/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-cris/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-generic/4level-fixup.h | 34 +++++++++++++++++++
linux-2.6-npiggin/include/asm-generic/tlb.h | 2 +
linux-2.6-npiggin/include/asm-h8300/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-ia64/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-m32r/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-m68k/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-m68knommu/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-mips/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-parisc/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-ppc/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-ppc64/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-s390/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-sh/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-sh64/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-sparc/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-sparc64/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-um/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-v850/pgtable.h | 2 +
linux-2.6-npiggin/include/asm-x86_64/pgtable.h | 2 +
linux-2.6-npiggin/include/linux/mm.h | 6 +++
linux-2.6-npiggin/mm/memory.c | 25 +++++++++++++
25 files changed, 109 insertions(+)
diff -puN /dev/null include/asm-generic/4level-fixup.h
--- /dev/null 2004-09-06 19:38:39.000000000 +1000
+++ linux-2.6-npiggin/include/asm-generic/4level-fixup.h 2004-12-22 20:38:01.000000000 +1100
@@ -0,0 +1,34 @@
+#ifndef _4LEVEL_FIXUP_H
+#define _4LEVEL_FIXUP_H
+
+#define __ARCH_HAS_4LEVEL_HACK
+
+#define PUD_SIZE PGDIR_SIZE
+#define PUD_MASK PGDIR_MASK
+#define PTRS_PER_PUD 1
+
+#define pud_t pgd_t
+
+#define pmd_alloc(mm, pud, address) \
+({ pmd_t *ret; \
+ if (pgd_none(*pud)) \
+ ret = __pmd_alloc(mm, pud, address); \
+ else \
+ ret = pmd_offset(pud, address); \
+ ret; \
+})
+
+#define pud_alloc(mm, pgd, address) (pgd)
+#define pud_offset(pgd, start) (pgd)
+#define pud_none(pud) 0
+#define pud_bad(pud) 0
+#define pud_present(pud) 1
+#define pud_ERROR(pud) do { } while (0)
+#define pud_clear(pud) do { } while (0)
+
+#undef pud_free_tlb
+#define pud_free_tlb(tlb, x) do { } while (0)
+#define pud_free(x) do { } while (0)
+#define __pud_free_tlb(tlb, x) do { } while (0)
+
+#endif
diff -puN include/linux/mm.h~4level-fallback include/linux/mm.h
--- linux-2.6/include/linux/mm.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/linux/mm.h 2004-12-22 20:36:07.000000000 +1100
@@ -631,6 +631,11 @@ extern void remove_shrinker(struct shrin
* the inlining and the symmetry break with pte_alloc_map() that does all
* of this out-of-line.
*/
+/*
+ * The following ifdef needed to get the 4level-fixup.h header to work.
+ * Remove it when 4level-fixup.h has been removed.
+ */
+#ifndef __ARCH_HAS_4LEVEL_HACK
static inline pud_t *pud_alloc(struct mm_struct *mm, pgd_t *pgd, unsigned long address)
{
if (pgd_none(*pgd))
@@ -644,6 +649,7 @@ static inline pmd_t *pmd_alloc(struct mm
return __pmd_alloc(mm, pud, address);
return pmd_offset(pud, address);
}
+#endif
extern void free_area_init(unsigned long * zones_size);
extern void free_area_init_node(int nid, pg_data_t *pgdat,
diff -puN mm/memory.c~4level-fallback mm/memory.c
--- linux-2.6/mm/memory.c~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/mm/memory.c 2004-12-22 20:36:07.000000000 +1100
@@ -1940,6 +1940,7 @@ int handle_mm_fault(struct mm_struct *mm
return VM_FAULT_OOM;
}
+#ifndef __ARCH_HAS_4LEVEL_HACK
#if (PTRS_PER_PGD > 1)
/*
* Allocate page upper directory.
@@ -2007,6 +2008,30 @@ out:
return pmd_offset(pud, address);
}
#endif
+#else
+pmd_t fastcall *__pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
+{
+ pmd_t *new;
+
+ spin_unlock(&mm->page_table_lock);
+ new = pmd_alloc_one(mm, address);
+ spin_lock(&mm->page_table_lock);
+ if (!new)
+ return NULL;
+
+ /*
+ * Because we dropped the lock, we should re-check the
+ * entry, as somebody else could have populated it..
+ */
+ if (pgd_present(*pud)) {
+ pmd_free(new);
+ goto out;
+ }
+ pgd_populate(mm, pud, new);
+out:
+ return pmd_offset(pud, address);
+}
+#endif
int make_pages_present(unsigned long addr, unsigned long end)
{
diff -puN include/asm-alpha/pgtable.h~4level-fallback include/asm-alpha/pgtable.h
--- linux-2.6/include/asm-alpha/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-alpha/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _ALPHA_PGTABLE_H
#define _ALPHA_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* This file contains the functions and defines necessary to modify and use
* the Alpha page table tree.
diff -puN include/asm-arm/pgtable.h~4level-fallback include/asm-arm/pgtable.h
--- linux-2.6/include/asm-arm/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-arm/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -10,6 +10,8 @@
#ifndef _ASMARM_PGTABLE_H
#define _ASMARM_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <asm/memory.h>
#include <asm/proc-fns.h>
#include <asm/arch/vmalloc.h>
diff -puN include/asm-arm26/pgtable.h~4level-fallback include/asm-arm26/pgtable.h
--- linux-2.6/include/asm-arm26/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-arm26/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -11,6 +11,8 @@
#ifndef _ASMARM_PGTABLE_H
#define _ASMARM_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <asm/memory.h>
diff -puN include/asm-cris/pgtable.h~4level-fallback include/asm-cris/pgtable.h
--- linux-2.6/include/asm-cris/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-cris/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -5,6 +5,8 @@
#ifndef _CRIS_PGTABLE_H
#define _CRIS_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#ifndef __ASSEMBLY__
#include <linux/config.h>
#include <linux/sched.h>
diff -puN include/asm-generic/pgtable.h~4level-fallback include/asm-generic/pgtable.h
diff -puN include/asm-h8300/pgtable.h~4level-fallback include/asm-h8300/pgtable.h
--- linux-2.6/include/asm-h8300/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-h8300/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _H8300_PGTABLE_H
#define _H8300_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <linux/slab.h>
#include <asm/processor.h>
diff -puN include/asm-i386/pgtable.h~4level-fallback include/asm-i386/pgtable.h
diff -puN include/asm-ia64/pgtable.h~4level-fallback include/asm-ia64/pgtable.h
--- linux-2.6/include/asm-ia64/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-ia64/pgtable.h 2004-12-22 20:38:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _ASM_IA64_PGTABLE_H
#define _ASM_IA64_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* This file contains the functions and defines necessary to modify and use
* the IA-64 page table tree.
diff -puN include/asm-m32r/pgtable.h~4level-fallback include/asm-m32r/pgtable.h
--- linux-2.6/include/asm-m32r/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-m32r/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _ASM_M32R_PGTABLE_H
#define _ASM_M32R_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/* $Id$ */
/*
diff -puN include/asm-m68k/pgtable.h~4level-fallback include/asm-m68k/pgtable.h
--- linux-2.6/include/asm-m68k/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-m68k/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _M68K_PGTABLE_H
#define _M68K_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <asm/setup.h>
diff -puN include/asm-m68knommu/pgtable.h~4level-fallback include/asm-m68knommu/pgtable.h
--- linux-2.6/include/asm-m68knommu/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-m68knommu/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _M68KNOMMU_PGTABLE_H
#define _M68KNOMMU_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* (C) Copyright 2000-2002, Greg Ungerer <gerg@snapgear.com>
*/
diff -puN include/asm-mips/pgtable.h~4level-fallback include/asm-mips/pgtable.h
--- linux-2.6/include/asm-mips/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-mips/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -8,6 +8,8 @@
#ifndef _ASM_PGTABLE_H
#define _ASM_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#ifdef CONFIG_MIPS32
#include <asm/pgtable-32.h>
diff -puN include/asm-parisc/pgtable.h~4level-fallback include/asm-parisc/pgtable.h
--- linux-2.6/include/asm-parisc/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-parisc/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _PARISC_PGTABLE_H
#define _PARISC_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <asm/fixmap.h>
diff -puN include/asm-ppc/pgtable.h~4level-fallback include/asm-ppc/pgtable.h
--- linux-2.6/include/asm-ppc/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-ppc/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -2,6 +2,8 @@
#ifndef _PPC_PGTABLE_H
#define _PPC_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#ifndef __ASSEMBLY__
diff -puN include/asm-ppc64/pgtable.h~4level-fallback include/asm-ppc64/pgtable.h
--- linux-2.6/include/asm-ppc64/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-ppc64/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _PPC64_PGTABLE_H
#define _PPC64_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* This file contains the functions and defines necessary to modify and use
* the ppc64 hashed page table.
diff -puN include/asm-s390/pgtable.h~4level-fallback include/asm-s390/pgtable.h
--- linux-2.6/include/asm-s390/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-s390/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -13,6 +13,8 @@
#ifndef _ASM_S390_PGTABLE_H
#define _ASM_S390_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* The Linux memory management assumes a three-level page table setup. For
* s390 31 bit we "fold" the mid level into the top-level page table, so
diff -puN include/asm-sh/pgtable.h~4level-fallback include/asm-sh/pgtable.h
--- linux-2.6/include/asm-sh/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-sh/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef __ASM_SH_PGTABLE_H
#define __ASM_SH_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* Copyright (C) 1999 Niibe Yutaka
* Copyright (C) 2002, 2003, 2004 Paul Mundt
diff -puN include/asm-sh64/pgtable.h~4level-fallback include/asm-sh64/pgtable.h
--- linux-2.6/include/asm-sh64/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-sh64/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef __ASM_SH64_PGTABLE_H
#define __ASM_SH64_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* This file is subject to the terms and conditions of the GNU General Public
* License. See the file "COPYING" in the main directory of this archive
diff -puN include/asm-sparc/pgtable.h~4level-fallback include/asm-sparc/pgtable.h
--- linux-2.6/include/asm-sparc/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-sparc/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -9,6 +9,8 @@
* Copyright (C) 1998 Jakub Jelinek (jj@sunsite.mff.cuni.cz)
*/
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <linux/spinlock.h>
#include <linux/swap.h>
diff -puN include/asm-sparc64/pgtable.h~4level-fallback include/asm-sparc64/pgtable.h
--- linux-2.6/include/asm-sparc64/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-sparc64/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -12,6 +12,8 @@
* the SpitFire page tables.
*/
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <asm/spitfire.h>
#include <asm/asi.h>
diff -puN include/asm-um/pgtable.h~4level-fallback include/asm-um/pgtable.h
--- linux-2.6/include/asm-um/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-um/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -7,6 +7,8 @@
#ifndef __UM_PGTABLE_H
#define __UM_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
#include "linux/sched.h"
#include "asm/processor.h"
#include "asm/page.h"
diff -puN include/asm-v850/pgtable.h~4level-fallback include/asm-v850/pgtable.h
--- linux-2.6/include/asm-v850/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-v850/pgtable.h 2004-12-22 20:36:07.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef __V850_PGTABLE_H__
#define __V850_PGTABLE_H__
+#include <asm-generic/4level-fixup.h>
+
#include <linux/config.h>
#include <asm/page.h>
diff -puN include/asm-x86_64/pgtable.h~4level-fallback include/asm-x86_64/pgtable.h
--- linux-2.6/include/asm-x86_64/pgtable.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/pgtable.h 2004-12-22 20:38:06.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _X86_64_PGTABLE_H
#define _X86_64_PGTABLE_H
+#include <asm-generic/4level-fixup.h>
+
/*
* This file contains the functions and defines necessary to modify and use
* the x86-64 page table tree.
diff -puN include/asm-generic/tlb.h~4level-fallback include/asm-generic/tlb.h
--- linux-2.6/include/asm-generic/tlb.h~4level-fallback 2004-12-22 20:36:07.000000000 +1100
+++ linux-2.6-npiggin/include/asm-generic/tlb.h 2004-12-22 20:36:07.000000000 +1100
@@ -141,11 +141,13 @@ static inline void tlb_remove_page(struc
__pte_free_tlb(tlb, ptep); \
} while (0)
+#ifndef __ARCH_HAS_4LEVEL_HACK
#define pud_free_tlb(tlb, pudp) \
do { \
tlb->need_flush = 1; \
__pud_free_tlb(tlb, pudp); \
} while (0)
+#endif
#define pmd_free_tlb(tlb, pmdp) \
do { \
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 9/11] convert i386 to generic nopud header
2004-12-22 9:59 ` [PATCH 8/11] introduce fallback header Nick Piggin
@ 2004-12-22 10:00 ` Nick Piggin
2004-12-22 10:00 ` [PATCH 10/11] convert ia64 " Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 10:00 UTC (permalink / raw)
To: Nick Piggin
Cc: Linus Torvalds, Andrew Morton, Andi Kleen, Hugh Dickins,
Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 5 bytes --]
9/11
[-- Attachment #2: 4level-architecture-changes-for-i386.patch --]
[-- Type: text/plain, Size: 14555 bytes --]
i386 works with 2 and 3 levels
Signed-off-by: Andi Kleen <ak@suse.de>
Converted to use pud_t by Nick Piggin
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
---
linux-2.6-npiggin/arch/i386/kernel/acpi/sleep.c | 3 +-
linux-2.6-npiggin/arch/i386/kernel/vm86.c | 11 ++++++++-
linux-2.6-npiggin/arch/i386/mm/fault.c | 13 ++++++++--
linux-2.6-npiggin/arch/i386/mm/hugetlbpage.c | 8 +++++-
linux-2.6-npiggin/arch/i386/mm/init.c | 18 ++++++++++-----
linux-2.6-npiggin/arch/i386/mm/ioremap.c | 7 +++++
linux-2.6-npiggin/arch/i386/mm/pageattr.c | 14 ++++++++---
linux-2.6-npiggin/arch/i386/mm/pgtable.c | 12 ++++++++--
linux-2.6-npiggin/include/asm-i386/pgalloc.h | 3 --
linux-2.6-npiggin/include/asm-i386/pgtable-3level.h | 24 ++++++++++----------
linux-2.6-npiggin/include/asm-i386/pgtable.h | 1
11 files changed, 81 insertions(+), 33 deletions(-)
diff -puN arch/i386/kernel/acpi/sleep.c~4level-architecture-changes-for-i386 arch/i386/kernel/acpi/sleep.c
--- linux-2.6/arch/i386/kernel/acpi/sleep.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/kernel/acpi/sleep.c 2004-12-22 20:31:49.000000000 +1100
@@ -7,6 +7,7 @@
#include <linux/acpi.h>
#include <linux/bootmem.h>
+#include <asm/current.h> /* XXX remove me */
#include <asm/smp.h>
@@ -24,7 +25,7 @@ static void init_low_mapping(pgd_t *pgd,
int pgd_ofs = 0;
while ((pgd_ofs < pgd_limit) && (pgd_ofs + USER_PTRS_PER_PGD < PTRS_PER_PGD)) {
- set_pgd(pgd, *(pgd+USER_PTRS_PER_PGD));
+ set_pgd(pgd, (*(pgd+USER_PTRS_PER_PGD)));
pgd_ofs++, pgd++;
}
}
diff -puN arch/i386/kernel/vm86.c~4level-architecture-changes-for-i386 arch/i386/kernel/vm86.c
--- linux-2.6/arch/i386/kernel/vm86.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/kernel/vm86.c 2004-12-22 20:31:49.000000000 +1100
@@ -137,6 +137,7 @@ struct pt_regs * fastcall save_v86_state
static void mark_screen_rdonly(struct task_struct * tsk)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte, *mapped;
int i;
@@ -151,7 +152,15 @@ static void mark_screen_rdonly(struct ta
pgd_clear(pgd);
goto out;
}
- pmd = pmd_offset(pgd, 0xA0000);
+ pud = pud_offset(pgd, 0xA0000);
+ if (pud_none(*pud))
+ goto out;
+ if (pud_bad(*pud)) {
+ pud_ERROR(*pud);
+ pud_clear(pud);
+ goto out;
+ }
+ pmd = pmd_offset(pud, 0xA0000);
if (pmd_none(*pmd))
goto out;
if (pmd_bad(*pmd)) {
diff -puN arch/i386/mm/fault.c~4level-architecture-changes-for-i386 arch/i386/mm/fault.c
--- linux-2.6/arch/i386/mm/fault.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/fault.c 2004-12-22 20:31:49.000000000 +1100
@@ -518,6 +518,7 @@ vmalloc_fault:
int index = pgd_index(address);
unsigned long pgd_paddr;
pgd_t *pgd, *pgd_k;
+ pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;
pte_t *pte_k;
@@ -530,11 +531,17 @@ vmalloc_fault:
/*
* set_pgd(pgd, *pgd_k); here would be useless on PAE
- * and redundant with the set_pmd() on non-PAE.
+ * and redundant with the set_pmd() on non-PAE. As would
+ * set_pud.
*/
- pmd = pmd_offset(pgd, address);
- pmd_k = pmd_offset(pgd_k, address);
+ pud = pud_offset(pgd, address);
+ pud_k = pud_offset(pgd_k, address);
+ if (!pud_present(*pud_k))
+ goto no_context;
+
+ pmd = pmd_offset(pud, address);
+ pmd_k = pmd_offset(pud_k, address);
if (!pmd_present(*pmd_k))
goto no_context;
set_pmd(pmd, *pmd_k);
diff -puN arch/i386/mm/hugetlbpage.c~4level-architecture-changes-for-i386 arch/i386/mm/hugetlbpage.c
--- linux-2.6/arch/i386/mm/hugetlbpage.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/hugetlbpage.c 2004-12-22 20:31:49.000000000 +1100
@@ -21,20 +21,24 @@
static pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd = NULL;
pgd = pgd_offset(mm, addr);
- pmd = pmd_alloc(mm, pgd, addr);
+ pud = pud_alloc(mm, pgd, addr);
+ pmd = pmd_alloc(mm, pud, addr);
return (pte_t *) pmd;
}
static pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd = NULL;
pgd = pgd_offset(mm, addr);
- pmd = pmd_offset(pgd, addr);
+ pud = pud_offset(pgd, addr);
+ pmd = pmd_offset(pud, addr);
return (pte_t *) pmd;
}
diff -puN arch/i386/mm/init.c~4level-architecture-changes-for-i386 arch/i386/mm/init.c
--- linux-2.6/arch/i386/mm/init.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/init.c 2004-12-22 20:31:49.000000000 +1100
@@ -54,15 +54,18 @@ static int noinline do_test_wp_bit(void)
*/
static pmd_t * __init one_md_table_init(pgd_t *pgd)
{
+ pud_t *pud;
pmd_t *pmd_table;
#ifdef CONFIG_X86_PAE
pmd_table = (pmd_t *) alloc_bootmem_low_pages(PAGE_SIZE);
set_pgd(pgd, __pgd(__pa(pmd_table) | _PAGE_PRESENT));
- if (pmd_table != pmd_offset(pgd, 0))
+ pud = pud_offset(pgd, 0);
+ if (pmd_table != pmd_offset(pud, 0))
BUG();
#else
- pmd_table = pmd_offset(pgd, 0);
+ pud = pud_offset(pgd, 0);
+ pmd_table = pmd_offset(pud, 0);
#endif
return pmd_table;
@@ -100,6 +103,7 @@ static pte_t * __init one_page_table_ini
static void __init page_table_range_init (unsigned long start, unsigned long end, pgd_t *pgd_base)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
int pgd_idx, pmd_idx;
unsigned long vaddr;
@@ -112,8 +116,8 @@ static void __init page_table_range_init
for ( ; (pgd_idx < PTRS_PER_PGD) && (vaddr != end); pgd++, pgd_idx++) {
if (pgd_none(*pgd))
one_md_table_init(pgd);
-
- pmd = pmd_offset(pgd, vaddr);
+ pud = pud_offset(pgd, vaddr);
+ pmd = pmd_offset(pud, vaddr);
for (; (pmd_idx < PTRS_PER_PMD) && (vaddr != end); pmd++, pmd_idx++) {
if (pmd_none(*pmd))
one_page_table_init(pmd);
@@ -233,7 +237,7 @@ EXPORT_SYMBOL(kmap_prot);
EXPORT_SYMBOL(kmap_pte);
#define kmap_get_fixmap_pte(vaddr) \
- pte_offset_kernel(pmd_offset(pgd_offset_k(vaddr), (vaddr)), (vaddr))
+ pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr), vaddr), (vaddr)), (vaddr))
void __init kmap_init(void)
{
@@ -249,6 +253,7 @@ void __init kmap_init(void)
void __init permanent_kmaps_init(pgd_t *pgd_base)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
unsigned long vaddr;
@@ -257,7 +262,8 @@ void __init permanent_kmaps_init(pgd_t *
page_table_range_init(vaddr, vaddr + PAGE_SIZE*LAST_PKMAP, pgd_base);
pgd = swapper_pg_dir + pgd_index(vaddr);
- pmd = pmd_offset(pgd, vaddr);
+ pud = pud_offset(pgd, vaddr);
+ pmd = pmd_offset(pud, vaddr);
pte = pte_offset_kernel(pmd, vaddr);
pkmap_page_table = pte;
}
diff -puN arch/i386/mm/ioremap.c~4level-architecture-changes-for-i386 arch/i386/mm/ioremap.c
--- linux-2.6/arch/i386/mm/ioremap.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/ioremap.c 2004-12-22 20:31:49.000000000 +1100
@@ -80,9 +80,14 @@ static int remap_area_pages(unsigned lon
BUG();
spin_lock(&init_mm.page_table_lock);
do {
+ pud_t *pud;
pmd_t *pmd;
- pmd = pmd_alloc(&init_mm, dir, address);
+
error = -ENOMEM;
+ pud = pud_alloc(&init_mm, dir, address);
+ if (!pud)
+ break;
+ pmd = pmd_alloc(&init_mm, pud, address);
if (!pmd)
break;
if (remap_area_pmd(pmd, address, end - address,
diff -puN arch/i386/mm/pageattr.c~4level-architecture-changes-for-i386 arch/i386/mm/pageattr.c
--- linux-2.6/arch/i386/mm/pageattr.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/pageattr.c 2004-12-22 20:31:49.000000000 +1100
@@ -19,11 +19,15 @@ static struct list_head df_list = LIST_H
pte_t *lookup_address(unsigned long address)
{
- pgd_t *pgd = pgd_offset_k(address);
+ pgd_t *pgd = pgd_offset_k(address);
+ pud_t *pud;
pmd_t *pmd;
if (pgd_none(*pgd))
return NULL;
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+ if (pud_none(*pud))
+ return NULL;
+ pmd = pmd_offset(pud, address);
if (pmd_none(*pmd))
return NULL;
if (pmd_large(*pmd))
@@ -77,9 +81,11 @@ static void set_pmd_pte(pte_t *kpte, uns
spin_lock_irqsave(&pgd_lock, flags);
for (page = pgd_list; page; page = (struct page *)page->index) {
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pgd = (pgd_t *)page_address(page) + pgd_index(address);
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+ pmd = pmd_offset(pud, address);
set_pte_atomic((pte_t *)pmd, pte);
}
spin_unlock_irqrestore(&pgd_lock, flags);
@@ -92,7 +98,7 @@ static void set_pmd_pte(pte_t *kpte, uns
static inline void revert_page(struct page *kpte_page, unsigned long address)
{
pte_t *linear = (pte_t *)
- pmd_offset(pgd_offset(&init_mm, address), address);
+ pmd_offset(pud_offset(pgd_offset_k(address), address), address);
set_pmd_pte(linear, address,
pfn_pte((__pa(address) & LARGE_PAGE_MASK) >> PAGE_SHIFT,
PAGE_KERNEL_LARGE));
diff -puN arch/i386/mm/pgtable.c~4level-architecture-changes-for-i386 arch/i386/mm/pgtable.c
--- linux-2.6/arch/i386/mm/pgtable.c~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/arch/i386/mm/pgtable.c 2004-12-22 20:31:49.000000000 +1100
@@ -62,6 +62,7 @@ void show_mem(void)
static void set_pte_pfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -70,7 +71,12 @@ static void set_pte_pfn(unsigned long va
BUG();
return;
}
- pmd = pmd_offset(pgd, vaddr);
+ pud = pud_offset(pgd, vaddr);
+ if (pud_none(*pud)) {
+ BUG();
+ return;
+ }
+ pmd = pmd_offset(pud, vaddr);
if (pmd_none(*pmd)) {
BUG();
return;
@@ -95,6 +101,7 @@ static void set_pte_pfn(unsigned long va
void set_pmd_pfn(unsigned long vaddr, unsigned long pfn, pgprot_t flags)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
if (vaddr & (PMD_SIZE-1)) { /* vaddr is misaligned */
@@ -110,7 +117,8 @@ void set_pmd_pfn(unsigned long vaddr, un
printk ("set_pmd_pfn: pgd_none\n");
return; /* BUG(); */
}
- pmd = pmd_offset(pgd, vaddr);
+ pud = pud_offset(pgd, vaddr);
+ pmd = pmd_offset(pud, vaddr);
set_pmd(pmd, pfn_pmd(pfn, flags));
/*
* It's enough to flush this one mapping.
diff -puN include/asm-i386/mmu_context.h~4level-architecture-changes-for-i386 include/asm-i386/mmu_context.h
diff -puN include/asm-i386/page.h~4level-architecture-changes-for-i386 include/asm-i386/page.h
diff -puN include/asm-i386/pgalloc.h~4level-architecture-changes-for-i386 include/asm-i386/pgalloc.h
--- linux-2.6/include/asm-i386/pgalloc.h~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgalloc.h 2004-12-22 20:31:49.000000000 +1100
@@ -17,7 +17,6 @@
/*
* Allocate and free page tables.
*/
-
extern pgd_t *pgd_alloc(struct mm_struct *);
extern void pgd_free(pgd_t *pgd);
@@ -44,7 +43,7 @@ static inline void pte_free(struct page
#define pmd_alloc_one(mm, addr) ({ BUG(); ((pmd_t *)2); })
#define pmd_free(x) do { } while (0)
#define __pmd_free_tlb(tlb,x) do { } while (0)
-#define pgd_populate(mm, pmd, pte) BUG()
+#define pud_populate(mm, pmd, pte) BUG()
#endif
#define check_pgt_cache() do { } while (0)
diff -puN include/asm-i386/pgtable.h~4level-architecture-changes-for-i386 include/asm-i386/pgtable.h
--- linux-2.6/include/asm-i386/pgtable.h~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgtable.h 2004-12-22 20:31:49.000000000 +1100
@@ -303,6 +303,7 @@ static inline pte_t pte_modify(pte_t pte
* control the given virtual address
*/
#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
+#define pgd_index_k(addr) pgd_index(addr)
/*
* pgd_offset() returns a (pgd_t *)
diff -puN include/asm-i386/pgtable-2level.h~4level-architecture-changes-for-i386 include/asm-i386/pgtable-2level.h
diff -puN include/asm-i386/pgtable-3level.h~4level-architecture-changes-for-i386 include/asm-i386/pgtable-3level.h
--- linux-2.6/include/asm-i386/pgtable-3level.h~4level-architecture-changes-for-i386 2004-12-22 20:31:49.000000000 +1100
+++ linux-2.6-npiggin/include/asm-i386/pgtable-3level.h 2004-12-22 20:31:49.000000000 +1100
@@ -1,6 +1,8 @@
#ifndef _I386_PGTABLE_3LEVEL_H
#define _I386_PGTABLE_3LEVEL_H
+#include <asm-generic/pgtable-nopud.h>
+
/*
* Intel Physical Address Extension (PAE) Mode - three-level page
* tables on PPro+ CPUs.
@@ -15,9 +17,9 @@
#define pgd_ERROR(e) \
printk("%s:%d: bad pgd %p(%016Lx).\n", __FILE__, __LINE__, &(e), pgd_val(e))
-static inline int pgd_none(pgd_t pgd) { return 0; }
-static inline int pgd_bad(pgd_t pgd) { return 0; }
-static inline int pgd_present(pgd_t pgd) { return 1; }
+#define pud_none(pud) 0
+#define pud_bad(pud) 0
+#define pud_present(pud) 1
/*
* Is the pte executable?
@@ -59,8 +61,8 @@ static inline void set_pte(pte_t *ptep,
set_64bit((unsigned long long *)(pteptr),pte_val(pteval))
#define set_pmd(pmdptr,pmdval) \
set_64bit((unsigned long long *)(pmdptr),pmd_val(pmdval))
-#define set_pgd(pgdptr,pgdval) \
- set_64bit((unsigned long long *)(pgdptr),pgd_val(pgdval))
+#define set_pud(pudptr,pudval) \
+ set_64bit((unsigned long long *)(pudptr),pud_val(pudval))
/*
* Pentium-II erratum A13: in PAE mode we explicitly have to flush
@@ -68,22 +70,22 @@ static inline void set_pte(pte_t *ptep,
* We do not let the generic code free and clear pgd entries due to
* this erratum.
*/
-static inline void pgd_clear (pgd_t * pgd) { }
+static inline void pud_clear (pud_t * pud) { }
#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
#define pmd_page_kernel(pmd) \
((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
-#define pgd_page(pgd) \
-((struct page *) __va(pgd_val(pgd) & PAGE_MASK))
+#define pud_page(pud) \
+((struct page *) __va(pud_val(pud) & PAGE_MASK))
-#define pgd_page_kernel(pgd) \
-((unsigned long) __va(pgd_val(pgd) & PAGE_MASK))
+#define pud_page_kernel(pud) \
+((unsigned long) __va(pud_val(pud) & PAGE_MASK))
/* Find an entry in the second-level page table.. */
-#define pmd_offset(dir, address) ((pmd_t *) pgd_page(*(dir)) + \
+#define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \
pmd_index(address))
static inline pte_t ptep_get_and_clear(pte_t *ptep)
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 10/11] convert ia64 to generic nopud header
2004-12-22 10:00 ` [PATCH 9/11] convert i386 to generic nopud header Nick Piggin
@ 2004-12-22 10:00 ` Nick Piggin
2004-12-22 10:01 ` [PATCH 11/11] convert x86_64 to 4 level page tables Nick Piggin
0 siblings, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 10:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 6 bytes --]
10/11
[-- Attachment #2: 4level-ia64.patch --]
[-- Type: text/plain, Size: 6975 bytes --]
Convert ia64 architecture over to handle 4 level pagetables.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/arch/ia64/mm/fault.c | 7 ++++++-
linux-2.6-npiggin/arch/ia64/mm/hugetlbpage.c | 20 ++++++++++++++------
linux-2.6-npiggin/arch/ia64/mm/init.c | 14 ++++++++++++--
linux-2.6-npiggin/include/asm-ia64/pgalloc.h | 5 ++---
linux-2.6-npiggin/include/asm-ia64/pgtable.h | 16 ++++++++--------
linux-2.6-npiggin/include/asm-ia64/tlb.h | 6 ++++++
6 files changed, 48 insertions(+), 20 deletions(-)
diff -puN include/asm-ia64/pgtable.h~4level-ia64 include/asm-ia64/pgtable.h
--- linux-2.6/include/asm-ia64/pgtable.h~4level-ia64 2004-12-22 20:31:53.000000000 +1100
+++ linux-2.6-npiggin/include/asm-ia64/pgtable.h 2004-12-22 20:32:20.000000000 +1100
@@ -1,8 +1,6 @@
#ifndef _ASM_IA64_PGTABLE_H
#define _ASM_IA64_PGTABLE_H
-#include <asm-generic/4level-fixup.h>
-
/*
* This file contains the functions and defines necessary to modify and use
* the IA-64 page table tree.
@@ -256,11 +254,12 @@ ia64_phys_addr_valid (unsigned long addr
#define pmd_page_kernel(pmd) ((unsigned long) __va(pmd_val(pmd) & _PFN_MASK))
#define pmd_page(pmd) virt_to_page((pmd_val(pmd) + PAGE_OFFSET))
-#define pgd_none(pgd) (!pgd_val(pgd))
-#define pgd_bad(pgd) (!ia64_phys_addr_valid(pgd_val(pgd)))
-#define pgd_present(pgd) (pgd_val(pgd) != 0UL)
-#define pgd_clear(pgdp) (pgd_val(*(pgdp)) = 0UL)
-#define pgd_page(pgd) ((unsigned long) __va(pgd_val(pgd) & _PFN_MASK))
+#define pud_none(pud) (!pud_val(pud))
+#define pud_bad(pud) (!ia64_phys_addr_valid(pud_val(pud)))
+#define pud_present(pud) (pud_val(pud) != 0UL)
+#define pud_clear(pudp) (pud_val(*(pudp)) = 0UL)
+
+#define pud_page(pud) ((unsigned long) __va(pud_val(pud) & _PFN_MASK))
/*
* The following have defined behavior only work if pte_present() is true.
@@ -330,7 +329,7 @@ pgd_offset (struct mm_struct *mm, unsign
/* Find an entry in the second-level page table.. */
#define pmd_offset(dir,addr) \
- ((pmd_t *) pgd_page(*(dir)) + (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1)))
+ ((pmd_t *) pud_page(*(dir)) + (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1)))
/*
* Find an entry in the third-level page table. This looks more complicated than it
@@ -563,5 +562,6 @@ do { \
#define __HAVE_ARCH_PTE_SAME
#define __HAVE_ARCH_PGD_OFFSET_GATE
#include <asm-generic/pgtable.h>
+#include <asm-generic/pgtable-nopud.h>
#endif /* _ASM_IA64_PGTABLE_H */
diff -puN arch/ia64/mm/fault.c~4level-ia64 arch/ia64/mm/fault.c
--- linux-2.6/arch/ia64/mm/fault.c~4level-ia64 2004-12-22 20:31:53.000000000 +1100
+++ linux-2.6-npiggin/arch/ia64/mm/fault.c 2004-12-22 20:32:06.000000000 +1100
@@ -51,6 +51,7 @@ static int
mapped_kernel_page_is_present (unsigned long address)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *ptep, pte;
@@ -58,7 +59,11 @@ mapped_kernel_page_is_present (unsigned
if (pgd_none(*pgd) || pgd_bad(*pgd))
return 0;
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+ if (pud_none(*pud) || pud_bad(*pud))
+ return 0;
+
+ pmd = pmd_offset(pud, address);
if (pmd_none(*pmd) || pmd_bad(*pmd))
return 0;
diff -puN arch/ia64/mm/hugetlbpage.c~4level-ia64 arch/ia64/mm/hugetlbpage.c
--- linux-2.6/arch/ia64/mm/hugetlbpage.c~4level-ia64 2004-12-22 20:31:53.000000000 +1100
+++ linux-2.6-npiggin/arch/ia64/mm/hugetlbpage.c 2004-12-22 20:32:06.000000000 +1100
@@ -29,13 +29,17 @@ huge_pte_alloc (struct mm_struct *mm, un
{
unsigned long taddr = htlbpage_to_page(addr);
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, taddr);
- pmd = pmd_alloc(mm, pgd, taddr);
- if (pmd)
- pte = pte_alloc_map(mm, pmd, taddr);
+ pud = pud_alloc(mm, pgd, taddr);
+ if (pud) {
+ pmd = pmd_alloc(mm, pud, taddr);
+ if (pmd)
+ pte = pte_alloc_map(mm, pmd, taddr);
+ }
return pte;
}
@@ -44,14 +48,18 @@ huge_pte_offset (struct mm_struct *mm, u
{
unsigned long taddr = htlbpage_to_page(addr);
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte = NULL;
pgd = pgd_offset(mm, taddr);
if (pgd_present(*pgd)) {
- pmd = pmd_offset(pgd, taddr);
- if (pmd_present(*pmd))
- pte = pte_offset_map(pmd, taddr);
+ pud = pud_offset(pgd, taddr);
+ if (pud_present(*pud)) {
+ pmd = pmd_offset(pud, taddr);
+ if (pmd_present(*pmd))
+ pte = pte_offset_map(pmd, taddr);
+ }
}
return pte;
diff -puN arch/ia64/mm/init.c~4level-ia64 arch/ia64/mm/init.c
--- linux-2.6/arch/ia64/mm/init.c~4level-ia64 2004-12-22 20:31:53.000000000 +1100
+++ linux-2.6-npiggin/arch/ia64/mm/init.c 2004-12-22 20:32:06.000000000 +1100
@@ -237,6 +237,7 @@ struct page *
put_kernel_page (struct page *page, unsigned long address, pgprot_t pgprot)
{
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -248,7 +249,11 @@ put_kernel_page (struct page *page, unsi
spin_lock(&init_mm.page_table_lock);
{
- pmd = pmd_alloc(&init_mm, pgd, address);
+ pud = pud_alloc(&init_mm, pgd, address);
+ if (!pud)
+ goto out;
+
+ pmd = pmd_alloc(&init_mm, pud, address);
if (!pmd)
goto out;
pte = pte_alloc_map(&init_mm, pmd, address);
@@ -381,6 +386,7 @@ create_mem_map_page_table (u64 start, u6
struct page *map_start, *map_end;
int node;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
@@ -395,7 +401,11 @@ create_mem_map_page_table (u64 start, u6
pgd = pgd_offset_k(address);
if (pgd_none(*pgd))
pgd_populate(&init_mm, pgd, alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE));
- pmd = pmd_offset(pgd, address);
+ pud = pud_offset(pgd, address);
+
+ if (pud_none(*pud))
+ pud_populate(&init_mm, pud, alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE));
+ pmd = pmd_offset(pud, address);
if (pmd_none(*pmd))
pmd_populate_kernel(&init_mm, pmd, alloc_bootmem_pages_node(NODE_DATA(node), PAGE_SIZE));
diff -puN include/asm-ia64/pgalloc.h~4level-ia64 include/asm-ia64/pgalloc.h
--- linux-2.6/include/asm-ia64/pgalloc.h~4level-ia64 2004-12-22 20:31:53.000000000 +1100
+++ linux-2.6-npiggin/include/asm-ia64/pgalloc.h 2004-12-22 20:32:06.000000000 +1100
@@ -79,12 +79,11 @@ pgd_free (pgd_t *pgd)
}
static inline void
-pgd_populate (struct mm_struct *mm, pgd_t *pgd_entry, pmd_t *pmd)
+pud_populate (struct mm_struct *mm, pud_t *pud_entry, pmd_t *pmd)
{
- pgd_val(*pgd_entry) = __pa(pmd);
+ pud_val(*pud_entry) = __pa(pmd);
}
-
static inline pmd_t*
pmd_alloc_one_fast (struct mm_struct *mm, unsigned long addr)
{
diff -puN include/asm-ia64/tlb.h~4level-ia64 include/asm-ia64/tlb.h
--- linux-2.6/include/asm-ia64/tlb.h~4level-ia64 2004-12-22 20:31:53.000000000 +1100
+++ linux-2.6-npiggin/include/asm-ia64/tlb.h 2004-12-22 20:32:06.000000000 +1100
@@ -236,4 +236,10 @@ do { \
__pmd_free_tlb(tlb, ptep); \
} while (0)
+#define pud_free_tlb(tlb, pudp) \
+do { \
+ tlb->need_flush = 1; \
+ __pud_free_tlb(tlb, pudp); \
+} while (0)
+
#endif /* _ASM_IA64_TLB_H */
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 11/11] convert x86_64 to 4 level page tables
2004-12-22 10:00 ` [PATCH 10/11] convert ia64 " Nick Piggin
@ 2004-12-22 10:01 ` Nick Piggin
0 siblings, 0 replies; 13+ messages in thread
From: Nick Piggin @ 2004-12-22 10:01 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andrew Morton, Andi Kleen, Hugh Dickins, Linux Memory Management
[-- Attachment #1: Type: text/plain, Size: 6 bytes --]
11/11
[-- Attachment #2: 4level-x86-64.patch --]
[-- Type: text/plain, Size: 44714 bytes --]
From: Andi Kleen <ak@suse.de>
Converted to true 4levels. The address space per process is expanded to
47bits now, the supported physical address space is 46bits.
Lmbench fork/exit numbers are down a few percent because it has to walk much
more pagetables, but some planned future optimizations will hopefully recover
it.
See Documentation/x86_64/mm.txt for more details on the memory map.
Converted to pud_t by Nick Piggin.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
---
linux-2.6-npiggin/Documentation/x86_64/mm.txt | 168 ++-------------------
linux-2.6-npiggin/arch/x86_64/ia32/syscall32.c | 31 ++-
linux-2.6-npiggin/arch/x86_64/kernel/acpi/sleep.c | 8 -
linux-2.6-npiggin/arch/x86_64/kernel/head.S | 1
linux-2.6-npiggin/arch/x86_64/kernel/init_task.c | 2
linux-2.6-npiggin/arch/x86_64/kernel/reboot.c | 2
linux-2.6-npiggin/arch/x86_64/kernel/setup64.c | 13 -
linux-2.6-npiggin/arch/x86_64/mm/fault.c | 111 ++++++++-----
linux-2.6-npiggin/arch/x86_64/mm/init.c | 101 +++++-------
linux-2.6-npiggin/arch/x86_64/mm/ioremap.c | 43 ++++-
linux-2.6-npiggin/arch/x86_64/mm/pageattr.c | 34 ++--
linux-2.6-npiggin/include/asm-x86_64/e820.h | 3
linux-2.6-npiggin/include/asm-x86_64/mmu_context.h | 5
linux-2.6-npiggin/include/asm-x86_64/page.h | 12 -
linux-2.6-npiggin/include/asm-x86_64/pda.h | 1
linux-2.6-npiggin/include/asm-x86_64/pgalloc.h | 38 ++++
linux-2.6-npiggin/include/asm-x86_64/pgtable.h | 140 +++++++----------
linux-2.6-npiggin/include/asm-x86_64/processor.h | 4
18 files changed, 314 insertions(+), 403 deletions(-)
diff -puN Documentation/x86_64/mm.txt~4level-x86-64 Documentation/x86_64/mm.txt
--- linux-2.6/Documentation/x86_64/mm.txt~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/Documentation/x86_64/mm.txt 2004-12-22 20:33:05.000000000 +1100
@@ -1,148 +1,24 @@
-The paging design used on the x86-64 linux kernel port in 2.4.x provides:
-o per process virtual address space limit of 512 Gigabytes
-o top of userspace stack located at address 0x0000007fffffffff
-o PAGE_OFFSET = 0xffff800000000000
-o start of the kernel = 0xffffffff800000000
-o global RAM per system 2^64-PAGE_OFFSET-sizeof(kernel) = 128 Terabytes - 2 Gigabytes
-o no need of any common code change
-o no need to use highmem to handle the 128 Terabytes of RAM
-
-Description:
-
- Userspace is able to modify and it sees only the 3rd/2nd/1st level
- pagetables (pgd_offset() implicitly walks the 1st slot of the 4th
- level pagetable and it returns an entry into the 3rd level pagetable).
- This is where the per-process 512 Gigabytes limit cames from.
-
- The common code pgd is the PDPE, the pmd is the PDE, the
- pte is the PTE. The PML4E remains invisible to the common
- code.
-
- The kernel uses all the first 47 bits of the negative half
- of the virtual address space to build the direct mapping using
- 2 Mbytes page size. The kernel virtual addresses have bit number
- 47 always set to 1 (and in turn also bits 48-63 are set to 1 too,
- due the sign extension). This is where the 128 Terabytes - 2 Gigabytes global
- limit of RAM cames from.
-
- Since the per-process limit is 512 Gigabytes (due to kernel common
- code 3 level pagetable limitation), the higher virtual address mapped
- into userspace is 0x7fffffffff and it makes sense to use it
- as the top of the userspace stack to allow the stack to grow as
- much as possible.
-
- Setting the PAGE_OFFSET to 2^39 (after the last userspace
- virtual address) wouldn't make much difference compared to
- setting PAGE_OFFSET to 0xffff800000000000 because we have an
- hole into the virtual address space. The last byte mapped by the
- 255th slot in the 4th level pagetable is at virtual address
- 0x00007fffffffffff and the first byte mapped by the 256th slot in the
- 4th level pagetable is at address 0xffff800000000000. Due to this
- hole we can't trivially build a direct mapping across all the
- 512 slots of the 4th level pagetable, so we simply use only the
- second (negative) half of the 4th level pagetable for that purpose
- (that provides us 128 Terabytes of contigous virtual addresses).
- Strictly speaking we could build a direct mapping also across the hole
- using some DISCONTIGMEM trick, but we don't need such a large
- direct mapping right now.
-
-Future:
-
- During 2.5.x we can break the 512 Gigabytes per-process limit
- possibly by removing from the common code any knowledge about the
- architectural dependent physical layout of the virtual to physical
- mapping.
-
- Once the 512 Gigabytes limit will be removed the kernel stack will
- be moved (most probably to virtual address 0x00007fffffffffff).
- Nothing will break in userspace due that move, as nothing breaks
- in IA32 compiling the kernel with CONFIG_2G.
-
-Linus agreed on not breaking common code and to live with the 512 Gigabytes
-per-process limitation for the 2.4.x timeframe and he has given me and Andi
-some very useful hints... (thanks! :)
-
-Thanks also to H. Peter Anvin for his interesting and useful suggestions on
-the x86-64-discuss lists!
-
-Other memory management related issues follows:
-
-PAGE_SIZE:
-
- If somebody is wondering why these days we still have a so small
- 4k pagesize (16 or 32 kbytes would be much better for performance
- of course), the PAGE_SIZE have to remain 4k for 32bit apps to
- provide 100% backwards compatible IA32 API (we can't allow silent
- fs corruption or as best a loss of coherency with the page cache
- by allocating MAP_SHARED areas in MAP_ANONYMOUS memory with a
- do_mmap_fake). I think it could be possible to have a dynamic page
- size between 32bit and 64bit apps but it would need extremely
- intrusive changes in the common code as first for page cache and
- we sure don't want to depend on them right now even if the
- hardware would support that.
-
-PAGETABLE SIZE:
-
- In turn we can't afford to have pagetables larger than 4k because
- we could not be able to allocate them due physical memory
- fragmentation, and failing to allocate the kernel stack is a minor
- issue compared to failing the allocation of a pagetable. If we
- fail the allocation of a pagetable the only thing we can do is to
- sched_yield polling the freelist (deadlock prone) or to segfault
- the task (not even the sighandler would be sure to run).
-
-KERNEL STACK:
-
- 1st stage:
-
- The kernel stack will be at first allocated with an order 2 allocation
- (16k) (the utilization of the stack for a 64bit platform really
- isn't exactly the double of a 32bit platform because the local
- variables may not be all 64bit wide, but not much less). This will
- make things even worse than they are right now on IA32 with
- respect of failing fork/clone due memory fragmentation.
-
- 2nd stage:
-
- We'll benchmark if reserving one register as task_struct
- pointer will improve performance of the kernel (instead of
- recalculating the task_struct pointer starting from the stack
- pointer each time). My guess is that recalculating will be faster
- but it worth a try.
-
- If reserving one register for the task_struct pointer
- will be faster we can as well split task_struct and kernel
- stack. task_struct can be a slab allocation or a
- PAGE_SIZEd allocation, and the kernel stack can then be
- allocated in a order 1 allocation. Really this is risky,
- since 8k on a 64bit platform is going to be less than 7k
- on a 32bit platform but we could try it out. This would
- reduce the fragmentation problem of an order of magnitude
- making it equal to the current IA32.
-
- We must also consider the x86-64 seems to provide in hardware a
- per-irq stack that could allow us to remove the irq handler
- footprint from the regular per-process-stack, so it could allow
- us to live with a smaller kernel stack compared to the other
- linux architectures.
-
- 3rd stage:
-
- Before going into production if we still have the order 2
- allocation we can add a sysctl that allows the kernel stack to be
- allocated with vmalloc during memory fragmentation. This have to
- remain turned off during benchmarks :) but it should be ok in real
- life.
-
-Order of PAGE_CACHE_SIZE and other allocations:
-
- On the long run we can increase the PAGE_CACHE_SIZE to be
- an order 2 allocations and also the slab/buffercache etc.ec..
- could be all done with order 2 allocations. To make the above
- to work we should change lots of common code thus it can be done
- only once the basic port will be in a production state. Having
- a working PAGE_CACHE_SIZE would be a benefit also for
- IA32 and other architectures of course.
+<previous description obsolete, deleted>
-Andrea <andrea@suse.de> SuSE
+Virtual memory map with 4 level page tables:
+
+0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm
+hole caused by [48:63] sign extension
+ffff800000000000 - ffff80ffffffffff (=40bits) guard hole
+ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of phys. memory
+ffffc10000000000 - ffffc1ffffffffff (=40bits) hole
+ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space
+... unused hole ...
+ffffffff80000000 - ffffffff82800000 (=40MB) kernel text mapping, from phys 0
+... unused hole ...
+ffffffff88000000 - fffffffffff00000 (=1919MB) module mapping space
+
+vmalloc space is lazily synchronized into the different PML4 pages of
+the processes using the page fault handler, with init_level4_pgt as
+reference.
+
+Current X86-64 implementations only support 40 bit of address space,
+but we support upto 46bits. This expands into MBZ space in the page tables.
+
+-Andi Kleen, Jul 2004
diff -puN arch/x86_64/ia32/syscall32.c~4level-x86-64 arch/x86_64/ia32/syscall32.c
--- linux-2.6/arch/x86_64/ia32/syscall32.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/ia32/syscall32.c 2004-12-22 20:33:05.000000000 +1100
@@ -40,23 +40,30 @@ static int use_sysenter = -1;
*/
int __map_syscall32(struct mm_struct *mm, unsigned long address)
{
+ pgd_t *pgd;
+ pgd_t *pud;
pte_t *pte;
pmd_t *pmd;
- int err = 0;
+ int err = -ENOMEM;
spin_lock(&mm->page_table_lock);
- pmd = pmd_alloc(mm, pgd_offset(mm, address), address);
- if (pmd && (pte = pte_alloc_map(mm, pmd, address)) != NULL) {
- if (pte_none(*pte)) {
- set_pte(pte,
- mk_pte(virt_to_page(syscall32_page),
- PAGE_KERNEL_VSYSCALL));
+ pgd = pgd_offset(mm, address);
+ pud = pud_alloc(mm, pgd, address);
+ if (pud) {
+ pmd = pmd_alloc(mm, pud, address);
+ if (pmd && (pte = pte_alloc_map(mm, pmd, address)) != NULL) {
+ if (pte_none(*pte)) {
+ set_pte(pte,
+ mk_pte(virt_to_page(syscall32_page),
+ PAGE_KERNEL_VSYSCALL));
+ }
+ /* Flush only the local CPU. Other CPUs taking a fault
+ will just end up here again
+ This probably not needed and just paranoia. */
+ __flush_tlb_one(address);
+ err = 0;
}
- /* Flush only the local CPU. Other CPUs taking a fault
- will just end up here again */
- __flush_tlb_one(address);
- } else
- err = -ENOMEM;
+ }
spin_unlock(&mm->page_table_lock);
return err;
}
diff -puN arch/x86_64/kernel/acpi/sleep.c~4level-x86-64 arch/x86_64/kernel/acpi/sleep.c
--- linux-2.6/arch/x86_64/kernel/acpi/sleep.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/kernel/acpi/sleep.c 2004-12-22 20:33:05.000000000 +1100
@@ -61,9 +61,13 @@ extern char wakeup_start, wakeup_end;
extern unsigned long FASTCALL(acpi_copy_wakeup_routine(unsigned long));
+static pgd_t low_ptr;
+
static void init_low_mapping(void)
{
- cpu_pda[0].level4_pgt[0] = cpu_pda[0].level4_pgt[pml4_index(PAGE_OFFSET)];
+ pgd_t *slot0 = pgd_offset(current->mm, 0UL);
+ low_ptr = *slot0;
+ set_pgd(slot0, *pgd_offset(current->mm, PAGE_OFFSET));
flush_tlb_all();
}
@@ -97,7 +101,7 @@ int acpi_save_state_disk (void)
*/
void acpi_restore_state_mem (void)
{
- cpu_pda[0].level4_pgt[0] = 0;
+ set_pgd(pgd_offset(current->mm, 0UL), low_ptr);
flush_tlb_all();
}
diff -puN arch/x86_64/kernel/head.S~4level-x86-64 arch/x86_64/kernel/head.S
--- linux-2.6/arch/x86_64/kernel/head.S~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/kernel/head.S 2004-12-22 20:33:05.000000000 +1100
@@ -225,7 +225,6 @@ ENTRY(init_level4_pgt)
.quad 0x0000000000103007 /* -> level3_kernel_pgt */
.org 0x2000
-/* Kernel does not "know" about 4-th level of page tables. */
ENTRY(level3_ident_pgt)
.quad 0x0000000000104007
.fill 511,8,0
diff -puN arch/x86_64/kernel/init_task.c~4level-x86-64 arch/x86_64/kernel/init_task.c
--- linux-2.6/arch/x86_64/kernel/init_task.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/kernel/init_task.c 2004-12-22 20:33:05.000000000 +1100
@@ -47,5 +47,3 @@ EXPORT_SYMBOL(init_task);
DEFINE_PER_CPU(struct tss_struct, init_tss) ____cacheline_maxaligned_in_smp;
#define ALIGN_TO_4K __attribute__((section(".data.init_task")))
-
-pgd_t boot_vmalloc_pgt[512] ALIGN_TO_4K;
diff -puN arch/x86_64/kernel/setup64.c~4level-x86-64 arch/x86_64/kernel/setup64.c
--- linux-2.6/arch/x86_64/kernel/setup64.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/kernel/setup64.c 2004-12-22 20:33:05.000000000 +1100
@@ -66,7 +66,7 @@ __setup("noexec=", nonx_setup);
/*
* Great future plan:
- * Declare PDA itself and support (irqstack,tss,pml4) as per cpu data.
+ * Declare PDA itself and support (irqstack,tss,pgd) as per cpu data.
* Always point %gs to its beginning
*/
void __init setup_per_cpu_areas(void)
@@ -100,7 +100,6 @@ void __init setup_per_cpu_areas(void)
void pda_init(int cpu)
{
- pml4_t *level4;
struct x8664_pda *pda = &cpu_pda[cpu];
/* Setup up data that may be needed in __get_free_pages early */
@@ -119,22 +118,14 @@ void pda_init(int cpu)
/* others are initialized in smpboot.c */
pda->pcurrent = &init_task;
pda->irqstackptr = boot_cpu_stack;
- level4 = init_level4_pgt;
} else {
- level4 = (pml4_t *)__get_free_pages(GFP_ATOMIC, 0);
- if (!level4)
- panic("Cannot allocate top level page for cpu %d", cpu);
pda->irqstackptr = (char *)
__get_free_pages(GFP_ATOMIC, IRQSTACK_ORDER);
if (!pda->irqstackptr)
panic("cannot allocate irqstack for cpu %d", cpu);
}
- pda->level4_pgt = (unsigned long *)level4;
- if (level4 != init_level4_pgt)
- memcpy(level4, &init_level4_pgt, PAGE_SIZE);
- set_pml4(level4 + 510, mk_kernel_pml4(__pa_symbol(boot_vmalloc_pgt)));
- asm volatile("movq %0,%%cr3" :: "r" (__pa(level4)));
+ asm volatile("movq %0,%%cr3" :: "r" (__pa_symbol(&init_level4_pgt)));
pda->irqstackptr += IRQSTACKSIZE-64;
}
diff -puN arch/x86_64/mm/fault.c~4level-x86-64 arch/x86_64/mm/fault.c
--- linux-2.6/arch/x86_64/mm/fault.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/mm/fault.c 2004-12-22 20:33:05.000000000 +1100
@@ -143,25 +143,25 @@ static int bad_address(void *p)
void dump_pagetable(unsigned long address)
{
- pml4_t *pml4;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
- asm("movq %%cr3,%0" : "=r" (pml4));
+ asm("movq %%cr3,%0" : "=r" (pgd));
- pml4 = __va((unsigned long)pml4 & PHYSICAL_PAGE_MASK);
- pml4 += pml4_index(address);
- printk("PML4 %lx ", pml4_val(*pml4));
- if (bad_address(pml4)) goto bad;
- if (!pml4_present(*pml4)) goto ret;
-
- pgd = __pgd_offset_k((pgd_t *)pml4_page(*pml4), address);
+ pgd = __va((unsigned long)pgd & PHYSICAL_PAGE_MASK);
+ pgd += pgd_index(address);
+ printk("PGD %lx ", pgd_val(*pgd));
if (bad_address(pgd)) goto bad;
- printk("PGD %lx ", pgd_val(*pgd));
- if (!pgd_present(*pgd)) goto ret;
+ if (!pgd_present(*pgd)) goto ret;
+
+ pud = __pud_offset_k((pud_t *)pgd_page(*pgd), address);
+ if (bad_address(pud)) goto bad;
+ printk("PUD %lx ", pud_val(*pud));
+ if (!pud_present(*pud)) goto ret;
- pmd = pmd_offset(pgd, address);
+ pmd = pmd_offset(pud, address);
if (bad_address(pmd)) goto bad;
printk("PMD %lx ", pmd_val(*pmd));
if (!pmd_present(*pmd)) goto ret;
@@ -232,7 +232,53 @@ static noinline void pgtable_bad(unsigne
do_exit(SIGKILL);
}
-int page_fault_trace;
+/*
+ * Handle a fault on the vmalloc or module mapping area
+ */
+static int vmalloc_fault(unsigned long address)
+{
+ pgd_t *pgd, *pgd_ref;
+ pud_t *pud, *pud_ref;
+ pmd_t *pmd, *pmd_ref;
+ pte_t *pte, *pte_ref;
+
+ /* Copy kernel mappings over when needed. This can also
+ happen within a race in page table update. In the later
+ case just flush. */
+
+ pgd = pgd_offset(current->mm ?: &init_mm, address);
+ pgd_ref = pgd_offset_k(address);
+ if (pgd_none(*pgd_ref))
+ return -1;
+ if (pgd_none(*pgd))
+ set_pgd(pgd, *pgd_ref);
+
+ /* Below here mismatches are bugs because these lower tables
+ are shared */
+
+ pud = pud_offset(pgd, address);
+ pud_ref = pud_offset(pgd_ref, address);
+ if (pud_none(*pud_ref))
+ return -1;
+ if (pud_none(*pud) || pud_page(*pud) != pud_page(*pud_ref))
+ BUG();
+ pmd = pmd_offset(pud, address);
+ pmd_ref = pmd_offset(pud_ref, address);
+ if (pmd_none(*pmd_ref))
+ return -1;
+ if (pmd_none(*pmd) || pmd_page(*pmd) != pmd_page(*pmd_ref))
+ BUG();
+ pte_ref = pte_offset_kernel(pmd_ref, address);
+ if (!pte_present(*pte_ref))
+ return -1;
+ pte = pte_offset_kernel(pmd, address);
+ if (!pte_present(*pte) || pte_page(*pte) != pte_page(*pte_ref))
+ BUG();
+ __flush_tlb_all();
+ return 0;
+}
+
+int page_fault_trace = 0;
int exception_trace = 1;
/*
@@ -300,8 +346,11 @@ asmlinkage void do_page_fault(struct pt_
* protection error (error_code & 1) == 0.
*/
if (unlikely(address >= TASK_SIZE)) {
- if (!(error_code & 5))
- goto vmalloc_fault;
+ if (!(error_code & 5)) {
+ if (vmalloc_fault(address) < 0)
+ goto bad_area_nosemaphore;
+ return;
+ }
/*
* Don't take the mm semaphore here. If we fixup a prefetch
* fault we could otherwise deadlock.
@@ -310,7 +359,7 @@ asmlinkage void do_page_fault(struct pt_
}
if (unlikely(error_code & (1 << 3)))
- goto page_table_corruption;
+ pgtable_bad(address, regs, error_code);
/*
* If we're in an interrupt or have no user
@@ -524,34 +573,4 @@ do_sigbus:
info.si_addr = (void __user *)address;
force_sig_info(SIGBUS, &info, tsk);
return;
-
-vmalloc_fault:
- {
- pgd_t *pgd;
- pmd_t *pmd;
- pte_t *pte;
-
- /*
- * x86-64 has the same kernel 3rd level pages for all CPUs.
- * But for vmalloc/modules the TLB synchronization works lazily,
- * so it can happen that we get a page fault for something
- * that is really already in the page table. Just check if it
- * is really there and when yes flush the local TLB.
- */
- pgd = pgd_offset_k(address);
- if (!pgd_present(*pgd))
- goto bad_area_nosemaphore;
- pmd = pmd_offset(pgd, address);
- if (!pmd_present(*pmd))
- goto bad_area_nosemaphore;
- pte = pte_offset_kernel(pmd, address);
- if (!pte_present(*pte))
- goto bad_area_nosemaphore;
-
- __flush_tlb_all();
- return;
- }
-
-page_table_corruption:
- pgtable_bad(address, regs, error_code);
}
diff -puN arch/x86_64/mm/init.c~4level-x86-64 arch/x86_64/mm/init.c
--- linux-2.6/arch/x86_64/mm/init.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/mm/init.c 2004-12-22 20:33:05.000000000 +1100
@@ -108,28 +108,28 @@ static void *spp_getpage(void)
static void set_pte_phys(unsigned long vaddr,
unsigned long phys, pgprot_t prot)
{
- pml4_t *level4;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte, new_pte;
Dprintk("set_pte_phys %lx to %lx\n", vaddr, phys);
- level4 = pml4_offset_k(vaddr);
- if (pml4_none(*level4)) {
- printk("PML4 FIXMAP MISSING, it should be setup in head.S!\n");
+ pgd = pgd_offset_k(vaddr);
+ if (pgd_none(*pgd)) {
+ printk("PGD FIXMAP MISSING, it should be setup in head.S!\n");
return;
}
- pgd = level3_offset_k(level4, vaddr);
- if (pgd_none(*pgd)) {
+ pud = pud_offset(pgd, vaddr);
+ if (pud_none(*pud)) {
pmd = (pmd_t *) spp_getpage();
- set_pgd(pgd, __pgd(__pa(pmd) | _KERNPG_TABLE | _PAGE_USER));
- if (pmd != pmd_offset(pgd, 0)) {
- printk("PAGETABLE BUG #01! %p <-> %p\n", pmd, pmd_offset(pgd,0));
+ set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE | _PAGE_USER));
+ if (pmd != pmd_offset(pud, 0)) {
+ printk("PAGETABLE BUG #01! %p <-> %p\n", pmd, pmd_offset(pud,0));
return;
}
}
- pmd = pmd_offset(pgd, vaddr);
+ pmd = pmd_offset(pud, vaddr);
if (pmd_none(*pmd)) {
pte = (pte_t *) spp_getpage();
set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE | _PAGE_USER));
@@ -210,31 +210,31 @@ static __init void unmap_low_page(int i)
ti->allocated = 0;
}
-static void __init phys_pgd_init(pgd_t *pgd, unsigned long address, unsigned long end)
+static void __init phys_pud_init(pud_t *pud, unsigned long address, unsigned long end)
{
long i, j;
- i = pgd_index(address);
- pgd = pgd + i;
- for (; i < PTRS_PER_PGD; pgd++, i++) {
+ i = pud_index(address);
+ pud = pud + i;
+ for (; i < PTRS_PER_PUD; pud++, i++) {
int map;
unsigned long paddr, pmd_phys;
pmd_t *pmd;
- paddr = (address & PML4_MASK) + i*PGDIR_SIZE;
+ paddr = address + i*PUD_SIZE;
if (paddr >= end) {
- for (; i < PTRS_PER_PGD; i++, pgd++)
- set_pgd(pgd, __pgd(0));
+ for (; i < PTRS_PER_PUD; i++, pud++)
+ set_pud(pud, __pud(0));
break;
}
- if (!e820_mapped(paddr, paddr+PGDIR_SIZE, 0)) {
- set_pgd(pgd, __pgd(0));
+ if (!e820_mapped(paddr, paddr+PUD_SIZE, 0)) {
+ set_pud(pud, __pud(0));
continue;
}
pmd = alloc_low_page(&map, &pmd_phys);
- set_pgd(pgd, __pgd(pmd_phys | _KERNPG_TABLE));
+ set_pud(pud, __pud(pmd_phys | _KERNPG_TABLE));
for (j = 0; j < PTRS_PER_PMD; pmd++, j++, paddr += PMD_SIZE) {
unsigned long pe;
@@ -260,7 +260,7 @@ void __init init_memory_mapping(void)
unsigned long adr;
unsigned long end;
unsigned long next;
- unsigned long pgds, pmds, tables;
+ unsigned long puds, pmds, tables;
Dprintk("init_memory_mapping\n");
@@ -273,9 +273,9 @@ void __init init_memory_mapping(void)
* discovered.
*/
- pgds = (end + PGDIR_SIZE - 1) >> PGDIR_SHIFT;
+ puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
- tables = round_up(pgds*8, PAGE_SIZE) + round_up(pmds * 8, PAGE_SIZE);
+ tables = round_up(puds*8, PAGE_SIZE) + round_up(pmds * 8, PAGE_SIZE);
table_start = find_e820_area(0x8000, __pa_symbol(&_text), tables);
if (table_start == -1UL)
@@ -288,13 +288,13 @@ void __init init_memory_mapping(void)
for (adr = PAGE_OFFSET; adr < end; adr = next) {
int map;
- unsigned long pgd_phys;
- pgd_t *pgd = alloc_low_page(&map, &pgd_phys);
- next = adr + PML4_SIZE;
+ unsigned long pud_phys;
+ pud_t *pud = alloc_low_page(&map, &pud_phys);
+ next = adr + PGDIR_SIZE;
if (next > end)
next = end;
- phys_pgd_init(pgd, adr-PAGE_OFFSET, next-PAGE_OFFSET);
- set_pml4(init_level4_pgt + pml4_index(adr), mk_kernel_pml4(pgd_phys));
+ phys_pud_init(pud, adr-PAGE_OFFSET, next-PAGE_OFFSET);
+ set_pgd(init_level4_pgt + pgd_index(adr), mk_kernel_pgd(pud_phys));
unmap_low_page(map);
}
asm volatile("movq %%cr4,%0" : "=r" (mmu_cr4_features));
@@ -306,25 +306,12 @@ void __init init_memory_mapping(void)
extern struct x8664_pda cpu_pda[NR_CPUS];
-static unsigned long low_pml4[NR_CPUS];
-
-void swap_low_mappings(void)
-{
- int i;
- for (i = 0; i < NR_CPUS; i++) {
- unsigned long t;
- if (!cpu_pda[i].level4_pgt)
- continue;
- t = cpu_pda[i].level4_pgt[0];
- cpu_pda[i].level4_pgt[0] = low_pml4[i];
- low_pml4[i] = t;
- }
- flush_tlb_all();
-}
-
+/* Assumes all CPUs still execute in init_mm */
void zap_low_mappings(void)
{
- swap_low_mappings();
+ pgd_t *pgd = pgd_offset_k(0UL);
+ pgd_clear(pgd);
+ flush_tlb_all();
}
#ifndef CONFIG_DISCONTIGMEM
@@ -361,10 +348,14 @@ void __init clear_kernel_mapping(unsigne
for (; address < end; address += LARGE_PAGE_SIZE) {
pgd_t *pgd = pgd_offset_k(address);
- pmd_t *pmd;
- if (!pgd || pgd_none(*pgd))
+ pud_t *pud;
+ pmd_t *pmd;
+ if (pgd_none(*pgd))
+ continue;
+ pud = pud_offset(pgd, address);
+ if (pud_none(*pud))
continue;
- pmd = pmd_offset(pgd, address);
+ pmd = pmd_offset(pud, address);
if (!pmd || pmd_none(*pmd))
continue;
if (0 == (pmd_val(*pmd) & _PAGE_PSE)) {
@@ -531,29 +522,29 @@ void __init reserve_bootmem_generic(unsi
int kern_addr_valid(unsigned long addr)
{
unsigned long above = ((long)addr) >> __VIRTUAL_MASK_SHIFT;
- pml4_t *pml4;
pgd_t *pgd;
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
if (above != 0 && above != -1UL)
return 0;
- pml4 = pml4_offset_k(addr);
- if (pml4_none(*pml4))
+ pgd = pgd_offset_k(addr);
+ if (pgd_none(*pgd))
return 0;
- pgd = pgd_offset_k(addr);
- if (pgd_none(*pgd))
+ pud = pud_offset(pgd, addr);
+ if (pud_none(*pud))
return 0;
- pmd = pmd_offset(pgd, addr);
+ pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd))
return 0;
if (pmd_large(*pmd))
return pfn_valid(pmd_pfn(*pmd));
- pte = pte_offset_kernel(pmd, addr);
+ pte = pte_offset_kernel(pmd, addr);
if (pte_none(*pte))
return 0;
return pfn_valid(pte_pfn(*pte));
diff -puN arch/x86_64/mm/ioremap.c~4level-x86-64 arch/x86_64/mm/ioremap.c
--- linux-2.6/arch/x86_64/mm/ioremap.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/mm/ioremap.c 2004-12-22 20:33:05.000000000 +1100
@@ -49,10 +49,10 @@ static inline int remap_area_pmd(pmd_t *
{
unsigned long end;
- address &= ~PGDIR_MASK;
+ address &= ~PUD_MASK;
end = address + size;
- if (end > PGDIR_SIZE)
- end = PGDIR_SIZE;
+ if (end > PUD_SIZE)
+ end = PUD_SIZE;
phys_addr -= address;
if (address >= end)
BUG();
@@ -67,31 +67,54 @@ static inline int remap_area_pmd(pmd_t *
return 0;
}
+static inline int remap_area_pud(pud_t * pud, unsigned long address, unsigned long size,
+ unsigned long phys_addr, unsigned long flags)
+{
+ unsigned long end;
+
+ address &= ~PGDIR_MASK;
+ end = address + size;
+ if (end > PGDIR_SIZE)
+ end = PGDIR_SIZE;
+ phys_addr -= address;
+ if (address >= end)
+ BUG();
+ do {
+ pmd_t * pmd = pmd_alloc(&init_mm, pud, address);
+ if (!pmd)
+ return -ENOMEM;
+ remap_area_pmd(pmd, address, end - address, address + phys_addr, flags);
+ address = (address + PUD_SIZE) & PUD_MASK;
+ pmd++;
+ } while (address && (address < end));
+ return 0;
+}
+
static int remap_area_pages(unsigned long address, unsigned long phys_addr,
unsigned long size, unsigned long flags)
{
int error;
- pgd_t * dir;
+ pgd_t *pgd;
unsigned long end = address + size;
phys_addr -= address;
- dir = pgd_offset_k(address);
+ pgd = pgd_offset_k(address);
flush_cache_all();
if (address >= end)
BUG();
spin_lock(&init_mm.page_table_lock);
do {
- pmd_t *pmd;
- pmd = pmd_alloc(&init_mm, dir, address);
+ pud_t *pud;
+ pud = pud_alloc(&init_mm, pgd, address);
error = -ENOMEM;
- if (!pmd)
+ if (!pud)
break;
- if (remap_area_pmd(pmd, address, end - address,
+ if (remap_area_pud(pud, address, end - address,
phys_addr + address, flags))
break;
error = 0;
address = (address + PGDIR_SIZE) & PGDIR_MASK;
- dir++;
+ pgd++;
} while (address && (address < end));
spin_unlock(&init_mm.page_table_lock);
flush_tlb_all();
diff -puN arch/x86_64/mm/pageattr.c~4level-x86-64 arch/x86_64/mm/pageattr.c
--- linux-2.6/arch/x86_64/mm/pageattr.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/mm/pageattr.c 2004-12-22 20:33:05.000000000 +1100
@@ -16,12 +16,16 @@
static inline pte_t *lookup_address(unsigned long address)
{
- pgd_t *pgd = pgd_offset_k(address);
+ pgd_t *pgd = pgd_offset_k(address);
+ pud_t *pud;
pmd_t *pmd;
pte_t *pte;
- if (!pgd || !pgd_present(*pgd))
+ if (pgd_none(*pgd))
+ return NULL;
+ pud = pud_offset(pgd, address);
+ if (!pud_present(*pud))
return NULL;
- pmd = pmd_offset(pgd, address);
+ pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd))
return NULL;
if (pmd_large(*pmd))
@@ -98,16 +102,20 @@ static inline void save_page(unsigned lo
*/
static void revert_page(unsigned long address, pgprot_t ref_prot)
{
- pgd_t *pgd;
- pmd_t *pmd;
- pte_t large_pte;
-
- pgd = pgd_offset_k(address);
- pmd = pmd_offset(pgd, address);
- BUG_ON(pmd_val(*pmd) & _PAGE_PSE);
- pgprot_val(ref_prot) |= _PAGE_PSE;
- large_pte = mk_pte_phys(__pa(address) & LARGE_PAGE_MASK, ref_prot);
- set_pte((pte_t *)pmd, large_pte);
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t large_pte;
+
+ pgd = pgd_offset_k(address);
+ BUG_ON(pgd_none(*pgd));
+ pud = pud_offset(pgd,address);
+ BUG_ON(pud_none(*pud));
+ pmd = pmd_offset(pud, address);
+ BUG_ON(pmd_val(*pmd) & _PAGE_PSE);
+ pgprot_val(ref_prot) |= _PAGE_PSE;
+ large_pte = mk_pte_phys(__pa(address) & LARGE_PAGE_MASK, ref_prot);
+ set_pte((pte_t *)pmd, large_pte);
}
static int
diff -puN include/asm-x86_64/e820.h~4level-x86-64 include/asm-x86_64/e820.h
--- linux-2.6/include/asm-x86_64/e820.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/e820.h 2004-12-22 20:33:05.000000000 +1100
@@ -26,9 +26,6 @@
#define LOWMEMSIZE() (0x9f000)
-#define MAXMEM (120UL * 1024 * 1024 * 1024 * 1024) /* 120TB */
-
-
#ifndef __ASSEMBLY__
struct e820entry {
u64 addr; /* start of memory segment */
diff -puN include/asm-x86_64/mmu_context.h~4level-x86-64 include/asm-x86_64/mmu_context.h
--- linux-2.6/include/asm-x86_64/mmu_context.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/mmu_context.h 2004-12-22 20:33:05.000000000 +1100
@@ -40,10 +40,7 @@ static inline void switch_mm(struct mm_s
write_pda(active_mm, next);
#endif
set_bit(cpu, &next->cpu_vm_mask);
- /* Re-load page tables */
- *read_pda(level4_pgt) = __pa(next->pgd) | _PAGE_TABLE;
- __flush_tlb();
-
+ asm volatile("movq %0,%%cr3" :: "r" (__pa(next->pgd)) : "memory");
if (unlikely(next->context.ldt != prev->context.ldt))
load_LDT_nolock(&next->context, cpu);
}
diff -puN include/asm-x86_64/page.h~4level-x86-64 include/asm-x86_64/page.h
--- linux-2.6/include/asm-x86_64/page.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/page.h 2004-12-22 20:33:05.000000000 +1100
@@ -43,22 +43,22 @@ void copy_page(void *, void *);
*/
typedef struct { unsigned long pte; } pte_t;
typedef struct { unsigned long pmd; } pmd_t;
+typedef struct { unsigned long pud; } pud_t;
typedef struct { unsigned long pgd; } pgd_t;
-typedef struct { unsigned long pml4; } pml4_t;
#define PTE_MASK PHYSICAL_PAGE_MASK
typedef struct { unsigned long pgprot; } pgprot_t;
#define pte_val(x) ((x).pte)
#define pmd_val(x) ((x).pmd)
+#define pud_val(x) ((x).pud)
#define pgd_val(x) ((x).pgd)
-#define pml4_val(x) ((x).pml4)
#define pgprot_val(x) ((x).pgprot)
#define __pte(x) ((pte_t) { (x) } )
#define __pmd(x) ((pmd_t) { (x) } )
+#define __pud(x) ((pud_t) { (x) } )
#define __pgd(x) ((pgd_t) { (x) } )
-#define __pml4(x) ((pml4_t) { (x) } )
#define __pgprot(x) ((pgprot_t) { (x) } )
extern unsigned long vm_stack_flags, vm_stack_flags32;
@@ -67,19 +67,19 @@ extern unsigned long vm_force_exec32;
#define __START_KERNEL 0xffffffff80100000UL
#define __START_KERNEL_map 0xffffffff80000000UL
-#define __PAGE_OFFSET 0x0000010000000000UL /* 1 << 40 */
+#define __PAGE_OFFSET 0xffff810000000000UL
#else
#define __START_KERNEL 0xffffffff80100000
#define __START_KERNEL_map 0xffffffff80000000
-#define __PAGE_OFFSET 0x0000010000000000 /* 1 << 40 */
+#define __PAGE_OFFSET 0xffff810000000000
#endif /* !__ASSEMBLY__ */
/* to align the pointer to the (next) page boundary */
#define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)
/* See Documentation/x86_64/mm.txt for a description of the memory map. */
-#define __PHYSICAL_MASK_SHIFT 40
+#define __PHYSICAL_MASK_SHIFT 46
#define __PHYSICAL_MASK ((1UL << __PHYSICAL_MASK_SHIFT) - 1)
#define __VIRTUAL_MASK_SHIFT 48
#define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
diff -puN include/asm-x86_64/pda.h~4level-x86-64 include/asm-x86_64/pda.h
--- linux-2.6/include/asm-x86_64/pda.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/pda.h 2004-12-22 20:33:05.000000000 +1100
@@ -17,7 +17,6 @@ struct x8664_pda {
int irqcount; /* Irq nesting counter. Starts with -1 */
int cpunumber; /* Logical CPU number */
char *irqstackptr; /* top of irqstack */
- unsigned long volatile *level4_pgt; /* Per CPU top level page table */
unsigned int __softirq_pending;
unsigned int __nmi_count; /* number of NMI on this CPUs */
struct mm_struct *active_mm;
diff -puN include/asm-x86_64/pgalloc.h~4level-x86-64 include/asm-x86_64/pgalloc.h
--- linux-2.6/include/asm-x86_64/pgalloc.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/pgalloc.h 2004-12-22 20:33:05.000000000 +1100
@@ -9,8 +9,10 @@
#define pmd_populate_kernel(mm, pmd, pte) \
set_pmd(pmd, __pmd(_PAGE_TABLE | __pa(pte)))
-#define pgd_populate(mm, pgd, pmd) \
- set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(pmd)))
+#define pud_populate(mm, pud, pmd) \
+ set_pud(pud, __pud(_PAGE_TABLE | __pa(pmd)))
+#define pgd_populate(mm, pgd, pud) \
+ set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(pud)))
static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, struct page *pte)
{
@@ -33,12 +35,37 @@ static inline pmd_t *pmd_alloc_one (stru
return (pmd_t *)get_zeroed_page(GFP_KERNEL|__GFP_REPEAT);
}
-static inline pgd_t *pgd_alloc (struct mm_struct *mm)
+static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
{
- return (pgd_t *)get_zeroed_page(GFP_KERNEL|__GFP_REPEAT);
+ return (pud_t *)get_zeroed_page(GFP_KERNEL|__GFP_REPEAT);
}
-static inline void pgd_free (pgd_t *pgd)
+static inline void pud_free (pud_t *pud)
+{
+ BUG_ON((unsigned long)pud & (PAGE_SIZE-1));
+ free_page((unsigned long)pud);
+}
+
+static inline pgd_t *pgd_alloc(struct mm_struct *mm)
+{
+ unsigned boundary;
+ pgd_t *pgd = (pgd_t *)__get_free_page(GFP_KERNEL|__GFP_REPEAT);
+ if (!pgd)
+ return NULL;
+ /*
+ * Copy kernel pointers in from init.
+ * Could keep a freelist or slab cache of those because the kernel
+ * part never changes.
+ */
+ boundary = pgd_index(__PAGE_OFFSET);
+ memset(pgd, 0, boundary * sizeof(pgd_t));
+ memcpy(pgd + boundary,
+ init_level4_pgt + boundary,
+ (PTRS_PER_PGD - boundary) * sizeof(pgd_t));
+ return pgd;
+}
+
+static inline void pgd_free(pgd_t *pgd)
{
BUG_ON((unsigned long)pgd & (PAGE_SIZE-1));
free_page((unsigned long)pgd);
@@ -73,5 +100,6 @@ extern inline void pte_free(struct page
#define __pte_free_tlb(tlb,pte) tlb_remove_page((tlb),(pte))
#define __pmd_free_tlb(tlb,x) pmd_free(x)
+#define __pud_free_tlb(tlb,x) pud_free(x)
#endif /* _X86_64_PGALLOC_H */
diff -puN include/asm-x86_64/pgtable.h~4level-x86-64 include/asm-x86_64/pgtable.h
--- linux-2.6/include/asm-x86_64/pgtable.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/pgtable.h 2004-12-22 20:34:25.000000000 +1100
@@ -1,17 +1,9 @@
#ifndef _X86_64_PGTABLE_H
#define _X86_64_PGTABLE_H
-#include <asm-generic/4level-fixup.h>
-
/*
* This file contains the functions and defines necessary to modify and use
* the x86-64 page table tree.
- *
- * x86-64 has a 4 level table setup. Generic linux MM only supports
- * three levels. The fourth level is currently a single static page that
- * is shared by everybody and just contains a pointer to the current
- * three level page setup on the beginning and some kernel mappings at
- * the end. For more details see Documentation/x86_64/mm.txt
*/
#include <asm/processor.h>
#include <asm/fixmap.h>
@@ -19,15 +11,14 @@
#include <linux/threads.h>
#include <asm/pda.h>
-extern pgd_t level3_kernel_pgt[512];
-extern pgd_t level3_physmem_pgt[512];
-extern pgd_t level3_ident_pgt[512];
+extern pud_t level3_kernel_pgt[512];
+extern pud_t level3_physmem_pgt[512];
+extern pud_t level3_ident_pgt[512];
extern pmd_t level2_kernel_pgt[512];
-extern pml4_t init_level4_pgt[];
-extern pgd_t boot_vmalloc_pgt[];
+extern pgd_t init_level4_pgt[];
extern unsigned long __supported_pte_mask;
-#define swapper_pg_dir NULL
+#define swapper_pg_dir init_level4_pgt
extern void paging_init(void);
extern void clear_kernel_mapping(unsigned long addr, unsigned long size);
@@ -41,16 +32,19 @@ extern unsigned long pgkern_mask;
extern unsigned long empty_zero_page[PAGE_SIZE/sizeof(unsigned long)];
#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
-#define PML4_SHIFT 39
-#define PTRS_PER_PML4 512
-
/*
* PGDIR_SHIFT determines what a top-level page table entry can map
*/
-#define PGDIR_SHIFT 30
+#define PGDIR_SHIFT 39
#define PTRS_PER_PGD 512
/*
+ * 3rd level page
+ */
+#define PUD_SHIFT 30
+#define PTRS_PER_PUD 512
+
+/*
* PMD_SHIFT determines the size of the area a middle-level
* page table can map
*/
@@ -66,14 +60,13 @@ extern unsigned long empty_zero_page[PAG
printk("%s:%d: bad pte %p(%016lx).\n", __FILE__, __LINE__, &(e), pte_val(e))
#define pmd_ERROR(e) \
printk("%s:%d: bad pmd %p(%016lx).\n", __FILE__, __LINE__, &(e), pmd_val(e))
+#define pud_ERROR(e) \
+ printk("%s:%d: bad pud %p(%016lx).\n", __FILE__, __LINE__, &(e), pud_val(e))
#define pgd_ERROR(e) \
printk("%s:%d: bad pgd %p(%016lx).\n", __FILE__, __LINE__, &(e), pgd_val(e))
-
-#define pml4_none(x) (!pml4_val(x))
#define pgd_none(x) (!pgd_val(x))
-
-extern inline int pgd_present(pgd_t pgd) { return !pgd_none(pgd); }
+#define pud_none(x) (!pud_val(x))
static inline void set_pte(pte_t *dst, pte_t val)
{
@@ -85,6 +78,16 @@ static inline void set_pmd(pmd_t *dst, p
pmd_val(*dst) = pmd_val(val);
}
+static inline void set_pud(pud_t *dst, pud_t val)
+{
+ pud_val(*dst) = pud_val(val);
+}
+
+extern inline void pud_clear (pud_t *pud)
+{
+ set_pud(pud, __pud(0));
+}
+
static inline void set_pgd(pgd_t *dst, pgd_t val)
{
pgd_val(*dst) = pgd_val(val);
@@ -95,45 +98,30 @@ extern inline void pgd_clear (pgd_t * pg
set_pgd(pgd, __pgd(0));
}
-static inline void set_pml4(pml4_t *dst, pml4_t val)
-{
- pml4_val(*dst) = pml4_val(val);
-}
-
-#define pgd_page(pgd) \
-((unsigned long) __va(pgd_val(pgd) & PHYSICAL_PAGE_MASK))
+#define pud_page(pud) \
+((unsigned long) __va(pud_val(pud) & PHYSICAL_PAGE_MASK))
#define ptep_get_and_clear(xp) __pte(xchg(&(xp)->pte, 0))
#define pte_same(a, b) ((a).pte == (b).pte)
-#define PML4_SIZE (1UL << PML4_SHIFT)
-#define PML4_MASK (~(PML4_SIZE-1))
#define PMD_SIZE (1UL << PMD_SHIFT)
#define PMD_MASK (~(PMD_SIZE-1))
+#define PUD_SIZE (1UL << PUD_SHIFT)
+#define PUD_MASK (~(PUD_SIZE-1))
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE-1))
#define USER_PTRS_PER_PGD (TASK_SIZE/PGDIR_SIZE)
#define FIRST_USER_PGD_NR 0
-#define USER_PGD_PTRS (PAGE_OFFSET >> PGDIR_SHIFT)
-#define KERNEL_PGD_PTRS (PTRS_PER_PGD-USER_PGD_PTRS)
-
-#define TWOLEVEL_PGDIR_SHIFT 20
-#define BOOT_USER_L4_PTRS 1
-#define BOOT_KERNEL_L4_PTRS 511 /* But we will do it in 4rd level */
-
-
-
#ifndef __ASSEMBLY__
-#define VMALLOC_START 0xffffff0000000000UL
-#define VMALLOC_END 0xffffff7fffffffffUL
-#define MODULES_VADDR 0xffffffffa0000000UL
-#define MODULES_END 0xffffffffafffffffUL
+#define MAXMEM 0x3fffffffffffUL
+#define VMALLOC_START 0xffffc20000000000UL
+#define VMALLOC_END 0xffffe1ffffffffffUL
+#define MODULES_VADDR 0xffffffff88000000
+#define MODULES_END 0xfffffffffff00000
#define MODULES_LEN (MODULES_END - MODULES_VADDR)
-#define IOMAP_START 0xfffffe8000000000UL
-
#define _PAGE_BIT_PRESENT 0
#define _PAGE_BIT_RW 1
#define _PAGE_BIT_USER 2
@@ -224,6 +212,14 @@ static inline unsigned long pgd_bad(pgd_
return val & ~(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED);
}
+static inline unsigned long pud_bad(pud_t pud)
+{
+ unsigned long val = pud_val(pud);
+ val &= ~PTE_MASK;
+ val &= ~(_PAGE_USER | _PAGE_DIRTY);
+ return val & ~(_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED);
+}
+
#define pte_none(x) (!pte_val(x))
#define pte_present(x) (pte_val(x) & (_PAGE_PRESENT | _PAGE_PROTNONE))
#define pte_clear(xp) do { set_pte(xp, __pte(0)); } while (0)
@@ -302,54 +298,32 @@ static inline int pmd_large(pmd_t pte) {
/*
* Level 4 access.
- * Never use these in the common code.
*/
-#define pml4_page(pml4) ((unsigned long) __va(pml4_val(pml4) & PTE_MASK))
-#define pml4_index(address) ((address >> PML4_SHIFT) & (PTRS_PER_PML4-1))
-#define pml4_offset_k(address) (init_level4_pgt + pml4_index(address))
-#define pml4_present(pml4) (pml4_val(pml4) & _PAGE_PRESENT)
-#define mk_kernel_pml4(address) ((pml4_t){ (address) | _KERNPG_TABLE })
-#define level3_offset_k(dir, address) ((pgd_t *) pml4_page(*(dir)) + pgd_index(address))
+#define pgd_page(pgd) ((unsigned long) __va((unsigned long)pgd_val(pgd) & PTE_MASK))
+#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
+#define pgd_offset(mm, addr) ((mm)->pgd + pgd_index(addr))
+#define pgd_offset_k(address) (init_level4_pgt + pgd_index(address))
+#define pgd_present(pgd) (pgd_val(pgd) & _PAGE_PRESENT)
+#define mk_kernel_pgd(address) ((pgd_t){ (address) | _KERNPG_TABLE })
-/* PGD - Level3 access */
+/* PUD - Level3 access */
/* to find an entry in a page-table-directory. */
-#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
-static inline pgd_t *__pgd_offset_k(pgd_t *pgd, unsigned long address)
+#define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+#define pud_offset(pgd, address) ((pud_t *) pgd_page(*(pgd)) + pud_index(address))
+#define pud_offset_k(pgd, addr) pud_offset(pgd, addr)
+#define pud_present(pud) (pud_val(pud) & _PAGE_PRESENT)
+
+static inline pud_t *__pud_offset_k(pud_t *pud, unsigned long address)
{
- return pgd + pgd_index(address);
+ return pud + pud_index(address);
}
-/* Find correct pgd via the hidden fourth level page level: */
-
-/* This accesses the reference page table of the boot cpu.
- Other CPUs get synced lazily via the page fault handler. */
-static inline pgd_t *pgd_offset_k(unsigned long address)
-{
- unsigned long addr;
-
- addr = pml4_val(init_level4_pgt[pml4_index(address)]);
- addr &= PHYSICAL_PAGE_MASK;
- return __pgd_offset_k((pgd_t *)__va(addr), address);
-}
-
-/* Access the pgd of the page table as seen by the current CPU. */
-static inline pgd_t *current_pgd_offset_k(unsigned long address)
-{
- unsigned long addr;
-
- addr = read_pda(level4_pgt)[pml4_index(address)];
- addr &= PHYSICAL_PAGE_MASK;
- return __pgd_offset_k((pgd_t *)__va(addr), address);
-}
-
-#define pgd_offset(mm, address) ((mm)->pgd+pgd_index(address))
-
/* PMD - Level 2 access */
#define pmd_page_kernel(pmd) ((unsigned long) __va(pmd_val(pmd) & PTE_MASK))
#define pmd_page(pmd) (pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT))
#define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
-#define pmd_offset(dir, address) ((pmd_t *) pgd_page(*(dir)) + \
+#define pmd_offset(dir, address) ((pmd_t *) pud_page(*(dir)) + \
pmd_index(address))
#define pmd_none(x) (!pmd_val(x))
#define pmd_present(x) (pmd_val(x) & _PAGE_PRESENT)
diff -puN include/asm-x86_64/processor.h~4level-x86-64 include/asm-x86_64/processor.h
--- linux-2.6/include/asm-x86_64/processor.h~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/include/asm-x86_64/processor.h 2004-12-22 20:33:05.000000000 +1100
@@ -165,9 +165,9 @@ static inline void clear_in_cr4 (unsigne
/*
- * User space process size: 512GB - 1GB (default).
+ * User space process size. 47bits.
*/
-#define TASK_SIZE (0x0000007fc0000000UL)
+#define TASK_SIZE (0x800000000000)
/* This decides where the kernel will search for a free chunk of vm
* space during mmap's.
diff -puN arch/x86_64/kernel/reboot.c~4level-x86-64 arch/x86_64/kernel/reboot.c
--- linux-2.6/arch/x86_64/kernel/reboot.c~4level-x86-64 2004-12-22 20:33:05.000000000 +1100
+++ linux-2.6-npiggin/arch/x86_64/kernel/reboot.c 2004-12-22 20:33:05.000000000 +1100
@@ -74,7 +74,7 @@ static void reboot_warm(void)
local_irq_disable();
/* restore identity mapping */
- init_level4_pgt[0] = __pml4(__pa(level3_ident_pgt) | 7);
+ init_level4_pgt[0] = __pgd(__pa(level3_ident_pgt) | 7);
__flush_tlb_all();
/* Move the trampoline to low memory */
_
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/11] alternate 4-level page tables patches (take 2)
2004-12-22 9:50 [PATCH 0/11] alternate 4-level page tables patches (take 2) Nick Piggin
2004-12-22 9:52 ` [PATCH 1/11] parentheses to x86-64 macro Nick Piggin
@ 2004-12-22 10:18 ` Andi Kleen
1 sibling, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2004-12-22 10:18 UTC (permalink / raw)
To: Nick Piggin
Cc: Linus Torvalds, Andrew Morton, Andi Kleen, Hugh Dickins,
Linux Memory Management
> Comments? Any consensus as to which way we want to go? I don't want to
> inflame tempers by continuing this line of work, just provoke discussion.
Personally I think it's still better to just convert the architectures
over like I did. It has to be done anyways, since you can't leave
the warnings in.
When that is done it doesn't matter much which level you hcange.
I offer my tested patchkit for that :) Main advantage is that since
it's already been tested for quite some time it would be possible
to merge it much faster. And Nick would save some work someone
else already did ;-)
If it helps I can do a global s/pml4_t/p<whatevernamelinusprefers>_t/ too.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-12-22 10:18 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-22 9:50 [PATCH 0/11] alternate 4-level page tables patches (take 2) Nick Piggin
2004-12-22 9:52 ` [PATCH 1/11] parentheses to x86-64 macro Nick Piggin
2004-12-22 9:53 ` [PATCH 2/11] generic 3-level nopmd folding header Nick Piggin
2004-12-22 9:54 ` [PATCH 3/11] convert i386 to generic nopmd header Nick Piggin
2004-12-22 9:54 ` [PATCH 4/11] split copy_page_range Nick Piggin
2004-12-22 9:55 ` [PATCH 5/11] replace clear_page_tables with clear_page_range Nick Piggin
2004-12-22 9:56 ` [PATCH 6/11] introduce 4-level nopud folding header Nick Piggin
2004-12-22 9:57 ` [PATCH 7/11] convert Linux to 4-level page tables Nick Piggin
2004-12-22 9:59 ` [PATCH 8/11] introduce fallback header Nick Piggin
2004-12-22 10:00 ` [PATCH 9/11] convert i386 to generic nopud header Nick Piggin
2004-12-22 10:00 ` [PATCH 10/11] convert ia64 " Nick Piggin
2004-12-22 10:01 ` [PATCH 11/11] convert x86_64 to 4 level page tables Nick Piggin
2004-12-22 10:18 ` [PATCH 0/11] alternate 4-level page tables patches (take 2) Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox