From: Hugh Dickins <hugh.dickins@tiscali.co.uk>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 7/8] mm: reinstate ZERO_PAGE
Date: Mon, 7 Sep 2009 22:39:34 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.64.0909072238320.15430@sister.anvils> (raw)
In-Reply-To: <Pine.LNX.4.64.0909072222070.15424@sister.anvils>
KAMEZAWA Hiroyuki has observed customers of earlier kernels taking
advantage of the ZERO_PAGE: which we stopped do_anonymous_page() from
using in 2.6.24. And there were a couple of regression reports on LKML.
Following suggestions from Linus, reinstate do_anonymous_page() use of
the ZERO_PAGE; but this time avoid dirtying its struct page cacheline
with (map)count updates - let vm_normal_page() regard it as abnormal.
Use it only on arches which __HAVE_ARCH_PTE_SPECIAL (x86, s390, sh32,
most powerpc): that's not essential, but minimizes additional branches
(keeping them in the unlikely pte_special case); and incidentally
excludes mips (some models of which needed eight colours of ZERO_PAGE
to avoid costly exceptions).
Don't be fanatical about avoiding ZERO_PAGE updates: get_user_pages()
callers won't want to make exceptions for it, so increment its count
there. Changes to mlock and migration? happily seems not needed.
In most places it's quicker to check pfn than struct page address:
prepare a __read_mostly zero_pfn for that. Does get_dump_page()
still need its ZERO_PAGE check? probably not, but keep it anyway.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
---
I have not studied the performance of this at all: I'd rather it go
into mmotm where others may decide whether it's a good thing or not.
mm/memory.c | 53 +++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 44 insertions(+), 9 deletions(-)
--- mm6/mm/memory.c 2009-09-07 13:16:53.000000000 +0100
+++ mm7/mm/memory.c 2009-09-07 13:17:01.000000000 +0100
@@ -107,6 +107,17 @@ static int __init disable_randmaps(char
}
__setup("norandmaps", disable_randmaps);
+static unsigned long zero_pfn __read_mostly;
+
+/*
+ * CONFIG_MMU architectures set up ZERO_PAGE in their paging_init()
+ */
+static int __init init_zero_pfn(void)
+{
+ zero_pfn = page_to_pfn(ZERO_PAGE(0));
+ return 0;
+}
+core_initcall(init_zero_pfn);
/*
* If a p?d_bad entry is found while walking page tables, report
@@ -499,7 +510,9 @@ struct page *vm_normal_page(struct vm_ar
if (HAVE_PTE_SPECIAL) {
if (likely(!pte_special(pte)))
goto check_pfn;
- if (!(vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP)))
+ if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
+ return NULL;
+ if (pfn != zero_pfn)
print_bad_pte(vma, addr, pte, NULL);
return NULL;
}
@@ -1144,9 +1157,14 @@ struct page *follow_page(struct vm_area_
goto no_page;
if ((flags & FOLL_WRITE) && !pte_write(pte))
goto unlock;
+
page = vm_normal_page(vma, address, pte);
- if (unlikely(!page))
- goto bad_page;
+ if (unlikely(!page)) {
+ if ((flags & FOLL_DUMP) ||
+ pte_pfn(pte) != zero_pfn)
+ goto bad_page;
+ page = pte_page(pte);
+ }
if (flags & FOLL_GET)
get_page(page);
@@ -2085,10 +2103,19 @@ gotten:
if (unlikely(anon_vma_prepare(vma)))
goto oom;
- VM_BUG_ON(old_page == ZERO_PAGE(0));
- new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
- if (!new_page)
- goto oom;
+
+ if (pte_pfn(orig_pte) == zero_pfn) {
+ new_page = alloc_zeroed_user_highpage_movable(vma, address);
+ if (!new_page)
+ goto oom;
+ } else {
+ new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
+ if (!new_page)
+ goto oom;
+ cow_user_page(new_page, old_page, address, vma);
+ }
+ __SetPageUptodate(new_page);
+
/*
* Don't let another task, with possibly unlocked vma,
* keep the mlocked page.
@@ -2098,8 +2125,6 @@ gotten:
clear_page_mlock(old_page);
unlock_page(old_page);
}
- cow_user_page(new_page, old_page, address, vma);
- __SetPageUptodate(new_page);
if (mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))
goto oom_free_new;
@@ -2594,6 +2619,15 @@ static int do_anonymous_page(struct mm_s
spinlock_t *ptl;
pte_t entry;
+ if (HAVE_PTE_SPECIAL && !(flags & FAULT_FLAG_WRITE)) {
+ entry = pte_mkspecial(pfn_pte(zero_pfn, vma->vm_page_prot));
+ ptl = pte_lockptr(mm, pmd);
+ spin_lock(ptl);
+ if (!pte_none(*page_table))
+ goto unlock;
+ goto setpte;
+ }
+
/* Allocate our own private page. */
pte_unmap(page_table);
@@ -2617,6 +2651,7 @@ static int do_anonymous_page(struct mm_s
inc_mm_counter(mm, anon_rss);
page_add_new_anon_rmap(page, vma, address);
+setpte:
set_pte_at(mm, address, page_table, entry);
/* No need to invalidate - it was non-present before */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-09-07 21:40 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-07 21:26 [PATCH 0/8] mm: around get_user_pages flags Hugh Dickins
2009-09-07 21:29 ` [PATCH 1/8] mm: munlock use follow_page Hugh Dickins
2009-09-08 2:58 ` KAMEZAWA Hiroyuki
2009-09-08 11:30 ` Hugh Dickins
2009-09-08 17:10 ` Rik van Riel
2009-09-09 15:59 ` Minchan Kim
2009-09-11 11:07 ` Hiroaki Wakabayashi
2009-09-07 21:31 ` [PATCH 2/8] mm: remove unused GUP flags Hugh Dickins
2009-09-08 17:27 ` Rik van Riel
2009-09-07 21:33 ` [PATCH 3/8] mm: add get_dump_page Hugh Dickins
2009-09-08 18:57 ` Rik van Riel
2009-09-07 21:35 ` [PATCH 4/8] mm: FOLL_DUMP replace FOLL_ANON Hugh Dickins
2009-09-08 18:59 ` Rik van Riel
2009-09-09 11:14 ` Mel Gorman
2009-09-09 16:16 ` Minchan Kim
2009-09-13 15:46 ` Hugh Dickins
2009-09-13 23:05 ` Minchan Kim
2009-09-07 21:37 ` [PATCH 5/8] mm: follow_hugetlb_page flags Hugh Dickins
2009-09-08 22:21 ` Rik van Riel
2009-09-09 11:31 ` Mel Gorman
2009-09-13 15:35 ` Hugh Dickins
2009-09-14 13:27 ` Mel Gorman
2009-09-15 20:26 ` Hugh Dickins
2009-09-07 21:38 ` [PATCH 6/8] mm: fix anonymous dirtying Hugh Dickins
2009-09-08 22:23 ` Rik van Riel
2009-09-07 21:39 ` Hugh Dickins [this message]
2009-09-08 2:37 ` [PATCH 7/8] mm: reinstate ZERO_PAGE KAMEZAWA Hiroyuki
2009-09-08 11:56 ` Hugh Dickins
2009-09-09 1:44 ` KAMEZAWA Hiroyuki
2009-09-15 20:15 ` Hugh Dickins
2009-09-08 7:31 ` Nick Piggin
2009-09-08 12:17 ` Hugh Dickins
2009-09-08 15:34 ` Nick Piggin
2009-09-08 16:40 ` Hugh Dickins
2009-09-08 14:13 ` Linus Torvalds
2009-09-08 23:35 ` Rik van Riel
2009-09-07 21:40 ` [PATCH 8/8] mm: FOLL flags for GUP flags Hugh Dickins
2009-09-07 23:51 ` [PATCH 0/8] mm: around get_user_pages flags Linus Torvalds
2009-09-07 23:52 ` KAMEZAWA Hiroyuki
2009-09-08 0:00 ` KOSAKI Motohiro
2009-09-10 0:33 ` KOSAKI Motohiro
2009-09-15 20:16 ` Hugh Dickins
2009-09-15 20:30 ` [PATCH 0/4] mm: mlock, hugetlb, zero followups Hugh Dickins
2009-09-15 20:31 ` [PATCH 1/4] mm: m(un)lock avoid ZERO_PAGE Hugh Dickins
2009-09-16 0:08 ` KOSAKI Motohiro
2009-09-16 9:35 ` Mel Gorman
2009-09-16 11:40 ` Hugh Dickins
2009-09-16 12:47 ` Mel Gorman
2009-09-15 20:33 ` [PATCH 2/4] mm: hugetlbfs_pagecache_present Hugh Dickins
2009-09-15 20:37 ` [PATCH 3/4] mm: ZERO_PAGE without PTE_SPECIAL Hugh Dickins
2009-09-16 6:20 ` KAMEZAWA Hiroyuki
2009-09-15 20:38 ` [PATCH 4/4] mm: move highest_memmap_pfn Hugh Dickins
2009-09-17 0:33 ` [PATCH 0/4] mm: mlock, hugetlb, zero followups KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0909072238320.15430@sister.anvils \
--to=hugh.dickins@tiscali.co.uk \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox