linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Dave Jones <davej@redhat.com>,
	Arjan van de Ven <arjan@infradead.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 4/8] badpage: vm_normal_page use print_bad_pte
Date: Mon, 1 Dec 2008 00:43:37 +0000 (GMT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0812010042430.11401@blonde.site> (raw)
In-Reply-To: <Pine.LNX.4.64.0812010032210.10131@blonde.site>

print_bad_pte() is so far being called only when zap_pte_range() finds
negative page_mapcount, or there's a fault on a pte_file where it does
not belong.  That's weak coverage when we suspect pagetable corruption.

Originally, it was called when vm_normal_page() found an invalid pfn:
but pfn_valid is expensive on some architectures and configurations, so
2.6.24 put that under CONFIG_DEBUG_VM (which doesn't help in the field),
then 2.6.26 replaced it by a VM_BUG_ON (likewise).

Reinstate the print_bad_pte() in vm_normal_page(), but use a cheaper
test than pfn_valid(): memmap_init_zone() (used in bootup and hotplug)
keep a __read_mostly note of the highest_memmap_pfn, vm_normal_page()
then check pfn against that.  We could call this pfn_plausible() or
pfn_sane(), but I doubt we'll need it elsewhere: of course it's not
reliable, but gives much stronger pagetable validation on many boxes.

Also use print_bad_pte() when the pte_special bit is found outside a
VM_PFNMAP or VM_MIXEDMAP area, instead of VM_BUG_ON.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
---

 mm/internal.h   |    1 +
 mm/memory.c     |   20 ++++++++++----------
 mm/page_alloc.c |    4 ++++
 3 files changed, 15 insertions(+), 10 deletions(-)

--- badpage3/mm/internal.h	2008-11-10 11:27:02.000000000 +0000
+++ badpage4/mm/internal.h	2008-11-28 20:40:42.000000000 +0000
@@ -49,6 +49,7 @@ extern void putback_lru_page(struct page
 /*
  * in mm/page_alloc.c
  */
+extern unsigned long highest_memmap_pfn;
 extern void __free_pages_bootmem(struct page *page, unsigned int order);
 
 /*
--- badpage3/mm/memory.c	2008-11-28 20:40:40.000000000 +0000
+++ badpage4/mm/memory.c	2008-11-28 20:40:42.000000000 +0000
@@ -467,21 +467,18 @@ static inline int is_cow_mapping(unsigne
 struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 				pte_t pte)
 {
-	unsigned long pfn;
+	unsigned long pfn = pte_pfn(pte);
 
 	if (HAVE_PTE_SPECIAL) {
-		if (likely(!pte_special(pte))) {
-			VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
-			return pte_page(pte);
-		}
-		VM_BUG_ON(!(vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP)));
+		if (likely(!pte_special(pte)))
+			goto check_pfn;
+		if (!(vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP)))
+			print_bad_pte(vma, addr, pte, NULL);
 		return NULL;
 	}
 
 	/* !HAVE_PTE_SPECIAL case follows: */
 
-	pfn = pte_pfn(pte);
-
 	if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
 		if (vma->vm_flags & VM_MIXEDMAP) {
 			if (!pfn_valid(pfn))
@@ -497,11 +494,14 @@ struct page *vm_normal_page(struct vm_ar
 		}
 	}
 
-	VM_BUG_ON(!pfn_valid(pfn));
+check_pfn:
+	if (unlikely(pfn > highest_memmap_pfn)) {
+		print_bad_pte(vma, addr, pte, NULL);
+		return NULL;
+	}
 
 	/*
 	 * NOTE! We still have PageReserved() pages in the page tables.
-	 *
 	 * eg. VDSO mappings can cause them to exist.
 	 */
 out:
--- badpage3/mm/page_alloc.c	2008-11-28 20:40:40.000000000 +0000
+++ badpage4/mm/page_alloc.c	2008-11-28 20:40:42.000000000 +0000
@@ -69,6 +69,7 @@ EXPORT_SYMBOL(node_states);
 
 unsigned long totalram_pages __read_mostly;
 unsigned long totalreserve_pages __read_mostly;
+unsigned long highest_memmap_pfn __read_mostly;
 int percpu_pagelist_fraction;
 
 #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
@@ -2597,6 +2598,9 @@ void __meminit memmap_init_zone(unsigned
 	unsigned long pfn;
 	struct zone *z;
 
+	if (highest_memmap_pfn < end_pfn - 1)
+		highest_memmap_pfn = end_pfn - 1;
+
 	z = &NODE_DATA(nid)->node_zones[zone];
 	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
 		/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2008-12-01  0:43 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-01  0:37 [PATCH 0/8] badpage: more resilient bad page pte and rmap Hugh Dickins
2008-12-01  0:40 ` [PATCH 1/8] badpage: simplify page_alloc flag check+clear Hugh Dickins
2008-12-01 14:47   ` Christoph Lameter
2008-12-01 23:50     ` Hugh Dickins
2008-12-02  2:21       ` Christoph Lameter
2008-12-02 10:39         ` Hugh Dickins
2008-12-02 13:12           ` Christoph Lameter
2008-12-02 14:12             ` Hugh Dickins
2008-12-03  0:57               ` Andrew Morton
2008-12-01  0:41 ` [PATCH 2/8] badpage: keep any bad page out of circulation Hugh Dickins
2008-12-01 14:49   ` Christoph Lameter
2008-12-01 23:19     ` Hugh Dickins
2008-12-01  0:42 ` [PATCH 3/8] badpage: replace page_remove_rmap Eeek and BUG Hugh Dickins
2008-12-01  0:43 ` Hugh Dickins [this message]
2008-12-01  0:44 ` [PATCH 5/8] badpage: zap print_bad_pte on swap and file Hugh Dickins
2008-12-01  0:45 ` [PATCH 6/8] badpage: remove vma from page_remove_rmap Hugh Dickins
2008-12-01  0:46 ` [PATCH 7/8] badpage: ratelimit print_bad_pte and bad_page Hugh Dickins
2008-12-03  0:56   ` Andrew Morton
2008-12-03 13:04     ` Hugh Dickins
2008-12-01  0:48 ` [PATCH 8/8] badpage: KERN_ALERT BUG instead of KERN_EMERG Hugh Dickins
2008-12-01 14:40   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0812010042430.11401@blonde.site \
    --to=hugh@veritas.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=davej@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=manfred@colorfullife.com \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox