From: Andi Kleen <andi@firstfloor.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org
Subject: Re: [PATCH] [0/16] POISON: Intro
Date: Wed, 8 Apr 2009 08:21:07 +0200 [thread overview]
Message-ID: <20090408062107.GE17934@one.firstfloor.org> (raw)
In-Reply-To: <20090407224709.742376ff.akpm@linux-foundation.org>
On Tue, Apr 07, 2009 at 10:47:09PM -0700, Andrew Morton wrote:
> On Tue, 7 Apr 2009 17:09:56 +0200 (CEST) Andi Kleen <andi@firstfloor.org> wrote:
>
> > Upcoming Intel CPUs have support for recovering from some memory errors. This
> > requires the OS to declare a page "poisoned", kill the processes associated
> > with it and avoid using it in the future. This patchkit implements
> > the necessary infrastructure in the VM.
>
> Seems that this feature is crying out for a testing framework (perhaps
> it already has one?).
Multiple ones in fact.
One of them is
git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git
(test suite covering various cases)
git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
(injector using the x86 specific error injection hooks I posted
earlier)
Then i have some tests using the madvise MADV_POISON hook
(which tests the various cases from a process stand points
and recovers). This is still a little hackish, but if there's
interest I can put it out. It has at least one test case
that is known to hang (non linear mappings), still looking
at that.
Long term plan was to put both mce-test above and the
MADV_POISON test into LTP.
And a few random hacks. But coverage is still not 100%
> A simplistic approach would be
Random kill anywhere is hard to test because your system will
die regularly and randomly. mce-test.git does some automated
testing of fatal errors by catching them using kexec, but we haven't
tried that for full recovery.
>
> echo some-pfn > /proc/bad-pfn-goes-here
>
> A slightly more sophisticated version might do the deed from within a
> timer interrupt, just to get a bit more coverage.
mce-test/inject does it from other CPUs with smp_function_call_single,
so it's really relatively random. I've considered to use NMIs too,
but at least the high level recovery code synchronizes first
to work queue context anyways, so it doesn't buy us too much for that.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-04-08 6:18 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-07 15:09 Andi Kleen
2009-04-07 15:09 ` [PATCH] [1/16] POISON: Add support for high priority work items Andi Kleen
2009-04-07 15:09 ` [PATCH] [2/16] POISON: Add page flag for poisoned pages Andi Kleen
2009-04-07 21:07 ` Christoph Lameter
2009-04-08 0:29 ` Russ Anderson
2009-04-08 6:26 ` Andi Kleen
2009-04-08 5:14 ` Andrew Morton
2009-04-08 6:24 ` Andi Kleen
2009-04-08 7:00 ` Andrew Morton
2009-04-08 9:38 ` Andi Kleen
2009-04-07 15:09 ` [PATCH] [3/16] POISON: Handle poisoned pages in page free Andi Kleen
2009-04-07 23:21 ` Minchan Kim
2009-04-08 6:51 ` Andi Kleen
2009-04-08 7:39 ` Minchan Kim
2009-04-08 9:41 ` Andi Kleen
2009-04-08 10:05 ` Minchan Kim
2009-04-07 15:10 ` [PATCH] [4/16] POISON: Export some rmap vma locking to outside world Andi Kleen
2009-04-07 15:10 ` [PATCH] [5/16] POISON: Add support for poison swap entries Andi Kleen
2009-04-07 21:11 ` Christoph Lameter
2009-04-07 21:56 ` Andi Kleen
2009-04-07 21:56 ` Christoph Lameter
2009-04-07 22:25 ` Andi Kleen
2009-04-07 15:10 ` [PATCH] [6/16] POISON: Add new SIGBUS error codes for poison signals Andi Kleen
2009-04-07 15:10 ` [PATCH] [7/16] POISON: Add basic support for poisoned pages in fault handler Andi Kleen
2009-05-26 12:55 ` Hidehiro Kawai
2009-05-26 13:18 ` Andi Kleen
2009-04-07 15:10 ` [PATCH] [8/16] POISON: Add various poison checks in mm/memory.c Andi Kleen
2009-04-07 19:03 ` Johannes Weiner
2009-04-07 19:31 ` Andi Kleen
2009-04-07 20:17 ` Johannes Weiner
2009-04-07 20:24 ` Andi Kleen
2009-04-07 20:36 ` Johannes Weiner
2009-04-07 15:10 ` [PATCH] [9/16] POISON: x86: Add VM_FAULT_POISON handling to x86 page fault handler Andi Kleen
2009-04-07 15:10 ` [PATCH] [10/16] POISON: Use bitmask/action code for try_to_unmap behaviour Andi Kleen
2009-04-07 21:19 ` Christoph Lameter
2009-04-07 21:59 ` Andi Kleen
2009-04-07 22:04 ` Christoph Lameter
2009-04-07 22:35 ` Andi Kleen
2009-04-07 15:10 ` [PATCH] [11/16] POISON: Handle poisoned pages in try_to_unmap Andi Kleen
2009-04-07 15:10 ` [PATCH] [12/16] POISON: Handle poisoned pages in set_page_dirty() Andi Kleen
2009-04-07 15:10 ` [PATCH] [13/16] POISON: The high level memory error handler in the VM Andi Kleen
2009-04-07 16:03 ` Rik van Riel
2009-04-07 16:30 ` Andi Kleen
2009-04-07 18:51 ` Johannes Weiner
2009-04-07 19:40 ` Andi Kleen
2009-04-08 17:03 ` Chris Mason
2009-04-09 7:29 ` Andi Kleen
2009-04-09 7:58 ` [PATCH] [13/16] POISON: The high level memory error handler in the VM II Andi Kleen
2009-04-09 13:30 ` Chris Mason
2009-04-09 14:02 ` Andi Kleen
2009-04-09 14:37 ` Chris Mason
2009-04-09 14:57 ` Andi Kleen
2009-04-29 8:16 ` Wu Fengguang
2009-04-29 8:21 ` btrfs BUG on creating huge sparse file Wu Fengguang
2009-04-29 11:40 ` Chris Mason
2009-04-29 11:45 ` Wu Fengguang
2009-04-29 8:36 ` [PATCH] [13/16] POISON: The high level memory error handler in the VM II Andi Kleen
2009-04-29 9:05 ` Wu Fengguang
2009-04-29 11:27 ` Chris Mason
2009-04-07 15:10 ` [PATCH] [14/16] x86: MCE: Rename mce_notify_user to mce_notify_irq Andi Kleen
2009-04-07 15:10 ` [PATCH] [15/16] x86: MCE: Support action-optional machine checks Andi Kleen
2009-04-07 15:10 ` [PATCH] [16/16] POISON: Add madvise() based injector for poisoned data Andi Kleen
2009-04-07 19:13 ` [PATCH] [0/16] POISON: Intro Robin Holt
2009-04-07 19:38 ` Andi Kleen
2009-04-08 5:15 ` Andrew Morton
2009-04-08 6:15 ` Andi Kleen
2009-04-08 17:29 ` Roland Dreier
2009-04-09 7:22 ` Andi Kleen
2009-04-08 5:47 ` Andrew Morton
2009-04-08 6:21 ` Andi Kleen [this message]
2009-04-13 13:18 ` Wu Fengguang
2009-05-26 12:50 ` Hidehiro Kawai
2009-05-26 13:29 ` Andi Kleen
2009-05-28 4:37 ` Hidehiro Kawai
2009-05-28 8:00 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090408062107.GE17934@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox