From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>,
"ksummit-discuss@lists.linuxfoundation.org"
<ksummit-discuss@lists.linuxfoundation.org>
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Device error handling / reporting / isolation
Date: Wed, 14 May 2014 11:24:54 +1000 [thread overview]
Message-ID: <1400030694.17624.206.camel@pasglop> (raw)
In-Reply-To: <1399552623.17118.22.camel@i7.infradead.org>
On Thu, 2014-05-08 at 13:37 +0100, David Woodhouse wrote:
> I'd like to have a discussion about handling device errors.
>
> IOMMUs are becoming more common, and we've seen some failure modes where
> we just end up with an endless stream of fault reports from a given
> device, and the kernel can do nothing else.
.../...
I'm definitely interested in this, and would nominate Gavin Shan from
IBM as well who is our EEH expert for the kernel.
To cut a long story short, we have an extensive set of HW facilities
in our PCI host bridges to detect errors and freeze all operations
in and out of devices upon detection of errors, in order to prevent
propagation of bad data.
In addition, we have a recovery process involving the few drivers
who support the corresponding hooks. We could describe the process,
it can be fairly convoluted.
We fallback to simulating an unplug of the device (unbind the driver),
a reset and a re-bind for devices that don't have the hooks.
Cheers,
Ben.
prev parent reply other threads:[~2014-05-14 1:25 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-08 12:37 David Woodhouse
2014-05-08 18:03 ` Bjorn Helgaas
2014-05-08 20:00 ` Rafael J. Wysocki
2014-05-08 19:56 ` James Bottomley
2014-05-09 8:55 ` David Woodhouse
2014-05-09 11:31 ` Laurent Pinchart
2014-05-14 1:28 ` Benjamin Herrenschmidt
2014-05-09 17:48 ` Roland Dreier
2014-05-09 17:58 ` Matthew Wilcox
2014-05-09 18:08 ` Roland Dreier
2014-05-14 1:40 ` Benjamin Herrenschmidt
2014-05-09 18:05 ` Will Deacon
2014-05-12 15:03 ` Joerg Roedel
2014-05-09 19:37 ` Josh Triplett
2014-05-09 19:44 ` David Woodhouse
2014-05-09 19:53 ` Roland Dreier
2014-05-09 20:13 ` Luck, Tony
2014-05-09 20:19 ` James Bottomley
2014-05-10 1:09 ` Laurent Pinchart
2014-05-11 22:43 ` Daniel Vetter
2014-05-12 15:07 ` Joerg Roedel
2014-05-12 15:35 ` Daniel Vetter
2014-05-12 16:16 ` Andy Lutomirski
2014-05-12 16:28 ` Joerg Roedel
2014-05-12 16:59 ` Laurent Pinchart
2014-05-12 17:15 ` Joerg Roedel
2014-05-12 17:11 ` Daniel Vetter
2014-05-12 17:40 ` Joerg Roedel
2014-05-13 10:06 ` Daniel Vetter
2014-05-12 17:04 ` Daniel Vetter
2014-05-13 11:27 ` David Woodhouse
2014-05-13 17:25 ` Daniel Vetter
2014-05-14 1:50 ` Benjamin Herrenschmidt
2014-05-14 20:09 ` Daniel Vetter
2014-05-15 1:08 ` Benjamin Herrenschmidt
2014-05-12 16:26 ` Joerg Roedel
2014-05-12 14:58 ` Joerg Roedel
2014-05-13 14:37 ` David Woodhouse
2014-05-14 1:46 ` Benjamin Herrenschmidt
2014-05-14 1:43 ` Benjamin Herrenschmidt
2014-05-14 1:42 ` Benjamin Herrenschmidt
2014-05-14 1:24 ` Benjamin Herrenschmidt [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1400030694.17624.206.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=dwmw2@infradead.org \
--cc=gwshan@linux.vnet.ibm.com \
--cc=ksummit-discuss@lists.linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox