From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Roland Dreier <roland@kernel.org>
Cc: "ksummit-discuss@lists.linuxfoundation.org"
<ksummit-discuss@lists.linuxfoundation.org>
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Device error handling / reporting / isolation
Date: Wed, 14 May 2014 11:40:46 +1000 [thread overview]
Message-ID: <1400031646.17624.215.camel@pasglop> (raw)
In-Reply-To: <CAG4TOxNJxWLWSZYW313XATtCpV+SM9WrYFJ-V+0Pf-yryLh67g@mail.gmail.com>
On Fri, 2014-05-09 at 10:48 -0700, Roland Dreier wrote:
>
> I think there's a more general problem that's worth talking about
> here. In addition to IOMMU faults, there are lots of other PCI errors
> that can happen, and we have some small number of drivers that have
> been "hardened" to try and recover from these errors. However even
> for these "hardened" drivers it seems pretty easy to hit deadlocks
> when the driver tries to tear down and reinitialize things.
Right. We are hitting that every time we test a new round of machines /
FW / Distro on power when testing EEH. The error path in the drivers are
very badly tested.
For example, when our HW "isolates" a device, all reads start returning
ff's on MMIOs. Plenty of drivers will have either infinite or
very-long-timeout loops waiting for a bit to clear...
Also, when our HW decides to fence the entire PCI Express controller
(which can happen for example if it took a parity error in an internal
cache), subsequent MMIOs return ff's but also take a long time (hundreds
of microseconds or more).
We had issues where driver implement timeouts like this:
for (i = 0; i < 10000; i++) {
foo = readl(bar);
if ((foo & my_bit) == 0)
break;
udelay(1);
}
And expect this to be a 10ms timeout ... in fenced situations, it ends
up being a 100ms or 1s timeout (we've seen much longer ones).
One way to help find/fix these would be a better error injection
capability to "isolate" devices, for example my remapping their
MMIOs to something that returns ff's :-)
> So I wonder if we can do better without proliferating error handling
> tentacles into all sorts of low-level drivers ("did we just read
> 0xffffffff here? how about here? are we in the middle of error
> recovery? how about now?").
We can't because ultimately, that is what HW will return when it's
broken, disconnected, lost a link, or EEH.
> One context where this is becoming a real concern is with NVMe drives.
> These are SSDs that (may) look like normal 2.5" drives, but use PCIe
> rather than SATA or SAS to connect to the host. Since they look like
> normal drives, it's natural to put them into hot-pluggable JBODs, but
> it turns out we react much worse to PCIe surprise removal than, say,
> SAS hotplug.
Cheers,
Ben.
next prev parent reply other threads:[~2014-05-14 1:41 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-08 12:37 David Woodhouse
2014-05-08 18:03 ` Bjorn Helgaas
2014-05-08 20:00 ` Rafael J. Wysocki
2014-05-08 19:56 ` James Bottomley
2014-05-09 8:55 ` David Woodhouse
2014-05-09 11:31 ` Laurent Pinchart
2014-05-14 1:28 ` Benjamin Herrenschmidt
2014-05-09 17:48 ` Roland Dreier
2014-05-09 17:58 ` Matthew Wilcox
2014-05-09 18:08 ` Roland Dreier
2014-05-14 1:40 ` Benjamin Herrenschmidt [this message]
2014-05-09 18:05 ` Will Deacon
2014-05-12 15:03 ` Joerg Roedel
2014-05-09 19:37 ` Josh Triplett
2014-05-09 19:44 ` David Woodhouse
2014-05-09 19:53 ` Roland Dreier
2014-05-09 20:13 ` Luck, Tony
2014-05-09 20:19 ` James Bottomley
2014-05-10 1:09 ` Laurent Pinchart
2014-05-11 22:43 ` Daniel Vetter
2014-05-12 15:07 ` Joerg Roedel
2014-05-12 15:35 ` Daniel Vetter
2014-05-12 16:16 ` Andy Lutomirski
2014-05-12 16:28 ` Joerg Roedel
2014-05-12 16:59 ` Laurent Pinchart
2014-05-12 17:15 ` Joerg Roedel
2014-05-12 17:11 ` Daniel Vetter
2014-05-12 17:40 ` Joerg Roedel
2014-05-13 10:06 ` Daniel Vetter
2014-05-12 17:04 ` Daniel Vetter
2014-05-13 11:27 ` David Woodhouse
2014-05-13 17:25 ` Daniel Vetter
2014-05-14 1:50 ` Benjamin Herrenschmidt
2014-05-14 20:09 ` Daniel Vetter
2014-05-15 1:08 ` Benjamin Herrenschmidt
2014-05-12 16:26 ` Joerg Roedel
2014-05-12 14:58 ` Joerg Roedel
2014-05-13 14:37 ` David Woodhouse
2014-05-14 1:46 ` Benjamin Herrenschmidt
2014-05-14 1:43 ` Benjamin Herrenschmidt
2014-05-14 1:42 ` Benjamin Herrenschmidt
2014-05-14 1:24 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1400031646.17624.215.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=ksummit-discuss@lists.linuxfoundation.org \
--cc=roland@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox