ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Andy Lutomirski <luto@amacapital.net>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>,
	"ksummit-discuss@lists.linuxfoundation.org"
	<ksummit-discuss@lists.linuxfoundation.org>
Subject: Re: [Ksummit-discuss] [CORE TOPIC] Device error handling / reporting / isolation
Date: Mon, 12 May 2014 19:04:45 +0200	[thread overview]
Message-ID: <CAKMK7uHgQKgHoeGDAkTBVwzSA7X+XrCRO9zXYwKsGUCJZOk-Dw@mail.gmail.com> (raw)
In-Reply-To: <CALCETrXo7Zqg9EadLHTniLhAB9f13C9wzFWZAqTSG=4z0ocgQg@mail.gmail.com>

On Mon, May 12, 2014 at 6:16 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, May 12, 2014 at 8:35 AM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>> On Mon, May 12, 2014 at 5:07 PM, Joerg Roedel <joro@8bytes.org> wrote:
>>> On Mon, May 12, 2014 at 12:43:09AM +0200, Daniel Vetter wrote:
>>>> So I think having some iommu storm handling (like we have for
>>>> interrupts in general and a lot of other things) would go a long way
>>>> towards the goal of enabling iommus everywhere.
>>>
>>> Right, the developer use-case needs also be taken into account. We could
>>> easily ignore a device after it did something wrong to get rid of
>>> io-page-fault or interupt storms. But we also need a way to tell the
>>> kernel to unignore the device later :)
>>
>> A disable/enable cycle of the pci bus master setting should be a good
>> enough signal? Presuming you can say for sure which devices is doing
>> the offending dma transactions ofc ... Or maybe we should just be
>> optimists and re-enable the IOMMU if _any_ child device gets
>> re-enabled (or bus master re-enabled for pci) in the hopes that the
>> developers just reloaded the driver. Worst case the storm handling
>> will kick in again shortly.
>
> Just to check: are you talking about disabling the IOMMU if there's a
> fault storm or disabling reporting of IOMMU faults?

Re-enabling of the IOMMU after it was completely shut off to isolate a
fault storm from a rouge device. Since if I as a developer still have
to reboot if I wreak havoc in my driver it's only marginally better
than a box that went down in a iommu page fault storm. But if I can
just reload the driver (with the bug fixed) and get back a working
device because the IOMMU was re-enabling then that would help. Not
sure yet how feasible this really is.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

  parent reply	other threads:[~2014-05-12 17:04 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08 12:37 David Woodhouse
2014-05-08 18:03 ` Bjorn Helgaas
2014-05-08 20:00   ` Rafael J. Wysocki
2014-05-08 19:56 ` James Bottomley
2014-05-09  8:55   ` David Woodhouse
2014-05-09 11:31     ` Laurent Pinchart
2014-05-14  1:28       ` Benjamin Herrenschmidt
2014-05-09 17:48 ` Roland Dreier
2014-05-09 17:58   ` Matthew Wilcox
2014-05-09 18:08     ` Roland Dreier
2014-05-14  1:40   ` Benjamin Herrenschmidt
2014-05-09 18:05 ` Will Deacon
2014-05-12 15:03   ` Joerg Roedel
2014-05-09 19:37 ` Josh Triplett
2014-05-09 19:44   ` David Woodhouse
2014-05-09 19:53   ` Roland Dreier
2014-05-09 20:13     ` Luck, Tony
2014-05-09 20:19       ` James Bottomley
2014-05-10  1:09         ` Laurent Pinchart
2014-05-11 22:43           ` Daniel Vetter
2014-05-12 15:07             ` Joerg Roedel
2014-05-12 15:35               ` Daniel Vetter
2014-05-12 16:16                 ` Andy Lutomirski
2014-05-12 16:28                   ` Joerg Roedel
2014-05-12 16:59                     ` Laurent Pinchart
2014-05-12 17:15                       ` Joerg Roedel
2014-05-12 17:11                     ` Daniel Vetter
2014-05-12 17:40                       ` Joerg Roedel
2014-05-13 10:06                         ` Daniel Vetter
2014-05-12 17:04                   ` Daniel Vetter [this message]
2014-05-13 11:27                     ` David Woodhouse
2014-05-13 17:25                       ` Daniel Vetter
2014-05-14  1:50                       ` Benjamin Herrenschmidt
2014-05-14 20:09                         ` Daniel Vetter
2014-05-15  1:08                           ` Benjamin Herrenschmidt
2014-05-12 16:26                 ` Joerg Roedel
2014-05-12 14:58         ` Joerg Roedel
2014-05-13 14:37         ` David Woodhouse
2014-05-14  1:46         ` Benjamin Herrenschmidt
2014-05-14  1:43     ` Benjamin Herrenschmidt
2014-05-14  1:42   ` Benjamin Herrenschmidt
2014-05-14  1:24 ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKMK7uHgQKgHoeGDAkTBVwzSA7X+XrCRO9zXYwKsGUCJZOk-Dw@mail.gmail.com \
    --to=daniel.vetter@ffwll.ch \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=ksummit-discuss@lists.linuxfoundation.org \
    --cc=luto@amacapital.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox