On Thu, 2014-05-08 at 12:56 -0700, James Bottomley wrote:
> On Thu, 2014-05-08 at 13:37 +0100, David Woodhouse wrote:
> > I'd like to have a discussion about handling device errors.
> > 
> > IOMMUs are becoming more common, and we've seen some failure modes where
> > we just end up with an endless stream of fault reports from a given
> > device, and the kernel can do nothing else.
> 
> This is when the addresses being sent by the bus don't have IOTLB
> entries?

You speak as if you have a software-filled IOTLB. I'd have phrased that
as "don't have page table entries". But yes, that.

Or they have read-only IOTLB entries, and they're trying to write.

And as I said, once we start looking at it I suspect we'll end up
finding other offences that need to be taken into consideration. Which
is why I think this warrants a wider discussion rather than the IOMMU
owners sitting in a darkened room doing it amongst themselves.

> > But I absolutely don't want us to be implementing policies like that in
> > an individual IOMMU driver; this needs to be handled by generic device
> > code. Once upon a time I might have said PCI code, but this is actually
> > relevant for non-PCI devices too.
> 
> Right, with my PARISC hat on, our IOMMUs sit adjacent to the CPUs.  The
> PCI busses (if we have any) are a couple of layers down.

Even the Intel IOMMU can do mappings (and take faults) for ACPI devices,
these days.

> > I want the IOMMU to report errors, and let the system do the appropriate
> > thing. Which requires some discussion about what the "appropriate thing"
> > can be in various circumstances, and indeed what options are available
> > to us on various platforms.
> > 
> > Participants would be those working with IOMMUs on various platforms,
> > including Jörg Rödel, myself, and hopefully someone with a fairly
> > intimate knowledge of EEH as used on POWER systems.

I note that Jörg isn't actually on the nominations list. I think he
should be...

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation