linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* IOMMU setup vs DAC (PCI)
@ 2001-02-09 19:39 Grant Grundler
  2001-02-09 19:42 ` David S. Miller
  2001-02-09 19:47 ` Kanoj Sarcar
  0 siblings, 2 replies; 8+ messages in thread
From: Grant Grundler @ 2001-02-09 19:39 UTC (permalink / raw)
  To: davem; +Cc: linux-mm

Dave (Miller),

Matthew Wilcox and I had the following conversation:

<ggg> willy: how do systems which support dual address cycle PCI (ie 64-bit
  addressing) access hi-mem (>4GB)?  bounce-buffers?
* ggg is wondering if any recent changes define a 64-bit type for dma_addr_t
<willy> ggg: the IOMMU is used to map a 32-bit PCI address to a 64-bit
  address-bus address
<ggg> willy: what if I design a board that doesn't *have* an IOMMU?
<willy> except on x86 where bounce buffers get used.
<ggg> IOMMU has a performance cost.
<willy> so does DAC.
<ggg> DAC is cheap compared to IOMMU overhead.
<willy> i'll have to take your word for that.
<ggg> DAC doesn't cost the CPU anything and IOMMU mgt does.
<willy> but how much mgt needs to be done?  if you're doing a 4k read from
  disc, it's surely cheaper?
<ggg> IOMMU also has to R/W TLB and get flushed in certain circumstances - ie
  extra PIO to the IOMMU
<ggg> willy: no way.
<ggg> willy: setup time on IOMMU kills you.
<ggg> try bigger reads and we can argue. I don't know where the tradeoff is for
  parisc IOMMU's.
<willy> i'm the wrong person to be arguing with.  davem/rth/linus/sct/mingo
  are the people.
<ggg> willy: ok.
* ggg sends mail to davem
<willy> linux-mm might be the right list to argue this on.
<ggg> ok. what's the full email addr?
<ggg> vger.kernel.org?
<willy> @kvack.org
<ggg> ok tnx.

My original quest was for an architecturally neutral way to pass
64-bit physical memory addresses back to a 64-bit capable card.

pci_dma_supported() interface provides the right hook for the
driver to advertise device capabilities. dma_addr_t is defined
in most arches (read x86) to be 32-bit. But IA64 (u64) and mips*
(unsigned long) have broken ground here already. I'll explore
further to see if parisc*-linux can in fact use "unsigned long".

But I'm still interested in any comments or insights.
(ie am I out to lunch? ;^)

thanks,
grant

Grant Grundler
parisc-linux {PCI|IOMMU|SMP} hacker
+1.408.447.7253
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 19:39 IOMMU setup vs DAC (PCI) Grant Grundler
@ 2001-02-09 19:42 ` David S. Miller
  2001-02-09 20:04   ` Grant Grundler
  2001-02-09 19:47 ` Kanoj Sarcar
  1 sibling, 1 reply; 8+ messages in thread
From: David S. Miller @ 2001-02-09 19:42 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-mm

Grant Grundler writes:
 > My original quest was for an architecturally neutral way to pass
 > 64-bit physical memory addresses back to a 64-bit capable card.
 > 
 > pci_dma_supported() interface provides the right hook for the
 > driver to advertise device capabilities. dma_addr_t is defined
 > in most arches (read x86) to be 32-bit. But IA64 (u64) and mips*
 > (unsigned long) have broken ground here already. I'll explore
 > further to see if parisc*-linux can in fact use "unsigned long".
 > 
 > But I'm still interested in any comments or insights.
 > (ie am I out to lunch? ;^)

You are going into unchartered territory.  IA64 supports 64-bit
DMA as a HACK at best (see qla20xx driver) and uses a software
iommu implementation to handle all normal drivers using the
supported 32-bit pci_*() interfaces.

The 64-bit support API will appear in 2.5.x, no sooner.

And all that talk of IOMMU overhead assumes a shit implementation of
TLB flushing.  With a sane setup you only flush once per circle walk
of the page tables, see the tricks in sparc64/kernel/pci_iommu.c to
see what I'm talking about.  I can push 2 gigabytes to a disk over
SCSI and only take 18 PIOs to the IOMMU.  Also, many IOMMU based PCI
implementations do not offer the software managed
prefetching/write-behind facilities when 64-bit DAC is used.

I say stay at 32-bit IOMMU based stuff for now.

Later,
David S. Miller
davem@redhat.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 19:39 IOMMU setup vs DAC (PCI) Grant Grundler
  2001-02-09 19:42 ` David S. Miller
@ 2001-02-09 19:47 ` Kanoj Sarcar
  2001-02-09 19:52   ` David S. Miller
  1 sibling, 1 reply; 8+ messages in thread
From: Kanoj Sarcar @ 2001-02-09 19:47 UTC (permalink / raw)
  To: Grant Grundler; +Cc: davem, linux-mm

> 
> My original quest was for an architecturally neutral way to pass
> 64-bit physical memory addresses back to a 64-bit capable card.
> 
> pci_dma_supported() interface provides the right hook for the
> driver to advertise device capabilities. dma_addr_t is defined
> in most arches (read x86) to be 32-bit. But IA64 (u64) and mips*
> (unsigned long) have broken ground here already. I'll explore
> further to see if parisc*-linux can in fact use "unsigned long".
> 
> But I'm still interested in any comments or insights.
> (ie am I out to lunch? ;^)

dma_addr_t should be unsigned long, which is 64 bits on 64 bit
architectures, so things are fine there.

On regular x86, dma_addr_t is u32, which still works.

The problem is really on x86 PAE. I think Alan also pointed out
that other architectures might have similar issues (ARM?). For
x86-PAE, dma_addr_t should really be u64/unsigned long long. The
only issue is that there are gcc bugs while dealing with 64 bit
quantities on x86, and performance implications.

Additionally, we have also talked in the past of making a typedef
for representing physical addresses. This typedef would be the 
same as the one to represent dma_addr_t.

Kanoj

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 19:47 ` Kanoj Sarcar
@ 2001-02-09 19:52   ` David S. Miller
  2001-02-09 20:07     ` Kanoj Sarcar
  0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 2001-02-09 19:52 UTC (permalink / raw)
  To: Kanoj Sarcar; +Cc: Grant Grundler, linux-mm

Kanoj Sarcar writes:
 > dma_addr_t should be unsigned long, which is 64 bits on 64 bit
 > architectures, so things are fine there.
 > 
 > On regular x86, dma_addr_t is u32, which still works.

It's 32-bit on sparc64 since 32-bit DMA addresses are all
we need since the IOMMU is used for anything.

In fact, if your architecture is doing nothing other
than PCI, you _OUGHT_ to make it 32-bit even on 64-bit
platforms because the PCI dma interface does not support
64-bit DACs in any way shape or form until 2.5.x in then
a new dma64_addr_t type will be used to denote a DAC
address.

Later,
David S. Miller
davem@redhat.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 19:42 ` David S. Miller
@ 2001-02-09 20:04   ` Grant Grundler
  0 siblings, 0 replies; 8+ messages in thread
From: Grant Grundler @ 2001-02-09 20:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-mm

"David S. Miller" wrote:
> You are going into unchartered territory.  IA64 supports 64-bit
> DMA as a HACK at best (see qla20xx driver) and uses a software
> iommu implementation to handle all normal drivers using the
> supported 32-bit pci_*() interfaces.

Ah ok. that's what I was afraid of.

> The 64-bit support API will appear in 2.5.x, no sooner.
> 
> And all that talk of IOMMU overhead assumes a shit implementation of
> TLB flushing.

Yup. That's us. :^(
For "SBA" IOMMU, CPU has to invalidate TLB entries since the board
designers *botched* the implementation. TLB and IOPdir were *supposed*
to be coherent.

Future implementations are promised to work correctly.
I'll believe it when I see it.

> With a sane setup you only flush once per circle walk
> of the page tables, see the tricks in sparc64/kernel/pci_iommu.c to
> see what I'm talking about.  I can push 2 gigabytes to a disk over
> SCSI and only take 18 PIOs to the IOMMU.

I've been able to reduce PIO *read* overhead a bunch.
Normally every PIO write to TLB invalidate has to be followed by a read
to guarantee the IOMMU sees the write.  I bundle several (32 or more)
PIO writes together and follow with one read. The read overhead
disappears in mix with other overhead. So while it sucks, it's not
*really* bad...

> Also, many IOMMU based PCI
> implementations do not offer the software managed
> prefetching/write-behind facilities when 64-bit DAC is used.
>
> I say stay at 32-bit IOMMU based stuff for now.

ok - tnx!

grant

Grant Grundler
parisc-linux {PCI|IOMMU|SMP} hacker
+1.408.447.7253
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 19:52   ` David S. Miller
@ 2001-02-09 20:07     ` Kanoj Sarcar
  2001-02-09 20:23       ` David S. Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Kanoj Sarcar @ 2001-02-09 20:07 UTC (permalink / raw)
  To: David S. Miller; +Cc: Grant Grundler, linux-mm

> 
> 
> Kanoj Sarcar writes:
>  > dma_addr_t should be unsigned long, which is 64 bits on 64 bit
>  > architectures, so things are fine there.
>  > 
>  > On regular x86, dma_addr_t is u32, which still works.
> 
> It's 32-bit on sparc64 since 32-bit DMA addresses are all
> we need since the IOMMU is used for anything.

Ok.

> 
> In fact, if your architecture is doing nothing other
> than PCI, you _OUGHT_ to make it 32-bit even on 64-bit
> platforms because the PCI dma interface does not support
> 64-bit DACs in any way shape or form until 2.5.x in then
> a new dma64_addr_t type will be used to denote a DAC
> address.

Way I look at it, if you have a 64 bit platform which has
hardware to send PCI64 data to any piece of memory, then
it would be sad if software were to limit you and say "No,
PCI64 dma data must go within this piece of (low) memory
which the kernel can address with 32 bits". Because, this
assumes usage of bounce buffers, which is not pretty 
performance wise.

In some cases (in 2.4, prior to dma64_addr_t), if arch 
code can figure out a device is A64, the driver does support
A64, then it can privately decide to use A64 style mapping
and pci_dma operations for that pci_dev. Is there a problem
with this approach?

Kanoj

> 
> Later,
> David S. Miller
> davem@redhat.com
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 20:07     ` Kanoj Sarcar
@ 2001-02-09 20:23       ` David S. Miller
  2001-02-09 21:11         ` Kanoj Sarcar
  0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 2001-02-09 20:23 UTC (permalink / raw)
  To: Kanoj Sarcar; +Cc: Grant Grundler, linux-mm

Kanoj Sarcar writes:
 > In some cases (in 2.4, prior to dma64_addr_t), if arch 
 > code can figure out a device is A64, the driver does support
 > A64, then it can privately decide to use A64 style mapping
 > and pci_dma operations for that pci_dev. Is there a problem
 > with this approach?

Only device code can determine if a device is A64 and will
actually spit out DAC addressing.

Let me give you one example.  On the Syskonnect Gigabit cards,
if any of the top 32-bits of an address are non-zero, DAC will
be used else a SAC cycle will be used for the address.

Alpha and Sparc64 PCI controllers interpret DAC and SAC addresses
differently.  For example, on sparc64, a DAC address to physical
memory should be formed by software with this equation:

	DAC_ADDR = (0x03fff00000000000 + PHYS_ADDR)

Alpha, if I remember correctly, uses a different upper constant.
For these two platforms, if SAC is used by the device then
normal IOMMU translation occurs (unless the IOMMU is disabled
thus putting the PCI controller into a bypass mode).

So it is not just "A64 capable", it is "will spit out DAC for
_this_ PCI dma address" and "can arch handle DACs appropriately."

You have to use a different type due to all of these variables.
So we will have dma64_addr_t and pci64_map_single et a.
The driver has to make a conscious decision to use 64-bit
DACs, and all devices I know of supporting DAC must be specifically
told to use DACs.  See things like SCSI_NCR_USE_64BIT_DAC in the
sym53c8xx driver.

The reason these interfaces don't and will not exist in 2.4.x is
precisely because I've had to track down and figure out all of these
arch and device specific details before deciding on an interface
that can work for everyone.  The PCI dma API in 2.4.x is frozen.

In short trying to get 64-bit DAC'able addresses with pci_map_single()
is illegal and any driver doing it is flat out non-portable.

Later,
David S. Miller
davem@redhat.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: IOMMU setup vs DAC (PCI)
  2001-02-09 20:23       ` David S. Miller
@ 2001-02-09 21:11         ` Kanoj Sarcar
  0 siblings, 0 replies; 8+ messages in thread
From: Kanoj Sarcar @ 2001-02-09 21:11 UTC (permalink / raw)
  To: David S. Miller; +Cc: Grant Grundler, linux-mm

> 
> 
> Kanoj Sarcar writes:
>  > In some cases (in 2.4, prior to dma64_addr_t), if arch 
>  > code can figure out a device is A64, the driver does support
>  > A64, then it can privately decide to use A64 style mapping
>  > and pci_dma operations for that pci_dev. Is there a problem
>  > with this approach?
> 
> Only device code can determine if a device is A64 and will
> actually spit out DAC addressing.
> 
> Let me give you one example.  On the Syskonnect Gigabit cards,
> if any of the top 32-bits of an address are non-zero, DAC will
> be used else a SAC cycle will be used for the address.
> 
> Alpha and Sparc64 PCI controllers interpret DAC and SAC addresses
> differently.  For example, on sparc64, a DAC address to physical
> memory should be formed by software with this equation:
> 
> 	DAC_ADDR = (0x03fff00000000000 + PHYS_ADDR)
> 
> Alpha, if I remember correctly, uses a different upper constant.
> For these two platforms, if SAC is used by the device then
> normal IOMMU translation occurs (unless the IOMMU is disabled
> thus putting the PCI controller into a bypass mode).
> 
> So it is not just "A64 capable", it is "will spit out DAC for
> _this_ PCI dma address" and "can arch handle DACs appropriately."
> 

As a counter example, see the much simpler-to-handle qlogicisp.c
driver, which is programmed at start to use DAC or SAC (via
config option CONFIG_QL_ISP_A64). Also, qlogicfc.c is quite
similar (PCI64_DMA_BITS).

So, if your arch can handle A64, it would build this driver in 
CONFIG_QL_ISP_A64 mode, and the pci_dma implementations would know
that this device/driver can do A64.

> You have to use a different type due to all of these variables.
> So we will have dma64_addr_t and pci64_map_single et a.
> The driver has to make a conscious decision to use 64-bit
> DACs, and all devices I know of supporting DAC must be specifically
> told to use DACs.  See things like SCSI_NCR_USE_64BIT_DAC in the
> sym53c8xx driver.
> 

If the Symbios chips behave similar to Qlogic chips, then 
SCSI_NCR_USE_64BIT_DAC should really be a config option. 

> The reason these interfaces don't and will not exist in 2.4.x is
> precisely because I've had to track down and figure out all of these
> arch and device specific details before deciding on an interface
> that can work for everyone.  The PCI dma API in 2.4.x is frozen.
> 
> In short trying to get 64-bit DAC'able addresses with pci_map_single()
> is illegal and any driver doing it is flat out non-portable.
> 

Yes, understood. As you point out, the Syskonnect Gigabit card is
probably best operated in A32 mode. 

All I am trying to say is that performance of certain drivers on 
certain architectures might be improvable by certain tricks, even in
2.4.

Kanoj

> Later,
> David S. Miller
> davem@redhat.com
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-02-09 21:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-09 19:39 IOMMU setup vs DAC (PCI) Grant Grundler
2001-02-09 19:42 ` David S. Miller
2001-02-09 20:04   ` Grant Grundler
2001-02-09 19:47 ` Kanoj Sarcar
2001-02-09 19:52   ` David S. Miller
2001-02-09 20:07     ` Kanoj Sarcar
2001-02-09 20:23       ` David S. Miller
2001-02-09 21:11         ` Kanoj Sarcar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox