* [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux
@ 2025-01-31 17:53 Wei Huang
2025-02-03 10:13 ` Jonathan Cameron
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Wei Huang @ 2025-01-31 17:53 UTC (permalink / raw)
To: lsf-pc
Cc: linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam,
Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit,
Suravee, shivankg, Michael.Day
Hi All,
I want to proposal a talk for the LSFMMBPF conference: Enabling Smart
Data Stream Accelerator (SDXI) Support for Linux.
The smart data stream accelerator (SDXI) is an industry standard [1]
that provides various advanced capabilities, such as offloading DMA
operations, supporting user-space addresses, and offering other advanced
data processing features. With the integration of SDXI into a SoC, DMA
offloading can now be supported across different address spaces. This
talk focuses on a software design which enables comprehensive SDXI
support across multiple software layers in the Linux Kernel. These
interfaces not only facilitate SDXI hardware management but also allow
kernel space subsystems and user space applications to directly own and
control SDXI hardware under the protection of IOMMU.
To illustrate the practical applications of SDXI, Red Hat and AMD
developed a user-space library that leverages the SDXI driver interface,
demonstrating various use cases, such as memory operation offloading, in
both bare-metal and virtual environments.
The prototype device driver [2] and user-space library are available for
testing. We continue to work on the improvement of both components and
plan to upstream the device driver soon.
== DISCUSSION ==
At this conference, we plan to discuss with the community on:
1) Use Cases
* Linux DMA engine
* Kernel task offloading (e.g., bulk copying)
* QoS and kernel perf integration
* New use cases
2) User-Space API Interface
* IOCTL proposal
* Security control
* User-space app integration
3) Virtualization Support
* Progress & current status
* Challenges
== REFERENCES ==
[1] SDXI 1.0 specification, https://www.snia.org/sdxi
[2] SDXI device driver, https://github.com/AMDESE/linux-sdxi
Thanks,
-Wei
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux 2025-01-31 17:53 [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux Wei Huang @ 2025-02-03 10:13 ` Jonathan Cameron 2025-02-03 14:23 ` Jason Gunthorpe 2025-02-04 0:52 ` Wei Huang 2025-02-04 4:19 ` [Lsf-pc] " Dan Williams 2025-02-04 7:59 ` Christoph Hellwig 2 siblings, 2 replies; 7+ messages in thread From: Jonathan Cameron @ 2025-02-03 10:13 UTC (permalink / raw) To: Wei Huang Cc: lsf-pc, linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam, Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit, Suravee, shivankg, Michael.Day, Zhangfei Gao, Zhou Wang, Shameer Kolothum, Jason Gunthorpe On Fri, 31 Jan 2025 11:53:07 -0600 Wei Huang <wei.huang2@amd.com> wrote: > Hi All, > > I want to proposal a talk for the LSFMMBPF conference: Enabling Smart > Data Stream Accelerator (SDXI) Support for Linux. > > The smart data stream accelerator (SDXI) is an industry standard [1] > that provides various advanced capabilities, such as offloading DMA > operations, supporting user-space addresses, and offering other advanced > data processing features. With the integration of SDXI into a SoC, DMA > offloading can now be supported across different address spaces. This > talk focuses on a software design which enables comprehensive SDXI > support across multiple software layers in the Linux Kernel. These > interfaces not only facilitate SDXI hardware management but also allow > kernel space subsystems and user space applications to directly own and > control SDXI hardware under the protection of IOMMU. > > To illustrate the practical applications of SDXI, Red Hat and AMD > developed a user-space library that leverages the SDXI driver interface, > demonstrating various use cases, such as memory operation offloading, in > both bare-metal and virtual environments. > > The prototype device driver [2] and user-space library are available for > testing. We continue to work on the improvement of both components and > plan to upstream the device driver soon. > > == DISCUSSION == > At this conference, we plan to discuss with the community on: Hi Wei, Lots of topics and hints at interesting areas, but I'd like to see more details to understand how this maps to other data moving / reorganizing accelerators. Whilst SDXI looks like a good and feature rich spec, I'm curious what is fundamentally new? Perhaps it is just the right time to improve functionality for DMA engines in general. > > 1) Use Cases > * Linux DMA engine > * Kernel task offloading (e.g., bulk copying) > * QoS and kernel perf integration > * New use cases All interesting topics across this particular DMA engine and many others. For new use cases are you planning to bring some, or is this a request for suggestions? > > 2) User-Space API Interface > * IOCTL proposal I'm curious on this aspect and how it compares with previous approaches. Obviously bring some new operators and possibly need to target remote memory. However we have existing support for userspace access to accelerators for crypto, compression etc (and much broader) We went through a similar process finding a path to support those a few years ago and ended up with UACCE. (drivers/misc/uacce lots of stuff under drivers crypto). If there is overlap it would be good to figure out a path that reduces duplication / complexity of interfacing with the various userspace projects we all care about. I won't tell the stories of pain an redesigns it took to get UACCE upstream, but if you are doing another new thing, good luck! (+CC some folk more familiar and active in this space than I am). > * Security control > * User-space app integration > > 3) Virtualization Support > * Progress & current status Good to have some more detail on this in particular. Is this mostly blocked on vSVA, IOMMUFD etc progress or is there something new? > * Challenges > > == REFERENCES == > [1] SDXI 1.0 specification, https://www.snia.org/sdxi > [2] SDXI device driver, https://github.com/AMDESE/linux-sdxi > > Thanks, > -Wei > Thanks, Jonathan > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux 2025-02-03 10:13 ` Jonathan Cameron @ 2025-02-03 14:23 ` Jason Gunthorpe 2025-02-04 0:59 ` Wei Huang 2025-02-04 0:52 ` Wei Huang 1 sibling, 1 reply; 7+ messages in thread From: Jason Gunthorpe @ 2025-02-03 14:23 UTC (permalink / raw) To: Jonathan Cameron Cc: Wei Huang, lsf-pc, linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam, Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit, Suravee, shivankg, Michael.Day, Zhangfei Gao, Zhou Wang, Shameer Kolothum On Mon, Feb 03, 2025 at 10:13:23AM +0000, Jonathan Cameron wrote: > Lots of topics and hints at interesting areas, but I'd like to see more > details to understand how this maps to other data moving / reorganizing > accelerators. Whilst SDXI looks like a good and feature rich spec, > I'm curious what is fundamentally new? Perhaps it is just the > right time to improve functionality for DMA engines in general. It looks quite alot like Intel's IDXD to me, which seems to have alot of overlap. > > [2] SDXI device driver, https://github.com/AMDESE/linux-sdxi Sorry, but a never posted branch based on an v6.6 kernel using alot of obsoleted and removed iommu APIs doesn't seem like LSF/MM content to me. LSF/MM is supposed to be a problem solving conference, you should ideally have something in active discussion on the mailing list with an unresolved problem to talk about. Jason ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux 2025-02-03 14:23 ` Jason Gunthorpe @ 2025-02-04 0:59 ` Wei Huang 0 siblings, 0 replies; 7+ messages in thread From: Wei Huang @ 2025-02-04 0:59 UTC (permalink / raw) To: Jason Gunthorpe, Jonathan Cameron Cc: lsf-pc, linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam, Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit, Suravee, shivankg, Michael.Day, Zhangfei Gao, Zhou Wang, Shameer Kolothum On 2/3/25 8:23 AM, Jason Gunthorpe wrote: > On Mon, Feb 03, 2025 at 10:13:23AM +0000, Jonathan Cameron wrote: > >> Lots of topics and hints at interesting areas, but I'd like to see more >> details to understand how this maps to other data moving / reorganizing >> accelerators. Whilst SDXI looks like a good and feature rich spec, >> I'm curious what is fundamentally new? Perhaps it is just the >> right time to improve functionality for DMA engines in general. > > It looks quite alot like Intel's IDXD to me, which seems to have alot > of overlap. In terms of certain functionalities, it does overlap with IDXD in some areas. Both implementations, however, take different design approaches that might be worthy of discussion. > >>> [2] SDXI device driver, https://github.com/AMDESE/linux-sdxi > > Sorry, but a never posted branch based on an v6.6 kernel using alot of > obsoleted and removed iommu APIs doesn't seem like LSF/MM content to > me. LSF/MM is supposed to be a problem solving conference, you should > ideally have something in active discussion on the mailing list with > an unresolved problem to talk about. Internally we have a rebase to kernel 6.12 and can be shared in github immediately. Regarding the upstream, Nathan Lynch and I are working on SDXI upstream patchset. > > Jason ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux 2025-02-03 10:13 ` Jonathan Cameron 2025-02-03 14:23 ` Jason Gunthorpe @ 2025-02-04 0:52 ` Wei Huang 1 sibling, 0 replies; 7+ messages in thread From: Wei Huang @ 2025-02-04 0:52 UTC (permalink / raw) To: Jonathan Cameron Cc: lsf-pc, linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam, Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit, Suravee, shivankg, Michael.Day, Zhangfei Gao, Zhou Wang, Shameer Kolothum, Jason Gunthorpe On 2/3/25 4:13 AM, Jonathan Cameron wrote: > On Fri, 31 Jan 2025 11:53:07 -0600 > Wei Huang <wei.huang2@amd.com> wrote: > >> Hi All, >> >> I want to proposal a talk for the LSFMMBPF conference: Enabling Smart >> Data Stream Accelerator (SDXI) Support for Linux. >> >> The smart data stream accelerator (SDXI) is an industry standard [1] >> that provides various advanced capabilities, such as offloading DMA >> operations, supporting user-space addresses, and offering other advanced >> data processing features. With the integration of SDXI into a SoC, DMA >> offloading can now be supported across different address spaces. This >> talk focuses on a software design which enables comprehensive SDXI >> support across multiple software layers in the Linux Kernel. These >> interfaces not only facilitate SDXI hardware management but also allow >> kernel space subsystems and user space applications to directly own and >> control SDXI hardware under the protection of IOMMU. >> >> To illustrate the practical applications of SDXI, Red Hat and AMD >> developed a user-space library that leverages the SDXI driver interface, >> demonstrating various use cases, such as memory operation offloading, in >> both bare-metal and virtual environments. >> >> The prototype device driver [2] and user-space library are available for >> testing. We continue to work on the improvement of both components and >> plan to upstream the device driver soon. >> >> == DISCUSSION == >> At this conference, we plan to discuss with the community on: > > Hi Wei, > > Lots of topics and hints at interesting areas, but I'd like to see more > details to understand how this maps to other data moving / reorganizing > accelerators. Whilst SDXI looks like a good and feature rich spec, > I'm curious what is fundamentally new? Perhaps it is just the Compared with existing implementations, I think the following (combined) features are considered interesting: * An industry open standard that is architecture agnostic * Support various, including user-mode, address spaces * Designed with virtualization in mind, easy for passthru and migration * Easy extension for future functionalities > right time to improve functionality for DMA engines in general. > > >> >> 1) Use Cases >> * Linux DMA engine >> * Kernel task offloading (e.g., bulk copying) >> * QoS and kernel perf integration >> * New use cases > > All interesting topics across this particular DMA engine and many others. > For new use cases are you planning to bring some, or is this a request > for suggestions? Both. Some use cases we tested: * Server as a DMA engine in Linux * AutoNUMA offloading * Memory zeroing for large VM memory initialization * Batching folio copy operations We do expect more use cases, and want to solicit ideas from the community. > >> >> 2) User-Space API Interface >> * IOCTL proposal > > I'm curious on this aspect and how it compares with previous approaches. > Obviously bring some new operators and possibly need to target remote > memory. However we have existing support for userspace access to accelerators > for crypto, compression etc (and much broader) > > We went through a similar process finding a path to support those a few > years ago and ended up with UACCE. (drivers/misc/uacce lots of stuff > under drivers crypto). If there is overlap it would be good to figure > out a path that reduces duplication / complexity of interfacing with > the various userspace projects we all care about. I won't tell the stories > of pain an redesigns it took to get UACCE upstream, but if you are doing > another new thing, good luck! (+CC some folk more familiar and active > in this space than I am). Thanks for the pointer. Right now, as a prototype, we don't take UACCE approach. But we did see UACCE can be utilized in this space. This can be part of discussion. > >> * Security control >> * User-space app integration >> >> 3) Virtualization Support >> * Progress & current status > > Good to have some more detail on this in particular. Is this mostly blocked > on vSVA, IOMMUFD etc progress or is there something new? It is blocked by vIOMMU support in both kernel and QEMU/KVM. To support SVA inside VMs, we have to present a virtual IOMMU to guest VMs. There are various way of implementing virtual IOMMU. But AMD hardware vIOMMU is supposed to have better performance over emulated vIOMMU (there was a KVM Forum talk by Suravee and me for reference). We have a prototype implementation. Right now Suravee is working on cleaning/finishing up hardware vIOMMU patches for the upstream. > >> * Challenges >> >> == REFERENCES == >> [1] SDXI 1.0 specification, https://www.snia.org/sdxi >> [2] SDXI device driver, https://github.com/AMDESE/linux-sdxi >> >> Thanks, >> -Wei >> > Thanks, > > Jonathan >> > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux 2025-01-31 17:53 [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux Wei Huang 2025-02-03 10:13 ` Jonathan Cameron @ 2025-02-04 4:19 ` Dan Williams 2025-02-04 7:59 ` Christoph Hellwig 2 siblings, 0 replies; 7+ messages in thread From: Dan Williams @ 2025-02-04 4:19 UTC (permalink / raw) To: Wei Huang via Lsf-pc Cc: linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam, Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit, Suravee, shivankg, Michael.Day Wei Huang via Lsf-pc wrote: > Hi All, > > I want to proposal a talk for the LSFMMBPF conference: Enabling Smart > Data Stream Accelerator (SDXI) Support for Linux. > > The smart data stream accelerator (SDXI) is an industry standard [1] > that provides various advanced capabilities, such as offloading DMA > operations, supporting user-space addresses, and offering other advanced > data processing features. With the integration of SDXI into a SoC, DMA > offloading can now be supported across different address spaces. This > talk focuses on a software design which enables comprehensive SDXI > support across multiple software layers in the Linux Kernel. These > interfaces not only facilitate SDXI hardware management but also allow > kernel space subsystems and user space applications to directly own and > control SDXI hardware under the protection of IOMMU. > > To illustrate the practical applications of SDXI, Red Hat and AMD > developed a user-space library that leverages the SDXI driver interface, > demonstrating various use cases, such as memory operation offloading, in > both bare-metal and virtual environments. > > The prototype device driver [2] and user-space library are available for > testing. We continue to work on the improvement of both components and > plan to upstream the device driver soon. > > == DISCUSSION == > At this conference, we plan to discuss with the community on: > > 1) Use Cases > * Linux DMA engine Is this a use case? In other words copy-offload engines have struggled for more than a decard to impact kernel use cases due to the maintenance burden of split async / synchronous paths for kernel-buffer-to-kernel-buffer copies. The Linux dmaengine subsystem mainly stayed relevant due to device-DMA use cases. > * Kernel task offloading (e.g., bulk copying) > * QoS and kernel perf integration > * New use cases I think for this effort to be successful it needs to focus on one embarassingly clear use case where CPU copy hits a scaling wall, not a gallery of potential use cases. > 2) User-Space API Interface > * IOCTL proposal > * Security control > * User-space app integration The best API for copy-offload is no new API, i.e. transparent acceleration of existing software. For example, io_uring is already asking applications to rewrite and submit bulk work to the kernel. It would be lovely if the applications got copy offload for free after paying the io_uring conversion cost. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux 2025-01-31 17:53 [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux Wei Huang 2025-02-03 10:13 ` Jonathan Cameron 2025-02-04 4:19 ` [Lsf-pc] " Dan Williams @ 2025-02-04 7:59 ` Christoph Hellwig 2 siblings, 0 replies; 7+ messages in thread From: Christoph Hellwig @ 2025-02-04 7:59 UTC (permalink / raw) To: Wei Huang Cc: lsf-pc, linux-mm, Don Dutile, Joel Savitz, Moyes, William, Iyer, Shyam, Lynch, Nathan, mel.gorman, santosh.shukla, Suthikulpanit, Suravee, shivankg, Michael.Day This is a lot of mumbling for for something that should have had a driver for the basic dmaengine functionality upstream for a long time. I'd suggest you spend your time on upstreaming that driver first and then send actualy code proposing anything beyond that to the list first and start a discusssion if that doesn't get anywhere. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-02-04 7:59 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-01-31 17:53 [LSF/MM/BPF TOPIC] Enabling Smart Data Stream Accelerator Support for Linux Wei Huang 2025-02-03 10:13 ` Jonathan Cameron 2025-02-03 14:23 ` Jason Gunthorpe 2025-02-04 0:59 ` Wei Huang 2025-02-04 0:52 ` Wei Huang 2025-02-04 4:19 ` [Lsf-pc] " Dan Williams 2025-02-04 7:59 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox