Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers

ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed

From: Dave Airlie <airlied@gmail.com>
To: Linus Walleij <linus.walleij@linaro.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, Greg KH <greg@kroah.com>,
	 Leon Romanovsky <leon@kernel.org>,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Josh Triplett <josh@joshtriplett.org>,
	 Mauro Carvalho Chehab <mchehab@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	ksummit@lists.linux.dev,  dev@tvm.apache.org
Subject: Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers
Date: Mon, 13 Sep 2021 09:15:05 +1000	[thread overview]
Message-ID: <CAPM=9txX_f2iRPTGEeqbHqbPxZ2X-e4RStGXO-rz3KC2n4yyiw@mail.gmail.com> (raw)
In-Reply-To: <CACRpkdZRy8b3B8chCnpEHV3_qfBS6kCqMNmCPy4MV0vf0-AsAw@mail.gmail.com>

On Mon, 13 Sept 2021 at 08:52, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Sun, Sep 12, 2021 at 11:13 PM Dave Airlie <airlied@gmail.com> wrote:
>
> > For userspace components as well these communities of experts need to
> > exist for each domain, and we need to encourage upstream first
> > processes across the board for these split kernel/userspace stacks.
> >
> > The habanalabs compiler backend is an LLVM fork, I'd like to see the
> > effort to upstream that LLVM backend into LLVM proper.
>
> I couldn't agree more.
>
> A big part of the problem with inference engines / NPU:s is that of no
> standardized userspace. Several of the machine learning initiatives
> from some years back now have stale git repositories and are
> visibly unmaintained, c.f. Caffe https://github.com/BVLC/caffe
> last commit 2 years ago.
>
> In a discussion thread at LWN I raised Apache TVM as a currently
> quite obviously alive and kicking community, and these people have
> the ambition to provide "an open source machine learning compiler
> framework for CPUs, GPUs, and machine learning accelerators".
> https://tvm.apache.org/
> At least they have all relevant companies logotypes on their homepage,
> so there is some kind of commitment.
> You can find for example from Arm an RFC for real HW accelerator code
> support using (out of tree) Linux kernel drivers with Apache TVM:
> https://discuss.tvm.apache.org/t/rfc-ethosn-arm-ethos-n-integration/6680
>
> Then there is Google's TensorFlow. How open is that for a random
> HW vendor who want to integrate their accelerator and how open is
> it to working with the kernel community? Then there is PyTorch.
> All of these apparently active. Well CPU vendors often support
> two different compilers so I guess they could very well support
> three machine learning userspaces, why not.
>
> What confuses me is what kind of time horizon and longevity these
> projects have, and what level of commitment is involved and
> what ambition. Especially to what extent they would care about
> working with the Linux kernel community. (TVM have a mail
> address so I added them on CC.)
>
> Habanalabs propose an LLVM fork as compiler, yet the Intel
> logo is on the Apache TVM website, and no sign of integrating with
> that project. They claim to support also TensorFlow.
>
> The way I percieve it is that there simply isn't any GCC/LLVM or
> Gallium 3D of NPU:s, these people haven't yet decided that "here
> is that userspace we are all going to use". Or have they?
>
> LLVM? TVM? TensorFlow? PyTorch? Some other one?

Yeah I've been doing the same research, and there is also the Glow
project I think to add to the list.

The thing is control, everyone wants to run it, when it comes to Linux
nearly all the vendors have realised they've lost their control and
learned to live with it, but the second they are into userspace, it's
like hey we need to be in charge of every single piece of this, thus
losing the Linux kernel advantage of pooling engineering expertise
cross-vendor.

I certainly don't want to be the distro packager having to package 30
forks of LLVM for 20 different vendor accelerators with 20 runtime
APIs and 20 forks of TVM/Tensorflow/pytorch.

Enabling that behaviour by just merging kernel drivers and washing our
hands to me seems like a large misstep for the future of
maintainability of the kernel, esp as these devices start interacting
with GPUs or RDMA and we get locked into unmovable interfaces that we
can't even analyse for deadlocks etc.

Dave.

next prev parent reply	other threads:[~2021-09-12 23:15 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 21:00 Jonathan Corbet
2021-09-10 21:32 ` Josh Triplett
2021-09-13 13:50   ` Christian Brauner
2021-09-13 13:57     ` Daniel Vetter
2021-09-14  2:07       ` Laurent Pinchart
2021-09-14 14:40   ` Jani Nikula
2021-09-14 14:45     ` Geert Uytterhoeven
2021-09-14 14:59       ` Jani Nikula
2021-09-14 15:10         ` Geert Uytterhoeven
2021-09-10 21:51 ` James Bottomley
2021-09-10 21:59   ` Alexandre Belloni
2021-09-10 22:35     ` James Bottomley
2021-09-11 14:51       ` Jonathan Corbet
2021-09-11 15:24         ` James Bottomley
2021-09-11 21:52           ` Laurent Pinchart
2021-09-14 13:22             ` Johannes Berg
2021-09-11  0:08   ` Laurent Pinchart
2021-09-10 22:52 ` Mauro Carvalho Chehab
2021-09-10 23:45   ` Josh Triplett
2021-09-10 23:48     ` Dave Hansen
2021-09-11  0:13       ` Laurent Pinchart
2021-09-10 23:55     ` Thomas Gleixner
2021-09-11  0:20       ` Laurent Pinchart
2021-09-11 14:20         ` Steven Rostedt
2021-09-11 22:08           ` Laurent Pinchart
2021-09-11 22:42             ` Steven Rostedt
2021-09-11 23:10               ` Laurent Pinchart
2021-09-13 11:10               ` Mark Brown
2021-09-11 22:51           ` Mauro Carvalho Chehab
2021-09-11 23:22           ` Mauro Carvalho Chehab
2021-09-11 10:31       ` Leon Romanovsky
2021-09-11 11:41         ` Laurent Pinchart
2021-09-11 12:04           ` Leon Romanovsky
2021-09-11 22:04             ` Laurent Pinchart
2021-09-12  4:27               ` Leon Romanovsky
2021-09-12  7:26                 ` Greg KH
2021-09-12  8:29                   ` Leon Romanovsky
2021-09-12 13:25                     ` Greg KH
2021-09-12 14:15                       ` Leon Romanovsky
2021-09-12 14:34                         ` Greg KH
2021-09-12 16:41                           ` Laurent Pinchart
2021-09-12 20:35                           ` Dave Airlie
2021-09-12 20:41                           ` Dave Airlie
2021-09-12 20:49                             ` Daniel Vetter
2021-09-12 21:12                               ` Dave Airlie
2021-09-12 22:51                                 ` Linus Walleij
2021-09-12 23:15                                   ` Dave Airlie [this message]
2021-09-13 13:20                                   ` Arnd Bergmann
2021-09-13 13:54                                     ` Daniel Vetter
2021-09-13 22:04                                       ` Arnd Bergmann
2021-09-13 23:33                                         ` Dave Airlie
2021-09-14  9:08                                           ` Arnd Bergmann
2021-09-14  9:23                                             ` Daniel Vetter
2021-09-14 10:47                                               ` Laurent Pinchart
2021-09-14 12:58                                               ` Arnd Bergmann
2021-09-14 19:45                                                 ` Daniel Vetter
2021-09-14 15:43                                             ` Luck, Tony
2021-09-13 14:52                                     ` James Bottomley
2021-09-14 13:07                                     ` Linus Walleij
2021-09-13 14:03                           ` Mark Brown
2021-09-12 15:55                       ` Laurent Pinchart
2021-09-12 16:43                         ` James Bottomley
2021-09-12 16:58                           ` Laurent Pinchart
2021-09-12 17:08                             ` James Bottomley
2021-09-12 19:52                   ` Dave Airlie
2021-09-12  7:46                 ` Mauro Carvalho Chehab
2021-09-12  8:00                   ` Leon Romanovsky
2021-09-12 14:53                     ` Laurent Pinchart
2021-09-12 15:41                       ` Mauro Carvalho Chehab
2021-09-10 23:46   ` Laurent Pinchart
2021-09-11  0:38     ` Mauro Carvalho Chehab
2021-09-11  9:27       ` Laurent Pinchart
2021-09-11 22:33         ` Mauro Carvalho Chehab
2021-09-13 12:04         ` Mark Brown
2021-09-12 19:13 ` Dave Airlie
2021-09-12 19:48   ` Laurent Pinchart
2021-09-13  2:26     ` Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPM=9txX_f2iRPTGEeqbHqbPxZ2X-e4RStGXO-rz3KC2n4yyiw@mail.gmail.com' \
    --to=airlied@gmail.com \
    --cc=corbet@lwn.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dev@tvm.apache.org \
    --cc=greg@kroah.com \
    --cc=josh@joshtriplett.org \
    --cc=ksummit@lists.linux.dev \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=leon@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=mchehab@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox