From: "Luis R. Rodriguez" <mcgrof@kernel.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "Oded Gabbay" <oded.gabbay@gmail.com>,
"Jörg Rödel" <jroedel@suse.de>,
ksummit-discuss@lists.linuxfoundation.org,
"Mauro Carvalho Chehab" <mchehab@osg.samsung.com>,
"vegard.nossum@gmail.com" <vegard.nossum@gmail.com>,
"rafael.j.wysocki" <rafael.j.wysocki@intel.com>,
"Cristina Moraru" <cristina.moraru09@gmail.com>,
"Roberto Di Cosmo" <roberto@dicosmo.org>,
"Marek Szyprowski" <m.szyprowski@samsung.com>,
"Stefano Zacchiroli" <zack@upsilon.cc>,
"Valentin Rothberg" <valentinrothberg@gmail.com>
Subject: Re: [Ksummit-discuss] [TECH TOPIC] Addressing complex dependencies and semantics (v2)
Date: Tue, 2 Aug 2016 02:56:50 +0200 [thread overview]
Message-ID: <20160802005650.GF3296@wotan.suse.de> (raw)
In-Reply-To: <1555444.RIYpEbA2F7@vostro.rjw.lan>
On Tue, Aug 02, 2016 at 02:01:56AM +0200, Rafael J. Wysocki wrote:
> On Monday, August 01, 2016 09:03:09 PM Luis R. Rodriguez wrote:
> > On Fri, Jul 29, 2016 at 12:13:03PM +0100, Mark Brown wrote:
> > > On Fri, Jul 29, 2016 at 09:45:55AM +0200, Hans Verkuil wrote:
> > >
> > > > My main problem is not so much with deferred probe (esp. for cyclic dependencies
> > > > it is a simple method of solving this, and simple is good). My main problem is
> > > > that you can't tell the system that driver A needs to be probed after drivers B,
> > > > C and D are probed first.
> > >
> > > > That would allow us to get rid of v4l2-async.c which is a horrible hack.
> >
> > I'd like to understand the requirement for this a bit better, so someone explaining
> > this would be good if this moves forward as a tech session.
>
> One case I'm familiar with is when a device has two (or more) device IDs, where
> one is more specific than the other. The idea being that if the OS has a driver
> for that particular device, it will use the more specific device ID (say A) to
> look for it, but otherwise it will use the other "generic" ID (say B) to match
> against a "generic" driver.
>
> Of course, that only works if the driver for A is probed before the driver for B.
That's a good case to keep in mind which is indeed complex.
> > > > That code allows a bridge driver to wait until all dependent drivers are probed.
> > > > This really should be core functionality.
> > >
> > > > Do other subsystems do something similar like drivers/media/v4l2-core/v4l2-async.c?
> > > > Does anyone know?
> > >
> > > ASoC does, it has an explicit card driver to join things together and
> > > that just defers probe until everything it needs is present. This was
> > > originally open coded in ASoC but once deferred probe was implemented we
> > > converted to that.
> >
> > OK that's 2.
> >
> > Another piece of code that deals with complex dependencies I've recently
> > ventured into was the x86 IOMMU stuff. The complexities here are both at
> > the built-in init level and also later at probe.
> >
> > For the built-in code we have a code run time sort where based on a simple set
> > of semantics used to declare dependencies final code that was compiled in is
> > sorted out so that the code that needs to initialize first triggers first. hpa
> > had suggested long ago that generalizing this was desirable, so I've taken that
> > task and have some basic building blocks which are now being proposed in their
> > RFC v3 series [0]. If it seems that the run time sort mechanism is left out,
> > its correct, upon review with Andy on RFC v2 we don't *yet* need a run time
> > sort, even though I expanded on the existing one for IOMMU and strengthened the
> > semantics there, its still available on the userspace mockup solution for
> > linker tables though [1]. I should note that I did at least determine that
> > generalizing a sort for dependency maps which shuffles code at run time
> > did not make sense due to the fact that you'd either need to generalize the
> > building blocks used for defining a dependency map. If you move one item used
> > for init from the end to the front, your iteration function must use the same
> > structure in the same way. Run time sorts for code sections then are left up
> > to each subsystem to implement. On x86, I evaluated sorting out dependencies
> > further, for instance on setup_arch() -- however it just seemed not needed
> > at this point. For other subsystems this may make more sense.
> >
> > At the probe level IOMMUs face another issue when dealing with GPU drivers.
> > The kernel has 7 levels of initialization possible (pure_initcall() == 0,
> > core_initcall() == 1, ..., fs_initcall() == 5, device_initcall() == 6,
> > late_initcall() == 7), module_init() maps to device_initcall(), but modules may
> > also use late_initcall() if they know a driver built-in needs to definitely run
> > very late. Naturally if you're not a piece of code running very early on, and
> > you want the option to be a module as well (device_initcall == 6) then you
> > really only have at your disposal two levels available before you get to an end
> > device driver. It turns out that for GPU drivers this is not enough to map out
> > enough dependencies at the higher level, so the only next best solution
> > available is to implicitly rely on linker order. This is the case for ordering
> > logistics between AMD IOMMU v1, AMD IOMMUv2 (depends on AMD IOMMU v1), and
> > AMD KFD driver, and the AMD radeon driver:
> >
> > 0. AMD IOMMUv1: arch/x86/kernel/pci-dma.c
> > 1. AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c
> > 2. AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c
> > 3. AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c
> >
> > For details refer to a recent discussion on this [2].
> >
> > Using linker tables for initialization should enable us to scale initialization
> > levels well beyond 7 levels, in fact this could be however long an ASCII string
> > is acceptable in C as the order level is stored as a string (part of the section
> > name) and also enable each subsystem to have its own set of levels / heuristics
> > for initialization as well. Ordering would actually be done at link time, so
> > the code is already ordered *iff* all dependency heuristics are available at
> > compilation time. If dependency information can only be available at run time,
> > a run time sort is optional but using the existing x86 IOMMU code as an example
> > and precedent (and considering my expanded stricter semantics) -- in the future
> > this can be an option for any subsystem as well as a supplement.
> >
> > My interest in the media controller and the SAT solver is I consider the run
> > time sort nothing but a sloppy hack (see my userspace tree for tons of
> > semantics fixes), and formalizing this to something more concrete that can be
> > generic should be more useful to other parts of the kernel. This is why I
> > wanted to discuss these things. Is there a generic common goal here and
> > can we share code ? If so what would that look like ?
> >
> > My impression at first is that linker-time dependencies should suffice to cover
> > a lot cases, but that should still mean then that binutils ld SORT() could
> > potentially be enhanced further in the future, specially if a simple dependency
> > map could be expressed. For instance can we make it run faster, or would
> > having it interpret certain other criteria enable us to use a SAT solver
> > to optimize dependency resolution / ordering. Then there is run time sorting
> > and optimization -- but the same questions apply here.
> >
> > I think a fruitful session would be to:
> >
> > o review / explain each of the complex ordering problems from different
> > kernel subsystems
> > o review / explain existing solutions to these problems
> > o finally try to figure out if a common solution is possible, both
> > short term and long term
> >
> > [0] https://lkml.kernel.org/r/1469222687-1600-1-git-send-email-mcgrof@kernel.org
> > [1] https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linker-tables.git/
> > [2] https://lkml.kernel.org/r/1464311916-10065-1-git-send-email-mcgrof@kernel.org
>
> I'm still wondering why probe ordering is special and suspend-resume and runtime
> PM ordering isn't (and shutdown ordering too, for that matter).
>
> In the majority of cases when probe ordering matters, suspend/resume and runtime
> PM ordering matters too, so considering one of them alone doesn't seem very
> useful to me.
>
> Now, even if you can address probe ordering issues by using some link-time
> methods and similar, runtime PM is async by nature and suspend/resume is
> async for many devices too (for efficiency reasons) and the only approach
> that works here is to have the dependencies represented as data somehow
> and use those when you need to carry out the operation.
Apologies, its why I Cc'd you -- I did mean to implicate both, given that as
I also had reviewed your work, I figured similar issues were being dealt with
there. So indeed, what I am suggesting is perhaps we can knock a few birds with
one stone here.
Luis
next prev parent reply other threads:[~2016-08-02 0:56 UTC|newest]
Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-27 16:50 Luis R. Rodriguez
2016-07-27 17:26 ` Mark Brown
2016-07-27 17:58 ` Luis R. Rodriguez
2016-07-27 18:03 ` Mark Brown
2016-07-27 19:20 ` Luis R. Rodriguez
2016-07-28 0:54 ` Rafael J. Wysocki
2016-07-28 10:41 ` Laurent Pinchart
2016-07-28 10:54 ` Hans Verkuil
2016-07-28 11:03 ` Laurent Pinchart
2016-07-28 11:46 ` Jan Kara
2016-07-28 15:16 ` Mark Brown
2016-07-28 16:00 ` Laurent Pinchart
2016-08-02 8:32 ` Jan Kara
2016-08-03 14:17 ` Alexandre Belloni
2016-07-30 1:59 ` Steven Rostedt
2016-08-01 13:12 ` Laurent Pinchart
2016-07-28 20:12 ` Lars-Peter Clausen
2016-07-28 20:38 ` Mark Brown
2016-08-01 13:15 ` Laurent Pinchart
2016-07-28 14:36 ` Rafael J. Wysocki
2016-07-29 7:33 ` Hans Verkuil
2016-08-01 13:03 ` Laurent Pinchart
2016-08-01 13:17 ` Hans Verkuil
2016-08-04 8:22 ` Jani Nikula
2016-08-04 9:50 ` Greg KH
2016-08-04 10:20 ` Mark Brown
2016-08-04 10:27 ` Jani Nikula
2016-08-05 2:59 ` Rob Herring
2016-08-05 9:01 ` Arnd Bergmann
2016-08-05 10:54 ` Greg KH
2016-08-05 11:31 ` Andrzej Hajda
2016-08-05 11:58 ` Mark Brown
2016-08-05 13:43 ` Greg KH
2016-08-05 19:27 ` Rob Herring
2016-08-09 8:08 ` Daniel Vetter
2016-08-09 8:17 ` Greg KH
2016-08-09 12:04 ` Daniel Vetter
2016-08-04 12:37 ` Geert Uytterhoeven
2016-08-04 15:53 ` Mark Brown
2016-07-28 21:49 ` Lars-Peter Clausen
2016-07-29 3:50 ` Greg KH
2016-07-29 7:45 ` Hans Verkuil
2016-07-29 7:55 ` Lars-Peter Clausen
2016-08-01 13:06 ` Laurent Pinchart
2016-07-29 11:13 ` Mark Brown
2016-08-01 13:09 ` Laurent Pinchart
2016-08-01 13:14 ` Lars-Peter Clausen
2016-08-01 13:19 ` Laurent Pinchart
2016-08-01 13:21 ` Hans Verkuil
2016-08-01 13:26 ` Laurent Pinchart
2016-08-01 13:35 ` Hans Verkuil
2016-08-01 13:38 ` Laurent Pinchart
2016-08-01 13:51 ` Hans Verkuil
2016-08-01 17:15 ` Laurent Pinchart
2016-08-01 13:33 ` Lars-Peter Clausen
2016-08-01 13:55 ` Mauro Carvalho Chehab
2016-08-01 14:41 ` Lars-Peter Clausen
2016-08-01 14:44 ` Andrzej Hajda
2016-08-01 14:54 ` Lars-Peter Clausen
2016-08-01 15:20 ` Mark Brown
2016-08-01 15:34 ` Andrzej Hajda
2016-08-01 15:43 ` Lars-Peter Clausen
2016-08-01 16:18 ` Andrzej Hajda
2016-08-01 17:06 ` Mark Brown
2016-08-01 18:21 ` Lars-Peter Clausen
2016-08-02 11:45 ` Andrzej Hajda
2016-08-01 18:33 ` Andrzej Hajda
2016-08-01 18:48 ` Mark Brown
2016-08-01 19:42 ` Andrzej Hajda
2016-08-01 20:05 ` Lars-Peter Clausen
2016-08-02 8:57 ` Takashi Iwai
2016-08-01 17:40 ` Laurent Pinchart
2016-08-02 7:38 ` Greg KH
2016-08-01 19:03 ` Luis R. Rodriguez
2016-08-02 0:01 ` Rafael J. Wysocki
2016-08-02 0:56 ` Luis R. Rodriguez [this message]
2016-08-02 1:03 ` Dmitry Torokhov
2016-08-02 8:30 ` Jiri Kosina
2016-08-02 9:41 ` Hannes Reinecke
2016-08-02 9:48 ` Jiri Kosina
2016-08-02 11:50 ` Takashi Iwai
2016-08-09 9:57 ` Jörg Rödel
2016-08-09 16:08 ` James Bottomley
2016-08-09 16:11 ` James Bottomley
2016-08-09 16:51 ` Luis R. Rodriguez
2016-08-09 17:05 ` David Woodhouse
2016-08-09 17:12 ` James Bottomley
2016-08-09 16:53 ` Jörg Rödel
2016-08-09 18:06 ` Luis R. Rodriguez
2016-08-10 15:21 ` Jörg Rödel
2016-08-10 16:42 ` Luis R. Rodriguez
2016-08-10 21:37 ` Jörg Rödel
2016-08-12 7:33 ` Linus Walleij
2016-07-27 18:50 ` Dmitry Torokhov
2016-07-28 10:43 ` Marc Zyngier
2016-07-28 10:51 ` Laurent Pinchart
2016-07-28 23:43 ` Luis R. Rodriguez
2016-08-01 12:44 ` Laurent Pinchart
2016-07-28 11:18 ` Mauro Carvalho Chehab
2016-07-28 11:24 ` Laurent Pinchart
2016-07-28 12:25 ` Mauro Carvalho Chehab
2016-07-28 16:04 ` Laurent Pinchart
2016-07-29 0:00 ` Luis R. Rodriguez
2016-08-01 12:50 ` Laurent Pinchart
2016-08-01 20:32 ` Luis R. Rodriguez
2016-07-29 13:57 ` Andrzej Hajda
2016-09-07 16:40 ` Kevin Hilman
2016-08-01 14:03 ` Marek Szyprowski
2016-11-03 18:43 ` Laurent Pinchart
2016-11-04 6:53 ` Marek Szyprowski
2016-09-08 21:03 ` Frank Rowand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160802005650.GF3296@wotan.suse.de \
--to=mcgrof@kernel.org \
--cc=cristina.moraru09@gmail.com \
--cc=jroedel@suse.de \
--cc=ksummit-discuss@lists.linuxfoundation.org \
--cc=m.szyprowski@samsung.com \
--cc=mchehab@osg.samsung.com \
--cc=oded.gabbay@gmail.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rjw@rjwysocki.net \
--cc=roberto@dicosmo.org \
--cc=valentinrothberg@gmail.com \
--cc=vegard.nossum@gmail.com \
--cc=zack@upsilon.cc \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox