ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Greg KH <greg@kroah.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>,
	Mark Brown <broonie@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Roland Dreier <roland@kernel.org>,
	Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>,
	ksummit@lists.linux.dev, Daniel Vetter <daniel@ffwll.ch>
Subject: Re: cdev/devm_* issues (was Re: [TECH TOPIC] Rust for Linux)
Date: Thu, 8 Jul 2021 11:23:30 +0300	[thread overview]
Message-ID: <YOa2Au2TEjCpm1xX@pendragon.ideasonboard.com> (raw)
In-Reply-To: <YOagA4bgdGYos5aa@kroah.com>

Hi Greg,

On Thu, Jul 08, 2021 at 08:49:39AM +0200, Greg KH wrote:
> On Wed, Jul 07, 2021 at 08:17:08PM +0300, Laurent Pinchart wrote:
> > On Wed, Jul 07, 2021 at 02:45:04PM +0200, Greg KH wrote:
> > > On Wed, Jul 07, 2021 at 01:38:44PM +0100, James Bottomley wrote:
> > > > On Wed, 2021-07-07 at 14:20 +0200, Greg KH wrote:
> > > > > On Wed, Jul 07, 2021 at 12:34:31PM +0100, James Bottomley wrote:
> > > > > > On Wed, 2021-07-07 at 11:50 +0100, Mark Brown wrote:
> > > > > > > On Tue, Jul 06, 2021 at 10:36:14PM +0200, Linus Walleij wrote:
> > > > > > > > On Tue, Jul 6, 2021 at 10:00 PM Roland Dreier wrote:
> > > > > > > > > "devres" / devm_xxx was an attempt to deal with this in C,
> > > > > > > > > but it only solves some cases of this and has not received a
> > > > > > > > > lot of adoption (we can argue about the reasons).
> > > > > > > >  
> > > > > > > > Really? From my point of view that is adopted all over the map.
> > > > > > > > I add new users all the time and use it as much as I can when
> > > > > > > > writing new drivers.
> > > > > > > 
> > > > > > > Yes, it's *super* widely used in most of the kernel.  Perhaps
> > > > > > > there's some subsystems that reject it for some reason.
> > > > > > > 
> > > > > > > > I think it's a formidable success, people just need to learn to
> > > > > > > > do it more.
> > > > > > > 
> > > > > > > There *are* issues with people adopting it too enthusiastically -
> > > > > > > as well as the memory lifetime issues that Laurent mentioned it's
> > > > > > > easy for it to cause problems with interrupt handlers that are
> > > > > > > left live longer than they should be and try to use things that
> > > > > > > were already deallocated.
> > 
> > (CC'ing Daniel Vetter as the author of the DRM devres-based resource
> > management)
> > 
> > I've given this lots of thoughts lately, in the context of V4L2, but it
> > should be roughly the same for character devices in general. Here's what
> > I think should be done for drivers that expose character devices.
> > 
> > - Drivers must stop allocating their internal data structure (the one
> >   they set as device data with dev_set_drvdata()) with devm_kzalloc().
> >   The data structure must instead be allocated with a plain kzalloc()
> >   and reference-counted.
> > 
> >   Most drivers will register a single character device using a
> >   subsystem-specific API (e.g. video_register_device() in V4L2). The
> >   subsystem needs to provide a .release() callback, called when the
> >   last reference to the character device is released. Drivers must
> >   implement this, and can simply free their internal data structure at
> >   this point.
> > 
> >   For drivers that register multiple character devices, or in general
> >   expose multiple interfaces to userspace or other parts of the kernel,
> >   the internal data structure must be properly reference-counted, with a
> >   reference released in each .release() callback. There may be ways to
> >   simplify this.
> > 
> >   This can be seen as going back to the pre-devm_kzalloc() era, but it's
> >   only about undoing a mistake that was done way too often (to be fair,
> >   many drivers used to just kfree() the data in the driver's .remove()
> >   operation, so it wasn't always a regression, only enshrining a
> >   preexisting bad practice).
> > 
> >   This is only part of the puzzle though. There's a remove/use race that
> >   still needs to be solved.
> > 
> > - In .remove(), drivers must inform the character device that new access
> >   from userspace are not allowed anymore. This would set a flag in
> >   struct cdev that would result in all new calls from userspace through
> >   file operations to be rejected (with -ENXIO for instance). This should
> >   be wrapped in subsystem-specific functions (e.g.
> >   video_device_disconnect() wrapping cdev_disconnect()). From now on, no
> >   new calls are possible, but existing calls may be in progress.
> > 
> > - Drivers then need to cancel all pending I/O and wait for completion.
> >   I/O (either direct memory-mapped I/O or through bus APIs, such as I2C
> >   or USB transactions) are not allowed anymore after .remove() returns.
> >   This will have the side effect of waking up userspace contexts that
> >   are waiting for I/O completion (through a blocking file I/O operation
> >   for instance). Existing calls from userspace will start completing.
> > 
> >   This is also a good place to start shutting down the device, for
> >   instance disabling interrupts.
> > 
> > - The next step is for drivers to wait until all calls from userspace
> >   complete. This should be done with a call count in struct cdev that is
> >   updated upon entry and exit of calls from userspace, and a wait queue
> >   to wait for that count to go to 0. This should be wrapped in
> >   subsystem-specific APIs. As the flag that indicates device removal is
> >   set, no new calls are allowed so the counter can only decrease, and as
> >   all pending I/O have terminated or have been cancelled, no pending
> >   calls should be blocked.
> > 
> > - Finally, drivers should unregister the character device, through the
> >   appropriate subsystem API.
> > 
> >   At this point, memory mappings and file handles referencing the device
> >   may still exist, but no file operation is in progress. The device is
> >   quiescent.
> > 
> >   Care needs to be taken in drivers and subsystems to not start any I/O
> >   operation when handling the file .release() operation or the
> >   destruction of memory mappings. Overall I don't expect much issues,
> >   but I'm sure some drivers do strange things in those code paths.
> > 
> > - When the least memory mapping is gone and the last file handle is
> >   closed, the subsystem will call the driver's .release() callback. At
> >   this point, the driver will perform the operations listed in the first
> >   item of this list.
> > 
> > 
> > The challenge here will be to make this as easy as possible for drivers,
> > to avoid risk of drivers getting it wrong. The DRM/KMS subsystem has
> > created a devres-based system to handle this, with the devres associated
> > with the drm_device (the abstraction of the cdevs exposed by DRM to
> > userspace), *not* the physical device. It has a drawback though, it
> > assumes that a DRM driver will only ever want to register a drm_device
> > and nothing else, and hardcodes that assumption in the way it releases
> > resources. That's fine for most DRM drivers, but if a driver was to
> > register a drm_device and something else (such as a V4L2 video_device,
> > an input device, ...), the DRM subsystem will get in the way.
> > 
> > I have two questions:
> > 
> > - Does the above make sense ?
> 
> Yes, totally, thank you for taking the time to write this all out.  It's
> been in the back of my mind for over a decade that we need to work on
> these issues, but have not had any time to sit down and write it all
> down like you have.
> 
> > - Assuming it does, how do we get from the current mess to a situation
> >   where writing a driver will be a pleasure, not a punishment ? :-)
> 
> That's the real question.  Thanks to a lot of cleanups that Christoph
> has recently done to the lower-level cdev code, the lower levels are now
> in a shape where we can work on them better.  I'm going to try to carve
> out some time in the next few months to start to work on these things.
> I think that once we get the ideas of what needs to be done, and a
> working core change, I can unleash some interns on doing tree-wide
> cleanups/changes to help bring everything into alignment.
> 
> I'm going to save this email off for reference for later.  But of
> course, if others want to start to attack this earlier, all the better :)

My too long todo list contains an entry to address this in the V4L2
subsystem to start with (see
https://lore.kernel.org/linux-media/20200816003315.GA13826@roeck-us.net/).
It may be a good prototype to then extend it to cdev. If I get a chance
to work on it, I'll CC you.

> > > > > > Fixing this should not be beyond the wit of humankind, though.  If
> > > > > > we delayed deallocation to module release, that would fix the
> > > > > > interrupt issue, wouldn't it?  Perhaps all devres memory for
> > > > > > devices should live until then anyway and thus be automatically
> > > > > > deallocated instead of needing an explicit free ... the problem
> > > > > > with that being compiled in devices currently optimize away the
> > > > > > module refcounting, but that should be fixable.
> > > > > 
> > > > > module code lifespans are different than device structure lifespans,
> > > > > it's when people get them confused, as here, that we have problems.
> > > > 
> > > > I'm not claiming they are.  What I am claiming is that module lifetime
> > > > must always encompass the device lifetimes.  Therefore, you can never
> > > > be incorrect by using a module lifetime for anything attached to a
> > > > device, just inefficient for using memory longer than potentially
> > > > needed.  However, in a lot of use cases, the device is created on
> > > > module init and destroyed on module exit, so the inefficiency is barely
> > > > noticeable.
> > > 
> > > In almost no use case is the device created on module init of a driver.
> > > devices are created by busses, look at the USB code as an example.  The
> > > usb bus creates the devices and then individual modules bind to that
> > > device as needed.  If the device is removed from the system, wonderful,
> > > the device is unbound, but the module is still loaded.  So if you really
> > > wanted to, with your change, just do a insert/remove of a USB device a
> > > few zillion times and then memory is all gone :(
> > > 
> > > > The question I'm asking is shouldn't we optimize for this?
> > > 
> > > No.
> > > 
> > > > so let people allocate devm memory safe in the knowledge it will be
> > > > freed on module release?
> > > 
> > > No, see above.  Modules are never removed in a normal system.  devices
> > > are.
> > > 
> > > And the drm developers have done great work in unwinding some of these
> > > types of mistakes in their drivers, let's not go backwards please.

-- 
Regards,

Laurent Pinchart

  reply	other threads:[~2021-07-08  8:24 UTC|newest]

Thread overview: 200+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-25 22:09 [TECH TOPIC] Rust for Linux Miguel Ojeda
2021-07-05 23:51 ` Linus Walleij
2021-07-06  4:30   ` Leon Romanovsky
2021-07-06  9:55     ` Linus Walleij
2021-07-06 10:16       ` Geert Uytterhoeven
2021-07-06 17:59         ` Linus Walleij
2021-07-06 18:36           ` Miguel Ojeda
2021-07-06 19:12             ` Linus Walleij
2021-07-06 21:32               ` Miguel Ojeda
2021-07-07 14:10             ` Arnd Bergmann
2021-07-07 15:28               ` Miguel Ojeda
2021-07-07 15:50                 ` Andrew Lunn
2021-07-07 16:34                   ` Miguel Ojeda
2021-07-07 16:55                 ` Arnd Bergmann
2021-07-07 17:54                   ` Miguel Ojeda
2021-07-06 10:22       ` Leon Romanovsky
2021-07-06 14:30       ` Miguel Ojeda
2021-07-06 14:32         ` Miguel Ojeda
2021-07-06 15:03         ` Sasha Levin
2021-07-06 15:33           ` Miguel Ojeda
2021-07-06 15:42             ` Laurent Pinchart
2021-07-06 16:09               ` Mike Rapoport
2021-07-06 18:29               ` Miguel Ojeda
2021-07-06 18:38                 ` Laurent Pinchart
2021-07-06 19:45                   ` Steven Rostedt
2021-07-06 19:59                   ` Miguel Ojeda
2021-07-06 18:53             ` Sasha Levin
2021-07-06 21:50               ` Miguel Ojeda
2021-07-07  4:57                 ` Leon Romanovsky
2021-07-07 13:39                 ` Alexandre Belloni
2021-07-07 13:50                   ` Miguel Ojeda
2021-07-06 18:26         ` Linus Walleij
2021-07-06 19:11           ` Miguel Ojeda
2021-07-06 19:13         ` Johannes Berg
2021-07-06 19:43           ` Miguel Ojeda
2021-07-06 10:20     ` James Bottomley
2021-07-06 14:55       ` Miguel Ojeda
2021-07-06 15:01         ` Sasha Levin
2021-07-06 15:36           ` Miguel Ojeda
2021-07-09 10:02         ` Marco Elver
2021-07-09 16:02           ` Miguel Ojeda
2021-07-06 18:09       ` Linus Walleij
2021-07-06 14:24     ` Miguel Ojeda
2021-07-06 14:33       ` Laurent Pinchart
2021-07-06 14:56       ` Leon Romanovsky
2021-07-06 15:29         ` Miguel Ojeda
2021-07-07  4:38           ` Leon Romanovsky
2021-07-06 20:00   ` Roland Dreier
2021-07-06 20:36     ` Linus Walleij
2021-07-06 22:00       ` Laurent Pinchart
2021-07-07  7:27         ` Julia Lawall
2021-07-07  7:45           ` Greg KH
2021-07-07  7:52             ` James Bottomley
2021-07-07 13:49               ` Miguel Ojeda
2021-07-07 14:08                 ` James Bottomley
2021-07-07 15:15                   ` Miguel Ojeda
2021-07-07 15:44                     ` Greg KH
2021-07-07 17:01                       ` Wedson Almeida Filho
2021-07-07 17:20                         ` Greg KH
2021-07-07 19:19                           ` Wedson Almeida Filho
2021-07-07 20:38                             ` Jan Kara
2021-07-07 23:09                               ` Wedson Almeida Filho
2021-07-08  6:11                                 ` Greg KH
2021-07-08 13:36                                   ` Wedson Almeida Filho
2021-07-08 18:51                                     ` Greg KH
2021-07-08 19:31                                       ` Andy Lutomirski
2021-07-08 19:35                                         ` Geert Uytterhoeven
2021-07-08 21:56                                           ` Andy Lutomirski
2021-07-08 19:49                                       ` Linus Walleij
2021-07-08 20:34                                         ` Miguel Ojeda
2021-07-08 22:13                                           ` Linus Walleij
2021-07-09  7:24                                             ` Geert Uytterhoeven
2021-07-19 12:24                                             ` Wedson Almeida Filho
2021-07-19 13:15                                               ` Wedson Almeida Filho
2021-07-19 14:02                                                 ` Arnd Bergmann
2021-07-19 14:13                                                   ` Linus Walleij
2021-07-19 21:32                                                     ` Arnd Bergmann
2021-07-19 21:33                                                     ` Arnd Bergmann
2021-07-20  1:46                                                       ` Miguel Ojeda
2021-07-20  6:43                                                         ` Johannes Berg
2021-07-19 14:43                                                   ` Geert Uytterhoeven
2021-07-19 18:24                                                     ` Miguel Ojeda
2021-07-19 18:47                                                       ` Steven Rostedt
2021-07-19 14:54                                                   ` Miguel Ojeda
2021-07-19 17:32                                                   ` Wedson Almeida Filho
2021-07-19 21:31                                                     ` Arnd Bergmann
2021-07-19 17:37                                                   ` Miguel Ojeda
2021-07-19 16:02                                                 ` Vegard Nossum
2021-07-19 17:45                                                   ` Miguel Ojeda
2021-07-19 17:54                                                     ` Miguel Ojeda
2021-07-19 18:06                                                   ` Wedson Almeida Filho
2021-07-19 19:37                                                     ` Laurent Pinchart
2021-07-19 21:09                                                       ` Wedson Almeida Filho
2021-07-20 23:54                                                         ` Laurent Pinchart
2021-07-21  1:33                                                           ` Andy Lutomirski
2021-07-21  1:42                                                             ` Laurent Pinchart
2021-07-21 13:54                                                               ` Linus Walleij
2021-07-21 14:13                                                                 ` Wedson Almeida Filho
2021-07-21 14:19                                                                   ` Linus Walleij
2021-07-22 11:33                                                                     ` Wedson Almeida Filho
2021-07-23  0:45                                                                       ` Linus Walleij
2021-07-21  4:39                                                             ` Wedson Almeida Filho
2021-07-23  1:04                                                               ` Laurent Pinchart
2021-07-21  4:23                                                           ` Wedson Almeida Filho
2021-07-23  1:13                                                             ` Laurent Pinchart
2021-07-19 22:57                                                 ` Alexandre Belloni
2021-07-20  7:15                                                   ` Miguel Ojeda
2021-07-20  9:39                                                     ` Alexandre Belloni
2021-07-20 12:10                                                       ` Miguel Ojeda
2021-07-19 13:53                                               ` Linus Walleij
2021-07-19 14:42                                                 ` Wedson Almeida Filho
2021-07-19 22:16                                                   ` Linus Walleij
2021-07-20  1:20                                                     ` Wedson Almeida Filho
2021-07-20 13:21                                                       ` Andrew Lunn
2021-07-20 13:38                                                         ` Miguel Ojeda
2021-07-20 14:04                                                           ` Andrew Lunn
2021-07-20 13:55                                                         ` Greg KH
2021-07-20  1:21                                                     ` Miguel Ojeda
2021-07-20 16:00                                                       ` Mark Brown
2021-07-20 22:42                                                       ` Linus Walleij
2021-07-19 14:43                                                 ` Miguel Ojeda
2021-07-19 15:15                                                   ` Andrew Lunn
2021-07-19 15:43                                                     ` Miguel Ojeda
2021-07-09  7:03                                         ` Viresh Kumar
2021-07-09 17:06                                         ` Mark Brown
2021-07-09 17:43                                           ` Miguel Ojeda
2021-07-10  9:53                                             ` Jonathan Cameron
2021-07-10 20:09                                         ` Kees Cook
2021-07-08 13:55                                   ` Miguel Ojeda
2021-07-08 14:58                                     ` Greg KH
2021-07-08 15:02                                       ` Mark Brown
2021-07-08 16:38                                       ` Andy Lutomirski
2021-07-08 18:01                                         ` Greg KH
2021-07-08 18:00                                       ` Miguel Ojeda
2021-07-08 18:44                                         ` Greg KH
2021-07-08 23:09                                           ` Miguel Ojeda
2021-07-08  7:20                                 ` Geert Uytterhoeven
2021-07-08 13:41                                   ` Wedson Almeida Filho
2021-07-08 13:43                                     ` Geert Uytterhoeven
2021-07-08 13:54                                       ` Wedson Almeida Filho
2021-07-08 14:16                                         ` Geert Uytterhoeven
2021-07-08 14:24                                           ` Wedson Almeida Filho
2021-07-09  7:04                                             ` Jerome Glisse
2021-07-08 14:04                                       ` Miguel Ojeda
2021-07-08 14:18                                         ` Geert Uytterhoeven
2021-07-08 14:28                                           ` Miguel Ojeda
2021-07-08 14:33                                             ` Geert Uytterhoeven
2021-07-08 14:35                                               ` Miguel Ojeda
2021-07-09 11:55                                                 ` Geert Uytterhoeven
2021-07-08 16:07                                               ` Andy Lutomirski
2021-07-07 20:58                           ` Miguel Ojeda
2021-07-07 21:47                             ` Laurent Pinchart
2021-07-07 22:44                               ` Miguel Ojeda
2021-07-07 17:01           ` Miguel Ojeda
2021-07-07 10:50       ` Mark Brown
2021-07-07 10:56         ` Julia Lawall
2021-07-07 11:27           ` James Bottomley
2021-07-07 11:34         ` James Bottomley
2021-07-07 12:20           ` Greg KH
2021-07-07 12:38             ` James Bottomley
2021-07-07 12:45               ` Greg KH
2021-07-07 17:17                 ` Laurent Pinchart
2021-07-08  6:49                   ` cdev/devm_* issues (was Re: [TECH TOPIC] Rust for Linux) Greg KH
2021-07-08  8:23                     ` Laurent Pinchart [this message]
2021-07-08 23:06                     ` Linus Walleij
2021-07-09  0:02                       ` Dan Williams
2021-07-09 16:53                       ` Wedson Almeida Filho
2021-07-13  8:59                         ` Linus Walleij
     [not found]                           ` <CAHp75VfW7PxAyU=eYPNWFU_oUY=aStz-4W5gX87KSo402YhMXQ@mail.gmail.com>
2021-07-21 13:46                             ` Linus Walleij
2021-07-21 15:49                               ` Andy Shevchenko
2021-07-10  7:09                     ` Dan Carpenter
2021-07-12 13:42                       ` Jason Gunthorpe
2021-07-15  9:54                     ` Daniel Vetter
2021-07-21  9:08                       ` Dan Carpenter
2021-07-22  9:56                         ` Daniel Vetter
2021-07-22 10:09                           ` Dan Carpenter
2021-07-08  9:08                   ` [TECH TOPIC] Rust for Linux Mauro Carvalho Chehab
2021-07-10 16:42                     ` Laurent Pinchart
2021-07-10 17:18                       ` Andy Lutomirski
2021-07-07 15:17           ` Mark Brown
2021-07-06 21:45     ` Bart Van Assche
2021-07-06 23:08       ` Stephen Hemminger
2021-07-07  2:41         ` Bart Van Assche
2021-07-07 18:57           ` Linus Torvalds
2021-07-07 20:32             ` Bart Van Assche
2021-07-07 20:39               ` Linus Torvalds
2021-07-07 21:40                 ` Laurent Pinchart
2021-07-08  7:22                 ` Geert Uytterhoeven
2021-07-07 21:02               ` Laurent Pinchart
2021-07-07 22:11               ` Miguel Ojeda
2021-07-07 22:43                 ` Laurent Pinchart
2021-07-07 23:21                   ` Miguel Ojeda
2021-07-07 23:40                     ` Laurent Pinchart
2021-07-08  0:27                       ` Miguel Ojeda
2021-07-08  0:56                         ` Laurent Pinchart
2021-07-08  6:26             ` Alexey Dobriyan
2021-07-06 19:05 ` Bart Van Assche
2021-07-06 19:27   ` Miguel Ojeda
2021-07-07 15:48 ` Steven Rostedt
2021-07-07 16:44   ` Miguel Ojeda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YOa2Au2TEjCpm1xX@pendragon.ideasonboard.com \
    --to=laurent.pinchart@ideasonboard.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=broonie@kernel.org \
    --cc=daniel@ffwll.ch \
    --cc=greg@kroah.com \
    --cc=ksummit@lists.linux.dev \
    --cc=linus.walleij@linaro.org \
    --cc=miguel.ojeda.sandonis@gmail.com \
    --cc=roland@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox