From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>, Shuah Khan <shuah.kh@samsung.com>,
ksummit-discuss@lists.linuxfoundation.org
Subject: Re: [Ksummit-discuss] [TECH TOPIC] Fix devm_kzalloc, its users, or both
Date: Sat, 01 Aug 2015 13:55:54 +0300 [thread overview]
Message-ID: <65396948.VLxKPYsgi9@avalon> (raw)
In-Reply-To: <20150731172732.GB7557@n2100.arm.linux.org.uk>
Hi Russell,
On Friday 31 July 2015 18:27:32 Russell King - ARM Linux wrote:
> On Fri, Jul 31, 2015 at 06:04:16PM +0100, Mark Brown wrote:
> > On Fri, Jul 31, 2015 at 06:14:14PM +0300, Laurent Pinchart wrote:
> > > It recently came to my attention that the way devm_kzalloc() is used by
> > > most drivers is broken. I've raised the topic on LKML (see
> > > http://lkml.org/lkml/2015/7/14/741) in the hope that my findings were
> > > simply>
> > lkml.org is down (again) - can you please provide a subject line?
>
> I wish people would stop using lkml.org
>
> > > The issue occurs when drivers use devm_kzalloc() to allocate data
> > > structures that can be accessed through file operations on a device
> > > node. The following sequence of events will then lead to a crash.
> >
> > Like Julia says I'm not sure this is really related to devm_ - I would
> > really expect that the majority of users were already broken prior to
> > the conversion to devm_ since the natural thing is to free things in the
> > remove() function which has exactly the same issues, the main problem
> > here is that the file open after device is removed case is rare for most
> > devices and requires somewhat obscure handling.
>
> I completely agree - this has *nothing* to do with devm_ at all. Any
> bugs that are there as a result of converting to devm_kzalloc() where
> there before. 99.9% of the devm_kzalloc() conversions are merely
> replacing the kmalloc()/kzalloc() in the probe function with devm_kzalloc()
> and removing the free() in the remove function.
In those cases I agree that the bug is not introduced by the conversion.
However, as we're selling devm_kzalloc() as solving all problems (or, at
least, as many developers believe it will), many new drivers get submitted
with blind devm_kzalloc() usage without any thought about proper lifetime
management for objects allocated by the driver. This gets past (at least some)
subsystem maintainers, showing that even if the devm_kzalloc() API forbids
this kind of usage, the message is not conveyed properly. My main concern here
is that driver developers do the wrong things, and maintainers don't catch the
problem. In addition to fixing existing code we need to fix that problem.
> If _anything_, converting to devm_kzalloc() means that the lifetime of
> the data structure is _slightly_ longer - because rather than the data
> structure being freed in the remove() callback, it's freed after the
> remove() callback has returned.
>
> However, both are just as buggy.
>
> The devm_* aspect of this thread is just an anti-devm_* smoke-screen.
> It's completely irrelevant.
>
> > Tejun's suggestion seems like the most robust thing here - allocation
> > issues are only going to be one of the problems with userspace accessing
> > devices that are going away and there's a complexity cost from having
> > the partially destroyed cases around. Off the top of my head there's
> > anything that attempts to access the hardware if it's genuinely gone
> > away rather than just been soft unbound for example. If the device can
> > just invalidate all open files on the way out then that's going to be
> > exactly what most things want.
>
> Well, accessing hardware is even more of a problem. Consider that
> ioremap()s will also be cleaned up at the same time (whether in
> ->remove() or in devm cleanup processing - again, not a devm problem)
> thereby removing the mapping for accessing the hardware.
Drivers are usually in a bit of a better shape when it comes to hardware
access. Most developers understand that the remove() function is supposed to
stop the hardware, and most subsystems will prevent operations that touch the
hardware from reaching drivers after the remove() function unregisters the
subsystem "class" (for lack of a better word) device. This can still be racy
if not implemented correctly, but shouldn't be subject to the same issue of
userspace keeping file handles open long after the device is gone.
This being said, if unregistration/removal/cleanup racing with userspace is a
problem for which we have solutions, it's still too easy today to implement it
incorrectly in drivers. That's also something we might want to discuss to
share best practices among subsystems and come up with a common message we can
send to driver developers.
--
Regards,
Laurent Pinchart
next prev parent reply other threads:[~2015-08-01 10:55 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-31 15:14 Laurent Pinchart
2015-07-31 15:56 ` Russell King - ARM Linux
2015-07-31 16:34 ` Julia Lawall
2015-07-31 16:51 ` Dmitry Torokhov
2015-07-31 16:57 ` Julia Lawall
2015-07-31 17:03 ` Dmitry Torokhov
2015-07-31 16:53 ` Christoph Hellwig
2015-07-31 17:02 ` James Bottomley
2015-07-31 17:05 ` Dmitry Torokhov
2015-07-31 17:13 ` James Bottomley
2015-07-31 17:33 ` Dmitry Torokhov
2015-07-31 17:36 ` James Bottomley
2015-07-31 18:28 ` Dmitry Torokhov
2015-07-31 18:40 ` James Bottomley
2015-07-31 19:41 ` Dmitry Torokhov
2015-08-01 10:57 ` Mark Brown
2015-08-02 14:05 ` Russell King - ARM Linux
2015-08-02 14:21 ` Julia Lawall
2015-08-01 11:04 ` Laurent Pinchart
2015-08-01 11:21 ` Julia Lawall
2015-08-04 12:55 ` Dan Carpenter
2015-08-04 14:01 ` Geert Uytterhoeven
2015-08-04 17:55 ` Dmitry Torokhov
2015-08-04 18:03 ` Julia Lawall
2015-08-04 18:07 ` Dmitry Torokhov
2015-08-04 19:49 ` Russell King - ARM Linux
2015-07-31 17:04 ` Mark Brown
2015-07-31 17:27 ` Russell King - ARM Linux
2015-08-01 10:55 ` Laurent Pinchart [this message]
2015-08-01 16:30 ` Russell King - ARM Linux
2015-08-02 23:33 ` Laurent Pinchart
2015-08-01 10:47 ` Laurent Pinchart
2015-08-01 10:55 ` Julia Lawall
2015-08-01 11:01 ` Laurent Pinchart
2015-08-01 15:18 ` Tejun Heo
2015-08-02 0:48 ` Guenter Roeck
2015-08-02 14:30 ` Russell King - ARM Linux
2015-08-02 16:04 ` Guenter Roeck
2015-08-04 10:40 ` Daniel Vetter
2015-08-04 11:18 ` Russell King - ARM Linux
2015-08-04 11:56 ` Daniel Vetter
2015-08-04 11:59 ` Daniel Vetter
2015-08-04 14:48 ` Tejun Heo
2015-08-04 22:44 ` Laurent Pinchart
2015-08-05 9:41 ` Daniel Vetter
2015-08-04 10:49 ` Takashi Iwai
2015-08-10 7:58 ` Linus Walleij
2015-08-10 10:23 ` Russell King - ARM Linux
2015-08-11 11:35 ` Takashi Iwai
2015-08-11 15:19 ` Daniel Vetter
2015-08-21 2:19 ` Dmitry Torokhov
2015-08-21 15:07 ` Julia Lawall
2015-08-21 16:14 ` Dmitry Torokhov
2015-08-21 16:58 ` Mark Brown
2015-08-21 17:30 ` Dmitry Torokhov
2015-08-21 17:41 ` Mark Brown
2015-08-21 17:52 ` Mark Brown
2015-08-21 18:05 ` Dmitry Torokhov
2015-08-21 18:18 ` Mark Brown
2015-10-12 18:36 ` Theodore Ts'o
2015-10-12 18:44 ` Mark Brown
2015-10-14 15:58 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=65396948.VLxKPYsgi9@avalon \
--to=laurent.pinchart@ideasonboard.com \
--cc=ksummit-discuss@lists.linuxfoundation.org \
--cc=linux@arm.linux.org.uk \
--cc=shuah.kh@samsung.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox