From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 3D62D279 for ; Fri, 31 Jul 2015 18:28:21 +0000 (UTC) Received: from mail-pd0-f178.google.com (mail-pd0-f178.google.com [209.85.192.178]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 1FE58E0 for ; Fri, 31 Jul 2015 18:28:20 +0000 (UTC) Received: by pdjr16 with SMTP id r16so47746043pdj.3 for ; Fri, 31 Jul 2015 11:28:19 -0700 (PDT) Date: Fri, 31 Jul 2015 11:28:15 -0700 From: Dmitry Torokhov To: James Bottomley Message-ID: <20150731182815.GK5613@dtor-ws> References: <2111196.TG1k3f53YQ@avalon> <20150731165346.GA18984@infradead.org> <1438362159.2179.42.camel@HansenPartnership.com> <20150731170523.GF5613@dtor-ws> <1438362797.2179.45.camel@HansenPartnership.com> <20150731173335.GI5613@dtor-ws> <1438364188.2179.53.camel@HansenPartnership.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1438364188.2179.53.camel@HansenPartnership.com> Cc: Christoph Hellwig , Tejun Heo , Russell King , ksummit-discuss@lists.linuxfoundation.org, Shuah Khan Subject: Re: [Ksummit-discuss] [TECH TOPIC] Fix devm_kzalloc, its users, or both List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Fri, Jul 31, 2015 at 10:36:28AM -0700, James Bottomley wrote: > On Fri, 2015-07-31 at 10:33 -0700, Dmitry Torokhov wrote: > > On Fri, Jul 31, 2015 at 10:13:17AM -0700, James Bottomley wrote: > > > On Fri, 2015-07-31 at 10:05 -0700, Dmitry Torokhov wrote: > > > > On Fri, Jul 31, 2015 at 10:02:39AM -0700, James Bottomley wrote: > > > > > On Fri, 2015-07-31 at 09:53 -0700, Christoph Hellwig wrote: > > > > > > On Fri, Jul 31, 2015 at 06:34:21PM +0200, Julia Lawall wrote: > > > > > > > How is this different from the free happening explicitly in the remove > > > > > > > function? > > > > > > > > > > > > It's not. The real problem is that people don't understand life time > > > > > > rules and expect magic interfaces to fix it for them. > > > > > > > > > > So surely the rule is that we do this in module removal. That doesn't > > > > > get called until last put on the module and that can't happen (or > > > > > shouldn't happen) while userspace is holding open one of the /sys > > > > > or /proc interfaces (usually those objects hold a reference on something > > > > > within the driver to prevent this). > > > > > > > > > > There is an alternative way of handling this: that would be to detach > > > > > the file from the backing interface at _del time, so sysfs/kernfs would > > > > > take over the interface and return -ENODEV or something meaning we > > > > > could tear down the module even if there was an open interface file. > > > > > I'm not advocating this as a solution because I can already see the > > > > > problems (like how do you switch interfaces atomically) but if this > > > > > really is a serious problem, we should explore alternative solutions. > > > > > > > > Tejun already done such "switching" for sysfs so it should be possible, > > > > however (blasphemy!) there are other entities than files that also may > > > > have different lifetime rules that live past device unbinding. > > > > > > By unbinding do you mean when the unbind is called, which is fine, the > > > interface handler is still there, or when the final module put is > > > called, which is not fine because that's a lifetime problem. In the > > > latter case we need a hunting expedition to have them all caught and > > > shot. > > > > I was talking about former because module is normally stays pinned if it > > implements file_operations and will not be unpinned until last release > > on file (device) is called. > > Yes, I thought so, so that tells us the problem isn't really lifetime > rules per-se (the handlers and module are still present and won't be > released until the pinned object is), it's about the way we handle the > teardown ... effectively the devm_ memory is released at the wrong point > during teardown. Effectively, if this happens a lot, it's saying our > rules and best practises for this are hard to follow. I am not sure why you are making distinction between module lifetime rules and driver data lifetime rules. Both are objects that have certain lifetimes and when there is a mismatch in lifetimes between objects that use/being used the havoc ensues. Thanks. -- Dmitry