From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Sat, 9 Jul 2016 17:21:30 -0400 From: Theodore Ts'o To: Guenter Roeck Message-ID: <20160709212130.GC26097@thunk.org> References: <20160709000631.GB8989@io.lakedaemon.net> <1468024946.2390.21.camel@HansenPartnership.com> <20160709093626.GA6247@sirena.org.uk> <5781148F.1010102@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5781148F.1010102@roeck-us.net> Cc: James Bottomley , ksummit-discuss@lists.linux-foundation.org, Jason Cooper Subject: Re: [Ksummit-discuss] [CORE TOPIC] stable workflow List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Sat, Jul 09, 2016 at 08:13:19AM -0700, Guenter Roeck wrote: > The proliferation of stable trees (or rather, how to avoid it) might be > one of the parts of the puzzle. Yes, there are way too many right now. Something that would really help is if there was some way to know who is actually *using* each of the various stable trees. It would certainly help prioritize my work. I started paying more attention to 3.10 and 3.18 kernels because I was directly involved with a projects which had product kernels based on 3.10 and 3.18. I do perform "fire and forget" gce-xfstests runs on 3.10, 3.14, 3.18, 4.1, and 4.4 because it doesn't take much effort. But actually going through and figuring out why we have a lot of test failures on a particular test kernel (generally because there were patches that were too dangerous or too complicated to backport via the cc:stable route), and then trying to get the fixes to a specific stable kernel takes a lot more time, and that I don't have. So as a results, 3.10.102 looks like this: BEGIN TEST 4k: Ext4 4k block Wed Jul 6 12:33:11 EDT 2016 Failures: ext4/308 generic/067 generic/092 generic/135 generic/323 generic/324 ... and 3.18.36 looks like this: BEGIN TEST 4k: Ext4 4k block Tue Jul 5 16:17:29 EDT 2016 Failures: ext4/001 generic/313 ... and 3.14.73 looks like *this*: BEGIN TEST 4k: Ext4 4k block Tue Jul 5 15:52:48 EDT 2016 Failures: ext4/308 generic/034 generic/039 generic/040 generic/041 generic/056 generic/057 generic/059 generic/065 generic/066 generic/073 generic/090 generic/101 generic/104 generic/106 generic/107 generic/135 generic/177 generic/313 generic/321 generic/322 generic/324 generic/325 generic/335 generic/336 generic/341 generic/342 generic/343 generic/348 The other thing that I'll note which is very discouraging as an upstream maintainer trying to get backports and fixes into stable kernel is that I don't have any proof that it actually helps. I've lost count of the number of times when someone has asked me about a bug or a test failure with a particular device kernel based on 3.10 or 3.18, and it will turn out that device kernels generally don't take updates the stable kernels, and it's not obvious to me whether or not SOC vendors update their BSP kernels to take into account fixes from the latest stable kernel. (But even if they do, apparently many device vendors aren't bothering to merge in changes from the SOC's BSP kernel, even if the BSP kernel is getting -stable updates.) So if I'm going to invest more time into getting fixes into the many, MANY stable kernels, and/or try to invest time in recruiting volunteers and training them to do this task, can someone please tell me how much difference it actually makes? Thanks, - Ted P.S. For the recorder, the newer stable kernels are in much better shape. For example, 4.4.14 looks like this: BEGIN TEST 4k: Ext4 4k block Tue Jul 5 23:02:09 EDT 2016 Passed all 223 tests Of course, as far as I know there are **no** devices based on 4.4 yet.... for devices shipping for this Christmas season, I suspect we'll be *lucky* if they are using 3.18....