* [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things @ 2014-05-15 23:13 Dan Williams 2014-05-16 2:56 ` NeilBrown 0 siblings, 1 reply; 38+ messages in thread From: Dan Williams @ 2014-05-15 23:13 UTC (permalink / raw) To: ksummit-discuss; +Cc: Dan J Williams What would it take and would we even consider moving 2x faster than we are now? A cursory glance at Facebook's development statistics [1] shows them handling more developers, more commits, and a higher rate of code growth than kernel.org [2]. As mentioned in their development process presentation, "tools" and "culture" enable such a pace of development without the project flying apart. Assuming the response to the initial question is not "we're moving fast enough, thank you very much, go away", what would help us move faster? I submit that there are three topics in this space that have aspects which can only be productively discussed in a forum like kernel summit: 1/ Merge Karma: Collect patch review and velocity data for a maintainer to answer questions like "am I pushing too much risk upstream?", "from cycle to cycle am I maintaining a consistent velocity?", "should I modulate the scope of the review feedback I trust?". I think where proposals like this have fallen over in the past was with the thought that this data could be used as a weapon by toxic contributors, or used to injure someone's reputation. Instead this would be collected consistently (individually?), for private use and shared in a limited fashion at forums like kernel summit to have data to drive "how are we doing as a community?" discussions. 2/ Gatekeeper: Saying "no" is how we as a kernel community mitigate risk and it is healthy for us to say "no" early and often. However, the only real dimension we currently have to say "no" is "no, I won't merge your code". The staging-tree opened up a way to give a qualified "no" by allowing new drivers a sandbox to get in shape for moving into the kernel-tree-proper while still being available to end users. The idea with a Facebook-inspired Gatekeeper system is to have another way to say "no" while still merging code. Consider a facility more fine-grained than the recently deprecated CONFIG_EXPERIMENTAL, and add run-time modification capability. Similar to loading a staging driver, overriding a Gatkeeper-variable (i.e. where a maintainer has explicitly said "no") taints the kernel. This then becomes a tool for those situations where there is value / need in distributing the code, while still saying "no" to its acceptability in its current state. 3/ LKP and Testing: If there was a generic way for tools like LKP to discover and run per-subsystem / driver unit tests I am fairly confident LKP would already be sending the community test results. LKP is the closest we have to Facebook-Perflab (automated regression testing environment), and it's one of the best tools we have for moving development faster without increasing risk in the code we deliver. Has the time come for a coordinated unit-test culture in Linux kernel development? This topic proposal is a self-nomination (dan.j.williams@intel.com) for attending Kernel Summit, and I also nominate Fengguang Wu (fengguang.wu@intel.com) to participate in any discussions that involve LKP. [1]: http://www.infoq.com/presentations/Facebook-Release-Process [2]: http://www.linuxfoundation.org/publications/linux-foundation/who-writes-linux-2013 ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-15 23:13 [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things Dan Williams @ 2014-05-16 2:56 ` NeilBrown 2014-05-16 15:04 ` Chris Mason 2014-05-21 7:22 ` Dan Williams 0 siblings, 2 replies; 38+ messages in thread From: NeilBrown @ 2014-05-16 2:56 UTC (permalink / raw) To: Dan Williams; +Cc: Dan J Williams, ksummit-discuss [-- Attachment #1: Type: text/plain, Size: 4045 bytes --] On Thu, 15 May 2014 16:13:58 -0700 Dan Williams <dan.j.williams@gmail.com> wrote: > What would it take and would we even consider moving 2x faster than we > are now? Hi Dan, you seem to be suggesting that there is some limit other than "competent engineering time" which is slowing Linux "progress" down. Are you really suggesting that? What might these other limits be? Certainly there are limits to minimum gap between conceptualisation and release (at least one release cycle), but is there really a limit to the parallelism that can be achieved? NeilBrown > A cursory glance at Facebook's development statistics [1] > shows them handling more developers, more commits, and a higher rate > of code growth than kernel.org [2]. As mentioned in their development > process presentation, "tools" and "culture" enable such a pace of > development without the project flying apart. Assuming the response > to the initial question is not "we're moving fast enough, thank you > very much, go away", what would help us move faster? I submit that > there are three topics in this space that have aspects which can only > be productively discussed in a forum like kernel summit: > > 1/ Merge Karma: Collect patch review and velocity data for a > maintainer to answer questions like "am I pushing too much risk > upstream?", "from cycle to cycle am I maintaining a consistent > velocity?", "should I modulate the scope of the review feedback I > trust?". I think where proposals like this have fallen over in the > past was with the thought that this data could be used as a weapon by > toxic contributors, or used to injure someone's reputation. Instead > this would be collected consistently (individually?), for private use > and shared in a limited fashion at forums like kernel summit to have > data to drive "how are we doing as a community?" discussions. > > 2/ Gatekeeper: Saying "no" is how we as a kernel community mitigate > risk and it is healthy for us to say "no" early and often. However, > the only real dimension we currently have to say "no" is "no, I won't > merge your code". The staging-tree opened up a way to give a > qualified "no" by allowing new drivers a sandbox to get in shape for > moving into the kernel-tree-proper while still being available to end > users. The idea with a Facebook-inspired Gatekeeper system is to have > another way to say "no" while still merging code. Consider a facility > more fine-grained than the recently deprecated CONFIG_EXPERIMENTAL, > and add run-time modification capability. Similar to loading a > staging driver, overriding a Gatkeeper-variable (i.e. where a > maintainer has explicitly said "no") taints the kernel. This then > becomes a tool for those situations where there is value / need in > distributing the code, while still saying "no" to its acceptability in > its current state. > > 3/ LKP and Testing: If there was a generic way for tools like LKP to > discover and run per-subsystem / driver unit tests I am fairly > confident LKP would already be sending the community test results. LKP > is the closest we have to Facebook-Perflab (automated regression > testing environment), and it's one of the best tools we have for > moving development faster without increasing risk in the code we > deliver. Has the time come for a coordinated unit-test culture in > Linux kernel development? > > This topic proposal is a self-nomination (dan.j.williams@intel.com) > for attending Kernel Summit, and I also nominate Fengguang Wu > (fengguang.wu@intel.com) to participate in any discussions that > involve LKP. > > [1]: http://www.infoq.com/presentations/Facebook-Release-Process > [2]: http://www.linuxfoundation.org/publications/linux-foundation/who-writes-linux-2013 > _______________________________________________ > Ksummit-discuss mailing list > Ksummit-discuss@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-16 2:56 ` NeilBrown @ 2014-05-16 15:04 ` Chris Mason 2014-05-16 17:09 ` Andy Grover ` (2 more replies) 2014-05-21 7:22 ` Dan Williams 1 sibling, 3 replies; 38+ messages in thread From: Chris Mason @ 2014-05-16 15:04 UTC (permalink / raw) To: NeilBrown, Dan Williams; +Cc: Dan J Williams, ksummit-discuss -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/15/2014 10:56 PM, NeilBrown wrote: > On Thu, 15 May 2014 16:13:58 -0700 Dan Williams > <dan.j.williams@gmail.com> wrote: > >> What would it take and would we even consider moving 2x faster >> than we are now? > > Hi Dan, you seem to be suggesting that there is some limit other > than "competent engineering time" which is slowing Linux "progress" > down. > > Are you really suggesting that? What might these other limits be? > > Certainly there are limits to minimum gap between conceptualisation > and release (at least one release cycle), but is there really a > limit to the parallelism that can be achieved? I haven't compared the FB commit rates with the kernel, but I'll pretend Dan's basic thesis is right and talk about which parts of the facebook model may move faster than the kernel. The facebook is pretty similar to the way the kernel works. The merge window lasts a few days and the major releases are every week, but overall it isn't too far away. The biggest difference is that we have a centralized tool for reviewing the patches, and once it has been reviewed by a specific number of people, you push it in. The patch submission tool runs the patch through lint and various static analysis to make sure it follows proper coding style and doesn't include patterns of known bugs. This cuts down on the review work because the silly coding style mistakes are gone before it gets to the tool. When you put in a patch, you have to put in reviewers, and they get a little notification that your patch needs review. Once the reviewers are happy, you push the patch in. The biggest difference: there are no maintainers. If I want to go change the calendar tool to fix a bug, I patch it, get someone else to sign off and push. All of which is my way of saying the maintainers (me included) are the biggest bottleneck. There are a lot of reasons I think the maintainer model fits the kernel better, but at least for btrfs I'm trying to speed up the patch review process and use patchwork more effectively. Facebook controls risk with new features using gatekeepers in the code. That way we can beta test larger changes against an expanding group of unsuspecting people without turning it on for everyone at once. It's also much easier to accept risk when you have complete control over deployment. facebook.com rolls out twice a day, which is basically a stable tree cherry-picked from tip, and there are plenty of chances to fix problems. Anrdoid/iphone releases can't be controlled the same way, and so they have longer testing periods. - -chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJTdijtAAoJEATCFrcofBuh6KsP/jnA+wDUA2KW1LTdRV8JGuLp dgguLD8H+KN4s4ft7JmLUq+fCgCtI5Y4HUXc/BTog+HY6rae7Wnkfp5EKjkFO770 RsOFdI+MxyfqQmIrhHuV7jhO69joal89vLhW1nsAYe6C8+UrHs2YFgPkzbsqbM46 1jn4k9ot6+Msgah0ZPt2U20R2RQooMe9dIJR2LofpB2MWQMzkojJnB7CLD6Kg9PH Hjjc4EdC/7/lccw6KndJv4N2+uDAJmdP7xIdeEvS5PxS+e6/0nY4/W+XgxDYWoVQ rrTFkS+PJMRKqfW16qrqJNB6rjpeF2ou9Y3XsKLJYwiErhwDo1+wFisF7GpxMY2G ruD0Bn/RpNBCtCOpM9uNGW1cK5pbAgBegUrf3G6woUC/2UcOOm3L26dvhWk4ht62 y3BwS/cFMb7q/1iLvwbEMgImdFULQRwnlkmu3dHxBaaZKL9kN0Qa+YvU4qxoCV6R YIzkz4XjWItlmZhknmw+3WXDbFUp237X3E6sN+EJ7J2LcBcMhvYcknXLc+WgiwI5 ycoP0G8Yd9LSZ8o+rraElBHT+RgIRuNLzeP8vKaPvdVeLbuC4hhCtN0gwGvKpwjr VZoWnSFqEqWFTpSMdfGEIWoqOERfB64WAfcvF1Y7Z/nnjxEGxwHFehCE4cdZ++ZG 8OFW7b4yI67zQyghofRQ =zNEp -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-16 15:04 ` Chris Mason @ 2014-05-16 17:09 ` Andy Grover 2014-05-23 8:11 ` Dan Carpenter 2014-05-16 18:31 ` Randy Dunlap 2014-05-21 7:48 ` Dan Williams 2 siblings, 1 reply; 38+ messages in thread From: Andy Grover @ 2014-05-16 17:09 UTC (permalink / raw) To: Chris Mason, NeilBrown, Dan Williams; +Cc: Dan J Williams, ksummit-discuss On 05/16/2014 08:04 AM, Chris Mason wrote: > The biggest difference: there are no maintainers. If I want to go > change the calendar tool to fix a bug, I patch it, get someone else to > sign off and push. > > All of which is my way of saying the maintainers (me included) are the > biggest bottleneck. There are a lot of reasons I think the maintainer > model fits the kernel better, but at least for btrfs I'm trying to > speed up the patch review process and use patchwork more effectively. Dan and Chris, you talked about some technical differences and solutions, but here are some thoughts/questions I had just on the non-technical side of things for the ksummit: * Big differences vs corporate development: - no one can be told what to do by a common boss - No assumption that co-contributors have a basic level of competence, so sign-offs may not mean much - Co-contributors' area of development may have no- or negative value for maintainer (see "tinification" as an e.g.) - Co-contributors may work for competing companies * Forking the project, the traditional FOSS avenue for bad-maintainer/moving-too-slow is not realistically available for the kernel or per-subsystem, due to massive momentum. * If the maintainer is unresponsive, what recourse does a submitter have? (Is this written down somewhere?) Is taking recourse actually culturally acceptable? How would the gone-around maintainer treat future submissions? * At what point does it make sense to field a sub- or co-maintainer? * Would more maintainer delegation help contributor recruitment and continued involvement? Versus, efficiency of highly-optimized patchflows by fewer maintainers. * Do current maintainers feel they cannot delegate or relinquish maintainership? Maintainership-as-a-burden vs. maintainership-as-lead-developer vs. maintainership-as-a-career-goal. * Are there other large-scale FOSS projects that may have development flows worth drawing lessons from? Thanks -- Andy ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-16 17:09 ` Andy Grover @ 2014-05-23 8:11 ` Dan Carpenter 0 siblings, 0 replies; 38+ messages in thread From: Dan Carpenter @ 2014-05-23 8:11 UTC (permalink / raw) To: Andy Grover; +Cc: ksummit-discuss On Fri, May 16, 2014 at 10:09:51AM -0700, Andy Grover wrote: > * If the maintainer is unresponsive, what recourse does a submitter > have? (Is this written down somewhere?) Is taking recourse actually > culturally acceptable? How would the gone-around maintainer treat > future submissions? You ping the maintainer after a month of no response. If there is still no response after month two then you try to send it through Andrew. If the maintainer responds but requests changes then you have to do what he says. If the maintainer responds but just NAKs your patch, then you're pretty much screwed. If you really care, then you have to carry those patches out of tree. > > * At what point does it make sense to field a sub- or co-maintainer? I don't maintain a git tree but the mechanical bit don't seem that hard to me. It's reviewing the code which is tricky. All patches should be going to a list (LKML doesn't count) so anyone can help review patches. But how to get people to review each other's patches? regards, dan carpenter ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-16 15:04 ` Chris Mason 2014-05-16 17:09 ` Andy Grover @ 2014-05-16 18:31 ` Randy Dunlap 2014-05-21 7:48 ` Dan Williams 2 siblings, 0 replies; 38+ messages in thread From: Randy Dunlap @ 2014-05-16 18:31 UTC (permalink / raw) To: Chris Mason, NeilBrown, Dan Williams; +Cc: Dan J Williams, ksummit-discuss On 05/16/2014 08:04 AM, Chris Mason wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/15/2014 10:56 PM, NeilBrown wrote: >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams >> <dan.j.williams@gmail.com> wrote: >> >>> What would it take and would we even consider moving 2x faster >>> than we are now? >> >> Hi Dan, you seem to be suggesting that there is some limit other >> than "competent engineering time" which is slowing Linux "progress" >> down. >> >> Are you really suggesting that? What might these other limits be? >> >> Certainly there are limits to minimum gap between conceptualisation >> and release (at least one release cycle), but is there really a >> limit to the parallelism that can be achieved? > > I haven't compared the FB commit rates with the kernel, but I'll > pretend Dan's basic thesis is right and talk about which parts of the > facebook model may move faster than the kernel. Thanks for the summary. > The facebook is pretty similar to the way the kernel works. The merge > window lasts a few days and the major releases are every week, but > overall it isn't too far away. > > The biggest difference is that we have a centralized tool for > reviewing the patches, and once it has been reviewed by a specific > number of people, you push it in. > > The patch submission tool runs the patch through lint and various > static analysis to make sure it follows proper coding style and > doesn't include patterns of known bugs. This cuts down on the review > work because the silly coding style mistakes are gone before it gets > to the tool. Yes, this is very nice. Reviewers should not be burdened with checking coding style or common issues or build problems or kconfig problems. They should just be able to review the merits and correctness of the patch. (Yes, that would mean that I would find something different to do on most days. :) > When you put in a patch, you have to put in reviewers, and they get a > little notification that your patch needs review. Once the reviewers > are happy, you push the patch in. > > The biggest difference: there are no maintainers. If I want to go > change the calendar tool to fix a bug, I patch it, get someone else to > sign off and push. > > All of which is my way of saying the maintainers (me included) are the > biggest bottleneck. There are a lot of reasons I think the maintainer I have to agree (me included). > model fits the kernel better, but at least for btrfs I'm trying to > speed up the patch review process and use patchwork more effectively. > > Facebook controls risk with new features using gatekeepers in the > code. That way we can beta test larger changes against an expanding > group of unsuspecting people without turning it on for everyone at once. > > It's also much easier to accept risk when you have complete control > over deployment. facebook.com rolls out twice a day, which is > basically a stable tree cherry-picked from tip, and there are plenty > of chances to fix problems. > > Anrdoid/iphone releases can't be controlled the same way, and so they > have longer testing periods. -- ~Randy ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-16 15:04 ` Chris Mason 2014-05-16 17:09 ` Andy Grover 2014-05-16 18:31 ` Randy Dunlap @ 2014-05-21 7:48 ` Dan Williams 2014-05-21 7:55 ` Greg KH 2014-05-21 8:25 ` NeilBrown 2 siblings, 2 replies; 38+ messages in thread From: Dan Williams @ 2014-05-21 7:48 UTC (permalink / raw) To: Chris Mason; +Cc: ksummit-discuss On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/15/2014 10:56 PM, NeilBrown wrote: >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams >> <dan.j.williams@gmail.com> wrote: >> >>> What would it take and would we even consider moving 2x faster >>> than we are now? >> >> Hi Dan, you seem to be suggesting that there is some limit other >> than "competent engineering time" which is slowing Linux "progress" >> down. >> >> Are you really suggesting that? What might these other limits be? >> >> Certainly there are limits to minimum gap between conceptualisation >> and release (at least one release cycle), but is there really a >> limit to the parallelism that can be achieved? > > I haven't compared the FB commit rates with the kernel, but I'll > pretend Dan's basic thesis is right and talk about which parts of the > facebook model may move faster than the kernel. > > The facebook is pretty similar to the way the kernel works. The merge > window lasts a few days and the major releases are every week, but > overall it isn't too far away. > > The biggest difference is that we have a centralized tool for > reviewing the patches, and once it has been reviewed by a specific > number of people, you push it in. > > The patch submission tool runs the patch through lint and various > static analysis to make sure it follows proper coding style and > doesn't include patterns of known bugs. This cuts down on the review > work because the silly coding style mistakes are gone before it gets > to the tool. > > When you put in a patch, you have to put in reviewers, and they get a > little notification that your patch needs review. Once the reviewers > are happy, you push the patch in. > > The biggest difference: there are no maintainers. If I want to go > change the calendar tool to fix a bug, I patch it, get someone else to > sign off and push. > > All of which is my way of saying the maintainers (me included) are the > biggest bottleneck. There are a lot of reasons I think the maintainer > model fits the kernel better, but at least for btrfs I'm trying to > speed up the patch review process and use patchwork more effectively. To be clear, I'm not arguing for a maintainer-less model. We don't have the tooling or operational-data to support that. We need maintainers to say "no". But, what I think we can do is give maintainers more varied ways to say it. The goal, de-escalate the merge event as a declaration that the code quality/architecture conversation is over. Release early, release often, and with care merge often. With regards to saying "no" faster, it seems kernel code rarely comes with tests. However, maintainers today are already able to reduce the latency to "no" when the 0-day-kbuild robot emits a negative test. Why not arm that system with tests it can autodiscover? What has held back unit test culture in the kernel? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 7:48 ` Dan Williams @ 2014-05-21 7:55 ` Greg KH 2014-05-21 9:05 ` Matt Fleming 2014-05-21 8:25 ` NeilBrown 1 sibling, 1 reply; 38+ messages in thread From: Greg KH @ 2014-05-21 7:55 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Wed, May 21, 2014 at 12:48:48AM -0700, Dan Williams wrote: > With regards to saying "no" faster, it seems kernel code rarely comes > with tests. However, maintainers today are already able to reduce the > latency to "no" when the 0-day-kbuild robot emits a negative test. > Why not arm that system with tests it can autodiscover? What has held > back unit test culture in the kernel? The fact that no one has stepped up and taken maintainership of the tests and ensure that they continue to work. I'm working on solving that issue, by getting funding for someone to do this and focus on tests that all developers and maintainers can use to help ensure that nothing broke when they make a change. Give me a few months, hopefully there will be something I can talk about soon in this area. thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 7:55 ` Greg KH @ 2014-05-21 9:05 ` Matt Fleming 2014-05-21 12:52 ` Greg KH 0 siblings, 1 reply; 38+ messages in thread From: Matt Fleming @ 2014-05-21 9:05 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On Wed, 21 May, at 04:55:47PM, Greg KH wrote: > On Wed, May 21, 2014 at 12:48:48AM -0700, Dan Williams wrote: > > With regards to saying "no" faster, it seems kernel code rarely comes > > with tests. However, maintainers today are already able to reduce the > > latency to "no" when the 0-day-kbuild robot emits a negative test. > > Why not arm that system with tests it can autodiscover? What has held > > back unit test culture in the kernel? > > The fact that no one has stepped up and taken maintainership of the > tests and ensure that they continue to work. That's not usually how unit tests work. They're supposed to be owned by everyone, i.e. if your change breaks the test you are responsible for fixing your change, or the test, or both. Everyone needs to ensure the tests continue to work. Likewise, the person implementing a new feature is the most well equipped to write tests for it. Unfortunately that does require a certain amount of "buy-in" from the community. However, a maintainer role might make sense for collating test results, reporting failures or running the tests on a large number of hardware configurations, like how Fengguang Wu says "The 0-day infrastructure shows your commit introduced a regression" or Stephen Rothwell says "A merge of your tree causes these conflicts". For anything other than trivial cases I wouldn't expect these guys to have to fixup the breakage to ensure the tests continue working - that kind of never ending battle would make a person's head explode. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 9:05 ` Matt Fleming @ 2014-05-21 12:52 ` Greg KH 2014-05-21 13:23 ` Matt Fleming 0 siblings, 1 reply; 38+ messages in thread From: Greg KH @ 2014-05-21 12:52 UTC (permalink / raw) To: Matt Fleming; +Cc: ksummit-discuss On Wed, May 21, 2014 at 10:05:13AM +0100, Matt Fleming wrote: > On Wed, 21 May, at 04:55:47PM, Greg KH wrote: > > On Wed, May 21, 2014 at 12:48:48AM -0700, Dan Williams wrote: > > > With regards to saying "no" faster, it seems kernel code rarely comes > > > with tests. However, maintainers today are already able to reduce the > > > latency to "no" when the 0-day-kbuild robot emits a negative test. > > > Why not arm that system with tests it can autodiscover? What has held > > > back unit test culture in the kernel? > > > > The fact that no one has stepped up and taken maintainership of the > > tests and ensure that they continue to work. > > That's not usually how unit tests work. They're supposed to be owned by > everyone, i.e. if your change breaks the test you are responsible for > fixing your change, or the test, or both. Everyone needs to ensure the > tests continue to work. Ideal world meet the real world. Today, the in-kernel tests are broken. Now if that is a kernel problem, or a test problem, no one seems to be stepping up to figure that out. Someone needs to be on top of it to do that. Given that no one has done that for, well, ever, this needs to be fixed. > Likewise, the person implementing a new feature is the most well > equipped to write tests for it. Unfortunately that does require a > certain amount of "buy-in" from the community. We have that "buy-in", and have had it for a long time. I've been asking for this for years, and finally have the ear of people who are able to allocate resources for it. Which is a very nice chance that I do not want to blow it. > However, a maintainer role might make sense for collating test results, > reporting failures or running the tests on a large number of hardware > configurations, like how Fengguang Wu says "The 0-day infrastructure > shows your commit introduced a regression" or Stephen Rothwell says "A > merge of your tree causes these conflicts". > > For anything other than trivial cases I wouldn't expect these guys to > have to fixup the breakage to ensure the tests continue working - that > kind of never ending battle would make a person's head explode. Agreed, but given that no one has even tried, and that I know of someone who has agreed to do this work, let's give it a chance. thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 12:52 ` Greg KH @ 2014-05-21 13:23 ` Matt Fleming 0 siblings, 0 replies; 38+ messages in thread From: Matt Fleming @ 2014-05-21 13:23 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On Wed, 21 May, at 09:52:18PM, Greg KH wrote: > > Ideal world meet the real world. > > Today, the in-kernel tests are broken. Now if that is a kernel problem, > or a test problem, no one seems to be stepping up to figure that out. > Someone needs to be on top of it to do that. Given that no one has done > that for, well, ever, this needs to be fixed. Which tests are broken? FWIW, the EFI tests in tools/testing/selftests work fine because I regularly run them against any changes I merge. > Agreed, but given that no one has even tried, and that I know of someone > who has agreed to do this work, let's give it a chance. Go for it! -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 7:48 ` Dan Williams 2014-05-21 7:55 ` Greg KH @ 2014-05-21 8:25 ` NeilBrown 2014-05-21 8:36 ` Dan Williams 1 sibling, 1 reply; 38+ messages in thread From: NeilBrown @ 2014-05-21 8:25 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss [-- Attachment #1: Type: text/plain, Size: 3620 bytes --] On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On 05/15/2014 10:56 PM, NeilBrown wrote: > >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams > >> <dan.j.williams@gmail.com> wrote: > >> > >>> What would it take and would we even consider moving 2x faster > >>> than we are now? > >> > >> Hi Dan, you seem to be suggesting that there is some limit other > >> than "competent engineering time" which is slowing Linux "progress" > >> down. > >> > >> Are you really suggesting that? What might these other limits be? > >> > >> Certainly there are limits to minimum gap between conceptualisation > >> and release (at least one release cycle), but is there really a > >> limit to the parallelism that can be achieved? > > > > I haven't compared the FB commit rates with the kernel, but I'll > > pretend Dan's basic thesis is right and talk about which parts of the > > facebook model may move faster than the kernel. > > > > The facebook is pretty similar to the way the kernel works. The merge > > window lasts a few days and the major releases are every week, but > > overall it isn't too far away. > > > > The biggest difference is that we have a centralized tool for > > reviewing the patches, and once it has been reviewed by a specific > > number of people, you push it in. > > > > The patch submission tool runs the patch through lint and various > > static analysis to make sure it follows proper coding style and > > doesn't include patterns of known bugs. This cuts down on the review > > work because the silly coding style mistakes are gone before it gets > > to the tool. > > > > When you put in a patch, you have to put in reviewers, and they get a > > little notification that your patch needs review. Once the reviewers > > are happy, you push the patch in. > > > > The biggest difference: there are no maintainers. If I want to go > > change the calendar tool to fix a bug, I patch it, get someone else to > > sign off and push. > > > > All of which is my way of saying the maintainers (me included) are the > > biggest bottleneck. There are a lot of reasons I think the maintainer > > model fits the kernel better, but at least for btrfs I'm trying to > > speed up the patch review process and use patchwork more effectively. > > To be clear, I'm not arguing for a maintainer-less model. We don't > have the tooling or operational-data to support that. We need > maintainers to say "no". But, what I think we can do is give > maintainers more varied ways to say it. The goal, de-escalate the > merge event as a declaration that the code quality/architecture > conversation is over. > > Release early, release often, and with care merge often. I think this falls foul of the "no regressions" rule. The kernel policy is that once the functionality gets to users, it cannot be taken away. Individual drivers in 'staging' manage to avoid this rule because that are clearly separate things. New system calls and attributes in sysfs etc seem to be much harder to "partially" release. To quote from Linus in a recent interview: "I personally think the stable development model is not one of continual incremental improvements, but a succession of overshooting and crashing." Yet Linux is stuck in "incremental improvement" mode and is not in a position to "overshoot and crash" much. I agree there is a problem. I can't see a solution. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 8:25 ` NeilBrown @ 2014-05-21 8:36 ` Dan Williams 2014-05-21 8:53 ` Matt Fleming 2014-05-21 10:11 ` NeilBrown 0 siblings, 2 replies; 38+ messages in thread From: Dan Williams @ 2014-05-21 8:36 UTC (permalink / raw) To: NeilBrown; +Cc: ksummit-discuss On Wed, May 21, 2014 at 1:25 AM, NeilBrown <neilb@suse.de> wrote: > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> > wrote: > >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: >> > -----BEGIN PGP SIGNED MESSAGE----- >> > Hash: SHA1 >> > >> > On 05/15/2014 10:56 PM, NeilBrown wrote: >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams >> >> <dan.j.williams@gmail.com> wrote: >> >> >> >>> What would it take and would we even consider moving 2x faster >> >>> than we are now? >> >> >> >> Hi Dan, you seem to be suggesting that there is some limit other >> >> than "competent engineering time" which is slowing Linux "progress" >> >> down. >> >> >> >> Are you really suggesting that? What might these other limits be? >> >> >> >> Certainly there are limits to minimum gap between conceptualisation >> >> and release (at least one release cycle), but is there really a >> >> limit to the parallelism that can be achieved? >> > >> > I haven't compared the FB commit rates with the kernel, but I'll >> > pretend Dan's basic thesis is right and talk about which parts of the >> > facebook model may move faster than the kernel. >> > >> > The facebook is pretty similar to the way the kernel works. The merge >> > window lasts a few days and the major releases are every week, but >> > overall it isn't too far away. >> > >> > The biggest difference is that we have a centralized tool for >> > reviewing the patches, and once it has been reviewed by a specific >> > number of people, you push it in. >> > >> > The patch submission tool runs the patch through lint and various >> > static analysis to make sure it follows proper coding style and >> > doesn't include patterns of known bugs. This cuts down on the review >> > work because the silly coding style mistakes are gone before it gets >> > to the tool. >> > >> > When you put in a patch, you have to put in reviewers, and they get a >> > little notification that your patch needs review. Once the reviewers >> > are happy, you push the patch in. >> > >> > The biggest difference: there are no maintainers. If I want to go >> > change the calendar tool to fix a bug, I patch it, get someone else to >> > sign off and push. >> > >> > All of which is my way of saying the maintainers (me included) are the >> > biggest bottleneck. There are a lot of reasons I think the maintainer >> > model fits the kernel better, but at least for btrfs I'm trying to >> > speed up the patch review process and use patchwork more effectively. >> >> To be clear, I'm not arguing for a maintainer-less model. We don't >> have the tooling or operational-data to support that. We need >> maintainers to say "no". But, what I think we can do is give >> maintainers more varied ways to say it. The goal, de-escalate the >> merge event as a declaration that the code quality/architecture >> conversation is over. >> >> Release early, release often, and with care merge often. > > I think this falls foul of the "no regressions" rule. > > The kernel policy is that once the functionality gets to users, it cannot be > taken away. Individual drivers in 'staging' manage to avoid this rule > because that are clearly separate things. > New system calls and attributes in sysfs etc seem to be much harder to > "partially" release. My straw man is something like the following for driver "foo" if (gatekeeper_foo_new_awesome_sauce) do_new_thing(); Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and warns that there is no guarantee of this functionality being present in the same form or at all going forward. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 8:36 ` Dan Williams @ 2014-05-21 8:53 ` Matt Fleming 2014-05-21 10:11 ` NeilBrown 1 sibling, 0 replies; 38+ messages in thread From: Matt Fleming @ 2014-05-21 8:53 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Wed, 21 May, at 01:36:55AM, Dan Williams wrote: > > My straw man is something like the following for driver "foo" > > if (gatekeeper_foo_new_awesome_sauce) > do_new_thing(); > > Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and > warns that there is no guarantee of this functionality being present > in the same form or at all going forward. This kind of thing is done all the time for web developemnt - I think it's given the name "feature bit". It makes sense when you control the execution environment, like a web server, and if things explode you can detect that on the web server end, and not necessarily require your user to report the problem. It also makes a lot of sense for continuous deployment, where the master branch is always the branch used in production. When a user needs to actively enable this feature and report problems it's just like another CONFIG_* option, and I'm not sure that's an improvement. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 8:36 ` Dan Williams 2014-05-21 8:53 ` Matt Fleming @ 2014-05-21 10:11 ` NeilBrown 2014-05-21 15:35 ` Dan Williams 1 sibling, 1 reply; 38+ messages in thread From: NeilBrown @ 2014-05-21 10:11 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss [-- Attachment #1: Type: text/plain, Size: 5584 bytes --] On Wed, 21 May 2014 01:36:55 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > On Wed, May 21, 2014 at 1:25 AM, NeilBrown <neilb@suse.de> wrote: > > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> > > wrote: > > > >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: > >> > -----BEGIN PGP SIGNED MESSAGE----- > >> > Hash: SHA1 > >> > > >> > On 05/15/2014 10:56 PM, NeilBrown wrote: > >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams > >> >> <dan.j.williams@gmail.com> wrote: > >> >> > >> >>> What would it take and would we even consider moving 2x faster > >> >>> than we are now? > >> >> > >> >> Hi Dan, you seem to be suggesting that there is some limit other > >> >> than "competent engineering time" which is slowing Linux "progress" > >> >> down. > >> >> > >> >> Are you really suggesting that? What might these other limits be? > >> >> > >> >> Certainly there are limits to minimum gap between conceptualisation > >> >> and release (at least one release cycle), but is there really a > >> >> limit to the parallelism that can be achieved? > >> > > >> > I haven't compared the FB commit rates with the kernel, but I'll > >> > pretend Dan's basic thesis is right and talk about which parts of the > >> > facebook model may move faster than the kernel. > >> > > >> > The facebook is pretty similar to the way the kernel works. The merge > >> > window lasts a few days and the major releases are every week, but > >> > overall it isn't too far away. > >> > > >> > The biggest difference is that we have a centralized tool for > >> > reviewing the patches, and once it has been reviewed by a specific > >> > number of people, you push it in. > >> > > >> > The patch submission tool runs the patch through lint and various > >> > static analysis to make sure it follows proper coding style and > >> > doesn't include patterns of known bugs. This cuts down on the review > >> > work because the silly coding style mistakes are gone before it gets > >> > to the tool. > >> > > >> > When you put in a patch, you have to put in reviewers, and they get a > >> > little notification that your patch needs review. Once the reviewers > >> > are happy, you push the patch in. > >> > > >> > The biggest difference: there are no maintainers. If I want to go > >> > change the calendar tool to fix a bug, I patch it, get someone else to > >> > sign off and push. > >> > > >> > All of which is my way of saying the maintainers (me included) are the > >> > biggest bottleneck. There are a lot of reasons I think the maintainer > >> > model fits the kernel better, but at least for btrfs I'm trying to > >> > speed up the patch review process and use patchwork more effectively. > >> > >> To be clear, I'm not arguing for a maintainer-less model. We don't > >> have the tooling or operational-data to support that. We need > >> maintainers to say "no". But, what I think we can do is give > >> maintainers more varied ways to say it. The goal, de-escalate the > >> merge event as a declaration that the code quality/architecture > >> conversation is over. > >> > >> Release early, release often, and with care merge often. > > > > I think this falls foul of the "no regressions" rule. > > > > The kernel policy is that once the functionality gets to users, it cannot be > > taken away. Individual drivers in 'staging' manage to avoid this rule > > because that are clearly separate things. > > New system calls and attributes in sysfs etc seem to be much harder to > > "partially" release. > > My straw man is something like the following for driver "foo" > > if (gatekeeper_foo_new_awesome_sauce) > do_new_thing(); > > Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and > warns that there is no guarantee of this functionality being present > in the same form or at all going forward. Interesting idea. Trying to imagine how this might play out in practice.... You talk about "value delivered to users". But users tend to use applications, and applications are the users of kernel features. Will anyone bother writing or adapting an application to use a feature which is not guaranteed to hang around? Maybe they will, but will the users of the application know that it might stop working after a kernel upgrade? Maybe... Maybe if we had some concrete examples of features that could have been delayed using a gatekeeper. The one that springs to my mind is cgroups. Clearly useful, but clearly controversial. It appears that the original implementation was seriously flawed and Tejun is doing a massive amount of work to "fix" it, and this apparently will lead to API changes. And this is happening without any gatekeepers. Would it have been easier in some way with gatekeepers? ... I don't see how it would be, except that fewer people would have used cgroups, and then maybe we wouldn't have as much collective experience to know what the real problems were(?). I think that is the key. With a user-facing option, people will try it and probably cope if it disappears (though they might complain loudly and sign petitions declaring facebook to be the anti-$DEITY). However with kernel internal options, applications are unlikely to use them without some expectation of stability. So finding the problems would be a lot harder. Which doesn't mean that it can't work, but it would be nice if create some real life examples to see how it plays out in practice. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 10:11 ` NeilBrown @ 2014-05-21 15:35 ` Dan Williams 2014-05-21 23:06 ` Rafael J. Wysocki 2014-05-21 23:48 ` NeilBrown 0 siblings, 2 replies; 38+ messages in thread From: Dan Williams @ 2014-05-21 15:35 UTC (permalink / raw) To: NeilBrown; +Cc: ksummit-discuss On Wed, May 21, 2014 at 3:11 AM, NeilBrown <neilb@suse.de> wrote: > On Wed, 21 May 2014 01:36:55 -0700 Dan Williams <dan.j.williams@intel.com> > wrote: > >> On Wed, May 21, 2014 at 1:25 AM, NeilBrown <neilb@suse.de> wrote: >> > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> >> > wrote: >> > >> >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: >> >> > -----BEGIN PGP SIGNED MESSAGE----- >> >> > Hash: SHA1 >> >> > >> >> > On 05/15/2014 10:56 PM, NeilBrown wrote: >> >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams >> >> >> <dan.j.williams@gmail.com> wrote: >> >> >> >> >> >>> What would it take and would we even consider moving 2x faster >> >> >>> than we are now? >> >> >> >> >> >> Hi Dan, you seem to be suggesting that there is some limit other >> >> >> than "competent engineering time" which is slowing Linux "progress" >> >> >> down. >> >> >> >> >> >> Are you really suggesting that? What might these other limits be? >> >> >> >> >> >> Certainly there are limits to minimum gap between conceptualisation >> >> >> and release (at least one release cycle), but is there really a >> >> >> limit to the parallelism that can be achieved? >> >> > >> >> > I haven't compared the FB commit rates with the kernel, but I'll >> >> > pretend Dan's basic thesis is right and talk about which parts of the >> >> > facebook model may move faster than the kernel. >> >> > >> >> > The facebook is pretty similar to the way the kernel works. The merge >> >> > window lasts a few days and the major releases are every week, but >> >> > overall it isn't too far away. >> >> > >> >> > The biggest difference is that we have a centralized tool for >> >> > reviewing the patches, and once it has been reviewed by a specific >> >> > number of people, you push it in. >> >> > >> >> > The patch submission tool runs the patch through lint and various >> >> > static analysis to make sure it follows proper coding style and >> >> > doesn't include patterns of known bugs. This cuts down on the review >> >> > work because the silly coding style mistakes are gone before it gets >> >> > to the tool. >> >> > >> >> > When you put in a patch, you have to put in reviewers, and they get a >> >> > little notification that your patch needs review. Once the reviewers >> >> > are happy, you push the patch in. >> >> > >> >> > The biggest difference: there are no maintainers. If I want to go >> >> > change the calendar tool to fix a bug, I patch it, get someone else to >> >> > sign off and push. >> >> > >> >> > All of which is my way of saying the maintainers (me included) are the >> >> > biggest bottleneck. There are a lot of reasons I think the maintainer >> >> > model fits the kernel better, but at least for btrfs I'm trying to >> >> > speed up the patch review process and use patchwork more effectively. >> >> >> >> To be clear, I'm not arguing for a maintainer-less model. We don't >> >> have the tooling or operational-data to support that. We need >> >> maintainers to say "no". But, what I think we can do is give >> >> maintainers more varied ways to say it. The goal, de-escalate the >> >> merge event as a declaration that the code quality/architecture >> >> conversation is over. >> >> >> >> Release early, release often, and with care merge often. >> > >> > I think this falls foul of the "no regressions" rule. >> > >> > The kernel policy is that once the functionality gets to users, it cannot be >> > taken away. Individual drivers in 'staging' manage to avoid this rule >> > because that are clearly separate things. >> > New system calls and attributes in sysfs etc seem to be much harder to >> > "partially" release. >> >> My straw man is something like the following for driver "foo" >> >> if (gatekeeper_foo_new_awesome_sauce) >> do_new_thing(); >> >> Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and >> warns that there is no guarantee of this functionality being present >> in the same form or at all going forward. > > Interesting idea. > Trying to imagine how this might play out in practice.... > > You talk about "value delivered to users". But users tend to use > applications, and applications are the users of kernel features. > > Will anyone bother writing or adapting an application to use a feature which > is not guaranteed to hang around? > Maybe they will, but will the users of the application know that it might > stop working after a kernel upgrade? Maybe... > > Maybe if we had some concrete examples of features that could have been > delayed using a gatekeeper. > > The one that springs to my mind is cgroups. Clearly useful, but clearly > controversial. It appears that the original implementation was seriously > flawed and Tejun is doing a massive amount of work to "fix" it, and this > apparently will lead to API changes. And this is happening without any > gatekeepers. Would it have been easier in some way with gatekeepers? > ... I don't see how it would be, except that fewer people would have used > cgroups, and then maybe we wouldn't have as much collective experience to > know what the real problems were(?). > > I think that is the key. With a user-facing option, people will try it and > probably cope if it disappears (though they might complain loudly and sign > petitions declaring facebook to be the anti-$DEITY). However with kernel > internal options, applications are unlikely to use them without some > expectation of stability. So finding the problems would be a lot harder. > > Which doesn't mean that it can't work, but it would be nice if create some > real life examples to see how it plays out in practice. > Biased by my background of course, but I think driver development is more amenable to this sort of approach. For drivers the kernel is in many instances the application. For example, I currently have in my review queue a patch set to add sata port multiplier support to libsas. I hope I get the review done in time for merging it in 3.16. But, what if I also had the option of saying "let's gatekeeper this for a cycle". Users that care could start using it and reporting bugs, and it would be clear that the implementation is provisional. My opinion is that bug reports would attract deeper code review that otherwise would not occur if the feature was simply delayed for a cycle. I think I also would have liked to use a gatekeeper to stage the deletion of NET_DMA from the kernel. Mark it for removal, see who screams, but still make it straightforward for such people to make their case with data why the value should stay. For the core kernel, which I admittedly have not touched much, are there cases where an application wants to make a value argument to users, but needs some kernel infrastructure to stand on? Do we inadvertently stifle otherwise promising experiments by forcing upstream acceptance before the experiment gets the exposure it needs? ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 15:35 ` Dan Williams @ 2014-05-21 23:06 ` Rafael J. Wysocki 2014-05-21 23:03 ` Dan Williams 2014-05-21 23:48 ` NeilBrown 1 sibling, 1 reply; 38+ messages in thread From: Rafael J. Wysocki @ 2014-05-21 23:06 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Wednesday, May 21, 2014 08:35:55 AM Dan Williams wrote: > On Wed, May 21, 2014 at 3:11 AM, NeilBrown <neilb@suse.de> wrote: > > On Wed, 21 May 2014 01:36:55 -0700 Dan Williams <dan.j.williams@intel.com> > > wrote: > > > >> On Wed, May 21, 2014 at 1:25 AM, NeilBrown <neilb@suse.de> wrote: > >> > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> > >> > wrote: > >> > > >> >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: > >> >> > -----BEGIN PGP SIGNED MESSAGE----- > >> >> > Hash: SHA1 > >> >> > > >> >> > On 05/15/2014 10:56 PM, NeilBrown wrote: > >> >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams > >> >> >> <dan.j.williams@gmail.com> wrote: > >> >> >> > >> >> >>> What would it take and would we even consider moving 2x faster > >> >> >>> than we are now? > >> >> >> > >> >> >> Hi Dan, you seem to be suggesting that there is some limit other > >> >> >> than "competent engineering time" which is slowing Linux "progress" > >> >> >> down. > >> >> >> > >> >> >> Are you really suggesting that? What might these other limits be? > >> >> >> > >> >> >> Certainly there are limits to minimum gap between conceptualisation > >> >> >> and release (at least one release cycle), but is there really a > >> >> >> limit to the parallelism that can be achieved? > >> >> > > >> >> > I haven't compared the FB commit rates with the kernel, but I'll > >> >> > pretend Dan's basic thesis is right and talk about which parts of the > >> >> > facebook model may move faster than the kernel. > >> >> > > >> >> > The facebook is pretty similar to the way the kernel works. The merge > >> >> > window lasts a few days and the major releases are every week, but > >> >> > overall it isn't too far away. > >> >> > > >> >> > The biggest difference is that we have a centralized tool for > >> >> > reviewing the patches, and once it has been reviewed by a specific > >> >> > number of people, you push it in. > >> >> > > >> >> > The patch submission tool runs the patch through lint and various > >> >> > static analysis to make sure it follows proper coding style and > >> >> > doesn't include patterns of known bugs. This cuts down on the review > >> >> > work because the silly coding style mistakes are gone before it gets > >> >> > to the tool. > >> >> > > >> >> > When you put in a patch, you have to put in reviewers, and they get a > >> >> > little notification that your patch needs review. Once the reviewers > >> >> > are happy, you push the patch in. > >> >> > > >> >> > The biggest difference: there are no maintainers. If I want to go > >> >> > change the calendar tool to fix a bug, I patch it, get someone else to > >> >> > sign off and push. > >> >> > > >> >> > All of which is my way of saying the maintainers (me included) are the > >> >> > biggest bottleneck. There are a lot of reasons I think the maintainer > >> >> > model fits the kernel better, but at least for btrfs I'm trying to > >> >> > speed up the patch review process and use patchwork more effectively. > >> >> > >> >> To be clear, I'm not arguing for a maintainer-less model. We don't > >> >> have the tooling or operational-data to support that. We need > >> >> maintainers to say "no". But, what I think we can do is give > >> >> maintainers more varied ways to say it. The goal, de-escalate the > >> >> merge event as a declaration that the code quality/architecture > >> >> conversation is over. > >> >> > >> >> Release early, release often, and with care merge often. > >> > > >> > I think this falls foul of the "no regressions" rule. > >> > > >> > The kernel policy is that once the functionality gets to users, it cannot be > >> > taken away. Individual drivers in 'staging' manage to avoid this rule > >> > because that are clearly separate things. > >> > New system calls and attributes in sysfs etc seem to be much harder to > >> > "partially" release. > >> > >> My straw man is something like the following for driver "foo" > >> > >> if (gatekeeper_foo_new_awesome_sauce) > >> do_new_thing(); > >> > >> Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and > >> warns that there is no guarantee of this functionality being present > >> in the same form or at all going forward. > > > > Interesting idea. > > Trying to imagine how this might play out in practice.... > > > > You talk about "value delivered to users". But users tend to use > > applications, and applications are the users of kernel features. > > > > Will anyone bother writing or adapting an application to use a feature which > > is not guaranteed to hang around? > > Maybe they will, but will the users of the application know that it might > > stop working after a kernel upgrade? Maybe... > > > > Maybe if we had some concrete examples of features that could have been > > delayed using a gatekeeper. > > > > The one that springs to my mind is cgroups. Clearly useful, but clearly > > controversial. It appears that the original implementation was seriously > > flawed and Tejun is doing a massive amount of work to "fix" it, and this > > apparently will lead to API changes. And this is happening without any > > gatekeepers. Would it have been easier in some way with gatekeepers? > > ... I don't see how it would be, except that fewer people would have used > > cgroups, and then maybe we wouldn't have as much collective experience to > > know what the real problems were(?). > > > > I think that is the key. With a user-facing option, people will try it and > > probably cope if it disappears (though they might complain loudly and sign > > petitions declaring facebook to be the anti-$DEITY). However with kernel > > internal options, applications are unlikely to use them without some > > expectation of stability. So finding the problems would be a lot harder. > > > > Which doesn't mean that it can't work, but it would be nice if create some > > real life examples to see how it plays out in practice. > > > > Biased by my background of course, but I think driver development is > more amenable to this sort of approach. For drivers the kernel is in > many instances the application. For example, I currently have in my > review queue a patch set to add sata port multiplier support to > libsas. I hope I get the review done in time for merging it in 3.16. > But, what if I also had the option of saying "let's gatekeeper this > for a cycle". Users that care could start using it and reporting > bugs, and it would be clear that the implementation is provisional. > My opinion is that bug reports would attract deeper code review that > otherwise would not occur if the feature was simply delayed for a > cycle. There's more to that. The model you're referring to is only possible if all participants are employees of one company or otherwise members of one organization that has some kind of control over them. The kernel development is not done like that, though, so I'm afraid that the Facebook experience is not applicable here directly. For example, we take patches from pretty much everyone on the Internet. Does Facebook do that too? I don't think so. Thanks! -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 23:06 ` Rafael J. Wysocki @ 2014-05-21 23:03 ` Dan Williams 2014-05-21 23:40 ` Laurent Pinchart ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Dan Williams @ 2014-05-21 23:03 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ksummit-discuss On Wed, May 21, 2014 at 4:06 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Wednesday, May 21, 2014 08:35:55 AM Dan Williams wrote: >> On Wed, May 21, 2014 at 3:11 AM, NeilBrown <neilb@suse.de> wrote: >> > On Wed, 21 May 2014 01:36:55 -0700 Dan Williams <dan.j.williams@intel.com> >> > wrote: >> > >> >> On Wed, May 21, 2014 at 1:25 AM, NeilBrown <neilb@suse.de> wrote: >> >> > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> >> >> > wrote: >> >> > >> >> >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: >> >> >> > -----BEGIN PGP SIGNED MESSAGE----- >> >> >> > Hash: SHA1 >> >> >> > >> >> >> > On 05/15/2014 10:56 PM, NeilBrown wrote: >> >> >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams >> >> >> >> <dan.j.williams@gmail.com> wrote: >> >> >> >> >> >> >> >>> What would it take and would we even consider moving 2x faster >> >> >> >>> than we are now? >> >> >> >> >> >> >> >> Hi Dan, you seem to be suggesting that there is some limit other >> >> >> >> than "competent engineering time" which is slowing Linux "progress" >> >> >> >> down. >> >> >> >> >> >> >> >> Are you really suggesting that? What might these other limits be? >> >> >> >> >> >> >> >> Certainly there are limits to minimum gap between conceptualisation >> >> >> >> and release (at least one release cycle), but is there really a >> >> >> >> limit to the parallelism that can be achieved? >> >> >> > >> >> >> > I haven't compared the FB commit rates with the kernel, but I'll >> >> >> > pretend Dan's basic thesis is right and talk about which parts of the >> >> >> > facebook model may move faster than the kernel. >> >> >> > >> >> >> > The facebook is pretty similar to the way the kernel works. The merge >> >> >> > window lasts a few days and the major releases are every week, but >> >> >> > overall it isn't too far away. >> >> >> > >> >> >> > The biggest difference is that we have a centralized tool for >> >> >> > reviewing the patches, and once it has been reviewed by a specific >> >> >> > number of people, you push it in. >> >> >> > >> >> >> > The patch submission tool runs the patch through lint and various >> >> >> > static analysis to make sure it follows proper coding style and >> >> >> > doesn't include patterns of known bugs. This cuts down on the review >> >> >> > work because the silly coding style mistakes are gone before it gets >> >> >> > to the tool. >> >> >> > >> >> >> > When you put in a patch, you have to put in reviewers, and they get a >> >> >> > little notification that your patch needs review. Once the reviewers >> >> >> > are happy, you push the patch in. >> >> >> > >> >> >> > The biggest difference: there are no maintainers. If I want to go >> >> >> > change the calendar tool to fix a bug, I patch it, get someone else to >> >> >> > sign off and push. >> >> >> > >> >> >> > All of which is my way of saying the maintainers (me included) are the >> >> >> > biggest bottleneck. There are a lot of reasons I think the maintainer >> >> >> > model fits the kernel better, but at least for btrfs I'm trying to >> >> >> > speed up the patch review process and use patchwork more effectively. >> >> >> >> >> >> To be clear, I'm not arguing for a maintainer-less model. We don't >> >> >> have the tooling or operational-data to support that. We need >> >> >> maintainers to say "no". But, what I think we can do is give >> >> >> maintainers more varied ways to say it. The goal, de-escalate the >> >> >> merge event as a declaration that the code quality/architecture >> >> >> conversation is over. >> >> >> >> >> >> Release early, release often, and with care merge often. >> >> > >> >> > I think this falls foul of the "no regressions" rule. >> >> > >> >> > The kernel policy is that once the functionality gets to users, it cannot be >> >> > taken away. Individual drivers in 'staging' manage to avoid this rule >> >> > because that are clearly separate things. >> >> > New system calls and attributes in sysfs etc seem to be much harder to >> >> > "partially" release. >> >> >> >> My straw man is something like the following for driver "foo" >> >> >> >> if (gatekeeper_foo_new_awesome_sauce) >> >> do_new_thing(); >> >> >> >> Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and >> >> warns that there is no guarantee of this functionality being present >> >> in the same form or at all going forward. >> > >> > Interesting idea. >> > Trying to imagine how this might play out in practice.... >> > >> > You talk about "value delivered to users". But users tend to use >> > applications, and applications are the users of kernel features. >> > >> > Will anyone bother writing or adapting an application to use a feature which >> > is not guaranteed to hang around? >> > Maybe they will, but will the users of the application know that it might >> > stop working after a kernel upgrade? Maybe... >> > >> > Maybe if we had some concrete examples of features that could have been >> > delayed using a gatekeeper. >> > >> > The one that springs to my mind is cgroups. Clearly useful, but clearly >> > controversial. It appears that the original implementation was seriously >> > flawed and Tejun is doing a massive amount of work to "fix" it, and this >> > apparently will lead to API changes. And this is happening without any >> > gatekeepers. Would it have been easier in some way with gatekeepers? >> > ... I don't see how it would be, except that fewer people would have used >> > cgroups, and then maybe we wouldn't have as much collective experience to >> > know what the real problems were(?). >> > >> > I think that is the key. With a user-facing option, people will try it and >> > probably cope if it disappears (though they might complain loudly and sign >> > petitions declaring facebook to be the anti-$DEITY). However with kernel >> > internal options, applications are unlikely to use them without some >> > expectation of stability. So finding the problems would be a lot harder. >> > >> > Which doesn't mean that it can't work, but it would be nice if create some >> > real life examples to see how it plays out in practice. >> > >> >> Biased by my background of course, but I think driver development is >> more amenable to this sort of approach. For drivers the kernel is in >> many instances the application. For example, I currently have in my >> review queue a patch set to add sata port multiplier support to >> libsas. I hope I get the review done in time for merging it in 3.16. >> But, what if I also had the option of saying "let's gatekeeper this >> for a cycle". Users that care could start using it and reporting >> bugs, and it would be clear that the implementation is provisional. >> My opinion is that bug reports would attract deeper code review that >> otherwise would not occur if the feature was simply delayed for a >> cycle. > > There's more to that. > > The model you're referring to is only possible if all participants are > employees of one company or otherwise members of one organization that > has some kind of control over them. The kernel development is not done > like that, though, so I'm afraid that the Facebook experience is not > applicable here directly. > > For example, we take patches from pretty much everyone on the Internet. > Does Facebook do that too? I don't think so. > I'm struggling to see how this addresses my new libsas feature example? Simply, if an end user knows how to override a "gatekeeper" that user can test features that we are otherwise still debating upstream. They can of course also apply the patches directly, but I am proposing we formalize a mechanism to encourage more experimentation in-tree. I'm fully aware we do not have the tactical data nor operational control to run the kernel like a website, that's not my concern. My concern is with expanding a maintainer's options for mitigating risk. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 23:03 ` Dan Williams @ 2014-05-21 23:40 ` Laurent Pinchart 2014-05-22 0:10 ` Rafael J. Wysocki 2014-05-22 15:48 ` Theodore Ts'o 2 siblings, 0 replies; 38+ messages in thread From: Laurent Pinchart @ 2014-05-21 23:40 UTC (permalink / raw) To: ksummit-discuss On Wednesday 21 May 2014 16:03:49 Dan Williams wrote: > On Wed, May 21, 2014 at 4:06 PM, Rafael J. Wysocki wrote: > > On Wednesday, May 21, 2014 08:35:55 AM Dan Williams wrote: > >> On Wed, May 21, 2014 at 3:11 AM, NeilBrown wrote: > >> > On Wed, 21 May 2014 01:36:55 -0700 Dan Williams wrote: > >> >> On Wed, May 21, 2014 at 1:25 AM, NeilBrown wrote: > >> >> > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams wrote: > >> >> >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: > >> >> >> > -----BEGIN PGP SIGNED MESSAGE----- > >> >> >> > Hash: SHA1 > >> >> >> > > >> >> >> > On 05/15/2014 10:56 PM, NeilBrown wrote: > >> >> >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams wrote: > >> >> >> >>> What would it take and would we even consider moving 2x faster > >> >> >> >>> than we are now? > >> >> >> >> > >> >> >> >> Hi Dan, you seem to be suggesting that there is some limit other > >> >> >> >> than "competent engineering time" which is slowing Linux > >> >> >> >> "progress" down. > >> >> >> >> > >> >> >> >> Are you really suggesting that? What might these other limits > >> >> >> >> be? > >> >> >> >> > >> >> >> >> Certainly there are limits to minimum gap between > >> >> >> >> conceptualisation and release (at least one release cycle), but > >> >> >> >> is there really a limit to the parallelism that can be achieved? > >> >> >> > > >> >> >> > I haven't compared the FB commit rates with the kernel, but I'll > >> >> >> > pretend Dan's basic thesis is right and talk about which parts of > >> >> >> > the facebook model may move faster than the kernel. > >> >> >> > > >> >> >> > The facebook is pretty similar to the way the kernel works. The > >> >> >> > merge window lasts a few days and the major releases are every > >> >> >> > week, but overall it isn't too far away. > >> >> >> > > >> >> >> > The biggest difference is that we have a centralized tool for > >> >> >> > reviewing the patches, and once it has been reviewed by a > >> >> >> > specific number of people, you push it in. > >> >> >> > > >> >> >> > The patch submission tool runs the patch through lint and various > >> >> >> > static analysis to make sure it follows proper coding style and > >> >> >> > doesn't include patterns of known bugs. This cuts down on the > >> >> >> > review work because the silly coding style mistakes are gone > >> >> >> > before it gets to the tool. > >> >> >> > > >> >> >> > When you put in a patch, you have to put in reviewers, and they > >> >> >> > get a little notification that your patch needs review. Once the > >> >> >> > reviewers are happy, you push the patch in. > >> >> >> > > >> >> >> > The biggest difference: there are no maintainers. If I want to > >> >> >> > go change the calendar tool to fix a bug, I patch it, get someone > >> >> >> > else to sign off and push. > >> >> >> > > >> >> >> > All of which is my way of saying the maintainers (me included) > >> >> >> > are the biggest bottleneck. There are a lot of reasons I think > >> >> >> > the maintainer model fits the kernel better, but at least for > >> >> >> > btrfs I'm trying to speed up the patch review process and use > >> >> >> > patchwork more effectively. > >> >> >> > >> >> >> To be clear, I'm not arguing for a maintainer-less model. We don't > >> >> >> have the tooling or operational-data to support that. We need > >> >> >> maintainers to say "no". But, what I think we can do is give > >> >> >> maintainers more varied ways to say it. The goal, de-escalate the > >> >> >> merge event as a declaration that the code quality/architecture > >> >> >> conversation is over. > >> >> >> > >> >> >> Release early, release often, and with care merge often. > >> >> > > >> >> > I think this falls foul of the "no regressions" rule. > >> >> > > >> >> > The kernel policy is that once the functionality gets to users, it > >> >> > cannot be taken away. Individual drivers in 'staging' manage to > >> >> > avoid this rule because that are clearly separate things. > >> >> > New system calls and attributes in sysfs etc seem to be much harder > >> >> > to "partially" release. > >> >> > >> >> My straw man is something like the following for driver "foo" > >> >> > >> >> if (gatekeeper_foo_new_awesome_sauce) > >> >> > >> >> do_new_thing(); > >> >> > >> >> Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and > >> >> warns that there is no guarantee of this functionality being present > >> >> in the same form or at all going forward. > >> > > >> > Interesting idea. > >> > Trying to imagine how this might play out in practice.... > >> > > >> > You talk about "value delivered to users". But users tend to use > >> > applications, and applications are the users of kernel features. > >> > > >> > Will anyone bother writing or adapting an application to use a feature > >> > which is not guaranteed to hang around? > >> > Maybe they will, but will the users of the application know that it > >> > might stop working after a kernel upgrade? Maybe... > >> > > >> > Maybe if we had some concrete examples of features that could have been > >> > delayed using a gatekeeper. > >> > > >> > The one that springs to my mind is cgroups. Clearly useful, but > >> > clearly controversial. It appears that the original implementation was > >> > seriously flawed and Tejun is doing a massive amount of work to "fix" > >> > it, and this apparently will lead to API changes. And this is > >> > happening without any gatekeepers. Would it have been easier in some > >> > way with gatekeepers? ... I don't see how it would be, except that > >> > fewer people would have used cgroups, and then maybe we wouldn't have > >> > as much collective experience to know what the real problems were(?). > >> > > >> > I think that is the key. With a user-facing option, people will try it > >> > and probably cope if it disappears (though they might complain loudly > >> > and sign petitions declaring facebook to be the anti-$DEITY). However > >> > with kernel internal options, applications are unlikely to use them > >> > without some expectation of stability. So finding the problems would > >> > be a lot harder. > >> > > >> > Which doesn't mean that it can't work, but it would be nice if create > >> > some real life examples to see how it plays out in practice. > >> > >> Biased by my background of course, but I think driver development is > >> more amenable to this sort of approach. For drivers the kernel is in > >> many instances the application. For example, I currently have in my > >> review queue a patch set to add sata port multiplier support to > >> libsas. I hope I get the review done in time for merging it in 3.16. > >> But, what if I also had the option of saying "let's gatekeeper this > >> for a cycle". Users that care could start using it and reporting > >> bugs, and it would be clear that the implementation is provisional. > >> My opinion is that bug reports would attract deeper code review that > >> otherwise would not occur if the feature was simply delayed for a > >> cycle. > > > > There's more to that. > > > > The model you're referring to is only possible if all participants are > > employees of one company or otherwise members of one organization that > > has some kind of control over them. The kernel development is not done > > like that, though, so I'm afraid that the Facebook experience is not > > applicable here directly. > > > > For example, we take patches from pretty much everyone on the Internet. > > Does Facebook do that too? I don't think so. > > I'm struggling to see how this addresses my new libsas feature example? > > Simply, if an end user knows how to override a "gatekeeper" that user > can test features that we are otherwise still debating upstream. They > can of course also apply the patches directly, but I am proposing we > formalize a mechanism to encourage more experimentation in-tree. Isn't that what CONFIG_EXPERIMENTAL was for ? Putting a similar mechanism in place would likely be abused the same way, and end up being enabled by default by distros at the end of the day. http://lwn.net/Articles/520867/ explains how experimental items should be handled, possibly depending on CONFIG_BROKEN (hopefully distros won't enable that one). Let's not forget that the kernel carries security implication. We might want to make it easier for users to enable experimental features, but not so easy that they could enable dangerous features without knowing it, or without realizing what they're doing. Out-of-tree patches should be pretty safe in that regard, an in-tree mechanism should take those constraints into account. We also need to decide on where to put the limit. Experimental features that haven't been properly reviewed can have side effects. They might make build robots fail even when the feature is disabled, because the implementation doesn't properly handle the disabled case. We would need to review experimental patches to prevent that from happening, and that could just put more burden on maintainers instead of helping them. > I'm fully aware we do not have the tactical data nor operational > control to run the kernel like a website, that's not my concern. My > concern is with expanding a maintainer's options for mitigating risk. -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 23:03 ` Dan Williams 2014-05-21 23:40 ` Laurent Pinchart @ 2014-05-22 0:10 ` Rafael J. Wysocki 2014-05-22 15:48 ` Theodore Ts'o 2 siblings, 0 replies; 38+ messages in thread From: Rafael J. Wysocki @ 2014-05-22 0:10 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Wednesday, May 21, 2014 04:03:49 PM Dan Williams wrote: > On Wed, May 21, 2014 at 4:06 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Wednesday, May 21, 2014 08:35:55 AM Dan Williams wrote: > >> On Wed, May 21, 2014 at 3:11 AM, NeilBrown <neilb@suse.de> wrote: > >> > On Wed, 21 May 2014 01:36:55 -0700 Dan Williams <dan.j.williams@intel.com> > >> > wrote: [cut] > > > > There's more to that. > > > > The model you're referring to is only possible if all participants are > > employees of one company or otherwise members of one organization that > > has some kind of control over them. The kernel development is not done > > like that, though, so I'm afraid that the Facebook experience is not > > applicable here directly. > > > > For example, we take patches from pretty much everyone on the Internet. > > Does Facebook do that too? I don't think so. > > > > I'm struggling to see how this addresses my new libsas feature example? What about security? What about preventing distros from shipping code that won't be accepted eventually? > Simply, if an end user knows how to override a "gatekeeper" that user > can test features that we are otherwise still debating upstream. They > can of course also apply the patches directly, but I am proposing we > formalize a mechanism to encourage more experimentation in-tree. So is staging not sufficient any more? > I'm fully aware we do not have the tactical data nor operational > control to run the kernel like a website, that's not my concern. My > concern is with expanding a maintainer's options for mitigating risk. What risk exactly do you mean? Rafael ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 23:03 ` Dan Williams 2014-05-21 23:40 ` Laurent Pinchart 2014-05-22 0:10 ` Rafael J. Wysocki @ 2014-05-22 15:48 ` Theodore Ts'o 2014-05-22 16:31 ` Dan Williams 2 siblings, 1 reply; 38+ messages in thread From: Theodore Ts'o @ 2014-05-22 15:48 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Wed, May 21, 2014 at 04:03:49PM -0700, Dan Williams wrote: > Simply, if an end user knows how to override a "gatekeeper" that user > can test features that we are otherwise still debating upstream. They > can of course also apply the patches directly, but I am proposing we > formalize a mechanism to encourage more experimentation in-tree. > > I'm fully aware we do not have the tactical data nor operational > control to run the kernel like a website, that's not my concern. My > concern is with expanding a maintainer's options for mitigating risk. Various maintainers are doing this sort of thing already. For example, file system developers stage new file system features in precisely this way. Both xfs and ext4 have done this sort of thing, and certainly SuSE has used this technique with btrfs to only support those file system features which they are prepared to support. The problem is using this sort of gatekeeper is something that a maintainer has to use in combination with existing techniques, and it doesn't necessarliy accelerate development by all that much. In particular, if it has any kind of kernel ABI or file system format implications, we need to make sure the interfaces are set in stone before we can let it into the mainline kernel, even if it is not enabled by default. (Consider the avidity that userspace application developers can sometimes have for using even debugging interfaces such as ftrace, and the "no userspace breakages" rule. So not only do you have to worry about userspace applicaitons not using a feature which is protected by a gatekeeper, you also have to worry about premature pervasive use of a feature such that you can't change the interface any more.) That by the way is the singular huge advangtage that centralized code bases such as those found at Google and Facebook have --- if I need to make a kernel change for some feature that hasn't made it upstream yet, all of the users of some particular Google-specific kernel<->user space interface is under a single source tree, and while I do need to worry about staged deployments, I can be extremely confident that I can identify all of the users of a particular interface, and put in appropriate measures to update an interface. It still might take several release candences, but that's typically far shorter than what it would take to obsolete a published upstream interface. As a result, I am much more willing to let a ugly, but operationally necessary new feature (such as say a netlink interface to export information about file system errors, for example) into an internal Google kernel interface, but I'd be much less willing to let something like that go upstream, because while it's annoying to have to forward port such an out-of-tree patch, having to deal with fixing or upgrading a published interface is at least an order or two more work. In addition, both Google and Facebook can afford to make changes that only need to worry about their data center environment, where as an upstream change has to work in a much larger variety of situations and circumstances. The bottom line is just because you can do something at Facebook or Google does not necessarily mean that the same technique will port over easily into the upstream development model. - Ted ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 15:48 ` Theodore Ts'o @ 2014-05-22 16:31 ` Dan Williams 2014-05-22 17:38 ` Theodore Ts'o ` (3 more replies) 0 siblings, 4 replies; 38+ messages in thread From: Dan Williams @ 2014-05-22 16:31 UTC (permalink / raw) To: Theodore Ts'o; +Cc: ksummit-discuss On Thu, May 22, 2014 at 8:48 AM, Theodore Ts'o <tytso@mit.edu> wrote: > On Wed, May 21, 2014 at 04:03:49PM -0700, Dan Williams wrote: >> Simply, if an end user knows how to override a "gatekeeper" that user >> can test features that we are otherwise still debating upstream. They >> can of course also apply the patches directly, but I am proposing we >> formalize a mechanism to encourage more experimentation in-tree. >> >> I'm fully aware we do not have the tactical data nor operational >> control to run the kernel like a website, that's not my concern. My >> concern is with expanding a maintainer's options for mitigating risk. > > Various maintainers are doing this sort of thing already. For > example, file system developers stage new file system features in > precisely this way. Both xfs and ext4 have done this sort of thing, > and certainly SuSE has used this technique with btrfs to only support > those file system features which they are prepared to support. > > The problem is using this sort of gatekeeper is something that a > maintainer has to use in combination with existing techniques, and it > doesn't necessarliy accelerate development by all that much. In > particular, if it has any kind of kernel ABI or file system format > implications, we need to make sure the interfaces are set in stone > before we can let it into the mainline kernel, even if it is not > enabled by default. (Consider the avidity that userspace application > developers can sometimes have for using even debugging interfaces such > as ftrace, and the "no userspace breakages" rule. So not only do you > have to worry about userspace applicaitons not using a feature which > is protected by a gatekeeper, you also have to worry about premature > pervasive use of a feature such that you can't change the interface > any more.) I agree that something like this is prickly once it gets entangled with ABI concerns. But, I disagree with the speed argument... unless you believe -staging has not increased the velocity of kernel development? > That by the way is the singular huge advangtage that centralized code > bases such as those found at Google and Facebook have --- if I need to > make a kernel change for some feature that hasn't made it upstream > yet, all of the users of some particular Google-specific kernel<->user > space interface is under a single source tree, and while I do need to > worry about staged deployments, I can be extremely confident that I > can identify all of the users of a particular interface, and put in > appropriate measures to update an interface. It still might take > several release candences, but that's typically far shorter than what > it would take to obsolete a published upstream interface. Understood, but I'm not advocating that a system like this be used to support the Facebook/Google style kernel hacks to do things that only mega-datacenters care about. > As a result, I am much more willing to let a ugly, but operationally > necessary new feature (such as say a netlink interface to export > information about file system errors, for example) into an internal > Google kernel interface, but I'd be much less willing to let something > like that go upstream, because while it's annoying to have to forward > port such an out-of-tree patch, having to deal with fixing or > upgrading a published interface is at least an order or two more work. > > In addition, both Google and Facebook can afford to make changes that > only need to worry about their data center environment, where as an > upstream change has to work in a much larger variety of situations and > circumstances. > > The bottom line is just because you can do something at Facebook or > Google does not necessarily mean that the same technique will port > over easily into the upstream development model. Neil already disabused me of the idea that a "gatekeeper" could be used to beneficial effect in the core kernel, and I can see it's equally difficult to use this in filesystems that need to be careful of ABI changes. However, nothing presented so far has swayed me from my top of mind concern which is the ability to ship pre-production driver features in the upstream kernel. I'm thinking of it as "-staging for otherwise established drivers". ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 16:31 ` Dan Williams @ 2014-05-22 17:38 ` Theodore Ts'o 2014-05-22 18:42 ` Dan Williams ` (2 subsequent siblings) 3 siblings, 0 replies; 38+ messages in thread From: Theodore Ts'o @ 2014-05-22 17:38 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: > Neil already disabused me of the idea that a "gatekeeper" could be > used to beneficial effect in the core kernel, and I can see it's > equally difficult to use this in filesystems that need to be careful > of ABI changes. However, nothing presented so far has swayed me from > my top of mind concern which is the ability to ship pre-production > driver features in the upstream kernel. I'm thinking of it as > "-staging for otherwise established drivers". In the case where you are just adding some additional hardware enablement for some newer version of some chipset, I can see the applicability. But if the new feature also requires new core code functionality (for example some smarter way of handling interrupt mitigation or interrupt steering, for example), the "gatekeeper" approach can also get problematic, for the reasons Neil outlined. For example, I can remember lots of serial driver enhancements that required core tty layer changes in order to be effective. (In fact I had a friendly competition with the FreeBSD tty maintainer many years ago, but one of the reasons why I was able to get significantly better improvements with Linux was because the FreeBSD core team back then viewed the architecture from BSD 4.3 to be handed down from the mountain top as if from Moses....) So this is why I'm wondering how commonly applicable this particular technique might be, and if it's restricted to individual driver code, is there any thing special we really need to do to encourage this. After all, device drivers authors could use a sysfs file to do this sort of thing today, right? - Ted ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 16:31 ` Dan Williams 2014-05-22 17:38 ` Theodore Ts'o @ 2014-05-22 18:42 ` Dan Williams 2014-05-22 19:06 ` Chris Mason 2014-05-22 20:31 ` Dan Carpenter 2014-05-23 2:13 ` Greg KH 3 siblings, 1 reply; 38+ messages in thread From: Dan Williams @ 2014-05-22 18:42 UTC (permalink / raw) To: Theodore Ts'o; +Cc: ksummit-discuss On Thu, May 22, 2014 at 9:31 AM, Dan Williams <dan.j.williams@intel.com> wrote: > On Thu, May 22, 2014 at 8:48 AM, Theodore Ts'o <tytso@mit.edu> wrote: >> On Wed, May 21, 2014 at 04:03:49PM -0700, Dan Williams wrote: >>> Simply, if an end user knows how to override a "gatekeeper" that user >>> can test features that we are otherwise still debating upstream. They >>> can of course also apply the patches directly, but I am proposing we >>> formalize a mechanism to encourage more experimentation in-tree. >>> >>> I'm fully aware we do not have the tactical data nor operational >>> control to run the kernel like a website, that's not my concern. My >>> concern is with expanding a maintainer's options for mitigating risk. >> >> Various maintainers are doing this sort of thing already. For >> example, file system developers stage new file system features in >> precisely this way. Both xfs and ext4 have done this sort of thing, >> and certainly SuSE has used this technique with btrfs to only support >> those file system features which they are prepared to support. >> >> The problem is using this sort of gatekeeper is something that a >> maintainer has to use in combination with existing techniques, and it >> doesn't necessarliy accelerate development by all that much. In >> particular, if it has any kind of kernel ABI or file system format >> implications, we need to make sure the interfaces are set in stone >> before we can let it into the mainline kernel, even if it is not >> enabled by default. (Consider the avidity that userspace application >> developers can sometimes have for using even debugging interfaces such >> as ftrace, and the "no userspace breakages" rule. So not only do you >> have to worry about userspace applicaitons not using a feature which >> is protected by a gatekeeper, you also have to worry about premature >> pervasive use of a feature such that you can't change the interface >> any more.) > > I agree that something like this is prickly once it gets entangled > with ABI concerns. But, I disagree with the speed argument... unless > you believe -staging has not increased the velocity of kernel > development? > >> That by the way is the singular huge advangtage that centralized code >> bases such as those found at Google and Facebook have --- if I need to >> make a kernel change for some feature that hasn't made it upstream >> yet, all of the users of some particular Google-specific kernel<->user >> space interface is under a single source tree, and while I do need to >> worry about staged deployments, I can be extremely confident that I >> can identify all of the users of a particular interface, and put in >> appropriate measures to update an interface. It still might take >> several release candences, but that's typically far shorter than what >> it would take to obsolete a published upstream interface. > > Understood, but I'm not advocating that a system like this be used to > support the Facebook/Google style kernel hacks to do things that only > mega-datacenters care about. > >> As a result, I am much more willing to let a ugly, but operationally >> necessary new feature (such as say a netlink interface to export >> information about file system errors, for example) into an internal >> Google kernel interface, but I'd be much less willing to let something >> like that go upstream, because while it's annoying to have to forward >> port such an out-of-tree patch, having to deal with fixing or >> upgrading a published interface is at least an order or two more work. >> >> In addition, both Google and Facebook can afford to make changes that >> only need to worry about their data center environment, where as an >> upstream change has to work in a much larger variety of situations and >> circumstances. >> >> The bottom line is just because you can do something at Facebook or >> Google does not necessarily mean that the same technique will port >> over easily into the upstream development model. > > Neil already disabused me of the idea that a "gatekeeper" could be > used to beneficial effect in the core kernel, and I can see it's > equally difficult to use this in filesystems that need to be careful > of ABI changes. However, nothing presented so far has swayed me from > my top of mind concern which is the ability to ship pre-production > driver features in the upstream kernel. I'm thinking of it as > "-staging for otherwise established drivers". Interesting quote / counterpoint from Dave Chinner that supports the "don't do this for filesystems!" sentiment: "The development of btrfs has shown that moving prototype filesystems into the main kernel tree does not lead stability, performance or production readiness any faster than if they stayed as an out-of-tree module until most of the development was complete. If anything, merging into mainline reduces the speed at which a filesystem can be brought to being feature complete and production ready." The care that must be taken with merging experiments is accidentally leaking promises that you don't intend to keep to users. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 18:42 ` Dan Williams @ 2014-05-22 19:06 ` Chris Mason 0 siblings, 0 replies; 38+ messages in thread From: Chris Mason @ 2014-05-22 19:06 UTC (permalink / raw) To: Dan Williams, Theodore Ts'o; +Cc: ksummit-discuss On 05/22/2014 02:42 PM, Dan Williams wrote: > On Thu, May 22, 2014 at 9:31 AM, Dan Williams <dan.j.williams@intel.com> wrote: > > Interesting quote / counterpoint from Dave Chinner that supports the > "don't do this for filesystems!" sentiment: > > "The development of btrfs has shown that moving prototype filesystems > into the main kernel tree does not lead stability, performance or > production readiness any faster than if they stayed as an out-of-tree > module until most of the development was complete. If anything, > merging into mainline reduces the speed at which a filesystem can be > brought to being feature complete and production ready." > > The care that must be taken with merging experiments is accidentally > leaking promises that you don't intend to keep to users. Not too surprising, but I disagree with Dave here. Having things upstream earlier increases community ownership, and it helps reduce silos of private code in the project. Btrfs does have its warts, but it also looks like a Linux filesystem. Out of tree, it would be something different, and certainly less than it is now. -chris ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 16:31 ` Dan Williams 2014-05-22 17:38 ` Theodore Ts'o 2014-05-22 18:42 ` Dan Williams @ 2014-05-22 20:31 ` Dan Carpenter 2014-05-22 20:56 ` Geert Uytterhoeven 2014-05-23 2:13 ` Greg KH 3 siblings, 1 reply; 38+ messages in thread From: Dan Carpenter @ 2014-05-22 20:31 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: > I agree that something like this is prickly once it gets entangled > with ABI concerns. But, I disagree with the speed argument... unless > you believe -staging has not increased the velocity of kernel > development? Staging is good because it brings more developers, but in many cases it is a slow down. Merged codes has stricter rules where you have to write reviewable patches. If there is a bug early in a patch series then you can't just fix it in a later patch, you need to redo the whole series. Porting a wifi driver to a different wireless stack is difficult/impossible when you have to write bisectable code. I often think that developers would be better off just working like mad to fix things up outside the tree. The good thing about staging is that before there were all these drivers out there which people were using but they were never going to be merged in the kernel. Now we merge them and try to clean them up so it is a step in the right direction. regards, dan carpenter ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 20:31 ` Dan Carpenter @ 2014-05-22 20:56 ` Geert Uytterhoeven 2014-05-23 6:21 ` James Bottomley 0 siblings, 1 reply; 38+ messages in thread From: Geert Uytterhoeven @ 2014-05-22 20:56 UTC (permalink / raw) To: Dan Carpenter; +Cc: ksummit-discuss On Thu, May 22, 2014 at 10:31 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote: > On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: >> I agree that something like this is prickly once it gets entangled >> with ABI concerns. But, I disagree with the speed argument... unless >> you believe -staging has not increased the velocity of kernel >> development? > > Staging is good because it brings more developers, but in many cases it > is a slow down. Merged codes has stricter rules where you have to write > reviewable patches. If there is a bug early in a patch series then you > can't just fix it in a later patch, you need to redo the whole series. In theory... These days many fixes end up as separate commits in various subsystem trees, due to "no rebase" rules and other regulations. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 20:56 ` Geert Uytterhoeven @ 2014-05-23 6:21 ` James Bottomley 2014-05-23 14:11 ` John W. Linville 0 siblings, 1 reply; 38+ messages in thread From: James Bottomley @ 2014-05-23 6:21 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: ksummit-discuss, Dan Carpenter On Thu, 2014-05-22 at 22:56 +0200, Geert Uytterhoeven wrote: > On Thu, May 22, 2014 at 10:31 PM, Dan Carpenter > <dan.carpenter@oracle.com> wrote: > > On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: > >> I agree that something like this is prickly once it gets entangled > >> with ABI concerns. But, I disagree with the speed argument... unless > >> you believe -staging has not increased the velocity of kernel > >> development? > > > > Staging is good because it brings more developers, but in many cases it > > is a slow down. Merged codes has stricter rules where you have to write > > reviewable patches. If there is a bug early in a patch series then you > > can't just fix it in a later patch, you need to redo the whole series. > > In theory... > > These days many fixes end up as separate commits in various subsystem > trees, due to "no rebase" rules and other regulations. No, pretty much in practise. I've no qualms about dropping a patch series if one of the git tree tests shows problems and, since I have a mostly linear tree, that means a rebase. I also don't believe in "preserving" history which is simply bug fixes that should have been in the series. Sometimes, if the fix took a while to track down, I might keep the separate patch for credit + learning, but most of the time I'd fold it into a commit and annotate the commit. James ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-23 6:21 ` James Bottomley @ 2014-05-23 14:11 ` John W. Linville 2014-05-24 9:14 ` James Bottomley 0 siblings, 1 reply; 38+ messages in thread From: John W. Linville @ 2014-05-23 14:11 UTC (permalink / raw) To: James Bottomley; +Cc: Dan Carpenter, ksummit-discuss On Thu, May 22, 2014 at 11:21:35PM -0700, James Bottomley wrote: > On Thu, 2014-05-22 at 22:56 +0200, Geert Uytterhoeven wrote: > > On Thu, May 22, 2014 at 10:31 PM, Dan Carpenter > > <dan.carpenter@oracle.com> wrote: > > > On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: > > >> I agree that something like this is prickly once it gets entangled > > >> with ABI concerns. But, I disagree with the speed argument... unless > > >> you believe -staging has not increased the velocity of kernel > > >> development? > > > > > > Staging is good because it brings more developers, but in many cases it > > > is a slow down. Merged codes has stricter rules where you have to write > > > reviewable patches. If there is a bug early in a patch series then you > > > can't just fix it in a later patch, you need to redo the whole series. > > > > In theory... > > > > These days many fixes end up as separate commits in various subsystem > > trees, due to "no rebase" rules and other regulations. > > No, pretty much in practise. I've no qualms about dropping a patch > series if one of the git tree tests shows problems and, since I have a > mostly linear tree, that means a rebase. > > I also don't believe in "preserving" history which is simply bug fixes > that should have been in the series. Sometimes, if the fix took a while > to track down, I might keep the separate patch for credit + learning, > but most of the time I'd fold it into a commit and annotate the commit. That's all well and good, but rebasing causes a lot of pain. This is particularly true when you have downstream trees. In any case, bugs will eventually show-up -- probably on the day after you merge the 'final' series. Hopefully those are not 'brown paper bag' bugs, but you can only stall a series so long in hopes of shaking those out. You can only extend yourself so far in pursuit of bisectability. John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-23 14:11 ` John W. Linville @ 2014-05-24 9:14 ` James Bottomley 2014-05-24 19:19 ` Geert Uytterhoeven 0 siblings, 1 reply; 38+ messages in thread From: James Bottomley @ 2014-05-24 9:14 UTC (permalink / raw) To: John W. Linville; +Cc: ksummit-discuss, Dan Carpenter On Fri, 2014-05-23 at 10:11 -0400, John W. Linville wrote: > On Thu, May 22, 2014 at 11:21:35PM -0700, James Bottomley wrote: > > On Thu, 2014-05-22 at 22:56 +0200, Geert Uytterhoeven wrote: > > > On Thu, May 22, 2014 at 10:31 PM, Dan Carpenter > > > <dan.carpenter@oracle.com> wrote: > > > > On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: > > > >> I agree that something like this is prickly once it gets entangled > > > >> with ABI concerns. But, I disagree with the speed argument... unless > > > >> you believe -staging has not increased the velocity of kernel > > > >> development? > > > > > > > > Staging is good because it brings more developers, but in many cases it > > > > is a slow down. Merged codes has stricter rules where you have to write > > > > reviewable patches. If there is a bug early in a patch series then you > > > > can't just fix it in a later patch, you need to redo the whole series. > > > > > > In theory... > > > > > > These days many fixes end up as separate commits in various subsystem > > > trees, due to "no rebase" rules and other regulations. > > > > No, pretty much in practise. I've no qualms about dropping a patch > > series if one of the git tree tests shows problems and, since I have a > > mostly linear tree, that means a rebase. > > > > I also don't believe in "preserving" history which is simply bug fixes > > that should have been in the series. Sometimes, if the fix took a while > > to track down, I might keep the separate patch for credit + learning, > > but most of the time I'd fold it into a commit and annotate the commit. > > That's all well and good, but rebasing causes a lot of pain. Not usually if you manage it right. > This is particularly true when you have downstream trees. What I find is that people rarely actually need to base development on my tree as upstream. We do sometimes get the odd entangled patch (code that changes something that changed in my tree), but we haven't had that for a while now. The rule therefore is use an upstream Linus tree to develop unless you specifically have entangled patches. If you need to test with my tree, you can still pull it in as a merge. I also have specific methodologies where I keep head and tail branches of my trees, so for <x> development branch I have an <x>-base branch as well, so I can simply do a git checkout <x> git rebase --onto origin/master <x>-base git branch -f <x>-base origin/master > In any case, bugs will eventually show-up -- probably on the day after > you merge the 'final' series. Hopefully those are not 'brown paper bag' > bugs, but you can only stall a series so long in hopes of shaking > those out. You can only extend yourself so far in pursuit of bisectability. Right, you have to have a "history commit" point ... for me that's when I send the tree to Linus ... then the history becomes immutable and any breakage discovered afterwards has to be fixed by separate patches. James ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-24 9:14 ` James Bottomley @ 2014-05-24 19:19 ` Geert Uytterhoeven 0 siblings, 0 replies; 38+ messages in thread From: Geert Uytterhoeven @ 2014-05-24 19:19 UTC (permalink / raw) To: James Bottomley; +Cc: Dan Carpenter, ksummit-discuss Hi James, On Sat, May 24, 2014 at 11:14 AM, James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > I also have specific methodologies where I keep head and tail branches > of my trees, so for <x> development branch I have an <x>-base branch as > well, so I can simply do a > > git checkout <x> > git rebase --onto origin/master <x>-base > git branch -f <x>-base origin/master If your origin/master is only forwarding (i.e. never rebased), you can do without the <x>-base branch, as it will always point somewhere into the history of origin/master. Git is smart enough so "git rebase origin/master <x>" will do the right thing. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-22 16:31 ` Dan Williams ` (2 preceding siblings ...) 2014-05-22 20:31 ` Dan Carpenter @ 2014-05-23 2:13 ` Greg KH 2014-05-23 3:03 ` Dan Williams 2014-05-23 14:02 ` Josh Boyer 3 siblings, 2 replies; 38+ messages in thread From: Greg KH @ 2014-05-23 2:13 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: > On Thu, May 22, 2014 at 8:48 AM, Theodore Ts'o <tytso@mit.edu> wrote: > > On Wed, May 21, 2014 at 04:03:49PM -0700, Dan Williams wrote: > >> Simply, if an end user knows how to override a "gatekeeper" that user > >> can test features that we are otherwise still debating upstream. They > >> can of course also apply the patches directly, but I am proposing we > >> formalize a mechanism to encourage more experimentation in-tree. > >> > >> I'm fully aware we do not have the tactical data nor operational > >> control to run the kernel like a website, that's not my concern. My > >> concern is with expanding a maintainer's options for mitigating risk. > > > > Various maintainers are doing this sort of thing already. For > > example, file system developers stage new file system features in > > precisely this way. Both xfs and ext4 have done this sort of thing, > > and certainly SuSE has used this technique with btrfs to only support > > those file system features which they are prepared to support. > > > > The problem is using this sort of gatekeeper is something that a > > maintainer has to use in combination with existing techniques, and it > > doesn't necessarliy accelerate development by all that much. In > > particular, if it has any kind of kernel ABI or file system format > > implications, we need to make sure the interfaces are set in stone > > before we can let it into the mainline kernel, even if it is not > > enabled by default. (Consider the avidity that userspace application > > developers can sometimes have for using even debugging interfaces such > > as ftrace, and the "no userspace breakages" rule. So not only do you > > have to worry about userspace applicaitons not using a feature which > > is protected by a gatekeeper, you also have to worry about premature > > pervasive use of a feature such that you can't change the interface > > any more.) > > I agree that something like this is prickly once it gets entangled > with ABI concerns. But, I disagree with the speed argument... unless > you believe -staging has not increased the velocity of kernel > development? As the maintainer of drivers/staging/ I don't think it has increased the speed of the development of other parts of the kernel at all. Do you have numbers that show otherwise? > Neil already disabused me of the idea that a "gatekeeper" could be > used to beneficial effect in the core kernel, and I can see it's > equally difficult to use this in filesystems that need to be careful > of ABI changes. However, nothing presented so far has swayed me from > my top of mind concern which is the ability to ship pre-production > driver features in the upstream kernel. I'm thinking of it as > "-staging for otherwise established drivers". The thing you need to realize is that the large majority of people who would ever use that new "feature" will not until it ends up in an "enterprise" kernel release. And that will not be for another few years, so while you think you got it all right, we really don't know who is using it, or how well it works, for a few years. But feel free to try to do this in your subsystem, as Ted points out, it can be done for somethings, but be careful about thinking things are ok when you don't have many real users :) thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-23 2:13 ` Greg KH @ 2014-05-23 3:03 ` Dan Williams 2014-05-23 7:44 ` Greg KH 2014-05-23 14:02 ` Josh Boyer 1 sibling, 1 reply; 38+ messages in thread From: Dan Williams @ 2014-05-23 3:03 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On Thu, May 22, 2014 at 7:13 PM, Greg KH <greg@kroah.com> wrote: > On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: >> On Thu, May 22, 2014 at 8:48 AM, Theodore Ts'o <tytso@mit.edu> wrote: >> > On Wed, May 21, 2014 at 04:03:49PM -0700, Dan Williams wrote: >> >> Simply, if an end user knows how to override a "gatekeeper" that user >> >> can test features that we are otherwise still debating upstream. They >> >> can of course also apply the patches directly, but I am proposing we >> >> formalize a mechanism to encourage more experimentation in-tree. >> >> >> >> I'm fully aware we do not have the tactical data nor operational >> >> control to run the kernel like a website, that's not my concern. My >> >> concern is with expanding a maintainer's options for mitigating risk. >> > >> > Various maintainers are doing this sort of thing already. For >> > example, file system developers stage new file system features in >> > precisely this way. Both xfs and ext4 have done this sort of thing, >> > and certainly SuSE has used this technique with btrfs to only support >> > those file system features which they are prepared to support. >> > >> > The problem is using this sort of gatekeeper is something that a >> > maintainer has to use in combination with existing techniques, and it >> > doesn't necessarliy accelerate development by all that much. In >> > particular, if it has any kind of kernel ABI or file system format >> > implications, we need to make sure the interfaces are set in stone >> > before we can let it into the mainline kernel, even if it is not >> > enabled by default. (Consider the avidity that userspace application >> > developers can sometimes have for using even debugging interfaces such >> > as ftrace, and the "no userspace breakages" rule. So not only do you >> > have to worry about userspace applicaitons not using a feature which >> > is protected by a gatekeeper, you also have to worry about premature >> > pervasive use of a feature such that you can't change the interface >> > any more.) >> >> I agree that something like this is prickly once it gets entangled >> with ABI concerns. But, I disagree with the speed argument... unless >> you believe -staging has not increased the velocity of kernel >> development? > > As the maintainer of drivers/staging/ I don't think it has increased the > speed of the development of other parts of the kernel at all. Do you > have numbers that show otherwise? Well, I'm defining velocity as value delivered to end users and amount of testing that can be dsitributed by being upstream. By that definition -staging does make us faster simply because mainline releases have more drivers that they would otherwise, and it attracts more developers to test and cleanup the code. >> Neil already disabused me of the idea that a "gatekeeper" could be >> used to beneficial effect in the core kernel, and I can see it's >> equally difficult to use this in filesystems that need to be careful >> of ABI changes. However, nothing presented so far has swayed me from >> my top of mind concern which is the ability to ship pre-production >> driver features in the upstream kernel. I'm thinking of it as >> "-staging for otherwise established drivers". > > The thing you need to realize is that the large majority of people who > would ever use that new "feature" will not until it ends up in an > "enterprise" kernel release. And that will not be for another few > years, so while you think you got it all right, we really don't know who > is using it, or how well it works, for a few years. > > But feel free to try to do this in your subsystem, as Ted points out, it > can be done for somethings, but be careful about thinking things are ok > when you don't have many real users :) > Point taken. However, if this is the case, why is there so much tension around some merge events? Especially in cases where there is low risk for regression. We seem to aim for perfection in merging and that is specifically the latency I am targeting with a "this feature is behind a gatekeeper" release-valve for that pressure to not merge. If things stay behind a gatekeeper too long they get reverted. Would that modulate the latency to "ack" in any meaningful way. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-23 3:03 ` Dan Williams @ 2014-05-23 7:44 ` Greg KH 0 siblings, 0 replies; 38+ messages in thread From: Greg KH @ 2014-05-23 7:44 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss On Thu, May 22, 2014 at 08:03:32PM -0700, Dan Williams wrote: > >> Neil already disabused me of the idea that a "gatekeeper" could be > >> used to beneficial effect in the core kernel, and I can see it's > >> equally difficult to use this in filesystems that need to be careful > >> of ABI changes. However, nothing presented so far has swayed me from > >> my top of mind concern which is the ability to ship pre-production > >> driver features in the upstream kernel. I'm thinking of it as > >> "-staging for otherwise established drivers". > > > > The thing you need to realize is that the large majority of people who > > would ever use that new "feature" will not until it ends up in an > > "enterprise" kernel release. And that will not be for another few > > years, so while you think you got it all right, we really don't know who > > is using it, or how well it works, for a few years. > > > > But feel free to try to do this in your subsystem, as Ted points out, it > > can be done for somethings, but be careful about thinking things are ok > > when you don't have many real users :) > > > > Point taken. > > However, if this is the case, why is there so much tension around some > merge events? Especially in cases where there is low risk for > regression. What "tension" are you speaking of? Getting new apis correct before we do a release? Or something else? I didn't see any specific examples mentioned in this thread, but I might have missed it. > We seem to aim for perfection in merging and that is > specifically the latency I am targeting with a "this feature is behind > a gatekeeper" release-valve for that pressure to not merge. If things > stay behind a gatekeeper too long they get reverted. Would that > modulate the latency to "ack" in any meaningful way. For a filesystem, or a driver, as stated, this might work. For a syscall, or new subsystem api to userspace, that isn't going to work for the above mentioned reasons. See the cgroups interface for one example of how long it took for people to actually start to use it (years) and then, once we realized just how bad the interface really was for real-world usages, it was too late as people were already using them, so we have to have them around for an indefinate time before they can be removed, if ever. thanks, greg k-h ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-23 2:13 ` Greg KH 2014-05-23 3:03 ` Dan Williams @ 2014-05-23 14:02 ` Josh Boyer 1 sibling, 0 replies; 38+ messages in thread From: Josh Boyer @ 2014-05-23 14:02 UTC (permalink / raw) To: Greg KH; +Cc: ksummit-discuss On Thu, May 22, 2014 at 10:13 PM, Greg KH <greg@kroah.com> wrote: > On Thu, May 22, 2014 at 09:31:44AM -0700, Dan Williams wrote: >> On Thu, May 22, 2014 at 8:48 AM, Theodore Ts'o <tytso@mit.edu> wrote: >> > On Wed, May 21, 2014 at 04:03:49PM -0700, Dan Williams wrote: >> >> Simply, if an end user knows how to override a "gatekeeper" that user >> >> can test features that we are otherwise still debating upstream. They >> >> can of course also apply the patches directly, but I am proposing we >> >> formalize a mechanism to encourage more experimentation in-tree. >> >> >> >> I'm fully aware we do not have the tactical data nor operational >> >> control to run the kernel like a website, that's not my concern. My >> >> concern is with expanding a maintainer's options for mitigating risk. >> > >> > Various maintainers are doing this sort of thing already. For >> > example, file system developers stage new file system features in >> > precisely this way. Both xfs and ext4 have done this sort of thing, >> > and certainly SuSE has used this technique with btrfs to only support >> > those file system features which they are prepared to support. >> > >> > The problem is using this sort of gatekeeper is something that a >> > maintainer has to use in combination with existing techniques, and it >> > doesn't necessarliy accelerate development by all that much. In >> > particular, if it has any kind of kernel ABI or file system format >> > implications, we need to make sure the interfaces are set in stone >> > before we can let it into the mainline kernel, even if it is not >> > enabled by default. (Consider the avidity that userspace application >> > developers can sometimes have for using even debugging interfaces such >> > as ftrace, and the "no userspace breakages" rule. So not only do you >> > have to worry about userspace applicaitons not using a feature which >> > is protected by a gatekeeper, you also have to worry about premature >> > pervasive use of a feature such that you can't change the interface >> > any more.) >> >> I agree that something like this is prickly once it gets entangled >> with ABI concerns. But, I disagree with the speed argument... unless >> you believe -staging has not increased the velocity of kernel >> development? > > As the maintainer of drivers/staging/ I don't think it has increased the > speed of the development of other parts of the kernel at all. Do you > have numbers that show otherwise? > >> Neil already disabused me of the idea that a "gatekeeper" could be >> used to beneficial effect in the core kernel, and I can see it's >> equally difficult to use this in filesystems that need to be careful >> of ABI changes. However, nothing presented so far has swayed me from >> my top of mind concern which is the ability to ship pre-production >> driver features in the upstream kernel. I'm thinking of it as >> "-staging for otherwise established drivers". > > The thing you need to realize is that the large majority of people who > would ever use that new "feature" will not until it ends up in an > "enterprise" kernel release. And that will not be for another few > years, so while you think you got it all right, we really don't know who > is using it, or how well it works, for a few years. I don't entirely agree with that. Many of the non-enterprise distros are rebasing more frequently, and collectively their user bases are pretty large. Fedora, Arch, Ubuntu, and OpenSuSE get requests to enable new features all the time. If you consider the distros that have an enterprise downstream (e.g. Fedora, OpenSuSE), you even get people picking those up and using them as previews for the next EL release. So yes, EL kernels have massive userbases and they tend to adopt very slowly. However, as soon as code is in a released upstream kernel, a non-trivial number of people are going to be able to use it. If you factor in hot-topic things like containers (docker docker docker), those features are requested in the non-EL distros very rapidly (sometimes even before they're merged). Maybe Dan's case isn't hot-topic enough to match this, but there is certainly the possibility of early adoption and usage by a large number of users as soon as code lands. josh ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 15:35 ` Dan Williams 2014-05-21 23:06 ` Rafael J. Wysocki @ 2014-05-21 23:48 ` NeilBrown 2014-05-22 4:04 ` Dan Williams 1 sibling, 1 reply; 38+ messages in thread From: NeilBrown @ 2014-05-21 23:48 UTC (permalink / raw) To: Dan Williams; +Cc: ksummit-discuss [-- Attachment #1: Type: text/plain, Size: 8015 bytes --] On Wed, 21 May 2014 08:35:55 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > On Wed, May 21, 2014 at 3:11 AM, NeilBrown <neilb@suse.de> wrote: > > On Wed, 21 May 2014 01:36:55 -0700 Dan Williams <dan.j.williams@intel.com> > > wrote: > > > >> On Wed, May 21, 2014 at 1:25 AM, NeilBrown <neilb@suse.de> wrote: > >> > On Wed, 21 May 2014 00:48:48 -0700 Dan Williams <dan.j.williams@intel.com> > >> > wrote: > >> > > >> >> On Fri, May 16, 2014 at 8:04 AM, Chris Mason <clm@fb.com> wrote: > >> >> > -----BEGIN PGP SIGNED MESSAGE----- > >> >> > Hash: SHA1 > >> >> > > >> >> > On 05/15/2014 10:56 PM, NeilBrown wrote: > >> >> >> On Thu, 15 May 2014 16:13:58 -0700 Dan Williams > >> >> >> <dan.j.williams@gmail.com> wrote: > >> >> >> > >> >> >>> What would it take and would we even consider moving 2x faster > >> >> >>> than we are now? > >> >> >> > >> >> >> Hi Dan, you seem to be suggesting that there is some limit other > >> >> >> than "competent engineering time" which is slowing Linux "progress" > >> >> >> down. > >> >> >> > >> >> >> Are you really suggesting that? What might these other limits be? > >> >> >> > >> >> >> Certainly there are limits to minimum gap between conceptualisation > >> >> >> and release (at least one release cycle), but is there really a > >> >> >> limit to the parallelism that can be achieved? > >> >> > > >> >> > I haven't compared the FB commit rates with the kernel, but I'll > >> >> > pretend Dan's basic thesis is right and talk about which parts of the > >> >> > facebook model may move faster than the kernel. > >> >> > > >> >> > The facebook is pretty similar to the way the kernel works. The merge > >> >> > window lasts a few days and the major releases are every week, but > >> >> > overall it isn't too far away. > >> >> > > >> >> > The biggest difference is that we have a centralized tool for > >> >> > reviewing the patches, and once it has been reviewed by a specific > >> >> > number of people, you push it in. > >> >> > > >> >> > The patch submission tool runs the patch through lint and various > >> >> > static analysis to make sure it follows proper coding style and > >> >> > doesn't include patterns of known bugs. This cuts down on the review > >> >> > work because the silly coding style mistakes are gone before it gets > >> >> > to the tool. > >> >> > > >> >> > When you put in a patch, you have to put in reviewers, and they get a > >> >> > little notification that your patch needs review. Once the reviewers > >> >> > are happy, you push the patch in. > >> >> > > >> >> > The biggest difference: there are no maintainers. If I want to go > >> >> > change the calendar tool to fix a bug, I patch it, get someone else to > >> >> > sign off and push. > >> >> > > >> >> > All of which is my way of saying the maintainers (me included) are the > >> >> > biggest bottleneck. There are a lot of reasons I think the maintainer > >> >> > model fits the kernel better, but at least for btrfs I'm trying to > >> >> > speed up the patch review process and use patchwork more effectively. > >> >> > >> >> To be clear, I'm not arguing for a maintainer-less model. We don't > >> >> have the tooling or operational-data to support that. We need > >> >> maintainers to say "no". But, what I think we can do is give > >> >> maintainers more varied ways to say it. The goal, de-escalate the > >> >> merge event as a declaration that the code quality/architecture > >> >> conversation is over. > >> >> > >> >> Release early, release often, and with care merge often. > >> > > >> > I think this falls foul of the "no regressions" rule. > >> > > >> > The kernel policy is that once the functionality gets to users, it cannot be > >> > taken away. Individual drivers in 'staging' manage to avoid this rule > >> > because that are clearly separate things. > >> > New system calls and attributes in sysfs etc seem to be much harder to > >> > "partially" release. > >> > >> My straw man is something like the following for driver "foo" > >> > >> if (gatekeeper_foo_new_awesome_sauce) > >> do_new_thing(); > >> > >> Where setting gatekeeper_foo_new_awesome_sauce taints the kernel and > >> warns that there is no guarantee of this functionality being present > >> in the same form or at all going forward. > > > > Interesting idea. > > Trying to imagine how this might play out in practice.... > > > > You talk about "value delivered to users". But users tend to use > > applications, and applications are the users of kernel features. > > > > Will anyone bother writing or adapting an application to use a feature which > > is not guaranteed to hang around? > > Maybe they will, but will the users of the application know that it might > > stop working after a kernel upgrade? Maybe... > > > > Maybe if we had some concrete examples of features that could have been > > delayed using a gatekeeper. > > > > The one that springs to my mind is cgroups. Clearly useful, but clearly > > controversial. It appears that the original implementation was seriously > > flawed and Tejun is doing a massive amount of work to "fix" it, and this > > apparently will lead to API changes. And this is happening without any > > gatekeepers. Would it have been easier in some way with gatekeepers? > > ... I don't see how it would be, except that fewer people would have used > > cgroups, and then maybe we wouldn't have as much collective experience to > > know what the real problems were(?). > > > > I think that is the key. With a user-facing option, people will try it and > > probably cope if it disappears (though they might complain loudly and sign > > petitions declaring facebook to be the anti-$DEITY). However with kernel > > internal options, applications are unlikely to use them without some > > expectation of stability. So finding the problems would be a lot harder. > > > > Which doesn't mean that it can't work, but it would be nice if create some > > real life examples to see how it plays out in practice. > > > > Biased by my background of course, but I think driver development is > more amenable to this sort of approach. For drivers the kernel is in > many instances the application. For example, I currently have in my > review queue a patch set to add sata port multiplier support to > libsas. I hope I get the review done in time for merging it in 3.16. > But, what if I also had the option of saying "let's gatekeeper this > for a cycle". Users that care could start using it and reporting > bugs, and it would be clear that the implementation is provisional. > My opinion is that bug reports would attract deeper code review that > otherwise would not occur if the feature was simply delayed for a > cycle. I can certainly see how this could work for driver features. We sometimes do that sort of incremental release with CONFIG options, but those are clumsy to work with. Having run-time enablement is appealing. What might the control interface look like I imagine something like dynamic_debug, which a file that lists all the dynamic_config options. Writing some message to the file would enable the selected options, and so the dynamic code editing required to enable it. I think it is probably worth trying - see what sort of take-up it gets. NeilBrown > > I think I also would have liked to use a gatekeeper to stage the > deletion of NET_DMA from the kernel. Mark it for removal, see who > screams, but still make it straightforward for such people to make > their case with data why the value should stay. > > For the core kernel, which I admittedly have not touched much, are > there cases where an application wants to make a value argument to > users, but needs some kernel infrastructure to stand on? Do we > inadvertently stifle otherwise promising experiments by forcing > upstream acceptance before the experiment gets the exposure it needs? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-21 23:48 ` NeilBrown @ 2014-05-22 4:04 ` Dan Williams 0 siblings, 0 replies; 38+ messages in thread From: Dan Williams @ 2014-05-22 4:04 UTC (permalink / raw) To: NeilBrown; +Cc: ksummit-discuss On Wed, May 21, 2014 at 4:48 PM, NeilBrown <neilb@suse.de> wrote: [..] > What might the control interface look like > I imagine something like dynamic_debug, which a file that lists all the > dynamic_config options. Writing some message to the file would enable the > selected options, and so the dynamic code editing required to enable it. Ooh, yes, an interface similar to dynamic debug control seems like a good fit. > I think it is probably worth trying - see what sort of take-up it gets. Thanks Neil! ...as everyone else moans, "don't encourage him". ;-) ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things 2014-05-16 2:56 ` NeilBrown 2014-05-16 15:04 ` Chris Mason @ 2014-05-21 7:22 ` Dan Williams 1 sibling, 0 replies; 38+ messages in thread From: Dan Williams @ 2014-05-21 7:22 UTC (permalink / raw) To: NeilBrown; +Cc: ksummit-discuss [ speaking for myself ] On Thu, May 15, 2014 at 7:56 PM, NeilBrown <neilb@suse.de> wrote: > On Thu, 15 May 2014 16:13:58 -0700 Dan Williams <dan.j.williams@gmail.com> > wrote: > >> What would it take and would we even consider moving 2x faster than we >> are now? > > Hi Dan, > you seem to be suggesting that there is some limit other than "competent > engineering time" which is slowing Linux "progress" down. Where "progress" is "value delivered to users", yes. > Are you really suggesting that? Yes, look at -staging as the first step down this path. Functionality delivered to users while "upstream acceptance" happens in parallel. I'm arguing for a finer grained mechanism for staging functionality out to users. > What might these other limits be? Testing and audience. A simplistic example of moving slow is merging a feature only after it has proven to have a large enough audience. Or the opposite, spending development resources to polish and merge a dead-on-arrival solution, but only discovering that fact once exposed to wider distribution. > Certainly there are limits to minimum gap between conceptualisation and > release (at least one release cycle), but is there really a limit to the > parallelism that can be achieved? Again, in general, I think there are aspects of "upstream acceptance" that can done in parallel with delivering value to end users. ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2014-05-24 19:19 UTC | newest] Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-05-15 23:13 [Ksummit-discuss] [CORE TOPIC] [nomination] Move Fast and Oops Things Dan Williams 2014-05-16 2:56 ` NeilBrown 2014-05-16 15:04 ` Chris Mason 2014-05-16 17:09 ` Andy Grover 2014-05-23 8:11 ` Dan Carpenter 2014-05-16 18:31 ` Randy Dunlap 2014-05-21 7:48 ` Dan Williams 2014-05-21 7:55 ` Greg KH 2014-05-21 9:05 ` Matt Fleming 2014-05-21 12:52 ` Greg KH 2014-05-21 13:23 ` Matt Fleming 2014-05-21 8:25 ` NeilBrown 2014-05-21 8:36 ` Dan Williams 2014-05-21 8:53 ` Matt Fleming 2014-05-21 10:11 ` NeilBrown 2014-05-21 15:35 ` Dan Williams 2014-05-21 23:06 ` Rafael J. Wysocki 2014-05-21 23:03 ` Dan Williams 2014-05-21 23:40 ` Laurent Pinchart 2014-05-22 0:10 ` Rafael J. Wysocki 2014-05-22 15:48 ` Theodore Ts'o 2014-05-22 16:31 ` Dan Williams 2014-05-22 17:38 ` Theodore Ts'o 2014-05-22 18:42 ` Dan Williams 2014-05-22 19:06 ` Chris Mason 2014-05-22 20:31 ` Dan Carpenter 2014-05-22 20:56 ` Geert Uytterhoeven 2014-05-23 6:21 ` James Bottomley 2014-05-23 14:11 ` John W. Linville 2014-05-24 9:14 ` James Bottomley 2014-05-24 19:19 ` Geert Uytterhoeven 2014-05-23 2:13 ` Greg KH 2014-05-23 3:03 ` Dan Williams 2014-05-23 7:44 ` Greg KH 2014-05-23 14:02 ` Josh Boyer 2014-05-21 23:48 ` NeilBrown 2014-05-22 4:04 ` Dan Williams 2014-05-21 7:22 ` Dan Williams
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox