From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 68F72D5A for ; Mon, 10 Sep 2018 23:38:07 +0000 (UTC) Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on0103.outbound.protection.outlook.com [104.47.36.103]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 99F017D2 for ; Mon, 10 Sep 2018 23:38:06 +0000 (UTC) From: Sasha Levin To: Steven Rostedt Date: Mon, 10 Sep 2018 23:38:04 +0000 Message-ID: <20180910233803.GW16300@sasha-vm> References: <20180905101710.73137669@gandalf.local.home> <20180907004944.GD16300@sasha-vm> <20180907014930.GE16300@sasha-vm> <20180907145437.GF16300@sasha-vm> <20180910194310.GV16300@sasha-vm> <20180910164519.6cbcc116@vmware.local.home> In-Reply-To: <20180910164519.6cbcc116@vmware.local.home> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-ID: <2CFD5838CD39D54EA10916A19C55A895@namprd21.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: ksummit Subject: Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] Bug-introducing patches List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Mon, Sep 10, 2018 at 04:45:19PM -0400, Steven Rostedt wrote: >On Mon, 10 Sep 2018 19:43:11 +0000 >Sasha Levin wrote: > >> On Fri, Sep 07, 2018 at 08:52:40AM -0700, Linus Torvalds wrote: >> >So this is what my argument really boils down to: the more critical a >> >patch is, the more likely it is to be pushed more aggressively, which >> >in turn makes it statistically much more likely to show up not only >> >during the latter part of the development cycle, but it will directly >> >mean that it looks "less tested". >> > >> >And AT THE SAME TIME, the more critical a patch is, the more likely it >> >is to also show up as a problem spot for distros. Because, by >> >definition, it touched something critical and likely subtle. >> > >> >End result: BY DEFINITION you'll see a correlation between "less >> >testing" and "more problems". >> > >> >But THAT is correlation. That's not the fundamental causation. >> > >> >Now, I agree that it's correlation that makes sense to treat as >> >causation. It just is very tempting to say: "less testing obviously >> >means more problems". And I do think that it's very possibly a real >> >causal property as well, but my argument has been that it's not at all >> >obviously so, exactly because I would expect that correlation to exist >> >even if there was absolutely ZERO causality. >> > >> >See what my argument is? You're arguing from correlation. And I think >> >there is a much more direct causal argument that explains a lot of the >> >correlation. >> >> Both of us agree that patches in later -rc cycles are buggier. We don't >> agree on why, but I think that it actually doesn't matter much. For the >> sake of the argument, let's go with what you're saying and assume that >> they're buggier because they are are more critical, tricky and subtle. >> >> So we have this time period of a few weeks where we know that we're >> going to see tricky patches. What can we do to better deal with it? >> Saying that we'll just see more bugs and we should just live with it >> because it's "BY DEFINITION" is not really a good answer IMO. >> >> For stable trees, we can address that by waiting even longer before >> picking up -rc5+ stuff, but that will move us further away from your >> tree which is an undesirable effect. >> >> I don't have anything beyond guesses, but I don't think the >> solution here is WONTFIX. >> > >I think it may be more of CANTFIX. > >The bugs introduced after -rc5 are more subtle and harder to trigger. I >(and I presume Linus, but he can talk for himself) don't believe that >keeping it in linux-next any longer will help find them, unless the >bots get better to do so. The problem is that these bugs are not going >to be triggered until they get into the mainline kernel and perhaps not >even until they get into the distros. We want to find them before that, >but it's not until they are used in production environments that they >will get found. If you're fixing something in -rc8, which is, according to Linus, only for *critical* fixes that are usually complex, you better have tested that code before pushing in. Is it on obscure hardware no one has access too? I can't imagine what makes that bug critical then. Otherwise, yes, it should be a requirement that a patch was reasonably tested before being merged, this is more true for those late -rc critical fixes. >The best we can do is make the automated testing of linux-next better >such that there's less -rc5 patches that need to go in in the first >place. Being in -next is not only about running it through automatic bots. Being on 0day means, in practice, "amount of days humans had to review/test that code". I didn't want to count days-in-next just to credit automatic testing, but also as an indicator of how many eyeballs a commit attracted before being merged. >I do think that anything that goes into -rc5 or later should be tested >by the developer and the 0day bot, to make sure they don't introduce >some silly bug. But linux-next was mainly to deal with bugs caused by >integration of various sub systems. But -rc5 fixes only care about >integrating with mainline. And as Linus pointed out, when it gets into >mainline, it will then be pulled into linux-next where it gets >integrated with new code coming into the next merge window. It would be nice if every bug coming in that late would have a Tested-by: tag. Isn't it a requirement that patches should be tested anyways? Require that every patch was sent to lkml? Is it a big ask? If the patches are so complex and subtle, require at least one reviewed-by/acked-by? -- Thanks, Sasha=